rjbs forgot what he was saying

not logged in (root) | by date | tagcloud | help | login

recent tags

2   journal  
2   travel  
2   yapc  

RSS feed entries

collapse entry bodies

fixing my accidentally strict mail rules… or not (body)

by rjbs, created 2014-02-24 20:36

I recently made some changes to Ywar, my personal goal tracker, and I couldn't be happier! Mostly.

Ywar is configured with a list of "checks." Each check looks up some datum, compares it to previous measurements, decides whether the goal state was met, and saves the current measurement. The checks used to run once a day, at 23:00. This meant that, for the most part, the feedback I got was the next morning in my daily agenda mail. I could hit refresh at 23:05, if I wanted, and if I was awake. If I did something at 8:00, I'd just have to remember. For the most part, this wasn't a big problem, but I wanted to be able to run things more often.

Last week, when I was working on my Goodreads/Ywar integration, I also made the changes needed to run ywar update more often. There were two main changes: every measurement now carries a log of whether it resulted in goal completion, and checks don't get the last measured value, but the "last state," which contains both the last value measured and the value measured at the last completion.

While I was at it, I added Pushover notifications. Now, when I get up in the morning, I step on my scale. A few minutes later, my phone bleeps, telling me, "Good job! You stepped on the scale!" Over breakfast, I might read an article I've saved to Instapaper. While I was the dishes, or maybe while I read a second article, my iPad bleeps. "Good job! You read something from Instapaper!"

This is surprisingly motivating. I'm completing goals much more often than I used to, now. (The Goodreads integration has also been really motivating.)

This change also inadvertantly introduced a pretty significant change in my email rules. Most of them follow the same pattern, which is something like this:

  • at least once every five days, have less unread mail than the previous day

Some of them say "flagged" instead of "unread," or limit their checks to specific folders, but the pattern is pretty much always like the one above. When I started passing each check both the "last measured" and "last completion" values, I had to decide which they'd use for computing whether the goal was completed. In every case, I chose "last completion." That means that the difference checked is always between the now and the last time we met our goal. This has a massive impact here.

It used to be that all I had to do to keep my "keep reading email" goal alive was to reduce my unread mail count from the previous date. Imagine the following set of end-of-day unread message counts:

  • Sunday: 50
  • Monday: 100
  • Tuesday: 70
  • Wednesday: 100
  • Thursday: 75
  • Friday: 80
  • Saturday: 70

Under the old rules, I would get three completions in that period. On each of Tuesday, Thursday, and Saturday, the count of unread messages goes down from the previous day.

Under the new reules, I would get only one completion: Tuesday. After that, the only way, ever, to get another completion is to get down to below 70 unread messages. Maybe in a few days, I get to 60, and now that's my target. This gets pretty unforgiving pretty fast! My current low water mark for unread mail is 28, and I get an averge of 126 new messages each day. These goals actually have a minimum threshold, so that anything under the threshold counts, even if last time I was further below it. Right now, it's set at 10 for my unread mail goal.

It would be pretty easy to fix this to work like it used to work. I'd get the latest measurement made yesterday and compare to that. I'm just not sure that I should restore that behavior. The old behavior made it very easy to read the easy mail and ignore the stuff that really needed my time. I could let some mail pile up on Wednesday, read the light stuff on Thursday, and I'd still get my point. I kept thinking that I needed something "more forgiving," but I don't think that's true. I don't even think it makes sense. What would "more forgiving" mean? I'm not sure.

One thing to consider is that if I can never keep a streak alive, I won't bother trying. It can't be too difficult. It has to seem possible, and to be possible, without being a huge chore. It just shouldn't be so easy that no progress is really being made.

Also, I need to make sure that once I've broken my streak, any progress starts me up again. If I lose my streak and end up with 2000 messages, having to get back to 25 is going to be a nightmare. My original design was written with this in mind: any progress was progress enough. The new behavior ratchets the absolute maximum down, so that once I've gotten rid of most of those 2000 messages, I can't let them pile back up by ignoring 5 one day, 5 the next, and then reading six the third. Maybe the real solution will be to keep exactly the behavior I have, but to fiddle with the minimum threshold.

The other thing I want to think about, eventually, is message age. I don't want to ding myself for mail received "just now." If a message hasn't been read for a week, I should really read it immediately. If it's just come in this afternoon, though, it should basically be ignored in my counts. For now, though, I think I can ignore that. After all, my goal here is to read email, not to spend my email reading time on writing tools to remind me of that goal!

freemium: the abomination of desolation (body)

by rjbs, created 2014-02-24 10:38
tagged with: @markup:md games journal

I've never been a fan of "freemium," although I understand that game developers need to get paid. It often feels like the way freemium games are developed goes something like this:

  • design a good game
  • focus on making the player want to keep playing
  • insert arbitrary points at which the player must stop playing for hours
  • allow the user to pay money to continue playing immediately

This model drives me batty. It's taking a game and making it worse to encourage the user to pay more. It is, in my mind, the opposite of making a good game that you can make better by paying more. I gladly fork over money for add-on content on games that were good to start with. I never, ever pay to repair a game that has been broken on purpose.

The whole thing reminds me of the 486SX processor, where you could buy a disabled 486 processor now and later upgrade it with a completely new processor that was pinned to only fit into the add-on slot. At least the 486SX could be somewhat explained away as a means to make some money on processors that didn't pass post-production inspection. These fremium games are just broken on purpose from the start.

I think the deciding factor for me is whether I can play the game as much as I want without hitting the pay screen. Years ago, everyone at work was playing Travian. It's a simple browser-based nation-building game, something like a very simplified Civilization. Your workers collect resources and you use them to build cities, troops, and so on. The game is multiplayer and online, so you are in competition with other nations with whom you may eventually go to war or with whom you may establish trade routes. You can keep playing as long as you have resources to spend and free workers. by paying money, you could speed up work or acquire more resources, but the game didn't throw up a barrier every half hour forcing you to wait. It was all a natural part of the game's design, and made sense to have in a simultaneous-play multiplayer game. (Of course, the problem here is that players willing to spend more money have a tactical advantage. That's a different kind of problem, though.)

I used to play an iOS game called Puzzle Craft. The basic game play is tile-matching, and it's all built around the idea that you're the founder of a village that you want to grow into a thriving kingdom. At first, you tile-match to grow crops. Over time, new kinds of tiles are added, and you can respond by developing new tools and by changing the matching rules. You can also build a mine, for a similar but not identical tile matching game. You'll need to deal with both resources to progress along your quest.

I was very excited to see that the makers of Puzzle Craft released a new game this week, Another Case Solved. It's a tile-matcher built in a larger framework, just like Puzzle Craft, but this game is a silly hard boiled detective game. Matching tiles helps you solve mysteries. The game is fun to look at and listen to, but playing it has made me angrier and angrier.

Unlocking major cases requires solving minor cases. Solving minor cases requires a newspaper in which to find them. Newspapers are delivered every fifteen minutes, and you can't have more than three or four of them at a time. In other words, if you want to play more than four (very short) games an hour, you have to spend "candy" to get more newspapers, and you get a piece or two of candy every 12 hours. Also, after a little while, the minor cases become extremely difficult to solve, meaning that every hour you're allowed to play the game three or four times, and that you will probably lose most of them, because there is a low turn limit in each game. Of course, you can keep playing after the turn limit by paying candy.

The whole setup makes it completely transparent that the time and turn limits are there to cajole the player into paying to be allowed to play the free game. It sticks in my craw! I like the game. It is fun. I would pay for it, were it something I could buy at a fixed price. Microtransactions to continue playing the game, though, burn me up.

Maybe I should keep telling myself that I pumped a lot of quarters into Gauntlet when I was a kid. How different is this?

I think it's pretty different. I've seen people play for a very, very long time on one quarter.

integrating Ywar with Goodreads (body)

by rjbs, created 2014-02-17 20:37

Ywar is a little piece of productivity software that I wrote. I've written about Ywar before, so I won't rehash it much. The idea is that I use The Daily Practice to track whether I'm doing things that I've decided to do. I track a lot of these things already, and Ywar connects up my existing tracking with The Daily Practice so that I don't have to do more work on top of the work I'm already doing. In other words, it's an attempt to make my data work for me, rather than just sit there.

For quite a while now, only a few of my TDP goals needed manual entry, and most of them could clearly be automated. It wasn't clear, though, how to automate my "keep reading books" tasks. I knew Goodreads existed, but it seemed like using Goodreads would be just as much work as using TDP. Either way, I have to go to a site and click something for each book. I kept thinking about how to make my reading goals more motivating and more interesting, but nothing occurred to me until this weekend.

I was thinking about how it's hard for me to tell how long it will take me to finish a book. Lately, I'm taking an age to read anything. Catch-22 is about 500 pages and I've been working on it since January 2. Should I be able to do more? I'm not sure. My current reading goals have been very vague. I thought of them as, "spend 'enough time' reading a book from each shelf once every five days." This makes it easy to decide sloppily whether I've read enough, but it's always an all-at-once decision.

In Goodreads, I can keep track of my progress over several days. That means I can change my goal to "get at least 50 pages read a week." There's no fuzzy logic there, just simple page count. It might not be right for every book, but I can adjust it as needed. If it's too low or high, I can fix that too. It seemed like a marked improvement, and it also gave me a reason to consider looking at Goodreads a bit more, where I've seen some interesting recommendations.

With my mind made up, all I had to do was write the code. Almost every time that I've wanted to write code to talk to the developer API of a service that's primarily addressed not via the API, it's been sort of a mess that's usable, but weird and a little annoying. So it was with Goodreads. The code for my Goodreads/Ywar integration is on GitHub. Below is just some of the weirdness I got to encounter.

This request gets the books on my "currently reading" shelf as XML.

sprintf 'https://www.goodreads.com/review/list?format=xml&v=2&id=%s&key=%s&shelf=currently-reading',

The resource is review/list because it's a list of reviews. Go figure! That doesn't mean that there are actually any reivews, though. In Goodreads, a review represents the intersection of a user and a book. If it's on your shelf, it has a review. If there's no review in the usual sense of the word, it just means that the review's body is empty.

The XML document that you get in reply has a little bit of uninteresting data, followed by a <reviews> element that contains all the reviews for the page of results. Here's a review:

    <id type="integer">168668</id>
    <text_reviews_count type="integer">7875</text_reviews_count>
    <title>Catch-22 (Catch-22, #1)</title>
    <publisher>Simon &amp; Schuster </publisher>
    <description>...omitted by rjbs...</description>
        <name>Joseph Heller</name>

    <shelf name="currently-reading" />
    <shelf name="literature" />
  <started_at>Thu Jan 02 17:04:20 -0800 2014</started_at>
  <date_added>Tue Nov 26 08:37:09 -0800 2013</date_added>
  <date_updated>Thu Jan 02 17:04:20 -0800 2014</date_updated>


It's XML. It's not really that bad, either. One problem, though, was that it didn't include my current position. My current position in the book is not a function of my review, but of my status updates. I'll need to get those, too.

I was intrigued, though by the format=xml in the URL. Maybe I could get it as JSON! I tried, and I got this:


Well! That's certainly briefer. It's also, obviously, missing a ton of data. It doesn't include book titles, total page count, or any shelves other than the one that I requested. That is: note that in the XML you can see that the book is on both currently-reading and literature. In the JSON, only currently-reading is listed. Still, it turns out that this is all I need, so it's all I fetch. I get the JSON contents of my books in progress, and then once I have them, I can get each review in full from this resource:

  sprintf 'https://www.goodreads.com/review/show.xml?key=%s&id=%s',

Why does that help? I mean, what I got in the first request was a review, too, right? Well, yes, but when you get the review via review/show.xml, you get a very slightly different set of data. In fact, almost the only difference is the inclusion of comment and user_status items. It's a bit frustrating, because in both cases you're getting a review element, and their ids are the same, but their contents are not. It makes it a bit less straightforward to write an XML-to-object mapper.

When I get review 774476430, which is my copy of Catch-22, this is the first user status in the review:

    <chapter type="integer" nil="true"/>
    <comments_count type="integer">0</comments_count>
    <created_at type="datetime">2014-02-16T12:47:14+00:00</created_at>
    <id type="integer">39382590</id>
    <last_comment_at type="datetime" nil="true"/>
    <note_updated_at type="datetime" nil="true"/>
    <note_uri nil="true"/>
    <page type="integer" nil="true"/>
    <percent type="integer">68</percent>
    <ratings_count type="integer">0</ratings_count>
    <updated_at type="datetime">2014-02-16T12:47:14+00:00</updated_at>
    <work_id type="integer">814330</work_id>

By the way, the XML you get back isn't nicely indented as above. It's not entirely unindented, either. It's sometimes properly indented and sometimes just weird. I think I'd be less weirded out if it just stuck to being a long string of XML with indentation at all, but mostly libxml2 reads the XML, not me, so I should shut up.

The important things above are the page and percent items. They tell me how far through the book I am as of that status update. If I gave a page number when updating, the page element won't have "true" as its nil attribute, and the text content of the element will be a number. If I gave a percentage when updatng, as I did above, you get what you see above. I can convert a percentage to a page count by using the num_pages found on the book record. The whole book record is present in the review, as it was the first time, so I just get all the data I need this time via XML.

Actually, though, there's a reason to get the XML the first time. Each time that I do this check, it's for in-progress books on a certain shelf. If I start by getting the XML, I can then proceed only with books that are also on the right shelf, like, above, "literature." Although you can specify multiple shelves to the review/list endpoint, only one of them is respected. If there are four books on my "currently reading" shelf, but only one is "literature," then by getting XML first, I'll do two queries instead of five.

So I guess I should go back and start with the XML.

By the way, did you notice that review/list takes a query parameter called format, which can be either XML or JSON, and maybe other things... but that review/show.xml includes the type in the path? You can't change the xml to json and get JSON. You just get a 404 instead.

In the end, making Ywar get data from Goodreads wasn't so bad. It had some annoying moments, as is often the case when using a mostly-browser-based web service's API. It made me finally use XML::LibXML for some real work, and hopefully it will lead to me using Goodreads more and getting some value out of that.

games I've played lately (body)

by rjbs, created 2014-02-03 21:51
tagged with: @markup:md games journal

About a year ago, I told Mark Dominus that I wanted to learn to play bridge, but that it was tough to find friends who were also interested. (I'd rather play with physically-present people whom I know than online with strangers.) He said, "Sackson's Gamut of Games has a two-player bridge variant." I had never heard of Sackson, A Gamut of Games, or the two-player variant. I said, "oh, cool," and went off to look into it all.

So, A Gamut of Games is a fantastic little book that you can get at Amazon for about ten bucks. (That's a kickback-to-rjbs link, by the way.) It's got about thirty games in it, most of which you've probably never heard of. I've played only about a fifth of them so far, or less, and so far they've been a lot of fun. The first game in the book is called Mate, and is meant to feel a bit like chess, which it does. It's a game of pure skill, which is quite unusual for a card game. When I first got the book, I would teach the game to everybody. It was easy to learn and play, fun, and a novelty. I also taught Martha how to play, using a Bavarian deck of cards, which made the game more fun for her. We'd go to the neighborhood bar, get some chicken tenders and beer (root and otherwise) and play a few hands.

I got to wondering why more of the games in Gamut weren't available electronically. I found a Lines of Action app for iOS, but it was a bit half-baked, on the network side. Then, through Board Game Geek, I found a site called Super Duper Games offering online Mate, in the guise of "Chess Cards." It's a really mixed experience, but I am a big fan.

The site's sort of ugly, and incredibly slow. There are some weird display issues and some things that are, if not bugs, are darn close. On the other hand, it's got dozens of cool board games that you haven't played before, and you can play them online, against your friends or strangers. If you join, challenge me to something. I'm rjbs.

So far, I've only played a small number of the available games.

One of my favorites is Abande. Like many SDG games, I think it would be even better played in real time, and I'm hoping to produce a board for playing it. Another great one is Alfred's Wyke, which should be easy to play at a table with minimal equipment. I think I'll play it with wooden coins, which I bought in bulk several years ago.

There's also Amazons, Archimedes, Aries, and many more.

In many cases, the games would be improved by realtime play, I think, but only one game has stricken me as greatly hampered by its electronic form. Tumblewords is a cross between Scrabble and Connect Four, which sounds pretty great. It seems like it probably is pretty great, too, but it's got a problem. In some games, like cribbage, part of the goal is to correctly identify what you've just scored. Similarly, in Tumblewords, part of the challenge should be noting all the words that each move introduces. On Super Duper Games, the computer does this for you, using a dictionary. It means you get points for all sorts of words that you'd never have noticed otherwise. I think I may have to play this one in real space before anything else!

Check out Super Duper Games, even if only to read the rules for the games there and play them. Or, maybe try playing something! If you don't want to challenge me, there are dozens of open challenge sitting around at any time.

Dist::Zilla is for lovers (body)

by rjbs, created 2014-01-25 11:21
last modified 2014-01-25 14:20
tagged with: @markup:md journal perl

I don't like getting into the occasional arguments about whether Dist::Zilla is a bad thing or not. Tempers often seem to run strangly high around this point, and my position should, at least along some axes, be implicitly clear. I wrote it and I still use it and I still find it to have been worth the relatively limited time I spent doing it. Nonetheless, as David Golden said, "Dist::Zilla seems to rub some people wrong way." These people complain, either politely or not, and that rubs people who are using Dist::Zilla the wrong way, and as people get irritated with one another, their arguments become oversimplified. "What you're doing shows that you don't care about users!" or "Users aren't inconvenienced at all because there are instructions in the repo!" or some other bad over-distillation.

The most important thing I've ever said on this front, or probably ever will, is that Dist::Zilla is a tool for adjusting the trade-offs in maintaining software projects. In many ways, it was born as a replacement for Module::Install, which was the same sort of thing, adjusting trade-offs from the baseline of ExtUtils::MakeMaker. I didn't like either of those, so I built something that could make things easier for me without making install-time weird or problematic. This meant that contributing to my repository would get weird or problematic for some people. That was obvious, and it was something I weighed and thought about and decided I was okay with. It also meant, for me, that if somebody wanted to contribute and was confused, it would be up to me to help them, because I wanted, personally, to make them feel like I was interested in working with them¹. At any rate, of course it's one more thing to know, to know what the heck to do when you look at a git repository and see no Makefile.PL or Build.PL, and having to know one more thing is a drag. Dist::Zilla imposes that drag on outsiders (at least in its most common configurations), and it has to be used with that in mind.

Another thing I've often said is that Dist::Zilla is something to be used thoughtfully. If it was a physical tool, it would be yellow with black stripes, with a big high voltage glyph on it. It's a force multiplier, and it lets you multiply all kinds of force, even force applied in the wrong direction. You have to aim really carefully before pulling the trigger, or you might shoot quite a lot of feet, a surprising number of which will belong to you.

If everybody who was using Dist::Zilla thought carefully about the ways that it's shifting around who gets inconvenienced by what, I like to imagine that there would be inconsiderate fewer straw man arguments about how nobody's really being inconvenienced. Similarly, if everybody who felt inconvenienced by an author's choice in built tools started from the idea that the author has written and given away their software to try and help other users, there might be fewer ungracious complaints that the author's behavior is antisocial and hostile.

Hopefully my next post will be about some fun code or maybe D&D.

1: My big failure on this front, I think, is replying promptly, rather than not being a big jerk. I must improve, I must improve, I must improve...

Dist::Zilla and line numbering (body)

by rjbs, created 2014-01-14 11:22

brian d foy wrote a few times lately about potential annoyances distributed across various parties through the use of Dist::Zilla. I agree that Dist::Zilla can shuffle around the usual distribution of annoyances, and am happy with the trade offs that I think I'm making, and other people want different trade offs. What I don't like, though, is adding annoyance for no gain, or when it can be easily eliminated. Most of the time, if I write software that does something annoying and leave it that way for a long time, it's actually a sign that it doesn't annoy me. That's been the case, basically forever, with the fact that my Dist::Zilla configuration builds distributions where the .pm files' line numbers don't match the line numbers in my git repo. That means that when someone says "I get a warning from line 10," I have to compare the released version to the version in git. Sometimes, that someone is me. Either way, it's a cost I decided was worth the convenience.

Last week, just before heading out for dinner with ABE.pm, I had the sudden realization that I could probably avoid the line number differences in my shipped dists. The realization was sparked by a little coincidence: I was reminded of the problem just after having to make some unrelated changes to an unsung bit of code responsible for creating most of the problem.


Pod::Weaver is the tool I use to rewrite my sort-of-Pod into actual-Pod and to add boilerplate. I really don't like working with Pod::Simple or Pod::Parser, nor did I like a few of the other tools I looked at, so when building Pod::Weaver, I decided to also write my own lower-level Pod-munging tool. It's something like HTML::Tree, although much lousier, and it stops at the paragraph level. Formatting codes (aka "interior sequences") are not handled. Still, I've found it very useful in helping me build other Pod tools quickly, and I don't regret building it. (I sure would like to give it a better DAG-like abstraction, though!)

The library is Pod::Elemental, and there's a tool called Pod::Elemental::PerlMunger that bridges the gap between Dist::Zilla::Plugin::PodWeaver and Pod::Weaver. Given some Perl source code, it does this:

  1. make a PPI::Document from the source code
  2. extract the Pod elements from the PPI::Document
  3. build a Pod::Elemental::Document from the Pod
  4. pass the Pod and (Pod-free) PPI document to an arbitrary piece of code, which is expected to alter the documents
  5. recombine the two documents, generally by putting the Pod at the end of the Perl

The issue was that step two, extracting Pod, was deleting all the Pod from the source code. Given this document:

package X;


X is the best!


sub do_things { ... }

...we would rewrite it to look like this:

package X;

sub do_things { ... }

X is the best!


...we'd see do_things as being line 9 in the pre-munging Perl, but line 3 in the post-munging Perl. Given a more realistic piece of code with interleaved Pod, you'd expect to see the difference in line numbers to increase as you got later into the munged copy.

I heard the suggestion, many times, to insert # line directives to keep the reported line numbers matching. I loathed this idea. Not only would it be an accounting nightmare in case anything else wanted to rewrite the file, but it meant that the line numbers in errors wouldn't match the file that the user would have installed! It would make it harder to debug problems in an emergency, which is never okay with me.

There was a much simpler solution, which occurred to me out of the blue and made me feel foolish for not having thought of it when writing the original code. I'd rewrite the document to look like this:

package X;

# =head1 OVERVIEW
# X is the best!
# =cut

sub do_things { ... }

X is the best!


Actually, my initial idea was to insert stretches of blank lines. David Golden suggested just commenting out the Pod. I implemented both and started off using blank lines myself. After a little while, it became clear that all that whitespace was going to drive me nuts. I switched my code to producing comments, instead. It's not the default, though. The default is to keep doing what it has been doing.

It works like this: PerlMunger now has an attribute called C, which refers to a subroutine or method name. It's passed the Pod token that's about to be removed, and it returns a list of tokens to put in its place. The default replacer returns nothing. Other replacers are built in to return blank lines or commented-out Pod. It's easy to write your own, if you can think of something you'd like better.

Karen Etheridge suggested another little twist, which I also implemented. It may be the case that you've got Pod interleaved with your code, and that some of it ends up after the last bits of code. Or, maybe in some documents, you've got all your Pod after the code, but in others, you don't. If your concern is just keeping the line numbers of code the same, who cares about the Pod that won't affect those line numbers? You can specify a C for replacing the Pod tokens after any relevant code. I decided not to use that, though. I just comment it all out.


Pod rewriting wasn't the only thing affecting my line numbers. The other thing was the insertion of a $VERSION assignment, carried out by the core plugin PkgVersion. Its rules are simple:

  1. look for each package statement in each Perl file
  2. skip it if it's private (i.e., there's a line break between package and the package name)
  3. insert a version assignment on the line after the package statement

...and a version assignment looked like this:

  $My::Package::VERSION = '1.234';

Another version-assignment-inserter exists, OurPkgVersion. It works like this:

  1. look for each comment like # VERSION
  2. put, on the same line: our $VERSION = '1.234';

I had two objections to just switching to OurPkgVersion. First, the idea of adding a magic comment that conveyed no information, and served only as a marker, bugged me. This is not entirely rational, but it bugged me, and I knew myself well enough to know that it would keep bugging me forever.

The other objection is more practical. Because the version assignment uses our and does not wrap itself in a bare block, it means that the lexical environment of the rest of the code differs between production and test. This is not likely to cause big problems, but when it does cause problems, I think they'll be bizarre. Best to avoid that.

Of course, I could have written a patch to OurPkgVersion to insert braces around the assignment, but I didn't, because of that comment thing. Instead, I changed PkgVersion. First off, I changed its assignment to look like this:

$My::Package::VERSION = '1.234';

Note: no enclosing braces. They were an artifact of an earlier time, and served no purpose.

Then, I updated its rules of operation:

  1. look for each package statement in each Perl file
  2. skip it if it's private (i.e., there's a line break between package and the package name)
  3. skip forward past any full-line comments following the package statement
  4. if you ended up at a blank line, put the version assignment there
  5. otherwise, insert a new line

This means that as long as you leave a blank line after your package statement, your code's line numbers won't change. I'm now leaving this code after the # ABSTRACT comment after my package statements. (Why do the VERSION comments bug me, but not the ABSTRACT comments? The ABSTRACT comments contain more data — the abstract — that can't be computed from elsewhere.) Now, this can still fall back to inserting lines, but that's okay, because what I didn't include in the rules above is this: if configured with die_on_line_insertion = 1, PkgVersion will throw an exception rather than insert lines. This means that as I release the next version of all my dists, I'll hit cases once in a while where I can't build because I haven't made room for a version assignment. That's okay with me!

I'm very happy to have made these changes. I might never notice the way in which I benefit from them, because they're mostly going to prevent me from having occasional annoyances in the future, but I feel good about that. I'm so sure that they're going to reduce my annoyance, that I'll just enjoy the idea of it now, and then forget, later, that I ever did this work.

making my daemon share more memory (body)

by rjbs, created 2014-01-10 19:45
last modified 2014-01-10 19:45
Quick refresher: when you've got a unix process and it forks, the new fork can share memory with its parent, unless it starts making changes. Lots of stuff is in memory, including your program's code. This means that if you're going to `require` a lot of Perl modules, you should strongly consider loading them early, rather than later. Although a runtime `require` statement can make program start faster, it's often a big loss for a forking daemon: the module gets re-compiled for every forked child, multiplying both the time and memory cost.

Today I noticed that one of the daemons I care for was loading some code post-fork, and I thought to myself, "You know, I've never audited that program to see whether it does a good job at loading everything pre-fork." I realized that it might be a place to quickly get a lot of benefit, assuming I could figure out what was getting loaded post-fork. So I wrote this:

use strict;
use warnings;
package PostForkINC;

sub import {
  my ($self, $code) = @_;

  my $pid = $$;

  my $callback = sub {
    return if $pid == $$;
    my (undef, $filename) = @_;

  unshift @INC, $callback;

When loaded, PostForkINC puts a callback at the head of @INC so that any subsequent attempt to load a module hits the callback. As long as the process hasn't forked (that is, $$ is still what it was when PostForkINC was loaded), nothing happens. If it has forked, though, something happens. That "something" is left up to the user.

Sometimes I find a branch of code that I don't think is being traversed anymore. I love deleting code, so my first instinct is to just delete it… but of course that might be a mistake. It may be that the code is being run but that I don't see how. I could try to figure it out through testing or inspection, but it's easier to just put a little wiretap in the code to tell me when it runs. I built a little system called Alive. When called, it sends a UDP datagram about what code was called, where, and by whom (and what). A server receives the datagram (usually) and makes a note of it. By using UDP, we keep the impact on the code being inspected very low. This system has helped find a bunch of code being run that was thought long dead.

I combined PostForkINC with Alive and restarted the daemon. Within seconds, I had dozens of reports of libraries — often quite heavy ones — being loaded after the fork.

This is great! I now have lots of little improvements to make to my daemon.

There is one place where it's not as straightforward as it might have been. Sometimes, a program tries to load an "optional" module. If it fails, no problem. PostForkINC can seem to produce a false positive, here, because it says that Optional::Module is being loaded post-fork. In reality, though, no new code is being added to the process.

When I told David Golden what I was up to, he predicted this edge case and said, "but you might not care." I didn't, and said so. Once I saw that this was happening in my program, though, I started to care. Even if I wasn't using more memory, I was looking all over @INC to try to find files that I knew couldn't exist. Loading them pre-fork wasn't going to work, but there are ways around it. I could put something in %INC to mark them as already loaded, but instead I opted to fix the code that was looking for them, avoiding the whole idea of optional modules, which was a pretty poor fit for the program in question, anyway.

I've still got a bunch of tweaking to do before I've fixed all the post-fork loading, but I got quite a lot of it already, and I'm definitely going to apply this library to more daemons in the very near future.

todo for 2014 (body)

by rjbs, created 2014-01-07 11:33
last modified 2015-01-16 22:28
tagged with: @markup:md journal todo

Huh. It looks like I haven't written a todo list for the year since 2008. I don't know whether I wish I had, but I'm a bit surprised. I'm going to list some things from my lists 2005-2008 that I did not accomplish and start there.

write more prose and/or poetry

I keep meaning to write more, and I don't. In 2013, I wrote 145 opening sentences as part of my daily routine, but that's less a success than it is a reminder of how easy it is to produce an opening sentence to keep that Daily Practice streak going. I need to find some other set of assignments to do, and actually do them.

I've also changed my Daily Practice goal to give me a larger set of exercises to do, including writing synopses for things to try writing later. Then later, I hope to make a second goal for trying to flesh out some of those ideas.

write a Cocoa application

I just didn't do it. I need to find a project that's doable. I think what I want to do is instead first work on a little web service I have been wanting to tackle. Then I can write a client as a Cocoa project. Like a number of other project ideas, I think part of what blocks me is solitude. I'd like to work on these projects with someone, for motivation, for company, and for second opinions. I just don't have anybody local who's interested, and doing this sort of thing remotely can be harder to sort out.

write more programs for fun

I spent a lot of time dealing with maintaining code that, for my purposes, already works. I want to spend less time doing this. I've been doing a good job at doing parts of it steadily, which hopefully means I can keep up with the existing work, but I don't want to do more. If anything, I want to do less. Handling feature requests for a bunch of features that you don't need isn't very interesting, but on the other hand just accepting anything without regard for its impact on design quality isn't any good, either.

Instead of spending more time on existing working code, I want to write more new (presumably pretty broken) code. I have a ton of ideas for things to do, and I just need to set aside the time to do it — but see above, regarding solitude.

spend more time with friends

I tend to go out for a beer with a friend only once every few months. This seems ridiculous. I also have a growing collection of unplayed board games. I just need to make more of an effort.

get my driver's license


cook and bake more often

I've gotten pretty good at making pancakes and fried eggs, and that's it. I want to keep working on the basics, and Martha likes helping, so I need to get her more involved in helping pick projects, and then doing them. I don't know what we should try next, though. Maybe roasts.

I got an Arduino! (body)

by rjbs, created 2013-12-30 09:34
last modified 2013-12-30 09:35

For Christmas, Gloria gave me an Arduino Starter Kit! It's got an Arduino Uni, a bunch of wires, some resistors and LEDs and stuff, a motor, and I don't know what else yet. I hadn't been very intereted in Arduino until Rob Blackwell was giving a pretty neat demo at the "Quack and Hack" at DuckDuckGo last year. Still, I knew it would just be another thing to eat up my time, and I decided to stay away. Finally, though, I started having ideas of things that might be fun, but not too ambitious. I put the starter kit on my Christmas wish list and I got the Arduino Workshop book for cheap from O'Reilly.

The starter kit comes with its own book, but there are quite a few passages that appear verbatim in both books. I'm not sure of their relationship, but they're different enough that I've been reading both. I've gotten a decent idea of how to accomplish simple things, but I don't really understand the underlying ideas, yet. As I sat, squinting at a schematic, I wondered: Is this what beginning programmers feel like? "If I write these magic words, I know what will happen, but not why!" I already had a lot of sympathy for that kind of thinking, but it has been strengthened by this experience.

For example, I know that I can put a resistor on either side of a device in my circuit and it works, and I generally understand why, but then I don't understand how a rectifying diode helps prevent problems with a spike caused by a closing relay? I need to find a good elementary course on electricity and electronics and I need to really let it sink in. This is one of those topics, like special relativity, that I've often understood for a few minutes, but not longer. (Special relativity finally sunk in once I wrote some programs to compute time dilation.)

I'm going to keep working through the books, because there's clearly a lot more to learn. I'm not sure, though, what I'm hoping to do after I get through the whole thing. Even if I don't keep using it after I finish the work, though, I think it will have been a good experience and worth having done.

My favorite project so far was one of my own. The Arduino Workshop has a project where you build a KITT-like scanner with five LEDs using pulse width modulation to make them scan left to right. That looked like it would be neat, but I skipped the software for left to right scanning and instead wrote a little program to make it count to 2⁶-1 over and over on its LED fingers.

Here's the program:

void setup() {
  for (int pin = 2; pin < 7; pin++)  pinMode(pin, OUTPUT);

void loop() {
  for (int a = 0; a < 64; a++) {
    for (int pin = 2; pin < 7; pin++) {
      digitalWrite(pin, (a & ( 2 << pin-2 )) ? HIGH : LOW);

I originally got stuck on the 2<<pin-2 expression because I wanted to use exponentiation, which introduces some minor type complications. I was about to sort them out when I remembered that I could just use bit shifting. That was a nice (if tiny) relief.

Here's what the decide looks like in action:

Working with hardware is different from software in ways that are easy to imagine, but that don't really bug you until you're experiencing them. If I have a good idea about how to rebuild a circuit to make it simpler, I can't take a snapshot of the current state for a quick restore in case I was wrong. Or, I can, but it's an actual snapshot on my camera, and I'll have to rebuild by hand later.

If I make a particularly bad mistake, I can destroy a piece of hardware permanently. Given the very small amounts of power I'm using, this probably means "I can burn out an LED," but it's still a real problem. I've been surprised that there's no "reset your device memory between projects" advice. I keep imagining that my old program will somehow cause harm to my new project's circuits. Also, plugging the whole thing into my laptop makes me nervous every time. It's a little silly, but it does.

My next project involves a servo motor. That should be fun.

Trekpocalypse Now (body)

by rjbs, created 2013-12-22 21:02
last modified 2013-12-23 07:54
tagged with: @markup:md dnd journal rpg yapc

At YAPC::NA in Austin this year, I ran a sorta-D&D game on game night. I have been meaning to write it up nicely, but I think it's just not going to happen, so I'm going to write it up badly. Here we go…

The Ward

The PCs are wardens, residents of the Ward, a strange stronghold of almost perpetual night, along with two hundred others. The elite protect and provide for the rest of the populace. Sometimes a few venture out of the ward, but rarely and often never to return. Beyond the ward are the Sunlit Lands, as far as most wardens ever travel.

The Golden Circle of the Ward maintains communication with distant cells, including one at the War Sactum. The War Sanctum is a small chamber where The Voice (the wardens' oft-witnessed deity) is much more responsive and where terrible powers are accessible to those who know the right secrets. After reports of an upcoming celestial event of great importance, the War Sanctum was attacked by the Alexandrians, an aggressive group of Mogh. With no other cells of the Golden Circle able to respond, the PCs are assembled and sent. They have 3 hours.

From the Ward, the players must first pass through the Sunlit Lands.

The exact terrain of the Sunlit Lands changes from time to time, although there are a limited number of configurations it takes on. Now, it is an expansive field in a bay, surrounded by cliffs. There are huts built here and there. The sun is warm and bright. A few dozen people are scattered around, picnicking. The PCs exit the Ward through a steel blast door set into a cliff wall.

The inhabitants of the Sunlit Lands are the Hollow Men. They are pleasant, somewhat vapid people, with very few worries. They cannot leave the Sunlit Lands, having lived on its food for too long. The same is true of anyone who lives on their food for more than a few days.

The PCs know the next step is London. London is currently accessible only through an undersea tunnel — one of its less helpful locations. ("Can't you wait a few weeks?" ask the Hollow Men, if asked) Someone will have to wade out, dive under, and get the hatch open. This leads to an expanse of metal tubing long enough for everyone to get in, but it will fill with water, though the water will dissolve as it fills. It fills fast enough that this isn't a big help, but once the entrance is closed, things will dry off.

London is a twilit metropolis of narrow alleys and dreary weather. The streets are crowded with anxious citizens of all ranks. The PCs will, unless disguised quickly, draw attention.

The vast majority of citizens of London just want to be left alone. They will react to weird stuff by gasping or acting scandalized, but will not raise an alarm. The people to fear are the League of Yellow Men and their henchmen. The League rules the city with an iron fist, opposed only by the network of thieves and urchins run by Abdul Amir.

The game didn't get much further than that, due to a TPK, but a few more areas stood between London and the War Sanctum.


The game was set aboard Enterprise (NCC-1701-D) several generations after a catastrophic space catastrophe. The challenge was to make it from Ten Forward to the battle bridge, making it through decks controlled by several of the factions now struggling for control of the ship. The Alexandrian faction of the Mogh were the final boss in this scenario, but the Yellow Men, Machine Men, and others were around. I'd started to scribble down ideas for what else I'd do if I had time. The Cult of the Immortal seemed like they'd be fun, and the nanite-possessed Skull Crushers led me to thinking about even more ridiculous backstory.

Eventually, I forced myself to stop thinking of new ideas, since I already had way too many, and get to work on some rules.

I started with the Moldvay rules, which is a pretty nice simple set of rules onto which to bolt hacks. I wanted to have distinct classes that felt like classes, weren't too complicated, and captured at least some of the Trek flavor. I looked at stealing from Hulks & Horrors and Starships & Spacemen but I decided I didn't quite like them.

Instead, I sketched out six (or seven) distinct classes and, rather than come up with complete rules up through level four, I just threw together characters at that level. The character sheet image, above, links to the full set of PC pregens that I made.


PCs have six attributes:

  • Strength
  • Dexterity
  • Intelligence
  • Technology
  • Command
  • Empathy


The Crimson are the leaders of the wardens. They act both as warriors and priests, having the ability to influence the Voice. Each Crimson PC has a pool of Access points. They can get information or favors from the Voice by rolling 1d20 vs. 20, and they can add their total Access points to the roll, or they can spend a number of Access points equal to the level of the effect they want to achieve. The pregens all had 5 Access, meaning that all effects needed a 15 or better, but every time the PC got a guaranteed success, all subsequent attempts in the episode got harder.

The Gold Circle specialize in more powerful effects, but they take much longer. For example, the Remote Viewing power takes 3 turns to invoke — that's a half hour! Gold Circle PCs get a number of Bright Ideas per episode, though, and these can drastically reduce the time required. The first few Bright Ideas spent on an effect change the units. After they get down to rounds, they go to turns, and after that, to instant. A power can only be used once, unless you have another Bright Idea. Powers usually have a minimum required time, usually one round. I provided two Gold Circle builds. One was a support character, with Remote Viewing, Shields Up!, Dispel Illusion, and Cornucopia. The other was a stealth attacker with access to the teleporter and a personal cloaking device.

The Azure are experts in understanding (rather than just using) the technology of past generations. They don't have any powers on their own, but they're the only characters who can use tech more complicated than ray guns and comm badges. Azure PCs are trained in using specific devices. Each device has a number of charges, and once they're used up the device can't be used again until it's recharged, which takes … well, too long to do in the game's three hour window.

Empath PCs have psychic powers! They're pretty close to AD&D 2nd Edition psionicist characters. They have a pool of psi power points, and powers cost points. Their prime attributes are Command and Empathy, depending on their build. They'd probably all have some of the same powers, then some different ones. One power was Weaken Will, which lets the empathy attack the target's effective Command stat, making them more susceptible to later powers.

The Mogh (aka Warface (get it, Worf-face?)) are Klingons. Or, I guess, part-Klingons. I like to imagine they're all descended from Alexander and Worf, but who knows? They can regenerate 2 hp per round (but their shields stink; see below). In combat, they can switch between different stances that give them different advantages or disadvantages.

Twinlings were the final PC race, and I was very happy to include them. Why don't we see more of those guys on the show? If I ever wrote an episode… Well, anyway, they were very much a support class. A single twinling PC is two entities, and gets two actions per round, but if either one dies, that's it. I'd probably give the survivor a round or two to be heroic, but that's it. All the twinlings' powers were there to enhance the other characters' powers. They could restore charges to devices for the Azure, repair device (or shield) damage, temporarily boost shield strength, extend the range or duration of other powers, or (and I was sad nobody used this) duplicate any device or power they'd seen used in the last day. Using any twinling power consumed a set of Spare Parts.

The Green were going to be descendants of Orions, for no good reason other than I had a bunch of other colors represented. I didn't come up with any good ideas, though.


Everybody got two sets of attack and defense stats, one for melee and one for beam. Beam weapons have an effectively infinite range. (I guess if there'd been extravehicular combat, I'd think about that harder.) Melee weapons… well, you know.

Everybody got a shield, except for the PC with the personal cloak. Shields have a strength stat, which is the amount of damage they can absorb, in hp, before failing. They always absorb whole attacks, so when a 10 hp attack hits your 1 hp shield, you take no damage. Phew! Shields also have their own hit points, and every time they fail, they lose a hit point. As long as they're not reduced to zero hit points, it will recharge after a few rounds in which the character takes no damage. So, if your shield is strength 5, charge 2, hp 5, then it can absorb two 4 hp attacks before failing. Then its wearer needs to stay out of the line of fire for two rounds, and the shield will be up again. This can happen five times.

The Mogh PCs have specialized shields. They have strength 1, but infinite hit points. They're really just there to give them one attack worth of immunity while they charge into melee combat. Shooting at your enemy from full cover is dishonorable! Or, at least, not what those guys did best. The hilarious side effect of this was that the Mogh PCs always ran right into combat, thus becoming targets for friendly fire. At least one died that way. I believe the rule for friendly fire was that if you missed by more than your beam bonus, and the target was engaged in melee with your ally, you hit your ally.


It was a fun! I needed more prep and a more streamlined scenario. I would definitely run it again, and it could clearly be run as a mini-campaign and remain fun the whole time, and there's a huge universe of stuff to steal from.

keeping track of the (dumb) things I do (body)

by rjbs, created 2013-11-25 22:39

Last week, I was thinking about how sometimes I do something I have to do and then feel great, and sometimes I do something I have to do and then feel lousy. I decided I should keep track of what I do and how it makes me feel. (I have some dark predictions, but am trying to hold off until I have more recorded.) To do this, I needed a way to record the facts, and it needed to be really, really easy to use. I'd never take the time to say "I did something" if it was a hassle.

I decided I wanted to run commands something like this:

  rjbs:~$ did some code review on DZ patches :)

So, I did some code review and it made me happy. Then I thought of some embellishments:

  rjbs:~$ did some code review on DZ patches :) +code ~45m ++

I spent about 45 minutes doing it, it was code, and this was an improvement to my mood from when I started. There's a problem, though: :) isn't valid shell. I solved this by making did with no arguments read from standard input. I also renamed did to D. I also think I might make it accept emoji, so I could run:

  rjbs:~$ D haggled with mortgage provider 😠 +money

Later, I'll write something to build a blog post template from a week's work, maybe. I'm still not sure whether I'll keep using this. I need to get into the habit, and I'm not sure how, although connecting it to Ywar might help.

Anyway, the code is a real mess right now, and it kind of stinks, but D is on GitHub in case it's of interest to anyone.

in search of excellent conference presentations (body)

by rjbs, created 2013-11-18 18:24
tagged with: @markup:md journal

At OSCON this past year, I was a just little surprised by the still-shrinking Perl track. What really surprised me, though, was the entirely absent Ruby track. I tried to figure out what it meant, and whether it meant anything, but I didn't come to any conclusions. Even if I'd more carefully collected actual data, I'm not sure I could've made any really useful conclusions.

Instead, I came to a flimsier, wobblier conclusion: the Perl track could have more, better talks that would appeal to more people, including people from outside of Perl. I spoke to some OSCON regulars about this and nobody told me that I was deluded. When I got home, I asked a few people whether they'd ever considered coming to give a talk at OSCON. I got a few replies something like this:

I hadn't really, but maybe I should. What would I talk about, though? Talking about stuff I do in Perl wouldn't make sense, because OSCON isn't a Perl conference.

OSCON is an interesting conference. It's ecumenical — or it could and should be. In practice, though, it can be a bit cliquish. I was disappointed when I first saw lunch tables marked as the "Python table" or "JavaScript table." I was told (and believe) that people asked for this sort of thing as a way to find people with the same interests, but I think that one of the most interesting things about OSCON is the ability to talk shop with people whose shop is quite unlike your own. It leads to interesting discoveries.

This only works, though, if you really talk about what you really do. If I said, "Well, I filter and route email with a lot of Perl and a little C," nobody's going to learn anything interesting from me. On the other hand, I could tack on a few more sentences about the specific problems we encounter and how we get past them. "High performance, highly configurable email filtering is stymied by the specific 'commit' phases of SMTP, so we've had to spend a little time figuring out how to do as much rejecting as early as possible, but everything else as late as possible." Once you're talking about specific problems, people can relate, even if they don't know much about the domain.

Hearing about interesting solutions to problems can often help me think about new possible solutions to my own problems, so what I like is to hear people talk about their specific solutions to specific problems. I seek these talks out. I've basically given up on talks like "The Ten Best Things About Go" or "A Quick Intro to Clojure." They can be interesting, but generally I find them wishy-washy. They're not compelling enough to get me to commit to doing serious work in a new language, and they don't discuss any single problem in enough detail to inspire my to rethink things.

So I think that, in general, talks about really specific pieces of software are the best, and that means talks about software in Perl (or Python or Bash or Go...) because that shows the actual solution that was made. Most of these talks, I think, would be interesting to all sorts of people who don't use the underlying language or system. If you work on an ORM in Python, would a talk on DBIx::Class be interesting? Yes, I think it could be. Could a talk on q.py be useful for just about anybody who debugs code? Yes. And so on.

I'm really hoping to see some interesting real-problem-related talks show up this year, and plan to go to whichever ones look the most concrete. I also hope to give some talks like that. Talks like that are my favorite to give, and I look forward to spending more time talking about solving real problems than talking about abstractions.

OSCON's call for participation tends to come out in January. That should be plenty of time to think about our most interesting solved problems!

moving my homedir into the 21st century (body)

by rjbs, created 2013-11-14 23:26
tagged with: @markup:md journal

Over the last few weeks, I've done a bit of pair programming across the Internet, which I haven't done in years. It was great! Most of this was with Ingy döt Net and Frew Schmidt.

As is often the case, the value wasn't only in the work we did, but in the exchange of ideas while doing it. I got to see both Ingy and Frew using their tools, and it made me want to steal from them. It also helped me get a handle on what things I didn't want to change in my own setup, and why. It's definitely something I'd like to do more often.

Both Ingy and Frew were using tmux, the terminal multiplexer. tmux is a lot like GNU screen, which I've been using for at least fifteen years. If you're not using either one, and you use a unix, you really ought to start! They help me get a lot of my work parallelized and simplified. I first learned of tmux a few years ago when I learned that several members of the Moose core dev team has started using it instead of screen. I tried to switch at the time, but it didn't work out. It crashed too much, its Solaris support seemed spotty, and basically it got in my way. Now, inspired by looking at what Ingy and Frew were doing, I felt like trying again. I sat down and read most of the tmux book and was convinced in theory. Although I don't like every difference between screen and tmux, there were clear benefits.

Then I got to work actually switching, which meant producing a tolerable .tmux.conf. I started with the one I'd made years before and slowly added to it as I read more about tmux's features. It's clear that I've got more improvements to make, but they're going to require a few months of using my current config to figure things out.

When I paired with Ingy, we used PairUp, his instant pairing environment. Basically, you provision a Debian-like VM using whatever system you want (we were using RackSpace, but I tried it with EC2, also) and, with one command, create a useful environment for pairing in a shared tmux session. We didn't actually work on anything. Intead, he showed me PairUp and we encountered enough foibles along the way that we got to pair on fixing up the pairing environment. It was fun.

I saw a lot of the tools he was using, as we went, and one of them was his dotfile manager. I've seen a lot of dotfile managers, although I've never really switched to using one. Instead, I was using a fairly gross hack of my own, using GNU make to install my dotfiles. The tool that Ingy was using, ... was interesting enough to get me to switch. I've converted almost all of my config repositories to using it, and I feel good about this.

... isn't a huge magic change in how to look at config files, and that's why I like it. It's also not just "your dotfiles in a repo." It's got two bits that make it very cool.

First, it is configured with a list of repositories containing your configuration:

- repo: git@github.com:sharpsaw/loop-dots.git
- repo: git@github.com:rjbs/rjbs-dots.git
- repo: git@github.com:rjbs/rjbs-osx-dots.git
- repo: git@github.com:rjbs/vim-dots.git
- repo: rjbs@....:git/rjbs-private-dots.git

Each one of these repositories is kept in sync in ~/.../src, and the files in them are symlinked into your home directory. Any file in the first repo takes precedence over files in later repositories, so you can establish canonical behaviors early and add specialized ones later.

The second interesting bit is provided by the loop-dots repository above. It sets up a number of config files (like .zshrc and .vimrc to name just two) that loop over the rest of the dots repositories, sourcing subsidiary files. So there's a global .zshrc, but almost the only thing it does is load the .zshrc files of other repositories. This makes it very simple to divide up your config files into roles. I can have a rjbs-private-dots that just adds on my "secret data" to my normal dot files. At work, I'll have an rjbs-work-dots that sets up variables needed there.

Finally, there's another key benefit: each repository is basically just a bunch of dot files in a repo, even though ... is more than that. If I ever decide that ... is nuts, bailing out of using it is very simple. I don't need to convert lots of things out of it, I just need to replace the ... program with, say, cp.

I'm only about a week into this big set of updates, but so far I think it's going well. Of course, time will tell. I haven't yet updated my Linode box, where I do quite a lot of my work, to use my ... config. Tomorrow…

Office Mode DEFCON (body)

by rjbs, created 2013-11-07 22:29
tagged with: @markup:md games journal

I picked up DEFCON a few months ago on Steam. It's a game inspired by the "big nuclear war boards" we saw in movies like Dr. Strangelove or, closer to the mark, WarGames. Each player controls a section of the world. The game starts with a few very short bits of placing units and quickly turns into a shooting war. Players launch fighters, deploy fleets, and eventually sound out bombers, subs, and ICBMs. The game looks gorgeous.

I was intrigued by "office mode." In this mode, the game is time limited, runs in a window, and stays mostly out of your way. My understanding was that it would be a good fit to run while working my day job. I'd just check in on it once in a while to issue new orders, but mostly I could ignore it. After all, a lot of time was sure to be missiles flying through the air. I got in touch with some friends to organize a game, and Florian Ragwitz and I gave it a shot today.

Unfortunately, it wasn't quite what I expected. The first few phases of the game were quite rapid-fire and required a good bit of my brain for about twenty minutes. After that, things calmed down, but not enough. It was not a great background activity. I couldn't, for example, check in for five minutes at a time between pomodori.

On the other hand, it was fun. I guess I should spend a bit more time fighting AIs, though, because Florian utterly destroyed me. I think I had one weapon hit a target, doing a fair bit of damage to Naples. Meanwhile, he destroyed about half the population of the USSR. (Florian was playing as Europe, and I was the USSR. Strangely, Europe, not the USSR, controls Kiev, Warsaw, and Dnipropetrovsk.)

Mazes & Minotaurs (body)

by rjbs, created 2013-10-30 23:06
tagged with: @markup:md journal mnm rpg

Over a decade ago, Paul Elliott wrote a tiny piece of counterfactual history called The Gygax/Arneson Tapes. It recounts the history of the world's most famous role-playing game, Mazes & Minotaurs, in which the players take on larger-than-life Greek-style heroes in Sword and Sandal adventures.

A while later, the amazing Olivier Legrand "dug up and published" the original 1972 rules for Mazes & Minotaurs. Of course, in reality he wrote it. All of it. It's a complete, good, playable RPG written based on a little half page of inspiration, also inspired by the little brown books of D&D.

Then, later, he produced the 1987 "revised" edition. This gives us the three core books you'd expect: the player's manual, the Maze Master's guide, and the creature compendium. Later came the M&M Companion, Viking & Valkyries (an alternate setting), and perhaps most amazingly of all, Minotaur Quarterly, an excellent magazine of add-on material for RM&M. Of course, sometimes it included "republished" articles from the days of OM&M.

The whole set of books is well done. They're all written as if the false history is true, and with a bit of tongue in cheek, but they're still good, playable games.

For about a year and a half, give or take, I ran a modified M&M game and it went well. I might run it again some day, either in that same setting or in the canonical Mythika, if I get around to watching a bunch more peplum films. I advise all fellow old school RPG fans to give M&M a look.

modules seeking homes (body)

by rjbs, created 2013-10-24 22:26
last modified 2013-10-24 22:26

I don't use Module::Install or Module::Starter anymore. For the most part, I don't think anyone should. I think there are better tools to use instead.

That said, if you really like using them, I have some plugins that I am no longer interested in maintaining:

  • Module::Install::AuthorTests
  • Module::Install::ExtraTests
  • Module::Starter::Plugin::SimpleStore
  • Module::Starter::Plugin::TT2

That's all I have to say.

Dist::Zilla v5 will break and/or fix your code (body)

by rjbs, created 2013-10-20 10:51
last modified 2013-10-20 21:24


When I wrote Dist::Zilla, there were a few times that I knew I was introducing encoding bugs, mostly around Pod handling and configuration reading. (There were other bugs, too, that I didn't recognize at the time.) My feeling at the time was, "These bugs won't affect me, and if they do I can work around them." My feeling was right, and everything was okay for a long time.

In fact, the first bug report I remember getting was from Olivier Mengué in 2011. He complained that dzil setup was not doing the right thing with encoding, which basically meant that he would be known by his mojibake name, Olivier Mengué. Oops.

I put off fixing this for a long time, because I knew how deeply the bugs ran into the foundation. I'd laid them there myself! There were a number of RT tickets or GitHub pull requests about this, but they all tended to address the surface issues. This is really not the way to deal with encoding problems. The right thing to do is to write all internal code expecting text where possible, and then to enforce encode/decode at the I/O borders. If you've spent a bunch of time writing fixes to specific problems inside the code, then when you fix the border security you need to go find and undo all your internal fixes.

My stubborn refusal to fix symptoms instead of the root cause left a lot of tickets mouldering, which was probably very frustrating for anybody affected. I sincerely apologize for the delay, but I'm pretty sure that we'll be much better off having the right fix in place.

The work ended up getting done because David Golden and I had been planning for months to get together for a weekend of hacking. We decided that we'd try to do the work to fix the Dist::Zilla encoding problems, and hashed out a plan. This weekend, we carried it out.

The Plan

As things were, Dist::Zilla got its input from a bunch of different sources, and didn't make any real demand of what got read in. Files were read raw, but strings in memory were … well, it wasn't clear what they were. Then we'd jam in-memory strings and file content together, and then either encode or not encode it at the end. Ugh.

What we needed was strict I/O discipline, which we added by fixing libraries like Mixin::Linewise and Data::Section. These now assume that you want text and that bytes read from handles should be UTF-8 decoded. (Their documentation goes into greater detail.) Now we'd know that we had a bunch of text coming in from those sources, great! What about files in your working directory?

Dist::Zilla's GatherDir plugin creates OnDisk file objects, which get their content by reading the file in. It had been read in raw, and would then be mucked about with in memory and then written back out raw. This meant that things tended to work, except when they didn't. What we wanted was for the files' content to be decoded when it was going to be treated as a string, but encoded when written to disk. We agreed on the solution right away:

Files now have both content and encoded_content and have an encoding.

When a file is read from disk, we only set the encoded content. If you try reading its content (which is always text) then it is decoded according to its encoding. The default encoding is UTF-8.

When a file is written out to disk, we write out the encoded content.

There's a good hunk of code making sure that, in general, you can update either the encoded or decoded content and they will both be kept up to date as needed. If you gather a file and never read its decoded content before writing it to disk, it is never decoded. In fact, its encoding attribute is never initialized… but you might be surprised by how often your files' decoded content is read. For example, do you have a script that selects files by checking the shebang line? You just decoded the content.

This led to some pretty good bugs in late tests, hitting a file like t/lib/Latin1.pm. This was a file intentionally written in Latin-1. When a test tried to read it, it threw an exception: it couldn't decode the file! Fortunately, we'd already planned a solution for this, and it was just fifteen minutes work to implement.

There is a way to declare the encoding of files.

We've added a new plugin role, EncodingProvider, and a new plugin, Encoding, to deal with this. EncodingProvider plugins have their set_file_encodings method called between file gathering and file munging, and they can set the encoding attribute of a file before its contents are likely to be read. For example, to fix my Latin-1 test file, I added this to my dist.ini:

filename = t/lib/Latin1.pm
encoding = Latin-1

The Encoding plugin takes the same file-specifying arguments as PruneFiles. It would be easy for someone to write a plugin that will check magic numbers, or file extensions, or whatever else. I think the above example is all that the core will be providing for now.

You can set a file's encoding to bytes to say that it can't be decoded and nothing should try. If something does try to get the decoded content, an exception is raised. That's useful for, say, shipped tarballs or images.

Pod::Weaver now tries to force an =encoding on you by @Default

The @Default pluginbundle for Pod::Weaver now includes a new Pod::Weaver plugin, SingleEncoding. If your input has any =encoding directives, they're consolidated into a single directive at the top of the document… unless they disagree, in which case an exception is raised. If no directives are found, a declaration of UTF-8 is added.

For sanity's sake, UTF-8 and utf8 are treated as equivalent… but you'll end up with UTF-8 in the output.

You can probably stop using Keedi Kim's Encoding Pod::Weaver plugin now. If you don't, the worst case is that you might end up with two mismatched encoding directives.

Your dist (or plugin) might be fixed!

If you had been experiencing double-encoded or wrongly-encoded content, things might just be fixed. We (almost entirely David) did a survey of dists on the CPAN and we think that most things will be fixed, rather than broken by this change. You should test with the trial release!

Your dist (or plugin) might be broken!

...then again, maybe your code was relying, in some way, on weird text/byte interactions or raw file slurping to set content. Now that we think we've fixed these in the general case, we may have broken your code specifically. You should test with the trial release!

The important things to consider when trying to fix any problems are:

  • files read from disk are assumed to be encoded UTF-8
  • the value given as content in InMemory file constructors is expected to be text
  • FromCode files are, by default, expected to have code that returns text; you can set (code_return_type => 'bytes') to change that
  • your dist.ini and config.ini files must be UTF-8 encoded
  • DATA content used by InlineFiles must be UTF-8 encoded
  • if you want to munge a file's content like a string, you need to use content
  • if you want to munge a file's content as bytes, you need to use encoded_content

If you stick to those rules, you should have no problems… I think! You should also report your experiences to me or, better yet, to the Dist::Zilla mailing list.

Most importantly, though, you should test with the trial release!

The Trial Release

You'll need to install these:

…and if you use Pod::Weaver:


I'd like to thank everyone who kept using Dist::Zilla without constantly telling me how awful the encoding situation was. It was awful, and I never got more than a few little nudges. Everyone was patient well beyond reason. Thanks!

Also, thanks to David Golden for helping me block out the time to get the work done, and for doing so much work on this. When he arrived on Friday, I was caught up in a hardware failure at the office and was mostly limited to offering suggestions and criticisms while he actually wrote code. Thanks, David!

writing OAuthy code (body)

by rjbs, created 2013-10-14 10:56

I've written a bunch of code that deals with APIs behind OAuth before. I wrote code for the Twitter API and for GitHub and for others. I knew roughly what happened when using OAuth, but in general everything was taken care of behind the scenes. Now as I work on furthering the control of my programmatic day planner, I need to deal with web services that don't have pre-built Perl libraries, and that means dealing with OAuth. So far, it's been a big pain, but I think it's been a pain that's helped me understand what I'm doing, so I won't have to flail around as much next time.

I wanted to tackle Instapaper first. I knew just what my goal automation would look like, and I'd spent enough time bugging their support to get my API keys. It seemed like the right place to start. Unfortunately, I think it wasn't the best service to start with. It felt a bit like this:

Hi! Welcome to the Instapaper API! For authentication and authorization, we use OAuth. OAuth can be daunting, but don't worry! There are a lot of libraries to help, because OAuth is a popular standard!

By the way, we've made our own changes to OAuth so that it isn't quite standard anymore!

For one thing, they require xAuth. Why? I don't know, but they do. I futzed around trying to figure out how to use Net::OAuth. It didn't work. Part of seemed to be that no matter what I did, the xAuth parameters ended up in the HTTP headers instead of the post body, and it wasn't easy to change the request body because of the various layers in play. I searched and searched and found what seemed like it would be a bit help: LWP::Authen::OAuth.

It looked like just what I wanted. It would let me work with normal web requests using an API that I knew, but it would sign things transparently. I bodged together this program:

use JSON;
use LWP::Authen::OAuth;

my $c_key     = 'my-consumer-key';
my $c_secret  = 'my-consumer-secret';

my $ua = LWP::Authen::OAuth->new(oauth_consumer_secret => $c_secret);

my $r = $ua->post(
  'https://www.instapaper.com/api/1/oauth/access_token', [
    x_auth_username => 'my-instapaper-username',
    x_auth_password => 'my-instapaper-password',
    x_auth_mode     => 'client_auth',

    oauth_consumer_key    => $c_key,
    oauth_consumer_secret => $c_secret,

print $r->as_string;

This program spits out a query string with my token and token secret! Great, from there I can get to work writing requests that actually talk to the API! For example, I can list my bookmarks:

use JSON;
use LWP::Authen::OAuth;

my $c_key     = 'my-consumer-key';
my $c_secret  = 'my-consumer-secret';

my $ua = LWP::Authen::OAuth->new(
 oauth_consumer_secret => $c_secret,
 oauth_token           => 'my-token',
 oauth_token_secret    => 'my-token-secret',

my $r = $ua->post(
    limit => 200,
    oauth_consumer_key    => $c_key,

my $JSON = JSON->new;
my @bookmarks = sort {; $a->{time} <=> $b->{time} }
                grep {; $_->{type} eq 'bookmark' }
                @{ $JSON->decode($r->decoded_content) };

for my $bookmark (@bookmarks) {
  say "$bookmark->{time} - $bookmark->{title}";
  say "  " . $JSON->encode($bookmark);

Great! With this done, I can get my list of bookmarks and give myself points for reading stuff that I wanted to read, and that's a big success right there. I mentioned my happiness about this in #net-twitter, where the OAuth experts I know hang out. Marc Mims said, basically, "That looks fine, except that it's got a big glaring bug in how it handles requests." URIs and OAuth encode things differently, so once you're outside of ASCII (and maybe before then), things break down. I also think there might be other issues you run into, based on later experience. I'm not sure LWP::Authen::OAuth can be entirely salvaged for general use, but I haven't tried much, and I'd be the wrong person to figure it out, anyway.

Still, I was feeling pretty good! It was time, I decided, to go for my next target. Unfortunately, my next target was Feedly, and they've been sitting on my API key request for quite a while. They seem to be doing this for just about everybody. Why do they need to scrutinize my API key anyway? I'm a paid lifetime account. Just give me the darn keys!

Well, fine. I couldn't write my Feedly automation, so I moved on to my third and, currently, final target: Withings. I wanted code to get my last few weight measurements from my Withings scale. I pulled up their API and got to work.

The first roadblock I hit was that I needed to know my numeric user id, which they really don't put anyplace you can find it. I had to dig for about half an hour before I found it embedded in a URL on one of their legacy UI pages. Yeesh!

After that, though, things went from tedious to confusing. I was getting directed to a URL that returned a bodyless 500 response. I'd get complaints about bogus signatures. I couldn't figure out how to get token data out of LWP::Authen::OAuth. I decided to bite the bullet and figure out what to do with Net::OAuth::Client.

As a side note: Net::OAuth says "you should probably use Net::OAuth::Client," and is documented in terms of it. Net::OAuth::Client says, "Net::OAuth::Client is alpha code. The rest of Net::OAuth is quite stable but this particular module is new, and is under-documented and under-tested." The other module I ended up needing to use directly, Net::OAuth::AccessToken, has the same warning. It was a little worrying.

This is how OAuth works: first, I'd need to make a client and use it to get a request token; second, I'd need to get the token approved by the user (me) and turned into an access token; finally, I'd use that token to make my actual requests. While at first, writing for Instapaper, I found Net::OAuth to feel overwhelming and weird, I ended up liking it much better when working on the Withings stuff. First, code to get the token:

use Data::Dumper;
use JSON 2;
use Net::OAuth::Client;

my $userid  = 'my-hard-to-find-user-id';
my $api_key = 'my-consumer-key';
my $secret  = 'my-consumer-secret';

my $session = sub {
  state %session;
  my $key = shift;
  return $session{$key} unless @_;
  $session{$key} = shift;

my $client = Net::OAuth::Client->new(
  site               => 'https://oauth.withings.com/',
  request_token_path => '/account/request_token',
  authorize_path     => '/account/authorize',
  access_token_path  => '/account/access_token',
  callback           => 'oob',
  session            => $session,

say $client->authorize_url; # <-- will have to go visit in browser

my $token = <STDIN>;
chomp $token;

my $verifier = <STDIN>;
chomp $verifier;

my $access_token = $client->get_access_token($token, $verifier);

say "token : " . $access_token->token;
say "secret: " . $access_token->token_secret;

The thing that had me confused the longest was that coderef in $session. Why do I need it? Under the hood, it looks optional, and it can be, but it's easier to just provide it. I'll come back to that. Here's how you use the program:

When you run the program, authorize_url generates a new URL that can be visited to authorize a token to be used for future requests. The URL is printed to the screen, and the user can open the URL in a browser. From there, the user should be prompted to authorize access for the requesting application (as authenticated by the consumer id and secret). The website then redirects the user to the callback URL. I gave "oob" which is obviously junk. That's okay because the URL will sit in my browser's address bar and I can copy out two of its query parameters: the token and the verifier. I paste these into the silently waiting Perl program. (I could've printed a prompt, but I didn't.)

Now that the token is approved for access, we can get an "access token." What? Well, the get_access_token method returns a Net::OAuth::AccessToken, which we'll use something like an LWP::UserAgent to perform requests against the API. I'll come back to how to use that a little later. For now, let's get back to the $session callback!

To use a token, you need to have both the token itself and the token secret. They're both generated during the call to authorize_url, but only the token's value is exposed. The secret is never shared. It is available, though, if you've set up a session callback to save and retrieve values. (The session callback is expected to behave sort of like CGI's venerable param routine.) This is one of those places where the API seems tortured to me, but I'm putting my doubts aside because (a) I don't want to rewrite this library and (b) I don't know enough about the problem space to know whether my feeling is warranted.

Anyway, at the end of this program we spit out the token and token secret and we exit. We could instead start making requests, but I always wanted to have two programs for this. It helps me ensure that I've saved the right data for future use, rather than lucking out by getting the program into the right state. After all, I'm only going to get a fresh auth token the first time. Every other time, I'll be running from my saved credentials.

My program to actually fetch my Withings measurements looks like this:

use Data::Dumper;
use JSON 2;
use Net::OAuth::Client;

my $userid  = 'my-hard-to-find-user-id';
my $api_key = 'my-consumer-key';
my $secret  = 'my-consumer-secret';

my $session = sub {
  state %session;
  my $key = shift;
  return $session{$key} unless @_;
  $session{$key} = shift;

my $client = Net::OAuth::Client->new(
  site               => 'https://oauth.withings.com/',
  request_token_path => '/account/request_token',
  authorize_path     => '/account/authorize',
  access_token_path  => '/account/access_token',
  callback           => 'oob',
  session            => $session,

my $token   = 'token-from-previous-program';
my $tsecret = 'token-secret-from-previous-program';

my $access_token = Net::OAuth::AccessToken->new(
  client => $client,
  token  => $token,
  token_secret => $tsecret,

my $month_ago = $^T - 30 * 86_400;
my $res = $access_token->get(
  . "?action=getmeas&startdate=$month_ago&userid=$userid"

my $payload = JSON->new->decode($res->decoded_content);
my @groups =
  sort { $a->{date} <=> $b->{date} } @{ $payload->{body}{measuregrps} };

for my $group (@groups) {
  my $when   = localtime $group->{date};
  my ($meas) = grep { $_->{type} == 1 } @{ $group->{measures} };

  unless ($meas) { warn "no weight on $when!\n"; next }
  my $kg = $meas->{value} * (10 ** $meas->{unit});
  my $lb = $kg * 2.2046226;
  printf "%s : %5.2f lbs\n", $when, $lb;

This starts to look good, to me. I make an OAuth client (the code here is identical to that in the previous program) and then make an AccessToken. Remember, that's the thing that I use like a LWP::UserAgent. Here, once I've got the AccessToken, I get a resource and from there it's just decoding JSON and mucking about with the result. (The data provided from the Withings measurements API is a bit weird, but not bad. It's certainly not as weird as many other data I've been given by other APIs!)

I may even go back to update my Instapaper code to use Net::OAuth, if I get a burst of energy. After all, the thing that gave me trouble was dealing with xAuth using Net::OAuth. Now that I have my token, it should just work… right? We'll see.

Microscope (body)

by rjbs, created 2013-10-09 16:58
tagged with: @markup:md games journal rpg

A few years ago I heard about the game Microscope and it sounded way cool. In summary: it is.

It is in some ways like a role-playing game, but in other ways it's something else entirely. When you play Microscope, you're not telling the story of a few character, and you're not trying to solve a puzzle. You're building a history on a large scale. It's meant for building stories on the scale of decades, centuries, or millennia.

The game starts with a few things being decided up before play really begins:

  • what's the general theme of the history being built?
  • what things are out of bounds
  • what things are explicitly allowed

From the start, play rotates. It is a game, although it's a game without victory conditions. Each round of the game, each player makes one or two moves from the short list of possible moves. The possible moves, though, are all of great importance to the final outcome. Basically, each player may:

  • declare the occurance of a player-described epoch anywhere within the timeline
  • add a major event to an existing epoch
  • invite the rest of the table to narrate a specific scene within the timeline

As with many other story-building games, once a fact is established, it cannot be contradicted. Since there's not really any challenge to getting your facts onto the table, the game is entirely co-operative. There is no fighting over the story allowed. Instead, there's a rule for suggesting "wait, before you write that down, maybe it would be cooler if…"

I only managed to play Microscope once, but it went pretty well. I think after two or three more games, it would be great fun.

I had originally wanted to start a regular set of Microscope games. Whoever committed to each round would show up first for a game of Microscope, establishing a setting. At the end of the session, the players could pick a point (or points) within the history where they'd like to play a traditional RPG, and then we'd have three sessions of that. It struck me as likely to be a ton of fun, but I'm not sure I can really wrangle up players for it. Here's the pitch I wrote myself:

Monthly Microscopy

Microscope is a game of fractal history building. When you play Microscope, you start with a big picture and you end with a complex history spanning decades or centuries. Microscope is a world-building game.

My plan is to play Microscope over and over, building new world, and then running traditional tabletop games in those worlds.

Every month, we'll play a game of Microscope. The big picture will be determined before we play, so everyone who shows up will have at least some idea what to expect. (Knowing the big picture only gets you so far in Microscope, though!)

At the end of the game, we'll have our setting described by a set of genre boundaries and specific facts about the world. We'll have to figure out, now, what kind of RPG we want to play in that world. When during the timeline does it take place? Who are the characters? These are answered by bidding.

At the end of each session, each player in attendance gets five points. If it's your first game, you get twenty. At the end of every Microscope game, players can suggest scenarios for the month's RPG, and then bid on a winning suggestion using their points. Each player may bid as many of his or her points across as many of the suggestions as he or she would like. The bids are made in secret, and all bid points are used up.

There will be three post-Microscope sessions each month. They might form a mini-campaign, or they might be three unrelated groups of characters, as determined by the winner of the plot auction.

Each game will be scheduled at least a week in advance, but won't have a fixed schedule. Times and days will move around to be friendly to different time zones and schedules. Microscope games will be played with G+ Hangouts and Docs. RPG sessions will be played on Roll20 — but we might use Skype for voice chat if their voice chat remains as problematic as it's been.

the perl 5.10 lexical topic (body)

by rjbs, created 2013-10-02 20:18
last modified 2013-10-03 10:55

In Perl 5.10, the idea of a lexical topic was introduced. The topic is $_, also known as "the default variable." If you use a built-in routine that really requires a parameter, but don't give it one, the odds are good that it will use $_. For example:


These three operations, all of which really need a parameter, will use $_. The topic will be substituted-in by s///, chomped by chomp, and said by say. Lots of things use the topic to make the language a little easier to write. In constrainted contexts, we can know what we're doing without being explicit about every little thing, because our conversation with the language has been topicalized.

Often, this leads to clear, concise code. Other times, it leads to horrible, hateful action at a distance. Those times are the worst.

Someone somewhere deep in your dependency chain has written:

sub utter {
  $_ = $_[0];

And you write some code like:

for (@files) {
  log_event("About to investigate file $_");
  next unless -f && -r;

Somewhere down the call stack, log_event sometimes calls utter. utter assigned to the topic, but didn't localize, and if nothing between your code and utter localized, then it will assign to your topic, which happens to be aliased to an element in @files. The filename gets replaced with a logging string, the string fails to pass the (-f && -r) test, so it isn't investigated. This is a bug, but it's not a bug in perl, it's a bug in your code. Is it a bug that this bug is so easy to write?

Well, that's hard to say. I don't think so. It's quite a bit of rope, though, that we're giving you with a default, global variable that often gets aliased by default!

If the variable wasn't global, though, this problem would be cleared up. We'd have a topic just for the piece of code you're looking at, and you could hold the whole thing in your head, and you'd be okay. We already have a kind of variable for that: lexical variables! So, Perl 5.10 introduced my $_.

So, to avoid having your topic clobbered, you could rewrite that loop:

for my $_ (@files) {
  log_event("About to investigate file $_");
  next unless -f && -r;

When log_event is entered, it has no way to see your lexical topic — the one with a filename in it. It can't alter it, either. It's like you've finally graduated to a language with proper scoping! The built-in filetest operators know to look at the lexical topic, if it's in effect, so they just work. What about investigate_file? It's a user-defined subroutine, and it wants to be able to default to $_ if no argument was passed.

Well, it will need to get updated, too. Previously it was written as:

sub investigate_file {
  my ($filename, @rest) = @_ ? (@_) : ($_);


Now, though, that $_ wouldn't be able to see your lexical topic. You need to tell perl that you want to pierce the veil of lexicality.

sub investigate_file (_) {
  my ($filename, @rest) = @_ ? (@_) : ($_);


That underscore in the prototype says "if I get no arguments, alias $_[0] to whichever topic is in effect where I was called." That's great and does just what we want, but there's another problem. We put a (_) prototype on our function. We actually needed (_@), because we take more than one argument. Or stated more simply: the other problem is that now we're thinking about prototypes, which is almost always a road to depression.

Anyway, what we've seen so far is that to gain much benefit from the lexical topic, we also need to update any topic-handling subroutine that's called while the topic is lexicalized. This starts to mean that you're auditing the code you call to make sure that it will work. This is a bummer, but it's only one layer deep that you need to worry about, because your lexical topic ends up in the subroutines @_. It does not, for example, end up in a similarly-lexicalized topic in that subroutine. Phew!

The story doesn't end here, though. There's another wrinkle, and it's a pretty wrinkly one.

One of the cool things we can do with lexical variables is build closures over them. Behold, the canonical example:

sub counter {
  my $i = 0;
  return sub { $i++ }

Once $_ is a lexical variable, we can close over it, too. Is this a problem? Maybe not. Maybe this is really cool:

for my $_ (@messages) {
  push @callbacks, sub { chomp; say };

Those nice compact callbacks use the default variable, but they have closed over the lexical topic as their default variable. Nice!

Unfortunately it isn't always that nice:

use Try::Tiny;

for my $_ (@messages) {
  # a dozen lines of code

  try {
  } catch {
    return unless /^critical: /;

Even though they look like blocks, the things between squiggly braces at try and catch are subroutines, so there's a calling boundary there. When the sub passed to catch is going to get called, the exception that was thrown has been put into $_. It's been put into the global topic, because otherwise it just couldn't work. It can't communicate its lexical topic into a subroutine that wasn't defined within its lexical environment. Subroutines only close over lexicals in their defining environment.

Speaking of which, there's a lexical $_ in the environment in which the catch sub is defined. In case you're on the edge of your seat wondering: yes, it will close over that topic. The $_ in the catch block won't match a regex against the $_ that has the exception in it, it will match against the lexical topic established way back up at the top of the for loop. What about log_exception? Well, it will get one topic or the other, depending on its subroutine prototype.

And, hey, that's one of the two ways we can fix the catch block above:

catch sub (_) {                 ┃   catch {
  return unless /^critical: /;  ┃     return unless $::_ =~ /^critical: /;
  log_exception;                ┃     log_exception($::_);
};                              ┃   };

You can take your pick about which is worse.

The last time this topic (ha ha) came up on perl5-porters, it wasn't clear how this would all get fixed, boiling down to something like "maybe the feature can be fixed (in a way to be specified later)".

…and that's why my $_ became experimental in Perl 5.18.0. It seems like it just didn't work. It was a good idea to start with, and it solves a real problem, and it seems like it could make the whole language make more sense. In practice, though, it leads to confusing action-at-a-distance-y problems, because it pits the language's fundamentals against each other. If we fix the lexical topic, it will almost certainly change how it works or is used, so relying heavily on its current behavior would be a bad idea. If we can't fix the lexical topic, we'll remove it. That makes relying on its behavior just as bad. When relying on a feature's current behavior is a bad idea, we mark it experimental and issue warnings, and that's just what we've done in v5.18.0.

I went to Tokyo! (body)

by rjbs, created 2013-09-26 13:30
tagged with: @markup:md journal perl yapc
I must have done something right when I attended YAPC: :Asia 2011, because they
invited me back this year. I was *delighted* to accept the invitation, and I'm
glad I did.

night in Tokyo

I said I'd give a talk on the state of things in Perl 5, which I'd done at YAPC::NA and OSCON, and which had gone well. It seemed like the topic to cover, given that I was presumably being invited over in large part due to my current work as pumpking. I only realized at the last minute that I was giving the talk as a keynote to a plenary session. This is probably good. If I'd known further in advance, I might have been tempted to do more editing, which would likely have been a mistake.

Closer to the conference, I was asked whether I could pick up an empty slot and do something, and of course I agreed. I had some pipe dreams of making a new talk on the spot, but cooler heads prevailed and I did a long-shelved talk about Dist::Zilla.

Both talks went acceptably, although I was unhappy with the Dist::Zilla talk. I think there's probably a reason I shelved it. If I do talk about Dist::Zilla again, I'll write new material. The keynote went very well, although it wasn't quite a full house. I wasn't opposite any speaker, sure, but I was competing with the iPhone 5s/5c launch. Ah, well! I got laughs and questions, both of which were not guaranteed. I also think I got played off about ten minutes early, so I rushed through the end when I didn't need to.

This wouldn't have happened, if I'd stuck to my usual practices. Normally when I give a talk, I put my iPhone on the podium with a clock or timer on it, and I time myself. I had been using an iOS app called Night Stand for this, the last few years, but I couldn't, on Friday. I had, for no very good reason, decided to upgrade my iPhone and iPad to iOS 7 on the morning before the conference. Despite briefly bricking both devices, I only encountered one real problem: Night Stand was no longer installing. After my keynote, I went and installed a replacement app, and chastised myself for not sticking to my usual routine.

By the time I was giving that keynote, I'd been in town for four days. A lot of activity tends to follow YAPCs, so it would've been nice to stick around afterward instead, but I was concerned about getting my body at least somewhat onto Tokyo time beforehand. Showing up to give a presentation half dead didn't seem like a good plan.

The trip wasn't great. I left home around 5:30 in the morning and headed to the bus stop. Even though it was going to be 80°F most of my time in Tokyo, it was only 40°F that morning, and I traveled in long pants. I agonized over this, and had thought about wearing sweat pants over shorts, or changing once I got to the airport. I decided this was ridiculous, though. It turned out, later, that I was wrong.

I flew out of Newark, which was just the way it always is. I avoided eating anything much because prices there are insane, but when my flight was delayed for three hours, I broke down and had a slice of pizza and an orangina. I also used the time to complete my "learn a new game each week" goal by learning backgammon. I killed a lot of time over the next few days with that app. It didn't take long to get bored of my AI opponent, but I haven't yet played against a human being.

The flight was pretty lousy. I'd been unable to get an aisle seat, so I wasn't able to get up and move around as much as I wanted. Worse, the plane was hot. I've always found planes to be a little too warm on the ground and a little too cool in the air. The sun was constantly baking my side of the plane, though, so it was nearly hot to the touch. I was sweating and gross, and I wished I had switched to shorts. The food was below average. I chose a bad movie to watch. When we finally landed, immigration took about an hour. I began to despair. It would 24 hours of travel by the time I reached the Pauley's, where I would stay. Was I really going to endure another awful 24 hours in just six days?

My spirits were lifted once I got out of the airport. (Isn't that always the way?) I changed my dollars to yen, bought a tiny bottle of some form of Pepsi, and went to squint at a subway map.

pepsi nex: king of zero

On my previous trip, I had been utterly defeated by the map of the subway at Narita. It looks a lot like any other subway map, but at each station are two numbers, each 3-4 digits. Were they time? Station numbers? Did I need to specify these to buy a ticket? The ticketing machines, though they spoke English, were also baffling. I was lost and finally just asked the station agent for help getting to Ueno Station.

This time, I felt like an old hand. I had forgotten all about the sign, but its meaning was immediately clear. They were prices for travel to each station, in yen, for each of the two lines that serviced the route. I fed ¥1200 into a ticket machine, bought a ticket, and got on the Keisei line toward Ueno. I probably could've done it with the machine's Japanese interface! I felt like a champ. Later, of course, I'd realize that the Keisei line takes a lot longer than the Skyliner, so maybe it wasn't the best choice… but I still felt good. Also, that long ride gave me time to finally finish reading It Can't Happen Here. Good riddance to that book!

My sense of accomplishment continued as I remembered the way to Marty and Karen's place. When I got in, I called home and confirmed that I was alive. I said that before we did anything else, I needed a shower. Then we chatted about this and that for a few hours and I decided that I didn't need to eat, just sleep. When I woke up, the sun was already up! It was a great victory over jet lag! Then I realized that it was 5:30 a.m., and the sun just gets up quite early in Tokyo. Land of the rising sun, indeed!

I got some work done and called home again. (Every time I travel, FaceTime grows more excellent!) Eventually, Karen and I headed out to check out the things I'd put onto my "check out while in Tokyo" list. First up, the Meiji Shrine!

Meiji Shrine entrance

We went to shops, did some wandering, and did not eat at Joël Robuchon's place in Rippongi. (Drat!) We got soba and retired for the night. The next day, we met up with Shawn Moore for more adventures. We went to Yoyogi Park, got izakaya with Keith Bawden, Daisuke Maki, et al., and Shawn and I ended our night with our Japanese Perl Monger hosts. We had a variety of izakaya food, but nothing compared, for me, to a plate of sauteed cabbage and anchovy. I could've eaten that all night. I also learned, I think, that I don't like uni. Good to know!

The next day, Shawn, Karen, and I headed down to Yokohama. Shawn and I had to get checked into our hotel. We planned to get to Kamakura to see the statue of the Amida Buddha, but got too late of a start. They both shrugged it off, but I felt I was to blame: we had to wait while I un-bricked my iPhone after my first attempt to upgrade its OS. Sorry! (Of course, they got to go later, so I'm not that sorry!)

Before leaving Minami-Senju, though, we got curry. Shawn had been very excited for CoCo Curry on our 2011 trip, and I was excited for it this time. Their curry comes in ten levels of hotness. I'd gotten level five, last time, and this time got six. In theory, you have to provide proof that you've had level five before (and, you know, lived) in order to get level six. I didn't have my proof, though, and I thought I might need Shawn to badger the waitress for me. Nope! I got served without being carded. I had found level five to be fairly bland, and so I expected six to be just a bit spicy. It was hot! I didn't get a photo! I really enjoyed it, and would definitely order it regularly if it we had a CoCo Curry place in Pennsylvania.

If I go back to Tokyo, I will eat level seven CoCo Curry. This is my promise to you, future YAPC::Asia organizer. Yes, you may watch to see if I cry.

Our hotel was just fine. The only room I could get was a smoking room (yuck) but that was the only complaint I had, and I knew what I was getting into there. For some reason we turned on the television, and sumo was on. We stared at this for a while, transfixed. It didn't last long, though. The spectacle was interesting, but the sport much less so, at least to me. Karen hit the road, Shawn and I worked on slides in earnest, and then we headed out to look for food. I put Shawn in charge (this was a common theme of my trip) and he found an excellent yakiniku place. We ordered a bunch of stuff with no idea what it was, except the tongue, and were not disappointed. (Shawn warned me at the outset: "I don't know a lot of food words.")

"What did I just eat?"

After some more slide wrangling, we crashed and, the next morning, were off to the conference.

YAPC::Asia is a strange conference for me. On both of my trips there, I've been an invited speaker, and felt very welcome… but feeling welcome isn't the same as feeling like a part of things. The language barrier is very difficult to get past. It's frustrating, because you find yourself in a room full of brilliant, funny, interesting people, but you can't quite participate. It's sort of like being a young child again.

Of course, that's what happens when the room is full of Japanese-speakers listening to another Japanese-speaker. It certainly need not be the case in one-on-one conversation. I chatted with Daisuke Maki, Kenichi Ishigaki, Hiroaki Kobayashi, and some others, but it was far too few and too infrequent. It was much easier to stick to talking to the people I already knew. In retrospect, this was pretty stupid. While it's true that I don't see (say) Paul and Shawn and Karen very often, I can talk to them whenever I want, and I know what topics to ask them about and so on.

This year, YAPC::Asia had eleven hundred people. So, that's something like a dozen that I knew and 1088 that I didn't. Heck, there were even a few westerners I didn't go pester, where there'd be no language issue. I wanted to try to convince more of the amazing talent in the Japanese Perl community to come hack on perl5.git, and for the most part, I did not do this outside of my in-talk exhortation. In that sense, my YAPC::Asia was a failure of my own making, and I regret my timidity.

In every other aspect, the conference was an amazing success as far as I could tell. It was extremely friendly, professional, energetic, and informative. I sat through a number of talks in Japanese, and they were really interesting. People sometimes talk about how there's "CPAN" and "Darkpan" and that's that. You're either working with "the community" or you're not. The reality is that there are multiple groups. Of course "we" know that in "the" community. How much crossover is there between the Dancer community and the Perl 5 Porters? Some. Well, the Japanese Perl community — or, rather, the community in Japan that made YAPC::Asia happen — has some crossover with the community that makes YAPC::NA happen, but there are large disjunct segments, and they're solving problems differently, and it's ridiculous to imagine that we can't learn from each other. Even if it wasn't self-evident, it was evident in the presentations that were given.

look at all those volunteers

After attending the largest YAPC ever, by quite a lot (at 1100 people!) it was also sad to learn that this may be the last YAPC::Asia in Tokyo for some time. The organizers, Daisuke Maki and the enigmatic "941" have been doing it for years, and have declared that they're done with it. It seems unlikely that anyone will step in and run the conference in their stead, at least in Tokyo. There may be, they suggested, a change to regional Perl workshops: one in Sapporo, one in Osaka, and so on. Perl workshops are great, but will I make it to the Osaka Perl Workshop? Well, we'll see.

If I do, though, I'm going to do my best Paul Fenwick impression and force everyone there to talk to me all the time.

When the conference was over, Karen, Marty, Paul, Shawn and I headed to dinner (with Marcel, Reini, and Mirjam) and then to… karaoke! At first, Marty was reticent and not sure he'd stick around. Paul's opening number changed his mind, though, and we sang ridiculous songs for ninety minutes. I drank a Zima. A Zima! I thought this was pretty ridiculous, but Paul one-upped me, or perhaps million-upped me, by ordering a cocktail made with pig placenta. I declined to sample it.

The next day, after a final FaceTime chat with Gloria and a final high five for Paul, I headed out to the airport. In 2011, I cut it incredibly close and nearly missed my plane, and I wasn't going to do that this time. Miyagawa pointed me toward the Musashikosugi JR line and warned me that the ticket terminals there were confusing. He was right, too. I wasted ten minutes trying to figure them out before finally asking the station agent for help. If I'd just started there, I would've made an earlier train and not ended up sitting on a bench for forty minutes. So I ended my last train ride in Tokyo much as I began my first one: baffled by the system, reduced to pleading for help. I didn't mind, really. I'd just finished an excellent trip and was feeling great. (I also felt pretty good about blaming the computer and not myself, but that's another matter.)

Hello Kitty plane

Narita was fine. Great, even! The airline staff treated me like a king. I got moved to an aisle seat with nobody beside me! I killed time in the United lounge, had a few free beers, and transferred some movies to my iPad. In short order, we were aboard and headed home. The flight was only eleven hours, customs was quick, and soon (finally!) I was reunited with my family and off to Cracker Barrel for a "welcome back to America" dinner.

It was a great YAPC, and the most important thing I learned was the same as always: I'm there to talk to the people, not listen to the talks. I'll do better next time!

lexical subroutines in perl 5 (body)

by rjbs, created 2013-09-25 19:50
last modified 2013-09-26 20:22

One of the big new experimental features in Perl 5.18.0 is lexical subroutines. In other words, you can write this:

my sub quickly { ... }
my @sorted = sort quickly @list;

my sub greppy (&@) { ... }
my @grepped = greppy { ... } @input;

These two examples highlight cases where lexical references to anonymous subroutines would not have worked. The first argument to sort must be a block or a subroutine name, which leads to awful code like this:

sort { $subref->($a, $b) } @list

With our greppy, above, we get to benefit from the parser-affecting behaviors of subroutine prototypes. Although you can write sub (&@) { ... }, it has no effect unless you install that into a named subroutine, and it needs to be done early enough.

On the other hand, lexical subroutines aren't just drop-in replacements for code refs. You can't pass them around and have them retain their named-sub behavior, because you'll still just have a reference to them. They won't be "really named." So if you can't use them as parameters, what are their benefits over named subs?

First of all, privacy. Sometimes, I see code like this:

package Abulafia;

our $Counter = 0;


Why isn't $Counter lexical? Is it part of the interface? Is it useful to have it shared? Would my code be safer if that was lexical, and thus hidden from casual accidents or stupid ideas? In general, I make all those sorts of variables lexical, just to make myself think harder before messing around with their values. If I need to be able to change them, after all, it's only a one word diff!

Well, named subroutines are, like our variables, global in scope. If you think you should be using lexical variables for things that aren't API, maybe you should be using lexical subroutines, too. Then again, you may have to be careful in thinking about what "aren't API" means. Consider this:

package Service::Client;
sub _ua { LWP::UserAgent->new(...) }

In testing, you've been making a subclass of Service::Client that overrides _ua to use a test UA. If you make that subroutine lexical, you can't override it in the subclass. In fact, if it's lexical, it won't participate in method dispatch at all, which means you're probably breaking your main class, too! After all, method dispatch starts in the package on which a method was invoked, then works its way up the packages in @INC. Well, package means package variables, and that excludes lexical subroutines.

So, it may be worth doing, but it means more thinking (about whether or not to lexicalize each non-public sub), which is something I try to avoid when coding.

So when is it useful? I see two scenarios.

The first is when you want to build a closure that's only used in one subroutine. You could make a big stretch, here, and talk about creating a DSL within your subroutine. I wouldn't, though.

# Please forgive this extremely contrived example. -- rjbs, 2013-09-25
sub dothings {
  my ($x, $y, @rest) = @_;

  my sub with_rest (&) { map $_[0]->(), @rest; }

  my @to_x = with_rest { $_ ** $x };
  my @to_y = with_rest { $_ ** $y };


I have no doubt that I will end up using this pattern someday. Why do I know this? Because I have written Python, and this is how named functions work there, and I use them!

There's another form, though, which I find even more interesting.

In my tests, I often make a bunch of little packages or classes in one file.

package Tester {
  sub do_testing {

package Targeter {
  sub get_targets {

Tester->do_testing($_) for Targeter->get_targets(%param);

Sometimes, I want to have some helper that they can all use, which I might write like this:

sub logger { diag shift; diag explain(shift) }

package Tester {
  sub do_testing {
    logger(testing => \@_);

package Targeter {
  sub get_targets {
    logger(targeting => \@_);

Tester->do_testing($_) for Targeter->get_targets;

Well… I might write it like that, but it won't work. logger is defined in one package (presumably main::) and then called from two different packages. Subroutine lookup is per-package, so you won't find logger. What you need is a name lookup that isn't package based, but, well, what's the word? Lexical!

So, you could make that a lexical subroutine by sticking my in front of the subroutine declaration (and adding use feature 'lexical_subs (and, for now, no warnings 'experimental::lexical_subs')). There are problems, though, like the fact that caller doesn't give great answers, yet. And we can't really monkeypatch that subroutine, if we wanted, which we might. (Strangely abusing stuff is more acceptable in tests than in the production code, in my book.) What we might want instead is a lexical name to a package variable. We have that already! We just write this:

our sub logger { ... }

I'm not using lexical subs much, yet, but I'm pretty sure I will use them a good bit more in the future!

The Great Infocom Replay: Starcross (body)

by rjbs, created 2013-09-22 19:51

Having finished the Zork trilogy, it was time for me to continue on into the great post-Zork canon. I was excited for this, because it means lots of games that I haven't played yet. First up: Starcross. I was especially excited for Starcross! It's the first of Infocom's sci-fi games, and I only remembered hearing good things. I'd meant to get started on the flight to YAPC::Asia, but didn't manage until I'd begun coming home. On the train to Narita, things got off to a weird start.

First, I realized I needed to consult the game's manual to get started. I'm not sure if this was done for fun or as copy protection, but fortunately I had a scan of the file I needed. After getting into the meat of the game, it was time to get mapping. Mapping Starcross took a while to get right, but it was fun. The game takes place on a huge space station, a rotating cylinder, in which some of the hallways are endless rings. I liked the idea, but I think that up/down, port/starboard, and fore/aft were used in a pretty confusing way. I'm not sure the map really made sense, but was a nice change of pace without being totally incomprehensible.

The game's puzzles had a lot going for them. It was clear when there was a puzzle to solve, and it was often clear what had to be done, but not quite how. Some objects had multiple uses, and some puzzles had multiple solutions. Unfortunately, it has a ton of the classic text adventure problems, and they drained the fun from the game at nearly every turn.

The game can silently enter unwinnable state, which you don't work out until you can't solve the next puzzle. (It occurs to me that an interpreter with its own UNDO would be a big help here, since I don't save enough.)

There are tasks that need to be repeated, despite appearances. Something like this happens:

You root around but don't find anything.

You still don't find anything.

Hey, look, a vital object for solving the game!
[ Your score has gone up 25 points. ]

…and my head explodes.

There are guess-the-verb puzzles, which far too often have as the "right" verb a really strange option. For example, there's a long-dead spaceman, now just a skeleton in a space suit.

It's a space suit with a dead alien in it.

You don't see anything special.

It sure is dead.

Something falls out of the sleeve of the suit!


There's a "thief" character that picks up objects and moves them around. It's used to good effect (as was the thief in Zork Ⅰ) but it wastes time. Wasting time wouldn't be a problem, if there wasn't a part of a time limit built into the game. The time limit can be worked around, but it means you need to play the game in the right order, which might mean going back to an early save once you work that out. (Why is it that I love figuring out the best play order in Suspended, but not anything else?) Even that wouldn't be so bad, in part because I happily I had started by solving a number of puzzles that can be solved in any order, but there was a problem. Most of the game's puzzles center around collecting keys, so by the end of the game you're carrying a bunch of keys, not to mention a few objects key to getting the remaining keys… and there's an inventory limit. It's not even a good inventory limit, where the game just says "you can't carry anything more." Instead, it's the kind where, when you're carrying too much, you start dropping random things.


It did lead to one amusing thing, at least, when I tried to pick up a key and accidentally dropped the space suit I was wearing.

Still, the game is good. I particularly like the representational puzzles, like the solar system and repair room. Its prose is good, but neither as economical as earlier games nor as rich as later ones, making it inferior to both. As in earlier games, I'm frustrated by the number of things mentioned but not examinable. Getting "I don't know that word [which I just used]" is worse than "you won't need to refer to that." I'm hoping that the larger dictionaries of v5 games will allow for better messages like that. I've got a good dozen games until I get to those, though.

Next up will be Suspended. I'm not sure how that will go, since I've played that game many times every year for the past decade or so. After that, The Witness, about which I know nearly nothing!

the Zork Standard Code for Information Interchange (body)

by rjbs, created 2013-09-15 11:33
last modified 2013-09-16 12:15

I always feel a little amazed when I realize how many of the things that really interest me, today, are things that I was introduced to by my father. Often, they're not even things that I think he's passionate about. They're just things we did together, and that was enough.

One of the things I really enjoyed doing with him was playing text adventures. It's strange, because I think we only did three (the Zork trilogy) and I was not very good at them. I got in trouble for sneaking out the Invisi-Clues hint book at one point and looking up answers for problems we hadn't seen yet. What was I thinking?

Still, it's stuck with me, and I'm glad, because I still enjoy replaying those games, trying to write my own, and reading about the craft. Most of my (lousy, unfinished) attempts to make good text adventures have been about making the game using existing tools. (Generally, Inform 6. Inform 7 looks amazing, but also like it's not for me.) Sometimes, though, I've felt like dabbling in the technical side of things, and that usually means playing around with the Z-Machine.

Most recently, I was thinking about writing an assembler to build Z-Machine code, and my thinking was that I'd write it in Perl 6. It didn't go too badly, at first. I wrote a Perl 6 program that built a very simple Z-Machine executable, I learned more Perl 6, and I even got my first commit into the Rakudo project. The very simple program was basically "Hello, World!" but it was just a bit more complicated than it might sound, because the Z-Machine has its own text encoding format called ZSCII, the Zork Standard Code for Information Exchange, and dealing with ZSCII took up about a third of my program. Almost all the rest was boilerplate to output required fields of the output binary, so really the ZSCII code was most of the significant code in this program. I wanted to write about ZSCII, how it works, and my experience writing (in Perl 5) ZMachine::ZSCII.

First, a quick refresher on some terminology, at least as I'll be using it:

  • a character set maps abstract characters to numbers (called code points) and back
  • an encoding maps from those numbers to octets and back, making it possible to store them in memory

We often hear people talking about how Latin-1 is both of these things, but in Unicode they are distinct. That is: there are fewer than 256 characters in Latin-1, so we can always store an character's code point in a single octet. In Unicode, there are vastly more than 256 characters, so we must use a non-identity encoding scheme. UTF-8 is very common, and uses variable-length sequences of bytes. UTF-16 is also common, and uses different variable-length byte sequences. There are plenty of other encodings for Unicode characters, too.

The Z-Machine's text representation has distinct character set and encoding layers, and they are weird.

The Z-Machine Character Set

Let's start with the character set. The Z-Machine character set is not one character set, but a per-program set. The basic mapping looks something like this:

| 000 - 01F | unassigned, save for (␀, ␡, ␉, ␤, and "sentence space") |
| 020 - 07E | same as ASCII                                           |
| 07F - 080 | unassigned                                              |
| 081 - 09A | control characters                                      |
| 09B - 0FB | extra characters                                        |
| 0FC - 0FE | control characters                                      |
| 0FF - 3FF | unassigned                                              |

There are a few things of note: first, the overlap with ASCII is great if you're American:

20-2F: ␠ ! " # $ % & ' ( ) * + , - . /
20-39: 0 1 2 3 4 5 6 7 8 9
3A-40: : ; < = > ? @
41-5A: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
5B-60: [ \ ] ^ _ `
61-7A: a b c d e f g h i j k l m n o p q r s t u v w x y z
7B-7E: { | } ~

The next thing to note is the "extra characters," which is where you'll be headed if you're not just speaking English. Those 96 code points can be defined by the programmer. Most of the time, they basically extend the character repertoire to cover Latin-1. When that's not useful, though, the Z-Machine executable may provide its own mapping of these extra character by providing an array of words called the Unicode translation table. Each position in the array maps to one extra character, and each value maps to a Unicode codepoint in the basic multilingual plane. In other words, the Z-Machine does not support Emoji.

So: ZSCII is not actually a character set, but a vast family of many possible user-defined character sets.

Finally, you may have noticed that the basic mapping table gave (unassigned) code points from 0x0FF to 0x3FF. Why's that? Well, the encoding mechanism, which we'll get to soon, lets you decode to 10-bit codepoints. My understanding, though, is that the only possible uses for this would be extremely esoteric. They can't form useful sentinel values because, as best as I can tell, there is no way to read a sequence of decoded codepoints from memory. Instead, they're always printed, and presumably the best output you'll get from one of these codepoints will be �.

Here's a string of text: Queensrÿche

Assuming the default Unicode translation table, here are the codepoints:

Unicode: 51 75 65 65 6E 73 72 FF 63 68 65

ZSCII  : 51 75 65 65 6E 73 72 A6 63 68 65

This all seems pretty simple so far, I think. The per-program table of extra characters is a bit weird, and the set of control characters (which I didn't discuss) is sometimes a bit weird. Mostly, though, it's all simple and reasonable. That's good, because things will get weirder as we try putting this into octets.

Z-Machine Character Encoding

The first thing you need to know is that we encode in two layers to get to octets. We're starting with ZSCII text. Any given piece of text is a sequence of ZSCII code points, each between 0 and 1023 (really 255) inclusive. Before we can get to octets, we first built pentets. I just made that word up. I hope you like it. It's a five-bit value, meaning it ranges from 0 to 31, inclusive.

What we actually talk about in Z-Machine jargon isn't pentets, but Z-characters. Keep that in mind: a character in ZSCII is distinct from a Z-character!

Obviously, we can't fit a ZSCII character, which ranges over 255 points, into a Z-character. We can't even fit the range of the ZSCII/ASCII intersection into five bits. What's going on?

We start by looking up Z-characters in this table:

  0                               1
  0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
  ␠       ␏ ␏ a b c d e f g h i j k l m n o p q r s t u v w x y z

In all cases, the value at the bottom is a ZSCII character, so you can represent a space (␠) with ZSCII character 0x020, and encode that to the Z-character 0x00. So, where's everything else? It's got to be in that range from 0x00 to 0x1F, somehow! The answer lies with those funny little "shift in" glyphs under 0x04 and 0x05. The table above was incomplete. It is only the first of the three "alphabets" of available Z-characters. The full table would look like this:

      0                               1
      0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
  A0  ␠       ␏ ␏ a b c d e f g h i j k l m n o p q r s t u v w x y z
  A1  ␠       ␏ ␏ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  A2  ␠       ␏ ␏ … ␤ 0 1 2 3 4 5 6 7 8 9 . , ! ? _ # ' " / \ - : ( )

Strings always begin in alphabet 0. Z-characters 0x04 and 0x05 mark the next character as being in alphabet 1 or alphabet 2, respectively. After that character, the shift is over, so there's no shift character to get to alphabet 0. You won't need it.

So, this gets us all the ZSCII/ASCII intersection characters… almost. The percent sign, for example, is missing. Beyond that, there's no sign of the "extra characters." Now what?

We get to the next layer of mapping via A2-06, represented above as an ellipsis. When we encounter A2-06, we read two more Z-characters, join the two pentets, interpret the resulting dectet as a little-endian 10-bit integer, and that's the ZSCII character being represented. So, in a given string of Z-characters, any given ZSCII character might take up:

  • one Z-character (a lowercase ASCII letter)
  • two Z-characters (an uppercase ASCII letter or one of the symbols in A2)
  • four Z-characters (anything else as 0x05 0x06 X Y, where X-Y points to ZSCII)

So, now that we know how to convert a ZSCII character to Z-characters without fail, how do we store that in octets? Easy. Let's encode this string:

»Gruß Gott!«

That maps to these twenty-four Z-characters:

»   05 06 05 02
G   04 0C
r   17
u   1A
ß   05 06 05 01
␤   00
G   04 0C
o   14
t   19
t   19
!   05 14
«   05 06 05 03

We start off with a four Z-character sequence, then a two Z-character sequence, then a few single Z-characters. The whole string of Z-characters should be fairly straightforward. We could just encode each Z-character as an octet, but that would be pretty wasteful. We'd have three unused bits per Z-character, and in 1979 every byte of memory was (in theory) precious. Instead, we'll pack three Z-characters into every word, saving the word's high bit for later. That means we can fit "!«" into two words like so:

!   05 14         0b00101 0b01110
«   05 06 05 03   0b00101 0b00110 0b00101 0b00011


0001 0101   1100 0101
1001 1000   1010 0011

Red and blue runs are the bits of our Z-characters. You can see that each word is three complete Z-characters. The green bits are the per-word high bits. This bit is always zero, except for the last word in a packed string. If we're given a pointer to a packed string in memory (this, for example, is the argument to the print_addr opcode in the Z-Machine instruction set) we know when to stop reading from memory because we encounter a word with the high bit set.

Okay! Now we can take a string of text, represent it as ZSCII characters, convert those to Z-characters, and then pack the whole thing into pairs of octets. Are we done?

Not quite. There are just two things I think are still worth mentioning.

The first is that the three alphabet tables that I named above are not constant. Just like the Unicode translation table, they can be overridden on a per-program basis. Some things are constant, like shift bits and the use of A2-06 as the leader for a four Z-character sequence, but most of the alphabet is up for grabs. The alphabet tables are stored as 78 bytes in memory, with each byte referring to a ZSCII code point. (Once again we see code points between 0x100 and 0x3FF getting snubbed!)

The other thing is abbreviations.

Abbreviations make use of the Z-characters I ignored above: 0x01 through 0x03. When one of these Z-characters is seen, the next character is read. Then this happens:

if (just_saw in (1, 2, 3)) {
  next   = read_another
  offset = 32 * (just_saw - 1) + next

offset is the offset into the "abbreviations table." Values in that table are pointers to memory locations of string. When the Z-Machine is printing a string of Z-characters and encounters an abbreviation, it looks up the memory address and prints the string there before continuing on with the original string. Abbreviation expansion does not recurse. This can save you a lot of storage if you keep referring to the "localized chronosynclastic infundibulum" throughout your program.


The two main methods of ZMachine::ZSCII should make good sense now:

sub encode {
  my ($self, $string) = @_;

  $string =~ s/\n/\x0D/g; # so we can just use \n instead of \r

  my $zscii  = $self->unicode_to_zscii($string);
  my $zchars = $self->zscii_to_zchars($zscii);

  return $self->pack_zchars($zchars);

First we fix up newlines. Then we map the Unicode string's characters to a string of ZSCII characters. Then we map the ZSCII characters into a sequence of Z-characters. Then we pack the Z-characters into words.

At every point, we're dealing with Perl strings, which are just sequences of code points. That is, they're like arrays of non-negative integers. It doesn't matter that $zscii is neither a string of Unicode text nor a string of octets to be printed or stored. After all, if someone has figured out that esoteric use of Z+03FF, then $zscii will contain what Perl calls "wide characters." Printing it will print the internal ("utf8") representation, which won't do anybody a lick of good. Nonetheless, using Perl strings keeps the code simple. Everything uses one abstraction (strings) intead of two (strings and arrays).

Originally, I wrote my ZSCII code in Perl 6, but the Perl 6 implementation was very crude, barely supporting the basics of ASCII-only ZSCII. I'm looking forward to (someday) bringing all the features in my Perl 5 code to the Perl 6 implementation, where I'll get to use distinct types (Str and Buf) for the text and non-text strings, sharing some, but not all, of the abstractions as appropriate.

Until then, I'm not sure what, if anything, I'll use this library for. Writing more of that Z-Machine assembler is tempting, or I might just add abbreviation support. First, though, I think it's time for me to make some more progress on my Great Infocom Replay…

random tables with Roland (body)

by rjbs, created 2013-09-10 08:42
last modified 2013-09-10 08:43

This post is tagged programming and dnd. I don't get to do that often, and I am pleased.

For quite a while, I've been using random tables to avoid responsibility for the things that happen in my D&D games. Instead of deciding on the events that occur at every turn, I create tables that describe the general feeling of a region and then let the dice decide what aspects are visible at any given moment. It has been extremely freeing. There's definitely a different kind of skill needed to get things right and to deal with what the random number gods decide, but I really enjoy it. Among other things, it means that I can do more planning well in advance and have more options at any moment. I don't need to plan a specific adventure or module each week, but instead prepare general ideas of regions on different scales, depending on the amount of time likely to be spent in each place.

Initially, I put these charts in Numbers, which worked pretty well.

Random Encounters spreadsheet

I was happy with some stupid little gimmicks. I color-coded tables to remind me which dice they'd need. The color codes matched up to colored boxes that showed me the distribution of probability on those dice, so I could build the tables with a bit more confidence. It was easy, but I found myself wanting to be able to drill further and further down. What would happen is this: I'd start with an encounter table with 19 entries, using 1d20+1d8 as the number generator. This would do pretty well for a while, but after you've gotten "goblin" a few times, you need more variety. So, next up "goblin" would stop being a result and would start being a redirection. "Go roll on the goblin encounter table."

As these tables multiplied, they became impossible to deal with in Numbers. Beyond that, I wanted more detail to be readily available. The encounter entry might have originally been "2d4 goblins," but now I wanted it to pick between twelve possible kinds of goblin encounters, each with their own number appearing, hit dice, treasure types, reaction modifiers, and so on. I'd be flipping through pages like a lunatic. It would have been possible to inch a bit closer to planning the adventure by pre-rolling all the tables to set up the encounter beforehand and fleshing it out with time to spare, but I wasn't interested in that. Even if I had been, it would have been a lot of boring rolling of dice. That's not what I want out of a D&D game. I want exciting rolling of dice!

I started a program for random encounters in the simplest way I could. A table might look something like this:

type: list
pick: 1
  - Cat
  - Dog
  - Wolf

When that table is consulted, one of its entries is picked at random, all with equal probability. If I wanted to stack the odds, I could put an entry in there multiple times. If I wanted to add new options, I'd just add them to the list. If I wanted to make the table more dice-like, I'd write this:

dice: 2d4
  2: Robot
  3: Hovering Squid
  4: Goblin
  5: Weredog
  6: Quarter-giant
  7: Rival adventurers
  8: Census-taking drone

As you'd expect, it rolls 2d4 to pick from the list.

This was fine for replacing the first, very simple set of tables, but I wanted more, and it was easy to add by making this all nest. For example, this is a table from my test/example files:

dice: 1d4
times: 2
  1: Robot
  2: Hovering Squid
    dice: 1d4
    times: 1d4
      1: Childhood friend
      2-3: Kitten
      4: Goblin
  4: Ghost

This rolls a d4 to get a result, then rolls it again for another result, and gives both. If either of the results is a 3, then it rolls 1-4 more times for additional options. The output looks like this:

~/code/Roland$ roland eg/dungeon/encounters-L3
~/code/Roland$ roland eg/dungeon/encounters-L3
~/code/Roland$ roland eg/dungeon/encounters-L3
Hovering Squid
  Childhood friend

Why are some of those things indented? Because the whole presentation of results stinks, because it's just good enough to get the point across. Oh well.

In the end, in the above examples, the final result is always a string. This isn't really all that useful. There are a bunch of other kinds of results that would be useful. For example, when rolling for an encounter on the first level of a dungeon, it's nice to have a result that says "actually, go roll on the second level, because something decided to come upstairs and look around." It's also great to be able to say, "the encounter is goblins; go use the goblin encounter generator."

Here's a much more full-featured table:

dice: 1d4 + 1d4
  2:  Instant death
  3:  { file: eg/dungeon/encounters-L2 }
  4:  { file: eg/monster/zombie }
    - { file: [ eg/monster/man, { num: 1 } ] }
    - { file: eg/plan }
      type: list
      pick: 1
        - canal
        - creek
        - river
        - stream
    - Panama
    dice: 1d6
      1-2: Tinker
      3-4: Tailor
      5: Soldier
        type: dict
          - Actually: Spy
          - Cover: { file: eg/job }
  7: { times: 2 }
  8: ~

(No, this is not from an actual campaign. "Instant death" is a bit much, even for me.)

Here, we see a few of Roland's other features. The mapping with file in it tells us to go roll the table found in another file, sometimes (as in the case of the first result under result 5) with extra parameters. We can mix table types. The top-level table is a die-rolling table, but result 5 is not. It's a list table, meaning we get each thing it includes. One of those things is a list table with a pick option, meaning we get that many things picked randomly from the list. Result 7 says "roll again on this table two more times and keep both results." Result 8 says, "nothing happens after all."

Result 6 under result 6 is one I've used pretty rarely. It returns a hash of data. In this case, the encounter is with a spy, but he has a cover job, found by consulting the job table.

Sometimes, in table like this, I know that I need to force a given result. If I haven't factored all the tables into their own results, I can pass -m to Roland to tell it to let me manually pick the die results, but to let each result have a default-random value. If I want to force result six on the above table, but want its details to be random, I can enter 6 manually and then hit enter until it's done:

~/code/Roland$ roland -m eg/dungeon/encounters-L1
rolling 1d4 + 1d4 for eg/dungeon/encounters-L1 [4]: 6
rolling 1d6 for eg/dungeon/encounters-L1, result 6 [3]: 

Finally, there are the monster-type results. We had this line:

- { file: [ eg/monster/man, { num: 1 } ] }

What's in that file?

type: monster
name: Man
ac: 9
hd: 1
mv: 120'
attacks: 1
damage: 1d4
num: 2d4
save: N
morale: 7
loot: ~
alignment: Lawful
description: Just this guy, you know?
- label: Is Zombie?
  dice: 1d100
    1: { replace: { file: [ eg/monster/zombie, { num: 1 } ] } }
    2: Infected
    3-100: No

In other words, it's basically a YAML-ified version of a Basic D&D monster block. There are a few additional fields that can be put on here, and we see some of them. For example, per-unit can decorate each unit. (We're expecting 2d4 men, because of the num field, but if you look up at the previous encounter table, you'll see that we can override this to do things like force an encounter with a single creature.) In this case, we'll get a bunch of men, some of whom may be infected or zombified.

Not every value is treated the same way. The number encountered is rolled and used to generate units, and the hd value is used to produce hit points for each one. Even though it looks like a dice specification, damage is left verbatim, since it will get rolled during combat. It's all a bit too special-casey for my tastes, but it works, and that's what matters.

~/code/Roland$ roland eg/monster/man

Man (wandering)
  No. Appearing: 5
  Hit Dice: 1
  Stats: [ AC 9, Mv 120', Dmg 1d4 ]
  Total XP: 50 (5 x 10 xp)
- Hit points: 6
  Is Zombie?: No
- Hit points: 1
  Is Zombie?: No
- Hit points: 2
  Is Zombie?: No
- Hit points: 5
  Is Zombie?: No
- Hit points: 5
  Is Zombie?: Infected

(Notice the "wandering" up top? You can specify different bits of stats for encountered-in-lair, as described in the old monster blocks.)

In the encounter we just rolled, there were no zombies. If there had been, this line would've come into play:

1: { replace: { file: [ eg/monster/zombie, { num: 1 } ] } }

This replaces the unit with the results of that roll. Let's force the issue:

~/code/Roland$ roland -m eg/monster/man
rolling 2d4 for number of Man [5]: 4
rolling 1d8 for Man #1 hp [7]: 
rolling 1d100 for unit-extra::Is Zombie? [38]: 1
rolling 2d8 for Zombie #1 hp [4]: 
rolling 1d8 for Man #2 hp [8]: 
rolling 1d100 for unit-extra::Is Zombie? [10]: 
rolling 1d8 for Man #3 hp [7]: 
rolling 1d100 for unit-extra::Is Zombie? [90]: 
rolling 1d8 for Man #4 hp [2]: 
rolling 1d100 for unit-extra::Is Zombie? [13]: 

Man (wandering)
  No. Appearing: 3
  Hit Dice: 1
  Stats: [ AC 9, Mv 120', Dmg 1d4 ]
  Total XP: 30 (3 x 10 xp)
- Hit points: 8
  Is Zombie?: No
- Hit points: 7
  Is Zombie?: No
- Hit points: 2
  Is Zombie?: No

Zombie (wandering)
  No. Appearing: 1
  Hit Dice: 2
  Stats: [ AC 7, Mv 90', Dmg 1d6 ]
  Total XP: 20 (1 x 20 xp)
- Hit points: 4

Note that I only supplied overrides for two of the rolls.

You can specify encounter extras, which act like per-unit extras, but for the whole group:

  Hug Price:
    dice: 1d10
      1-9: Free
      10:  10 cp

Finally, sometimes one kind of encounter implies another:

    dice: 1d10
      1-9: ~
      10:  { append: Monolith }

Here, one time out of ten, roboelfs are encountered with a Monolith. That could've been a redirect to describe a monolith, but for now I've just used a string. Later, I can write up a monolith table using whatever form I want. (Most likely, this kind of thing would become a dict with different properties all having embedded subtables.)

Right now, I'm really happy with Roland. Even though it's sort of a mess on many levels, it's good enough to let me get the job done. I think the problem I'm trying to solve is inherently wobbly, and trying to have an extremely robust model for it is going to be a big pain. Even though it goes against my impulses, I'm trying to leave things sort of a mess so that I can keep up with my real goal: making cool random tables.

Roland is on GitHub.

prev page
next page
page 3 of 83
2059 entries, 25 per page