rjbs forgot what he was saying

not logged in (root) | by date | tagcloud | help | login

rjbs's tags

-- - ?? + ++
4   80s  
2   _to_read  
1   abe.pm  
7   addex  
1   ads  
25   advice  
1   airlines  
1   algorithm  
9   amazon  
1   amber  
1   america  
1   animals  
1   aol  
43   apple  
7   applescript  
2   architecture  
1   arduino  
1   arf  
5   art  
1   assembler  
5   astronomy  
2   baby  
4   backup  
2   baking  
1   banjo  
2   barcode  
1   bash  
1   battletech  
6   beer  
2   bethlehem  
5   bible  
4   bicycle  
6   blog  
1   bonjour  
11   book  
49   books  
9   booze  
1   boston  
1   brainfuck  
1   breadmachine  
1   bryar  
1   bugzilla  
1   bus  
2   c  
1   calculator  
1   calendar  
33   campaign  
1   car  
4   cartoons  
1   cdbi  
1   cellphone  
6   chart  
1   chemistry  
14   chess  
1   china  
1   chinese  
20   christianity  
2   christmas  
1   chrome  
1   cloud  
6   cocoa  
16   code  
15   color  
9   comics  
7   compsci  
1   computers  
1   conference  
2   convention  
1   cooking  
22   cpan  
1   cricket  
2   criticism  
1   cron  
3   crossword  
4   crypto  
3   css  
11   culture  
1   cvs  
1   cygwin  
1   dad  
1   dashboard  
8   database  
2   dbi  
1   dcbpw  
1   debian  
2   debug  
8   delicious  
3   design  
2   dice  
2   dictionary  
27   distzilla  
1   diy  
79   dnd  
2   dns  
1   drawing  
4   dreamcast  
12   dreams  
1   drm  
1   dropbox  
1   dvorak  
4   ebook  
1   ebooks  
2   economics  
8   editor  
2   emacs  
67   email  
1   encoding  
4   english  
3   ergo  
1   erlang  
9   esperanto  
2   etiquette  
1   evolution  
1   exchange  
2   exercises  
1   extreme  
3   family  
1   fax  
1   fbl  
2   ff  
2   fiction  
6   firefox  
1   flags  
10   flash  
2   fletch  
1   flickr  
1   fluxx  
1   foaf  
1   folk  
11   fonts  
20   food  
5   forth  
2   forum  
1   free  
2   freesoftware  
2   friends  
1   friendship  
1   frink  
1   fud  
3   functional  
1   fundraising  
2   furniture  
8   game  
3   gameboy  
18   gamecube  
212   games  
50   gamesite  
1   gaming  
1   geography  
2   geometry  
1   german  
20   git  
1   github  
1   glasses  
1   gloria  
1   gmail  
15   go  
2   golf  
3   goof  
6   google  
9   gov  
1   groups  
11   guineapigs  
2   gun  
3   gwb  
7   haiku  
2   halo  
49   hardware  
10   haskell  
1   hate  
1   health  
1   high-st  
3   hiring  
15   history  
7   hiveminder  
1   home  
13   house  
15   howto  
7   html  
2   http  
35   humor  
1   icehouse  
2   icon  
6   icons  
4   idea  
1   ideas  
3   illusion  
16   image  
1   infocom  
11   infocom-replay  
2   inform  
17   int-fiction  
1   io  
1   ipad  
3   iphoto  
6   ipod  
5   irc  
10   itunes  
1   jargon  
2   java  
27   javascript  
4   jonstewart  
1   jott  
1254   journal  
1   jquery  
1   json  
1   karaoke  
15   keyboard  
5   keynote  
1   kinesis  
1   knave  
3   kwiki  
10   language  
2   lasertag  
2   latex  
2   latin  
6   law  
1   lazyweb  
2   lego  
2   library  
1   lighttpd  
11   linux  
4   liquidplanner  
10   lisp  
1   list  
2   literature  
7   logic  
1   logo  
2   lovecraft  
1   lua  
116   macosx  
1   magazine  
1   make  
2   management  
2   map  
4   maps  
6   mario  
1   markdown  
6   martha  
10   math  
3   mecha  
1   media  
2   memory  
1   metabase  
2   metroid  
2   mh  
1   microscopy  
1   microsoft  
1   minecraft  
1   mk  
12   mnm  
4   money  
3   moose  
2   motivation  
1   mouse  
24   movies  
3   mozilla  
1   mp3  
5   msie  
23   music  
10   mutt  
2   mysql  
1   mythology  
1   negativland  
1   nethack  
23   network  
9   networking  
8   news  
1   oauth  
2   ocaml  
3   omnifocus  
1   omniweb  
2   online  
1   ook  
1   openid  
1   oracle  
8   oscon  
1   oscon2008  
1   outliner  
7   paranoia  
2   parsing  
1   pdf  
6   pedagogy  
3   pennsylvania  
372   perl  
1   perl6  
2   perlmonks  
1   perlqa2011  
1   perlqa2012  
1   perlqa2013  
1   perlqa2014  
1   perlqah2015  
7   pets  
2   philosophy  
22   phone  
1   photos  
8   php  
1   physics  
1   piercing  
1   piet  
1   pike  
5   platformer  
1   pobox  
10   pod  
3   poetry  
11   politics  
1   porn  
1   portland  
2   postfix  
2   postgres  
2   ppw2007  
1   pr0n  
6   presentation  
1   printer  
32   productivity  
2   profiling  
1   progamming  
1   programing  
459   programming  
5   prolog  
7   ps2  
2   psx  
1   psych  
1   pvoice  
11   python  
1   qmail  
1   quality  
1   querylet  
2   quiz  
2   rant  
3   rants  
2   reading  
3   recipe  
56   reference  
1   regex  
12   religion  
7   repair  
1   replication  
1   resource  
3   rest  
18   retail  
11   review  
1   rhetoric  
1   rifts  
1   robot  
1   rpc  
124   rpg  
5   rss  
3   rtf  
1   rubik  
20   rubric  
17   ruby  
1   rugby  
1   russia  
1   rust  
6   rx  
2   safari  
1   satire  
1   scala  
2   scheme  
8   science  
1   sco  
2   screen  
2   search  
11   security  
2   sega  
3   service  
4   sh  
5   shaving  
1   sheetmusic  
2   slack  
1   slony  
2   smalltalk  
1   smoking  
1   social  
13   socialnetworking  
184   software  
1   solaris  
2   sonic  
2   space  
8   spam  
1   speech  
2   sports  
1   sql  
2   sqlite  
1   startrek  
3   starwars  
2   steelbat  
1   strategy  
68   stupid  
10   subversion  
1   sudo  
1   superhero  
3   svk  
1   switcher  
5   syntax  
1   tarot  
1   taxes  
17   testing  
1   tex  
1   thanksgiving  
1   theatre  
1   thunderbird  
7   tiddlywiki  
6   time  
5   tivo  
8   todo  
1   tolkien  
54   tool  
1   tools  
1   toys  
15   travel  
1   trek  
24   tutorial  
16   tv  
3   typing  
1   uk  
6   unicode  
8   unix  
1   usenet  
2   vcs  
1   vector  
2   vi  
10   video  
94   videogame  
14   vim  
1   virtual  
1   virtue  
2   visualization  
2   vnc  
1   vocabulary  
1   voip  
2   voting  
6   war  
1   weapons  
1   weather  
57   web  
1   webgames  
1   weight  
2   wii  
20   wiki  
1   wikipedia  
14   win32  
2   wireless  
1   wodehouse  
1   wordplay  
19   work  
3   writing  
20   wtf  
19   xbox  
1   xcode  
1   xen  
8   xml  
1   xp  
1   xslt  
2   xul  
1   yahoo  
3   yaml  
29   yapc  
1   yapcasia  
5   ywar  
4   zelda  
1   zen  
3   zombie  
1   zombies  
5   zsh  

RSS feed rjbs's entries with a body

collapse entry bodies

Remember the Milk, its API, and a new Perl client (body)

by rjbs, created 2019-11-05 21:11

Hey, I'm finally writing another post about things I did on my week off in August!

I use Remember the Milk for my personal todo lists. It's pretty good! I've been using it for years, and I wax and wane in my attention to my tasks, but it's been a good help and I'm glad to have it. I'd be even happier with some changes, but more on that later.

Years ago, I built Ywar, and I still use it. It tracks my habits, when possible, using the APIs of services where I leave footprints. Am I weighing myself? Am I exercising? Am I doing some reading? Am I closing todo items in Remember the Milk? I get a congratulatory push notification when I hit a goal, and I get an email in the morning telling me what I should do today. These notices help keep me paying attention and motivated. One of the reasons this has worked okay — although I'll definitely admit that Ywar has not remained a massive force for productivity of late — is that it's there's not much extra work involved. It looks at what I'm already doing and records whether I did what I wanted. This means I have good "did exercise" feedback when I go for a run, because my running app logs to RunKeeper, but nearly no feedback when I lift weights, because my weightlifting app has no API. Less friction leads to greater success.

So, I wanted to apply this to my interactions with RTM. Its web UI is pretty good, and there's a native macOS app that's pretty good, too. (Its macOS app is just the web UI, but I'll take it!) They're both extra apps, though, and I have complicated feelings about how many distinct apps or tabs I want. I won't try to spell it out here, I'll just say: I wasn't as happy as I could be using their UI.

At work, we have a cool bot that provides a chat interface to some of our internal services. I wanted to do the same thing for RTM, which should have been no big deal, except that the existing Perl client library for RTM, WebService::RTMAgent, is a synchronous, blocking interface, and Synergy is event-driven and async. I looked at making it work with futures, but I didn't really want to. It was built on the XML interface, it uses AUTOLOAD, and I just didn't really like its construction. (I have used it for years, though, and it's never really been a problem. It's just not what I wanted to built on.)

Instead, I built a new client library, CamelMilk, modeled lightly on something we'd built at work recently. You feed it your API key and secret, and it can manage auth tokens for users and call API methods. API calls return futures that, when ready, produce simple objects. Here's how it looks, more or less, in use in the Synergy plugin:

  my $rsp_f = $self->rtm_client->api_call('rtm.tasks.add' => {
    auth_token => $token,
    timeline   => $tl,
    name  => $todo_description,
    parse => 1,

  $rsp_f->then(sub ($rsp) {
    unless ($rsp->is_success) {
        "failed to cope with a request to make a task: %s", $rsp->_response,
      return $event->reply("Something went wrong creating that task, sorry.");

    $Logger->log([ "made task: %s", $rsp->_response ]);
    return $event->reply("Task created!");
  })->else(sub (@fail) {

Nice! This code is called in response to a user saying todo eat a whole pie ^tomorrow. It returns immediately while the API call happens in the background and it replies when it's all done. I wrote a little command line program to go along with the library for setting up auth tokens and making one-off API calls.

While writing this library, though, I ended up feeling less excited than when I started. It turns out: I don't like the Remember the Milk API. The first problem is timelines. Here's what the API docs say:

Timelines enable the Remember The Milk API to allow certain actions to be undone. The Remember The Milk web application requests a new timeline every time the application is visited — it is up to the API user to determine how often to request a new timeline. Timelines do not expire, but they must always be used.

Timelines can be thought of as long-running database transactions within which individual sub-transactions (API method calls) can be reverted. The start of a timeline is a snapshot of the state of a users' contacts, groups, lists and tasks at that point in time. Method calls can be reverted continouously until the start of the timeline is reached.

So, that api_call call above was actually inside another call:

    my $tl_f = $self->timeline_for($event->from_user);

    $tl_f->then(sub ($tl) {
      my $rsp_f = $self->rtm_client->api_call('rtm.tasks.add' => {…});

Either we have a timeline id for that user already ready or we go get one, meaning there's either an additional API call or local state management. That timeline id, though, means that you can later undo some methods. Sort of a niche use, but neat, but it complicates all sorts of actions. As long as you're tracking known timelines for undo, why not track transactions on your own so you can compute the inverse transaction and reply them when needed? Then you'd only do that when you might want to undo.

In practice, what I do, and what at least some other clients do, is make a timeline, cache the id, and never, ever think about it unless you call undo which (I predict, sans evidence) almost nobody ever, ever does.

This isn't my real beef, though, it's just a foreshadowing of it. The real beef is that it takes too many HTTP requests to do anything non-trivial. Let's say that I have some paperwork I need to file, so I have a todo for it. I just found out that it's due Friday! I want to set it to priority 1, add the due date, remove the "boring" tag and add the "omg" tag. That will require I call these methods:

  1. rtm.tasks.setDueDate
  2. rtm.tasks.setPriority
  3. rtm.tasks.removeTags
  4. rtm.tasks.setTags

Every one of these is its own HTTP transaction. What happens when you fail partaway through your series of calls? I guess it's time to call undo — possibly more than once, since you may need to undo several transactions.

Here's how it might work in JMAP if JMAP task lists were a standard. You have a task with id 123 and you want to do the updates above. You'd make a single JMAP call:

  methodCalls: [
    [ "Task/set", {
        "update": { "123": {
          "dueDate": "2019-11-08T00:00:00Z",
          "priority": 1,
          "tags/boring": false,
          "tags/omg": true,
        } },

Note that the update argument is an object with the id as a key. You can update many tasks at once. Note, too, that Task/set and its arguments are in an array. You can update multiple kinds of things at once. I could create two new lists and all their items like this...

  methodCalls: [
    [ "TaskList/set", {
        "create": {
          "work": { "name": "Work Stuff" },
          "home": { "name": "Home Todos" }
    [ "Task/set", {
        "create": {
          "w1": { "name": "Get Hired", "listId": "#work" },
          "w2": { "name": "Get Raise", "listId": "#work" },
          "h1": { "name": "Take Nap",  "listId": "#home" },

Working with a protocol like this makes working in an event loop driven system really nice. You have a lot of options, but a simple one is to stick all your updates into one call, then report back all their results to the individual calling futures, with only a single HTTP transaction required.

Turns out, working with JMAP can really spoil you for other APIs.

Anyway, despite the API being a bit of a drag, Remember The Milk remains great, and I continue to get things done by using it. Now I can get more things done by talking to my Slack bot about my agenda, which is great, and if you want, you can go use the code I wrote for it, too. I've had a look at the network inspector while using RTM, and they've clearly got a better protocol for their UI to use. Maybe someday it will be published for mere users like me, too!

I took some time off! (body)

by rjbs, created 2019-08-11 22:27
tagged with: @markup:md journal

With a lot of PTO hours piled up, leave accounting somewhat in flux at work, and MoMA incredibly closed until October, I resolved to take some time off from work, during which I would stay home and work on stuff that I'd been ignoring. For example, my big queue of "conference presentations to watch" and my backlog of articles to read, personal coding projects to poke at, some little quality-of-life fixes to my home setup. Then, of course, I also wanted to do some actually fun things: hang out with my family, have some nice meals, watch some movies, go to a museum or two, and so on.

Short summary: I think this was a big success. I might shoot for doing this twice a year or so, in the future, and still have time left to take a proper out-of-town vacation.

I think I did pretty well on the "do good relaxing stuff" front. Gloria and I went to the Barnes Foundation, we all got a few meals out, I got ice cream, and I went to the local cider place for drinks and pizza with some friends. Also, Gloria and I are now about halfway through season two of Star Trek: Discovery.

I'll write up a bit more about specific things of note over the next few days, I hope, but in brief:

I fixed a bunch of my config files, especially for offlineimap and Vim. Doing this briefly added about sixty extraneous folders to my IMAP store, but fortunately, as an email professional, I knew how to delete them.

Ages ago, I lost a lot of my mp3s due to my own idiocy. Since then, I've really pulled back on my use of iTunes. In part, because I lost so much music. In part, because iTunes has continued to get worse. In part, because I just like Spotify a whole lot. Despite this, I do want to have my iTunes library on my phone, and I have a phone big enough that I can keep most of my library on it. Unfortunately, somehow iTunes got into its head that it couldn't sync. Further, although I could tell it to delete all local content, it wouldn't stop thinking that it basically knew about the content, which seemed to be getting in the way. Finally, I deleted the playlists from the phone… which then synchronized to my library. My favorite playlists are smart playlists about three deep (for example, "Radio RJBS" is every song on one of three playlists, each of which is all the songs on two other playlists plus extra criteria), and the playlists I deleted were deep in this hierarchy of smart playlists. I had to get the base64 encoded playlist rule definitions out of the XML of a backup of my iTunes library, decompile the binary rules, and recreate things. It stank, but in the end, I got my music onto my phone!

This kind of horrible job was a good example of something I felt I could do this week that I couldn't have done other weeks. Very often, I feel very pressed for time, and once a task becomes too time-consuming, I move on, even though it leaves me unhappy to do so. I feel like I don't have time to spend on things that are of value only to me and are slow grinds. This week, I did quite a bit of that. A bunch of it was tedious, but it got done, and that felt good.

Another kind of tedious but necessary task: My little goal-tracking API-masher-up, Ywar has been suffering from a bit of bit rot, especially with its Withings integration, which I used to fetch my daily weight measurements. They recently turned off OAuth1 support, so I had to switch to OAuth2. I find OAuth pretty tedious in all its manifestations, so I'd put this off, but I'm trying to pay more attention to my Ywar emails, and so I need things that can be automated to stay automated. I got it working, it's stupid, it's ugly, but it works. I got a push notification to my phone within five minutes of the fix, and that was terrific.

I also put in some calm brain time on things that have felt like overwhelming tasks that I should probably do, but didn't really need to. For example, I want to rebuild my Linodes to be easier to maintain and redeploy in the future, but they work fine, so the pressure is low. I didn't do this, but I did write down a list of the things I'll have to do, so I can do it later without as much thinking. I ended the week with a couple new small todo lists in Remember the Milk for these projects.

I also got to the gym on seven of the nine days I was on leave, sometimes more than once. This was pretty good, although a couple times I dropped an exercise from my scheduled routine. (I am excited to try "explosive pushups" in theory, but in practice, I was tired as heck!) On a few days, I got there more than once a day so I could do some cardio in addition to lifting. This was great because, mostly, it meant I felt more motivated to sit in the steam room, which is always a treat. Also, I watched some Better Call Saul.

More soon.

drawing (but not generating) mazes (body)

by rjbs, created 2019-05-05 22:26
last modified 2019-08-11 21:34
tagged with: @markup:md journal programming

I've started a sort of book club here in Philly. It works like this: it's for people who want to do computer programming. We pick books that have programming problems, especially language-agnostic books, and then we commit to showing up to the book club meeting with answers for the exercises. There, we can show off our great work, or ask for help because we got stuck, or compare notes on what languages made things easier or harder. We haven't had an actual meeting yet, so I have no idea how well this is going to go.

Our first book is Mazes for Programmers, which I started in on quite a while ago but didn't make much progress through. The book's examples are in Ruby. My goal is to do work in a language that I don't know well, and I know Ruby well enough that it's a bad choice. I haven't decided yet what I'll do most of the work in, but I didn't want to do it all in Perl 5, which I already know well, and reach for when solving most daily problems. On the other hand, I knew a lot of the early material in the book (and maybe lots of the material in general) would be on generating mazes, which would be fairly algorithmic work and produce a data structure. I didn't want to get all caught up in drawing the data structure as a human-friendly maze, since that seemed like it would be a fiddly problem and would distract me from the actual maze generation.

This weekend, I wrote a program in Perl 5 that would take a very primitive description of a maze on standard input and print out a map on standard output. It was, as I predicted, sort of fiddly and tedious, but when I finished, I felt pretty good about it. I put my maze-drawing program on GitHub, but I thought it might be fun to write up what I did.

First, I needed a simple protocol. My goal was to accept input that would be easy to produce given any data structure describing a maze, even if it would be a sort of stupid format to actually store a maze in. I went with a line-oriented format like this:

  1 2 3
  4 5 6
  7 8 9

Every line in this example is row of three rooms in the maze. This input would actually be illegal, but it's a useful starting point. Every room in the maze is represented by an integer, which in turn represents a four-bit bitfield, where each bit tell us whether the room links in the indicated direction


So if a cell in the maze has passages leading south and east, it would be represented in the file by a 6. This means some kinds of input are nonsensical. What does this input mean?

  0 0 0
  0 2 0
  0 0 0

The center cell has a passage east, but the cell to its east has no passage west. Easy solution: this is illegal.

I made quite a few attempts to get from that to a working drawing algorithm. It was sort of painful, and I ended up feeling pretty stupid for a while. Eventually, though, I decided that the key was not to draw cells (rooms), but to draw lines. That meant that for a three by three grid of cells, I'd need to draw a four by four grid of lines. It's that old fencepost problem.

   1   2   3   4
 1 +---+---+---+
   | 0 | 0 | 0 |
 2 +---+---+---+
   | 0 | 2 | 8 |
 3 +---+---+---+
   | 0 | 0 | 0 |
 4 +---+---+---+

Here, there's only one linkage, so really the map could be drawn like this:

   1   2   3   4
 1 +---+---+---+
   | 0 | 0 | 0 |
 2 +---+---+---+
   | 0 | 2   8 |
 3 +---+---+---+
   | 0 | 0 | 0 |
 4 +---+---+---+

My reference map while testing was:

   1   2   3   4
 1 +---+---+---+
    10  12 | 0 |
 2 +---+   +---+
   | 0 | 5 | 0 |
 3 +---+   +---+
   | 0 | 3  12 |
 4 +---+---+   +

This wasn't too, too difficult to get, but it was pretty ugly. What I actually wanted was something drawn from nice box-drawing characters, which would look like this:

   1   2   3   4
 1 ╶───────┬───┐
    10  12 │ 0 │
 2 ┌───┐   ├───┤
   │ 0 │ 5 │ 0 │
 3 ├───┤   └───┤
   │ 0 │ 3  12 │
 4 └───┴───╴  ╵

Drawing this was going to be trickier. I couldn't just assume that every intersection was a +. I needed to decide how to pick the character at every intersection. I decided that for every intersection, like (2,2), I'd have to decide the direction of lines based on the links of the cells abutting the intersection. So, for (2,2) on the line axes, I'd have to look at the cells at (2,1) and (2,2) and (1,2) and (1,1). I called these the northeast, southeast, southwest, and northwest cells, relative to the intersection, respectively. Then determined whether a line extended from the middle of an intersection in a given direction as follows:

  # Remember, if the bit is set, then a link (or passageway) in that
  # direction exists.
  my $n = (defined $ne && ! ($ne & WEST ))
       || (defined $nw && ! ($nw & EAST ));
  my $e = (defined $se && ! ($se & NORTH))
       || (defined $ne && ! ($ne & SOUTH));
  my $s = (defined $se && ! ($se & WEST ))
       || (defined $sw && ! ($sw & EAST ));
  my $w = (defined $sw && ! ($sw & NORTH))
       || (defined $nw && ! ($nw & SOUTH));

For example, how do I know that at (2,2) the intersection should only have limbs headed west and south? Well, it has cells to the northeast and northwest, but they link west and east respectively, so there can be no limb headed north. On the other hand, the cells to its southeast and southwest do not link to one another, so there is a limb headed south.

This can be a bit weird to think about, so think about it while looking at the map and code.

Now, for each intersection, we'd have a four-bit number. What did that mean? Well, it was easy to make a little hash table with some bitwise operators and the Unicode character set…

  my %WALL = (
    0     | 0     | 0     | 0     ,=> ' ',
    0     | 0     | 0     | WEST  ,=> '╴',
    0     | 0     | SOUTH | 0     ,=> '╷',
    0     | 0     | SOUTH | WEST  ,=> '┐',
    0     | EAST  | 0     | 0     ,=> '╶',
    0     | EAST  | 0     | WEST  ,=> '─',
    0     | EAST  | SOUTH | 0     ,=> '┌',
    0     | EAST  | SOUTH | WEST  ,=> '┬',
    NORTH | 0     | 0     | 0     ,=> '╵',
    NORTH | 0     | 0     | WEST  ,=> '┘',
    NORTH | 0     | SOUTH | 0     ,=> '│',
    NORTH | 0     | SOUTH | WEST  ,=> '┤',
    NORTH | EAST  | 0     | 0     ,=> '└',
    NORTH | EAST  | 0     | WEST  ,=> '┴',
    NORTH | EAST  | SOUTH | 0     ,=> '├',
    NORTH | EAST  | SOUTH | WEST  ,=> '┼',

At first, I only drew the intersections, so my reference map looked like this:


When that worked -- which took quite a while -- I added code so that cells could have both horizontal and vertical fillter. My reference map had a width of 3 and a height of 1, meaning that it was drawn with 1 row of vertical-only filler and 3 columns of horizontal-only drawing per cell. The weird map just above had a zero height and width. Here's the same map with a width of 6 and a height of zero:

  ┌──────┐      ├──────┤
  ├──────┤      └──────┤
  └──────┴──────╴      ╵

I have no idea whether this program will end up being useful in my maze testing, but it was (sort of) fun to write. At this point, I'm mostly wondering whether it will be proven to be terrible later on.

As a side note, my decision to do the drawing in text was a major factor in the difficulty. Had I drawn the maps with a graphical canvas, it would have been nearly trivial. I'd just draw each cell, and then start adjacent cells with overlapping positions. If two walls drew over one another, it would be the intersection of drawn pixels that would display, which would be exactly what we wanted. Text can't work that way, because every visual division of the terminal can show only one glyph. In this way, a typewriter is more like a canvas than a text terminal. When it overstrikes two characters, the intersection of their inked surfaces really is seen. In a terminal, an overstriken character is fully replaced by the overstriking character.

It's all on GitHub, but here's my program as I stands tonight:

  use v5.20.0;
  use warnings;

  use Getopt::Long::Descriptive;

  my ($opt, $usage) = describe_options(
    '%c %o',
    [ 'debug|D',    'show debugging output' ],
    [ 'width|w=i',  'width of cells', { default => 3 } ],
    [ 'height|h=i', 'height of cells', { default => 1 } ],

  use utf8;
  binmode *STDOUT, ':encoding(UTF-8)';

  #  1   A maze file, in the first and stupidest form, is a sequence of lines.
  # 8•2  Every line is a sequence of numbers.
  #  4   Every number is a 4-bit number.  *On* sides are linked.
  # Here are some (-w 3 -h 1) depictions of mazes as described by the numbers
  # shown in their cells:
  # ┌───┬───┬───┐ ╶───────┬───┐
  # │ 0 │ 0 │ 0 │  10  12 │ 0 │
  # ├───┼───┼───┤ ┌───┐   ├───┤
  # │ 0 │ 0 │ 0 │ │ 0 │ 5 │ 0 │
  # ├───┼───┼───┤ ├───┤   └───┤
  # │ 0 │ 0 │ 0 │ │ 0 │ 3  12 │
  # └───┴───┴───┘ └───┴───╴   ╵

  use constant {
    NORTH => 1,
    EAST  => 2,
    SOUTH => 4,
    WEST  => 8,

  my @lines = <>;
  chomp @lines;

  my $grid = [ map {; [ split /\s+/, $_ ] } @lines ];

  die "bogus input\n" if grep {; grep {; /[^0-9]/ } @$_ } @$grid;

  my $max_x = $grid->[0]->$#*;
  my $max_y = $grid->$#*;

  die "not all rows of uniform length\n" if grep {; $#$_ != $max_x } @$grid;

  for my $y (0 .. $max_y) {
    for my $x (0 .. $max_x) {
      my $cell  = $grid->[$y][$x];
      my $south = $y < $max_y ? $grid->[$y+1][$x] : undef;
      my $east  = $x < $max_x ? $grid->[$y][$x+1] : undef;

      die "inconsistent vertical linkage at ($x, $y) ($cell v $south)"
        if $south && ($cell & SOUTH  xor  $south & NORTH);

      die "inconsistent horizontal linkage at ($x, $y) ($cell v $east)"
        if $east  && ($cell & EAST   xor  $east  & WEST );

  my %WALL = (
    0     | 0     | 0     | 0     ,=> ' ',
    0     | 0     | 0     | WEST  ,=> '╴',
    0     | 0     | SOUTH | 0     ,=> '╷',
    0     | 0     | SOUTH | WEST  ,=> '┐',
    0     | EAST  | 0     | 0     ,=> '╶',
    0     | EAST  | 0     | WEST  ,=> '─',
    0     | EAST  | SOUTH | 0     ,=> '┌',
    0     | EAST  | SOUTH | WEST  ,=> '┬',
    NORTH | 0     | 0     | 0     ,=> '╵',
    NORTH | 0     | 0     | WEST  ,=> '┘',
    NORTH | 0     | SOUTH | 0     ,=> '│',
    NORTH | 0     | SOUTH | WEST  ,=> '┤',
    NORTH | EAST  | 0     | 0     ,=> '└',
    NORTH | EAST  | 0     | WEST  ,=> '┴',
    NORTH | EAST  | SOUTH | 0     ,=> '├',
    NORTH | EAST  | SOUTH | WEST  ,=> '┼',

  sub wall {
    my ($n, $e, $s, $w) = @_;
    return $WALL{ ($n ? NORTH : 0)
                | ($e ? EAST : 0)
                | ($s ? SOUTH : 0)
                | ($w ? WEST : 0) } || '+';

  sub get_at {
    my ($x, $y) = @_;
    return undef if $x < 0 or $y < 0;
    return undef if $x > $max_x or $y > $max_y;
    return $grid->[$y][$x];

  my @output;

  for my $y (0 .. $max_y+1) {
    my $row = q{};

    my $filler;

    for my $x (0 .. $max_x+1) {
      my $ne = get_at($x    , $y - 1);
      my $se = get_at($x    , $y    );
      my $sw = get_at($x - 1, $y    );
      my $nw = get_at($x - 1, $y - 1);

      my $n = (defined $ne && ! ($ne & WEST ))
           || (defined $nw && ! ($nw & EAST ));
      my $e = (defined $se && ! ($se & NORTH))
           || (defined $ne && ! ($ne & SOUTH));
      my $s = (defined $se && ! ($se & WEST ))
           || (defined $sw && ! ($sw & EAST ));
      my $w = (defined $sw && ! ($sw & NORTH))
           || (defined $nw && ! ($nw & SOUTH));

      if ($opt->debug) {
        printf "(%u, %u) -> NE:%2s SE:%2s SW:%2s NW:%2s -> (%s %s %s %s) -> %s\n",
          $x, $y,
          (map {; $_ // '--'  } ($ne, $se, $sw, $nw)),
          (map {; $_ ? 1 : 0 } ($n,  $e,  $s,  $w)),
          wall($n, $e, $s, $w);

      $row .= wall($n, $e, $s, $w);

      if ($x > $max_x) {
        # The rightmost wall is just the right joiner.
        $filler .=  wall($s, 0, $s, 0);
      } else {
        # Every wall but the last gets post-wall spacing.
        $row .= ($e ? wall(0,1,0,1) : ' ') x $opt->width;
        $filler .=  wall($s, 0, $s, 0);
        $filler .= ' ' x $opt->width;

    push @output, $row;
    if ($y <= $max_y) {
      push @output, ($filler) x $opt->height;

  say for @output;

PTS 2019: I went to the Perl Toolchain Summit! (index) (body)

by rjbs, created 2019-04-29 20:47
last modified 2019-04-30 20:54

Once again, it is spring in the northern hemisphere, and so time for the Perl Toolchain Summit, aka the QA Hackathon. I've made it to most of these, and have usually found them to be productive and invigorating. Everybody shows up with something to do, most of the people you need to help you are there, and everybody is interested in what everybody else is doing and will offer good feedback, advice, or just expressions of appreciation (or sympathy, as the case may require).

Bisham Abbey

The most common topics for me at past summits have been PAUSE, CPAN Testers, Dist::Zilla, and the CPAN::Meta specification. This year, I focused almost entirely on PAUSE, and I think it paid off. Last year, I did quite a bit of work on Dist::Zilla and wasn't happy with the end result. It might have been a good time to have a second go, but I decided I had a lot more support personnel on hand for PAUSE, and stuck with it. I'll try to give a general run-down of what I worked on in my posts, and some of it I'll probably try to expand into some reference material for future PAUSE contributors.

Here are my posts about the summit:

  1. Marlow
  2. Getopt::Long::Descriptive
  3. Module::Faker
  4. Automated PAUSE Testing
  5. PAUSE Inspection Tools

PTS is made possible by the generosity of its sponsors, who are making it possible for useful work to happen that would, quite simply, not happen otherwise. Thanks to them: Booking.com, cPanel, MaxMind, ZipRecruiter, Cogendo, Elastic, OpenCage Data, Perl Services, Zoopla, Archer Education, OpusVL, Oetiker+Partner, SureVoIP, YEF, and of course FastMail.

PTS 2019: Testing PAUSE by Hand (5/5) (body)

by rjbs, created 2019-04-29 20:47
last modified 2019-04-30 07:07

Writing tests is great, but sometimes you're not sure what you're looking for. Maybe you don't know what will happen at all if you try something new. Maybe you know it fails, but not how or where. This is the kind of thing where normally I'd fire up the perl debugger. The debugger is too often ignored by programmers who could solve problems a lot faster by using it. I love adding print statements as much as the next person -- maybe more -- but the debugger is a specialized tool that is worth bringing out now and then.

That said, I am not here to advocate that you attach the debugger to the PAUSE indexer. It's just that tools for investigating are sometimes more appropriate than tools for asserting. So, I built one.

Actually, I built it in 2015, but I've only used it sparingly until this week. With the improvements I made to my module faking code, it became much, much easier to use this tool to investigate hypothetical scenarios.

It used to work like this:

  $ ./one-off-utils/build-fresh-cpan \
    Some-Tarball-1.23.tar.gz \

You'd provide it a bunch of pre-built tarballs and it would make a new TestPAUSE, index the tarballs, and drop you into a shell. Ugh! I mean, it was useful, but we're back to a world where we have to write a program to generate fake data. Instead, I updated the program to take a list of instructions of what to do. For example:

  $ ./one-off-utils/build-fresh-cpan \
    fake:RJBS:Some-Cool-Dist-1.234.tar.gz \
    fake:ANDK:CPAN-Client-3.000.tar.gz \
    file:RJBS:~/tmp/perl-5.32.0.tar.gz \
    index \

This does what you might expect: it fakes up two tarballs and uploads those. Then it runs the indexer. Then it makes another fake with the given snippet of Perl. Finally, it drops you into a shell. The shell might look like this:

/private/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/wZwYj2Cgp0$ ls -l
total 4
drwxr-xr-x 5 rjbs staff  160 Apr 29 12:25 Maildir/
drwxr-xr-x 4 rjbs staff  128 Apr 29 12:25 cpan/
drwxr-xr-x 4 rjbs staff  128 Apr 29 12:25 db/
drwxr-xr-x 5 rjbs staff  160 Apr 29 12:25 git/
-rw-r--r-- 1 rjbs staff 3086 Apr 29 12:25 pause.log
drwxr-xr-x 3 rjbs staff   96 Apr 29 12:25 run/

Maildir contains all the mail that PAUSE would've sent out.

cpan is the CPAN that would be published, so you can find the tarballs and the index files there.

db has SQLite databases storing all the PAUSE data.

git is a git index storing every state that the index files have ever had.

run is not interesting, and stores the lockfile for the indexer.

Then there's pause.log. As you might guess, it's the PAUSE log file. We'll come back to that below.

Anyway, you can probably imagine that this is a pretty useful collection of data for investigation! You can very quickly answer questions like "if the system is in state X and events A, B, C happen, what's the new state and why?" In fact, we had quite a few questions that we'd put off answering in the past because they were just too tedious to sort out. Now they've become a matter of writing a few lines at the command prompt.

Of course, the command prompt can be a tedious place to write what is, effectively, a program, so build-fresh-cpan can also get instructions from a file. For example, I can write instructions to the file test.pause and then run ./one-off-tools/build-fresh-cpan cmdprog:test.pause. Here's what a program might look:

  # First, we import some boring stuff.


  # Then some more boring stuff, though slightly less boring.
      name      => 'Some-Cool-Dist',
      version   => 1.302,
      packages  => [
        qw( Some::Cool::Dist Some::Cool::Package ),
        'Some::Cool::Util' => { in_file => "lib/Some/Cool/Package.pm" },


  # And now RJBS tries to steal ownership of one of ANDK's packages!
      name      => 'Some-Cool-Dist',
      version   => '4.000',
      packages  => [
        qw( Some::Cool::Dist Some::Cool::Package ),
        'Some::Cool::Util' => { in_file => "lib/Some/Cool/Package.pm" },


  # Note that we didn't index CPAN::Client from RJBS.
  cmd:zgrep Client cpan/modules/02packages.details.txt.gz

  # Then ANDK takes pity on RJBS and grants him permission.  RJBS just re-indexes
  # the old dist.

  # We pick the new file for individual indexing...

  # ...but that doesn't rebuild the indexes, it only updates the database.
  cmd:zgrep Client cpan/modules/02packages.details.txt.gz

  # Let's poke around.

  # Everything looks fine in the logs, but it doesn't look like there has
  # been an update to the files on disk.  Maybe only full reindexes update
  # the files!

  # ...so we do a full reindex.

  # ...and now it's there.
  cmd:zgrep Client cpan/modules/02packages.details.txt.gz

Turning a program like this into an automated test is trivial. You copy in and tweak a few lines, then add the assertions you want to make based on what you've learned. I expect to do a lot of investigation in this format.

I also might break this program as I work on it, but I don't expect it needs much long-term stability, beyond the basics. Its documentation is in the command line usage:

  build-fresh-cpan [-IPpv] [long options...] TYPE:WHAT...
          --dir STR              target directory; by default uses a tempdir
          -v --verbose           print logs to STDERR as it goes
          -p STR --packages STR  02packages file to prefill packages table
          -P STR --perms STR     06perms file to prefill mods/primeur/perms
          --default-user STR     default PAUSEID for uploads; default: LOCAL
          -I --stdin             read instructions from STDIN to run before ARGV
          --each                 index at start and after each upload
          --[no-]shell           add an implicit "shell" as last instruction;
                                 on by default

  Other than the --switches, arguments are instructions in one of the forms
  below.  For those that list PAUSEID, it may be omitted, and the default
  user is used instead.

  Valid instructions are:

    form                  | meaning
    index                 | index now
    index:FILE            | index just one file
    perm:PAUSEID:PKG:PERM | set perm (f or c or 0) for PAUSEID on PKG
    file:PAUSEID:FILE     | upload the named file as the named PAUSE user
    fake:PAUSEID:FILE     | generate a dist based on the given filename
    json:PAUSEID:JSON     | interpret the given JSON string as a faker struct
    perl:PAUSEID:PERL     | interpret the given Perl string as a faker struct
    adir:DIRECTORY        | author dir: dir with A/AB/ABC/Dist-1.0.tar.gz files
    fdir:DIRECTORY        | flat dir: upload all the files as default user
    prog:file             | read a file containing a list of instructions
    progcmd:"program"     | run program, which should print out instructions
    cmd:"program"         | run a command in the working directory
    shell                 | run a shell in the working directory

  prog and progcmd output can split instructions across multiple lines.  Lines
  that begin with whitespace will be appended to the line preceding them.

Oh, and see that big table? That's why I needed to fix Getopt::Long::Descriptive!

...but what about logging?

The PAUSE indexer log file has often stymied me. It has a lot of useful information, and some not useful information, but it can be hard to tell just what's going on. For example, here's the log output from indexing just one dist:

  >>>> Just uploaded fake from RJBS/Fake-Dist-1.23.tar.gz
  PAUSE::mldistwatch object created
  Running manifind
  Collecting distmtimes from DB
  Registering new users
  Info: new_active_users[RJBS]
  Starting BIGLOOP over 1 files
  . R/RJ/RJBS/Fake-Dist-1.23.tar.gz ..
  Assigned mtime '1556373013' to dist 'R/RJ/RJBS/Fake-Dist-1.23.tar.gz'
  Examining R/RJ/RJBS/Fake-Dist-1.23.tar.gz ...
  Going to untar. Running '/usr/bin/tar' 'xzf' '/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/jzPcAZf3Xg/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz'
  Untarred '/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/jzPcAZf3Xg/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz'
  Found 6 files in dist R/RJ/RJBS/Fake-Dist-1.23.tar.gz, first Fake-Dist-1.23/MANIFEST
  No readme in R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  Finished with pmfile[Fake-Dist-1.23/lib/Fake/Dist.pm]
  Result of normalize_version: sdv[1.23]
  Result of simile(): file[Dist] package[Fake::Dist] ret[1]
  No keyword 'no_index' or 'private' in META_CONTENT
  Result of filter_ppps: res[Fake::Dist]
  Will check keys_ppp[Fake::Dist]
  (uploader) Inserted into perms package[Fake::Dist]userid[RJBS]ret[1]err[]
  02maybe: Fake::Dist                      1.23 Fake-Dist-1.23/lib/Fake/Dist.pm (1556373013) R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  Inserting package: [INSERT INTO packages (package, version, dist, file, filemtime, pause_reg, distname) VALUES (?,?,?,?,?,?,?) ] Fake::Dist,1.23,R/RJ/RJBS/Fake-Dist-1.23.tar.gz,Fake-Dist-1.23/lib/Fake/Dist.pm,1556373013,1556373013
  Inserted into perms package[Fake::Dist]userid[RJBS]ret[]err[]
  Inserted into primeur package[Fake::Dist]userid[RJBS]ret[1]err[]
  Sent "indexer report" mail about RJBS/Fake-Dist-1.23.tar.gz
  Entering rewrite02
  Number of indexed packages: 0
  Entering rewrite01
  No 01modules exist; won't try to read it
  cared about 0 symlinks
  Entering rewrite03
  No 03modlists exist; won't try to read it
  Entering rewrite06
  Directory '/home/ftp/run/mirroryaml' not found
  Finished rewrite03 and everything at Sat Apr 27 14:50:14 2019

To make the log file easier to read, and logging easier to manage, I converted PAUSE to use Log::Dispatchouli::Global. Then I went through every log line and rewrote many of them to have a more self-similar format, and used Log::Dispatchouli's data printing facilities. Here's the same operation in the new log format:

  2019-03-29 13:27:28.0774 [23234] FRESH: just uploaded fake: cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  2019-03-29 13:27:28.0861 [23234] PAUSE::mldistwatch object created
  2019-03-29 13:27:28.0901 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: assigned mtime 1556558848
  2019-03-29 13:27:28.0902 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: beginning examination
  2019-03-29 13:27:28.0913 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: going to untar with: {{["/usr/bin/tar", "xzf", "/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/uRlTDQD954/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz"]}}
  2019-03-29 13:27:28.0920 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: untarred /var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/uRlTDQD954/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  2019-03-29 13:27:28.0921 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: found 6 files in dist, first is [Fake-Dist-1.23/MANIFEST]
  2019-03-29 13:27:28.0922 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: no README found
  2019-03-29 13:27:28.0924 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: selected pmfiles to index: {{["Fake-Dist-1.23/lib/Fake/Dist.pm"]}}
  2019-03-29 13:27:29.0009 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: result of normalize_version: 1.23
  2019-03-29 13:27:29.0009 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: result of basename_matches_package: {{{"file": "Dist", "package": "Fake::Dist", "ret": 1}}}
  2019-03-29 13:27:29.0010 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: will examine packages: {{["Fake::Dist"]}}
  2019-03-29 13:27:29.0012 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserted into perms: {{{"err": "", "package": "Fake::Dist", "reason": "(uploader)", "ret": 1, "userid": "RJBS"}}}
  2019-03-29 13:27:29.0012 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserting package: {{{"dist": "R/RJ/RJBS/Fake-Dist-1.23.tar.gz", "disttime": 1556558848, "file": "Fake-Dist-1.23/lib/Fake/Dist.pm", "filetime": 1556558848, "package": "Fake::Dist", "version": "1.23"}}}
  2019-03-29 13:27:29.0013 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserted into perms: {{{"err": "", "package": "Fake::Dist", "reason": null, "ret": "", "userid": "RJBS"}}}
  2019-03-29 13:27:29.0013 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserted into primeur: {{{"err": "", "package": "Fake::Dist", "ret": 1, "userid": "RJBS"}}}
  2019-03-29 13:27:29.0015 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: ensuring canonicalized case of Fake::Dist
  2019-03-29 13:27:29.0035 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: sent indexer report email
  2019-03-29 13:27:29.0040 [23234] rewriting 02packages
  2019-03-29 13:27:29.0069 [23234] no 01modules exist; won't try to read it
  2019-03-29 13:27:29.0069 [23234] symlinks updated: 0
  2019-03-29 13:27:29.0070 [23234] no 03modlists exist; won't try to read it
  2019-03-29 13:27:29.0171 [23234] FTP_RUN directory [/home/ftp/run/mirroryaml] does not exist
  2019-03-29 13:27:29.0213 [23234] finished rewriting indexes
  2019-03-29 13:27:29.0268 [23234] running a shell (/opt/local/bin/zsh)

It's not perfect, but I think it's much easier to read.

My hope is that the improved testing and investigation facilities will help us make serious strides toward overhauling the indexer before PTS 2020. We'll see, but right now, I feel pretty good about it!

PTS 2019: The PAUSE Test Suite (4/5) (body)

by rjbs, created 2019-04-29 20:45

PAUSE and case insensitivity

All that Module::Faker and Getopt work was really in furtherance of PAUSE work. When I arrived in Marlow, I had been given just request by Neil: sort out PAUSE!320, a change in behavior we've been slowly getting right for years now.

Once upon a time, the PAUSE index was case sensitive. Andreas König, the author and maintainer of PAUSE, says he's not sure whether or not this was actually intentional. At any rate, it meant that if one person uploaded the package "Parse::PERL" and another uploaded "Parse::Perl", the indexer would let them each have permissions on the name they uploaded. This is a problem because not all filesystems are case-sensitive. So a user might rely on Parse::PERL, install Parse::Perl, and then get bizarre errors when trying to use Parse::PERL in their code. The runtime, after all, does not verify that when you load the module Foo::Bar, it actually defines the package Foo::Bar. D'oh!

We've made various changes to address this problem, with the basic rule being that permissions are case-preserving and not case-sensitive. If you have permissions on Foo::Bar, then you also own foo::bar and FOO::BAR and foO::bAr and whatever else you like. The index and permissions data will show "Foo::Bar", because that's what you uploaded. Maintainers of the Foo::Bar namespace are free to rename it to fOO::BAr, if they want, by uploading a new distribution, but nobody else can sneak their way into the namespace.

This was basically implemented at last year's PTS, but wasn't deployed because it was broken in subtle ways. The fix was easy, once I worked out what the problem was, which required a bunch of testing. More on that shortly!

The upshot was that I did get the behavior changed successfully. Neil who had been working for years to eliminate all the case conflicts in the permissions list, had gotten things down to a few enough to count on one hand. With the bug fixed and the last few conflicts resolved, we believe we have ended the age of conflicting-case index entries. This was a nice milestone to reach. There was high-fiving, and Neil may have shed a single tear, though I couldn't swear to it in court.

As an aside: you might be wondering, "Why didn't you just put case-insensitive unique indexes in the database?" PAUSE's indexer is sort of a strange beast, which simultaneously updates the database, analyzes the contents of uploads, and decides what to do as it goes along. Triggering a database error would jump past a lot of behavior, and we could've done it, but it felt saner to try to detect the problem. I have plans for the future to tease apart the indexer's several behaviors.

testing PAUSE

In August 2011, David Golden and I got together in Brooklyn and began writing a test suite for the indexer. To be fair, a couple tests existed already, but they tested very, very few things. By just a bit before four in the afternoon on that day in 2011, David and I had a passing test that looked like this:

  my $result = PAUSE::TestPAUSE->new({
    author_root => 'corpus/authors',

    -e $result->tmpdir->file(qw(cpan modules 02packages.details.txt.gz)),
    "our indexer indexed",

  my $pkg_rows = $result->connect_mod_db->selectall_arrayref(
    'SELECT * FROM packages ORDER BY package, version',
    { Slice => {} },

  my @want = (
    { package => 'Bug::Gold',      version => '9.001' },
    { package => 'Hall::MtKing',   version => '0.01'  },
    { package => 'XForm::Rollout', version => '1.00'  },
    { package => 'Y',              version => 2       },

    [ map {; superhashof($_) } @want ],
    "we indexed exactly the dists we expected to",

Further refinements would come, but many of the tests still look quite a lot like this. Note how it begins: we name an author_root This is a directory full of pretend CPAN uploads by pretend CPAN authors. Every file in the directory is copied into the test PAUSE, simulating an upload, and then the indexer is run. To understand these tests, you need to know what's in that directory. It's not just a matter of running ls, either. The directory contains tarballs, and those tarballs are more or less opaque unless you unpack them. Ugh. In this test, the contents were all entirely uninteresting, but in later tests, you'd end up wondering what was being tested. Something would import corpus/mld/009 and then assert that the index should look one way or another, rarely noting that one dist in the directory had strange properties known only at the time of test-writing.

To make matters worse, the tests were split into two files. In one, each tested behavior was tested with a distinct TestPAUSE, so no two tests would interact. In the other file, though, every behavior was tested on top of the tests for the previous test, resulting in a very cluttered test file in which the intent of any given test might be pretty hard to determine, especially when you're reading it five years later.

Splitting those tests up so that each would use a distinct TestPAUSE wasn't going to be difficult as a matter of programming, but it meant each one needed to be teased apart from the one before it, meaning its intent needed to be sussed out, which meant unpacking tarballs and reading their contents. I shook my fist and cried, "Never again!"

By 2019, test had changed so that rather than this:

  my $result = PAUSE::TestPAUSE->new({
    author_root => 'corpus/authors',

You'd be likely to write:

  my $pause = PAUSE::TestPAUSE->new;


  my $result = $pause->test_reindex;

(This means you'd be able to later add more files, index again, and see what changed. Useful!)

To make the tests clearer, I added a new method, upload_author_fake:

  $pause->upload_author_fake(JBLOE => {
    name     => 'Example-Dist',
    version  => '1.0',
    packages => [ qw(Example::Dist Example::Dist::Package) ]
    more_metadata => { x_Example => 'This is an example, too.' },

Hey, it's using from_struct like we saw in my Module::Faker report from this PTS! Now you can always know exactly what is interesting about a fake. Sometimes, though, you don't need an interesting fake, you just need totally boring dist to be uploaded. In those cases, now you can just write

  $pause->upload_author_fake(JBLOE => 'Example-Dist-1.0.tar.gz');

...and Module::Faker will know what you mean.

With this tool available, and the new Module::Faker features to help produce weird distributions, I was able to rewrite the tests to be entirely isolated. I also deleted quite a few of the prebuilt tarballs from the corpus directory, but not all of them yet. One or two are a bit tedious to produce with Faker, and one or two others I just didn't get to.

I look forward to replacing those, in part because I know it will mean cool improvements to Module::Faker, and in part because every time I make the test suite saner, I make it easier to get more people confident that they can write more PAUSE code.

PTS 2019: Module::Faker (3/5) (body)

by rjbs, created 2019-04-29 20:40
last modified 2019-04-29 20:40

the basics

In 2008, I wrote Module::Faker. In fact, I wrote it at the first QA Hackathon! I'd had the idea to write it because at Pobox, we were writing a PAUSE-like module indexer for changing our internal deployment practices. It became clear that we could use it for testing actual PAUSE as well as other code, and I got to work. Since then, I've used it (and the related CPAN::Faker) quite a lot for testing, especially of PAUSE. At first, it was just a quick way to build a tarball that contained something that looked more or less like a CPAN distribution, with the files in the right places, the package statements, the version declarations, and so on.

As time went on, I needed weirder and more broken distributions, or distributions with more subtle changes than just a different version. In these cases, I'd build a fake, untar it, edit the contents, and tar it up again. Further, the usual way to fake up a dist was to write a META.yml-style file, stick in a directory with other ones, and then build them all. Why? Honestly, I can't remember. It's sort of a bizarre choice, and I think probably it was trying to be too clever. By default, you pass the contents of the META file to Module::Faker::Dist->new, which means its internals go to sort of annoying lengths to do the right thing. Why not a simpler constructor and a translation layer in between?

Well, that's not what I implemented, but I'll probably make that happen eventually. Instead, I added yet another constructor, this one much more suitable for building one-off fakes. Using it looks something like this:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'DENOMOLOS',
    name        => 'Totally-Fake-Software',
    version     => '2.002',

And when you write that to disk, you get this:

  ~/code/Module-Faker$ tar zxvf Totally-Fake-Software-2.002.tar.gz
  x Totally-Fake-Software-2.002/lib/Totally/Fake/Software.pm
  x Totally-Fake-Software-2.002/Makefile.PL
  x Totally-Fake-Software-2.002/t/00-nop.t
  x Totally-Fake-Software-2.002/META.json
  x Totally-Fake-Software-2.002/META.yml
  x Totally-Fake-Software-2.002/MANIFEST

  ~/code/Module-Faker$ cat Totally-Fake-Software-2.002/lib/Totally/Fake/Software.pm

  =head1 NAME

  Totally::Fake::Software - a cool package


  package Totally::Fake::Software;
  our $VERSION = '2.002';


Quite a few things are inferred, but the most important inference is that in a dist called Totally-Fake-Software with a version of 2.002, you'll want a single package named Totally::Fake::Software with a version of 2.002. It's pretty resaonable, as long as it can be overridden, and now it can:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'DENOMOLOS',
    name        => 'Totally-Fake-Software',
    version     => '2.002',
    packages    => [
      'Totally::Fake::Firmware' => {
        version => '2.001',
        in_file => "lib/Totally/Fake/Hardware.pm"

This does what you might imagine: it makes two modules (read: .pm files), one of which contains two package definitions, and those two packages have differing version numbers. That is:

~/code/Module-Faker$ cat Totally-Fake-Software-2.002/lib/Totally/Fake/Hardware.pm

=head1 NAME

Totally::Fake::Hardware - a cool package


package Totally::Fake::Hardware;
our $VERSION = '2.002';

package Totally::Fake::Firmware;
our $VERSION = '2.001';


I think I'm still going to tinker with how faker objects are generated, but by now you should get the point.

weird stuff

I use Module::Faker for testing PAUSE, and one thing that PAUSE has to do is deal with slightly (or not so slightly) malformed input. With no specification for the shape of a distribution (other than the strong suggestion that you follow what's in the CPAN::Meta spec), malformedness is in the eye of the beholder. I can tell you, though, that I have beheld a lot of malformed stuff in my time.

One example was pretty innocuous: in META.json, there can be a "provides" entry saying what packages are provided by this distribution, and in what file they're found. It might look like this:

  "provides": {
    "Some::Package::Here": { "file": "lib/Some/Package/Here.pm" }

From time to time, people have inserted an entry with an undefined file value. This was used as a means to say, "I am claiming ownership of this name, which I do not provide." A few years ago, the toolchaing gang came to a consensus: file needs to be provided, naming some actual file. Unfortunately, CPAN::Meta features footgun protection, shielding you from accidentally (read: on purpose) making this kind of bogus entry. I needed a way to produce it, and I really did not want to resort to editing and repacking tarballs by hand.

To do this, I added a new option, meta_munger:

  $dist = Module::Faker::Dist->from_struct({
    name    => 'Some-Package-Here',
    version => '1.0',
    cpan_author => 'TSWIFT',
    meta_munger => sub {
      $_[0]{provides} = {
        "Some::Package::Here" => { file => undef },

      return $_[0];

Doing this involved disassembling and duplicating a bit of CPAN::Meta, but it was worth it. All manner of stupid garbage can now be stuffed into fake dists' metadata. In fact, the munger can decide it's going to act like 2009-era RJBS by putting JSON into the META.yml file, it can use YAML::Syck, it can return syntactically invaild JSON, or whatever you like. Do your worst!

The munger runs quite late in the metadata generation process. Especially noteworthy, it runs after the metadata structure has been generated for a specific version. You might just want to provide a little bit of extra metadata, like an x_favorite_treat to be handled like any other bit of metadata, with CPAN::Meta::Merge. Easy enough:

  $dist = Module::Faker::Dist->from_struct({
    name    => 'DateTime-Breakfast',
    version => '1.0',
    cpan_author => 'BURGERS',
    more_metadata => { x_favorite_treat => 'cruffins' },

It's certainly less work to write than a munging subroutine.

I added support for different "style" packages. What does that mean? Well, given this code:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'MRXII',
    name        => 'Cube-Solver',
    version     => 'v3.3.3',
    packages    => [
      'Cube-Solver-Rubik' => { in_file => 'Solver.pm', style => 'legacy' },
      'Cube-Solver-GAN'   => { in_file => 'Solver.pm', style => 'statement' },
      'Cube-Solver-LHR'   => { in_file => 'Solver.pm', style => 'block' },

You get this result:

  ~/code/Module-Faker$ cat Cube-Solver-v3.3.3/Solver.pm

  =head1 NAME

  Cube-Solver-Rubik - a cool package


  package Cube-Solver-Rubik;
  our $VERSION = 'v3.3.3';

  package Cube-Solver-GAN v3.3.3;

  package Cube-Solver-LHR v3.3.3 {

    # Your code here



I actually haven't documented this feature, yet, because I'm not happy enough with it. For example, you can't get an assignment to $VERSION inside a package block. I'd like to make this all a bit more flexible. Sometime I half think that it would be cool to merge Module::Faker's behavior into Dist::Zilla's dist minting features, but I think this would be more trouble than it's worth. Probably.

I didn't document that, but I did document quite a lot of Module::Faker::Dist, which was previously entirely undocumented. Great!

PTS 2019: Getopt::Long::Descriptive (2/5) (body)

by rjbs, created 2019-04-29 20:37
last modified 2019-04-30 15:43

One non-PAUSE thing I worked on was Getopt::Long::Descriptive. I added a small new feature. It supports something like this:

    "%c %o ARG...",
    [ "foo",  "should we do foo?" ],
    [ "bar",  "should we do bar?" ],
    [ <<~'EOT' ],

      How should you know whether to use --foo and --bar?  Well, the choice
      is simple.  If you want to foo, use --foo, and if you want to use bar,
      don't, because --bar hasn't been implemented.

This is okay, but when you run it, you get:

  examine-program [long options...] ARG...
    --foo  should we do foo?
    --bar  should we do bar?
    How should you know whether to use --foo and --bar?  Well, the
  is simple.  If you want to foo, use --foo, and if you
    want to use bar,
  don't, because --bar hasn't been implemented.

Waaaa? Well, GLD is helpfully trying to word wrap and indent text for you, and it does a terrible job in the case of large hunks of text that you want displayed verbatim. I added a way to tell it to trust you, the author, on the indenting.

    "%c %o ARG...",
    [ "foo",  "should we do foo?" ],
    [ "bar",  "should we do bar?" ],
    [ \<<~'EOT' ],

      How should you know whether to use --foo and --bar?  Well, the choice
      is simple.  If you want to foo, use --foo, and if you want to use bar,
      don't, because --bar hasn't been implemented.

Did you miss it? It's the \ turning there heredoc string into a reference.

Of course, the code to implement this was nearly trivial. I spent more time on figuring out what was going on where than I did fixing it. I expect to start using this feature in other code more or less immediately.

PTS2019: Marlow (1/5) (body)

by rjbs, created 2019-04-29 20:36

For several reasons, I had considered not going to this year's PTS. Going was a good idea, and I'm glad I did it, and I definitely got valuable things accomplished, but the main reason I decided wasn't the morale boost I'd get from writing code. It was Marlow, the town where the event was hosted. Marlow is Neil Bowers' home town, and I've heard about quite a bit over the years. PTS was an opportunity to see it in person, to have a bit of a visit with Neil, to see the oft-praised Marlow Bookshop, and as I learned fairly late into the planning, to eat at a two Michelin star restaurant. I knew that the price of doing this would be hard work at the summit and I accepted it. Fortunately, I got to do all those things, they were great, and then the hard work was rewarding on its own. Two thumbs up.

Marlow felt just one notch up from tiny. The entirety of downtown seemed to be two roads of shops, each one about three blocks long. How tiny is it really? I couldn't say, I only spent a few hours there. Almost everything was recommended, except the fish and chips shop, about which apparently the less said the better. I did make it to the bookshop, which was charming, but I didn't spend enough time there to really take it in. I ate some great Thai and Vietnamese food, but the star meal was at the Hand and Flowers, the star-bearing "pub" I mentioned earlier. While we ate, Neil said, "Oh, and there are two three-star restaurants about five miles from here." I guess PTS will be back in Marlow sometime in the future, right?

Meeting Neil's family was also the impetus for me to finally learn how to play Dutch Blitz, and left me thinking I should really be able to solve a Rubik's cube. Also, who knew how many English people would be distressed at the prospect of watching someone drink their tea without milk?

So the town was nice, and about a mile's walk from the venue, Bisham Abbey. The manor house was built about 750 years ago, although significantly modified since. The building was imposing and attractive, and had a beautiful lawn that led right up to the banks of the Thames. Despite a few weirdly shaped doorways and slightly too-steep stairs, it was a great venue for the summit. It would've been nice to claim a few more of its rooms, but it was never a problem.


Bisham Abbey

We didn't stay at the abbey. Instead, we stayed just across a driveway at the national sports center on the abbey grounds. It's where Britain has some of its Olympians train, so it had a great gym, which I managed to use a few times. It also had a presumably even better gym, labeled the "Elite Gym", but it wasn't for mere mortals like me. The room was just fine. The food was fine. I failed to get any dinner on Saturday night, which made me realize that one of my travel bag necessities on future trips should be some kind of food for when I do that. A couple MetRx bars might be a reasonable emergency backup.

I left Marlow this morning the same way that I arrived: in a hired car driven by somebody hired by Neil, who knew exactly where I had to go, so all I had to do was sit down and wait. Top marks for that. Who doesn't love being picked up by a driver with your name on a slate? (It said "RJBS". Heh.)

Looking back, I almost wish that I'd had a bike while in Marlow, but the way that Brits drive on small country roads terrifies me. I'll just stick to walking next time.

I went to the Perl Toolchain Summit in Oslo! (body)

by rjbs, created 2018-04-27 19:58
last modified 2018-04-28 18:13

The last time I wrote in this journal was in January, and I felt committed, at the time, to make regular updates. Really, though, my first half 2018 has been pretty overwhelming, and journal posts are one of the many things I planned that didn't really happen. I'm hoping I can get back on track for the second half, for reasons I'll write about sometime soon. Maybe?

But I just got back from Oslo and the Perl Toolchain Summit, and it's practically a rule that if you go, you have to write up what you did. I like the PTS too much to snub my nose at it, so here I am!

This was the 11th summit, and the second to be held in Oslo, where the first one was held in 2008. I've attended ten of them, including that first one, and they've all been great! They're small events with about thirty people, all of whom are there by invitation. The idea is that the invitees are people who do important work on "the toolchain," which basically means "the code used to distribute, install, and test Perl software modules." People show up with plans in mind, and then work on them together with the other people whose work will likely be affected or involved. There's always a feeling that everyone is working hard and getting things done, which helps me feel energetic. Also, I like to take breaks and walk around and ask what people are doing. Sometimes they're stuck, and I can help, and then I go back to my own work feeling like a champ. It's good stuff.

I've worked on a number of parts of the toolchain, over the years, but most often I spend my time on PAUSE, with just a little time on Dist::Zilla. That's where my time went again this year. I wasn't at PTS last year, as I was in Melbourne for work. In the last two years, I've been doing much less work on CPAN libraries. That, plus some recent tense conversations on related mailing lists, had me feeling unsure about the value of my going. By the end, tough, I was very glad I went, and felt like I can be twice as productive next year if I spend just a little time preparing my work before I head out.

Here's what I actually did, more or less. Mostly less, as it's been a few days. I should have kept day to day notes as I've done in the past! There's another way to improve for next year.


I got to Oslo on Tuesday the 17th and walked around the city with David Golden. We didn't do much of anything, but I bought some things I'd forgotten to pack — this was my worst travel packing in years — and ate dinner with Todd Rinaldo. Mostly, my goal was to stay awake after taking a redeye flight, and I succeeded. The next day, we did more of the same, but also went to the National Gallery, which was a good stop. I wish I'd gone to the contemporary art museum down near the shore, but instead I got sidetracked into conversations about Perl. It could've been worse!

PAUSE work

PAUSE is the Perl Author Upload SErver. When you upload a file "to the CPAN," PAUSE decides whether you had permission to do what you're doing and then updates the indexes used by CPAN clients to install things.

Two years ago, in Rugby, we added a feature to PAUSE that would normalize distribution permissions on new uploads, making sure that if a co-maintainer uploaded a new version of a distribution, including some new package for the first time, the new package got the same permissions as the distribution's main module had. This is part of the slow shuffle of PAUSE from dealing only in package permissions to having a sense of distribution permissions.

I think the change was a good idea, but it had a few significant bugs. Andreas König had fixed at least some of them since then, but he had seen one file recently that hit a security problem. Unfortunately, his reproducer no longer worked, and we weren't sure why. While Andreas tried to reproduce the bug, I sat down and read the related code closely until I had a guess what the problem was.

I was delayed by an astounding bug in DB_File that manifested on my machine, but not his. This code:

  tie my %hash, 'DB_File', ...;

  $hash{ $_ } = $_ for @data;

  say "Exists" if exists $hash{foo};
  say "Grep"   if grep {; $_ eq 'foo' } keys %hash;

...would print "Grep" but not "Exists". In other words, exists was broken on these tied hashes. Rather than go further down this rabbit hole, we removed the tie. There's a lot more memory available for processes now than when the code was written maybe twenty years ago!

When Andreas had a reproducer working, we tried my fix… and it failed. The theory was right, though, and I had a real fix pretty quickly. Once we knew just what the problem was, it was pretty easy to write a simple test case for the PAUSE test suite.

After that, Andreas mentioned we'd seen some errors with transaction failures. I did a bit of work to make PAUSE retry transactions when appropriate and, while I was doing that, made what I thought were significant improvements to the emails PAUSE sends when things go wrong.

Doing this work, like a lot of PAUSE work, involved working out a lot of issues of action at a distance and vestigal code. In fact, the bug boiled down to a piece of code being added in the wrong phase, and the phase to which it was added should've been deleted a year before the bug was introduced. Since it wasn't, it was a deceptively inviting place to add the new feature. I decided I would try to purge some code that was long overdue for purging, and to refactor some large and confusing methods.

I spent a fair bit of Thursday afternoon on this, as a first pass. I think I made some good improvements. My biggest target was "the big loop", a while loop labeled BIGLOOP in PAUSE::mldistwatch aka "the indexer". I pulled this code apart into a few subroutines, and then did the same thing to some of the routines it called. More and more, I felt confident that there were (And still are) two main problems to address:

  • Distinct concerns like permissions and side-effects are interwoven and difficult to separate out. I started to discuss the idea of making the first phase of indexing construct a plan of side effects which would then all be taken in a second phase. Easier said than done! Later in the week, David Golden did some work toward this, though, and I think next year we can make big strides.
  • Often, decisions are made at a distance. During phase 1, a check might cause a variable on some parent object to be set. Later, in phase 6, a different object might go and find that parent and check the flag in order to make a decision. This is all somewhat ad hoc, so it's often unclear why a flag is being set or what the full implications of setting it might be. Achieving a desired effect late in the program might require changing the actions taken very early.

On Friday, I carried on work to deal with this, but eventually I stopped short of any major changes. It was going to take me all of Friday come up with a solution I'd like, and I didn't think I could have the implementation done in time to want to deploy it. The last thing I wanted was to push Andreas to deploy a massive refactoring on Sunday night before we all went home! Instead, this problem is on my list of things to think hard about in the weeks leading up to the 2019 PTS.

Finally, I had a go at the final (we hope) change needed for the full case-desensitization of PAUSE. Right now, if you've uploaded Foo::Bar, PAUSE prevents you from later uploading FOO::BAR. This was a conscious decision to avoid case conflicts, but we've since decided that you should be able to change the case, if you have permissions over the flattened version, but we must keep everything consistent. I made some good progress here, but then hit the problems above: side effects and checks happen in interleaved ways, so getting everything just right is tricky. I have a branch that's nearly done, and I hope to finish it up this year.

While doing that, I also made some improvements to the test system. I'm proud of the PAUSE test suite! It's easy to add new tests, and now it's even easier. In the past, if you wanted to test how the indexer would behave, you'd build some fake CPAN distributions and mock-upload them. These distributions could be made by hand, or by using Module::Faker. Either way, they took the form of tarballs sitting in a corpus directory. Making them was a minor drag, and once they're made, you'd not always sure what their point is.

I added a new method, upload_fake, that takes a META.yml file, builds the dist that that file might represent, and uploads that file for indexing. For example, given this metafile:

# filename: Foo-Bar-0.001.yml
name: Foo-Bar
version: 0.001
   version: 0.001
   file: lib/Foo/Bar.pm
  cpan_author: FCOME

...the test suite will make a file with lib/Foo/Bar.pm in it with the code needed: package statements, version declarations, and so on. The it will upload it to F/FC/FCOME/Foo-Bar-0.001.yml. This uses Module::Faker under the hood, and I took a little time to make some small tweaks to Module::Faker. I have some good ideas for my next round of work on it, too.

Oh, and finally: we want to add a new kind of permission, called "admin", which lets users upload new versions of code and to grand uploading permissions to others, but is clearly not the primary owner. Right now, we don't have that. David and I both made some inroads to making it possible, but it's not there yet.


First, I applied a bunch of small fixes and made a new v6 release. These were all worthy changes, but fairly uninteresting, with the possible exception of an update to the configuration loader, which will now correctly load plugins from ./inc on perl v5.26, where . is not in @INC.

After that, I made a new release of v5. It includes a tiny tweak to work better with newer Moose, so you can still get a working Dist::Zilla on a pre-5.14 perl. Don't get too excited, though. I still don't support v5. This release was made at the request of Karen Etheridge, but I'm not sure she's eagier to field any support requests, either. Consider upgrading to v6!

After that, I started work on v7. At work, newer code is being written against perl v5.24, and we use lots of new features: lexical subroutines, pair slicing, postfix dereferencing, subroutine signatures, /n, and so on. If practical, I wanted to be able to start doing that in Dist::Zilla.

I have a program that crawls over the CPAN, unpacking every distribution and building a small report about its contents. Here's an example report on one dist:

  sqlite> select * from dists where dist = 'Dist-Zilla';
           distfile = RJBS/Dist-Zilla-6.012.tar.gz
               dist = Dist-Zilla
       dist_version = 6.012
             cpanid = RJBS
              mtime = 1524298921
       has_meta_yml = 1
      has_meta_json = 1
          meta_spec = 2
  meta_dist_version = 6.012
     meta_generator = Dist::Zilla version 6.012, CPAN::Meta::Converter version 2.150010
   meta_gen_package = Dist::Zilla
   meta_gen_version = 6.012
      meta_gen_perl = v5.26.1
       meta_license = perl_5
     meta_yml_error = {}
   meta_yml_backend = YAML::Tiny version 1.70
    meta_json_error = {}
  meta_json_backend = YAML::Tiny version 1.70
  meta_struct_error = {}
       has_dist_ini = 1

I originally wrote this to tell me how many people were using Dist::Zilla, but it's useful for other things, like dependency analysis (not shown, above, is the dump of all the module requirements in the metafile) or common YAML errors.

The meta_gen_perl field looks for a new field I just added to all Dist::Zilla distributions, telling me the perl used to build the dist. Failing that, it looks for output from MetaConfig. You won't yet see these data for dists not built by Dist::Zilla. I looked for what perl version was being used to build distributions with Dist::Zilla v6:

  sqlite> SELECT SUBSTR(meta_gen_perl, 1, 5) AS perl, COUNT(*) AS dists
          FROM dists
          WHERE meta_gen_package = 'Dist::Zilla'
            AND meta_gen_perl IS NOT NULL
            AND SUBSTR(meta_gen_version,1,1)='6'
          GROUP BY SUBSTR(meta_gen_perl, 1, 5);

  perl        dists
  ----------  ----------
  v5.14       2
  v5.16       5
  v5.18       4
  v5.20       29
  v5.22       98
  v5.23       7
  v5.24       675
  v5.25       204
  v5.26       563
  v5.27       54

This isn't the best data-gathering in the world, but it made me feel confident about moving to v5.20. I started a branch, applied the commits that had been waiting for v5.20, and then got to work with other changes I wanted:

  • lexical subroutines
  • subroutine signatures
  • eliminating circumfix dereference

Lexical subs were only useful in a small number of places, as I expected. (The use here is "making the code a bit nicer to read".) Subroutine signatures were much more useful, and found a number of bugs or sloppy pieces of code, but they introduced a new problem.

Subroutine signatures enforce strict arity checking by default. That is, if you write this:

  sub add ($x, $y) { $x + $y }

  add(1, 2, 3);

...then you get an error about too many arguments. This is good! (It's also easy to make your subroutine accept and throw away unwanted arguments.) The not so good part is that the error you get tells you what subroutine was called incorrectly, but not by what calling line. This has been a known problem since signatures were introduced. For the most part, even though I use signatures daily, I hadn't found this to be a major problem. This time, though, a new pattern kept coming up:

  around some_method ($orig, $self, @rest) { ... }

Now, if the caller of some_method got the argument count wrong, I'd only be told that Class::MOP::around was called incorrectly. This could be anything! I'm going to push for the diagnostics to be fixed in v5.30.

In eliminating circumfix dereferencing, what I found was that I was always happier with postfix — and I knew I would be. I had already made a postfix-deref branch of Dist::Zilla years ago when the feature was experimental in a branch of v5.19. What I also found, though, was that I often wanted to eliminate the dereferencing altogether. Often, Dist::Zilla objects have attribute accessors that return references, often directly into the objects' guts. In those cases the reference doesn't just make things ugly, it makes things unsafe. I began converting some accessors to dereference and return lists. This broke a few downstream distributions, but nothing too badly. Karen helped me do some testing on this, along with some other v7 changes, and will probably end up dealing with more maybe-breakage based on v7 than anybody but me. I think I'll definitely keep these changes in the branch, and try to make sure everything is fixed well in advance.

The attribute I didn't want to just change to flattening, though, was $zilla->files, which returns the list of files in the distribution. For years, I've wanted to replace the terrible "array of file objects" with something a bit more like a filesystem. This would make "replace this file" or "delete the file named" or "rename this file" easier to write, check, and observe. It felt like fixing that might be best done at the same time as fixing other reference attributes.

So, v7 isn't abandoned or aborted, but it's definitely going to get some more thinking before release. That also gives me more time to collect more perl version usage data.


I made a new release of Test::Deep, mostly improving documentation. I also had a go at converting it to Test2::API. This caused some nervous noises from people who didn't want to see Test2 become required by something so high up the dependency river. What I found was that Test::Tester, which Test::Deep uses to test its own behavior, basically can't be used to test libraries using Test2::API without Test::Builder. If nothing else, that puts a significant damper in my musings about the Test::Deep change. It was only a quick side-project, though. I'm not in a rush or particularly determined to make the change.

I also talked with Chad Granum about Test2::Compare. I like parts and other parts I am not so keen on. I've wondered whether I can produce a sugar for Test2::Compare that I'd like. So far: maybe.


This year, there were fairly few big talking groups, which was fine by me. Sometimes, they're important, but sometimes they can be a big energy drain. (Sometimes, they're both.) We had a talk about building a page to help CPAN authors prepare for making new releases of highly-depended-upon distributions. I did a bit of work on that, but mostly just enough to let others contribute. I'm not sure how much success we've really had int he past with building how-tos.

There was some talk about converting PAUSE to have more dist-centric, rather than package-centric permissions. I agree that we should do that, but I knew it wasn't going to happen this year, so I stayed out of it as much as I could.

I interrupted a lot of people to ask what they were doing, which was often interesting and, maybe just once or twice, helpful to my victim.

Nico and Breno found that \1 is flagged read-only, but \-1 is not. "Oh good grief," I said, "it's going to be because -1 isn't a literal, it's an expression."

Sadly, I was right.

Chad showed me the tool he's working on for displaying test suite run results on the web, and it looked very nice. I asked if he'd ever heard of Smolder, and he said no, and I felt like an old-timer. Smolder was a project by Michael Peters, who was one of the attendees of the first summit (then the Perl QA Hackathon).

I talked with Breno about Data::Printer, and was utterly gobsmacked to learn that he implemented the "multiple Printer configurations in one process" feature that I've been moaning about (and blowing hot air about implementing) for years. I also showed off my Spotify playlist tracker to him and to Babs. I did not manage to get the Discover Weekly playlist of every attendee, though. Fortunately, this is easy to follow up on after the fact.


Other things of note: we went to a bar called RØØR and played shuffleboard. that was terrific, and I would like to do it again. We had a number of good dinners, and the price of beer in Oslo prevented me from overindulging. I couldn't find double-edged safety razors anywhere, so didn't shave until I couldn't stand it anymore and used a disposable razor from the hotel's shop. It was terrible, but I felt much better afterwards. I had skyr in Iceland, and it was good. I had salted liquorice in Oslo and it was utterly revolting, every single time I tried another piece.

Unlike some previous hackathons, we had lunch and dinner served at the workspace most days. I was worried that this might introduce the "20 hour days of doing terrible work because you're not sleeping" vibe of some hackathons. It did not. We still wrapped up whenever we felt done, but if we wanted to work just a little later to finish something, we could. It was good.


All the organizers did a great job, and it was a great event. I'm definitely looking forward to next year (in England), and I realie now that if I do some more prep work up front, I'll be much more successful. (I worked like this in the past, but I have since lost my way.)

The Perl Toolchain Summit is paid for by sponsors who make it possible to get all these people into one place to work without distraction. Those sponsors deserve lots of thanks! They've helped produce a lot of positive change over the years.

Thanks very much to Salve, Stig, Philippe, Laurent, Neil, and our sponsors: NUUG Foundation, Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP, Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, and Oetiker+Partner.

2017 in review, the 2018 plan (body)

by rjbs, created 2018-01-01 15:40
tagged with: @markup:md journal todo

It has been almost a year since my last blog post, in which I complained that Slack silently drops some messages for people using the IRC gateway. That bug was about a year old then, and it's about two years old now. I'm still annoyed… but that's not why I'm here. I'm here to summarize what I did this year, since I forgot to post anything about it during the year.

I have been keeping busy, despite appearances to the contrary. My involvement in the Perl 5 community has been way down: I'm doing less on p5p, touching less on the CPAN, and I didn't even make it to the Perl Toolchain Summit this year, for the first time in seven years! What has caused this disruption? In part it's just a continued trend, but mostly it has been work. I have been busy, busy, busy.

There have been two parts to that. The first, at least chronologically, was Topicbox. Topicbox is FastMail's newest product, and I spent about a year and a half hard at work on building it (but not alone). The lousy five cent explanation is "it's a mailing list system," but that description is lousy. It calls to mind Mailman or Majordomo, or even our previous email discussion product, Listbox. I really enjoyed using Listbox, but Topicbox puts it to shame. Beyond being much (much) faster and simpler, it changes the focus from mailing lists to organizations. You create an organization, invite its members, and then discussions can be organized into topics.

Topicbox is built on JMAP, a developing standard for efficient client/server applications based on simple synchronized object storage. We wrote it in Perl 5, in Ix, a framework we built for writing JMAP applications. I spent much of 2016 working alone on Ix, but now several members of the internal dev team work on it and Topicbox a lot of their time. At first, this was mainly a quality of life improvement for me. I'd been largely working in isolation, which was fine, but not great, and having people to discuss my work problems with was a big win.

Since then, the kind of work problems I have has changed substantially, which is the second thing that's kept me so busy with work this year. Pobox became part of FastMail in late 2015, and I think everybody realized pretty quickly that nobody on either side of the deal thought anybody on the other side was a boor or a cretin. Still, things moved slowly. The acquisition took (as I remember it) about fifty-two years to complete, and further integration of the teams was slow going. By early 2016, we knew we needed to re-organize the company at least somewhat to provide some structure for the growing team. We went through a few iterations of this, and by mid-year I ended up with business cards that said, "Ricardo Signes, CTO."

I've long wanted to be doing technical management, and this change has felt like I skipped a step or two on my imagined career path — which is just fine, since I felt like I'd spent a few too many years as an individual contributor. It's been a really enjoyable challenge to bring myself up to speed on parts of the system I've previously half-ignored, to start to build a coordinated plan for the future of our systems, and to work with all of the technical staff to execute that vision. I feel pretty good about our plans for 2018, and about the team's general excitement for the future. Beyond my project plan for 2018, I want this to be the year where I figure out and (more or less) lock down the amount of time I spend on different kinds of work.

I wrote todo lists for a few years in the past, and most of the time, I did very poorly. My finding was generally that I'd make pretty good progress on all my ongoing work projects, and could keep up with the random things I had going on, but I rarely made good progress on goals stated up front.

Probably many different things contributed to this, ranging from laziness to changing interests to bad time management to lack of accountability. I've tried to address some of these problems in the past, and I'm still trying, and I'm sure I'll never make as much of a one-year improvement as I want, but I'm going to keep trying.

May plan for 2018 is to try to stick to a three-tier routine: a daily routine, a weekly routine, and a monthly routine. Each one will start with a "make specific goals beyond the routine ones" and end with "write something about how you did." This writing will mostly go into my Day One journal, rather than boring everybody who subscribes to my blog's long-dead RSS feed… but I'll try to post some updates when I think it's interesting.

In 2017, I set my [Goodreads reading goal] to 52 books, and I hit 50% of that goal with a few hours to spare. For 2018, I've scaled back to 48, so I'll still have to push to hit it. I've set a few more specific goals about socializing with people, and I've said that every month I need to pick a skill to get better at, and then work on it. Maybe in time I'll realize that a month is longer than I need for most things, and make it a weekly pick.

This month, my goal is to get a better handle on IMAP. I have a tolerable understanding of it, but there are a few parts I could brush up on, and I think this will be a nice place to start. (It might also make it clear whether a month is too long for something small in scope.)

The other thing I need to keep in mind is that lots of other things are bound to come up and try to disrupt my routine. Sometimes, I might have to let them. When that happens, I need to be sure that I recover from the disruption, and that I don't view it as a failure, but just as a thing that happened. I think that a routine is a big help at getting things done, and that I've traditionally been so-so at sticking to non-work routines. I need to focus energy on getting into one, so that I can eventually have one without having to keep spending energy on it. Right? I think so.

The Slack IRC gateway drops your messages. (body)

by rjbs, created 2017-01-23 20:42
last modified 2017-01-23 22:27

So, imagine the following exchange in private message with one of your team members:

  <alice> So, how have I been doing?
  <bob> Frankly, I don't think it's working out.
  * bob is joking!  You're doing great.

Maybe Bob shouldn't be such a joker here, but sometimes Bob can't help himself. Unfortunately, Bob has just caused Alice an incredible amount of stress. Or, more to the point, Slack has gotten in the way of team communication, leading to a terrible misunderstanding.

The problem is that when you use /me on the IRC gateway while in private chat with someone, Slack just drops the message on the floor and doesn't deliver it! Alice never got that final message.

Read that again, then ask yourself how often you may have miscommunicated with your team because of this. Remember that it goes both ways: maybe you never used /me in privmsg, but did your coworkers? Who knows! I reported this bug to Slack feedback in August 2016 and again in September. When I reported it the second time, I went back in my IRC logs to compare them to Slack logs. If I found a time when an emote showed up in both, I'd know when the problem started.

The answer is that it started around January or February 2016. In other words, this silent data loss bug has been in place for a year and known about for, at the very least, five months. It hasn't been fixed. More than a few times in this period, I've realized that I missed an invitation to a meeting or other important communication because it was /me-ed at me in privmsg. This is garbage.

I can't fix Slack from my desk, but I can fix my IRC client to prevent me from making this error. I wrote this irssi plugin:

use warnings;
use strict;

use Irssi ();

our $VERSION = '0.001';
our %IRSSI = (
  authors => 'rjbs',
  name    => 'slack-privmsg-me',

Irssi::signal_add('message irc own_action'  => sub {
  my ($server, $message, $target) = @_;

  # only stop /me on Slack
  return unless $server->{address} =~ /\.slack\.com\z/i;

  return if $target =~ /^#/; # allow /me in channels

  Irssi::print("Sorry, Slack drops /me silently in private chat.");

This intercepts every "about to send an action" event and, if it's to a Slack chatnet and in a private message, reports an error to the user and aborts. I could've made it turn the message into a normal text message, but I thought I'd keep it inconvenient to keep me angry.

Please use this plugin. If you port it to WeeChat, I'll add a link here.

When people tell you, "Of course your free software project should use Slack. There's even an IRC gateway!" remember that Slack doesn't seem to give a darn about the IRC gateway losing messages. It's saying that gateway users are second-class users.



I will make friends by programming. (body)

by rjbs, created 2017-01-15 22:25

Every once in a while I randomly think of some friend I haven't talked to in a while and I drop them an email. Half the time (probably more), I never hear back, but sometimes I do, and that's pretty great. This week, I read an article about Eagle Scouts and it made me realize I hadn't talked to my high school friend Bill Campbell for a while, so I dropped him an email and he wrote right back, and I was happy to have sent the email.

Today, I decided it was foolish to wait for random thoughts to suggest I should write to people, so I went into my macOS Contacts application and made a "Correspondents" group with all the people whom it might be fun to email out of the blue once in a while.

Next, I wrote a program to pick somebody for me to email. Right now, it's an incredibly stupid program, but it does the job. Later, I can make it smarter. I figure I'll run it once every few days and see how that goes.

I wrote the program in JavaScript. It's the sort of thing you used to have to write in AppleScript (or Objective-C), but JavaScript now works well for scripting OS X applications, which is pretty great. This was my first time writing any JavaScript for OSA scripting, and I'm definitely looking forward to writing more. Probably my next step will be to rewrite some of the things I once wrote in Perl, using Mac::Glue, which stopped working years ago when Carbon was retired.

Here's the JavaScript program in its entirety:

  Contacts = Application("Contacts");

  var people = Contacts.groups.byName("Correspondents").people;
  var target = people[ Math.floor( Math.random() * people.length ) ];

  var first = target.firstName.get();
  var last  = target.lastName.get();

  first + " " + last;

what I read in 2016 (body)

by rjbs, created 2017-01-03 10:46
tagged with: @markup:md books journal

According to Goodreads, which should be accurate, these are the books I read in 2016. I meant to read more, but I didn't.

The Pleasure of the Text

I'd been meaning to read this since college, as one of my favorite professors spoke highly of Barthes. I found the book unpleasantly difficult and basically uninteresting. I like to imagine that I would have enjoyed the book much more in the original French, as the language in translation felt pretentious. That said, I think the ceiling on my enjoyment was going to be pretty low. I'll stick with Derrida and Debord.

Eclipse Phase: After the Fall

Eclipse Phase is a transhumanist RPG with a lot of interesting ideas. This is a collection of stories set in that game's universe. I did not enjoy it. There were one or two good bits, but mostly it was just not good.

Puttering About in a Small Land

This is one of Philip K. Dick's realistic novels. He wrote a few of these before settling into all sci-fi all the time. I've found them to be surprisingly good. They're mostly just slice of life stories about silghtly unhappy people in mid-20th century California, but I have enjoyed them, as I did this one.

Frost: Poems

I was still making progress on my "one book of poetry each month" project! Frost was spectacular! Much better than I had anticipated. Reading him in middle and high school was too early. I couldn't yet appreciate the subtext of his works. I still think back on these poems now.

Ancillary Mercy

I finished the Ancillary Justice trilogy, and it was good. I seem to recall thinking that Mercy wasn't the best of the three books, but the whole trilogy was very good, and I'm glad to have read it.

The Girl with All The Gifts

Gloria recommended this to me a number of times, and I finally took her advice, and it was good advice. This was a good post-apocalypse story with an interesting premise. I think I was happy with the ending. They made a movie of it, but I haven't seen it yet.

Jane Eyre

I'm going to try to keep reading some classic 19th century novels over the next few years. I should've read more last year, but instead read only Jane Eyre. I feel I could've done a lot worse. It had problems (that ending!), but I really enjoyed it. It was charming and funny and well-written. Its position in the canon is well-earned. I look forward to reading Wide Sargasso Sea this year, too.


I felt like this book played on some of the same ideas as Ready Player One, but was a substantially better book. I found the character and premise more interesting than those in Ready Player One, which seemed to be built mostly on evoking nostalgia.

Programming Pearls

This is a book that other computer books recommended to me often enough that I asked for a copy, received it, and put it on my shelf for years. I mostly enjoyed it, although sometimes I found it a bit boring. The interesting bits were very interesting, though. I think my recommendation here is "read it, but skip the parts you don't find interesting."

The Fine Art of Mixing Drinks

I'd been nursing this book for a few years, and finally decided to just finish it. It's a good book to have a run through, although I wouldn't say it revolutionized my drink making.

Uncertainty in Games

This short book discusses the different kinds of uncertainty that can be introduced to games to make them more interesting. I enjoyed the overview, and it helped me think about what makes games that I like interesting (or not). It also didn't fall into the sort of stodgy prose that this sort of theoretical work often does.


This was a tolerable sci-fi story rendered intolerable by the writing of its characters. The romance and love scenes made me groan and roll my eyes. Skip it.

Dr. Adder

This book got a strong recommendation from Philip K. Dick, who said it would've defined cyberpunk if it had been published when written, instead of many years later. It definitely felt like a bridge between Dick and cyberpunk, but it was a big mess and spent a lot of its time reminding the reader how very transgressive it was. I'm glad I finally read it, but you probably shouldn't bother.

The Library at Mount Char

This might be the best recent book I read this year. It's about a group of weird people raised by a dangerous madman. The madman is missing, and they seem to be looking for him. It's pretty dark, but also funny in places. It was a good read.

Saturn Run

The US and China launch missions to Saturn (and back) after seeing evidence of an alien space ship there. The book had a lot going for it, idea-wise, but I found its characters uninteresting and its plot predictable.

Shadow of the Torturer

This is the first book of the Book of the New Sun books. I think they're amazing, especially in that they greatly reward repeat reading. I got a lot more out of them on this read than I did on my last, and I will surely read them again in a few years. Next time, I may take notes. This year, I plan to read Peace, another novel by Wolfe. This one also, I am told, rewards repeat reading, but it's only about 300 pages, which is a relief.

Starting Forth

I had read most of this book a few years ago, but never finished the exercises in the later chapters. This year I forced myself to do so, which led to re-reading a few chapters to get back my Forth legs. I'm very glad I did this, because it renewed my appreciation for Forth and got me to write a simple but instructive program in Forth.

Forth is great and more people should learn it. (Probably almost nobody should be using it for much real work, though.)

Claw of the Conciliator

This is the second part of the Book of the New Sun.

Thinking Forth

Stack Computers: The New Wave

Sword of the Lictor

This is the third part of the Book of the New Sun.

Citadel of the Autarch

This is the forth part of the Book of the New Sun.

The Urth of the New Sun

This is the fifth and final part of the Book of the New Sun.

The Sonnets (Berrigan)

The most compelling part of this collection is the introduction, which makes big claims. By the end of the book, I felt like maybe those claims hadn't been malarky, after all. The sonnets and general structure of the book became more interesting as it went on, and I began to see more of the simultaneity of the work's poems. I almost felt like starting over with that in mind. Almost.

Who Censored Roger Rabbit

This is the book on which Who Framed Roger Rabbit was based. The film is superior in every way. I am irritated that I spent time on this book.


This is a short story about how libertarianism isn't all that it's cracked up to be. It was good, but maybe I would've liked it better if Vance, rather than Heinlein, had written it.

Learn Vimscript the Hard Way

This is a good book on Vim. It didn't change my life, but I learned things.

The Skin

I'd had this book on my for years after a favorable review when the book was first published unexpurgated in English. It's a semi-real memoir of Curzio Malaparte's experience of the final months of WWII in Italy. There were parts where I felt certain that Joseph Heller must have read it before writing Catch-22, most especially in the rantings of a man that Italy had won by losing the war, just like the dirty old man in Catch-22's brothel.

I'm glad I read the book. It was an interesting read and definitely had its moments, but I can't say that I'd strongly recommend it to anybody else. It was disjointed and sometimes difficult to enjoy. Still, as I say, I'm glad I read it.

Also, I should admit that one of the things that bothered me is the amount of dialog in French. It made me feel poorly educated, which I am.

Something Happened

This is Joseph Heller's second book, published 13 years after Catch-22. From Wikipedia:

Something Happened has frequently been criticized as overlong, rambling, and deeply unhappy. These sentiments are echoed in a review of the novel by Kurt Vonnegut Jr., but are balanced with praise for the novel's prose and the meticulous patience Heller took in the creation of the novel.

I agree with those remarks, except for the idea that the prose and patience balanced out the book's painful length. (It's only about 500 pages, but they're long pages.) There was a lot that I really liked about this book, but eventually I couldn't stay engaged and skimmed my way to the irritating conclusion.

Seven Databases in Seven Weeks

I found this about as good as the Seven Languages books. In other words: it was ... okay. It was a tolerable crash course, but I wasn't interested enough to do the exercises. Maybe doing them would've helped, but the book itself didn't get me interested enough to bother. Also, I have generally felt, in these books, that none of the authors has a really interesting narrative or voice.

Still, I understand Neo4j a lot better, now.


This is the other short story collection by Greg Egan. In general, I thought every story in it was a failure. Some had promise, but most were lousy, especially compared to his other (better, but still flawed) story collection, Axiomatic. I did sort of like Reasons to Be Cheerful, but not enough to make the book worth reading. Maybe not even enough to make the story worth reading.

Wyst: Alastor 1716

After Luminous, I wanted to read something I was guaranteed to enjoy. Clearly, I thought, I should read some Jack Vance. I asked Mark Dominus for a recommendation, and this was the shorter of the two he recommended. I enjoyed the heck out of, because Vance is a delight.

Discovering Scarfolk

This is a book from the Scarfolk blog, which presents surreal artifacts and articles from a fictional 1970's English town, Scarfolk. It was a fun and quick read, but I find the blog more fun.

The Informers

While on business in Melbourne, I got thinking about Bret Easton Ellis. Years ago I decided that I needed to space out my reading of his books, so that I wouldn't run out of books too quickly. He's one of my favorite authors, although I'm not sure I could say why. It has something to do with the very, very precise kind of bleakness he presents.

The Informers is (I think) his fourth book, and the fifth that I've read. It's a short story collection, and each story is very very loosely connected to some of the other stories in the book. This mirrors his usual practice of tying his books together with very thin threads. I found that a story collection was a great format for him, because it was able to further spread out the pointlessness depicted. It didn't much motion at all, because the break between stories was a substitute for any actual rising or falling action.

I was very pleased with my decision to read it.

The Speechwriter

This is a political memoir by one of the speechwriters for [https://en.wikipedia.org/wiki/Mark_Sanford], former governor of South Carolina. I had read that it provided some great insights into American politics, but it didn't. On the other hand, it was a light, enjoyable read. It was well written and funny when it wanted to be.

The Plagiarist

In a desperate attempt to get one more book read before 2016 ended, I decided to read this sixty-four page novella by Hugh Howey. It's a sci-fi story about a man who visits computer-generated worlds so that he can steal their original works of literature. It was mediocre.

Horror Movie Month 2016 (body)

by rjbs, created 2016-11-02 23:21
last modified 2019-09-28 19:03
tagged with: @markup:md journal movies

Another year, another thirty-one days of horror movies! I think our selections this year had fewer losers than some past years, but probably also fewer stand-out winners.

What We Watched

October 1: The Thing (1982)

A classic! I forgot how good the creature effects were, and how effective the moments of suspense. Also, as with all of Carpenter's original scores: great music!

We watched this one with Martha, who said it was much less boring than the 1951 version, which we previous tried to watch and abandoned, in preparation for this one. On the other hand, she said it was not scary.

October 2: Don't Breathe

We saw this one in the theatre!

It had a lot going for it, but in the end, we thought it was only okay, at best. I think I might have given it a "pretty good, if flawed," if it hadn't basically become a super-gross rape movie.

No, thanks. We want fun horror movies.

October 3: Paranormal Activities: Ghost Dimension

a.k.a. Paranormal Activities 6

It was the best Paranormal Activities movie in a long time. This does not mean it was very good. I liked a few things that it did, especially in how it tied itself back to the original film. Still, though, these movies are just not that great. I think it was about as good a capstone as we were going to get.

October 4: Silent Scream

We've spent some time working through a BuzzFeed list of horror movies. Some of the movies we'd already seen, and some we saw over the last two years. Now we're down to movies that I've had a hard time finding. Silent Scream is one of those.

Short version: it wasn't worth it. It had a couple funny bits, but mostly it was one of those 1980-ish films where they were trying to figure out how to make a slasher the right way. This one didn't hit the mark.

October 5: The Gate

Some kids stumble across a hole in the back yard. Their parents go away. The hole is full of tiny demons. They spend the movie fighting the demons before finally banishing them. It's from 1987, and you can tell. It's time to market horror-style movies to young audiences, and that's what this. It's not very good.

Instead, watch Joe Dante's The Hole.

October 6: He Never Died

Don't read the Wikipedia article! Just go see the movie!

This was probably my favorite new film of the month. It wasn't perfect, and as the movie went on it got less interesting to me, but Henry Rollins was just perfect. He plays a weird loner who lives in the big city and tries to avoid doing anything interesting. He eats eggplant parms every day and plays bingo a few times a week. There is something seriously weird going on with this boring guy.

I enjoyed it.

October 7: Extraordinary Tales

It's an anthology of animated Edgar Allan Poe stories. I was entirely unimpressed. Read them instead.

October 8: We Are Still Here

Middle-aged couple moves into a haunted house. Their neighbors visit but are sort of creepy weirdos. Friends come to visit and things get worse. On its surface, this movie didn't look very interesting, but I enjoyed it. It had good pacing. It was creepy. I wish the story held together a bit better, but it was fun. I would watch a sequel.

October 9: The Others

This was our second horror movie with Martha this year!

Gloria and I had seen this before, and I remembered thinking it was okay. On rewatching, I found it more interesting. It was a pretty solid ghost story.

Martha's verdict: good movie, not scary.

October 10: Ava's Possessions

The movie starts when a young woman has a possessing demons exorcised from her. She's found guilty of crimes she committed while possessed, and one option she's given is to go into a group therapy program for possession victims. This is a good setup, and they do not entirely squander it. I think I would've liked the movie better if there had been less horror action and more frank discussion of the problems of being possessed.

Still, I enjoyed it.

October 11: American Horror Story: Roanoke

Promising start. We liked the format as a ghost stories TV show.

October 12: Green Room

So, it was okay. A lefty punk band gets booked to play a white supremacist skinhead club. Things go sour.

We were excited because it had Patrick Stewart. He was wasted. The rest of the cast was good, though.

In the end, it was just a "everybody tries not to get killed" movie with a touch of torture porn. I was sure it was going to be a werewolf movie, and I was sorely disappointed when it wasn't.

October 13: American Horror Story: Roanoke

Uh oh. The usual American Horror Story thing is starting: too many plot lines, too many characters. What's going on here? Can this possibly remain coherent?

October 14: Southbound

An anthology! I love a good anthology, but so many horror anthologies are just crap. This one was not! It wasn't perfect, but it was good. There were five stories, each one linking to the next. I liked every story (but The Accident may have been my favorite), and I thought they were just connected enough to make it fun.

October 15: Krampus

It's a horror-comedy about Christmas. Krampus, Santa's evil buddy, comes to punish the grinchy. It was okay. It was not great, nor was it terrible.

October 16: The Fog (1980)

Another film that we'd seen before, but now watched with Martha! I like a lot about The Fog, and there's also a lot that I don't like. Maybe my big problem is that I find ghost pirates altogether too corny as villains.

Martha's verdict: she liked it, even though it wasn't scary.

October 16: Big Bad Wolves

I was a little worried when I realized this one would have subtitles, but I didn't mind. Also, this movie sounded very good. That is: the spoken Hebrew sounded really good in the mouths of these characters, even if I didn't understand a word of it.

The movie was well made, and I liked quite a few things about it, but its effective black comedy was undermined by the fact that it was about someone who raped and murdered children.

October 17: American Horror Story

Ugh, we give up. Too much stuff going on. Even if some of it was good, it was too much of a mess.

My proposal was that they try to do this series differently. Instead of acting like it's a whole season of a ghost stories TV series, they should've run a new ghost story each episode, slowly letting the viewer realize that they were somehow connected. Maybe by the end, we break the fourth wall and realize that the horror is now hunting the producers of the show.

Except the producers of American Horror Story would have to add eighteen subplots to the story.

October 18: Glitch

This ended up not seeming very horror-y. It's a new-to-us Australian TV series about a very small number of the dead rising from their graves. It looked pretty good, and we'll definitely watch more of it.

October 19: the third presidential debate

Okay, we skipped any traditional horror viewing to watch the debate between Clinton and Trump. I was left shaken.

October 20: Viral

Viral is an outbreak movie in which a parasite begins to spread rapidly, turning people into murder monsters. The film focuses on the experiences of two sisters trapped in a quarantined LA suburb during the outbreak while their parents are away. It was okay, but I felt like it could've been a lot better with some more work on the script.

October 21: Night of the Living Deb

Weird-o awkward protagonist has a one night stand with the guy she's been crushing on for ages. When they wake up, zombies have taken over their town. As they try to escape they discover what's really happening to their city… and their hearts.

It was okay. It really needed better writing. It occasionally felt like an SNL sketch.

October 22: Detention

We watched this one a few years ago, too. I really, really enjoyed this movie. I liked that it was weird, and unlike anything else, but not pretentious or self-important. It's just fun.

October 23: The Blob (1988)

Another movie watched with Martha! I hadn't seen this movie for maybe twenty years or more. When we watched it, I became sure of something I had forgotten: there was a novelization of this movie, and I read it. I can't be sure, but I feel pretty confident that this is true. I would've been about ten, so I probably thought it was awesome.

Anyway, the movie was cheesy, but not terrible. It was definitely much better than the 1958 version, which I considered watching with Martha a year or so ago. That was tiresome enough that I gave up during my previewing.

This one had some good bits. I especially liked the line cook being pulled bodily down the sink drain, making it bulge like a snake eating a rat.

Martha's verdict: fun, but not scary.

October 23: Insidious: Chapter 3

I may have liked this best of all the Insidious films. A teenage girl wants to contact the spirit of her dead mother, but instead gets the attention of malicious spirits living in her building. A friendly medium is called in to help.

There wasn't really anything in particular that I liked about this movie, I just liked it. I liked the cast, and I was pleased that the movie was neither very weird nor overly hampered by its formula.

All that said, it wasn't great. It was okay.

October 24: The Conjuring

This movie had a lot of problems. I wanted to yell at the TV, "Why are you doing this stupid thing that is obviously going to get someone killed?" Despite this, it was a decent horror movie. The creepy parts were creepy, the relaxing middle parts were relaxing, and it wasn't just jump scares.

Gloria and I are both pretty tired of possessions and haunted houses, though. The ideas aren't spent, but they've both been covered the same way in many, many films, especially in the last ten years. We need to see new ideas.

October 25: The Strain

We watched the first episode of this TV series about vampires invading New York City. (This is my understanding about what the series will be about. Even as I write this, we've only made it to episode three. I'm not sure what's really going to happen.)

I found the first episode sort of interesting, but also a mess. I didn't care enough to keep watching, but I felt like if I watched another episode, I might realize that then I'd want to keep watching. This turned out to be true.

I read somewhere that the producers of The Strain kept wanting to push the limits to see what they could get away with. I can see that, in the show.

October 26: The Purge: Election Year

Each subsequent Purge movie gives us more information about what life is like in the world of The Purge. This is a mixed blessing. On one hand, we want to know how the world has come to this, and what is really going on in America. On the other hand, the movie's answer is stupid and doesn't amount to much more than "bad stuff happened I guess."

I would like to see a short run TV series about The Purge, giving us better stories of how it started, what life is like when the Purge isn't going on, and how cleanup and recovery works the next day.

Just watching people try to make it across town while the laws are on hold gets old, and I could just go watch Escape from New York instead.

October 27: Sinister

Dude moves into a murder house to write about it. He discovers home movies of all kinds of awful murders. He realizes that there is a supernatural force that has been killing families for years. He tries to escape, but he can't, because there is a TWIST!

I liked a lot of things in this movie. It had plenty of good creepy bits. We really liked James Ransone's character and his scenes with Ethan Hawke. In the end, the explanation was not great. Vincent D'Onofrio's character makes things much more confusing than they needed to be.

I was pretty willing to watch more, though. So we did.

October 28: Sinister 2

We were sold on this when we found out that James Ransone's character from the first film would have a major role. Unfortunately, I found this one a lot less interesting. We mostly watched the lives of the two kids, and it was more unpleasant than interesting. The first movie did a good job of making it clear that creepy things happened to kids, but we didn't have to watch. In the second movie, we had to watch a lot of unpleasant kid scenes. Meh.

October 29: Ash vs. Evil Dead

I didn't know this series existed until we heard about it on NPR. We've only watched one episode so far, but I enjoyed it. It was very reminiscent of the original films. I know this shouldn't be surprising, given the people involved, but I was worried.

Gloria said, "It's hard to imagine that this kind of thing will work for two whole seasons." I agree, but I look forward to finding out what happens.

October 30: Poltergeist

Martha had been begging to watch this movie for over a year. Why oh why would we not let her watch it? She would not be scared. So, after a month of "this movie isn't scary" we said she could watch Poltergeist with us.

We settled in to watch the movie. Children were eaten by trees, vanished into televisions, attacked by clowns. Parents floated amid rotting corpses in the pool. A guy peeled his own face off.

Martha's verdict: Good movie, not scary.

I guess I'm just glad she liked it.

October 31: The Witch

I really liked It Follows, and thought it was a good movie in that every part of it was focused on creating a particular mood, and it worked. I heard people compare The Witch to It Follows just for this reason, so I was keen to see it.

Gloria and I both found it boring and uninteresting.

We didn't like the characters. We didn't find the evil things creepy. We didn't think the conflicts were interesting. I was especially irritated by the parents were so religious that it made them stupid. I tweeted this, and of course got replies in support of the idea that all religious people are stupid. Putting this tiresome idea aside: it doesn't make good viewing.


Is this the last 31 movie Horror Movie Month? Maybe.

We've found fewer great new horror movies, the last few years, and often best ones we find, we watch during the year, meaning that October is a bit of a slog. This year was much better, but I'm not sure we'll have that luck in 2017. Maybe we'll switch to mostly watching horror movies with Martha in October instead.

We've got a good eleven months to decide, though.

Horror Movie Month 2015 (body)

by rjbs, created 2016-10-29 19:57
last modified 2016-11-02 21:15
tagged with: @markup:md journal movies

So, it turns out I never posted a summary of our Horror Movie Month for 2015! I tried to recreate our viewing list by looking up my tweets, but there are a bunch of days with no movie tweet. What happened? I'm not sure. Anyway, here's a very late summary, missing days, and with a year's clouding of my memory. How accurate will it be? Who knows!

If I was more dedicated, I'd go find all my tweets from these days to reconstruct my thinking at the time. But I'm not.

The Whole List

October 1: Incident On and Off a Mountain Road

This was the first of several Masters of Horror films that we watched. As with most of them, I thought it had an okay idea that didn't really work out. We were annoyed that the movie seemed to be setting up one kind of scenario, but then it was something else entirely.

October 2: Cam2Cam

There have been a few horror movies told through video chat. This was one of the better ones, but it still wasn't great. I guess I don't regret it.

October 3: All the Boys Love Mandy Lane

All the Boys Love Many Lane, too, wasn't great, but I liked the ways it defied or subverted several genre norms.

October 4: Scream


It was still very good. I think this is probably one of my (and our?) favorite slasher movies, because it's scary, funny, and smart. Like many of the best slashers, though, it only works because you know the genre tropes. That's okay, though, because we all do!

October 5: Afflicted

Afflicted was a good take on the "something bad happens on a road trip" movie, and much better than many similar movies I've seen. Found footage, though… even though it often makes sense in the movies being made, it's a technique that needs a long rest.

October 6: 13 Sins

I really liked 13 Sins. It wasn't great, but I enjoyed its little twists, and it worked as a (very) dark comedy.

October 7: American Horror Story

We hated the 2015 season of American Horror Story and gave up after a couple episodes. The story was a mess, there was too much going on, and we didn't care about any of it. Also, we found a number of the actors' performances to be really dull.

October 7, later: Late Phases

I liked it. It wasn't the best thing ever, but I liked the characters, I liked the story, and I liked the way its tone varied into places we don't often see in movies like these. There's a great scene in which the protagonist is riding to town with a busload of (other) senior citizens and nothing much happens.

October 9: Open Grave

I don't remember it well. My recollection is that it had a bunch of really tired tropes, but did okay despite them… but it wasn't great. "Everyone is locked in a room and has amnesia" is something we've probably had enough of.

October 10: Scream 2

It was good!

October 11: Housebound

Gloria had already seen this and consented to watch it again. Why? Because it was good! Gloria likes horror comedies, and so do I, but I keep thinking we should watch serious ones, too, and I'm almost always wrong. In Housebound, a young woman in New Zealand is sentenced to house arrest and slowly begins to realize that the house is haunted. Hollywood would probably make this movie really gritty and creepy and shot in lots of sepia tones. Housebound was funny and surprising and weird. It took a couple twists that surprised me and made me laugh. Endorsed!

October 12: Curse of Chucky

I'd put it in the same bin as the earlier Child's Play movies, but better than most of them. That said, it was only okay, at best.

October 13: Hellraiser Ⅲ

Terrible. Awful. Why did they keep making these? I do mean to finish watching the series, but only because I am an idiot. The only thing I remember liking was "The DJ," a cenobite who throw compact discs with lethal force.

Really, it was just terrible.

October 15: Pick Me Up!

This was another Masters of Horror that didn't deliver on its potential. It was okay, and I'm glad I saw it, but I'd like to see it done over again. It's about two competing serial killers, which was a fun concept.

I learned about Masters of Horror existing because of this movie. I was looking for more films by Larry Cohen, whose work I've largely enjoyed.

October 16: Cheap Thrills

This was fun. Like 13 Sins, it's about some guy who gets told to do increasingly weird things to make money. It's also got a darkly comedic streak. I enjoyed it!

October 18: Mischief Night

Blind woman terrorized by masked killer. I barely remember it at all. We've seen enough of these movies that I do remember, so this one is probably best ignored.

October 19: The Houses October Built

Road trip! A bunch of friends are traveling around and looking for the scariest haunted house experiences around the country. There's this rumor that a super-secret one exists that is way scarier than you've ever seen before.

Sounds like a great concept, I thought! I found it disappointing.

October 20: Dreams in the Witch House (Masters of Horror)

This was one of the better Masters of Horror that we watched, but it was still mostly just okay. I enjoyed that it was a pretty faithful adaptation of Lovecraft, since so many adaptations change too much.

October 21: Razorback

This is a 1984 horror movie about an enormous bloodthirsty razorback boar that kills people. It's basically Jaws, but in the outback. This movie was not good, but it was pretty weird. It seemed like, "If this is even sort of what 1984 rural Australia was like, then Mad Max seems like a much more plausible vision of the future than I thought."

October 22: Jenifer

It's a Masters of Horror film. Like many of the others, it was interesting, but not really great. That said, it's probably my favorite piece of horror work by Dario Argento, whose films I usually find sort of overwrought. It was nice and creepy.

October 23: Case 39

I had entirely forgotten this film until I re-read the Wikipedia page just now. I can't say whether I liked it or note. I think, if I recall correctly, that it had a few good moments, but was otherwise mediocre.

October 25: Stonehearst Asylum

This movie had a good twist, but I think it maybe gave it up too soon… but it might have been pretty hard to keep it hidden for much longer. It had a lot of good talent (Ben Kingsley!), but it could've done more to be creepy and not just weird.

October 26: Nightbreed

I heard so many good things about this film, but I was really hesitant, because I'm not much of a fan of Clive Barker. It was so-so. The Nightbreed themselves were weird, but not very compelling. I would've rather re-watched Basket Case 2.

October 27: Cigarette Burns

This was probably my favorite of the Masters of Horror movies that we watched. It's very much in the spirit of H.P. Lovecraft, at least until the strange (but good) ending. It reminded me in many ways of Carpenter's earlier In the Mouth of Madness, which I think was (as I recall) a better film, despite a number of scenes in Cigarette Burns that were more nuanced and interesting than anything in Madness.

October 28: Vlog

I barely remember this. As I recall, it was much creepier than I expected, but had a bunch of other problems. It was one of the many (many? well, enough) webcam movies we watched in 2015.

October 29: Homecoming (Masters of Horror)

This was a dark comedy in which dead American soldiers rise from their grave as zombies, with just one desire: the vote. They plan to vote for anyone who will end the war. The film focuses on the campaign staff of the sitting president, who supports the war, and has to spin this the right way. The movie had a lot of problems, but it was good enough in other ways to overcome them.

October 30: The Fury

So, this was something like Carrie, or Firestarter, or Scanners. There's a secret government plan to weaponize psychics, and of course things go wrong. It was mediocre.

October 31: A Girl Walks Home Alone at Night

This was an Iranian film — I think. It was filmed in Persian, anyway, and directed by an Iranian woman. I remember it only faintly. I recall liking what they did with a cat.

October 31: The Stuff

This was my then-eight-year-old daughter's first admission to Horror Movie Month. It's a film by Larry Cohen, whose films I really like. In it, a weird white goo is found bubbling up from the frozen ground by someone who seems to be an industrial site night watchman. He looks at it quizzically and then, of course, tastes it. It's delicious!

Months later, The Stuff is the number one dessert in the country. People love it. They eat it for all three meals, plus snacks. It's a huge sensation, even if the ingredients just say "natural ingredients."

A young boy runs away from his family because they've lost their minds for The Stuff. He teams up with Michael Moriarty (who appears often in Cohen's film) to find the secret of The Stuff and stop it from consuming all the consumers in the world!

After the movie, we treated ourselves to some of that wonderful Stuff!

The Stuff

sending email with TLS (in Perl) (body)

by rjbs, created 2016-10-22 22:55

Every once in a while I hear or see somebody using one of the two obsolete secure SMTP transports for Email::Sender, and I wanted to make one more attempt to get people to switch, or to get them to tell me why switching won't work.

When you send mail via SMTP, and need to use SMTP AUTH to authenticate, you want to use a secure channel. There are two ways you're likely to do that. You might connect with TLS, conducting the entire SMTP transaction over a secure connection. Alternatively, you might connect in the clear and then issue a STARTTLS command to begin secure communication. For a long time, Perl's default SMTP library, Net::SMTP, did not support either of these, and it was sort of a pain to use them.

Email::Sender is probably the best library for sending mail in Perl, and it's shipped with an SMTP transport that uses Net::SMTP. That meant that if you wanted to use TLS or STARTTLS, you needed to use another transport. These were around as Email::Sender::Transport::SMTPS and Email::Sender::Transport::SMTP::TLS. These worked, but you needed to know that they existed, and might rely on libraries (like Net::SMTPS) not quite as widely tested as Net::SMTP.

About two years ago, Net::SMTP got native support for TLS and STARTTLS. About six months ago, the stock Email::Sender SMTP transport was upgraded to use it. Now you can just write:

my $xport = Email::Sender::Transport::SMTP->new({
  host => 'smtp.pobox.com',
  ssl  => 'starttls', # or 'ssl'
  sasl_username => 'devnull@example.com',
  sasl_password => 'aij2$j3!aa(',

...and not think about installing anything else. This is what I suggest you do.

I'm learning Rust! (body)

by rjbs, created 2016-10-15 22:45

I've been meaning to learn Rust for a long time. I read the book a while ago, and did some trivial exercises, but I didn't write any real programs. This is a pretty common problem for me: I learn the basics of a language, but don't put it to any real use. Writing my stupid 24 game solver in Forth definitely helped me think about writing real Forth programs, even if it was just a goof.

Now I'm working on implementing the Software Tools programs in Rust. These are simple programs that solve real world problems, or at least approximations of real world problems. I've written programs to copy files, expand and collapse tabs, count words, and compress files. So far, all my programs are pretty obviously mediocre, even to me, but I'm having fun and definitely learning a lot. At first, I thought I'd be working my way through the book program by program, but now I realize that I'm going to continually going back to earlier work to improve it with the things I'm learning as I go.

For example, I sarted off by buffering all my I/O manually, which worked, but made everything I did a bit gross to look at. Later, I found that you can wrap a thing that reads from a file (or other data source) in something that buffers it but then provides the same interface. I went back and added that to my old programs, deleting a bunch of code.

Soon, I know i'm going to be going back to add better command line argument handlng. I'm pretty sure my error handling is all garbage, too.

Still, the general concept has been a great success: I'm writing programs that actually do stuff, and they have fun edge cases, and it's just a lot less tedious than exercises in a text book.

So far, so good!

solving the 24 game in Forth (body)

by rjbs, created 2016-08-23 10:46
last modified 2016-08-23 18:41

About a month ago, Mark Jason Dominus posted a simple but difficult arithmetic puzzle, in which the solver had to use the basic four arithmetic operations to get from the numbers (6, 6, 5, 2) to 17. This reminded me of the 24 Game, which I played when I paid my infrequent visits to middle school math club. I knew I could solve this with a very simple Perl program that would do something like this:

  for my $inputs ( permutations_of( 6, 6, 5, 2 ) ) {
    for my $ops ( pick3s_of( qw( + - / * ) ) ) {
      for my $grouping ( 'linear', 'two-and-two' ) {
        next unless $target == solve($inputs, $ops, $grouping);
        say "solved it: ", explain($inputs, $opts, $grouping);

All those functions are easy to imagine, especially if we're willing to use string eval, which I would have been. I didn't write the program because it seemed obvious.

On the other hand, I had Forth on my brain at the time, so I decided I'd try to solve the problem in Forth. I told Dominus, saying, "As long as it's all integer division! Forth '83 doesn't have floats, after all." First he laughed at me for using a language with only integer math. Then he told me I'd need to deal with fractions. I thought about how I'd tackle this, but I had a realization: I use GNU Forth. GNU's version of almost everything is weighed down with oodles of excess features. Surely there would be floats!

In fact, there are floats in GNU Forth. They're fun and weird, like most things in Forth, and they live on their own stack. If you want to add the integer 1 to the float 2.5, you don't just cast 1 to int, you move it from the data stack to the float stack:

2.5e0 1. d>f f+

This puts 2.5 on the float stack and 1 on the data stack. The dot in 1. doesn't indicate that the number is a float, but that it's a double. Not a double-precision float, but a two-cell value. In the Forth implementation I'm using, 1 gets you an 8-byte 1 and 1. gets you a 16-byte 1. They're both integer values. (If you wrote 1.0 instead, as I was often temped to do, you'd be making a double that stored 10, because the position of the dot doesn't matter.) d>f takes a double from the top of the data stack, converts it to a float, and puts it on the top of the float stack. f+ pops the top two floats, float-adds them, and pushes the result back onto the float stack. Then we could verify that it worked by using f.s to print the entire float stack to the console.

Important: You have to keep in mind that there are two stacks, here, because it's very easy to manipulate the wrong stack and end up with really bizarre results. GNU Forth has locally named variables, but I chose to avoid them to keep the program feeling more like Forth to me.


I'm going to run through how my Forth 24 solver works, not in the order its written, but top-down, from most to least abstract. The last few lines of the program, something like int main are:

  17e set-target
  6e 6e 5e 2e set-inputs

  ." Inputs are: " .inputs
  ." Target is : " target f@ fe. cr
  ' check-solved each-expression

This sets up the target number and the inputs. Both of these are stored, not in the stack, but in memory. It would be possible to keep every piece of the program's data on the stack, I guess, but it would be a nightmare to manage. Having words that use more than two or three pieces of data from the stack gets confusing very quickly. (In fact, for me, having even one or two pieces can test my concentration!)

set-target and set-inputs are words meant to abstract a bit of the mechanics of initializing these memory locations. The code to name these locations, and to work with them, looks like this:

  create inputs 4 floats allot              \ the starting set of numbers
  create target 24 ,                        \ the target number

  : set-target target f! ;

  \ sugar for input number access
  : input-addr floats inputs + ;
  : input@ input-addr f@ ;
  : input! input-addr f! ;
  : set-inputs 4 0 do i input-addr f! loop ;

create names the current memory location. allot moves the next allocation forward by the size it's given on the stack, so create inputs 4 floats allot names the current allocation space to inputs and then saves the next four floats worth of space for use. The comma is a word that compiles a value into the current allocation slot, so create target 24 , allocates one cell of storage and puts a single-width integer 24 in it.

The words @ and ! read from and write to a memory address, respectively. set-target is trivial, just writing the number on the stack to a known memory location. Note, though, that it uses f!, a variant of ! that pops the value to set from the float stack.

set-inputs is built in terms of inputs-addr, which returns the memory address for given offset from inputs. If you want the final (3rd) input, it's stored at inputs plus the size of three floats. That's:

  inputs 3 floats +

When we make the three a parameter, we swap the order of the operands to plus so we can write:

  floats inputs + ( the definition of input-addr )

set-inputs loops from zero to three, each time popping a value off of the float stack and storing it in the next slot in our four-float array at input.


Now we have an array in memory storing our four inputs. We also want one for storing our operators. In fact, we want two: one for the code implements an operator and one for a name for the operator. (In fact, we could store only the name, and then interpret the name to get the code, but I decided I'd rather have two arrays.)

  create op-xts ' f+ , ' f- , ' f* , ' f/ ,
  create op-chr '+  c, '-  c, '*  c, '/  c,

These are pretty similar to the previous declarations: they use create to name a memory address and commas to compile values into those addresses. (Just like f, compiled a float, c, compiles a single char.) Now, we're also using ticks. We're using tick in two ways. In ' f+, the tick means "get the address of the next word and compile that instead of executing the word." It's a way of saying "give me a function pointer to the next word I name." In '+, the tick means "give me the ASCII value of the next character in the input stream."

Now we've got two arrays with parallel indexes, one storing function pointers (called execution tokens, or xts, in Forth parlance) and one storing single-character names. We also want some code to get items out of theses arrays, but there's a twist. When we iterate through all the possible permutations of the inputs, we can just shuffle the elements in our array and use it directly. When we work with the operators, we need to allow for repeated operators, so we can't just shuffle the source list. Instead, we'll make a three-element array to store the indexes of the operators being considered at any given moment:

  create curr-ops 0 , 0 , 0 ,

We'll make a word curr-op!, like ones we've seen before, for setting the op in position i.

  : curr-op! cells curr-ops + ! ;

If we want the 0th current operator to be the 3rd one from the operators array, we'd write:

  3 0 curr-op!

Then when we want to execute the operator currently assigned to position i, we'd use op-do. To get the name (a single character) of the operator at position i, we'd use op-c@:

  : op-do    cells curr-ops + @ cells op-xts + @ execute ;
  : op-c@    cells curr-ops + @ op-chr + c@ ;

These first get the value j stored in the ith position of curr-ops, then get the jth value from either op-xts or op-chr.

permutations of inputs

To get every permutation of the input array, I implemented Heap's algorithm, which has the benefit of being not just efficient, but also dead simple to implement. At first, I began implementing a recursive form, but ended up writing it iteratively because I kept hitting difficulties in stack management. In my experience, when you manage your own stacks, recursion gets significantly harder.

  : each-permutation ( xt -- )

    dup execute

    0 >r
      4 i <= if rdrop drop exit then

      i i hstate@ > if
        i do-ith-swap
        dup execute
        i hstate1+!
        0 hstate i cells + !

This word is meant to be called with an xt on the stack, which is the code that will be executed with each distinct permutation of the inputs. That's what the comment (in parentheses, like this) tells us. The left side of the double dash describes the elements consumed from the stack, and the right side is elements left on this stack.

init-state sets the procedure's state to zero. The state is an array of counters with as many elements as the array being permuted. Our implementation of each-permutations isn't generic. It only works with a four-element array, because init-state works off of hstate, a global four element array. It would be possible to make the permutor work on different sizes of input, but it still wouldn't be reentrant, because every call to each-permutation shares a single state array. You can't just get a new array inside each call, because there's no heap allocator to keep track of temporary use of memory.

(That last bit is stretching the truth. GNU Forth does have words for heap allocation, which just delegate to C's alloc and friends. I think using them would've been against the spirit of the thing.)

The main body of each-permutation is a loop, built using the most generic form of Forth loop, begin and again. begin tucks away its position in the program, and again jumps back to it. This isn't the only kind of loop in Forth. For example, init-state initializes our four-element state array like this:

  : init-state 4 0 do 0 i hstate! loop ;

The do loop there iterates from 0 to 3. Inside the loop body (between do and loop) the word i will put the current iteration value onto the top of the stack. It's not a variable, it's a word, and it gets the value by looking in another stack: the return stack. Forth words are like subroutines. Every time you call one, you are planning to return to your call site. When you call a word, your program's current execution point (the program counter), plus one, is pushed onto the return stack. Later, when your word hits an exit, it pops off that address and jumps to it.

The ; in a Forth word definition compiles to exit, in fact.

You can do really cool things with this. They're dangerous too, but who wants to live forever? For example, you can drop the top item from the return stack before returning, and do a non-local return to your caller's caller. Or you can replace your caller with some other location, and return to that word -- but it will return to your caller's caller when it finishes. Nice!

Because it's a convenient place to put stuff, Forth ends up using the return stack to store iteration variables. They have nothing to do with returning, but that's okay. In a tiny language machine like those that Forth targets, some features have to pull double duty!

begin isn't an iterating loop, so there's no special value on top of the return stack. That's why I put one there before the loop starts with 0 >r, which puts a 0 on the data stack, then pops the top of the data stack to the top of the return stack. I'm using this kind of loop because I want to be able to reset the iterator to zero. I could have done that with a normal iterating loop, I guess, but it didn't occur to me at the time, and now that I have working code, why change it?

Iterator reset works by setting i back to 0 with the zero-i word. In a non-resetting loop iteration, we increment i with inc-i. Of course, i isn't a variable, it's a thing on the return stack. I made these words up, and they're implemented like this:

  : zero-i r> rdrop 0 >r >r ;
  : inc-i  r> r> 1+ >r >r ;

Notice that both of them start with r> and end with >r. That's me saving and restoring the top item of the return stack. You see, once I call zero-i, the top element of the return stack is the call site! (Well, the call site plus one.) I can't just replace it, so I save it to the data stack, mess around with the second item on the return stack (which is now the top item) and then restore the actual caller so that when I hit the exit generated by the semicolon, I go back to the right place. Got it? Good!

Apart from that stuff, this word is really just the iterative Heap's algorithm from Wikipedia!

nested iteration

Now, the program didn't start by using each-iteration, but each-expression. Remember?

  ' check-solved each-expression

That doesn't just iterate over operand iterations, but also over operations and groupings. It looks like this:

  : each-expression ( xt -- )
    2 0 do
      i 0= linear !
      dup each-opset
      loop drop ;

It expects an execution token on the stack, and then calls each-opset twice with that token, setting linear to zero for the first call and 1 for the second. linear controls which grouping we'll use, meaning which of the two ways we'll evaluate the expression we're building:

  Linear    : o1 ~ ( o2 ~ ( o3 ~ o4 ) )
  Non-linear: (o1 ~ o2) ~ (o3 ~ o4)

each-opset is another iterator. It, too, expects an execution token and repeatedly passes it to something else. This time, it calls each-permutation, above, once with each possible combination of operator indexes in curr-op.

  : each-opset ( xt -- )
    4 0 do i 0 curr-op!
      4 0 do i 1 curr-op!
        4 0 do i 2 curr-op!
          dup each-permutation
          loop loop loop drop ;

This couldn't be much simpler! It's exactly like this:

  for i in (0 .. 3) {
    op[0] = i
    for j in (0 .. 3) {
      op[1] = j
      for k in (0 .. 3) {
        op[3] = k

inspecting state as we run

Now we have the full stack needed to call a given word for every possible expression. We have three slots each for one of four operators. We have four operands to rearrange. We have two possible groupings. We should end up with 4! x 4³ x 2 expression. That's 3072. It should be easy to count them by passing a counting function to the iteator!

create counter 0 ,
: count-iteration
  1 counter +!    \ add one to the counter
  counter @ . cr  \ then print it and a newline

' count-iteration each-expression

When run, we get a nice count up from 1 to 3072. It works! Similarly, I wanted to eyeball whether I got the right equations, so I wrote a number of different state-printing words, but I'll only show two here. First was .inputs, which prints the state of the input array. (It's conventional in Forth to start a string printing word's name with a dot, and to end a number printing word's name with a dot.)

  : .input  input@ fe. ;
  : .inputs 4 0 do i .input loop cr ;

.inputs loops over the indexes to the array and for each one calls i .input, which gets and prints the value. fe. prints a formatted float. Here's where I hit one of the biggest problems I'd have! This word prints the floats in their order in memory, which we might think of as left to right. If the array has [8, 6, 2, 1], we print that.

On the other hand, when we actually evaluate the expression, which we'll do a bit further on, we get the values like this:

4 0 do i input@ loop \ get all four inputs onto the float stack

Now the stack contains [1, 2, 8, 6]. The order in which we'll evaluate them is the reverse of the order we had stored them in memory. This is a big deal! It would've been possible to ensure that we operated on them the same way, for example by iterating from 3 to 0 instead of 0 to 3, but I decided to just leave it and force myself to think harder. I'm not sure if this was a good idea or just self-torture, but it's what I did.

The other printing word I wanted to show is .equation, which prints out the equation currently being considered.

  : .equation
    linear @
      0 .input 0 .op
        1 .input 1 .op
        (( 2 .input 2 .op 3 .input ))
      (( 0 .input 0 .op 1 .input ))
      1 .op
      (( 2 .input 2 .op 3 .input ))
    ." = " target f@ fe. cr ;

Here, we pick one of two formatters, based on whether or not we're doing linear evaluation. Then we print out the ops and inputs in the right order, adding parentheses as needed. We're printing the parens with (( and )), which are words I wrote. The alternative would have been to write things like:

  ." ( " 2 .input 2 .op 3 .input ." ) "

...or maybe...

  .oparen 2 .input 2 .op 3 .input

My program is tiny, so having very specialized words makes sense. Forth programmers talk about how you don't program in Forth. Instead, you program Forth itself to build the language you want, then do that. This is my pathetic dime store version of doing that. The paren-printing functions look like:

  : (( ." ( " ;
  : )) ." ) " ;

testing the equation

Now all we need to do is write something to actually test whether the equations hold and tell us when we get a winner. That looks like this:

  : check-solved
    this-solution target f@ 0.001e f~rel
    if .equation then ;

This is what we passed to each-expression at the beginning! We must be close to done now...

this-solution puts the value of the current expression onto the top of the (float) stack. target f@ gets the target number. Then we use f~rel. GNU Forth doesn't give you a f= operator to test float equality, because testing float equality without thinking about it is a bad idea, because it's too easy to lose precision to floating point mechanics. Instead, there are a bunch of float comparison operators. f~rel takes three items from the stack and puts a boolean onto the data stack. Those items are two values to compare, and an allowed margin of error. We're going to call the problem solved if we're within 0.001 of the target. If we are, we'll call equation. and print out the solution we found.

The evaluator, this-solution, looks like this:

  : this-solution
    4 0 do i input@ loop

    linear @ if
      2 op-do 1 op-do 0 op-do
      2 op-do
      frot frot
      0 op-do
      1 op-do

What could be simpler, right? We get the inputs out of memory (meaning they're now in reverse order on the stack) and pick an evaluation strategy based on the linear flag. If we're evaluating linearly, we execute each operator's execution token in order. If we're grouping, it works like this:

          ( r1 r2 r3 r4 ) \ first, all four inputs are on the stack
  2 op-do ( r1 r2 r5    ) \ we do first op, putting its result on stack
  frot    ( r2 r5 r1    ) \ we rotate the third float to the top
  frot    ( r5 r2 r1    ) \ we rotate the third float to the top again
                          \ ...so now the "bottom" group of inputs is on top
  0 op-do ( r5 r6       ) \ we do the last op, evaluating the bottom group
  fswap   ( r6 r5       ) \ we restore the "real" order of the two groups
  1 op-do ( r7          ) \ we do the middle op, and have our solution

That's it! That's the whole 24 solver, minus a few tiny bits of trivia. I've published the full source of the program on GitHub.

JSON::Typist (body)

by rjbs, created 2016-08-06 13:17

I've been meaning, for a while, to make a little post about a library I wrote a while ago.

Perl 5's type system is a mixed bag. Sometimes it's great, because you don't need to worry about types, and sometimes it's a pain, because you wish you could worry about types. There have been a number of proposals or attempts to sort this out over time, but basically nothing has happened. My guess is that not much is ever really going to happen, and that's okay. I still like Perl 5.

Sometimes, though, the lack of typing really does get in the way. In my experience, it's mostly when you need to deal with something outside of Perl that does have a strong distinction between numbers and strings. This can often be in the interchange of serialized data structures. JSON, for example, has three fundamental types that are more or less all muddled together: numbers, strings, and booleans.

When using JSON.pm, booleans can be produced by using \0 and \1, which is a bit weird, but ends up working really nicely. When read in, booleans become objects. Okay!

Strings and numbers can be produced by serializing "$x" or 0+$x directly, or by starting with string or number literals, which is maybe okay, but inspecting the data before it gets serialized can ruin this effect:

  ~$ perl -MJSON -E '$x = 0; say $x; say JSON->new->encode([$x])'

  ~$ perl -MJSON -E '$x = 0; say "$x"; say JSON->new->encode([$x])'

That say "$x" could always be buried deep in some subroutine, and you end up with spooky action at a distance.

Similarly, if you read in JSON and wanted to check what types the data had, you'd end up using B::svref_2object or other much-too-low-level tools. I wanted to be able to get objects back, just like I did with a boolean. I don't want this all the time, only sometimes, but when I want it, I want it!

I wrote JSON::Typist, which walks a structure produced by a JSON decode and returns a new structure, replacing string and number leafs with objects:

  my $content = q<{ "number": 5, "string": "5" }>;

  my $json = JSON->new->convert_blessed->canonical;

  my $payload = $json->decode( $content );
  my $typed   = JSON::Typist->new->apply_types( $payload );

  $typed->{string}->isa('JSON::Typist::String'); #true
  $typed->{number}->isa('JSON::Typist::Number'); # true

I'm using it for testing a web service that must provide data in the right types. It isn't enough to make sure that $data->{id} eq $expected, I also need to know that it was provided as a string. With JSON::Typist, I can.

I know this library needs some more work, and I need to build some test tools (maybe adding on to Test2::Compare?) to work with the structures I get back, but this has allowed me to test for (and then fix) a bunch of bugs, so I'm pretty happy with having gotten started.

HTML::MasonX::Free (body)

by rjbs, created 2016-05-24 21:45
tagged with: @markup:md journal

Who's ready to live in the past? Me!

Every time I try to like some other templating system in Perl, I fail. The only one I sort of like is Mason. No, no, not Mason 2. I don't like that. I like HTML::Mason. You know, Mason 1.

It has about a zillion problems, but the biggest problem, I think, is just its reputation. People think it's guaranteed to lead to some kind of awful "whole app written inside your templates," just because its original use case was "you can write our whole app inside your templates." But we believe in second chances, right?

For years now, I've wanted to write a Mason-inspired Mason replacement. I just haven't. I did, though, write a bunch of plugins to Mason to change how it behaved. They've made it a lot nicer to work with, and I thought I'd give a bit of a quick run-through on what they do. Maybe someone else will find them useful, although… well, I guess it could happen!

Stricter Component Interfaces

So, a typical Mason component might look like:


  <%method greeting>
  Hey, <% $name %>

    <head><title>Your face</title></head>
    <& SELF:body, name => $name &>

  <%method body>
    <div><& SELF:greeting, name => $name &><div>

  <!-- good night! -->

Even in this dumb contrived example, it can be hard to figure out the entry point. Basically, anything that isn't part of some other special block like <%method> or <%args> or <%def> is "the main thing that gets run." You could write your Perl programs like this, too, switching between the main code and subroutine definitions as you go, mixing them together, but you wouldn't. Right? No, you wouldn't.

Sometimes, we even encapsulate the main part of a program in sub main, like some other languages do. Then you run the program by calling main() at the end.

HTML::MasonX::Free::Compiler lets you do this with your Mason components. First, it forbids stray content. Everything must be inside a method or doc block (or similar structures), or the compiler barfs.

Then, when you render a component, there's a default method to call. So, if you call <& /some/component &> — which is what happens when you find and render a path — then it actually ends up calling /some/component:main. This forces a non-nesting structure where you're not interleaving a bunch of blocks inside of your main content.

Component Roots as Subclass Overlays

The Mason resolver maps component paths (which look like file paths) to components. In general it does that by looking through a file tree, but it can be more abstract, like in MasonX::Resolver::WidgetFactory. By default, though, it works like this:

Say you have three roots, /X and /Y and /Z. Then these two things exist:


…and then you call /vehicle/car.

Traditionally, Mason will look through the component and find the one in the first root. In this case, that's in /X. The component at /X/vehicle/car is then called. Calling (exec-ing) that component actually means walking up its ancestry to its inheritance root and calling that, which will then call $m->next until it gets back down to the actually-requested component.

This is nuts.

It made a bit of sense once upon a time when the default parent, autohandler was used for things like permissions checks. I'm only using Mason for templates, though, so forget that! I want to use inheritance in a more traditional way, for a more specialized version of a general thing. For this, I wrote HTML::MasonX::Free::Resolver. It gets a list of roots, but they're treated like overlays.

I'll elaborate. In the standard configuration, /X/vehicle/car can never have a parent under /Z. The default tree is:

    -> /X/vehicle/autohandler
      -> /X/autohandler

With HTML::MasonX::Free::Resolver, we'd get:

    -> /Z/vehicle/car

And while traditional Mason would call its tree from the bottom up, ours calls from the top down. Since all our components have a main method, then a pretty simple thing to do is to have this in the "base" template /Z/vehicle/car:

<%method main>
This is a <% SELF:color %> <% SELF:type %> car.

<%method color>grey</%method>
<%method type>motorized</%method>

…and in your "derived" template, /X/vehicle/car just:

<%method type>hybrid</%method>

This makes it easy to have a generic pack of templates that you customized on a per-install basis by adding a new root at the derived end of the list.

One fun fact: the component roots in Mason aren't stored in the resolver, but in the interpreter, even though the resolver is the thing that does the resolving. In order to have HTML::MasonX::Free::Resolver be in charge of its roots, you have to put a special value into comp_roots to indicate, "yes, I realize this won't ever get used."

HTML Entity Encoding with Fewer Screw-ups

Say your template has this:

  <input value='<% $value %>' />

Well, you'd never do that, right, because you'd use a widget generator? But let's pretend you would. The other bug is that you probably didn't escape the entites in $value, so maybe there's an HTML injection attack there. You might have wanted:

  <input value='<% $value |html %>' />

That weird-o pipe thing is Mason's filtering syntax. You probably almost always want to entity encode things, so you might set the default_escape_flags on your compiler to html. Then, when you don't wan't to encode, you do this:

  <div><% $known_html |n %></div>

This means, "no escaping for this, please." The problem is that you might want to write a method that accepts a parameter that could be of either type. There's no default way to know, and if you get it wrong, you're screwed up. You can find yourself in that situation in a number of ways.

HTML::MasonX::Free::Escape provides a replacement for the default html filter that can be given an argument that is known to be HTML. You generate it by using the html_hunk routine, like this:

  % my $text = "D&D";
  % my $html = html_hunk("D&amp;D");
  I like playing <% $text %> and more <% $html %>.

The rendered text will encode $text without double-encoding $html. You also can't accidentally do this:

  % my $html = html_hunk("D&amp;D");
  % my $string = "My favorite game is $html."

Or, rather, you can, but it will be a runtime error instead of a weird-o double encoding showing up somewhere.

That's it!

So, these don't really make Mason an amazingly modern thing, but help sand down a few of its most obvious warts, and that's been good enough for me!

Test::PgMonger, which you should probably not use (body)

by rjbs, created 2016-05-14 13:52

JMAP is a protocol that is meant to replace IMAP, CalDAV, CardDAV, SMTP, ACAP (ha ha), and probably some other protocols that aren't springing to mind. Like IMAP, it's meant to make it easy to synchronize offline work with an authoritative server. It does this by dividing up the data model into collections of discrete types, with each collection in a known and addressable state.

I'm not writing this post to talk about JMAP itself, though.

The JMAP model can be useful for things other than email, contacts, and that sort of thing. Why not make other things syncable in the same way? I've been writing a library to make this easy (or at least less difficult) to do. Given a DBIx::Class schema, my library Ix constructs a JMAP-like method dispatcher as well as a Plack application to publish it.

Ix is not even remotely ready for doing real work, so I'm not writing this post to talk about Ix, either.

Since an Ix application uses a database for storing all its entities, its test suite needed a database. I started out by using my usual strategy for testing simple database stuff: SQLite! I love SQLite. It is great. For each test, I could make a new SQLite database, deploy the DBIx::Class schema, and run tests. Then I'd delete the file. Done!

As the SQL that I was generating got more complex, I realized that using SQLite was no longer a good idea. It was great for getting started, but now I needed to run my tests against the same setup I'd have in production. I installed postgresql on my testing box and Postgres.app on my laptop. (By the way, have you seen Postgres.app? It runs Postgres, as you, on your Mac, just like a normal OS X app. It puts a elephant in the menu bar. Neat!) I still needed something to create and destroy my Postgres databases, though, since they weren't just files anymore.

I had a look at Test::Database, but it didn't do what I needed. I'll write (at least to BooK!) about the specific problems, but basically Test::Database's view of test databases is that they aren't nearly as single-use or disposable as what I wanted, and it wasn't easy to extend. Eventually, I wrote my own dumb little library, inspired by parts of Test::Database. It is called Test::PgMonger (pronounced "pig monger"), and it's stupid and effective.

The PgMonger object has credentials to PostgreSQL with permissions to create new users and databases. For now, I'm just assuming that localhost is trusted and I can use the postgres user. It uses those credentials to create a new user and a new database under a unique prefix. That database gets cleaned up at program exist, and there's a way to tell the PgMonger to kill all the databases that match its creating pattern, in case some escape deletion due to crashes or other screw-ups.

This is a really simple hunk of code, and even so it needs more refinement. Hopefully Test::Database can pick up the things I need so that I'm freed from thinking about this one-off thing. For now, though, this has made my testing really painless!

I went to the Perl QA Hackathon in Rugby! (body)

by rjbs, created 2016-04-26 22:45
last modified 2016-04-29 08:22

I've long said that the Perl QA Hackathon is my favorite professional event of the year. It's better than any conference, where there are some good talks and some bad talks, and some nice socializing. At the Perl QA Hackathon, stuff gets done. I usually leave feeling like a champ, and that was generally the case this time, too.

I flew over with Matt Horsfall, and the trip was fine. We got to the hotel in the early afternoon, settled in, played some chess (stalemate) and then got dinner with the folks there so far. I was delighted to get a (totally adequate) steak and ale pie. Why can't I find these in Philly? No idea.

steak and ale pie!!

The next day, we got down to business quickly. We started, as usual, with about thirty seconds of introduction from each person, and then we shut up and got to work. This year, we had most of a small hotel entirely to ourselves. This gave us a dining room, a small banquet hall, a meeting room, and a bar. I spent most of my time in the banquet hall, near the window. It seemed like the easiest place to work. Although there were many good things about the hotel, the chairs were not one of them! Still, it worked out just fine.

The MetaCPAN crew were in the dining room, a few people stayed at the bar seating most of the time, and the board room got used by various meetings, most of which I attended.

The view over my shoulder most of the time, though, was this:

getting to work!

Philippe wasn't always there with that camera, though. Just most of the time.

I think my work falls into three categories: Dist::Zilla work, meeting work, and pairing work.


Two big releases of Dist::Zilla came out of the QAH. One was v5.047 (and the two preceeding it), which closed about 40 issues and pull requests. Some of those just needed application, but others needed tests, or rework, or review, or whatever. Amusingly enough, other people at the QAH were working on Dist::Zilla issues, so as I tried to close out the obvious branches, more easy-to-merge branches kept popping up!

Eventually I got down to things that I didn't think I could handle and moved on to my next big Dist::Zilla task for the day: version six!

My goal with Dist::Zilla has been to have a new major version every year or two, breaking backward compatibility if needed, to fix things that seemed worth fixing. I've been very clear that while I value backcompat quite a lot in most software, Dist::Zilla will remain something of a wild west, where I will consider nothing sacred, if it gets me a big win. The biggest change for v6 was replacing Dist::Zilla's use of Path::Class with Path::Tiny. This was not a huge win, except insofar as it lets me focus on knowing and using a single API. It's also a bit faster, although it's hard to notice that under Dist::Zilla's lumbering pace.

Karen Etheridge and I puzzled over some encoding issues, specifically around PPI. The PPI plugin had changed, about two years ago, to passing octets rather than characters to PPI, and we weren't sure why. Karen was convinced that PPI did better with characters, but I had seen it hit a fatal bug that using octets avoided. Eventually, with the help of git blame and IRC logs, we determined that the problem was... a byte order mark. Worse yet, a BOM on UTF-8!

When parsing a string, PPI does in fact expect characters, but does not expect that the first one might be U+FEFF, the space character used at offset 0 in files to indicate the UTF encoding type. Perl's UTF-16 encoding layers will notice and use the BOM, but the UTF-8 layer will not, because a BOM on a UTF-8 file is a bad idea. Rather than try to do anything incredibly clever, I did something quite crude: I strip off leading U+FEFF when reading UTF-8 files and, sometimes, strings. Although this isn't always correct, I feel pretty confident that anybody who has put a literal ZERO WIDTH NO-BREAK SPACE in their code is going to deserve whatever they get.

With that done, a bunch of encoding issues go away and you can once again use Dist::Zilla on code like:

my $π = 22 / 7;

This also led to some fixes for Unicode text in __DATA__ sections. As with the Path::Tiny change, a number of downstream plugins were affected in one way or another, and I did my best to mitigate the damage. In most cases, anything broken was only working accidentally before.

Dist::Zilla v6.003 is currently in trial on CPAN, and a few more releases with a few more changes will happen before v6 is stable.

Oh, and it requires perl v5.14.0 now. That's perl from about five years ago.


I was in a number of meetings, but I'm only going to mention one: the Test2 meeting. We wanted to discuss the way forward for Test2 and Test::Builder. I think this needs an entire post of its own, which I'll try to get to soon. In short, the majority view in the room was that we should merge Test2 into Test-Simple and carry on. I am looking forward to this upgrade.

Other meetings included:

  • renaming the QAH (I'm not excited either way)
  • using Test2 directly in core for speed (turns out it's not a big win)
  • getting more review of issues filed on Software-License

A bit more about that last one: I wrote Software-License, and I feel I've done as much work on it as I care to, at least in the large. Now it gets a steady trickle of issues, and I'm not excited to keep doing it all myself. I recruited some helpers, but mostly nothing has come of it. I tried to rally the troops a bit to encourage more regular review just leading to each person giving a +1 or -1 on each pull request. Otherwise, Software-License is likely to languish.


I really enjoy being "free floating helper guy" at QAH. It's something I've done a lot ever since Oslo. What I mean is this: I look around for people who look frustrated and say, "Hey, how's it going?" Then they say what's up, and we talk about the issue. Sometimes, they just need to say things out loud, and I'm their rubber duck. Other times, we have a real discussion about the problem, do a bit of pair programming, debate the benefits of different options, or whatever. Even when I'm not really involved in the part of the toolchain being worked on, I feel like I have been able to contribute a lot this way, and I know it makes me more valuable to the group in general, because it leaves me with more of an understanding of more parts of the system.

This time, I was involved in review, pairing, or discussion on:

  • fixing Pod::Simple::Search with Neil Bowers
  • testing PAUSE web stuff with Pete Sergeant
  • CPAN::Reporter client code with Breno G. de Oliveira
  • DZ plugin issues with Karen Etheridge
  • Log::Dispatchouli logging problems with Sawyer X
  • PAUSE permissions updates with Neil Bowers
  • PAUSE indexing updates with Colin Newell (I owe him more code review!)
  • improvements to PAUSE's testing tools with Matthew Horsfall
  • PPI improvements with Matthew Horsfall

...and probably other things I've already forgotten.

more hard work

Pumpking Updates

A few weeks ago, I announced that I'm retiring as pumpking after a good four and a half years on the job. On the second night of the hackathon, the day ended with a few people saying some very nice things about me and giving me both a lovely "Silver Camel" award and also a staggering collection of bottles of booze. I had to borrow some extra luggage to bring it all home. (Also, a plush camel, a very nice hardbound notebook, and a book on cocktails!) I was asked to say something, and tried my best to sound at least slightly articulate.

Meanwhile, there was a lot of discussion going on — a bit at the QAH but more via email — about who would be taking over. In the end, Sawyer X agreed to take on the job. The reaction from the group, when this was announced, was strong and positive, except possibly from Sawyer himself, as he quickly fled the room, presumably to consider his grave mistake. He did not, however, recant.

perl v5.24.0

I didn't want to spend too much time on perl v5.24.0 at the QAH, but I did spend a bit, rolling out RC2 after discussing Configure updates with Tux and (newly-minted Configure expert) Aaron Crane. I'm hoping that we'll have v5.24.0 final in about a week.

Perl QAH 2017

I'm definitely looking forward to next year's QAH, wherever it may be. This year, I had hoped to do some significant refactoring of the internals of PAUSE, but as the QAH approached, I realized that this was a task I'd need to plan ahead for. I'm hoping that between now and QAH 2017, I can develop a plan to rework the guts to make them easier to unit test and then to re-use.

Thanks, sponsors!

The QAH is a really special event, in that most of the attendees are brought to it on the sponsor's dime. It's not a conference or a fun code jam, but a summit paid for by people and corporations who know they'll benefit from it. There's a list of all the sponsors on the event page, including, but not limited to:

Raise a glass to them!

Dist::Zilla v6 is here (in trial format) (body)

by rjbs, created 2016-04-24 04:38

I've been meaning to release Dist::Zilla v6 for quite a while now, but I've finally done it as a trial release. I'll make a stable release in a week or two. So far, I see no breakage, which is about what I expected. Here's the scoop:

Path::Class has been dropped for Path::Tiny.

Actually, you get a subclass of Path::Tiny. This isn't really supported behavior. In fact, Path::Tiny tells you not to do this. It won't be here long, though, and it only needs to work one level deep, which it does. It's just enough to give people downstream a warning instead of an exception. A lot of the grotty work of updating the internals to use Path::Tiny methods instead of Path::Class methods was done by Kent Frederic. Thanks, Kent!

-v no longer takes an argument

It used to be that dzil test -v put things in verbose mode, dzil test -v Plugin put just that plugin in verbose mode, and dzil -v test screwed up because it decided you meant test as a plugin name, and then couldn't figure out what command to run.

Now -v is all-things-verbose and -V is one plugin. It turns out that single-plugin verbosity has been broken for some time, and still is. I'll fix it very soon.

Deprecated plugins deleted

I've removed [Prereq] and [AutoPrereq] and [BumpVersion]. These were long marked as deprecated. The first two are just old spellings of the now-canonically-plural versions. BumpVersion is awful and nobody should use it ever.

PkgVersion can generate "package NAME VERSION" lines

So, now you can avoid deciding how to assign to $VERSION and add the version number directly to the package declaration. This also avoids the need to have any room for blank lines in which to add $VERSION.

Dist::Zilla now requires v5.14.0

Party like it's 2011.

the "credit the last uploader" problem (body)

by rjbs, created 2016-02-12 09:16
last modified 2016-02-12 09:40
tagged with: @markup:md cpan journal perl

First, a refresher…

At its simplest, the CPAN is a bunch of files and an index. The index directs you from package names to the files that contain the latest authorized release of that package. Everything else builds on top of that.

If you want to publish Foo::Bar to the CPAN, you need to use PAUSE. PAUSE manages users and permissions, authenticates users, accepts uploads, and then decides how and whether to index them. To make those indexing decisions, first PAUSE analyzes an uploaded file to see what packages it contains. Then it compares those packages to the permissions of the uploading user. If the user has permission, and if the uploaded package is later-versioned than the existing indexed package, the package is indexed.

I have skipped some details, but I believe that for the purpose of everything else I'm going to write about, this is a sufficient explanation.

To get permissions on a package that isn't indexed at all, you upload it. Then you have permissions. If you want to work with a package that already exists, the person who uploaded it needs to give you permission. There are two kinds of permission:

  • first-come; you're the person who first uploaded it, or the person to whom that person has handed over the keys; there is only one first-come user per package; you can upload new versions and you can assign and revoke co-maint permissions
  • co-maint: you are permitted to upload new versions, but you may not alter the permissions of the package

The Complaint

When you view code on MetaCPAN or search.cpan.org, one of the most visible details is the name (and avatar) of the last user to have uploaded that package. This creates a strong impression that this is the contact point for the package. Sometimes, this is true, or true enough. On the other hand, sometimes it's not, and that's a problem. It may be that the last person to upload the library only did so as a one-off act, or that they were a member of the team working on a project years ago when it was last released. Now, though, they will be boldly listed as the contact person.

Here's a scenario:

  • in 2002, a library, Pie::Packer is uploaded by Alice and is popular for a while
  • in 2008, Bob finds a bug and finds that Alice isn't really working on Perl anymore; Bob offers to do a release for just this bug fix
  • Alice gives Bob co-maint on Pie::Packer
  • Bob uploads Pie::Packer v1.234, the only release he ever plans to make
  • from 2008 through 2016, Bob is sent requests for help with Pie::Packer

Bob can't just pass on permissions to stop it. He can give up permissions, but he'll still be the last uploader.

You might object: "Alice should have given Bob first-come! Then he could pass along permissions!"

This is true. Maybe in 2010, Bob gives permissions to Charlotte... but now Charlotte is stuck in the same position. If nobody ever comes along to take it over, Charlotte can't usefully get out from under the distribution.

Half a Solution

In 2013, the QA Hackathon led to a consensus about a mechanism for permission transitions. It goes something like this:

  • give user "ADOPTME" co-maint to indicate that first-come permissions can be given to someone who wants them, and you don't need to be consulted
  • give user "HANDOFF" co-maint to indicate that you're looking to pass along first-come to someone else, but they should go through you

(The third magic user, "NEEDHELP," is not relevant to the topic at hand.)

Marking a library with ADOPTME or HANDOFF is useful in theory, but not in practice, because it's almost impossible to know that it has happened. Yesterday, I filed a bug about making ADOPTME/HANDOFF visible on MetaCPAN, and I think it's critically important to making the ADOPTME/HANDOFF worth having.

So, why is this section headed "half a solution"?

Because this solution helps you if you have first-come, but not if you have co-maint. Imagine poor Bob, above, in 2016. By this point, Alice has moved off the grid and can't be contacted. Bob can't mark the dist as ADOPTME. He can ask the PAUSE admins to do so, but that's it. It's also a bit a burden to put onto the PAUSE admins, who may not know whether Bob has really made a good faith effort to contact Alice.

The final remaining problem is this: There is no escape hatch for someone who has co-maint permissions and wants to get out from under the shadow of an unwanted upload.

The Simplest Thing That Could Possibly Work

This problem could be solved by adding a "GitHub Organizations"-like layer to PAUSE… but I think there's a much, much simpler mechanism.

We should always treat the first-come owner as the authoritative source, including when displaying a distribution on the web. MetaCPAN Web should stop showing the name and image of the latest uploder as prominently, and should show the first-come user instead. The same goes for search.cpan.org and other such sites. MetaCPAN already has a place for listing other contributors, which should contain the last uploader. Adding note like "last upload by BOB" seems okay, too, but the emphasis should be on connecting the distribution with the one person who can actually make decisions about its future.

prev page
next page
page 1 of 54
1330 entries, 25 per page