rjbs forgot what he was saying

not logged in (root) | by date | tagcloud | help | login

RSS feed entries

collapse entry bodies

I closed a lot of browser tabs (body)

by rjbs, created 2021-08-08 11:39
last modified 2021-08-09 10:54
tagged with: @markup:md journal programming

I am widely admired at work for my ability to have many, many browser tabs open. (That, at least, is what I take from the frequent shouts of "holy cow, man, look at your browser!") Nonetheless, I have long thought that it would be worth getting my total tab count down. I have tabs open for a bunch of reasons.

  • document I meant to read eventually
  • document I meant to read in time for some meeting
  • web application I always keep open
  • thing I am in the middle of working on but should finish at some point
  • thing I was reading or editing but am basically done with
  • duplicate tab of any of the above

One problem here is that once the count of tabs is large enough, "clean up tabs" starts with a whole step of "figure out which tab is what kind of tab". Can I just close it because I'm done? Is it a thing I should read? Soon? Someday? I need to improve my real time habits: closing things when done, grouping things by purpose, and so on. Until then, though, I need to see when I've let things get out of control.

For years, I've thought, "I should visualize my tab count over time." Once or twice I even wrote programs to help, but about a week ago I finally made the whole thing go. When I Tweeted about it, I got a couple "how'd you do thats?", so I thought I'd write it up. It's simple, except sort of not.

Here's the output:

a graph of my browser tabs

If you can't tell, this graph is generated by Grafana, a data visualizer that can produce graphs and many other visualizations of data from all kinds of systems. So, I needed to set up Grafana. That was easy, I used their free cloud hosting, but setting up Grafana on a Linode is also dead simple. Basically, you install and tell it where to get the data. There are a lot of Grafana tutorials.

I have my Grafana pointed at Prometheus. Prometheus is a time series database that gathers data by hitting HTTP endpoints. To grossly simplify, if you want your application to expose metrics to be gathered and later analyzed, you give it HTTP listener that replies with data in a specific format. There are a lot of Prometheus tutorials.

What I needed to do was to provide an HTTP service that would provide tab counts. I picked the simplest possible way to provide that count, meaning a response like this:

Content-Type: text/plain

chrome_open_tabs 234

First problem: how to count tabs?

When I originally looked at doing this, years ago, I was using Firefox. I tend to switch back and forth between Chrome and Firefox every few years. In Firefox, it was pretty easy to get tab counts. In the profile directory, there's a file called something like sessionstore-backups/recovery.jsonlz4. The exact place this lives has changed over time, but generally there has been a JSON file in your profile that you could read to see tabs.

Chrome stores its sessions in a seemingly obnoxious binary format called SSNS. I looked at decoding it and groaned. I knew I could write a browser extension to get at the tab counts, but I had a vague sense of unease about two things. First, I wasn't sure I'd have any means to embed a web server in Chrome to serve this data. That meant I'd want to write the data to a file to be server by something else. That gets to my second concern: I doubted I'd be able to reliably write the tab count to a file from an extension.

I had a flash of inspiration, though. If, by some great mercy, Chrome was automatable by AppleScript, I could make a go of it. I opened up the macOS Script Editor, hit Cmd-Shift-O to "Open Dictionary", and looked for Google Chrome. It was there! Its automation suite is tiny, but it exposes windows and tabs. I was able to write this AppleScript:

set tabcount to 0
if application "Google Chrome" is running then
  tell application "Google Chrome"
    repeat with w in windows
      set tabcount to tabcount + (count of tabs of w)
    end repeat
  end tell
end if

AppleScript is weird, but I have had decent success in using it for lots of little tasks in the past. These days, what I tend to do is prove something will work in AppleScript, then port it to JavaScript, using JXA. JXA is "JavaScript for Automation", which is just "what if you could write JavaScript instead of AppleScript to do the same stuff?" This is pretty appealing! JavaScript is a much less weird language, and you can combine your macOS automation with other code you've written in JavaScript.

It's not perfect, though. The objects you get to represent AppleScriptable entities are pretty clumsy. They don't let you get a list of their properties, and if you make a guess, it won't help much. object.foo will always evaluate to a function, but that function may throw a "no such function" exception when called. Array-like objects aren't iterable, so you'll do a lot of looping over indexes instead of iterating over value. Still, this isn't so bad:

let tabcount = 0;
const Chrome = new Application("Google Chrome");

if (Chrome.running()) {
  for (i in Chrome.windows) {
    tabcount += Chrome.windows[i].tabs.length;

That "check if Chrome is running" step is important. On one hand, it would be nice to act like quitting Chrome didn't really eliminate the mental weight of its tabs. On the other, using AppleScript to talk to an application that isn't running will launch that application, and I sure don't want that.

Second problem: how to spin up the HTTP service?

Now I had a means to count tabs. The code I actually wrote was a little different, because I gathered an array of per-window tab counts, in case I wanted to graph that, too, but it was a lot like code above. Now I needed to get it running regularly. The most obvious option for this was to have it run under launchd, the macOS service manager. This would be very sensible, but would require I think about launchd configuration, which I don't like doing. I thought about setting up daemontools to run things out of my home directory. I'd have to run it from launchd, but having set that up once, I wouldn't have to think about it again. I didn't even want to think about it once, thought!

I had another weird realization. I use a program called Hammerspoon, which is sort of an all-purpose tool for doing macOS automation. I use it to inject some menu bar icons that reorganize my desktop or run timers, and to set up keyboard shortcuts for a few things. Among its many other functions, it has facilities for embedded HTTP service. It can also run JavaScript using JXA. I wrote this:

tabulator = hs.httpserver.new()
tabulator:setCallback(function (method, path, headers, body)
  if (method == "GET") and (path == "/metrics") then
    bool, tabcounts, descriptor = hs.osascript.javascript([[
      const Chrome  = new Application("/Applications/Google Chrome.app");

      let tabCounts = [];

      if (Chrome.running()) {
        for (i in Chrome.windows) {

        tabCounts.sort((a,b) => b - a);


    if not bool then
      return "Error\n", 500, {}

    local sum = 0
    for i, tabs in ipairs(tabcounts) do
      sum = sum + tabs

    return "chrome_open_tabs " .. sum .. "\n", 200, {}
    return "No good.\n", 404, {}

This creates a new HTTP listener on port 9876. If it receives a GET request for /metrics, it runs my JXA to ask Chrome (if running) about its tab count. The hs.osascript.javascript function returns a tuple, and the second item in it is the final statement of the JavaScript code, where we've ended with tabCounts. If the code ran without an error, I return my metrics in a 200 response.

To collect this locally and not let just anybody on the internet trigger this JavaScript running, I have a locally running Prometheus instance on my MacBook. It hits this endpoing and then relays the results to my Prometheus instance in the cloud. Grafana looks at that and shows me my tab count. When it's in the red, I sigh, look through my tabs, and close what I can.

You can see my tab count crashing a few times. First, I closed obvious duplicates or dead documents. Later, I finished easy tasks represented by open tabs. On the weekend, I read a lot of backlogged articles and closed them. Now I'm around 30 tabs, which seems like it's probably about the right number for me.

I meant to build this to help me get better at doing things in Prometheus and Grafana, but I think mostly it was just sort of fun weird general purpose programming, and I enjoyed it. It was a nice reminder that lots of tedious problems have silly solutions.

what CPAN code did I install when? (body)

by rjbs, created 2021-07-03 13:56
last modified 2021-07-04 12:14

When I upgrade my perl, which I do pretty often, the first thing I do is install Task::BeLike::RJBS (by running cpanm rjbs). This installs most of the stuff I'm going to need to do my normal work. Over time, I tend to find that it needs an update, because over the course of the last year or so I started using some new libraries tht didn't get into the bundle. (This will happen less now that I'm using the monthly blead snapshots day to day again, but it's a real thing.)

I don't use plenv's "install everything I had before onto the new one," because I want to keep track of what I install every time. That means that for the first few days after installing a new perl, I end up having to install some library that's not there when I go to run some program that I run now and then. When that happens, I don't want to pull up a notepad and write down what's missing from my bundle. Instead, I wrote a little program to look at my installation history and show me clusters of installed libraries. After a week or two, I look at the output from this program and consider updating my bundle accordingly.

Here it is:

  use v5.34.0;
  use warnings;

  use File::stat;
  use Term::ANSIColor;

  my @perl_inc = `perl -E 'say for grep { m{/.plenv/versions/} } \@INC'`;
  chomp @perl_inc;

  my @lines = `find @perl_inc -name MYMETA.json`;
  chomp @lines;

  my %mtime;

  for my $line (@lines) {
    my ($dist) = $line =~ m{/([^/]+)/MYMETA.json\z};
    my $mtime  = stat($line)->mtime;
    $mtime{$dist} = $mtime;

  my $prev = 0;
  for my $dist (sort { $mtime{$a} <=> $mtime{$b} } keys %mtime) {
    my $mtime = $mtime{$dist};
    if ($mtime - $prev > 3600) {
      print "\n";
      printf "%s %s %s\n",
        colored(['bright_cyan'], '==['),
        colored(['bright_yellow'], scalar localtime $mtime),
        colored(['bright_cyan'], ']==');
    $prev = $mtime;
    say "$dist";

adding the perl-support section to my CPAN modules (body)

by rjbs, created 2021-07-02 19:39
last modified 2021-07-04 12:14

I have quite a few Perl software libraries available on the CPAN. I've written these at different points over the last twenty years, but almost all of them, until pretty recently, were written to support perl v5.8, which was released about 18 years ago. Perl v5.10 took almost four years to come out after v5.8. It had some teething problems, and then v5.12 took another three years.

During all this, with a lot of people pretty firmly entrenched in v5.8, it became pretty normal to assume that your users might very well be stuck on v5.8. That meant it was a courtesy to support that version of perl. Over the years, though, this didn't seem to move forward too much. Perl v5.8 seems to remain a pretty standard default version to target. I find this annoying, and I am going to be mostly ignoring this convention.

I don't think I really need a lot of justification for this, but I'll provide just a little. When I write code, I want to write it in one way. To me, the reason that Perl's "more than one way to do it" is great is that I can pick the one I like best, then always do that. Over time, better ways become possible, and I pick those. I don't write in every possible way, I write in one. The more versions of perl I support, the more I need to think about how to write something beyond "the way I most like to write things."

Being stuck on an old version of perl is a pain. I have had production systems stuck on old perls, and it's not fun. Sometimes, though, that's just how it is. It's part of dealing with the realities of operating software systems. When I get stuck in that situation, I don't expect all the new releases of code that I use to still work on my old perl. It's nice if they do, but if they don't, it's just more of the debt I'm incurring by virtue of being stuck. It's my problem, and not the problem of people who write new versions of things.

In the Perl Toolchain world, where things like "that program that runs your tests and installs your code", our rule is that we must support very old perls. This is reasonable, because this code is the thing that everybody wants to rely on. Possibly there are a small number of distributions I maintain that are sufficiently relied-upon that it would be important for new releases to keep working on old perls. Mostly, though, I think this is a problem best foisted downstream to operating systems still supporting and shipping ancient perls.

A while ago, I bumped up the minimum version on a few of my libraries to v5.20. (That version came out in 2014, so this is something like saying you support MSIE 11 but not older, to make a somewhat sloppy comparison.) Some of the pushback I got was, "The normal thing is to support 5.8, and you gave no notice you might not!" That's fair. It's especially fair if there are some libraries where I think I'd obviously keep 5.8 working, some where I'd be happy to require the latest monthly release, and some where I think there's some reasonable medium!

So, I have made a plan and written a little code, and it goes like this:

I added a new configuration option to my Dist::Zilla bundle, to let me declare how far in the past I'm prepared to live. I called this setting perl-support, but I think I'll end up changing that. It's not really related to me supporting anything. It's just about my intentions toward changing prerequisites.

I started with these options:

  • standard - If code won't work on every version of perl still in its perl5-porters maintenance period, I won't ship it. That means that once a perl is about three years old, I won't worry about. I don't actually expect to use this one very often, so possibly I'll rename it.
  • long-term - I won't require a perl less than five years old.
  • extreme - Extreme as in "extremely long term". I won't require a perl less than ten years old.
  • toolchain - I will abide by the Lancaster Consensus, meaning I'll stick to whatever the toolchain promises to support. Right now, that's v5.8.1.
  • no-mercy - I do what I want.

Over the past few weeks, I've been adding this setting to my bundle. It does a couple things, but by far the most important is add some text to my documentation that says something like this:

This module has a long-term perl support period. That means it will not require a version of perl released fewer than five years ago.

Although it may work on older versions of perl, no guarantee is made that the minimum required version will not be increased. The version may be increased for any reason, and there is no promise that patches will be accepted to lower the minimum required perl.

I don't actually plan to start cranking up the required versions of things just because, but when I do work on my code, I do want to modernize it. Now I'm just setting a guideline to myself about how modern I'll go.

At time I'm writing this, here's what level means what perl:

  • standard - v5.32.0 (or v5.30.0 if I go by security support)
  • long-term - v5.24.0
  • extreme - v5.14.0
  • toolchain - v5.8.1
  • no-mercy - v5.35.1? Blead? Only my own private build? I do what I want!

So far, I've been marking most of my code long-term. I'm pretty happy with that. v5.24.0 is the last version that I helped produce as pumpking, so it's a bit like I'm saying "I really want to use all the features I was helping to get out the door." And I do really want to!

adding some safety to git force-push (body)

by rjbs, created 2021-06-15 12:29
last modified 2021-07-04 12:14

Just the other day, I wrote about my little git branch manager tool. I use it to make sure I know what branches I have lying around, and to delete branches that have already been merged. I wrote the post because I had updated the code to work on more kinds of respository.

I updated the code to more on more kinds of respository because I wanted to write a different program to do something similar. See, about a week ago, I did a force-push that I shouldn't have. The history looked something like this:

  * [733401c] (github/rewrites) add the EIGHT file
  * [036a97a] SIX: rewrite file number six
  * [5d8e998] FIVE: rewrite file number five
  * [40778dd] FOUR: rewrite file number four
  * [624f233] THREE: rewrite file number three
  * [249af25] TWO: rewrite file number two
  * [2ac9796] ONE: rewrite file number one
  | * [d2e2cfb] (HEAD -> rewrites) add the ZERO file
  | * [2fadf99] SIX: rewrite file number six
  | * [6a2218f] FIVE: rewrite file number five
  | * [bbbe25a] FOUR: rewrite file number four
  | * [adda952] THREE: rewrite file number three
  | * [5d7c662] TWO: rewrite file number two
  | * [f656f0f] ONE: rewrite file number one
  | * [4623ca5] (github/main, main) add a seventh file for good luck
  * [4c74ba5] add some random content

What you'll see here is that there's a common base, then two versions of the same branch. One version (the local one) has been rebased on main. The other one is up on GitHub. Both of them have had new commits added. If you did a force-push of the rewrites branch, you'd clobber the "EIGHT" commit.

To avoid screwing up a force push, the advice includes "use --force-with-lease" which is good advice, it means "don't replace the remote if it's different than my fetch of it." That protects you against having it changed out from under you. It still means you have to read and compare the version you see with the version you want to push.

What I wanted was to avoid having to think about that whenever possible, and I knew there was a way to do it.

I wrote a new program, git-publish, that tries to turn the problem into this one:

  • While working on the foo branch, I run git publish
  • If there is no foo branch on my remote, my local branch is published to my remote.
  • If there is a remote branch foo, and my changes are a fast forward, they're pushed up.
  • If there's a remote branch foo, and my copy has rebased it and then (maybe) added commits to the end, a force-with-lease push is made.

Already, this is a pretty good win, covering some of the "is it safe?" questions in code so that I don't need to. The last case is "the difference is more than a rebase", and what I wanted was a means to see that explained. So, the last option is:

  • If the remote branch exists and my local branch doesn't just rebase-and/or-add, then an explanation of the changes (and non-changes) is printed, showing me what I'd really be doing if I force-pushed. I can instruct the program to continue (and it will force-with-lease push) or I can abort and sort things out by hand.

I'll probably refine this last step a bit more. I think I can at least add a step that's "cherry pick commits from the remote that aren't on this branch", and I'd put them after the common commits but before the new commits on my local branch. Even without that, though, this is a pretty nice little tool. I hope to use it more than git push from now on.

screenshot of git-publish

my git branch manager (body)

by rjbs, created 2021-06-13 10:30

Like a lot of people, I have not been great at cleaning up my old git branches over time. Sometimes they get merged but I don't delete them. (The "delete branch after merge" option in GitLab and GitHub help, but they're not 100%). Sometimes I forget I even had a branch, because I never filed a pull request. Also, with all those already-merged branches lying around, it's easy to overlook the not-even-requested branches, especially if I haven't touched them in a while.

The problem is that when you (or I, in this case!) work on a team with other people, they routinely fetch from your remote, and you're effectively cluttering up their clone with a bunch of dead branches. Sure, they can just ignore it, but it's just a little rude. Or, at least, it would be nicer to keep things tidier.

Around a year ago, I wrote a tool to help me clean up (and track) my branches in the primary Fastmail repository. It's got two parts: git-work-status and git-scrub-branches.

git-work-status tells me about all my branches, both local and remote, and whether they're rebased on the primary branch and whether the local and remote have the same head. It tells me which of the branches have an open merge request (the GitLab version of a pull request), whether they're approved, and also shows some labels of note.

git-scrub-branches tries to detect branches that can be destroyed, by doing this, more or less:

  1. fetch from the primary remote ("origin", for example)
  2. fetch from my personal remote
  3. detect whether any branches, local or remote, have already been merged
  4. rebase all branches
  5. detect whether any branches become zero-commit when rebased; they're merged!
  6. delete any known-merged branches
  7. update remaining remote branches to the rebased version

…and then running git-work-status to show where I ended up.

I have a lot of branches in my work code, so here's the program running on a repo where I have fewer, the perl5.git repo:

screenshot of branch scrubber

The existence of this screenshot might alert you to the fact that I've now made the tool work on perl5.git. In fact, it should work on any GitHub or GitLab repository, but if you want to you it, you'll have to set up some configuration in your .git/config something like this:

[branch-manager] primary-remote = github primary-branch = blead

Really, though, you should read the source code to make it go. This is definitely not code I'm looking to maintain as a public utility, but I did put put the code online, so feel free. I expect I'll keep messing around with it over time, without concern for anybody's use case but my own.

Finally, I should note that it's a bit of a double-edged sword, on that whole rudeness front. On one hand, it really does help keep your branches nice and tidy. I've used this to clean up literally hundreds of dead branches. On the other hand, GitLab pretty eagerly subscribers people to update notifications, and those notifications include "Rik just rebased his two-year-old branch for the 15th time." This can be annoying, but I'd rather receive those emails and know that branches were cleaned up. Also, I don't think GitHub sends them — let me know if I'm wrong!

Remember the Milk, its API, and a new Perl client (body)

by rjbs, created 2019-11-05 21:11

Hey, I'm finally writing another post about things I did on my week off in August!

I use Remember the Milk for my personal todo lists. It's pretty good! I've been using it for years, and I wax and wane in my attention to my tasks, but it's been a good help and I'm glad to have it. I'd be even happier with some changes, but more on that later.

Years ago, I built Ywar, and I still use it. It tracks my habits, when possible, using the APIs of services where I leave footprints. Am I weighing myself? Am I exercising? Am I doing some reading? Am I closing todo items in Remember the Milk? I get a congratulatory push notification when I hit a goal, and I get an email in the morning telling me what I should do today. These notices help keep me paying attention and motivated. One of the reasons this has worked okay — although I'll definitely admit that Ywar has not remained a massive force for productivity of late — is that it's there's not much extra work involved. It looks at what I'm already doing and records whether I did what I wanted. This means I have good "did exercise" feedback when I go for a run, because my running app logs to RunKeeper, but nearly no feedback when I lift weights, because my weightlifting app has no API. Less friction leads to greater success.

So, I wanted to apply this to my interactions with RTM. Its web UI is pretty good, and there's a native macOS app that's pretty good, too. (Its macOS app is just the web UI, but I'll take it!) They're both extra apps, though, and I have complicated feelings about how many distinct apps or tabs I want. I won't try to spell it out here, I'll just say: I wasn't as happy as I could be using their UI.

At work, we have a cool bot that provides a chat interface to some of our internal services. I wanted to do the same thing for RTM, which should have been no big deal, except that the existing Perl client library for RTM, WebService::RTMAgent, is a synchronous, blocking interface, and Synergy is event-driven and async. I looked at making it work with futures, but I didn't really want to. It was built on the XML interface, it uses AUTOLOAD, and I just didn't really like its construction. (I have used it for years, though, and it's never really been a problem. It's just not what I wanted to built on.)

Instead, I built a new client library, CamelMilk, modeled lightly on something we'd built at work recently. You feed it your API key and secret, and it can manage auth tokens for users and call API methods. API calls return futures that, when ready, produce simple objects. Here's how it looks, more or less, in use in the Synergy plugin:

  my $rsp_f = $self->rtm_client->api_call('rtm.tasks.add' => {
    auth_token => $token,
    timeline   => $tl,
    name  => $todo_description,
    parse => 1,

  $rsp_f->then(sub ($rsp) {
    unless ($rsp->is_success) {
        "failed to cope with a request to make a task: %s", $rsp->_response,
      return $event->reply("Something went wrong creating that task, sorry.");

    $Logger->log([ "made task: %s", $rsp->_response ]);
    return $event->reply("Task created!");
  })->else(sub (@fail) {

Nice! This code is called in response to a user saying todo eat a whole pie ^tomorrow. It returns immediately while the API call happens in the background and it replies when it's all done. I wrote a little command line program to go along with the library for setting up auth tokens and making one-off API calls.

While writing this library, though, I ended up feeling less excited than when I started. It turns out: I don't like the Remember the Milk API. The first problem is timelines. Here's what the API docs say:

Timelines enable the Remember The Milk API to allow certain actions to be undone. The Remember The Milk web application requests a new timeline every time the application is visited — it is up to the API user to determine how often to request a new timeline. Timelines do not expire, but they must always be used.

Timelines can be thought of as long-running database transactions within which individual sub-transactions (API method calls) can be reverted. The start of a timeline is a snapshot of the state of a users' contacts, groups, lists and tasks at that point in time. Method calls can be reverted continouously until the start of the timeline is reached.

So, that api_call call above was actually inside another call:

    my $tl_f = $self->timeline_for($event->from_user);

    $tl_f->then(sub ($tl) {
      my $rsp_f = $self->rtm_client->api_call('rtm.tasks.add' => {…});

Either we have a timeline id for that user already ready or we go get one, meaning there's either an additional API call or local state management. That timeline id, though, means that you can later undo some methods. Sort of a niche use, but neat, but it complicates all sorts of actions. As long as you're tracking known timelines for undo, why not track transactions on your own so you can compute the inverse transaction and reply them when needed? Then you'd only do that when you might want to undo.

In practice, what I do, and what at least some other clients do, is make a timeline, cache the id, and never, ever think about it unless you call undo which (I predict, sans evidence) almost nobody ever, ever does.

This isn't my real beef, though, it's just a foreshadowing of it. The real beef is that it takes too many HTTP requests to do anything non-trivial. Let's say that I have some paperwork I need to file, so I have a todo for it. I just found out that it's due Friday! I want to set it to priority 1, add the due date, remove the "boring" tag and add the "omg" tag. That will require I call these methods:

  1. rtm.tasks.setDueDate
  2. rtm.tasks.setPriority
  3. rtm.tasks.removeTags
  4. rtm.tasks.setTags

Every one of these is its own HTTP transaction. What happens when you fail partaway through your series of calls? I guess it's time to call undo — possibly more than once, since you may need to undo several transactions.

Here's how it might work in JMAP if JMAP task lists were a standard. You have a task with id 123 and you want to do the updates above. You'd make a single JMAP call:

  methodCalls: [
    [ "Task/set", {
        "update": { "123": {
          "dueDate": "2019-11-08T00:00:00Z",
          "priority": 1,
          "tags/boring": false,
          "tags/omg": true,
        } },

Note that the update argument is an object with the id as a key. You can update many tasks at once. Note, too, that Task/set and its arguments are in an array. You can update multiple kinds of things at once. I could create two new lists and all their items like this...

  methodCalls: [
    [ "TaskList/set", {
        "create": {
          "work": { "name": "Work Stuff" },
          "home": { "name": "Home Todos" }
    [ "Task/set", {
        "create": {
          "w1": { "name": "Get Hired", "listId": "#work" },
          "w2": { "name": "Get Raise", "listId": "#work" },
          "h1": { "name": "Take Nap",  "listId": "#home" },

Working with a protocol like this makes working in an event loop driven system really nice. You have a lot of options, but a simple one is to stick all your updates into one call, then report back all their results to the individual calling futures, with only a single HTTP transaction required.

Turns out, working with JMAP can really spoil you for other APIs.

Anyway, despite the API being a bit of a drag, Remember The Milk remains great, and I continue to get things done by using it. Now I can get more things done by talking to my Slack bot about my agenda, which is great, and if you want, you can go use the code I wrote for it, too. I've had a look at the network inspector while using RTM, and they've clearly got a better protocol for their UI to use. Maybe someday it will be published for mere users like me, too!

I took some time off! (body)

by rjbs, created 2019-08-11 22:27
tagged with: @markup:md journal

With a lot of PTO hours piled up, leave accounting somewhat in flux at work, and MoMA incredibly closed until October, I resolved to take some time off from work, during which I would stay home and work on stuff that I'd been ignoring. For example, my big queue of "conference presentations to watch" and my backlog of articles to read, personal coding projects to poke at, some little quality-of-life fixes to my home setup. Then, of course, I also wanted to do some actually fun things: hang out with my family, have some nice meals, watch some movies, go to a museum or two, and so on.

Short summary: I think this was a big success. I might shoot for doing this twice a year or so, in the future, and still have time left to take a proper out-of-town vacation.

I think I did pretty well on the "do good relaxing stuff" front. Gloria and I went to the Barnes Foundation, we all got a few meals out, I got ice cream, and I went to the local cider place for drinks and pizza with some friends. Also, Gloria and I are now about halfway through season two of Star Trek: Discovery.

I'll write up a bit more about specific things of note over the next few days, I hope, but in brief:

I fixed a bunch of my config files, especially for offlineimap and Vim. Doing this briefly added about sixty extraneous folders to my IMAP store, but fortunately, as an email professional, I knew how to delete them.

Ages ago, I lost a lot of my mp3s due to my own idiocy. Since then, I've really pulled back on my use of iTunes. In part, because I lost so much music. In part, because iTunes has continued to get worse. In part, because I just like Spotify a whole lot. Despite this, I do want to have my iTunes library on my phone, and I have a phone big enough that I can keep most of my library on it. Unfortunately, somehow iTunes got into its head that it couldn't sync. Further, although I could tell it to delete all local content, it wouldn't stop thinking that it basically knew about the content, which seemed to be getting in the way. Finally, I deleted the playlists from the phone… which then synchronized to my library. My favorite playlists are smart playlists about three deep (for example, "Radio RJBS" is every song on one of three playlists, each of which is all the songs on two other playlists plus extra criteria), and the playlists I deleted were deep in this hierarchy of smart playlists. I had to get the base64 encoded playlist rule definitions out of the XML of a backup of my iTunes library, decompile the binary rules, and recreate things. It stank, but in the end, I got my music onto my phone!

This kind of horrible job was a good example of something I felt I could do this week that I couldn't have done other weeks. Very often, I feel very pressed for time, and once a task becomes too time-consuming, I move on, even though it leaves me unhappy to do so. I feel like I don't have time to spend on things that are of value only to me and are slow grinds. This week, I did quite a bit of that. A bunch of it was tedious, but it got done, and that felt good.

Another kind of tedious but necessary task: My little goal-tracking API-masher-up, Ywar has been suffering from a bit of bit rot, especially with its Withings integration, which I used to fetch my daily weight measurements. They recently turned off OAuth1 support, so I had to switch to OAuth2. I find OAuth pretty tedious in all its manifestations, so I'd put this off, but I'm trying to pay more attention to my Ywar emails, and so I need things that can be automated to stay automated. I got it working, it's stupid, it's ugly, but it works. I got a push notification to my phone within five minutes of the fix, and that was terrific.

I also put in some calm brain time on things that have felt like overwhelming tasks that I should probably do, but didn't really need to. For example, I want to rebuild my Linodes to be easier to maintain and redeploy in the future, but they work fine, so the pressure is low. I didn't do this, but I did write down a list of the things I'll have to do, so I can do it later without as much thinking. I ended the week with a couple new small todo lists in Remember the Milk for these projects.

I also got to the gym on seven of the nine days I was on leave, sometimes more than once. This was pretty good, although a couple times I dropped an exercise from my scheduled routine. (I am excited to try "explosive pushups" in theory, but in practice, I was tired as heck!) On a few days, I got there more than once a day so I could do some cardio in addition to lifting. This was great because, mostly, it meant I felt more motivated to sit in the steam room, which is always a treat. Also, I watched some Better Call Saul.

More soon.

drawing (but not generating) mazes (body)

by rjbs, created 2019-05-05 22:26
last modified 2019-08-11 21:34
tagged with: @markup:md journal programming

I've started a sort of book club here in Philly. It works like this: it's for people who want to do computer programming. We pick books that have programming problems, especially language-agnostic books, and then we commit to showing up to the book club meeting with answers for the exercises. There, we can show off our great work, or ask for help because we got stuck, or compare notes on what languages made things easier or harder. We haven't had an actual meeting yet, so I have no idea how well this is going to go.

Our first book is Mazes for Programmers, which I started in on quite a while ago but didn't make much progress through. The book's examples are in Ruby. My goal is to do work in a language that I don't know well, and I know Ruby well enough that it's a bad choice. I haven't decided yet what I'll do most of the work in, but I didn't want to do it all in Perl 5, which I already know well, and reach for when solving most daily problems. On the other hand, I knew a lot of the early material in the book (and maybe lots of the material in general) would be on generating mazes, which would be fairly algorithmic work and produce a data structure. I didn't want to get all caught up in drawing the data structure as a human-friendly maze, since that seemed like it would be a fiddly problem and would distract me from the actual maze generation.

This weekend, I wrote a program in Perl 5 that would take a very primitive description of a maze on standard input and print out a map on standard output. It was, as I predicted, sort of fiddly and tedious, but when I finished, I felt pretty good about it. I put my maze-drawing program on GitHub, but I thought it might be fun to write up what I did.

First, I needed a simple protocol. My goal was to accept input that would be easy to produce given any data structure describing a maze, even if it would be a sort of stupid format to actually store a maze in. I went with a line-oriented format like this:

  1 2 3
  4 5 6
  7 8 9

Every line in this example is row of three rooms in the maze. This input would actually be illegal, but it's a useful starting point. Every room in the maze is represented by an integer, which in turn represents a four-bit bitfield, where each bit tell us whether the room links in the indicated direction


So if a cell in the maze has passages leading south and east, it would be represented in the file by a 6. This means some kinds of input are nonsensical. What does this input mean?

  0 0 0
  0 2 0
  0 0 0

The center cell has a passage east, but the cell to its east has no passage west. Easy solution: this is illegal.

I made quite a few attempts to get from that to a working drawing algorithm. It was sort of painful, and I ended up feeling pretty stupid for a while. Eventually, though, I decided that the key was not to draw cells (rooms), but to draw lines. That meant that for a three by three grid of cells, I'd need to draw a four by four grid of lines. It's that old fencepost problem.

   1   2   3   4
 1 +---+---+---+
   | 0 | 0 | 0 |
 2 +---+---+---+
   | 0 | 2 | 8 |
 3 +---+---+---+
   | 0 | 0 | 0 |
 4 +---+---+---+

Here, there's only one linkage, so really the map could be drawn like this:

   1   2   3   4
 1 +---+---+---+
   | 0 | 0 | 0 |
 2 +---+---+---+
   | 0 | 2   8 |
 3 +---+---+---+
   | 0 | 0 | 0 |
 4 +---+---+---+

My reference map while testing was:

   1   2   3   4
 1 +---+---+---+
    10  12 | 0 |
 2 +---+   +---+
   | 0 | 5 | 0 |
 3 +---+   +---+
   | 0 | 3  12 |
 4 +---+---+   +

This wasn't too, too difficult to get, but it was pretty ugly. What I actually wanted was something drawn from nice box-drawing characters, which would look like this:

   1   2   3   4
 1 ╶───────┬───┐
    10  12 │ 0 │
 2 ┌───┐   ├───┤
   │ 0 │ 5 │ 0 │
 3 ├───┤   └───┤
   │ 0 │ 3  12 │
 4 └───┴───╴  ╵

Drawing this was going to be trickier. I couldn't just assume that every intersection was a +. I needed to decide how to pick the character at every intersection. I decided that for every intersection, like (2,2), I'd have to decide the direction of lines based on the links of the cells abutting the intersection. So, for (2,2) on the line axes, I'd have to look at the cells at (2,1) and (2,2) and (1,2) and (1,1). I called these the northeast, southeast, southwest, and northwest cells, relative to the intersection, respectively. Then determined whether a line extended from the middle of an intersection in a given direction as follows:

  # Remember, if the bit is set, then a link (or passageway) in that
  # direction exists.
  my $n = (defined $ne && ! ($ne & WEST ))
       || (defined $nw && ! ($nw & EAST ));
  my $e = (defined $se && ! ($se & NORTH))
       || (defined $ne && ! ($ne & SOUTH));
  my $s = (defined $se && ! ($se & WEST ))
       || (defined $sw && ! ($sw & EAST ));
  my $w = (defined $sw && ! ($sw & NORTH))
       || (defined $nw && ! ($nw & SOUTH));

For example, how do I know that at (2,2) the intersection should only have limbs headed west and south? Well, it has cells to the northeast and northwest, but they link west and east respectively, so there can be no limb headed north. On the other hand, the cells to its southeast and southwest do not link to one another, so there is a limb headed south.

This can be a bit weird to think about, so think about it while looking at the map and code.

Now, for each intersection, we'd have a four-bit number. What did that mean? Well, it was easy to make a little hash table with some bitwise operators and the Unicode character set…

  my %WALL = (
    0     | 0     | 0     | 0     ,=> ' ',
    0     | 0     | 0     | WEST  ,=> '╴',
    0     | 0     | SOUTH | 0     ,=> '╷',
    0     | 0     | SOUTH | WEST  ,=> '┐',
    0     | EAST  | 0     | 0     ,=> '╶',
    0     | EAST  | 0     | WEST  ,=> '─',
    0     | EAST  | SOUTH | 0     ,=> '┌',
    0     | EAST  | SOUTH | WEST  ,=> '┬',
    NORTH | 0     | 0     | 0     ,=> '╵',
    NORTH | 0     | 0     | WEST  ,=> '┘',
    NORTH | 0     | SOUTH | 0     ,=> '│',
    NORTH | 0     | SOUTH | WEST  ,=> '┤',
    NORTH | EAST  | 0     | 0     ,=> '└',
    NORTH | EAST  | 0     | WEST  ,=> '┴',
    NORTH | EAST  | SOUTH | 0     ,=> '├',
    NORTH | EAST  | SOUTH | WEST  ,=> '┼',

At first, I only drew the intersections, so my reference map looked like this:


When that worked -- which took quite a while -- I added code so that cells could have both horizontal and vertical fillter. My reference map had a width of 3 and a height of 1, meaning that it was drawn with 1 row of vertical-only filler and 3 columns of horizontal-only drawing per cell. The weird map just above had a zero height and width. Here's the same map with a width of 6 and a height of zero:

  ┌──────┐      ├──────┤
  ├──────┤      └──────┤
  └──────┴──────╴      ╵

I have no idea whether this program will end up being useful in my maze testing, but it was (sort of) fun to write. At this point, I'm mostly wondering whether it will be proven to be terrible later on.

As a side note, my decision to do the drawing in text was a major factor in the difficulty. Had I drawn the maps with a graphical canvas, it would have been nearly trivial. I'd just draw each cell, and then start adjacent cells with overlapping positions. If two walls drew over one another, it would be the intersection of drawn pixels that would display, which would be exactly what we wanted. Text can't work that way, because every visual division of the terminal can show only one glyph. In this way, a typewriter is more like a canvas than a text terminal. When it overstrikes two characters, the intersection of their inked surfaces really is seen. In a terminal, an overstriken character is fully replaced by the overstriking character.

It's all on GitHub, but here's my program as I stands tonight:

  use v5.20.0;
  use warnings;

  use Getopt::Long::Descriptive;

  my ($opt, $usage) = describe_options(
    '%c %o',
    [ 'debug|D',    'show debugging output' ],
    [ 'width|w=i',  'width of cells', { default => 3 } ],
    [ 'height|h=i', 'height of cells', { default => 1 } ],

  use utf8;
  binmode *STDOUT, ':encoding(UTF-8)';

  #  1   A maze file, in the first and stupidest form, is a sequence of lines.
  # 8•2  Every line is a sequence of numbers.
  #  4   Every number is a 4-bit number.  *On* sides are linked.
  # Here are some (-w 3 -h 1) depictions of mazes as described by the numbers
  # shown in their cells:
  # ┌───┬───┬───┐ ╶───────┬───┐
  # │ 0 │ 0 │ 0 │  10  12 │ 0 │
  # ├───┼───┼───┤ ┌───┐   ├───┤
  # │ 0 │ 0 │ 0 │ │ 0 │ 5 │ 0 │
  # ├───┼───┼───┤ ├───┤   └───┤
  # │ 0 │ 0 │ 0 │ │ 0 │ 3  12 │
  # └───┴───┴───┘ └───┴───╴   ╵

  use constant {
    NORTH => 1,
    EAST  => 2,
    SOUTH => 4,
    WEST  => 8,

  my @lines = <>;
  chomp @lines;

  my $grid = [ map {; [ split /\s+/, $_ ] } @lines ];

  die "bogus input\n" if grep {; grep {; /[^0-9]/ } @$_ } @$grid;

  my $max_x = $grid->[0]->$#*;
  my $max_y = $grid->$#*;

  die "not all rows of uniform length\n" if grep {; $#$_ != $max_x } @$grid;

  for my $y (0 .. $max_y) {
    for my $x (0 .. $max_x) {
      my $cell  = $grid->[$y][$x];
      my $south = $y < $max_y ? $grid->[$y+1][$x] : undef;
      my $east  = $x < $max_x ? $grid->[$y][$x+1] : undef;

      die "inconsistent vertical linkage at ($x, $y) ($cell v $south)"
        if $south && ($cell & SOUTH  xor  $south & NORTH);

      die "inconsistent horizontal linkage at ($x, $y) ($cell v $east)"
        if $east  && ($cell & EAST   xor  $east  & WEST );

  my %WALL = (
    0     | 0     | 0     | 0     ,=> ' ',
    0     | 0     | 0     | WEST  ,=> '╴',
    0     | 0     | SOUTH | 0     ,=> '╷',
    0     | 0     | SOUTH | WEST  ,=> '┐',
    0     | EAST  | 0     | 0     ,=> '╶',
    0     | EAST  | 0     | WEST  ,=> '─',
    0     | EAST  | SOUTH | 0     ,=> '┌',
    0     | EAST  | SOUTH | WEST  ,=> '┬',
    NORTH | 0     | 0     | 0     ,=> '╵',
    NORTH | 0     | 0     | WEST  ,=> '┘',
    NORTH | 0     | SOUTH | 0     ,=> '│',
    NORTH | 0     | SOUTH | WEST  ,=> '┤',
    NORTH | EAST  | 0     | 0     ,=> '└',
    NORTH | EAST  | 0     | WEST  ,=> '┴',
    NORTH | EAST  | SOUTH | 0     ,=> '├',
    NORTH | EAST  | SOUTH | WEST  ,=> '┼',

  sub wall {
    my ($n, $e, $s, $w) = @_;
    return $WALL{ ($n ? NORTH : 0)
                | ($e ? EAST : 0)
                | ($s ? SOUTH : 0)
                | ($w ? WEST : 0) } || '+';

  sub get_at {
    my ($x, $y) = @_;
    return undef if $x < 0 or $y < 0;
    return undef if $x > $max_x or $y > $max_y;
    return $grid->[$y][$x];

  my @output;

  for my $y (0 .. $max_y+1) {
    my $row = q{};

    my $filler;

    for my $x (0 .. $max_x+1) {
      my $ne = get_at($x    , $y - 1);
      my $se = get_at($x    , $y    );
      my $sw = get_at($x - 1, $y    );
      my $nw = get_at($x - 1, $y - 1);

      my $n = (defined $ne && ! ($ne & WEST ))
           || (defined $nw && ! ($nw & EAST ));
      my $e = (defined $se && ! ($se & NORTH))
           || (defined $ne && ! ($ne & SOUTH));
      my $s = (defined $se && ! ($se & WEST ))
           || (defined $sw && ! ($sw & EAST ));
      my $w = (defined $sw && ! ($sw & NORTH))
           || (defined $nw && ! ($nw & SOUTH));

      if ($opt->debug) {
        printf "(%u, %u) -> NE:%2s SE:%2s SW:%2s NW:%2s -> (%s %s %s %s) -> %s\n",
          $x, $y,
          (map {; $_ // '--'  } ($ne, $se, $sw, $nw)),
          (map {; $_ ? 1 : 0 } ($n,  $e,  $s,  $w)),
          wall($n, $e, $s, $w);

      $row .= wall($n, $e, $s, $w);

      if ($x > $max_x) {
        # The rightmost wall is just the right joiner.
        $filler .=  wall($s, 0, $s, 0);
      } else {
        # Every wall but the last gets post-wall spacing.
        $row .= ($e ? wall(0,1,0,1) : ' ') x $opt->width;
        $filler .=  wall($s, 0, $s, 0);
        $filler .= ' ' x $opt->width;

    push @output, $row;
    if ($y <= $max_y) {
      push @output, ($filler) x $opt->height;

  say for @output;

PTS 2019: I went to the Perl Toolchain Summit! (index) (body)

by rjbs, created 2019-04-29 20:47
last modified 2019-04-30 20:54

Once again, it is spring in the northern hemisphere, and so time for the Perl Toolchain Summit, aka the QA Hackathon. I've made it to most of these, and have usually found them to be productive and invigorating. Everybody shows up with something to do, most of the people you need to help you are there, and everybody is interested in what everybody else is doing and will offer good feedback, advice, or just expressions of appreciation (or sympathy, as the case may require).

Bisham Abbey

The most common topics for me at past summits have been PAUSE, CPAN Testers, Dist::Zilla, and the CPAN::Meta specification. This year, I focused almost entirely on PAUSE, and I think it paid off. Last year, I did quite a bit of work on Dist::Zilla and wasn't happy with the end result. It might have been a good time to have a second go, but I decided I had a lot more support personnel on hand for PAUSE, and stuck with it. I'll try to give a general run-down of what I worked on in my posts, and some of it I'll probably try to expand into some reference material for future PAUSE contributors.

Here are my posts about the summit:

  1. Marlow
  2. Getopt::Long::Descriptive
  3. Module::Faker
  4. Automated PAUSE Testing
  5. PAUSE Inspection Tools

PTS is made possible by the generosity of its sponsors, who are making it possible for useful work to happen that would, quite simply, not happen otherwise. Thanks to them: Booking.com, cPanel, MaxMind, ZipRecruiter, Cogendo, Elastic, OpenCage Data, Perl Services, Zoopla, Archer Education, OpusVL, Oetiker+Partner, SureVoIP, YEF, and of course FastMail.

PTS 2019: Testing PAUSE by Hand (5/5) (body)

by rjbs, created 2019-04-29 20:47
last modified 2019-04-30 07:07

Writing tests is great, but sometimes you're not sure what you're looking for. Maybe you don't know what will happen at all if you try something new. Maybe you know it fails, but not how or where. This is the kind of thing where normally I'd fire up the perl debugger. The debugger is too often ignored by programmers who could solve problems a lot faster by using it. I love adding print statements as much as the next person -- maybe more -- but the debugger is a specialized tool that is worth bringing out now and then.

That said, I am not here to advocate that you attach the debugger to the PAUSE indexer. It's just that tools for investigating are sometimes more appropriate than tools for asserting. So, I built one.

Actually, I built it in 2015, but I've only used it sparingly until this week. With the improvements I made to my module faking code, it became much, much easier to use this tool to investigate hypothetical scenarios.

It used to work like this:

  $ ./one-off-utils/build-fresh-cpan \
    Some-Tarball-1.23.tar.gz \

You'd provide it a bunch of pre-built tarballs and it would make a new TestPAUSE, index the tarballs, and drop you into a shell. Ugh! I mean, it was useful, but we're back to a world where we have to write a program to generate fake data. Instead, I updated the program to take a list of instructions of what to do. For example:

  $ ./one-off-utils/build-fresh-cpan \
    fake:RJBS:Some-Cool-Dist-1.234.tar.gz \
    fake:ANDK:CPAN-Client-3.000.tar.gz \
    file:RJBS:~/tmp/perl-5.32.0.tar.gz \
    index \

This does what you might expect: it fakes up two tarballs and uploads those. Then it runs the indexer. Then it makes another fake with the given snippet of Perl. Finally, it drops you into a shell. The shell might look like this:

/private/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/wZwYj2Cgp0$ ls -l
total 4
drwxr-xr-x 5 rjbs staff  160 Apr 29 12:25 Maildir/
drwxr-xr-x 4 rjbs staff  128 Apr 29 12:25 cpan/
drwxr-xr-x 4 rjbs staff  128 Apr 29 12:25 db/
drwxr-xr-x 5 rjbs staff  160 Apr 29 12:25 git/
-rw-r--r-- 1 rjbs staff 3086 Apr 29 12:25 pause.log
drwxr-xr-x 3 rjbs staff   96 Apr 29 12:25 run/

Maildir contains all the mail that PAUSE would've sent out.

cpan is the CPAN that would be published, so you can find the tarballs and the index files there.

db has SQLite databases storing all the PAUSE data.

git is a git index storing every state that the index files have ever had.

run is not interesting, and stores the lockfile for the indexer.

Then there's pause.log. As you might guess, it's the PAUSE log file. We'll come back to that below.

Anyway, you can probably imagine that this is a pretty useful collection of data for investigation! You can very quickly answer questions like "if the system is in state X and events A, B, C happen, what's the new state and why?" In fact, we had quite a few questions that we'd put off answering in the past because they were just too tedious to sort out. Now they've become a matter of writing a few lines at the command prompt.

Of course, the command prompt can be a tedious place to write what is, effectively, a program, so build-fresh-cpan can also get instructions from a file. For example, I can write instructions to the file test.pause and then run ./one-off-tools/build-fresh-cpan cmdprog:test.pause. Here's what a program might look:

  # First, we import some boring stuff.


  # Then some more boring stuff, though slightly less boring.
      name      => 'Some-Cool-Dist',
      version   => 1.302,
      packages  => [
        qw( Some::Cool::Dist Some::Cool::Package ),
        'Some::Cool::Util' => { in_file => "lib/Some/Cool/Package.pm" },


  # And now RJBS tries to steal ownership of one of ANDK's packages!
      name      => 'Some-Cool-Dist',
      version   => '4.000',
      packages  => [
        qw( Some::Cool::Dist Some::Cool::Package ),
        'Some::Cool::Util' => { in_file => "lib/Some/Cool/Package.pm" },


  # Note that we didn't index CPAN::Client from RJBS.
  cmd:zgrep Client cpan/modules/02packages.details.txt.gz

  # Then ANDK takes pity on RJBS and grants him permission.  RJBS just re-indexes
  # the old dist.

  # We pick the new file for individual indexing...

  # ...but that doesn't rebuild the indexes, it only updates the database.
  cmd:zgrep Client cpan/modules/02packages.details.txt.gz

  # Let's poke around.

  # Everything looks fine in the logs, but it doesn't look like there has
  # been an update to the files on disk.  Maybe only full reindexes update
  # the files!

  # ...so we do a full reindex.

  # ...and now it's there.
  cmd:zgrep Client cpan/modules/02packages.details.txt.gz

Turning a program like this into an automated test is trivial. You copy in and tweak a few lines, then add the assertions you want to make based on what you've learned. I expect to do a lot of investigation in this format.

I also might break this program as I work on it, but I don't expect it needs much long-term stability, beyond the basics. Its documentation is in the command line usage:

  build-fresh-cpan [-IPpv] [long options...] TYPE:WHAT...
          --dir STR              target directory; by default uses a tempdir
          -v --verbose           print logs to STDERR as it goes
          -p STR --packages STR  02packages file to prefill packages table
          -P STR --perms STR     06perms file to prefill mods/primeur/perms
          --default-user STR     default PAUSEID for uploads; default: LOCAL
          -I --stdin             read instructions from STDIN to run before ARGV
          --each                 index at start and after each upload
          --[no-]shell           add an implicit "shell" as last instruction;
                                 on by default

  Other than the --switches, arguments are instructions in one of the forms
  below.  For those that list PAUSEID, it may be omitted, and the default
  user is used instead.

  Valid instructions are:

    form                  | meaning
    index                 | index now
    index:FILE            | index just one file
    perm:PAUSEID:PKG:PERM | set perm (f or c or 0) for PAUSEID on PKG
    file:PAUSEID:FILE     | upload the named file as the named PAUSE user
    fake:PAUSEID:FILE     | generate a dist based on the given filename
    json:PAUSEID:JSON     | interpret the given JSON string as a faker struct
    perl:PAUSEID:PERL     | interpret the given Perl string as a faker struct
    adir:DIRECTORY        | author dir: dir with A/AB/ABC/Dist-1.0.tar.gz files
    fdir:DIRECTORY        | flat dir: upload all the files as default user
    prog:file             | read a file containing a list of instructions
    progcmd:"program"     | run program, which should print out instructions
    cmd:"program"         | run a command in the working directory
    shell                 | run a shell in the working directory

  prog and progcmd output can split instructions across multiple lines.  Lines
  that begin with whitespace will be appended to the line preceding them.

Oh, and see that big table? That's why I needed to fix Getopt::Long::Descriptive!

...but what about logging?

The PAUSE indexer log file has often stymied me. It has a lot of useful information, and some not useful information, but it can be hard to tell just what's going on. For example, here's the log output from indexing just one dist:

  >>>> Just uploaded fake from RJBS/Fake-Dist-1.23.tar.gz
  PAUSE::mldistwatch object created
  Running manifind
  Collecting distmtimes from DB
  Registering new users
  Info: new_active_users[RJBS]
  Starting BIGLOOP over 1 files
  . R/RJ/RJBS/Fake-Dist-1.23.tar.gz ..
  Assigned mtime '1556373013' to dist 'R/RJ/RJBS/Fake-Dist-1.23.tar.gz'
  Examining R/RJ/RJBS/Fake-Dist-1.23.tar.gz ...
  Going to untar. Running '/usr/bin/tar' 'xzf' '/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/jzPcAZf3Xg/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz'
  Untarred '/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/jzPcAZf3Xg/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz'
  Found 6 files in dist R/RJ/RJBS/Fake-Dist-1.23.tar.gz, first Fake-Dist-1.23/MANIFEST
  No readme in R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  Finished with pmfile[Fake-Dist-1.23/lib/Fake/Dist.pm]
  Result of normalize_version: sdv[1.23]
  Result of simile(): file[Dist] package[Fake::Dist] ret[1]
  No keyword 'no_index' or 'private' in META_CONTENT
  Result of filter_ppps: res[Fake::Dist]
  Will check keys_ppp[Fake::Dist]
  (uploader) Inserted into perms package[Fake::Dist]userid[RJBS]ret[1]err[]
  02maybe: Fake::Dist                      1.23 Fake-Dist-1.23/lib/Fake/Dist.pm (1556373013) R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  Inserting package: [INSERT INTO packages (package, version, dist, file, filemtime, pause_reg, distname) VALUES (?,?,?,?,?,?,?) ] Fake::Dist,1.23,R/RJ/RJBS/Fake-Dist-1.23.tar.gz,Fake-Dist-1.23/lib/Fake/Dist.pm,1556373013,1556373013
  Inserted into perms package[Fake::Dist]userid[RJBS]ret[]err[]
  Inserted into primeur package[Fake::Dist]userid[RJBS]ret[1]err[]
  Sent "indexer report" mail about RJBS/Fake-Dist-1.23.tar.gz
  Entering rewrite02
  Number of indexed packages: 0
  Entering rewrite01
  No 01modules exist; won't try to read it
  cared about 0 symlinks
  Entering rewrite03
  No 03modlists exist; won't try to read it
  Entering rewrite06
  Directory '/home/ftp/run/mirroryaml' not found
  Finished rewrite03 and everything at Sat Apr 27 14:50:14 2019

To make the log file easier to read, and logging easier to manage, I converted PAUSE to use Log::Dispatchouli::Global. Then I went through every log line and rewrote many of them to have a more self-similar format, and used Log::Dispatchouli's data printing facilities. Here's the same operation in the new log format:

  2019-03-29 13:27:28.0774 [23234] FRESH: just uploaded fake: cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  2019-03-29 13:27:28.0861 [23234] PAUSE::mldistwatch object created
  2019-03-29 13:27:28.0901 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: assigned mtime 1556558848
  2019-03-29 13:27:28.0902 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: beginning examination
  2019-03-29 13:27:28.0913 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: going to untar with: {{["/usr/bin/tar", "xzf", "/var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/uRlTDQD954/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz"]}}
  2019-03-29 13:27:28.0920 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: untarred /var/folders/tp/xbk5yqfj7vv86jjcgk_cp4wh0000gq/T/uRlTDQD954/cpan/authors/id/R/RJ/RJBS/Fake-Dist-1.23.tar.gz
  2019-03-29 13:27:28.0921 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: found 6 files in dist, first is [Fake-Dist-1.23/MANIFEST]
  2019-03-29 13:27:28.0922 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: no README found
  2019-03-29 13:27:28.0924 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: selected pmfiles to index: {{["Fake-Dist-1.23/lib/Fake/Dist.pm"]}}
  2019-03-29 13:27:29.0009 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: result of normalize_version: 1.23
  2019-03-29 13:27:29.0009 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: result of basename_matches_package: {{{"file": "Dist", "package": "Fake::Dist", "ret": 1}}}
  2019-03-29 13:27:29.0010 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: will examine packages: {{["Fake::Dist"]}}
  2019-03-29 13:27:29.0012 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserted into perms: {{{"err": "", "package": "Fake::Dist", "reason": "(uploader)", "ret": 1, "userid": "RJBS"}}}
  2019-03-29 13:27:29.0012 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserting package: {{{"dist": "R/RJ/RJBS/Fake-Dist-1.23.tar.gz", "disttime": 1556558848, "file": "Fake-Dist-1.23/lib/Fake/Dist.pm", "filetime": 1556558848, "package": "Fake::Dist", "version": "1.23"}}}
  2019-03-29 13:27:29.0013 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserted into perms: {{{"err": "", "package": "Fake::Dist", "reason": null, "ret": "", "userid": "RJBS"}}}
  2019-03-29 13:27:29.0013 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: Fake-Dist-1.23/lib/Fake/Dist.pm: inserted into primeur: {{{"err": "", "package": "Fake::Dist", "ret": 1, "userid": "RJBS"}}}
  2019-03-29 13:27:29.0015 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: ensuring canonicalized case of Fake::Dist
  2019-03-29 13:27:29.0035 [23234] R/RJ/RJBS/Fake-Dist-1.23.tar.gz: sent indexer report email
  2019-03-29 13:27:29.0040 [23234] rewriting 02packages
  2019-03-29 13:27:29.0069 [23234] no 01modules exist; won't try to read it
  2019-03-29 13:27:29.0069 [23234] symlinks updated: 0
  2019-03-29 13:27:29.0070 [23234] no 03modlists exist; won't try to read it
  2019-03-29 13:27:29.0171 [23234] FTP_RUN directory [/home/ftp/run/mirroryaml] does not exist
  2019-03-29 13:27:29.0213 [23234] finished rewriting indexes
  2019-03-29 13:27:29.0268 [23234] running a shell (/opt/local/bin/zsh)

It's not perfect, but I think it's much easier to read.

My hope is that the improved testing and investigation facilities will help us make serious strides toward overhauling the indexer before PTS 2020. We'll see, but right now, I feel pretty good about it!

PTS 2019: The PAUSE Test Suite (4/5) (body)

by rjbs, created 2019-04-29 20:45

PAUSE and case insensitivity

All that Module::Faker and Getopt work was really in furtherance of PAUSE work. When I arrived in Marlow, I had been given just request by Neil: sort out PAUSE!320, a change in behavior we've been slowly getting right for years now.

Once upon a time, the PAUSE index was case sensitive. Andreas König, the author and maintainer of PAUSE, says he's not sure whether or not this was actually intentional. At any rate, it meant that if one person uploaded the package "Parse::PERL" and another uploaded "Parse::Perl", the indexer would let them each have permissions on the name they uploaded. This is a problem because not all filesystems are case-sensitive. So a user might rely on Parse::PERL, install Parse::Perl, and then get bizarre errors when trying to use Parse::PERL in their code. The runtime, after all, does not verify that when you load the module Foo::Bar, it actually defines the package Foo::Bar. D'oh!

We've made various changes to address this problem, with the basic rule being that permissions are case-preserving and not case-sensitive. If you have permissions on Foo::Bar, then you also own foo::bar and FOO::BAR and foO::bAr and whatever else you like. The index and permissions data will show "Foo::Bar", because that's what you uploaded. Maintainers of the Foo::Bar namespace are free to rename it to fOO::BAr, if they want, by uploading a new distribution, but nobody else can sneak their way into the namespace.

This was basically implemented at last year's PTS, but wasn't deployed because it was broken in subtle ways. The fix was easy, once I worked out what the problem was, which required a bunch of testing. More on that shortly!

The upshot was that I did get the behavior changed successfully. Neil who had been working for years to eliminate all the case conflicts in the permissions list, had gotten things down to a few enough to count on one hand. With the bug fixed and the last few conflicts resolved, we believe we have ended the age of conflicting-case index entries. This was a nice milestone to reach. There was high-fiving, and Neil may have shed a single tear, though I couldn't swear to it in court.

As an aside: you might be wondering, "Why didn't you just put case-insensitive unique indexes in the database?" PAUSE's indexer is sort of a strange beast, which simultaneously updates the database, analyzes the contents of uploads, and decides what to do as it goes along. Triggering a database error would jump past a lot of behavior, and we could've done it, but it felt saner to try to detect the problem. I have plans for the future to tease apart the indexer's several behaviors.

testing PAUSE

In August 2011, David Golden and I got together in Brooklyn and began writing a test suite for the indexer. To be fair, a couple tests existed already, but they tested very, very few things. By just a bit before four in the afternoon on that day in 2011, David and I had a passing test that looked like this:

  my $result = PAUSE::TestPAUSE->new({
    author_root => 'corpus/authors',

    -e $result->tmpdir->file(qw(cpan modules 02packages.details.txt.gz)),
    "our indexer indexed",

  my $pkg_rows = $result->connect_mod_db->selectall_arrayref(
    'SELECT * FROM packages ORDER BY package, version',
    { Slice => {} },

  my @want = (
    { package => 'Bug::Gold',      version => '9.001' },
    { package => 'Hall::MtKing',   version => '0.01'  },
    { package => 'XForm::Rollout', version => '1.00'  },
    { package => 'Y',              version => 2       },

    [ map {; superhashof($_) } @want ],
    "we indexed exactly the dists we expected to",

Further refinements would come, but many of the tests still look quite a lot like this. Note how it begins: we name an author_root This is a directory full of pretend CPAN uploads by pretend CPAN authors. Every file in the directory is copied into the test PAUSE, simulating an upload, and then the indexer is run. To understand these tests, you need to know what's in that directory. It's not just a matter of running ls, either. The directory contains tarballs, and those tarballs are more or less opaque unless you unpack them. Ugh. In this test, the contents were all entirely uninteresting, but in later tests, you'd end up wondering what was being tested. Something would import corpus/mld/009 and then assert that the index should look one way or another, rarely noting that one dist in the directory had strange properties known only at the time of test-writing.

To make matters worse, the tests were split into two files. In one, each tested behavior was tested with a distinct TestPAUSE, so no two tests would interact. In the other file, though, every behavior was tested on top of the tests for the previous test, resulting in a very cluttered test file in which the intent of any given test might be pretty hard to determine, especially when you're reading it five years later.

Splitting those tests up so that each would use a distinct TestPAUSE wasn't going to be difficult as a matter of programming, but it meant each one needed to be teased apart from the one before it, meaning its intent needed to be sussed out, which meant unpacking tarballs and reading their contents. I shook my fist and cried, "Never again!"

By 2019, test had changed so that rather than this:

  my $result = PAUSE::TestPAUSE->new({
    author_root => 'corpus/authors',

You'd be likely to write:

  my $pause = PAUSE::TestPAUSE->new;


  my $result = $pause->test_reindex;

(This means you'd be able to later add more files, index again, and see what changed. Useful!)

To make the tests clearer, I added a new method, upload_author_fake:

  $pause->upload_author_fake(JBLOE => {
    name     => 'Example-Dist',
    version  => '1.0',
    packages => [ qw(Example::Dist Example::Dist::Package) ]
    more_metadata => { x_Example => 'This is an example, too.' },

Hey, it's using from_struct like we saw in my Module::Faker report from this PTS! Now you can always know exactly what is interesting about a fake. Sometimes, though, you don't need an interesting fake, you just need totally boring dist to be uploaded. In those cases, now you can just write

  $pause->upload_author_fake(JBLOE => 'Example-Dist-1.0.tar.gz');

...and Module::Faker will know what you mean.

With this tool available, and the new Module::Faker features to help produce weird distributions, I was able to rewrite the tests to be entirely isolated. I also deleted quite a few of the prebuilt tarballs from the corpus directory, but not all of them yet. One or two are a bit tedious to produce with Faker, and one or two others I just didn't get to.

I look forward to replacing those, in part because I know it will mean cool improvements to Module::Faker, and in part because every time I make the test suite saner, I make it easier to get more people confident that they can write more PAUSE code.

PTS 2019: Module::Faker (3/5) (body)

by rjbs, created 2019-04-29 20:40
last modified 2019-04-29 20:40

the basics

In 2008, I wrote Module::Faker. In fact, I wrote it at the first QA Hackathon! I'd had the idea to write it because at Pobox, we were writing a PAUSE-like module indexer for changing our internal deployment practices. It became clear that we could use it for testing actual PAUSE as well as other code, and I got to work. Since then, I've used it (and the related CPAN::Faker) quite a lot for testing, especially of PAUSE. At first, it was just a quick way to build a tarball that contained something that looked more or less like a CPAN distribution, with the files in the right places, the package statements, the version declarations, and so on.

As time went on, I needed weirder and more broken distributions, or distributions with more subtle changes than just a different version. In these cases, I'd build a fake, untar it, edit the contents, and tar it up again. Further, the usual way to fake up a dist was to write a META.yml-style file, stick in a directory with other ones, and then build them all. Why? Honestly, I can't remember. It's sort of a bizarre choice, and I think probably it was trying to be too clever. By default, you pass the contents of the META file to Module::Faker::Dist->new, which means its internals go to sort of annoying lengths to do the right thing. Why not a simpler constructor and a translation layer in between?

Well, that's not what I implemented, but I'll probably make that happen eventually. Instead, I added yet another constructor, this one much more suitable for building one-off fakes. Using it looks something like this:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'DENOMOLOS',
    name        => 'Totally-Fake-Software',
    version     => '2.002',

And when you write that to disk, you get this:

  ~/code/Module-Faker$ tar zxvf Totally-Fake-Software-2.002.tar.gz
  x Totally-Fake-Software-2.002/lib/Totally/Fake/Software.pm
  x Totally-Fake-Software-2.002/Makefile.PL
  x Totally-Fake-Software-2.002/t/00-nop.t
  x Totally-Fake-Software-2.002/META.json
  x Totally-Fake-Software-2.002/META.yml
  x Totally-Fake-Software-2.002/MANIFEST

  ~/code/Module-Faker$ cat Totally-Fake-Software-2.002/lib/Totally/Fake/Software.pm

  =head1 NAME

  Totally::Fake::Software - a cool package


  package Totally::Fake::Software;
  our $VERSION = '2.002';


Quite a few things are inferred, but the most important inference is that in a dist called Totally-Fake-Software with a version of 2.002, you'll want a single package named Totally::Fake::Software with a version of 2.002. It's pretty resaonable, as long as it can be overridden, and now it can:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'DENOMOLOS',
    name        => 'Totally-Fake-Software',
    version     => '2.002',
    packages    => [
      'Totally::Fake::Firmware' => {
        version => '2.001',
        in_file => "lib/Totally/Fake/Hardware.pm"

This does what you might imagine: it makes two modules (read: .pm files), one of which contains two package definitions, and those two packages have differing version numbers. That is:

~/code/Module-Faker$ cat Totally-Fake-Software-2.002/lib/Totally/Fake/Hardware.pm

=head1 NAME

Totally::Fake::Hardware - a cool package


package Totally::Fake::Hardware;
our $VERSION = '2.002';

package Totally::Fake::Firmware;
our $VERSION = '2.001';


I think I'm still going to tinker with how faker objects are generated, but by now you should get the point.

weird stuff

I use Module::Faker for testing PAUSE, and one thing that PAUSE has to do is deal with slightly (or not so slightly) malformed input. With no specification for the shape of a distribution (other than the strong suggestion that you follow what's in the CPAN::Meta spec), malformedness is in the eye of the beholder. I can tell you, though, that I have beheld a lot of malformed stuff in my time.

One example was pretty innocuous: in META.json, there can be a "provides" entry saying what packages are provided by this distribution, and in what file they're found. It might look like this:

  "provides": {
    "Some::Package::Here": { "file": "lib/Some/Package/Here.pm" }

From time to time, people have inserted an entry with an undefined file value. This was used as a means to say, "I am claiming ownership of this name, which I do not provide." A few years ago, the toolchaing gang came to a consensus: file needs to be provided, naming some actual file. Unfortunately, CPAN::Meta features footgun protection, shielding you from accidentally (read: on purpose) making this kind of bogus entry. I needed a way to produce it, and I really did not want to resort to editing and repacking tarballs by hand.

To do this, I added a new option, meta_munger:

  $dist = Module::Faker::Dist->from_struct({
    name    => 'Some-Package-Here',
    version => '1.0',
    cpan_author => 'TSWIFT',
    meta_munger => sub {
      $_[0]{provides} = {
        "Some::Package::Here" => { file => undef },

      return $_[0];

Doing this involved disassembling and duplicating a bit of CPAN::Meta, but it was worth it. All manner of stupid garbage can now be stuffed into fake dists' metadata. In fact, the munger can decide it's going to act like 2009-era RJBS by putting JSON into the META.yml file, it can use YAML::Syck, it can return syntactically invaild JSON, or whatever you like. Do your worst!

The munger runs quite late in the metadata generation process. Especially noteworthy, it runs after the metadata structure has been generated for a specific version. You might just want to provide a little bit of extra metadata, like an x_favorite_treat to be handled like any other bit of metadata, with CPAN::Meta::Merge. Easy enough:

  $dist = Module::Faker::Dist->from_struct({
    name    => 'DateTime-Breakfast',
    version => '1.0',
    cpan_author => 'BURGERS',
    more_metadata => { x_favorite_treat => 'cruffins' },

It's certainly less work to write than a munging subroutine.

I added support for different "style" packages. What does that mean? Well, given this code:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'MRXII',
    name        => 'Cube-Solver',
    version     => 'v3.3.3',
    packages    => [
      'Cube-Solver-Rubik' => { in_file => 'Solver.pm', style => 'legacy' },
      'Cube-Solver-GAN'   => { in_file => 'Solver.pm', style => 'statement' },
      'Cube-Solver-LHR'   => { in_file => 'Solver.pm', style => 'block' },

You get this result:

  ~/code/Module-Faker$ cat Cube-Solver-v3.3.3/Solver.pm

  =head1 NAME

  Cube-Solver-Rubik - a cool package


  package Cube-Solver-Rubik;
  our $VERSION = 'v3.3.3';

  package Cube-Solver-GAN v3.3.3;

  package Cube-Solver-LHR v3.3.3 {

    # Your code here



I actually haven't documented this feature, yet, because I'm not happy enough with it. For example, you can't get an assignment to $VERSION inside a package block. I'd like to make this all a bit more flexible. Sometime I half think that it would be cool to merge Module::Faker's behavior into Dist::Zilla's dist minting features, but I think this would be more trouble than it's worth. Probably.

I didn't document that, but I did document quite a lot of Module::Faker::Dist, which was previously entirely undocumented. Great!

PTS 2019: Getopt::Long::Descriptive (2/5) (body)

by rjbs, created 2019-04-29 20:37
last modified 2019-04-30 15:43

One non-PAUSE thing I worked on was Getopt::Long::Descriptive. I added a small new feature. It supports something like this:

    "%c %o ARG...",
    [ "foo",  "should we do foo?" ],
    [ "bar",  "should we do bar?" ],
    [ <<~'EOT' ],

      How should you know whether to use --foo and --bar?  Well, the choice
      is simple.  If you want to foo, use --foo, and if you want to use bar,
      don't, because --bar hasn't been implemented.

This is okay, but when you run it, you get:

  examine-program [long options...] ARG...
    --foo  should we do foo?
    --bar  should we do bar?
    How should you know whether to use --foo and --bar?  Well, the
  is simple.  If you want to foo, use --foo, and if you
    want to use bar,
  don't, because --bar hasn't been implemented.

Waaaa? Well, GLD is helpfully trying to word wrap and indent text for you, and it does a terrible job in the case of large hunks of text that you want displayed verbatim. I added a way to tell it to trust you, the author, on the indenting.

    "%c %o ARG...",
    [ "foo",  "should we do foo?" ],
    [ "bar",  "should we do bar?" ],
    [ \<<~'EOT' ],

      How should you know whether to use --foo and --bar?  Well, the choice
      is simple.  If you want to foo, use --foo, and if you want to use bar,
      don't, because --bar hasn't been implemented.

Did you miss it? It's the \ turning there heredoc string into a reference.

Of course, the code to implement this was nearly trivial. I spent more time on figuring out what was going on where than I did fixing it. I expect to start using this feature in other code more or less immediately.

PTS 2019: Marlow (1/5) (body)

by rjbs, created 2019-04-29 20:36
last modified 2021-10-09 22:02

For several reasons, I had considered not going to this year's PTS. Going was a good idea, and I'm glad I did it, and I definitely got valuable things accomplished, but the main reason I decided wasn't the morale boost I'd get from writing code. It was Marlow, the town where the event was hosted. Marlow is Neil Bowers' home town, and I've heard about quite a bit over the years. PTS was an opportunity to see it in person, to have a bit of a visit with Neil, to see the oft-praised Marlow Bookshop, and as I learned fairly late into the planning, to eat at a two Michelin star restaurant. I knew that the price of doing this would be hard work at the summit and I accepted it. Fortunately, I got to do all those things, they were great, and then the hard work was rewarding on its own. Two thumbs up.

Marlow felt just one notch up from tiny. The entirety of downtown seemed to be two roads of shops, each one about three blocks long. How tiny is it really? I couldn't say, I only spent a few hours there. Almost everything was recommended, except the fish and chips shop, about which apparently the less said the better. I did make it to the bookshop, which was charming, but I didn't spend enough time there to really take it in. I ate some great Thai and Vietnamese food, but the star meal was at the Hand and Flowers, the star-bearing "pub" I mentioned earlier. While we ate, Neil said, "Oh, and there are two three-star restaurants about five miles from here." I guess PTS will be back in Marlow sometime in the future, right?

Meeting Neil's family was also the impetus for me to finally learn how to play Dutch Blitz, and left me thinking I should really be able to solve a Rubik's cube. Also, who knew how many English people would be distressed at the prospect of watching someone drink their tea without milk?

So the town was nice, and about a mile's walk from the venue, Bisham Abbey. The manor house was built about 750 years ago, although significantly modified since. The building was imposing and attractive, and had a beautiful lawn that led right up to the banks of the Thames. Despite a few weirdly shaped doorways and slightly too-steep stairs, it was a great venue for the summit. It would've been nice to claim a few more of its rooms, but it was never a problem.


Bisham Abbey

We didn't stay at the abbey. Instead, we stayed just across a driveway at the national sports center on the abbey grounds. It's where Britain has some of its Olympians train, so it had a great gym, which I managed to use a few times. It also had a presumably even better gym, labeled the "Elite Gym", but it wasn't for mere mortals like me. The room was just fine. The food was fine. I failed to get any dinner on Saturday night, which made me realize that one of my travel bag necessities on future trips should be some kind of food for when I do that. A couple MetRx bars might be a reasonable emergency backup.

I left Marlow this morning the same way that I arrived: in a hired car driven by somebody hired by Neil, who knew exactly where I had to go, so all I had to do was sit down and wait. Top marks for that. Who doesn't love being picked up by a driver with your name on a slate? (It said "RJBS". Heh.)

Looking back, I almost wish that I'd had a bike while in Marlow, but the way that Brits drive on small country roads terrifies me. I'll just stick to walking next time.

I went to the Perl Toolchain Summit in Oslo! (body)

by rjbs, created 2018-04-27 19:58
last modified 2018-04-28 18:13

The last time I wrote in this journal was in January, and I felt committed, at the time, to make regular updates. Really, though, my first half 2018 has been pretty overwhelming, and journal posts are one of the many things I planned that didn't really happen. I'm hoping I can get back on track for the second half, for reasons I'll write about sometime soon. Maybe?

But I just got back from Oslo and the Perl Toolchain Summit, and it's practically a rule that if you go, you have to write up what you did. I like the PTS too much to snub my nose at it, so here I am!

This was the 11th summit, and the second to be held in Oslo, where the first one was held in 2008. I've attended ten of them, including that first one, and they've all been great! They're small events with about thirty people, all of whom are there by invitation. The idea is that the invitees are people who do important work on "the toolchain," which basically means "the code used to distribute, install, and test Perl software modules." People show up with plans in mind, and then work on them together with the other people whose work will likely be affected or involved. There's always a feeling that everyone is working hard and getting things done, which helps me feel energetic. Also, I like to take breaks and walk around and ask what people are doing. Sometimes they're stuck, and I can help, and then I go back to my own work feeling like a champ. It's good stuff.

I've worked on a number of parts of the toolchain, over the years, but most often I spend my time on PAUSE, with just a little time on Dist::Zilla. That's where my time went again this year. I wasn't at PTS last year, as I was in Melbourne for work. In the last two years, I've been doing much less work on CPAN libraries. That, plus some recent tense conversations on related mailing lists, had me feeling unsure about the value of my going. By the end, tough, I was very glad I went, and felt like I can be twice as productive next year if I spend just a little time preparing my work before I head out.

Here's what I actually did, more or less. Mostly less, as it's been a few days. I should have kept day to day notes as I've done in the past! There's another way to improve for next year.


I got to Oslo on Tuesday the 17th and walked around the city with David Golden. We didn't do much of anything, but I bought some things I'd forgotten to pack — this was my worst travel packing in years — and ate dinner with Todd Rinaldo. Mostly, my goal was to stay awake after taking a redeye flight, and I succeeded. The next day, we did more of the same, but also went to the National Gallery, which was a good stop. I wish I'd gone to the contemporary art museum down near the shore, but instead I got sidetracked into conversations about Perl. It could've been worse!

PAUSE work

PAUSE is the Perl Author Upload SErver. When you upload a file "to the CPAN," PAUSE decides whether you had permission to do what you're doing and then updates the indexes used by CPAN clients to install things.

Two years ago, in Rugby, we added a feature to PAUSE that would normalize distribution permissions on new uploads, making sure that if a co-maintainer uploaded a new version of a distribution, including some new package for the first time, the new package got the same permissions as the distribution's main module had. This is part of the slow shuffle of PAUSE from dealing only in package permissions to having a sense of distribution permissions.

I think the change was a good idea, but it had a few significant bugs. Andreas König had fixed at least some of them since then, but he had seen one file recently that hit a security problem. Unfortunately, his reproducer no longer worked, and we weren't sure why. While Andreas tried to reproduce the bug, I sat down and read the related code closely until I had a guess what the problem was.

I was delayed by an astounding bug in DB_File that manifested on my machine, but not his. This code:

  tie my %hash, 'DB_File', ...;

  $hash{ $_ } = $_ for @data;

  say "Exists" if exists $hash{foo};
  say "Grep"   if grep {; $_ eq 'foo' } keys %hash;

...would print "Grep" but not "Exists". In other words, exists was broken on these tied hashes. Rather than go further down this rabbit hole, we removed the tie. There's a lot more memory available for processes now than when the code was written maybe twenty years ago!

When Andreas had a reproducer working, we tried my fix… and it failed. The theory was right, though, and I had a real fix pretty quickly. Once we knew just what the problem was, it was pretty easy to write a simple test case for the PAUSE test suite.

After that, Andreas mentioned we'd seen some errors with transaction failures. I did a bit of work to make PAUSE retry transactions when appropriate and, while I was doing that, made what I thought were significant improvements to the emails PAUSE sends when things go wrong.

Doing this work, like a lot of PAUSE work, involved working out a lot of issues of action at a distance and vestigal code. In fact, the bug boiled down to a piece of code being added in the wrong phase, and the phase to which it was added should've been deleted a year before the bug was introduced. Since it wasn't, it was a deceptively inviting place to add the new feature. I decided I would try to purge some code that was long overdue for purging, and to refactor some large and confusing methods.

I spent a fair bit of Thursday afternoon on this, as a first pass. I think I made some good improvements. My biggest target was "the big loop", a while loop labeled BIGLOOP in PAUSE::mldistwatch aka "the indexer". I pulled this code apart into a few subroutines, and then did the same thing to some of the routines it called. More and more, I felt confident that there were (And still are) two main problems to address:

  • Distinct concerns like permissions and side-effects are interwoven and difficult to separate out. I started to discuss the idea of making the first phase of indexing construct a plan of side effects which would then all be taken in a second phase. Easier said than done! Later in the week, David Golden did some work toward this, though, and I think next year we can make big strides.
  • Often, decisions are made at a distance. During phase 1, a check might cause a variable on some parent object to be set. Later, in phase 6, a different object might go and find that parent and check the flag in order to make a decision. This is all somewhat ad hoc, so it's often unclear why a flag is being set or what the full implications of setting it might be. Achieving a desired effect late in the program might require changing the actions taken very early.

On Friday, I carried on work to deal with this, but eventually I stopped short of any major changes. It was going to take me all of Friday come up with a solution I'd like, and I didn't think I could have the implementation done in time to want to deploy it. The last thing I wanted was to push Andreas to deploy a massive refactoring on Sunday night before we all went home! Instead, this problem is on my list of things to think hard about in the weeks leading up to the 2019 PTS.

Finally, I had a go at the final (we hope) change needed for the full case-desensitization of PAUSE. Right now, if you've uploaded Foo::Bar, PAUSE prevents you from later uploading FOO::BAR. This was a conscious decision to avoid case conflicts, but we've since decided that you should be able to change the case, if you have permissions over the flattened version, but we must keep everything consistent. I made some good progress here, but then hit the problems above: side effects and checks happen in interleaved ways, so getting everything just right is tricky. I have a branch that's nearly done, and I hope to finish it up this year.

While doing that, I also made some improvements to the test system. I'm proud of the PAUSE test suite! It's easy to add new tests, and now it's even easier. In the past, if you wanted to test how the indexer would behave, you'd build some fake CPAN distributions and mock-upload them. These distributions could be made by hand, or by using Module::Faker. Either way, they took the form of tarballs sitting in a corpus directory. Making them was a minor drag, and once they're made, you'd not always sure what their point is.

I added a new method, upload_fake, that takes a META.yml file, builds the dist that that file might represent, and uploads that file for indexing. For example, given this metafile:

# filename: Foo-Bar-0.001.yml
name: Foo-Bar
version: 0.001
   version: 0.001
   file: lib/Foo/Bar.pm
  cpan_author: FCOME

...the test suite will make a file with lib/Foo/Bar.pm in it with the code needed: package statements, version declarations, and so on. The it will upload it to F/FC/FCOME/Foo-Bar-0.001.yml. This uses Module::Faker under the hood, and I took a little time to make some small tweaks to Module::Faker. I have some good ideas for my next round of work on it, too.

Oh, and finally: we want to add a new kind of permission, called "admin", which lets users upload new versions of code and to grand uploading permissions to others, but is clearly not the primary owner. Right now, we don't have that. David and I both made some inroads to making it possible, but it's not there yet.


First, I applied a bunch of small fixes and made a new v6 release. These were all worthy changes, but fairly uninteresting, with the possible exception of an update to the configuration loader, which will now correctly load plugins from ./inc on perl v5.26, where . is not in @INC.

After that, I made a new release of v5. It includes a tiny tweak to work better with newer Moose, so you can still get a working Dist::Zilla on a pre-5.14 perl. Don't get too excited, though. I still don't support v5. This release was made at the request of Karen Etheridge, but I'm not sure she's eagier to field any support requests, either. Consider upgrading to v6!

After that, I started work on v7. At work, newer code is being written against perl v5.24, and we use lots of new features: lexical subroutines, pair slicing, postfix dereferencing, subroutine signatures, /n, and so on. If practical, I wanted to be able to start doing that in Dist::Zilla.

I have a program that crawls over the CPAN, unpacking every distribution and building a small report about its contents. Here's an example report on one dist:

  sqlite> select * from dists where dist = 'Dist-Zilla';
           distfile = RJBS/Dist-Zilla-6.012.tar.gz
               dist = Dist-Zilla
       dist_version = 6.012
             cpanid = RJBS
              mtime = 1524298921
       has_meta_yml = 1
      has_meta_json = 1
          meta_spec = 2
  meta_dist_version = 6.012
     meta_generator = Dist::Zilla version 6.012, CPAN::Meta::Converter version 2.150010
   meta_gen_package = Dist::Zilla
   meta_gen_version = 6.012
      meta_gen_perl = v5.26.1
       meta_license = perl_5
     meta_yml_error = {}
   meta_yml_backend = YAML::Tiny version 1.70
    meta_json_error = {}
  meta_json_backend = YAML::Tiny version 1.70
  meta_struct_error = {}
       has_dist_ini = 1

I originally wrote this to tell me how many people were using Dist::Zilla, but it's useful for other things, like dependency analysis (not shown, above, is the dump of all the module requirements in the metafile) or common YAML errors.

The meta_gen_perl field looks for a new field I just added to all Dist::Zilla distributions, telling me the perl used to build the dist. Failing that, it looks for output from MetaConfig. You won't yet see these data for dists not built by Dist::Zilla. I looked for what perl version was being used to build distributions with Dist::Zilla v6:

  sqlite> SELECT SUBSTR(meta_gen_perl, 1, 5) AS perl, COUNT(*) AS dists
          FROM dists
          WHERE meta_gen_package = 'Dist::Zilla'
            AND meta_gen_perl IS NOT NULL
            AND SUBSTR(meta_gen_version,1,1)='6'
          GROUP BY SUBSTR(meta_gen_perl, 1, 5);

  perl        dists
  ----------  ----------
  v5.14       2
  v5.16       5
  v5.18       4
  v5.20       29
  v5.22       98
  v5.23       7
  v5.24       675
  v5.25       204
  v5.26       563
  v5.27       54

This isn't the best data-gathering in the world, but it made me feel confident about moving to v5.20. I started a branch, applied the commits that had been waiting for v5.20, and then got to work with other changes I wanted:

  • lexical subroutines
  • subroutine signatures
  • eliminating circumfix dereference

Lexical subs were only useful in a small number of places, as I expected. (The use here is "making the code a bit nicer to read".) Subroutine signatures were much more useful, and found a number of bugs or sloppy pieces of code, but they introduced a new problem.

Subroutine signatures enforce strict arity checking by default. That is, if you write this:

  sub add ($x, $y) { $x + $y }

  add(1, 2, 3);

...then you get an error about too many arguments. This is good! (It's also easy to make your subroutine accept and throw away unwanted arguments.) The not so good part is that the error you get tells you what subroutine was called incorrectly, but not by what calling line. This has been a known problem since signatures were introduced. For the most part, even though I use signatures daily, I hadn't found this to be a major problem. This time, though, a new pattern kept coming up:

  around some_method ($orig, $self, @rest) { ... }

Now, if the caller of some_method got the argument count wrong, I'd only be told that Class::MOP::around was called incorrectly. This could be anything! I'm going to push for the diagnostics to be fixed in v5.30.

In eliminating circumfix dereferencing, what I found was that I was always happier with postfix — and I knew I would be. I had already made a postfix-deref branch of Dist::Zilla years ago when the feature was experimental in a branch of v5.19. What I also found, though, was that I often wanted to eliminate the dereferencing altogether. Often, Dist::Zilla objects have attribute accessors that return references, often directly into the objects' guts. In those cases the reference doesn't just make things ugly, it makes things unsafe. I began converting some accessors to dereference and return lists. This broke a few downstream distributions, but nothing too badly. Karen helped me do some testing on this, along with some other v7 changes, and will probably end up dealing with more maybe-breakage based on v7 than anybody but me. I think I'll definitely keep these changes in the branch, and try to make sure everything is fixed well in advance.

The attribute I didn't want to just change to flattening, though, was $zilla->files, which returns the list of files in the distribution. For years, I've wanted to replace the terrible "array of file objects" with something a bit more like a filesystem. This would make "replace this file" or "delete the file named" or "rename this file" easier to write, check, and observe. It felt like fixing that might be best done at the same time as fixing other reference attributes.

So, v7 isn't abandoned or aborted, but it's definitely going to get some more thinking before release. That also gives me more time to collect more perl version usage data.


I made a new release of Test::Deep, mostly improving documentation. I also had a go at converting it to Test2::API. This caused some nervous noises from people who didn't want to see Test2 become required by something so high up the dependency river. What I found was that Test::Tester, which Test::Deep uses to test its own behavior, basically can't be used to test libraries using Test2::API without Test::Builder. If nothing else, that puts a significant damper in my musings about the Test::Deep change. It was only a quick side-project, though. I'm not in a rush or particularly determined to make the change.

I also talked with Chad Granum about Test2::Compare. I like parts and other parts I am not so keen on. I've wondered whether I can produce a sugar for Test2::Compare that I'd like. So far: maybe.


This year, there were fairly few big talking groups, which was fine by me. Sometimes, they're important, but sometimes they can be a big energy drain. (Sometimes, they're both.) We had a talk about building a page to help CPAN authors prepare for making new releases of highly-depended-upon distributions. I did a bit of work on that, but mostly just enough to let others contribute. I'm not sure how much success we've really had int he past with building how-tos.

There was some talk about converting PAUSE to have more dist-centric, rather than package-centric permissions. I agree that we should do that, but I knew it wasn't going to happen this year, so I stayed out of it as much as I could.

I interrupted a lot of people to ask what they were doing, which was often interesting and, maybe just once or twice, helpful to my victim.

Nico and Breno found that \1 is flagged read-only, but \-1 is not. "Oh good grief," I said, "it's going to be because -1 isn't a literal, it's an expression."

Sadly, I was right.

Chad showed me the tool he's working on for displaying test suite run results on the web, and it looked very nice. I asked if he'd ever heard of Smolder, and he said no, and I felt like an old-timer. Smolder was a project by Michael Peters, who was one of the attendees of the first summit (then the Perl QA Hackathon).

I talked with Breno about Data::Printer, and was utterly gobsmacked to learn that he implemented the "multiple Printer configurations in one process" feature that I've been moaning about (and blowing hot air about implementing) for years. I also showed off my Spotify playlist tracker to him and to Babs. I did not manage to get the Discover Weekly playlist of every attendee, though. Fortunately, this is easy to follow up on after the fact.


Other things of note: we went to a bar called RØØR and played shuffleboard. that was terrific, and I would like to do it again. We had a number of good dinners, and the price of beer in Oslo prevented me from overindulging. I couldn't find double-edged safety razors anywhere, so didn't shave until I couldn't stand it anymore and used a disposable razor from the hotel's shop. It was terrible, but I felt much better afterwards. I had skyr in Iceland, and it was good. I had salted liquorice in Oslo and it was utterly revolting, every single time I tried another piece.

Unlike some previous hackathons, we had lunch and dinner served at the workspace most days. I was worried that this might introduce the "20 hour days of doing terrible work because you're not sleeping" vibe of some hackathons. It did not. We still wrapped up whenever we felt done, but if we wanted to work just a little later to finish something, we could. It was good.


All the organizers did a great job, and it was a great event. I'm definitely looking forward to next year (in England), and I realie now that if I do some more prep work up front, I'll be much more successful. (I worked like this in the past, but I have since lost my way.)

The Perl Toolchain Summit is paid for by sponsors who make it possible to get all these people into one place to work without distraction. Those sponsors deserve lots of thanks! They've helped produce a lot of positive change over the years.

Thanks very much to Salve, Stig, Philippe, Laurent, Neil, and our sponsors: NUUG Foundation, Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP, Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, and Oetiker+Partner.

2017 in review, the 2018 plan (body)

by rjbs, created 2018-01-01 15:40
tagged with: @markup:md journal todo

It has been almost a year since my last blog post, in which I complained that Slack silently drops some messages for people using the IRC gateway. That bug was about a year old then, and it's about two years old now. I'm still annoyed… but that's not why I'm here. I'm here to summarize what I did this year, since I forgot to post anything about it during the year.

I have been keeping busy, despite appearances to the contrary. My involvement in the Perl 5 community has been way down: I'm doing less on p5p, touching less on the CPAN, and I didn't even make it to the Perl Toolchain Summit this year, for the first time in seven years! What has caused this disruption? In part it's just a continued trend, but mostly it has been work. I have been busy, busy, busy.

There have been two parts to that. The first, at least chronologically, was Topicbox. Topicbox is FastMail's newest product, and I spent about a year and a half hard at work on building it (but not alone). The lousy five cent explanation is "it's a mailing list system," but that description is lousy. It calls to mind Mailman or Majordomo, or even our previous email discussion product, Listbox. I really enjoyed using Listbox, but Topicbox puts it to shame. Beyond being much (much) faster and simpler, it changes the focus from mailing lists to organizations. You create an organization, invite its members, and then discussions can be organized into topics.

Topicbox is built on JMAP, a developing standard for efficient client/server applications based on simple synchronized object storage. We wrote it in Perl 5, in Ix, a framework we built for writing JMAP applications. I spent much of 2016 working alone on Ix, but now several members of the internal dev team work on it and Topicbox a lot of their time. At first, this was mainly a quality of life improvement for me. I'd been largely working in isolation, which was fine, but not great, and having people to discuss my work problems with was a big win.

Since then, the kind of work problems I have has changed substantially, which is the second thing that's kept me so busy with work this year. Pobox became part of FastMail in late 2015, and I think everybody realized pretty quickly that nobody on either side of the deal thought anybody on the other side was a boor or a cretin. Still, things moved slowly. The acquisition took (as I remember it) about fifty-two years to complete, and further integration of the teams was slow going. By early 2016, we knew we needed to re-organize the company at least somewhat to provide some structure for the growing team. We went through a few iterations of this, and by mid-year I ended up with business cards that said, "Ricardo Signes, CTO."

I've long wanted to be doing technical management, and this change has felt like I skipped a step or two on my imagined career path — which is just fine, since I felt like I'd spent a few too many years as an individual contributor. It's been a really enjoyable challenge to bring myself up to speed on parts of the system I've previously half-ignored, to start to build a coordinated plan for the future of our systems, and to work with all of the technical staff to execute that vision. I feel pretty good about our plans for 2018, and about the team's general excitement for the future. Beyond my project plan for 2018, I want this to be the year where I figure out and (more or less) lock down the amount of time I spend on different kinds of work.

I wrote todo lists for a few years in the past, and most of the time, I did very poorly. My finding was generally that I'd make pretty good progress on all my ongoing work projects, and could keep up with the random things I had going on, but I rarely made good progress on goals stated up front.

Probably many different things contributed to this, ranging from laziness to changing interests to bad time management to lack of accountability. I've tried to address some of these problems in the past, and I'm still trying, and I'm sure I'll never make as much of a one-year improvement as I want, but I'm going to keep trying.

May plan for 2018 is to try to stick to a three-tier routine: a daily routine, a weekly routine, and a monthly routine. Each one will start with a "make specific goals beyond the routine ones" and end with "write something about how you did." This writing will mostly go into my Day One journal, rather than boring everybody who subscribes to my blog's long-dead RSS feed… but I'll try to post some updates when I think it's interesting.

In 2017, I set my [Goodreads reading goal] to 52 books, and I hit 50% of that goal with a few hours to spare. For 2018, I've scaled back to 48, so I'll still have to push to hit it. I've set a few more specific goals about socializing with people, and I've said that every month I need to pick a skill to get better at, and then work on it. Maybe in time I'll realize that a month is longer than I need for most things, and make it a weekly pick.

This month, my goal is to get a better handle on IMAP. I have a tolerable understanding of it, but there are a few parts I could brush up on, and I think this will be a nice place to start. (It might also make it clear whether a month is too long for something small in scope.)

The other thing I need to keep in mind is that lots of other things are bound to come up and try to disrupt my routine. Sometimes, I might have to let them. When that happens, I need to be sure that I recover from the disruption, and that I don't view it as a failure, but just as a thing that happened. I think that a routine is a big help at getting things done, and that I've traditionally been so-so at sticking to non-work routines. I need to focus energy on getting into one, so that I can eventually have one without having to keep spending energy on it. Right? I think so.

The Slack IRC gateway drops your messages. (body)

by rjbs, created 2017-01-23 20:42
last modified 2017-01-23 22:27

So, imagine the following exchange in private message with one of your team members:

  <alice> So, how have I been doing?
  <bob> Frankly, I don't think it's working out.
  * bob is joking!  You're doing great.

Maybe Bob shouldn't be such a joker here, but sometimes Bob can't help himself. Unfortunately, Bob has just caused Alice an incredible amount of stress. Or, more to the point, Slack has gotten in the way of team communication, leading to a terrible misunderstanding.

The problem is that when you use /me on the IRC gateway while in private chat with someone, Slack just drops the message on the floor and doesn't deliver it! Alice never got that final message.

Read that again, then ask yourself how often you may have miscommunicated with your team because of this. Remember that it goes both ways: maybe you never used /me in privmsg, but did your coworkers? Who knows! I reported this bug to Slack feedback in August 2016 and again in September. When I reported it the second time, I went back in my IRC logs to compare them to Slack logs. If I found a time when an emote showed up in both, I'd know when the problem started.

The answer is that it started around January or February 2016. In other words, this silent data loss bug has been in place for a year and known about for, at the very least, five months. It hasn't been fixed. More than a few times in this period, I've realized that I missed an invitation to a meeting or other important communication because it was /me-ed at me in privmsg. This is garbage.

I can't fix Slack from my desk, but I can fix my IRC client to prevent me from making this error. I wrote this irssi plugin:

use warnings;
use strict;

use Irssi ();

our $VERSION = '0.001';
our %IRSSI = (
  authors => 'rjbs',
  name    => 'slack-privmsg-me',

Irssi::signal_add('message irc own_action'  => sub {
  my ($server, $message, $target) = @_;

  # only stop /me on Slack
  return unless $server->{address} =~ /\.slack\.com\z/i;

  return if $target =~ /^#/; # allow /me in channels

  Irssi::print("Sorry, Slack drops /me silently in private chat.");

This intercepts every "about to send an action" event and, if it's to a Slack chatnet and in a private message, reports an error to the user and aborts. I could've made it turn the message into a normal text message, but I thought I'd keep it inconvenient to keep me angry.

Please use this plugin. If you port it to WeeChat, I'll add a link here.

When people tell you, "Of course your free software project should use Slack. There's even an IRC gateway!" remember that Slack doesn't seem to give a darn about the IRC gateway losing messages. It's saying that gateway users are second-class users.



I will make friends by programming. (body)

by rjbs, created 2017-01-15 22:25

Every once in a while I randomly think of some friend I haven't talked to in a while and I drop them an email. Half the time (probably more), I never hear back, but sometimes I do, and that's pretty great. This week, I read an article about Eagle Scouts and it made me realize I hadn't talked to my high school friend Bill Campbell for a while, so I dropped him an email and he wrote right back, and I was happy to have sent the email.

Today, I decided it was foolish to wait for random thoughts to suggest I should write to people, so I went into my macOS Contacts application and made a "Correspondents" group with all the people whom it might be fun to email out of the blue once in a while.

Next, I wrote a program to pick somebody for me to email. Right now, it's an incredibly stupid program, but it does the job. Later, I can make it smarter. I figure I'll run it once every few days and see how that goes.

I wrote the program in JavaScript. It's the sort of thing you used to have to write in AppleScript (or Objective-C), but JavaScript now works well for scripting OS X applications, which is pretty great. This was my first time writing any JavaScript for OSA scripting, and I'm definitely looking forward to writing more. Probably my next step will be to rewrite some of the things I once wrote in Perl, using Mac::Glue, which stopped working years ago when Carbon was retired.

Here's the JavaScript program in its entirety:

  Contacts = Application("Contacts");

  var people = Contacts.groups.byName("Correspondents").people;
  var target = people[ Math.floor( Math.random() * people.length ) ];

  var first = target.firstName.get();
  var last  = target.lastName.get();

  first + " " + last;

what I read in 2016 (body)

by rjbs, created 2017-01-03 10:46
tagged with: @markup:md books journal

According to Goodreads, which should be accurate, these are the books I read in 2016. I meant to read more, but I didn't.

The Pleasure of the Text

I'd been meaning to read this since college, as one of my favorite professors spoke highly of Barthes. I found the book unpleasantly difficult and basically uninteresting. I like to imagine that I would have enjoyed the book much more in the original French, as the language in translation felt pretentious. That said, I think the ceiling on my enjoyment was going to be pretty low. I'll stick with Derrida and Debord.

Eclipse Phase: After the Fall

Eclipse Phase is a transhumanist RPG with a lot of interesting ideas. This is a collection of stories set in that game's universe. I did not enjoy it. There were one or two good bits, but mostly it was just not good.

Puttering About in a Small Land

This is one of Philip K. Dick's realistic novels. He wrote a few of these before settling into all sci-fi all the time. I've found them to be surprisingly good. They're mostly just slice of life stories about silghtly unhappy people in mid-20th century California, but I have enjoyed them, as I did this one.

Frost: Poems

I was still making progress on my "one book of poetry each month" project! Frost was spectacular! Much better than I had anticipated. Reading him in middle and high school was too early. I couldn't yet appreciate the subtext of his works. I still think back on these poems now.

Ancillary Mercy

I finished the Ancillary Justice trilogy, and it was good. I seem to recall thinking that Mercy wasn't the best of the three books, but the whole trilogy was very good, and I'm glad to have read it.

The Girl with All The Gifts

Gloria recommended this to me a number of times, and I finally took her advice, and it was good advice. This was a good post-apocalypse story with an interesting premise. I think I was happy with the ending. They made a movie of it, but I haven't seen it yet.

Jane Eyre

I'm going to try to keep reading some classic 19th century novels over the next few years. I should've read more last year, but instead read only Jane Eyre. I feel I could've done a lot worse. It had problems (that ending!), but I really enjoyed it. It was charming and funny and well-written. Its position in the canon is well-earned. I look forward to reading Wide Sargasso Sea this year, too.


I felt like this book played on some of the same ideas as Ready Player One, but was a substantially better book. I found the character and premise more interesting than those in Ready Player One, which seemed to be built mostly on evoking nostalgia.

Programming Pearls

This is a book that other computer books recommended to me often enough that I asked for a copy, received it, and put it on my shelf for years. I mostly enjoyed it, although sometimes I found it a bit boring. The interesting bits were very interesting, though. I think my recommendation here is "read it, but skip the parts you don't find interesting."

The Fine Art of Mixing Drinks

I'd been nursing this book for a few years, and finally decided to just finish it. It's a good book to have a run through, although I wouldn't say it revolutionized my drink making.

Uncertainty in Games

This short book discusses the different kinds of uncertainty that can be introduced to games to make them more interesting. I enjoyed the overview, and it helped me think about what makes games that I like interesting (or not). It also didn't fall into the sort of stodgy prose that this sort of theoretical work often does.


This was a tolerable sci-fi story rendered intolerable by the writing of its characters. The romance and love scenes made me groan and roll my eyes. Skip it.

Dr. Adder

This book got a strong recommendation from Philip K. Dick, who said it would've defined cyberpunk if it had been published when written, instead of many years later. It definitely felt like a bridge between Dick and cyberpunk, but it was a big mess and spent a lot of its time reminding the reader how very transgressive it was. I'm glad I finally read it, but you probably shouldn't bother.

The Library at Mount Char

This might be the best recent book I read this year. It's about a group of weird people raised by a dangerous madman. The madman is missing, and they seem to be looking for him. It's pretty dark, but also funny in places. It was a good read.

Saturn Run

The US and China launch missions to Saturn (and back) after seeing evidence of an alien space ship there. The book had a lot going for it, idea-wise, but I found its characters uninteresting and its plot predictable.

Shadow of the Torturer

This is the first book of the Book of the New Sun books. I think they're amazing, especially in that they greatly reward repeat reading. I got a lot more out of them on this read than I did on my last, and I will surely read them again in a few years. Next time, I may take notes. This year, I plan to read Peace, another novel by Wolfe. This one also, I am told, rewards repeat reading, but it's only about 300 pages, which is a relief.

Starting Forth

I had read most of this book a few years ago, but never finished the exercises in the later chapters. This year I forced myself to do so, which led to re-reading a few chapters to get back my Forth legs. I'm very glad I did this, because it renewed my appreciation for Forth and got me to write a simple but instructive program in Forth.

Forth is great and more people should learn it. (Probably almost nobody should be using it for much real work, though.)

Claw of the Conciliator

This is the second part of the Book of the New Sun.

Thinking Forth

Stack Computers: The New Wave

Sword of the Lictor

This is the third part of the Book of the New Sun.

Citadel of the Autarch

This is the forth part of the Book of the New Sun.

The Urth of the New Sun

This is the fifth and final part of the Book of the New Sun.

The Sonnets (Berrigan)

The most compelling part of this collection is the introduction, which makes big claims. By the end of the book, I felt like maybe those claims hadn't been malarky, after all. The sonnets and general structure of the book became more interesting as it went on, and I began to see more of the simultaneity of the work's poems. I almost felt like starting over with that in mind. Almost.

Who Censored Roger Rabbit

This is the book on which Who Framed Roger Rabbit was based. The film is superior in every way. I am irritated that I spent time on this book.


This is a short story about how libertarianism isn't all that it's cracked up to be. It was good, but maybe I would've liked it better if Vance, rather than Heinlein, had written it.

Learn Vimscript the Hard Way

This is a good book on Vim. It didn't change my life, but I learned things.

The Skin

I'd had this book on my for years after a favorable review when the book was first published unexpurgated in English. It's a semi-real memoir of Curzio Malaparte's experience of the final months of WWII in Italy. There were parts where I felt certain that Joseph Heller must have read it before writing Catch-22, most especially in the rantings of a man that Italy had won by losing the war, just like the dirty old man in Catch-22's brothel.

I'm glad I read the book. It was an interesting read and definitely had its moments, but I can't say that I'd strongly recommend it to anybody else. It was disjointed and sometimes difficult to enjoy. Still, as I say, I'm glad I read it.

Also, I should admit that one of the things that bothered me is the amount of dialog in French. It made me feel poorly educated, which I am.

Something Happened

This is Joseph Heller's second book, published 13 years after Catch-22. From Wikipedia:

Something Happened has frequently been criticized as overlong, rambling, and deeply unhappy. These sentiments are echoed in a review of the novel by Kurt Vonnegut Jr., but are balanced with praise for the novel's prose and the meticulous patience Heller took in the creation of the novel.

I agree with those remarks, except for the idea that the prose and patience balanced out the book's painful length. (It's only about 500 pages, but they're long pages.) There was a lot that I really liked about this book, but eventually I couldn't stay engaged and skimmed my way to the irritating conclusion.

Seven Databases in Seven Weeks

I found this about as good as the Seven Languages books. In other words: it was ... okay. It was a tolerable crash course, but I wasn't interested enough to do the exercises. Maybe doing them would've helped, but the book itself didn't get me interested enough to bother. Also, I have generally felt, in these books, that none of the authors has a really interesting narrative or voice.

Still, I understand Neo4j a lot better, now.


This is the other short story collection by Greg Egan. In general, I thought every story in it was a failure. Some had promise, but most were lousy, especially compared to his other (better, but still flawed) story collection, Axiomatic. I did sort of like Reasons to Be Cheerful, but not enough to make the book worth reading. Maybe not even enough to make the story worth reading.

Wyst: Alastor 1716

After Luminous, I wanted to read something I was guaranteed to enjoy. Clearly, I thought, I should read some Jack Vance. I asked Mark Dominus for a recommendation, and this was the shorter of the two he recommended. I enjoyed the heck out of, because Vance is a delight.

Discovering Scarfolk

This is a book from the Scarfolk blog, which presents surreal artifacts and articles from a fictional 1970's English town, Scarfolk. It was a fun and quick read, but I find the blog more fun.

The Informers

While on business in Melbourne, I got thinking about Bret Easton Ellis. Years ago I decided that I needed to space out my reading of his books, so that I wouldn't run out of books too quickly. He's one of my favorite authors, although I'm not sure I could say why. It has something to do with the very, very precise kind of bleakness he presents.

The Informers is (I think) his fourth book, and the fifth that I've read. It's a short story collection, and each story is very very loosely connected to some of the other stories in the book. This mirrors his usual practice of tying his books together with very thin threads. I found that a story collection was a great format for him, because it was able to further spread out the pointlessness depicted. It didn't much motion at all, because the break between stories was a substitute for any actual rising or falling action.

I was very pleased with my decision to read it.

The Speechwriter

This is a political memoir by one of the speechwriters for [https://en.wikipedia.org/wiki/Mark_Sanford], former governor of South Carolina. I had read that it provided some great insights into American politics, but it didn't. On the other hand, it was a light, enjoyable read. It was well written and funny when it wanted to be.

The Plagiarist

In a desperate attempt to get one more book read before 2016 ended, I decided to read this sixty-four page novella by Hugh Howey. It's a sci-fi story about a man who visits computer-generated worlds so that he can steal their original works of literature. It was mediocre.

To Build a Better Ballot

by rjbs, created 2016-12-31 08:49
tagged with: voting

Horror Movie Month 2016 (body)

by rjbs, created 2016-11-02 23:21
last modified 2019-09-28 19:03
tagged with: @markup:md journal movies

Another year, another thirty-one days of horror movies! I think our selections this year had fewer losers than some past years, but probably also fewer stand-out winners.

What We Watched

October 1: The Thing (1982)

A classic! I forgot how good the creature effects were, and how effective the moments of suspense. Also, as with all of Carpenter's original scores: great music!

We watched this one with Martha, who said it was much less boring than the 1951 version, which we previous tried to watch and abandoned, in preparation for this one. On the other hand, she said it was not scary.

October 2: Don't Breathe

We saw this one in the theatre!

It had a lot going for it, but in the end, we thought it was only okay, at best. I think I might have given it a "pretty good, if flawed," if it hadn't basically become a super-gross rape movie.

No, thanks. We want fun horror movies.

October 3: Paranormal Activities: Ghost Dimension

a.k.a. Paranormal Activities 6

It was the best Paranormal Activities movie in a long time. This does not mean it was very good. I liked a few things that it did, especially in how it tied itself back to the original film. Still, though, these movies are just not that great. I think it was about as good a capstone as we were going to get.

October 4: Silent Scream

We've spent some time working through a BuzzFeed list of horror movies. Some of the movies we'd already seen, and some we saw over the last two years. Now we're down to movies that I've had a hard time finding. Silent Scream is one of those.

Short version: it wasn't worth it. It had a couple funny bits, but mostly it was one of those 1980-ish films where they were trying to figure out how to make a slasher the right way. This one didn't hit the mark.

October 5: The Gate

Some kids stumble across a hole in the back yard. Their parents go away. The hole is full of tiny demons. They spend the movie fighting the demons before finally banishing them. It's from 1987, and you can tell. It's time to market horror-style movies to young audiences, and that's what this. It's not very good.

Instead, watch Joe Dante's The Hole.

October 6: He Never Died

Don't read the Wikipedia article! Just go see the movie!

This was probably my favorite new film of the month. It wasn't perfect, and as the movie went on it got less interesting to me, but Henry Rollins was just perfect. He plays a weird loner who lives in the big city and tries to avoid doing anything interesting. He eats eggplant parms every day and plays bingo a few times a week. There is something seriously weird going on with this boring guy.

I enjoyed it.

October 7: Extraordinary Tales

It's an anthology of animated Edgar Allan Poe stories. I was entirely unimpressed. Read them instead.

October 8: We Are Still Here

Middle-aged couple moves into a haunted house. Their neighbors visit but are sort of creepy weirdos. Friends come to visit and things get worse. On its surface, this movie didn't look very interesting, but I enjoyed it. It had good pacing. It was creepy. I wish the story held together a bit better, but it was fun. I would watch a sequel.

October 9: The Others

This was our second horror movie with Martha this year!

Gloria and I had seen this before, and I remembered thinking it was okay. On rewatching, I found it more interesting. It was a pretty solid ghost story.

Martha's verdict: good movie, not scary.

October 10: Ava's Possessions

The movie starts when a young woman has a possessing demons exorcised from her. She's found guilty of crimes she committed while possessed, and one option she's given is to go into a group therapy program for possession victims. This is a good setup, and they do not entirely squander it. I think I would've liked the movie better if there had been less horror action and more frank discussion of the problems of being possessed.

Still, I enjoyed it.

October 11: American Horror Story: Roanoke

Promising start. We liked the format as a ghost stories TV show.

October 12: Green Room

So, it was okay. A lefty punk band gets booked to play a white supremacist skinhead club. Things go sour.

We were excited because it had Patrick Stewart. He was wasted. The rest of the cast was good, though.

In the end, it was just a "everybody tries not to get killed" movie with a touch of torture porn. I was sure it was going to be a werewolf movie, and I was sorely disappointed when it wasn't.

October 13: American Horror Story: Roanoke

Uh oh. The usual American Horror Story thing is starting: too many plot lines, too many characters. What's going on here? Can this possibly remain coherent?

October 14: Southbound

An anthology! I love a good anthology, but so many horror anthologies are just crap. This one was not! It wasn't perfect, but it was good. There were five stories, each one linking to the next. I liked every story (but The Accident may have been my favorite), and I thought they were just connected enough to make it fun.

October 15: Krampus

It's a horror-comedy about Christmas. Krampus, Santa's evil buddy, comes to punish the grinchy. It was okay. It was not great, nor was it terrible.

October 16: The Fog (1980)

Another film that we'd seen before, but now watched with Martha! I like a lot about The Fog, and there's also a lot that I don't like. Maybe my big problem is that I find ghost pirates altogether too corny as villains.

Martha's verdict: she liked it, even though it wasn't scary.

October 16: Big Bad Wolves

I was a little worried when I realized this one would have subtitles, but I didn't mind. Also, this movie sounded very good. That is: the spoken Hebrew sounded really good in the mouths of these characters, even if I didn't understand a word of it.

The movie was well made, and I liked quite a few things about it, but its effective black comedy was undermined by the fact that it was about someone who raped and murdered children.

October 17: American Horror Story

Ugh, we give up. Too much stuff going on. Even if some of it was good, it was too much of a mess.

My proposal was that they try to do this series differently. Instead of acting like it's a whole season of a ghost stories TV series, they should've run a new ghost story each episode, slowly letting the viewer realize that they were somehow connected. Maybe by the end, we break the fourth wall and realize that the horror is now hunting the producers of the show.

Except the producers of American Horror Story would have to add eighteen subplots to the story.

October 18: Glitch

This ended up not seeming very horror-y. It's a new-to-us Australian TV series about a very small number of the dead rising from their graves. It looked pretty good, and we'll definitely watch more of it.

October 19: the third presidential debate

Okay, we skipped any traditional horror viewing to watch the debate between Clinton and Trump. I was left shaken.

October 20: Viral

Viral is an outbreak movie in which a parasite begins to spread rapidly, turning people into murder monsters. The film focuses on the experiences of two sisters trapped in a quarantined LA suburb during the outbreak while their parents are away. It was okay, but I felt like it could've been a lot better with some more work on the script.

October 21: Night of the Living Deb

Weird-o awkward protagonist has a one night stand with the guy she's been crushing on for ages. When they wake up, zombies have taken over their town. As they try to escape they discover what's really happening to their city… and their hearts.

It was okay. It really needed better writing. It occasionally felt like an SNL sketch.

October 22: Detention

We watched this one a few years ago, too. I really, really enjoyed this movie. I liked that it was weird, and unlike anything else, but not pretentious or self-important. It's just fun.

October 23: The Blob (1988)

Another movie watched with Martha! I hadn't seen this movie for maybe twenty years or more. When we watched it, I became sure of something I had forgotten: there was a novelization of this movie, and I read it. I can't be sure, but I feel pretty confident that this is true. I would've been about ten, so I probably thought it was awesome.

Anyway, the movie was cheesy, but not terrible. It was definitely much better than the 1958 version, which I considered watching with Martha a year or so ago. That was tiresome enough that I gave up during my previewing.

This one had some good bits. I especially liked the line cook being pulled bodily down the sink drain, making it bulge like a snake eating a rat.

Martha's verdict: fun, but not scary.

October 23: Insidious: Chapter 3

I may have liked this best of all the Insidious films. A teenage girl wants to contact the spirit of her dead mother, but instead gets the attention of malicious spirits living in her building. A friendly medium is called in to help.

There wasn't really anything in particular that I liked about this movie, I just liked it. I liked the cast, and I was pleased that the movie was neither very weird nor overly hampered by its formula.

All that said, it wasn't great. It was okay.

October 24: The Conjuring

This movie had a lot of problems. I wanted to yell at the TV, "Why are you doing this stupid thing that is obviously going to get someone killed?" Despite this, it was a decent horror movie. The creepy parts were creepy, the relaxing middle parts were relaxing, and it wasn't just jump scares.

Gloria and I are both pretty tired of possessions and haunted houses, though. The ideas aren't spent, but they've both been covered the same way in many, many films, especially in the last ten years. We need to see new ideas.

October 25: The Strain

We watched the first episode of this TV series about vampires invading New York City. (This is my understanding about what the series will be about. Even as I write this, we've only made it to episode three. I'm not sure what's really going to happen.)

I found the first episode sort of interesting, but also a mess. I didn't care enough to keep watching, but I felt like if I watched another episode, I might realize that then I'd want to keep watching. This turned out to be true.

I read somewhere that the producers of The Strain kept wanting to push the limits to see what they could get away with. I can see that, in the show.

October 26: The Purge: Election Year

Each subsequent Purge movie gives us more information about what life is like in the world of The Purge. This is a mixed blessing. On one hand, we want to know how the world has come to this, and what is really going on in America. On the other hand, the movie's answer is stupid and doesn't amount to much more than "bad stuff happened I guess."

I would like to see a short run TV series about The Purge, giving us better stories of how it started, what life is like when the Purge isn't going on, and how cleanup and recovery works the next day.

Just watching people try to make it across town while the laws are on hold gets old, and I could just go watch Escape from New York instead.

October 27: Sinister

Dude moves into a murder house to write about it. He discovers home movies of all kinds of awful murders. He realizes that there is a supernatural force that has been killing families for years. He tries to escape, but he can't, because there is a TWIST!

I liked a lot of things in this movie. It had plenty of good creepy bits. We really liked James Ransone's character and his scenes with Ethan Hawke. In the end, the explanation was not great. Vincent D'Onofrio's character makes things much more confusing than they needed to be.

I was pretty willing to watch more, though. So we did.

October 28: Sinister 2

We were sold on this when we found out that James Ransone's character from the first film would have a major role. Unfortunately, I found this one a lot less interesting. We mostly watched the lives of the two kids, and it was more unpleasant than interesting. The first movie did a good job of making it clear that creepy things happened to kids, but we didn't have to watch. In the second movie, we had to watch a lot of unpleasant kid scenes. Meh.

October 29: Ash vs. Evil Dead

I didn't know this series existed until we heard about it on NPR. We've only watched one episode so far, but I enjoyed it. It was very reminiscent of the original films. I know this shouldn't be surprising, given the people involved, but I was worried.

Gloria said, "It's hard to imagine that this kind of thing will work for two whole seasons." I agree, but I look forward to finding out what happens.

October 30: Poltergeist

Martha had been begging to watch this movie for over a year. Why oh why would we not let her watch it? She would not be scared. So, after a month of "this movie isn't scary" we said she could watch Poltergeist with us.

We settled in to watch the movie. Children were eaten by trees, vanished into televisions, attacked by clowns. Parents floated amid rotting corpses in the pool. A guy peeled his own face off.

Martha's verdict: Good movie, not scary.

I guess I'm just glad she liked it.

October 31: The Witch

I really liked It Follows, and thought it was a good movie in that every part of it was focused on creating a particular mood, and it worked. I heard people compare The Witch to It Follows just for this reason, so I was keen to see it.

Gloria and I both found it boring and uninteresting.

We didn't like the characters. We didn't find the evil things creepy. We didn't think the conflicts were interesting. I was especially irritated by the parents were so religious that it made them stupid. I tweeted this, and of course got replies in support of the idea that all religious people are stupid. Putting this tiresome idea aside: it doesn't make good viewing.


Is this the last 31 movie Horror Movie Month? Maybe.

We've found fewer great new horror movies, the last few years, and often best ones we find, we watch during the year, meaning that October is a bit of a slog. This year was much better, but I'm not sure we'll have that luck in 2017. Maybe we'll switch to mostly watching horror movies with Martha in October instead.

We've got a good eleven months to decide, though.

Horror Movie Month 2015 (body)

by rjbs, created 2016-10-29 19:57
last modified 2016-11-02 21:15
tagged with: @markup:md journal movies

So, it turns out I never posted a summary of our Horror Movie Month for 2015! I tried to recreate our viewing list by looking up my tweets, but there are a bunch of days with no movie tweet. What happened? I'm not sure. Anyway, here's a very late summary, missing days, and with a year's clouding of my memory. How accurate will it be? Who knows!

If I was more dedicated, I'd go find all my tweets from these days to reconstruct my thinking at the time. But I'm not.

The Whole List

October 1: Incident On and Off a Mountain Road

This was the first of several Masters of Horror films that we watched. As with most of them, I thought it had an okay idea that didn't really work out. We were annoyed that the movie seemed to be setting up one kind of scenario, but then it was something else entirely.

October 2: Cam2Cam

There have been a few horror movies told through video chat. This was one of the better ones, but it still wasn't great. I guess I don't regret it.

October 3: All the Boys Love Mandy Lane

All the Boys Love Many Lane, too, wasn't great, but I liked the ways it defied or subverted several genre norms.

October 4: Scream


It was still very good. I think this is probably one of my (and our?) favorite slasher movies, because it's scary, funny, and smart. Like many of the best slashers, though, it only works because you know the genre tropes. That's okay, though, because we all do!

October 5: Afflicted

Afflicted was a good take on the "something bad happens on a road trip" movie, and much better than many similar movies I've seen. Found footage, though… even though it often makes sense in the movies being made, it's a technique that needs a long rest.

October 6: 13 Sins

I really liked 13 Sins. It wasn't great, but I enjoyed its little twists, and it worked as a (very) dark comedy.

October 7: American Horror Story

We hated the 2015 season of American Horror Story and gave up after a couple episodes. The story was a mess, there was too much going on, and we didn't care about any of it. Also, we found a number of the actors' performances to be really dull.

October 7, later: Late Phases

I liked it. It wasn't the best thing ever, but I liked the characters, I liked the story, and I liked the way its tone varied into places we don't often see in movies like these. There's a great scene in which the protagonist is riding to town with a busload of (other) senior citizens and nothing much happens.

October 9: Open Grave

I don't remember it well. My recollection is that it had a bunch of really tired tropes, but did okay despite them… but it wasn't great. "Everyone is locked in a room and has amnesia" is something we've probably had enough of.

October 10: Scream 2

It was good!

October 11: Housebound

Gloria had already seen this and consented to watch it again. Why? Because it was good! Gloria likes horror comedies, and so do I, but I keep thinking we should watch serious ones, too, and I'm almost always wrong. In Housebound, a young woman in New Zealand is sentenced to house arrest and slowly begins to realize that the house is haunted. Hollywood would probably make this movie really gritty and creepy and shot in lots of sepia tones. Housebound was funny and surprising and weird. It took a couple twists that surprised me and made me laugh. Endorsed!

October 12: Curse of Chucky

I'd put it in the same bin as the earlier Child's Play movies, but better than most of them. That said, it was only okay, at best.

October 13: Hellraiser Ⅲ

Terrible. Awful. Why did they keep making these? I do mean to finish watching the series, but only because I am an idiot. The only thing I remember liking was "The DJ," a cenobite who throw compact discs with lethal force.

Really, it was just terrible.

October 15: Pick Me Up!

This was another Masters of Horror that didn't deliver on its potential. It was okay, and I'm glad I saw it, but I'd like to see it done over again. It's about two competing serial killers, which was a fun concept.

I learned about Masters of Horror existing because of this movie. I was looking for more films by Larry Cohen, whose work I've largely enjoyed.

October 16: Cheap Thrills

This was fun. Like 13 Sins, it's about some guy who gets told to do increasingly weird things to make money. It's also got a darkly comedic streak. I enjoyed it!

October 18: Mischief Night

Blind woman terrorized by masked killer. I barely remember it at all. We've seen enough of these movies that I do remember, so this one is probably best ignored.

October 19: The Houses October Built

Road trip! A bunch of friends are traveling around and looking for the scariest haunted house experiences around the country. There's this rumor that a super-secret one exists that is way scarier than you've ever seen before.

Sounds like a great concept, I thought! I found it disappointing.

October 20: Dreams in the Witch House (Masters of Horror)

This was one of the better Masters of Horror that we watched, but it was still mostly just okay. I enjoyed that it was a pretty faithful adaptation of Lovecraft, since so many adaptations change too much.

October 21: Razorback

This is a 1984 horror movie about an enormous bloodthirsty razorback boar that kills people. It's basically Jaws, but in the outback. This movie was not good, but it was pretty weird. It seemed like, "If this is even sort of what 1984 rural Australia was like, then Mad Max seems like a much more plausible vision of the future than I thought."

October 22: Jenifer

It's a Masters of Horror film. Like many of the others, it was interesting, but not really great. That said, it's probably my favorite piece of horror work by Dario Argento, whose films I usually find sort of overwrought. It was nice and creepy.

October 23: Case 39

I had entirely forgotten this film until I re-read the Wikipedia page just now. I can't say whether I liked it or note. I think, if I recall correctly, that it had a few good moments, but was otherwise mediocre.

October 25: Stonehearst Asylum

This movie had a good twist, but I think it maybe gave it up too soon… but it might have been pretty hard to keep it hidden for much longer. It had a lot of good talent (Ben Kingsley!), but it could've done more to be creepy and not just weird.

October 26: Nightbreed

I heard so many good things about this film, but I was really hesitant, because I'm not much of a fan of Clive Barker. It was so-so. The Nightbreed themselves were weird, but not very compelling. I would've rather re-watched Basket Case 2.

October 27: Cigarette Burns

This was probably my favorite of the Masters of Horror movies that we watched. It's very much in the spirit of H.P. Lovecraft, at least until the strange (but good) ending. It reminded me in many ways of Carpenter's earlier In the Mouth of Madness, which I think was (as I recall) a better film, despite a number of scenes in Cigarette Burns that were more nuanced and interesting than anything in Madness.

October 28: Vlog

I barely remember this. As I recall, it was much creepier than I expected, but had a bunch of other problems. It was one of the many (many? well, enough) webcam movies we watched in 2015.

October 29: Homecoming (Masters of Horror)

This was a dark comedy in which dead American soldiers rise from their grave as zombies, with just one desire: the vote. They plan to vote for anyone who will end the war. The film focuses on the campaign staff of the sitting president, who supports the war, and has to spin this the right way. The movie had a lot of problems, but it was good enough in other ways to overcome them.

October 30: The Fury

So, this was something like Carrie, or Firestarter, or Scanners. There's a secret government plan to weaponize psychics, and of course things go wrong. It was mediocre.

October 31: A Girl Walks Home Alone at Night

This was an Iranian film — I think. It was filmed in Persian, anyway, and directed by an Iranian woman. I remember it only faintly. I recall liking what they did with a cat.

October 31: The Stuff

This was my then-eight-year-old daughter's first admission to Horror Movie Month. It's a film by Larry Cohen, whose films I really like. In it, a weird white goo is found bubbling up from the frozen ground by someone who seems to be an industrial site night watchman. He looks at it quizzically and then, of course, tastes it. It's delicious!

Months later, The Stuff is the number one dessert in the country. People love it. They eat it for all three meals, plus snacks. It's a huge sensation, even if the ingredients just say "natural ingredients."

A young boy runs away from his family because they've lost their minds for The Stuff. He teams up with Michael Moriarty (who appears often in Cohen's film) to find the secret of The Stuff and stop it from consuming all the consumers in the world!

After the movie, we treated ourselves to some of that wonderful Stuff!

The Stuff

sending email with TLS (in Perl) (body)

by rjbs, created 2016-10-22 22:55

Every once in a while I hear or see somebody using one of the two obsolete secure SMTP transports for Email::Sender, and I wanted to make one more attempt to get people to switch, or to get them to tell me why switching won't work.

When you send mail via SMTP, and need to use SMTP AUTH to authenticate, you want to use a secure channel. There are two ways you're likely to do that. You might connect with TLS, conducting the entire SMTP transaction over a secure connection. Alternatively, you might connect in the clear and then issue a STARTTLS command to begin secure communication. For a long time, Perl's default SMTP library, Net::SMTP, did not support either of these, and it was sort of a pain to use them.

Email::Sender is probably the best library for sending mail in Perl, and it's shipped with an SMTP transport that uses Net::SMTP. That meant that if you wanted to use TLS or STARTTLS, you needed to use another transport. These were around as Email::Sender::Transport::SMTPS and Email::Sender::Transport::SMTP::TLS. These worked, but you needed to know that they existed, and might rely on libraries (like Net::SMTPS) not quite as widely tested as Net::SMTP.

About two years ago, Net::SMTP got native support for TLS and STARTTLS. About six months ago, the stock Email::Sender SMTP transport was upgraded to use it. Now you can just write:

my $xport = Email::Sender::Transport::SMTP->new({
  host => 'smtp.pobox.com',
  ssl  => 'starttls', # or 'ssl'
  sasl_username => 'devnull@example.com',
  sasl_password => 'aij2$j3!aa(',

...and not think about installing anything else. This is what I suggest you do.

I'm learning Rust! (body)

by rjbs, created 2016-10-15 22:45

I've been meaning to learn Rust for a long time. I read the book a while ago, and did some trivial exercises, but I didn't write any real programs. This is a pretty common problem for me: I learn the basics of a language, but don't put it to any real use. Writing my stupid 24 game solver in Forth definitely helped me think about writing real Forth programs, even if it was just a goof.

Now I'm working on implementing the Software Tools programs in Rust. These are simple programs that solve real world problems, or at least approximations of real world problems. I've written programs to copy files, expand and collapse tabs, count words, and compress files. So far, all my programs are pretty obviously mediocre, even to me, but I'm having fun and definitely learning a lot. At first, I thought I'd be working my way through the book program by program, but now I realize that I'm going to continually going back to earlier work to improve it with the things I'm learning as I go.

For example, I sarted off by buffering all my I/O manually, which worked, but made everything I did a bit gross to look at. Later, I found that you can wrap a thing that reads from a file (or other data source) in something that buffers it but then provides the same interface. I went back and added that to my old programs, deleting a bunch of code.

Soon, I know i'm going to be going back to add better command line argument handlng. I'm pretty sure my error handling is all garbage, too.

Still, the general concept has been a great success: I'm writing programs that actually do stuff, and they have fun edge cases, and it's just a lot less tedious than exercises in a text book.

So far, so good!

solving the 24 game in Forth (body)

by rjbs, created 2016-08-23 10:46
last modified 2016-08-23 18:41

About a month ago, Mark Jason Dominus posted a simple but difficult arithmetic puzzle, in which the solver had to use the basic four arithmetic operations to get from the numbers (6, 6, 5, 2) to 17. This reminded me of the 24 Game, which I played when I paid my infrequent visits to middle school math club. I knew I could solve this with a very simple Perl program that would do something like this:

  for my $inputs ( permutations_of( 6, 6, 5, 2 ) ) {
    for my $ops ( pick3s_of( qw( + - / * ) ) ) {
      for my $grouping ( 'linear', 'two-and-two' ) {
        next unless $target == solve($inputs, $ops, $grouping);
        say "solved it: ", explain($inputs, $opts, $grouping);

All those functions are easy to imagine, especially if we're willing to use string eval, which I would have been. I didn't write the program because it seemed obvious.

On the other hand, I had Forth on my brain at the time, so I decided I'd try to solve the problem in Forth. I told Dominus, saying, "As long as it's all integer division! Forth '83 doesn't have floats, after all." First he laughed at me for using a language with only integer math. Then he told me I'd need to deal with fractions. I thought about how I'd tackle this, but I had a realization: I use GNU Forth. GNU's version of almost everything is weighed down with oodles of excess features. Surely there would be floats!

In fact, there are floats in GNU Forth. They're fun and weird, like most things in Forth, and they live on their own stack. If you want to add the integer 1 to the float 2.5, you don't just cast 1 to int, you move it from the data stack to the float stack:

2.5e0 1. d>f f+

This puts 2.5 on the float stack and 1 on the data stack. The dot in 1. doesn't indicate that the number is a float, but that it's a double. Not a double-precision float, but a two-cell value. In the Forth implementation I'm using, 1 gets you an 8-byte 1 and 1. gets you a 16-byte 1. They're both integer values. (If you wrote 1.0 instead, as I was often temped to do, you'd be making a double that stored 10, because the position of the dot doesn't matter.) d>f takes a double from the top of the data stack, converts it to a float, and puts it on the top of the float stack. f+ pops the top two floats, float-adds them, and pushes the result back onto the float stack. Then we could verify that it worked by using f.s to print the entire float stack to the console.

Important: You have to keep in mind that there are two stacks, here, because it's very easy to manipulate the wrong stack and end up with really bizarre results. GNU Forth has locally named variables, but I chose to avoid them to keep the program feeling more like Forth to me.


I'm going to run through how my Forth 24 solver works, not in the order its written, but top-down, from most to least abstract. The last few lines of the program, something like int main are:

  17e set-target
  6e 6e 5e 2e set-inputs

  ." Inputs are: " .inputs
  ." Target is : " target f@ fe. cr
  ' check-solved each-expression

This sets up the target number and the inputs. Both of these are stored, not in the stack, but in memory. It would be possible to keep every piece of the program's data on the stack, I guess, but it would be a nightmare to manage. Having words that use more than two or three pieces of data from the stack gets confusing very quickly. (In fact, for me, having even one or two pieces can test my concentration!)

set-target and set-inputs are words meant to abstract a bit of the mechanics of initializing these memory locations. The code to name these locations, and to work with them, looks like this:

  create inputs 4 floats allot              \ the starting set of numbers
  create target 24 ,                        \ the target number

  : set-target target f! ;

  \ sugar for input number access
  : input-addr floats inputs + ;
  : input@ input-addr f@ ;
  : input! input-addr f! ;
  : set-inputs 4 0 do i input-addr f! loop ;

create names the current memory location. allot moves the next allocation forward by the size it's given on the stack, so create inputs 4 floats allot names the current allocation space to inputs and then saves the next four floats worth of space for use. The comma is a word that compiles a value into the current allocation slot, so create target 24 , allocates one cell of storage and puts a single-width integer 24 in it.

The words @ and ! read from and write to a memory address, respectively. set-target is trivial, just writing the number on the stack to a known memory location. Note, though, that it uses f!, a variant of ! that pops the value to set from the float stack.

set-inputs is built in terms of inputs-addr, which returns the memory address for given offset from inputs. If you want the final (3rd) input, it's stored at inputs plus the size of three floats. That's:

  inputs 3 floats +

When we make the three a parameter, we swap the order of the operands to plus so we can write:

  floats inputs + ( the definition of input-addr )

set-inputs loops from zero to three, each time popping a value off of the float stack and storing it in the next slot in our four-float array at input.


Now we have an array in memory storing our four inputs. We also want one for storing our operators. In fact, we want two: one for the code implements an operator and one for a name for the operator. (In fact, we could store only the name, and then interpret the name to get the code, but I decided I'd rather have two arrays.)

  create op-xts ' f+ , ' f- , ' f* , ' f/ ,
  create op-chr '+  c, '-  c, '*  c, '/  c,

These are pretty similar to the previous declarations: they use create to name a memory address and commas to compile values into those addresses. (Just like f, compiled a float, c, compiles a single char.) Now, we're also using ticks. We're using tick in two ways. In ' f+, the tick means "get the address of the next word and compile that instead of executing the word." It's a way of saying "give me a function pointer to the next word I name." In '+, the tick means "give me the ASCII value of the next character in the input stream."

Now we've got two arrays with parallel indexes, one storing function pointers (called execution tokens, or xts, in Forth parlance) and one storing single-character names. We also want some code to get items out of theses arrays, but there's a twist. When we iterate through all the possible permutations of the inputs, we can just shuffle the elements in our array and use it directly. When we work with the operators, we need to allow for repeated operators, so we can't just shuffle the source list. Instead, we'll make a three-element array to store the indexes of the operators being considered at any given moment:

  create curr-ops 0 , 0 , 0 ,

We'll make a word curr-op!, like ones we've seen before, for setting the op in position i.

  : curr-op! cells curr-ops + ! ;

If we want the 0th current operator to be the 3rd one from the operators array, we'd write:

  3 0 curr-op!

Then when we want to execute the operator currently assigned to position i, we'd use op-do. To get the name (a single character) of the operator at position i, we'd use op-c@:

  : op-do    cells curr-ops + @ cells op-xts + @ execute ;
  : op-c@    cells curr-ops + @ op-chr + c@ ;

These first get the value j stored in the ith position of curr-ops, then get the jth value from either op-xts or op-chr.

permutations of inputs

To get every permutation of the input array, I implemented Heap's algorithm, which has the benefit of being not just efficient, but also dead simple to implement. At first, I began implementing a recursive form, but ended up writing it iteratively because I kept hitting difficulties in stack management. In my experience, when you manage your own stacks, recursion gets significantly harder.

  : each-permutation ( xt -- )

    dup execute

    0 >r
      4 i <= if rdrop drop exit then

      i i hstate@ > if
        i do-ith-swap
        dup execute
        i hstate1+!
        0 hstate i cells + !

This word is meant to be called with an xt on the stack, which is the code that will be executed with each distinct permutation of the inputs. That's what the comment (in parentheses, like this) tells us. The left side of the double dash describes the elements consumed from the stack, and the right side is elements left on this stack.

init-state sets the procedure's state to zero. The state is an array of counters with as many elements as the array being permuted. Our implementation of each-permutations isn't generic. It only works with a four-element array, because init-state works off of hstate, a global four element array. It would be possible to make the permutor work on different sizes of input, but it still wouldn't be reentrant, because every call to each-permutation shares a single state array. You can't just get a new array inside each call, because there's no heap allocator to keep track of temporary use of memory.

(That last bit is stretching the truth. GNU Forth does have words for heap allocation, which just delegate to C's alloc and friends. I think using them would've been against the spirit of the thing.)

The main body of each-permutation is a loop, built using the most generic form of Forth loop, begin and again. begin tucks away its position in the program, and again jumps back to it. This isn't the only kind of loop in Forth. For example, init-state initializes our four-element state array like this:

  : init-state 4 0 do 0 i hstate! loop ;

The do loop there iterates from 0 to 3. Inside the loop body (between do and loop) the word i will put the current iteration value onto the top of the stack. It's not a variable, it's a word, and it gets the value by looking in another stack: the return stack. Forth words are like subroutines. Every time you call one, you are planning to return to your call site. When you call a word, your program's current execution point (the program counter), plus one, is pushed onto the return stack. Later, when your word hits an exit, it pops off that address and jumps to it.

The ; in a Forth word definition compiles to exit, in fact.

You can do really cool things with this. They're dangerous too, but who wants to live forever? For example, you can drop the top item from the return stack before returning, and do a non-local return to your caller's caller. Or you can replace your caller with some other location, and return to that word -- but it will return to your caller's caller when it finishes. Nice!

Because it's a convenient place to put stuff, Forth ends up using the return stack to store iteration variables. They have nothing to do with returning, but that's okay. In a tiny language machine like those that Forth targets, some features have to pull double duty!

begin isn't an iterating loop, so there's no special value on top of the return stack. That's why I put one there before the loop starts with 0 >r, which puts a 0 on the data stack, then pops the top of the data stack to the top of the return stack. I'm using this kind of loop because I want to be able to reset the iterator to zero. I could have done that with a normal iterating loop, I guess, but it didn't occur to me at the time, and now that I have working code, why change it?

Iterator reset works by setting i back to 0 with the zero-i word. In a non-resetting loop iteration, we increment i with inc-i. Of course, i isn't a variable, it's a thing on the return stack. I made these words up, and they're implemented like this:

  : zero-i r> rdrop 0 >r >r ;
  : inc-i  r> r> 1+ >r >r ;

Notice that both of them start with r> and end with >r. That's me saving and restoring the top item of the return stack. You see, once I call zero-i, the top element of the return stack is the call site! (Well, the call site plus one.) I can't just replace it, so I save it to the data stack, mess around with the second item on the return stack (which is now the top item) and then restore the actual caller so that when I hit the exit generated by the semicolon, I go back to the right place. Got it? Good!

Apart from that stuff, this word is really just the iterative Heap's algorithm from Wikipedia!

nested iteration

Now, the program didn't start by using each-iteration, but each-expression. Remember?

  ' check-solved each-expression

That doesn't just iterate over operand iterations, but also over operations and groupings. It looks like this:

  : each-expression ( xt -- )
    2 0 do
      i 0= linear !
      dup each-opset
      loop drop ;

It expects an execution token on the stack, and then calls each-opset twice with that token, setting linear to zero for the first call and 1 for the second. linear controls which grouping we'll use, meaning which of the two ways we'll evaluate the expression we're building:

  Linear    : o1 ~ ( o2 ~ ( o3 ~ o4 ) )
  Non-linear: (o1 ~ o2) ~ (o3 ~ o4)

each-opset is another iterator. It, too, expects an execution token and repeatedly passes it to something else. This time, it calls each-permutation, above, once with each possible combination of operator indexes in curr-op.

  : each-opset ( xt -- )
    4 0 do i 0 curr-op!
      4 0 do i 1 curr-op!
        4 0 do i 2 curr-op!
          dup each-permutation
          loop loop loop drop ;

This couldn't be much simpler! It's exactly like this:

  for i in (0 .. 3) {
    op[0] = i
    for j in (0 .. 3) {
      op[1] = j
      for k in (0 .. 3) {
        op[3] = k

inspecting state as we run

Now we have the full stack needed to call a given word for every possible expression. We have three slots each for one of four operators. We have four operands to rearrange. We have two possible groupings. We should end up with 4! x 4³ x 2 expression. That's 3072. It should be easy to count them by passing a counting function to the iteator!

create counter 0 ,
: count-iteration
  1 counter +!    \ add one to the counter
  counter @ . cr  \ then print it and a newline

' count-iteration each-expression

When run, we get a nice count up from 1 to 3072. It works! Similarly, I wanted to eyeball whether I got the right equations, so I wrote a number of different state-printing words, but I'll only show two here. First was .inputs, which prints the state of the input array. (It's conventional in Forth to start a string printing word's name with a dot, and to end a number printing word's name with a dot.)

  : .input  input@ fe. ;
  : .inputs 4 0 do i .input loop cr ;

.inputs loops over the indexes to the array and for each one calls i .input, which gets and prints the value. fe. prints a formatted float. Here's where I hit one of the biggest problems I'd have! This word prints the floats in their order in memory, which we might think of as left to right. If the array has [8, 6, 2, 1], we print that.

On the other hand, when we actually evaluate the expression, which we'll do a bit further on, we get the values like this:

4 0 do i input@ loop \ get all four inputs onto the float stack

Now the stack contains [1, 2, 8, 6]. The order in which we'll evaluate them is the reverse of the order we had stored them in memory. This is a big deal! It would've been possible to ensure that we operated on them the same way, for example by iterating from 3 to 0 instead of 0 to 3, but I decided to just leave it and force myself to think harder. I'm not sure if this was a good idea or just self-torture, but it's what I did.

The other printing word I wanted to show is .equation, which prints out the equation currently being considered.

  : .equation
    linear @
      0 .input 0 .op
        1 .input 1 .op
        (( 2 .input 2 .op 3 .input ))
      (( 0 .input 0 .op 1 .input ))
      1 .op
      (( 2 .input 2 .op 3 .input ))
    ." = " target f@ fe. cr ;

Here, we pick one of two formatters, based on whether or not we're doing linear evaluation. Then we print out the ops and inputs in the right order, adding parentheses as needed. We're printing the parens with (( and )), which are words I wrote. The alternative would have been to write things like:

  ." ( " 2 .input 2 .op 3 .input ." ) "

...or maybe...

  .oparen 2 .input 2 .op 3 .input

My program is tiny, so having very specialized words makes sense. Forth programmers talk about how you don't program in Forth. Instead, you program Forth itself to build the language you want, then do that. This is my pathetic dime store version of doing that. The paren-printing functions look like:

  : (( ." ( " ;
  : )) ." ) " ;

testing the equation

Now all we need to do is write something to actually test whether the equations hold and tell us when we get a winner. That looks like this:

  : check-solved
    this-solution target f@ 0.001e f~rel
    if .equation then ;

This is what we passed to each-expression at the beginning! We must be close to done now...

this-solution puts the value of the current expression onto the top of the (float) stack. target f@ gets the target number. Then we use f~rel. GNU Forth doesn't give you a f= operator to test float equality, because testing float equality without thinking about it is a bad idea, because it's too easy to lose precision to floating point mechanics. Instead, there are a bunch of float comparison operators. f~rel takes three items from the stack and puts a boolean onto the data stack. Those items are two values to compare, and an allowed margin of error. We're going to call the problem solved if we're within 0.001 of the target. If we are, we'll call equation. and print out the solution we found.

The evaluator, this-solution, looks like this:

  : this-solution
    4 0 do i input@ loop

    linear @ if
      2 op-do 1 op-do 0 op-do
      2 op-do
      frot frot
      0 op-do
      1 op-do

What could be simpler, right? We get the inputs out of memory (meaning they're now in reverse order on the stack) and pick an evaluation strategy based on the linear flag. If we're evaluating linearly, we execute each operator's execution token in order. If we're grouping, it works like this:

          ( r1 r2 r3 r4 ) \ first, all four inputs are on the stack
  2 op-do ( r1 r2 r5    ) \ we do first op, putting its result on stack
  frot    ( r2 r5 r1    ) \ we rotate the third float to the top
  frot    ( r5 r2 r1    ) \ we rotate the third float to the top again
                          \ ...so now the "bottom" group of inputs is on top
  0 op-do ( r5 r6       ) \ we do the last op, evaluating the bottom group
  fswap   ( r6 r5       ) \ we restore the "real" order of the two groups
  1 op-do ( r7          ) \ we do the middle op, and have our solution

That's it! That's the whole 24 solver, minus a few tiny bits of trivia. I've published the full source of the program on GitHub.

prev page
next page
page 1 of 84
2093 entries, 25 per page