rjbs forgot what he was saying

not logged in (root) | by date | tagcloud | help | login

PTS 2019: Module::Faker (3/5)

by rjbs, created 2019-04-29 20:40
last modified 2019-04-29 20:40
tagged with: @markup:md journal perl programming

the basics

In 2008, I wrote Module::Faker. In fact, I wrote it at the first QA Hackathon! I'd had the idea to write it because at Pobox, we were writing a PAUSE-like module indexer for changing our internal deployment practices. It became clear that we could use it for testing actual PAUSE as well as other code, and I got to work. Since then, I've used it (and the related CPAN::Faker) quite a lot for testing, especially of PAUSE. At first, it was just a quick way to build a tarball that contained something that looked more or less like a CPAN distribution, with the files in the right places, the package statements, the version declarations, and so on.

As time went on, I needed weirder and more broken distributions, or distributions with more subtle changes than just a different version. In these cases, I'd build a fake, untar it, edit the contents, and tar it up again. Further, the usual way to fake up a dist was to write a META.yml-style file, stick in a directory with other ones, and then build them all. Why? Honestly, I can't remember. It's sort of a bizarre choice, and I think probably it was trying to be too clever. By default, you pass the contents of the META file to Module::Faker::Dist->new, which means its internals go to sort of annoying lengths to do the right thing. Why not a simpler constructor and a translation layer in between?

Well, that's not what I implemented, but I'll probably make that happen eventually. Instead, I added yet another constructor, this one much more suitable for building one-off fakes. Using it looks something like this:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'DENOMOLOS',
    name        => 'Totally-Fake-Software',
    version     => '2.002',
  });

And when you write that to disk, you get this:

  ~/code/Module-Faker$ tar zxvf Totally-Fake-Software-2.002.tar.gz
  x Totally-Fake-Software-2.002/lib/Totally/Fake/Software.pm
  x Totally-Fake-Software-2.002/Makefile.PL
  x Totally-Fake-Software-2.002/t/00-nop.t
  x Totally-Fake-Software-2.002/META.json
  x Totally-Fake-Software-2.002/META.yml
  x Totally-Fake-Software-2.002/MANIFEST

  ~/code/Module-Faker$ cat Totally-Fake-Software-2.002/lib/Totally/Fake/Software.pm

  =head1 NAME

  Totally::Fake::Software - a cool package

  =cut

  package Totally::Fake::Software;
  our $VERSION = '2.002';

  1

Quite a few things are inferred, but the most important inference is that in a dist called Totally-Fake-Software with a version of 2.002, you'll want a single package named Totally::Fake::Software with a version of 2.002. It's pretty resaonable, as long as it can be overridden, and now it can:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'DENOMOLOS',
    name        => 'Totally-Fake-Software',
    version     => '2.002',
    packages    => [
      'Totally::Fake::Software',
      'Totally::Fake::Hardware',
      'Totally::Fake::Firmware' => {
        version => '2.001',
        in_file => "lib/Totally/Fake/Hardware.pm"
      },
    ],
  });

This does what you might imagine: it makes two modules (read: .pm files), one of which contains two package definitions, and those two packages have differing version numbers. That is:

~/code/Module-Faker$ cat Totally-Fake-Software-2.002/lib/Totally/Fake/Hardware.pm

=head1 NAME

Totally::Fake::Hardware - a cool package

=cut

package Totally::Fake::Hardware;
our $VERSION = '2.002';

package Totally::Fake::Firmware;
our $VERSION = '2.001';

1

I think I'm still going to tinker with how faker objects are generated, but by now you should get the point.

weird stuff

I use Module::Faker for testing PAUSE, and one thing that PAUSE has to do is deal with slightly (or not so slightly) malformed input. With no specification for the shape of a distribution (other than the strong suggestion that you follow what's in the CPAN::Meta spec), malformedness is in the eye of the beholder. I can tell you, though, that I have beheld a lot of malformed stuff in my time.

One example was pretty innocuous: in META.json, there can be a "provides" entry saying what packages are provided by this distribution, and in what file they're found. It might look like this:

  "provides": {
    "Some::Package::Here": { "file": "lib/Some/Package/Here.pm" }
  }

From time to time, people have inserted an entry with an undefined file value. This was used as a means to say, "I am claiming ownership of this name, which I do not provide." A few years ago, the toolchaing gang came to a consensus: file needs to be provided, naming some actual file. Unfortunately, CPAN::Meta features footgun protection, shielding you from accidentally (read: on purpose) making this kind of bogus entry. I needed a way to produce it, and I really did not want to resort to editing and repacking tarballs by hand.

To do this, I added a new option, meta_munger:

  $dist = Module::Faker::Dist->from_struct({
    name    => 'Some-Package-Here',
    version => '1.0',
    cpan_author => 'TSWIFT',
    meta_munger => sub {
      $_[0]{provides} = {
        "Some::Package::Here" => { file => undef },
      };

      return $_[0];
    }
  });

Doing this involved disassembling and duplicating a bit of CPAN::Meta, but it was worth it. All manner of stupid garbage can now be stuffed into fake dists' metadata. In fact, the munger can decide it's going to act like 2009-era RJBS by putting JSON into the META.yml file, it can use YAML::Syck, it can return syntactically invaild JSON, or whatever you like. Do your worst!

The munger runs quite late in the metadata generation process. Especially noteworthy, it runs after the metadata structure has been generated for a specific version. You might just want to provide a little bit of extra metadata, like an x_favorite_treat to be handled like any other bit of metadata, with CPAN::Meta::Merge. Easy enough:

  $dist = Module::Faker::Dist->from_struct({
    name    => 'DateTime-Breakfast',
    version => '1.0',
    cpan_author => 'BURGERS',
    more_metadata => { x_favorite_treat => 'cruffins' },
  });

It's certainly less work to write than a munging subroutine.

I added support for different "style" packages. What does that mean? Well, given this code:

  my $dist = Module::Faker::Dist->from_struct({
    cpan_author => 'MRXII',
    name        => 'Cube-Solver',
    version     => 'v3.3.3',
    packages    => [
      'Cube-Solver-Rubik' => { in_file => 'Solver.pm', style => 'legacy' },
      'Cube-Solver-GAN'   => { in_file => 'Solver.pm', style => 'statement' },
      'Cube-Solver-LHR'   => { in_file => 'Solver.pm', style => 'block' },
    ],
  });

You get this result:

  ~/code/Module-Faker$ cat Cube-Solver-v3.3.3/Solver.pm

  =head1 NAME

  Cube-Solver-Rubik - a cool package

  =cut

  package Cube-Solver-Rubik;
  our $VERSION = 'v3.3.3';

  package Cube-Solver-GAN v3.3.3;

  package Cube-Solver-LHR v3.3.3 {

    # Your code here

  }

  1

I actually haven't documented this feature, yet, because I'm not happy enough with it. For example, you can't get an assignment to $VERSION inside a package block. I'd like to make this all a bit more flexible. Sometime I half think that it would be cool to merge Module::Faker's behavior into Dist::Zilla's dist minting features, but I think this would be more trouble than it's worth. Probably.

I didn't document that, but I did document quite a lot of Module::Faker::Dist, which was previously entirely undocumented. Great!