A Foolish Manifesto

fREWdiculous!

On Moose and Speed

Today the question was asked: “To Moose or Not to Moose?” The article is fairly well written, but it seems to me that the comments are not exactly educated. Here is the main one this is in response to:

I’d try Mouse too. Unless you’re doing something funky I’d be surprised if it’s more than a 1 letter change to your source code.

First off, here is a quote from the POD:

Moose is wonderful. Use Moose instead of Mouse.

The author recommends not to use Mouse. That’s a big deal to me. Also, enjoy the following quote:

The original author of this module has mostly stepped down from maintaining Mouse. See http://www.nntp.perl.org/group/perl.moose/2009/04/msg653.html. If you would like to help maintain this module, please get in touch with us.

He’s also given up on it. Moral of the story, don’t use it.

Now for the good news!

Today I started working on mst’s plan for MX::Antlers, which is a way to use the actual Moose, with the speed of Mouse, without persistence or anything like that. Great for CGI and whatnot.

Now I’m a little fuzzy on the implementation, but if I understand correctly this will “compile” Moose into a single file. It will not include Class::MOP, so you won’t be able to use ->meta, but generally for basic modules don’t need it, so no big deal really. What I am working on is updating the existing Moose test-suite to disable the tests for ->meta. My current plan is to use an environment variable, but whatever I do it will be a function so that we can change it to some other methodology if we need to.

So! Get excited! Depending on the code we may be able to abstract it to apply to other heavy frameworks (Catalyst?) to make them sufficiently fast as well. Once I have some basic stuff in the public repo (hopefully a couple before Friday) I’ll put up a post or two explaining how to get the work done, and then we can parallelize the work. Who’s with me?

  • 3 Comments
  • Filed under: Uncategorized
  • Speed, OO, Black Magic, and YAGNI + RTFM

    At work we have a certain customer who has a database with something like 250 report tables. They are generated and maintained purely in code and if you ever touch one manually it’s for a one-off script or something. Anyway, we recently started using DBIx::Class at work and part of that meant accessing those report tables with DBIC.

    The first step was to use DBIx::Class::Schema::Loader, which looks at the table structure and generates a bunch of perl files. Then we just use DBIC as normal. Unfortunately this is in a CGI environment, without mod_perl, or FastCGI or any of that stuff. That means not only is this loading all 250 files (each 25~ K in size,) but also parsing them etc. Just to be clear, we have a 15 second startup time. Have fun telling your customer that that’s better in an AJAX context.

    So that was just Not Okay. I asked in #dbix-class and robkinyon suggested that I make a YAML file that would represent all of the tables. He couldn’t give me code and it was Friday, but I did get my code to add columns on the fly, so it couldn’t be much harder to go from there, could it?

    Of course it could! It always will in such a context. So I asked again, what would be the best way to generate in memory classes of a single data structure, in #dbix-class. castaway recommended subclassing DBIx::Class::Schema::Loader to do what I wanted. So that took a few hours to get to work, including figuring out how everything worked. That was really pretty exciting because it was a Good Way to do what I wanted. Too bad there are some Schema::Loader implementation issues.

    Turns out that after making our full data structure it took longer to load the classes into memory than to leave them on the hard drive. I should have realized this would be the case, but for some reason I blocked it out: S::L works by writing temporary files and having perl include them, so really we were reading just as much data but also writing it too. At this point I have spent about 10 hours total on this project and it’s absurdly slow. My boss was not very happy. The irony was that I had used the initial success of the subclass of S::L for leverage in a certain bargain, which I hope to post about soon.

    I spoke with ilmari, the person who wrote S::L and he was telling me how to make S::L do everything in memory, but I couldn’t get it to work and my boss (quite reasonably) was breathing down my neck.

    So pure, unadulterated Black Magic it was. I would write all the code as a string and then include that with strange require tricks. I can’t take credit for this really, as I got a lot of help from people on StackOverflow. Anyway, here is how that could be done:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    #!/usr/bin/perl

    use strict;
    use warnings;
    use feature ':5.10';

    my $data_struct = [{
          table => 'Foo',
          columns => [qw{foobar foobaz}],
       },{
          table => 'Rpt1',
          columns => [1..20],
       }];

    $data_struct = [map { { table => "EPMS::Schema::Result::".$_->{table}, columns => $_->{columns} } } @{$data_struct} ];

    my $tables = [ map { $_->{table} } @{$data_struct} ];
    my $columns = { map { $_->{table} => $_->{columns} } @{$data_struct}};

    foreach my $class (@{$tables}) {
        no strict 'refs';
        *{ "${class}::INC" } = sub {
            my ($self, $req) = @_;
            return unless $req eq $class;
            my $data = qq!
                package $class;
                use feature ':5.10';
                sub foo {
                   my \$self = shift;
                   say "\$self:".\$self->columns;
                }
                \$columns = [!.join(',', map { qq['$_'] } @{ $columns->{$class} } ).qq!];
                sub columns {
                   my \$self = shift;
                   return join ',', \@{\$columns};
                }
                1;
            !;
            open my $fh, '<', \$data;
            return $fh;
        };
       my $foo = bless { }, $class;
       unshift @INC, $foo;

       require $class;
    }

    That works, but it was still actually pretty slow, surprisingly.

    I also tried concatenating all of the files into a single file and it was still more or less just as slow.

    So finally I broke down and did the unthinkable: I RTFM’d on DBIx::Class::Schema to see if there were any clues. The clue that I got out of it was the following bit:

    register_class

    You will only need this method if you have your Result classes in files which are not named after the packages (or all in the same file). You may also need it to register classes at runtime.

    So what I could do is generate all the classes with code, but really simply, without all the column metadata since the DB is the single point of truth in this context, and then load the ones we’d need on the fly!

    It was easy as pie. Use a template to generate the classes and write them to files (we used the namespace EPMS::Schema::NonDefaultResult, so that it’s clear that it’s result, but not loaded by the load_namespaces method of the schema.) Then I just added a method to our Schema that would do the following (from memory):

    1
    2
    3
    4
    5
    sub load_report {
       my ($self, $report_num) = @_;
       eval "require EPMS::Schema::NonDefaultResult::Rpt$report_num";
       $self->register_class("Rpt$report_num", "EPMS::Schema::NonDefaultResult::Rpt$report_num");
    }

    And that was basically it. I also wrote a little bit of code to short circuit if the report is already loaded. Anyway, it works reasonably quickly and isn’t too ghetto! So the moral of the story is probably to RTFM before you try crazy stuff.

  • 3 Comments
  • Filed under: Uncategorized