A Foolish Manifesto

fREWdiculous!

Archive for the ‘perl’ Category

In the spirit of one of my other posts I’ve decided to chronicle my path with at least a couple event loops.

More than eighteen months ago I documented my decision to start using an event loop as it would handle things I may not have considered, the example mentioned specifically in that post being exceptions. Things went well! I used the code I documented in that post for a long time with no issues until recently. It turns out that the event loop I was using didn’t actually handle exceptions at all, thus completely nullifying my reason to use it.

So I looked elsewhere. I looked at the grandfather of event loops, POE. I like a lot of the components that have been written on top of POE, but POE itself is frustratingly low level. That’s a topic for another post though (yes I looked at Reflex.)

After my last post and speaking with Rocco Caputo, auther of our venerable POE, I came up with the following runner role:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
package Lynx::SMS::DoesRun;

use Moose::Role;
use POE;

# this merely uses our logger etc
with 'Lynx::SMS::HandlesDieForPOE';

requires 'single_run';

has period => (
   is => 'ro',
   required => 1,
);

has schema => (
   is => 'ro',
);

sub run {
   my $self = shift;

   POE::Session->create(
      inline_states => {
         _start => sub {
            $_[KERNEL]->sig( DIE => 'sig_DIE' );
            $_[KERNEL]->yield('loop');
         },
         sig_DIE => \&die_handler,
         loop => sub {
            $_[KERNEL]->delay( loop => $self->period );
            $self->single_run;
         },
      },
   );

   POE::Kernel->run;
}

no Moose::Role;

1;

This works fine. It’s (to me) a little ugly, but I imagine that I’d get used to it if I were to write much more POE. But then Rocco pointed out that maybe I’m just wasting my time with event loops for this use case. Ultimately using POE as a glorified Try::Tiny is stupid and really not even the goal. So finally I’ve ended up just a few steps beyond where I started:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
package Lynx::SMS::DoesRun;

use Moose::Role;
use Try::Tiny;
use Log::Contextual qw(:log :dlog);

requires 'single_run';

has period => (
   is => 'ro',
   required => 1,
);

has schema => (
   is => 'ro',
);

sub run {
   my $self = shift;

   while (1) {
      try {
         $self->single_run;
      } catch {
         my $error = $_;
         log_error { $error }
      };
      sleep($self->period)
   }
}

no Moose::Role;

1;

The observant reader will notice that despite me mentioning the above use case, which is really the only important one for me given that our actual server will run all of our services in separate processes, there is still the benefit of Event Loops mentioned in the first post for development purposes (starting all services in a single program.) I have indeed converted that to POE, but that probably doesn’t matter. I run my unified service script maybe once or twice a year at this point. Here it is if anyone is interested:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
package Lynx::SMS::Runner;

use Moose;
use POE;

with 'Lynx::SMS::HandlesDieForPOE';

has tasks => (
   is => 'ro',
   default => sub { [] },
);

sub run {
   my $self = shift;

   POE::Session->create(
      inline_states => {
         _start => sub {
            $_[KERNEL]->sig( DIE => 'sig_DIE' );
            $self->create_children_sessions,
         },
         sig_DIE => \&die_handler,
      },
   );

   POE::Kernel->run;
}

sub create_children_sessions {
   my $self = shift;
   my $x = 0;
   my @tasks = @{$self->tasks};
   for my $task (@tasks) {
      POE::Session->create(
         inline_states => {
            _start => sub {
               $_[KERNEL]->delay(loop => ($x++ / @tasks ));
            },
            loop => sub {
               $_[KERNEL]->delay( loop => $task->period );
               $task->single_run;
            },
         },
      );
   }
}

no Moose;

__PACKAGE__->meta->make_immutable;

1;

I look forward to using POE for actual heavy-lifting in another one of our projects, and will post about the experience when I get there.

  • 0 Comments
  • Filed under: perl
  • Perl Event Loop

    I have some extremely basic code using AnyEvent but I recently found out that I was doing it wrong. That is, the entire reason I am using an event loop is to catch errors, log them, and keep going. That’s one of the great benefits that Catalyst gives me; I override one thing and I get universal error logging. The problem is that AnyEvent specifically does not handle this use case.

    I have a working solution, but as I am planning on rewriting our services in evented code this prohibition makes me really worried. The problem is that you can’t just know your code won’t die. Exceptions happen and as a developer of a language that’s not Java or C# I don’t know where they come from. My current solution is ok, but I don’t think it’s really viable long term. Here’s my current code:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    #!/usr/bin/env perl

    use strict;
    use warnings;

    use AnyEvent;
    use Try::Tiny;

    sub event {
       print "looped\n";
       die "lol" if rand() < .5;
    }

    sub NEVER_DIE {
       my $code = shift;
       return sub {
          try \&$code, catch { warn $_ } # <-- this should be logging, you get the idea
       }
    }

    my $cv = AE::cv;
    my $w = AE::timer 0, 1, NEVER_DIE(\&event);
    $cv->recv;

    This works for simple cases, but if I chose to go down this route in the long term I’d have to wrap every single code ref in NEVER_DIE, which is pretty lame.

    I looked at POE as it may support my use case better but as far as I can tell it’s support is WORSE. Here’s what I came up with:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    #!/usr/bin/perl

    use strict;
    use warnings;

    use POE;
    use Try::Tiny;

    sub handler_start {
      my ($kernel, $heap, $session) = @_[KERNEL, HEAP, SESSION];
      $kernel->yield('event');
    }

    sub NEVER_DIE {
       my $code = shift;
       return sub {
          try \&$code, catch { warn $_ } # <-- this should be logging, you get the idea
       }
    }

    sub event {
       print "looped\n";
       die "lol" if rand() < .5;
       $_[KERNEL]->delay_add('event', 1);
    }

    POE::Session->create(
     inline_states => {
       _start    => \&handler_start,
       event     => NEVER_DIE(\&event),
       _stop     => sub{},
     }
    );

    POE::Kernel->run();
    exit;

    So I still have to use NEVER_DIE, so that’s a lose, and worse, if event dies before the call to delay_add we end anyway. Sure, I could put delay_add at the beginning of event, but that brings me to another thing that really bothers me about the “POE Way” (my own terminology, I may just not be getting it), for my AnyEvent code I can add a bunch of things and they don’t have to know about each other. The loop handles calling the events. With POE it seems like I have to manually tell it “call this, now call this.” That seems to defeat the entire purpose! What am I missing here?

    If anyone knows an event loop I should consider (MUST RUN WELL ON WINDOWS) or maybe some setting in POE and some kind of POE timer thing, or some way of safely overriding how AE calls it’s events, please, comment and let me know.

  • 3 Comments
  • Filed under: perl
  • Try Out Color Coded SQL

    Thanks to arcanez, my color coding SQL Logging has been merged into DBIC’s master!

    That means you can easily try out the new color coding! All you need to do to try it out is clone our master from git:

    1
    git clone git://git.shadowcat.co.uk/dbsrgits/DBIx-Class.git

    Make sure you install any new deps. The main one will be SQL::Abstract 1.68.

    1
    cpanm --installdeps .

    And then use that as your lib directory when you run your server or whatever:

    1
    perl -I ~/DBIx-Class/lib scripts/foo_server.pl -rd

    Now, you won’t notice a difference till you set the DBIC_TRACE_PROFILE variable. It sets the color profile to use. If you are on Linux and install the ANSIColor package, you probably want to set it to “console”. If you are in win32 or do no want to install ANSIColor, set it to “console_monochrome”. Both profiles fill in placeholders for you, for excellent readability, so that’s extremely helpful.

    If you would like to make a nicer colorscheme, or more importantly want to use modern 256 color consoles, feel free! The documentation for that is available at SQL::Abstract::Tree. The best way to define one of those is to make a json file (I do ~/dbic.json) and populate it with the profile information you like, and then set DBIC_TRACE_PROFILE to the full path of the file. That way you can experiment with various profiles and when you think you have one that’s worth sharing, send it to me and I’ll probably merge it in!

    Anyway, we hope to cut a new release in a week or two, with lots of other great new stuff, so feel free to wait if you’d prefer.

  • 1 Comment
  • Filed under: perl
  • CPAN Mashup?

    One of the common issues I hear about CPAN is that it’s so sprawling that people do not know which modules to use and which not to use. Hopefully part of that issue will be solved by the Enlightened Perl Core, but that will only go so far. Recently there were a couple posts regarding this issue. (Note: They are in reference to a post I made and they are from the same guy.) I even recently had a discussion regarding this with my boss recently because we needed some barcode generation code. (We ended up using Barcode::Code128 but we spent a lot of time trying to get GD::Barcode to do what we wanted.) Furthermore I chatted with the EPCore guys regarding this and they all helped me think through a lot of these issues; I have a muddled mind :-)

    I think a solution to this problem is feasible. I imagine a web service that will help recommend various packages for given tasks. I have the following (pie in the sky) goals in mind:

    • Automated and Objective (as much as possible)
    • Easy to Use
    • Fast
    • Configurable scoring (for people who don’t like the default metrics)

    Here are some possible sources of data to make this all work:

    CPAN Testers is obvious. It has massive amounts of data and it can at least tell you if a module is good by it’s own measure. It might be worthwhile to look into some kind of scaling based on tests (configurable of course.) The idea there is that if a module has never failed because it has no tests that shouldn’t count.

    CPAN Deps isn’t even completed. I’ve only heard this name dropped, but the idea is clear. With it you could find out what modules are effectively core in that lots of people depend on them. You could use this in a PageRank style way in that modules that have a high score help add to the score of modules they depend on.

    The Github watches link that I posted is where I originally got the idea for this. I’m not really sold on it, but mst liked it so much I figured I’d keep it in the list; I wish I could give you a link to the actual conversation. He hated the idea of using “failhub” :-) I do like the idea, but I am certainly not as smart or motivated as mst.

    And last but certainly not least, CPAN Ratings. CPAN Ratings is an excellent idea, but it needs some love. Part of that has to do with it’s actual implementation (at the very least it’s ugly,) but the real issue is the use of it. More people need to use it. I don’t know how to do that other than to use it myself. I think it might be good if, after using a module for at least an iteration, I rated it. If one were to rate a module too soon the results could become inflated. And as a side-note, I personally think we should use OpenID instead of BitCard, but it’s not worth changing a bunch of stuff just for that.

    And then I was thinking that we could use a combination of module name searching, tags added to META.yml, and tags added manually. So DBIx::Class would theoretically add the ORM tag (and others possibly) to their META.yml, and then someone would manually add the tag to Class::DBI. Then when people search for ORM they would at least find those two. They would then get a score based on the previous five metrics. I would say have anything with a score beneath a certain number not even displayed, but have a link that would allow the display; and maybe a user option that would permanently display hidden items.

    I think this is something that would certainly be worth attempting. It wouldn’t be easy, and the stuff I’ve said above is certainly riddled with errors, but that shouldn’t stop us. What do you guys think? Other ideas for data sources? Implementation ideas? Tuits?

  • 5 Comments
  • Filed under: perl
  • Web Comic Downloaders

    Since the beginning of my serious webcomic journey with xkcd, I think that was four years ago, I’ve been writing little scripts to help me get started. The first type of script is to grab integer-based, monotonically increasing files. Very easy. Done in Ruby.

    1
    2
    3
    4
    5
    6
    7
    #!/usr/bin/ruby -w

    Fromat = "http://foobar.com/comics/%08d.gif"
    1.upto(986) do |i|
      `wget #{sprintf(Fromat, i)}`
      sleep 1
    end

    The next harder are the ones that are based on the date of publication. Usually though, they will be published Monday-Wed-Fri or something like that, so you can just increase per day and then check if it’s the correct weekday. See more Ruby.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    #!/usr/bin/ruby -w

    Day = 60 * 60 * 24

    Fromat = "http://www.foobar.com/comics/st%Y%m%d.gif"

    t = Time.local(2005, 2, 5)

    MWF = [1,3,5]

    until t == Time.local(2007, 7, 9)
      if MWF.include? t.wday
        `wget #{t.strftime(Fromat)}`
        sleep 3
      end

      t += Day
    end

    And then lastly, and hardest of all, are arbitrary files that can only be ascertained by clicking links. Perl + CPAN to the rescue!!!

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    #!perl
    use strict;
    use warnings;
    use feature ':5.10';

    use WWW::Mechanize;
    my $mech = WWW::Mechanize->new( autocheck => 1 );

    sub process_page {
       my @images = $mech->find_all_images(
          url_abs_regex => qr{http://www\.foobar\.com/memberimages/.*\.jpg}i
       );
       foreach (@images) {
          my $url = $_->url;

          if ($url !~ qr/banner/i) {
             say "downloading $url";
             qx{wget $url};
          }
       }
    }

    $mech->get( 'http://www.foobar.com/foo/bar/series.php?view=single&ID=72709' );
    process_page;
    while (
       $mech->follow_link(
          # third link on page matching regex
          n             => 3,
          url_abs_regex =>
             qr{http://www\.webcomicsnation\.com/dmeconis/familyman/series\.php\?view=single&ID=\d+}i
       )
    ) {
       sleep 1;
       process_page;
    }

    This last one should be checked on every now and then as it is easy for it to get stuck in an infinite loop on the last couple comics.

    Anyway, enjoy! This set of scripts should take care of all of your webcomic scraping needs :-)

    Note: these are not to avoid ads, but to speed up the initial reading process as speed is an issue when reading 400 or more strips.

  • 1 Comment
  • Filed under: perl, Ruby
  • Script to Rename MP3′s

    I recently got a new car stereo due to the other one being stolen. I am almost entirely happy with the model that I ended up purchasing, but one thing that it does, which is really obnoxious, is that it doesn’t sort the files correctly unless the track number is early on in the file name. Even if all tracks are “FooBarBaz 01 – name.mp3″ it seems to ignore the number unless it’s the very beginning of the file name. Anyway, easy fix:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    #!perl
    use strict;
    use warnings;
    use feature ':5.10';

    use Music::Tag;
    use File::Find::Rule;
    use File::Basename "fileparse";
    use File::Copy "move";
    use File::Spec;
    my $directory = shift || '.';

    my @songs
       = File::Find::Rule->file()->name( '*.mp3' )
       ->in( $directory );

    foreach my $song (@songs) {
       my $info = Music::Tag->new($song);
       $info->get_tag;
       my $track = $info->track;
       my $title = $info->title;
       my (undef, $dir, $suffix) =
          fileparse($song, qr/\.[^.]*/);
       $info->close;
       if ($track and $title) {
          my $newfilename = File::Spec->catfile(
             $dir,
             sprintf "%02d %s%s", $track, $title, $suffix
          );
          if ($song ne $newfilename) {
             say "renaming $song to $newfilename";
             move $song, $newfilename;
          }
       }
    }

    It doesn’t really deal with illegal characters, it just doesn’t rename those files. Eventually I’ll get around to doing that. Anyway, just figured someone might be interested/want to copy paste this.

  • 3 Comments
  • Filed under: perl
  • Don’t Repeat Yourself: JSON

    With DBIx::Class we typically have a TO_JSON method which returns a hashref of the data you want in your json. Here’s an example:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    sub TO_JSON {
       my $self = shift;
       return {
          id => $self->id,
          name => $self->name,
          comments => $self->comments,
          email => $self->email,
          job => $self->job,
          ok => $self->ok,
          i_cant => $self->i_cant,
          think_of => $self->think_ok,
          anymore => $self->anymore,
       };
    }

    Here’s the shorter version mst inspired me to write:

    1
    2
    3
    4
    5
    6
    7
    8
    sub TO_JSON {
       my $self = shift;
       return {
          map { $_ => self->$_ }
             qw{ id name comments email job
                ok i_cant think_of anymore },
       };
    }

    Anyway, not very complex, but still awesome.

  • 2 Comments
  • Filed under: perl
  • Perl 5 to Perl 6 Rewrite

    My coworker Wes asked me if there could be a nice refactor for the following function which checks CAS Numbers to ensure their validity. After struggling for 30 minutes I gave up trying to make it a little bit nicer with reduce.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    sub cas_old {
      my $cas = shift;
      if ($cas =~ /\d{1,8}-\d\d-\d/) {
        my @ary = grep { $_ ne '-' } split(//, $cas);
        my $check = pop @ary;
        my $count = @ary;
        my $sum;
        for (@ary){
          $sum += $_ * $count--;
        }
        return $sum % 10 == $check;
      }
      return;
    }

    Let’s take a look at this and figure it out. The crunchy bit is the for loop, so I’ll go through that. Basically we are summing each item times a weight that is inversely proportional to it’s location in the list. Or to be more explicit, let’s do an example on the board (7732-18-5.) 5 is the check digit.

    $_ $count $_ * $count $sum
    7 6 42 42
    7 5 35 77
    3 4 12 89
    2 3 6 95
    1 2 2 97
    8 1 8 105

    So basically we are making a special summation. The thing that’s unusual is that we have a decrementing counter along with it. If I had the control structure which I am about to show you in my mind already the solution might have jumped out sooner.

    So I asked about it in #perl6 and it turns out there is a very nice Perl 6 version. It takes advantage of the mystical hyperoperator (>>infix op<<); that is, it takes two lists and performs an operation on each element together. Think SIMD. It also uses reduce ([infix op]) which I have mentioned before. Check it out!

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    sub cas(Str $cas) {
        if $cas ~~ /(\d ** 1..8)\-(\d\d)\-(\d)/ {
            my @digits = $0~$1.split '';
            my $check = $2;
            return ([+] @digits.reverse
               >>*<<
               (1..@digits)) % 10 == $check;
        }
        return Bool::False;
    }

    This does the same thing as above. Or to put it in English, we take our digits, reverse them, and then multiply each digit (hyperoperator, >>*<<) by the respective integer in the other list, that is, 1 to the size of the list. We then sum (reduce, [+]) that new list, and get the modulus 10 of it.

    Very elegant, no?

    Update: Turns out there is also a very elegant version in p5, according to mst. Check it out!

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    sub cas_old {
      use List::Util 'sum';
      my $cas = shift;
      if ($cas =~ /(\d{1,8})-(\d\d)-(\d)/) {
        my @digits = split(//, $1.$2);
        my $count = @digits;
        my $check = $3;
        return (sum map $_ * $digits[-$_],
           1 .. $count) % 10 == $check;
      }
      return;
    }

    It’s very similar to the p6 version, just using fewer generalized operators, so you should be able to follow it fairly well.

  • 0 Comments
  • Filed under: perl
  • Dallas.p6m: May 2009

    We had the second Dallas.p6m on May 12, 2009. Along with me there were two of my coworkers, s1n, Graham Barr, and Patrick Michaud. We discussed a lot of things. One of which was the difference between subs and methods in Perl6. And the fact that you can’t imply self. This should explain it:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    class A {

       sub foo {
          say 'foo';
       }

       method bar($o:) {

          # much to s1n's chagrin, you can't
          # have baz() imply self.baz.  These
          # are his options

          say 'bar';
          self.baz;
          $o.baz;

          # great for when you have a lot
          # of object methods close by

          given self { .baz; }

          # this is the same as above but it's
          # not scoped.  Good for short methods

          $_ = self;
          .baz;
         
          # not really recommended as it
          # doesn't seem to be for anything
          # but attributes.
          $.baz;
       }

       method baz {
          say 'baz';
       }
    }

    my $a = A.new;
    A::foo;
    #A::bar; # dies

    #$a.foo; # dies
    $a.bar;
    A.bar;
    # note: there may be a distinction between
    # class and instance methods, but for now
    # you use the same method for both

    I also asked Patrick if he thought that Perl 5 in Perl 6 would really happen and if so how. He said it would happen, but probably not soon. There are really three options. The first is to embed Perl 5 in Parrot. This is really the “best” option as it would have 100% compatibility (except weird XS stuff,) and I think Patrick said that it had been prototyped, so that’s encouraging. The next option would be to reimplement most of Perl 5 in Perl 6. This would never get close to 100% but it would still be an option. The last option would be to have major parts of CPAN reimplemented in Perl 6, thus making compatibility far less important. Important CPAN modules would be DBI, a good templating system, a web framework (in progress), and some form of GUI toolkit.

    Somehow we got into a discussion about Mojo, the framework Graham uses at $work. It is modeled after Rails and is supposed to be simple to port to Perl 6. The most important thing about it, as far as I can tell, is that it has no dependencies. Graham made it sound like a lightweight framework, but I guess he just meant the no deps thing. CGI-Application (what I use) totals to 7k lines. Mojo, on the other hand, totals to 50k. Not exactly lightweight, but low dependencies is an interesting goal.

    And then two thirds of us are going to YAPC::NA, so we talked about that some. Very exciting things coming up!

  • 0 Comments
  • Filed under: perl
  • What I want from the Perl 5 support policy

    This is in response to chromatic’s post Writing Perl 5′s Support Policy

    I want to be able to use the support policy as a reason to convince customers with lots of Perl installs that they need to update. A big part of this means an easy upgrade.

    Probably most of the people using Perl 5 are in Unix. That makes it easier for you folks. On Windows installing Perl is no simple task, either ActivePerl or Strawberry Perl.

    For example, at $work we use Apache/mod_perl. (I don’t wanna hear your “FastCGI! FastCGI! FastCGI!”, none of you people have actually helped me so far!) Let’s say I want to use the shiny new Perl 5.11. I install it. Awesome. Wait! mod_perl doesn’t work! Ok, reinstall that. Oh wait! All of my XS/compiled modules don’t work! etc etc. I understand the fact that Windows is a second class citizen here, but in Unix you probably have the same issue. You just made some kind of package to install it all at once or something. That’s great, but does it help me? Is updating really supported?

    On a side note a point that my boss made when we discussed this issue is the fact that the community may not support Perl 5.8 in the future, but if companies will pay for it, ActiveState will probably support it. This is good for the customers and helps the community, so I’d say that’s really a win-win.

    Anyway, those are my thoughts. Take ‘em or leave ‘em.

  • 3 Comments
  • Filed under: perl