fREWdiculous!
3 Jul
Some of you probably know that I have some opinions, thoughts, and ideas. I actually started this blog because I wanted to write my own (can you guess what?) Manifesto. I chose to write it as a blog because I tend to change my mind. Ask some of my friends and family. They have all observed that I was going to be a math teacher, a psychologist, a biological engineer, a doctor, and a writer. (Take note: I am none of those things.)
I started programming in earnest about 10 years ago, when I purchased Programming Perl. I had done some basic, but I knew that real programmers used perl. I knew I would never use any other language. 6 years later I turned my back on perl when a professor introduced me to ruby. While in ruby-land I learned functional programming, what MVC was, what an ORM is, and the beauty of syntax (I still dig 5.times {…}). I knew that Rails was the One True way to program websites and that Prototype and Scriptaculous were the only way to program javascript.
Then 4 years later someone offered to pay me to write perl. I came back somewhat grudgingly, and I came extremely close to trying to write a certain project with rails. Fortunately (for my current dogmatism) my boss convinced me to stick with perl. Somewhere along the line I learned that perl 6 is truly being developed. I helped some and had some fun. I read the book currently titled The Passionate Programmer. After reading it I decided to start seriously look into switching from IIS to Apache.
After setting up Apache on my personal computer so I could have a useful error log I started researching ORM’s. I found that The One True ORM of any given language is DBIx::Class. I will never use another ORM as long as I live. I have posted about it a few times now. I’ll leave that at that.
Larry Wall says that the three programming virtues are laziness, impatience, and hubris. I agree with his conclusions. The first two often lead to code reuse. Code reuse is an excellent goal. Code reuse is what keeps my current codebase nimble and exciting to work on.
Part of code reuse means using libraries to help you get your job done. Did I mention DBIx::Class? Yeah, it helps me get my job done. Now, when I first started getting paid to code I was told that in a professional context, we don’t waste our time reinventing the wheel. Agreed! Let us not reinvent the wheel.
So instead of reinventing the wheel, we’ll purchase a library that does the job for us find some Open Source library that does (or almost does) the job for us. Before I got further I’d like to make a few points about Open Source software. (Also, let me remind you that I am Holden Caulfield right now so I may be lying, on purpose or accident.)
I do not use Open Source software because I desire or need freedom. My political friends tell me that I am spoiled for saying that freedom is not the highest virtue and that I would not be willing to die for freedom. There are other virtues that I (hope) would be willing to die for, but that’s another chapter.
I do not use Open Source software because I am poor. I purchase indie video games because they are awesome works of art and they are not cheap. I donate to Open Source and otherwise free software that I regularly use because I am glad to pay for the excellent work that someone will do to make my job/life easier. I think that it is fine to ransom features as an Open Source programmer.
I do no use Open Source because I am a communist. There is no reason that a programmer should give you his time and effort for free. Let me redact that statement: there is no reason that a person should give you his time and effort for free. If you view me as a carbon offset to the earth and that everything I do should be given to the poor, that’s fine. We are all wrong sometimes. Let me be clear: I love Ayn Rand as a phiosophess and I agree with her unconditionally.
I use Open Source software because I am a programmer. Jeff Atwood says that “If it’s a core business function, write that code yourself, no matter what.“. I agree Jeff! The problem comes when you purchase an over-the-counter library, it suits your fancy perfectly, and then six months later, as always happens, the customer wants more. The library no longer works for you, so you either pay the Closed Source vendor to implement the features you need, or find another library and port all of your code to that.
This is what happens to me: I use an Open Source library that does what I need. I eventually outgrow it or it doesn’t meet a specific need, I either whine enough to get someone to add the feature I need, or I figure out how to add it myself. I’m not even a very good programmer; I just really like to program.
Let me put it another way: do you have any friends who really like to work on their car? Do they buy the brand new drive-by-wire automatic Toyota that is more black box than car?
This post is pushing up against the thousand word mark and we certainly wouldn’t want to go there, so I’ll repeat myself one more time: I am a programmer. I will continue to use Open Source software because I love to program and because I don’t want any Golden Handcuffs.
So programmers, ask and it will be given to you; seek and you will find; untar and the code will be opened to you. Suits: feel free to purchase black boxes as Golden Handcuffs. Thank you and have a nice weekend.
1 Jun
Since the beginning of my serious webcomic journey with xkcd, I think that was four years ago, I’ve been writing little scripts to help me get started. The first type of script is to grab integer-based, monotonically increasing files. Very easy. Done in Ruby.
1 2 3 4 5 6 7 | #!/usr/bin/ruby -w Fromat = "http://foobar.com/comics/%08d.gif" 1.upto(986) do |i| `wget #{sprintf(Fromat, i)}` sleep 1 end |
The next harder are the ones that are based on the date of publication. Usually though, they will be published Monday-Wed-Fri or something like that, so you can just increase per day and then check if it’s the correct weekday. See more Ruby.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #!/usr/bin/ruby -w Day = 60 * 60 * 24 Fromat = "http://www.foobar.com/comics/st%Y%m%d.gif" t = Time.local(2005, 2, 5) MWF = [1,3,5] until t == Time.local(2007, 7, 9) if MWF.include? t.wday `wget #{t.strftime(Fromat)}` sleep 3 end t += Day end |
And then lastly, and hardest of all, are arbitrary files that can only be ascertained by clicking links. Perl + CPAN to the rescue!!!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | #!perl use strict; use warnings; use feature ':5.10'; use WWW::Mechanize; my $mech = WWW::Mechanize->new( autocheck => 1 ); sub process_page { my @images = $mech->find_all_images( url_abs_regex => qr{http://www\.foobar\.com/memberimages/.*\.jpg}i ); foreach (@images) { my $url = $_->url; if ($url !~ qr/banner/i) { say "downloading $url"; qx{wget $url}; } } } $mech->get( 'http://www.foobar.com/foo/bar/series.php?view=single&ID=72709' ); process_page; while ( $mech->follow_link( # third link on page matching regex n => 3, url_abs_regex => qr{http://www\.webcomicsnation\.com/dmeconis/familyman/series\.php\?view=single&ID=\d+}i ) ) { sleep 1; process_page; } |
This last one should be checked on every now and then as it is easy for it to get stuck in an infinite loop on the last couple comics.
Anyway, enjoy! This set of scripts should take care of all of your webcomic scraping needs
Note: these are not to avoid ads, but to speed up the initial reading process as speed is an issue when reading 400 or more strips.
5 May
I’ve mentioned this in at least one previous post before, but it bears repeating.
First off, here’s some context for the varied information I am about to throw out at you, dear reader. I keep in touch with both the Ruby and Rails worlds because I think they have some really good ideas. Recently there was a conference session about CouchDB. I read the slides and I was impressed. CouchDB is cool stuff! The method of presenting the information was a little weird, but I didn’t think he went too far. Little did I know that he actually showed a lot more in the actual session. (read: actual porn.) It was a big deal and a lot of people have strong feelings about what went down.
Update: See Giles’ comment; I can’t hear sarcasm on the internet; mea culpa.
I’ve mentioned Giles Bowkett before. He’s an amazing coder and often he’s a good guy, but sometimes he just has some strange ideas. Or at least that’s how I feel. Here’s a highlight from the previous link:
DHH is a god. He’s not just another programmer, whose code we should analyze, learn from, or improve. He is our leader. If DHH likes REST, we all like REST. If DHH refuses to apologize for anything ever, we all refuse to apologize for anything ever. This is the Rails Way and we must honor it. In fact, I would go to say that if we see DHH refusing to apologize for being rude when he’s right, we should go a step further and refuse to apologize for being a dick when we’re wrong.
After all, if you criticize somebody important in the Rails world, you don’t get an apology from them. You get banned from conferences. You should know who’s important, and you should kiss their ass, no matter what kind of scum they might be. If you think things should be different, then you should fuck off and go write Python like Zed Shaw.
That’s an interesting thought. I can understand loyalty. In the past year I’ve developed a lot of loyalty to the Perl world, and even specific people. I would say that I have a similar amount of loyalty to Larry Wall. But I would also say that I always have reservations to my loyalty to people. People are never perfect and I will never blindly listen to what they say. Especially when it comes to non-technical issues like ethics.
Here are a couple points of view that certain women have come to (I found these on _why’s blog).
Audrey Eschright: Ruby (and Rails in particular) loves the rock star image. You see it in job posts, how people talk about their work, and the way Rubyists rant on their blogs. It’s macho, it can be offputting to both genders, and it makes it easy in this kind of situation to say, “what’s your problem? I’m just busy being awesome”. It’s also a significant barrier to adoption for people who aren’t already a part of this culture, and don’t find it appealing.
Victoria Wang: DHH’s attitude seems to say that the more we lower ourselves to the most base level of marketing scum in the name of entertainment, the better, even if at the end of the day there are no more women, or anyone worth knowing, in the room. It kind of makes me want to never touch Rails code again.
But the Rails world isn’t always this way, and I am confused that certain people have come to the conclusion they have. Remember Giles’ thoughts before, about following DHH to the end of the earth, or at least to sexism? Giles recently had a post about how he wanted rails to be a welcoming community to homosexuals. How can you be open to people of different sexual orientations, but not to people of a different sex?
But then of course they have people like raganwald. If I could magically become someone, it would be raganwald + _why. Brilliant, hilarious, and good. Raganwald posted a few days ago with some thoughts that I think apply to the situation going down right now. I highly recommend reading it. But if you can’t I’ll apply it for you: the Rails community is not bad. Their various coders are quite smart and often good, but these recent events are not ok, and defending what happened is even worse.
And then on the other hand we have the Perl world. We certainly aren’t perfect. I think that often Perl programmers are stuck in the past and not willing to change things for the better. I can feel that changing though…
Recently Ovid has been posting a lot about Roles and apparently there was recently quite the debacle on use.perl.org. I didn’t read it because reading comments on use.perl.org is just too painful. But what I am getting at is that things were said that hurt other people. But unlike the current Rails situation, Ovid apologized.
You don’t have to have a huge ego to be awesome.
9 Feb
I watched this keynote from frozen perl this weekend and it was pretty great. There are plenty things to take from this presentation, but the thing I want to mention comes from slides 66-77. Consider that mandatory reading to understand this blog post.
Now read this, this, and almost any of these.
Caveat Lector: All of those links may be outliers. I am certainly not reading a statistically valid sample of The Webternet; so maybe just consider this some random observations from this random dude.
Also: All of the above linked people are smarter and more motivated than I am. I would not criticize their intelligence in the least and their technical skills has my respect for sure. What have I done? A job? Helped move some books to people that needed them at school? Some stuff at a hospital? Compared to Rails, Mongrel, and Archeo- whatever. Compared to those guys I am nothing. Keep that in mind.
I just think it is important that in the perl world we have keynotes where people say: “Be Excellent to Each Other.” People say that perl is dead. I disagree. You should see how many blog posts there are out there saying, “no, perl is not dead!” But I would much rather perl be dead than perl be a bunch of jerks.
Basically this boils down to the difference between me and those other programmers. I think that our priorities should be:
Even if you aren’t a Christian I still think that exactly that priority list matters: Code is useless without users. Maybe they think they are exceptions because they are the users. Fine. I just know that if I see Giles, Zed, or DHH in real life I will be like, “Wow, what a great coder! Probably don’t want to actually hang out with them for more than 10 minutes though…”
Maybe I’m wrong though. Maybe the internet is just not a place for being nice. I just know that I am going to try to do my best to “Be Excellent to Each Other.”
27 Jan
First off let me say that I love ruby. Ruby more or less taught me functional programming, which I love. But I do think that perl6 (which you may think is vaporware) is better. I only post about features which I can use right now in rakudo. With that said we shall move onward.
Update: the rest of this post, although still correct, is flawed. See comments for the Correct Ruby solution
Fjord asked me about how to iterate over two lists at the same time in perl6. I have only had to do this a couple of times and I usually just end up doing a ghetto c-style for loop. In perl6 there is a better way. Check it!
Perl6:
1 2 3 | my @a = 1,2,3; my @b = 4,5,6; for @a Z @b { say "$^a $^b" } |
prints:
1 4 2 5 3 6
You may think, “fREW, ruby can do this and it does it exactly the same, if not better!”
Ruby1.8:
1 2 3 | a = [1,2,3] b = [4,5,6] a.zip(b).each { puts "#{x[0]} #{x[1]}" } |
Do you notice the subtle difference? In ruby we get [[1,4],[2,5],[3,6]] vs perl’s (1,4,2,5,3,6).
That may not seem like a big deal, but what if you want to iterate over three lists? Here’s perl6:
1 2 3 4 | my @a = 1,2,3; my @b = 4,5,6; my @c = 7,8,9; for @a Z @b Z @c { say "$^a $^b $^c" } |
prints:
1 4 7 2 5 8 3 6 9
and Ruby:
1 2 3 4 5 6 | a = [1,2,3] b = [4,5,6] c = [7,8,9] a.zip(b).zip(c).each { puts "#{x[0][0]} #{x[0][1]} #{x[1]}" } |
That’s a drag! Anyway, I once read that there is a way to do this nicely in ruby, but I never could figure it out. I’d say the perl6 solution here is much nicer. Can someone prove me wrong?
20 Jan
Ruby:
1 | sum = (1..10).reduce {|x,y| x+y} |
or maybe
1 | sum = (1..10).reduce {:+} |
Perl6:
1 | my $sum = [+] 1..10; |
That has got to be some of the sexiest perl syntax ever!
10 Jan
Ok so that may be a sensational title, but really the point is this: Rails people talk a lot, perl people just get stuff done. I am ok with getting stuff done, but I don’t know how perl people do it because they don’t talk about it as much.
Anyway, with that in mind my company (MTSI) is starting a new project next week. I get to be a big part of the planning and I am pretty excited. Normally our code is just perl scripts that use SQL and string interpolation or template toolkit. The use of TT is a big, fairly recent step forward. I recently turned a utilities file into a full on module, so that’s good too.
But really we aren’t where we should be. The state of the art with web applications has moved forward significantly in the past few years (and I think a lot of that is because of some smart people that use rails,) and there is no reason that we cannot use this knowledge in perl.
I originally looked at Catalyst, but decided against it because my boss thought it was a pretty big commitment for something none of us have experience with. So I decided to look at CGI::Application (which we used for TOME.)
Before I get into that I just want to say that we have decided on DBIx::Class as an ORM. I looked at Rose::DB::Object but DBIx::Class just seems to have more polish and support. Plus they support SQL Server which we use (no comment.) DBIx::Class is fairly easy to use and next time I’m at work I’ll post a snippet of how to do various things we want to do.
The main reasons I went with CGI::Application are these:
The biggest issue with CGI::Application that I initially had was this: how can I have multiple controllers? In TOME we only had one controller but I think we should have had at least two, maybe three. Anyway, after some research I found this: Re: Re: Re: Why CGI::Application?. Basically he does what I thought that you are supposed to do, except with some excellent OO goodness.
I was thinking that you would just have like, 5 CGI::Applications and those would be the controllers. Well, instead of that you have 5 CGI::Applications that subclass a main one which has basic functions (logging in etc) that all the other ones need. If a controller gets too big you either split it into a couple or you subclass it for a couple related controllers.
Hopefully it goes as well in my mind as it should
9 Jan
So recently I was asking if andand exists in perl (here and here) and someone implemented it! How awesome is that? See it here.
Anyway, so I looked at the code and figured, “Well heck, if it’s that easy, I should do this for map and join on arrays!”
It was already done! The autobox::Core module does it already! You have to use more javascript-y syntax instead of regular perl-ish, but I think it makes things more clear anyway.
Example:
1 2 3 4 5 6 7 |
To be perfectly clear, you would probably think of the first one as: we are joining the results of the map that multiplies each item by two and the second one as: multiply each item by two and then join them with a comma.
Anyway, I am *so* stoked to use this at work.
26 Dec
Exciting! It was apparently put up yesterday, on Christmas. What a cool gift right? I looked through the changed maintained my Mauricio and here are /my/ favorites.
*New literal hash syntax [Ruby2]*
1 | {a: "foo"} # => {:a=>"foo"} |
*.() and calling Procs without #call/#[] [EXPERIMENTAL]*
You can now do:
1 | a = lambda{|*b| b} a.(1,2) # => [1, 2] |
*Multiple splats allowed*
1.9 allows multiple splat operators when calling a method:
1 2 3 4 5 | def foo(*a) a end foo(1, *[2,3], 4, *[5,6]) # => [1, 2, 3, 4, 5, 6] |
*Mandatory arguments after optional arguments allowed*
1 2 3 4 | def m(a, b=nil, *c, d) [a,b,c,d] end m(1,2) # => [1, nil, [], 2] |
*Object#tap*
Passes the object to the block and returns it (meant to be used for call chaining).
1 | "F".tap{|x| x.upcase!}[0] # => "F" # Note that "F".upcase![0] would fail since upcase! would return nil in this # case. |
*Module#attr is an alias of attr_reader*
Use
1 | attr :foo= |
to create a read/write accessor. (RCR#331)
*Enumerable#cycle*
Calls the given block for each element of the enumerable in a never-ending cycle:
1 2 | a = ["a", "b", "c"] a.cycle {|x| puts x } # print, a, b, c, a, b, c,.. forever. |
*Enumerable#group_by*
Groups the values in the enumerable according to the value returned by the block:
1 | (1..10).group_by{|x| x % 3} # => {0=>[3, 6, 9], 1=>[1, 4, 7, 10], 2=>[2, 5, 8]} |
*Enumerable#drop*
Without a block, returns an array with all but the first n elements from the enumeration. Otherwise drops elements while the block returns true (and returns all the elements after it returns a false value):
1 2 | a = [1, 2, 3, 4, 5] a.drop(3) # => [4, 5] a.drop {|i| i < 3 } # => [3, 4, 5] |
*Enumerable#inject (#reduce) without a block*
If no block is given, the first argument to #inject is the name of a two-argument method that will be called; the optional second argument is the initial value:
1 | [RUBY_VERSION, RUBY_RELEASE_DATE] # => ["1.9.0", "2007-08-03"] (1..10).reduce(:+) # => 55 |
*Enumerable#count*
It could be defined in Ruby as
1 | def count(*a) inject(0) do |c, e| if a.size == 1 # suspect, but this is how it works (a[0] == e) ? c + 1 : c else yield(e) ? c + 1 : c end end end |
Therefore
1 | ["bar", 1, "foo", 2].count(1) # => 1 ["bar", 1, "foo", 2].count{|x| x.to_i != 0} # => 2 |
*Array#nitems*
It is equivalent to selecting the elements that satisfy a condition and obtaining the size of the resulting array:
1 | %w[1 2 3 4 5 6].nitems{|x| x.to_i > 3} # => 3 |
*Block argument to Array#index, Array#rindex [Ruby2]*
They can now take a block to make them work like #select.
1 | ['a','b','c'].index{|e| e == 'b'} # => 1 ['a','b','c'].index{|e| e == 'c'} # => 2 ['a','a','a'].rindex{|e| e == 'a'} # => 2 ['a','a','a'].index{|e| e == 'b'} # => nil |
*Array#combination*
1 | ary.combination(n){|c| ...} |
yields all the combinations of length n of the elements in the array to the given block. If no block is passed, it returns an enumerator instead. The order of the combinations is unspecified.
1 | a = [1, 2, 3, 4] a.combination(1).to_a #=> [[1],[2],[3],[4]] a.combination(2).to_a #=> [[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]] a.combination(3).to_a #=> [[1,2,3],[1,2,4],[1,3,4],[2,3,4]] a.combination(4).to_a #=> [[1,2,3,4]] a.combination(0).to_a #=> [[]]: one combination of length 0 a.combination(5).to_a #=> [] : no combinations of length 5 |
*Array#permutation*
1 2 | Operates like #combination, but with permutations of length n. <code lang="ruby">a = [1, 2, 3] a.permutation(1).to_a #=> [[1],[2],[3]] a.permutation(2).to_a #=> [[1,2],[1,3],[2,1],[2,3],[3,1],[3,2]] a.permutation(3).to_a #=> [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]] a.permutation(0).to_a #=> [[]]: one permutation of length 0 a.permutation(4).to_a #=> [] : no permutations of length 4 |
*Array#pop, Array#shift*
They can take an argument to specify how many objects to return:
1 | %w[a b c d].pop(2) # => ["c", "d"] |
*Hash preserves order!*
1 2 3 4 5 6 7 8 | RUBY_VERSION # => "1.9.0" h={:a=>1, :b=>2, :c=>3, :d=>4} # => {:a=>1, :b=>2, :c=>3, :d=>4} h[:e]=5 h # => {:a=>1, :b=>2, :c=>3, :d=>4, :e=>5} h.keys # => [:a, :b, :c, :d, :e] h.values # => [1, 2, 3, 4, 5] h.to_a # => [[:a, 1], [:b, 2], [:c, 3], [:d, 4], [:e, 5]] |
vs.
1 2 3 4 5 6 7 | RUBY_VERSION # => "1.8.6" h={:a=>1, :b=>2, :c=>3, :d=>4} # => {:a=>1, :b=>2, :c=>3, :d=>4} h[:e]=5 h # => {:e=>5, :a=>1, :b=>2, :c=>3, :d=>4} h.keys # => [:e, :a, :b, :c, :d] h.values # => [5, 1, 2, 3, 4] h.to_a # => [[:e, 5], [:a, 1], [:b, 2], [:c, 3], [:d, 4]] |
*Numeric#upto, #downto, #times, #step*
These methods return an enumerator if no block is given:
1 | a = 10.times a.inject{|s,x| s+x } # => 45 a = [] b = 10.downto(5) b.each{|x| a << x} a # => [10, 9, 8, 7, 6, 5] |
*Range#cover?*
1 | range.cover?(value) |
compares value to the begin and end values of the range, returning true if it is comprised between them, honoring #exclude_end?.
1 2 | ("a".."z").cover?("c") # => true ("a".."z").cover?("5") # => false |
*Limit input in IO#gets, IO#readline, IO#readlines, IO#each_line, IO#lines, IO.foreach, IO.readlines, StringIO#gets, StringIO#readline, StringIO#each, StringIO#readlines*
These methods accept an optional integer argument to specify the maximum amount of data to be read. The limit is specified either as the (optional) second argument, or by passing a single integer argument (i.e. the first argument is interpreted as the limit if it’s an integer, as a line separator otherwise).
*IO#ungetc, StringIO#ungetc*
Allows to push back an arbitrarily large character.
*Seven predicate methods where added for the weekdays:*
1 2 | Time.now # => Thu Nov 03 18:58:25 CET 2005 Time.now.sunday? # => false |
6 Jul
Time saving tips and tricks!
This first tip is something that I use almost daily. Do you ever want to change a filename to something that is similar to the original name? For instance, maybe you just want to change/add/remove the extension? Well, if you are using a reasonable shell you can do the following:
1 2 3 4 5 6 | # Add .txt to the filename cp textfiel{,.txt} # change el to le cp textfi{el,le}.txt # remove extension cp textfile{.txt,} |
Or how about this; fairly often I will be programming and I will be adding a predefined string to the end of another string a bunch of times, except for the last time. The idea is to put the predefined string between some other things. This is pretty regular if you are generating HTML or SQL. Well, instead of doing the following:
1 2 3 4 5 | output = "" some_array.each_with_index do |item,index| output += item output += " AND " unless index = some_array.length - 1 end |
you can do:
1 | output = some_array.join(" AND ") |
Another thing that I find myself doing often is the following:
1 2 3 4 5 | output="" some_array.each_with_index do |item,index| output += "?" output += "," unless index=some_array.length-1 end |
That will generate the question marks for an SQL statement. Again, that’s a little messy and there is a cleaner way to do it.
1 | output = some_array.map{"?"}.join(",") |
Much better! It’s much shorter and should be easier to understand for other Ruby programmers.
It’s good to put things like this into practice, because it will make your code more readable and easier to maintain. Generally, in my manifesto, fewer lines of code (comments and whitespace don’t count) are better. Of course, in a language like Ruby this can create performance problems; it’s a balance between what works for you as the programmer and what works for the user. If the speed is really an issue, change the code. Otherwise, save your skull!
If you have any tips for regular things like this, let me know. I need to know stuff like this just as much as anyone else.