fREWdiculous!
26 Aug
At work I tend to play an…Evangelical role? I tend to experiment with various technologies, get sold on them, and then sell them to coworkers. Examples: Apache, DBIx::Class, CGIApp, and lately Catalyst. So I typically find various ways that the new tool helps make my job easier and tell people about that. After they believe me, I then educate them about various nuances and whatnot of the tool. Eventually this will happen with git, when it doesn’t suck so much with Windows.
So recently one of my coworkers asked me why he should use an ORM. I had thought I’d mostly fought that battle, but he wasn’t sold (he is now by far
). Anyway, here is my answer, open to the world.
In general this isn’t a huge benefit. SQL is pretty simple and remembering it’s syntax isn’t so bad. But when you want to do something in like paging in SQL Server is when an ORM really starts to shine. In general the ORM makes tasks that you want to do with SQL all the time nice and simple. For example, since we use Ext at work for most grids, users expect to be able to sort by all columns, have pagination, etc. That’s entirely abstracted away. I rarely think about those pedestrian things now
This is where a good ORM really shines. Instead of trying to remember seemingly transient relationships, like how the Shop table joins with the Orders table, we can document that by writing code using our ORM. After that the relationship is there forever. It’s an entirely new level of code reuse, if you are used to just vanilla SQL, even if you are reusing it with functions.
This is actually a lot more subtle in my mind. When I first started using a ORM for real (DBIx::Class) I kept looking for DBIC ways to do various things. Typically the answer was: “override insert” or “override update.” As a noob this can be pretty intimidating, but it really gives great amounts of flexibility. At some point I’ll do a post on OOP revelations I’ve had (interestingly, mostly I get those from hacking on the code of my ORM of choice
,) but for now I’ll just leave it at that.
What are the reasons that you use an ORM?
update: as Stevan notes, I really shouldn’t say all in the final bullet point above. It’s more subtle than that. When I say OOP I don’t actually mean the classes that the ORM represents inherit from each other. I just meant that if I want to do some extra stuff for one class when I store/retrieve it I can localize those changes.
7 Jul
So this is probably old hat to those people who are already big on architecture or know a lot about design patterns, but I thought it was a pretty clever implementation of data security. Anyway, first I’ll start off with how I actually did it, and then maybe talk about it in the abstract.
So here’s the idea, I have a user, and that user should only be able to view a certain set of messages. The messages are linked to groups which the users are linked to. So users have groups, and then groups have messages. So to display the messages we do something like this:
1 2 | my $to_display = $user->groups->related_resultset('messages'); |
And then you can use that kind of code to limit other things which would more easily cause security issues:
1 2 3 | my $message = $user->groups->related_resultset('messages') ->find($id); |
The fact that DBIC allows you to chain your searches is really what allows this kind of thing to happen. Of course, it could be emulated with most data structure based ORM’s by modifying the data structure that gets passed to the search or find method.
(I am pretty sure that you could do this just as easily with DBIx::Class::Schema::RestrictWithObject, but chaining off of user makes a lot of sense to me, so for now that’s how I’ll pull that off.)
Now before we get into a more general discussion I’d like to point out that because of DBIC’s implementation (and possible emulation of it already previously mentioned) this shouldn’t really be too much of a performance hit. Of course, the more related_resultset based chaining you do the more tables you are joining into the query, and that’s where you will start seeing performance issues.
Ok, so the general approach:
It seems to me that it wouldn’t be too hard to make a Highlander (Singleton) that would basically have methods for all of your ResultSet’s (or tables in SQL-talk.) It would contain any user credentials that are needed to get at any data. The idea would be to have it throw an exception if you were to try to instantiate it without all of the data needed to do your security stuff. Really that’s just good OO; any instantiated object should be complete.
Now I have to point out that this really isn’t a complete solution. My friend Fjord works on Birdstack and they need to support the hiding of specific columns, of specific rows, depending on a number of criteria. It’s possible that he could do this for birdstack, but that would end up making each optional column a join table, which would be slow and cumbersome. I don’t remember how he solved the issue, but I imagine that the best way to pull it off would be with a Highlander class that filtered each Result (row) coming from each ResultSet. I guess it would need to return specialize read-only classes or something.
One way or another, I think that no matter what, this fine grained control of public vs private data is going to be hard to manage and slow in a regular RDBMS. An object database might be able to handle it better, but I haven’t really thought much in that vein yet.
So with that I say to you peace be within your walls and security within your towers and racks!
18 Jun
Recently (6 monthsish ago) I decided on an ORM to use at $work. It was pretty hard to make a decision because I’d never really used an ORM for a significant amount of time. Now that I am pretty confident with my chosen ORM I feel like I can make a more informed comparison.
I’m going to skip over the basics of declaring classes themselves. Often when researching ORM’s this is the main thing that people look at. Unfortunately it’s (in my opinion) not that important. As long as everything you want to do is supported, the base model class should just stay out of your way. Recently there have been complaints about the aesthetic appeal of things like DBIC. I prefer to look at conceptual beauty rather than syntactic.
I’d rather focus on major differences in underlying structure, and more importantly, how searches work. I do a lot more searches than I do anything else, and I’d bet that’s the same for you, after you do more than just the basic structure of your app.
So without further ado, the contenders.
26 releases in 5 years. The most recent is from 2007.
2 authors, 33 credited.
The oldest of the discussed ORMs.
Searches are extremely simple, being limited to == and LIKE queries. Here’s an example:
1 2 3 4 5 | # == @cds = Music::CD->search(title => "Greatest Hits", year => 1990); # LIKE @cds = Music::CD->search_like(title => 'Hits%', artist => 'Various%'); |
You can do more complex things, but it’s really just writing SQL and giving that SQL search a name. A SQL dictionary approach you might say. Also note that the above returns an entire array of results, which is Not Great. You can use an iterator though for performance….but it still pulls it all into memory; it just doesn’t instantiate the objects right away, which is a little better, but still Bad.
A cool feature CDBI has is triggers for the lifecycle of the object. The triggers listed in the docs are:
CDBI also has built in constraints, so you can do validation in your model.
Both of these can be done with regular OO in all of the other ORM’s, but having a predefined naming scheme for things like this helps people to quickly learn what’s going on.
70 releases in 3 years. The most recent is two months ago.
1 author, 11 credited.
Written with speed in mind. Be aware that because of these manual optimizations the code is harder to maintain. I was told this by one of the contributors to the project. But you do get very good speed (supposedly, I haven’t done any tests myself) because of it.
Here’s a basic Rose::DB::Object search. It returns an arrayref, which isn’t optimal, but you can get an iterator just as easily.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | $products = Product::Manager->get_products( query => [ name => { like => '%Hat' }, id => { ge => 7 }, or => [ price => 15.00, price => { lt => 10.00 }, ], ], sort_by => 'name', limit => 10, offset => 50); |
From my perusing of the docs it seems that Rose basically has all of the perl data-structure based searches that one would hope for in an ORM that abstracts away most SQL.
62 releases in almost four years. Most recent being days old.
One “author,” 69 credited.
It is very much made for the convenience of the programmer. The .09 series will be Moose-based. See slides for proof.
Here’s an example of a relatively complex search with DBIx::Class.
1 2 3 4 5 6 7 8 9 10 11 | my $results = $schema->resultset('Artist')->search({ first_name => 'frew', last_name => [ # arrayrefs mean or by default { -like => 'schmi%' }, { -like => 'stat%' }, ] },{ page => 2, rows => 25, order_by => { -desc => [qw/last_name first_name/] } }); |
What I think is really great about DBIx::Class is the fact that you can chain searches. I really dig this feature. It lets me do things like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | $rs = $schema->resultset('Artist')->search({ first_name => $self->query->param('name'), 'friend.height' => 6*12+1, },{ join => 'friend' }); # imagine this is in another method (because it is) $rs = $rs->search(undef, { page => $self->query->param('page'), rows => $self->query->param('rows'), }); # imagine this is in another method (because it *also* is) $rs = $rs->search(undef, { order_by => { q{-}.$self->query->param('direction') => $self->query->param('sort') } }); # and maybe some permissions stuff $rs = $rs->search({ current_user => $self->user }); |
And most importantly, this is a single SQL query, not four.
I also need to mention that the DBIx::Class people have really helped me help them, which is not just a good feeling, I use the features I’ve added in production code. I can say for certain that working on their codebase has made me a better OO programmer.
22 releases in just over a year.
One author.
This is the newest of the four ORM’s reviewed here. Cut it some slack if it doesn’t have all the features that the other ones have.
Fey::ORM is made for people who are alright with actually writing SQL. It’s very … OO-y. For example:
1 2 3 4 | $select->select( $message_t, $user_t ) ->from( $message_t, $user_t ) ->where( $message_t->column('message_date'), '>=', DateTime->today()->subtract( days => 7 )->strftime( '%Y-%m-%d' ) ); |
What I think is really cool about Fey::ORM is that it has a standard method for creating relationships other than the usual has_one/has_many/many_to_many. Who wouldn’t want to be able to make relationships based on something other than equality? (Seriously though, I’d be able to use that at $work.)
So that’s what I gathered from a couple of hours of reading docs on CPAN and a few months of DBIC usage. I really like DBIx::Class, but I can see why people would choose some of the other ORMs. It does seem to me that the only reason to use Class::DBI is that you already have a giant codebase written on top of it. But if that’s the case, you could just use DBIx::Class’s compatibility layer…
Anyway, hope this was helpful for someone!
21 May
This is a blogish version of a message I posted to the DBIC Mailing list recently.
First off, this is my table structure:
User has many Roles (Role belongs to User)
Role has many Permissions (Permission belongs to Role)
Permissions has many Screens (Screens has many Permissions)
Screens belongs to Section (Section has many Screens)
So I thought I could do this:
1 2 3 4 5 | my @sections = $user->roles ->related_resultset('permissions') ->related_resultset('screens') ->related_resultset('section') ->all; |
But related_resultset doesn’t work with many_to_many because it’s not a “real” relation (I’d like to hear about why that is at some point.)
The following is close to what I wanted
1 2 3 4 5 6 7 | my @sections = $user->roles ->related_resultset('role_permissions') ->related_resultset('permission') ->related_resultset('permission_screens') ->related_resultset('screen') ->related_resultset('section') ->all; |
But it turns out it returns a section per role, which often means duplicates.
So I figured I could do a distinct, so I finally tried this:
1 2 3 4 5 6 7 | my @sections = $user->roles ->related_resultset('role_permissions') ->related_resultset('permission') ->related_resultset('permission_screens') ->related_resultset('screen') ->related_resultset('section') ->search(undef, { distinct => 1 }); |
And it worked! How cool is that?
I actually later on ended up only getting the screens and then getting the sections based on that, otherwise we got false positives on the sections. Anyway, now we have a nice roles/permission based tree getting built on our app for the navigation.
And this next little trick could be an entire post in itself, but my Draft queue is getting pretty huge, so I’ll just include it here:
1 2 3 4 5 6 7 | $cd_rs->search({ artist_id => { in => $artists_rs->search({ name => { like => '%beat%'}, })->get_column( 'id' )->as_query }, }); |
So basically what this does is a subselect. DBIC is very much strives to be consistent throughout, which brings us the amazing new as_query method. This turns the given resultset into a data structure, which can then be passed to other resultsets searchs to create subselects. The above search will find all of the CD’s by artists with the string ‘beat’ somewhere in their names.
Anyway, hope you enjoyed this post. My brother is getting married in a week and my sister is graduating highschool on Tuesday. I say this because I doubt I will be able to post much next week. So worst case scenario I will post again on the first of June.