Article written

One query show 0

Feb3

At $work we specialize in performance audits on systems of every size, from start-up sites hacked together overnight, to a ginormous applications built by world-recognized brand companies. I’ve seen a lot of interesting (and sometimes very unique) performance issues in every level of the stack: code (front-end and back-end), architecture, databases, (sometimes all of the above), but there is one particular, very “Performance 101″, issue that I see all the time. And frankly – it bugs the hell out of me. In fact, the problem is so common sense that writing about it seems a little embarrassing, yet the frequency of seeing the same issue is proving otherwise.

Consider the page that displays a list of items with some properties. Now consider this code. See anything wrong with it?

# select a list of items
$sth = $dbh->prepare("select id from items where ....");
$sth->execute();
foreach ($hreh = $sth->fetchtow_hashref()) {
  # create an array of item objects with all the properties
  push @items, Module::Item->new($href->{id});
}

foreach (@items) {
  # render item list
  print "ID: ".$_->id."\n";
  print "Properties: ".join(",",@{$_->properties})."\n";
  ....
}

If you didn’t spot the problem right away – I may have to borrow Theo’s stick and come over for a visit.

It’s the classic n+1 problem. The application is hitting a database to get the properties n(number of items) times, plus the hit from the initial query to get a list of items. Why?! There is absolutely no reason to do that. Ever. Yet, I see the same issue over and over again in variety of applications, running on different systems, written in different languages. Granted, one of the biggest offenders are the apps using a framework or an ORM (example above actually assumes Module::Item to know how to retrieve all the object data needed, including going to the database to get it). Maybe the excuse is that the developer doesn’t know what happens under the hood of ORM, although if you don’t know the benefits and limitations of a tool – you shouldn’t be using it. But I’ve also seen an alarming number of applications not using an ORM that have the same issue, and it’s even more obvious since you yourself are defining all the actions.

# select a list of items
$sth = $dbh->prepare("select id from items where ...");
$sth->execute();
foreach ($hreh = $sth->fetchtow_hashref()) {
  # create an array of items with all the properties
  $sth_properties = $dbh->prepare("select * from item_properties
                                   where item_id = ?");
  $sth_properties->execute($href->{id});
  $item->{properties} = $sth_properties->fetchrow_arrayref();
  $item->{id} = $href->{id};
  push @items, $item;
}

foreach (@items) {
  # render item list
  print "ID: ".$_->{id}."\n";
  print "Properties: ".join(",",@{$_->{properties}})."\n";
  ....
}

If nothing else – having a $dbh->prepare("SELECT ....") and $sth->execute inside a loop should raise a whole lot of red flags right away (this approach has more than one issue, but let’s concentrate on the problem at hand). There are different variations of the extent of, or lack of, ORM usage (as example, instead of initial SELECT you may use something like Module::Item->get_all) but it doesn’t change the sequence of logic, and subsequently, the performance.

I hope by this point of my rant everyone realized the “right” way of coding this bit of functionality, but just in case – here’s a code that produces exactly the same result with only 1 database query.

# select a list of items WITH properties
$sth = $dbh->prepare("select p.* from items i, item_properties p
                              where i.id = p.id
                              and i. .....");
$sth->execute();
foreach ($hreh = $sth->fetchtow_hashref()) {
  # create an array of item objects with all the properties
  # without going back to the database
  push @items, bless $href, 'Module::Item';
}

foreach (@items) {
  # render item list
  print "ID: ".$_->id."\n";
  print "Properties: ".join(",",@{$_->properties})."\n";
  ....
}

So pretty please, with sugar on top, don’t create performance problems where there should be none!

subscribe to comments RSS

There are no comments for this post

Please, feel free to post your own comment

* these are required fields