Database Interaction Optimization

Last time I covered a basic technique - eager loading - that when used correctly can speed your database access a lot. Now I’ll cover some smaller but also important details relative to database access and interaction in Ruby on Rails that will also help accelerating your database operations.

In some cases you might have to consider sacrificing you code legibility a bit in order to speed things up. ActiveRecord provides us developers a lot of high-level dynamic methods to interact with the database. Unfortunately some of these are not very efficient.

I’ll use the previous post’s example code for experimental purposes. It consists on a simple blog with posts and comments, for those of you who didn’t read it.

Use Transactions

Transactions are good for your application and they’re fast. Every time you create or update a record ActiveRecord wraps that operation in a transaction. This means that many sequential creations will be getting sent in individual transactions, slowing things down. Let’s test this method:

def index
  (1..100).each do |i|
    post = Post.create(:content => "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."+i.to_s)

    (1..20).each do |j|
      Comment.create(:content => "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s"+i.to_s+j.to_s, :post => post)
    end
  end
end

We’ll measure the time it takes to process using Rails standard development information.

Completed in 76017ms (View: 16, DB: 65555)

Quite slow. 76 seconds to process the request from which 66 were spent in database operations - this is what we’re interested in. Let’s try the same operation but wrapping everything up in a transaction, like this:

ActiveRecord::Base.transaction do
  def index
    (1..100).each do |i|
      post = Post.create(:content => "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."+i.to_s)

      (1..20).each do |j|
        Comment.create(:content => "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s"+i.to_s+j.to_s, :post => post)
      end
    end
  end
end

And test it again. The output is:

Completed in 5223ms (View: 48, DB: 476) | 200 OK [http://localhost/posts]

We’re down to 5 seconds of which only 0.5 were spent in database operations. Awesome decrease taking 0.007% of the time. It’s a huge improvement but it’s associated with this specific case, inserting 2000 records at the same time. In real life applications we usually don’t have that many insertions or updates sequentially, so the benefit isn’t that big. Let’s try for a much smaller set - 1 post and 2 comments. Without wrapping it up on a transaction, we have:

Completed in 205ms (View: 3, DB: 176) | 200 OK [http://localhost/posts]

Wrapping all 3 operations in a transaction and trying again, we have:

Completed in 173ms (View: 29, DB: 137) | 200 OK [http://localhost/posts]

Less 39 milliseconds spent on database operation, taking only 78% of the time.

Transactions can be an excellent performance catalyst in some particular situations, namely when you insert and/or update more than one record sequentially. This is obviously related to the way ActiveRecord behaves on database operations, creating a transaction for each one of them. This can be a waste of performance when you repeatedly perform these operations - wrap yourself the operations inside a transaction so that ActiveRecord goes easier on the database.

Avoid find_by_name

Even though find_by_name is awesome and improves code legibility a great deal, it’s actually a bit slow. Picking up our blog example in the last post, let’s run some tests and analyse their results.

# find_by_created_at
Comment.find_by_created_at("2009-11-06 18:25:48")
Completed in 197ms (View: 1, DB: 19) | 200 OK [http://localhost/posts]

# normal find with conditions
Comment.find(:first, :conditions => ["created_at = ?", "2009-11-06 18:25:48"])
Completed in 62ms (View: 1, DB: 7) | 200 OK [http://localhost/posts]

# find using sql
Completed in 30ms (View: 14, DB: 6) | 200 OK [http://localhost/posts]
Comment.find_by_sql("SELECT * FROM comments WHERE (created_at = '2009-11-06 18:25:48') LIMIT 1")

Since we’re talking about really small values, I had to make a few tests for each clause to find a stable result. There they are and we can extract some conclusions out of it:

Rule of thumb: In conditions, when you’re using integers, don’t wrap them around “” or ”. It’s heavier than using the integer as is since Rails has to deal with type detection.

Hacking your database configuration

If you’re using an InnoDB-based database, like mysql, you might want to hack into it’s configuration. There are 3 main key points:

By default these values are a bit lower than you’d want on a high-demand production box. While your database gets faster as innodb_buffer_pool_size grows (it’s no use making it larger than the database itself), there are some issues with sort_buffer_size.

The best values for these variables depend on the box and the characteristics of your rails project. You should test and benchmark your queries to find what values suit your server/application best. Do not extract conclusions locally, this kind of test should be ran on the destination server using the application itself.

Posted 2 years ago • Comments
blog comments powered by Disqus
Page 1 of 1