Ruby Summer of Code — Benchmarking CI

Back in April I was anxious to jump into the Google Summer of Code. I was eagerly scanning all Ruby-related proposals and waiting for Google to announce which ones were approved. As a Ruby guy, I was very disappointed by the fact that not a single Ruby-related proposal was accepted.

Despite this initial turndown, some awesomely clever people had this great idea. They wanted to follow Google’s steps but focus on our growing Ruby world. This way, the Ruby Summer of Code program was born.

Quoting from RubySOC’s page:

To continue Google’s great tradition of sponsoring open source development via summer student interns, Ruby companies, organizations and community members are getting together to fund Ruby Summer of Code. The project will work much the same way Google Summer of Code does—mentors and student interns, with mentors voting on which student projects get slots.

If you’ve been following my blog, you know what my MSc was about (I’ll blog about that later) and probably guessed what I wanted to work on during RubySOC—Rails’ performance. I promptly reached out to Yehuda Katz to debate some ideas. Carl Lerche joined us and soon the IRC channel was flooding with ideas. It would be great to improve form helpers. It would be even better to improve Active Record. Refactoring link helpers and the InstanceTag system would also have a significant outcome. Oh, and did I mention parallel partial rendering? That would be insanely cool as well.

Benchmarking CI

All these ideas lost their shininess when another came up. Who are we to define what should be improved? Instead, we should have a way of determining what is fast and what is slow on real world applications.

“Benchmarking CI” was promptly born. I felt eager to start and both Yehuda and Carl liked and helped form this idea. The project was outlined, proposed and… accepted. Its summary says it all (for a lengthy version, check this out):

This project consists in building an official full-stack benchmarking suite for Ruby on Rails. Each commit will automatically trigger a process where a remote machine starts a new server, runs the tests and reports back the results. As time goes by, it will be possible to watch the evolution of the framework’s performance and developers will be able to keep track of the impact their changes have. In it’s most basic form, it will bring a kind of performance-oriented continuous integration for the Ruby on Rails framework.

Rails 3 was nearing a stable release and it would be great to have this kind of CI available as soon as possible.

Starting off

I was targeting Rails 3 and while it’s good to work with cutting edge, it can also be a curse—the amount of applications which supported this version was minimal. Despite this difficulty, I needed something to work with.

At first, I thought about using an artificial application. Yehuda and Jeremy Kemper soon stepped in and helped me realize how nonsense that was—I was building a reliable platform for real-world applications and, consequently, measuring Rails’ performance using an artificial applications blown away that purpose. I needed something real.

If I needed to port something from Rails 2 to Rails 3, it had to be big. Something a lot of people used. One of Rails’ most popular applications. Already guessed? You’re right—Redmine. The time to code had arrived, as I was picking a somewhat big application and making it compatible with Rails 3.

A few weeks after and it was done. Redmine was mostly compatible with Rails 3 RC 1.

Profiling

Profiling time. I needed to be able to measure performance accurately on the most recent versions of Ruby and Rails. For this, three things had to be targetted: ruby, rails, and ruby-prof.

Remember railsbench? It was awesome back in Ruby 1.8. I needed something similar for Ruby 1.9. This way, the gcdata patches for YARV were created. Rails was also changed to accommodate these patches. After applying those patches, one could reliably benchmark and profile Rails applications on Ruby 1.9.

Ruby-prof was also enhanced—support for Ruby 1.9 was improved and two additions were made: an awesome HTML hierarchical printer created by Stefan Kaes was added, as well as a YAML-based printer for automated processing.

Ad-hoc test applications and dummy

One of the main purposes of the benchmarking CI was the ability to add and remove test applications on the fly, completely effortlessly. For this to happen, a couple things were needed:

Dummy was born. The whole package includes a Ruby library and 3 Rails generators:

Inspired by faker, dummy is a dummy data generator. Quoting its description:

Dummy can generate a lot of dummy data from company names to postal codes. While it allows you to specifically request a type of information, it can also try to determine what you’re looking for given a couple of parameters.

The rest of the package (dummy_*) are Rails generators which use dummy to generate test data, routes and performance tests for Rails applications automatically. Have a look at their github pages for in-depth information on how to use them.

Automation and visualization

Everything was ready—a renowned application running on Rails 3, YARV and Rails’s profiling tools enhanced and communicating, ruby-prof improved with suitable printers and test data/routes/performance tests being automatically generated.

What’s next? Building an applications which manages test applications, triggers performance benchmarks, stores and analysis the results, presents them in a meaningful way and notifies the responsible developer(s) for any performance regressions. All of this by harnessing the previously improved/developed tools.

It was also done. While being mostly complete, it is not ready for prime time just yet. This leads me to the next section.

What’s missing

For this project to be fully complete, a few things still need to be done:

Other tweaks that could be handy but aren’t critical for a final release:

What happens now

The Ruby Summer of code was amazing. I was given the opportunity to work with a subject I love around very clever folks from the Ruby land. I enjoyed every tiny bit of it and only regret having to finish my MSc thesis during the program, which took some precious time I had to invent elsewhere.

Sadly, RubySOC is over. My next goal is to bridge the gaps I enumerated above. After that, I’m aiming at tweaking this even further to be suitable for everyone, not just Rails itself, so that you can have a benchmarking CI in your Rails application. I also want you to be able to benchmark it in Rubinius and JRuby, not just MRI and YARV. It’d be a nice Christmas gift, wouldn’t it? Well, I’m hoping to release it sooner than that.

Before ending this long post, I’d like to make some brief acknowledgements. First of all, to RubySOC’s sponsors—you all made this possible. To the entire Engine Yard team behind this—especially Leah Silber—for coordinating all of this on your free time. To the Ruby community for being so helpful and motivating. At last but not least, to my awesome mentor—Yehuda Katz, one of the craziest guys I know (at least through IM), for guiding me and making it fun throughout the whole project.

My “giving back” to the Ruby community does not end here. What can I say? I loved this experience and I want more of it. Keep an eye out at my github page for up-to-date news on this subject!

EDIT: As cleverly pointed out by Myron Marston in the comments, “dummy_routes” is not the best name to describe the gem. It was changed to dummy_urls to improve the name’s meaning and generate less confusion about what it does. 

Posted 1 year ago • Comments

Porting an application to Ruby 1.9

We recently ported our application to Ruby 1.9, here at escolinhas. After seeing the benefits of using Ruby 1.9.1 instead of 1.8.7, we just couldn’t resist (check before, after or just read the entire blog post).

We had to sort a few things out in order to make our application 100% compatible with Ruby 1.9. I’ll cover the issues that most people will probably have to face.

Encoding issues in Rails

Ruby 1.9 has a much more powerful encoding engine. Unfortunately, the developer needs to put some extra effort to be able to cope with this. Check this great blog post for an in depth analysis. I’ll assume that you are using UTF-8 on your project. If you aren’t, you really should.

First of all, here’s some code that you should place inside your config/initializers:

# coding: UTF-8

# TODO: Most of these issues are not present in Rails 3. Remove this when updating.

# Force mysql rows to be UTF-8 (see rails.lighthouseapp.com/projects/8994/tickets/2476)
require 'mysql'
 
class Mysql::Result
  def encode(value, encoding = "utf-8")
    String === value ? value.force_encoding(encoding) : value
  end
  
  def each_utf8(&block)
    each_orig do |row|
      yield row.map {|col| encode(col) }
    end
  end
  alias each_orig each
  alias each each_utf8
 
  def each_hash_utf8(&block)
    each_hash_orig do |row|
      row.each {|k, v| row[k] = encode(v) }
      yield(row)
    end
  end
  alias each_hash_orig each_hash
  alias each_hash each_hash_utf8
end

# fix template rendering
module ActionView
  # NOTE: The template that this mixin is being included into is frozen
  # so you cannot set or modify any instance variables
  module Renderable #:nodoc:
    extend ActiveSupport::Memoizable


    private    
    def compile!(render_symbol, local_assigns)
        locals_code = local_assigns.keys.map { |key| "#{key} = local_assigns[:#{key}];" }.join

        source = <<-end_src
          def #{render_symbol}(local_assigns)
            old_output_buffer = output_buffer;#{locals_code};#{compiled_source}
          ensure
            self.output_buffer = old_output_buffer
          end
        end_src
        
        # Workaround for erb
        source.force_encoding('utf-8') if '1.9'.respond_to?(:force_encoding)

        begin
          ActionView::Base::CompiledTemplates.module_eval(source, filename, 0)
        rescue Errno::ENOENT => e
          raise e # Missing template file, re-raise for Base to rescue
        rescue Exception => e # errors from template code
          if logger = defined?(ActionController) && Base.logger
            logger.debug "ERROR: compiling #{render_symbol} RAISED #{e}"
            logger.debug "Function body: #{source}"
            logger.debug "Backtrace: #{e.backtrace.join("\n")}"
          end

          raise ActionView::TemplateError.new(self, {}, e)
        end
      end

  end
end

# the previous fix causes issues in uploaded files encoding, fixed here
module ActionController
  class Request
    private

      # Convert nested Hashs to HashWithIndifferentAccess and replace
      # file upload hashs with UploadedFile objects
      def normalize_parameters(value)
        case value
        when Hash
          if value.has_key?(:tempfile)
            upload = value[:tempfile]
            upload.extend(UploadedFile)
            upload.original_path = value[:filename]
            upload.content_type = value[:type]
            upload
          else
            h = {}
            value.each { |k, v| h[k] = normalize_parameters(v) }
            h.with_indifferent_access
          end
        when Array
          value.map { |e| normalize_parameters(e) }
        else
          value.force_encoding(Encoding::UTF_8) if value.respond_to?(:force_encoding)
          value
        end
      end
  end
end

So, what are we doing?

It’s not a good idea to be overriding certain Rails parts. You won’t need to do this when using Rails 2.3.6 (from what I’ve heard) and neither when using Rails 3 but for the present, where most of us use Rails 2.3.5, this is the way to go.

Setting the proper encoding in Ruby source files

Did you notice the following header on the intializer above?

# coding: UTF-8

I hope you did since you’ll be seeing it a lot unless you don’t use non-ascii in your Ruby files at all. If you do, you’ll need to specify its encoding. In many countries (like Portugal, Germany or China), the written language uses them all the time. To sort this out in our entire project without going mad, I created a simple task which handles this for us:

desc "Manage the encoding header of Ruby files"
task :check_encoding_headers => :environment do
  files = Array.new
  ["*.rb", "*.rake"].each do |extension|
    files.concat(Dir[ File.join(Dir.getwd.split(/\\/), "**", extension) ])
  end

  files.each do |file|
    content = File.read(file)
    next if content[0..16] == "# coding: UTF-8\n\n"
    
    ["\n\n", "\n"].each do |file_end|
      content = content.gsub(/(# encoding: UTF-8#{file_end})|(# coding: UTF-8#{file_end})|(# -*- coding: UTF-8 -*-#{file_end})/i, "")
    end

    new_file = File.open(file, "w")
    new_file.write("# coding: UTF-8\n\n"+content)
    new_file.close
  end
end

We run it once in a while, keeping our files clean and making sure that magical header is there. Not to worry if someone adds the header manually - the task is supposed to handle it, if necessary.

Minor issues

A few other minor issues also came up. I’ll just list some of them:

Final thoughts

As you’ve seen, most of the issues you’ll encounter when porting an application to Ruby 1.9 are encoding-related. We wouldn’t probably have these issues if UTF-8 was the default encoding for Ruby, but for now it’s ASCII-8BIT. 

Luckily, the Rails guys are trying to overcome these issues, along with the Ruby team. It’s nice to see a whole community working together.

Posted 2 years ago • Comments

Ruby Web Servers Benchmark

Feeling like my hardware wasn’t being used to the full extent of its capabilities, I decided to benchmark the most successful web servers from my previous trials in a different environment. This time, however, I’d use more workers, to see if there are significant performance gains and watch memory consumption.

Different versions of Ruby were also taken into account. As you’ve probably guessed, I’m benchmarking the application in Ruby 1.8 and Ruby 1.9.

Setup

The setup was pretty simple. Nginx (0.7.64) was used in all the tests. The remaining components’ versions were:

Nginx was acting as a proxy balancer and main server for Thin/Unicorn and Passenger, respectively. At first, 30 workers were used. In the following round, the number of workers was increased to 60.

The Ruby versions used were 1.8.7 (patchlevel 249) and 1.9.1 (patchlevel 376), compiled from source with the same flags: -O2 -march=nocona -pipe”.

Test

An awesome tool called autobench was used for this benchmark. While being great a great tool, ab lacks some of the existing features in httperf. Autobench, based on httperf, allows to perform more complex benchmarks and obtain valuable results.

This specific test aimed and discovering how much requests per second could the web server handle for each web page. Autobench would calibrate httperf to try to get more and more juice out of the application until it reached a bottleneck. After that it would try to stabilize the number of requests per second at the higher level the system could handle.

Memory consumption of all the involved components, in each test, was also recorded using the information in /proc/{pid}/status.

For this ride, 3 pages of escolinhas were used. The most visited one, the heaviest one and the lightest one. I’ll let you figure out which one is which.

An important side note is that a request needs to be completed in less than 30 seconds to be considered valid. If the reply only comes 32 seconds after, it is considered a failed request.

Configuration

Each setup had a similar configuration. The important sections were as follows:

When you see [30|60], it is obviously related to the varying number of workers. Nginx had a pretty standard configuration for all the tests.

All tests were ran on Gentoo Linux, with a tweaked sysctl to allow a higher throughput.

Results

The results of Ruby 1.8.7 are shown, moving on to Ruby 1.9.1. Finally, a brief analysis on memory usage is presented.

Ruby 1.8.7

Starting with Ruby 1.8.7, the results were as follows.

Page 1 - Ruby 1.8

As you’ve probably guessed, this is the heaviest page. Autobench was unable to find a stable point on this page, it’s simply too heavy to be persistently being requested. Anyway, all web servers behaved similarly, being able to dispatch 2.5 requests in the first iteration but completely suffocating after that.

Page 2 - Ruby 1.8

This time each setup was able to consistently serve the web page, being able to serve 10~12 requests per second. Although each web server performance is quite similar, we can see than Unicorn (with 60 workers) tends to take the lead.

Page 3 - Ruby 1.8

Again, all setups perform similarly. Unicorn (with 60 workers) seems to also take the lead on this one, although by an insignificant margin.

Ruby 1.9.1

After these tests, the configuration was changed to use Ruby 1.9.1. Let’s see how it stacks up.

Page 1 - Ruby 1.9

Wow. I mean - WOW. Switching the Ruby version increased the number of responding cycles to 15~16. The average handled requests per second also had a huge boost. I already knew that Ruby 1.9 is quite more efficient than Ruby 1.8, but we’re talking a 15x increase in successful iterations and a 2x increase in requests per second!

Yes, there are exceptions:

Passenger with 60 workers seems to take the lead on this one. Unicorn also behaved quite well, being stable all along.

Page 2 - Ruby 1.9

The results were pretty similar to our previous tests with Ruby 1.8.7, probably because the page is quite light. Ruby code is not a bottleneck here, as we can clearly see. Unicorn (both 30 and 60 workers) seems to be on top in most iterations.

Page 3 - Ruby 1.9

The results were, again, very similar. The reason is probably the same I’ve stated above. Passenger acted weirdly throughout the benchmark. With 30 workers, things went normally until the 10th iteration, where it started failing and acting weird. With 60 workers, it acted weirdly all along.

Since these strange requests have taken a surrealistic time  of 0.1 seconds to complete, I’m disqualifying Passenger here as something clearly went wrong. I’ve repeated these tests but the same results came out. I have not tried to find the true cause of this since, as we’ve seen, it won’t make much difference.

Memory

Here is the memory consumption in MB.

Autobench memory usage

A few details regarding memory usage:

Conclusions

After an exhaustive analysis of web servers performance, scalability and memory usage I can only state one fact:

The differences are very small, probably not noticeable and not really important to most people. One exception: Ruby 1.9. Start upgrading your applications, folks!

If you disagree with me, have another look at both charts regarding page 1 (with the different Ruby versions). Yes, real stuff there.

Diving into more detail, we can see that Unicorn with 60 workers generally yields better performance and scalability, although using a bit more memory than Thin.

We can also verify that the difference between 30 and 60 workers is completely insignificant - the database is the major bottleneck here. Maybe with efficient caching solutions (I’m looking at you, memcached!) the results could be a bit different. The native caching mechanisms of MySQL don’t seem to be highly effective.

You can still compare 30/60 workers with only 4 workers, by having a look at my previous benchmarks:

Interesting, huh? Happy Easter!

Posted 2 years ago • Comments

Passenger Benchmark (on Apache/Nginx)

For the last few days I’ve been putting Thin and Unicorn behind Apache, Nginx and Cherokee acting as proxies. The day would finally come where the all mighty Passenger would come to dance with the rest of its partners.

This benchmark is focused on Passenger’s performance on Apache and Nginx and, of course, an analysis is made on the achieved results.

Setup and test

The setup is pretty similar to the previous benchmarks. Apache, Nginx and Passenger versions are as follows:

The Ruby interpreter used was MRI, version 1.8.7.2010.01. This is done in order to make these benchmarks fair, since MRI was also used in Thin’s/Unicorn’s benchmarks and in the future all these benchmarks will come together.

As for the test itself, the environment used was the same from the previous benchmarks - 3 pages (one light, two heavy) and 5 request/concurrency variants (ranging from 50/1 to 2500/500). Apache ab was used to preform to endure the benchmark.

Configuration

Following the same line of fairness mentioned before, only 4 instances of Rails were used, although I noticed that Apache spawned a lot of Rails processes despite being instructed otherwise. The important sections of each web server configuration are as follows:

Some of you may notice the StartServers 20 line, in contradiction to Nginx’s worker_processes 1;. I actually needed to improve Apache’s configuration over Nginx’s to achieve a similar degree of scalability. Standing in the same feet, Apache wasn’t able to complete most of the tests without a single error or request time-out.

Results

Passenger benchmark results

Conclusions

When considering the tests that both web servers were able to successfully complete, Apache won by a slight margin of ~2%, while using 33% more memory.

Apache was, though, unable to cope with some of the tests. The 500/50 benchmark on the heavy page meant a few request time-outs for this web server, providing it failed the test.

The low memory usage and scalability of Nginx pose as strong arguments when choosing the web server to use with Passenger. In the scope of my research and this blog, it’s an obvious choice.

Posted 2 years ago • Comments

Thin vs. Unicorn Performance Benchmark

Thin, as I’ve mentioned in previous posts, is a very fast Ruby web server. Unicorn came a bit after and had some buzz associated with its arrival. Recently Twitter adopted it, so it’s a worthy opponent to the already established Thin web server.

Setup, test and configuration

Thin’s version and configuration were already stated in my previous 2 posts. As for Unicorn, you can find it’s configuration here. As you can see, it is pretty similar to Thin’s. This is very important to make this benchmark reasonably reliable.

Once again, the used proxy server was Nginx with the previously shown configurations. The test pages were also the same, just like Nginx’s version. The Unicorn version used was 0.97.0.

Results

Thin vs. Unicorn results

Conclusions

Unicorn’s performance is very similar to Thin’s. Thin was ~2% faster (in total) while using less memory (around 38MB of difference) so it’s still a bit better than Unicorn for the considered platform. These 2%/38MB should not, however, be a performance bottleneck for any system. There are many more important things to optimize before coming to these little details.

Posted 2 years ago • Comments
Page 1 of 5