One of Rails fundamental weaknesses concerning performance is related to Ruby’s garbage collector. Some significant performance gains have a large development cost and sometimes it’s not worth to spend time researching, applying and testing these modifications. The basic principle is: start simple and bear performance in mind from the first second. This will reduce the time spent on optimizations thus making your application cheaper to develop.
This is where Ruby’s garbage collector comes in: It’s simply not very efficient for Ruby on Rails. You can spend a few minutes fine-tunning it’s behavior and gain a significant performance boost.
Ruby’s garbage collector is a bit heavy and locks your application for a moment while it runs - it has a non-generational behavior. It’s imperative to guarantee that it doesn’t run too often and that it doesn’t take too much time while running.
There are some configurations parameters related to when the garbage collector is triggered and how it behaves. Ruby Enterprise Edition let’s you configure some parameters which override ruby’s not-rails-optimized default settings.
GC environment variables
There are a few important environment variables which affect ruby’s garbage collector performance. I’ll review them and discuss a few values. Remember: each application has it’s own optimal configuration. You should run a few tests and see what suits you best.
- Initial number of heap slots. It also represents the minimum number of slots, at all times (default: 10000)
- The number of new slots to allocate when all initial slots are used (default: 10000)
- Next time Ruby needs new heap slots it will use a multiplicator, defined by this environment variable’s value (default: 1.8, meaning it will allocate 18000 new slots if default settings are in use)
- The number of C data structures that can be allocated before triggering the garbage collector. This one is very important since the default value makes the GC run when there are still empty heap slots because Rails allocates and deallocates a lot of data (default: 8000000)
- The number of free slots that should be present after GC finishes running. If there are fewer slots than those defined it will allocate new ones according to RUBY_HEAP_SLOTS_INCREMENT and RUBY_HEAP_SLOTS_GROWTH_FACTOR parameters (default: 4096)
Rails uses a lot of memory and creates/destroys huge objects frequently. It’s simply not a normal ruby code so the default settings are not appropriate. Again, the optimal values depend on the application itself.
Let’s have a look at 37signals's settings:
RUBY_HEAP_MIN_SLOTS=600000 RUBY_GC_MALLOC_LIMIT=59000000 RUBY_HEAP_FREE_MIN=100000
We can see that the number of initial (and minimum) slots is 60 times bigger than the default, confirming my previous statement that Rails needs a lot more memory than traditional ruby applications. The number of C structures allowed before trigger is also 7 times bigger. Finally, 37signals settings demand that the number of free slots after GC runs must be 100000, 24 times more than the default settings. This last definitions is quite normal since the initial heap is much larger on these settings.
Let’s take a look a twitter's settings now:
RUBY_HEAP_MIN_SLOTS=500000 RUBY_HEAP_SLOTS_INCREMENT=250000 RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 RUBY_GC_MALLOC_LIMIT=50000000
The number of initial slots is much bigger, just like our previous example. The number of C structures allowed also got bigger, almost as big as 37signals’s definitions. We also see a couple of important changes: The heap grows using a 1 factor which means it’s not exponential like the default one. The heap slots increment is also much bigger than the default.
Evan, one of the engineers responsible for this investigation at twitter says that these definitions bring a 20%~40% speed increase along with a few patches - we’ll get to that on the next post.
Testing your application
In order to test your application behavior with the changed settings, you can use the Sylvain Joyeux’s patch which let’s you know exactly when GC runs, how much time it takes and how much CPU consumption is involved. You should also start top to keep track of Rails total used memory through testing.
Change your settings and observe the results. When you find your ideal definitions, test them a bit further. Be careful since some settings can make Rails memory usage explode and you don’t want this to happen on most scenarios.
There is an excellent real-world example in this Evan’s article, where he also addresses the issue we’ve been discussing. I was shocked to know GC runs for every 2 requests with the default settings. Take a look at his example - his modifications have an astonishing impact.
Making this work
To apply this modifications permanently you’ll need to create a small script to launch ruby with the settings you’ve defined.
#!/bin/bash export RUBY_HEAP_MIN_SLOTS=1250000 export RUBY_HEAP_SLOTS_INCREMENT=100000 export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 export RUBY_GC_MALLOC_LIMIT=30000000 export RUBY_HEAP_FREE_MIN=12500 exec "/path/to/ruby" "$@"
You should give this script permissions to execute, using a chmod a+x ruby-for-rails. After doing this, you need to change make your webserver in order for it to use your script instead of ruby directly. On apache and nginx using passenger you’ll just have to change their configuration. For other approaches try doing a little research on how to change this.