Lazy Routing deserialization

Hello symfony community. This is my first post on this blog, so I will briefly present myself. My name is Olivier Poitrey, I am the co-founder and CTO of Dailymotion. As Fabien mentioned in a previous post on this blog, we've recently switched part of our in-house framework to symfony 2.0. I'm proud to contribute to this excellent framework and I hope my company and I will be able to help the project as much as it helped us.

In the context of this economic crisis, you may not feel like buying new hardware but instead pay attention on what is slowing down your existing servers. At least, that's the way we fix things here at Dailymotion, and I guess we're not alone at spending days and sometimes nights playing with xdebug/kcachegrind or the excellent XHProf PECL extension recently released by Facebook.

During one of these optimization sessions, I discovered that in addition to our own mess, every page handled by symfony's routing was spending a constant and disproportionate amount of CPU time to initialize. Since the new greatly appreciated routing system introduced in symfony 1.2, every route is an object. Once the routing configuration is parsed and regexps are compiled, the whole routing map is serialized in a cache entry for subsequent requests.

In these subsequent requests, the cached routing map is deserialized, and all sfRoute objects are deserialized at the same time. Unfortunately, if the route matching the current request is among the first ones, most of those deserialized objects won't be used at all. With a routing configuration file containing hundreds of routes, it is a waste of resources that is hard to accept.

The obvious solution would be to deserialize sfRoute objects only as they are needed, and that's exactly what we have done to fix this performance issue. Instead of serializing the route-map with sfRoute objects in a single pass, they are serialized twice: first individually, and then together with the route map. This way, when the route map is deserialized, the sfRoute objects are kept serialized. They are then deserialized on-the-fly as required.

The Benchmarks

To emphasis the improvement, I wrote a script that generates 200 very simple routes, and which measures the time spent to parse a URL matching the first, the middle, and the last route.

require dirname(__FILE__) . '/lib/autoload/sfCoreAutoload.class.php';
sfCoreAutoload::register();
 
function generateRoutes(sfEvent $event)
{
  $routing = $event->getSubject();
  for ($i = 1; $i <= 200; $i++)
  {
    $routing->connect('route' . $i, new sfRoute('/route' . $i));
  }
}
 
$dispatcher = new sfEventDispatcher();
$dispatcher->connect('routing.load_configuration', 'generateRoutes');
 
$cache = new sfFileCache(array('cache_dir' => '/tmp/routing'));
 
foreach (array(false, true) as $lazy)
{
  echo $lazy ? "Lazy mode:\n" : "Normal mode:\n";
 
  $options = array('lazy_routes_deserialize' => $lazy);
 
  $cache->clean();
  // Instanciate the routing a first time to cache the compiled routes
  $routing = new sfPatternRouting($dispatcher, $cache, $options);
  // ... and a second time to feed buffer caches
  $routing = new sfPatternRouting($dispatcher, $cache, $options);
 
  foreach (array(1, 100, 200) as $routeNum)
  {
    printf('Route #%\'.-3d: ', $routeNum);
 
    $t = microtime(true);
    $routing = new sfPatternRouting($dispatcher, $cache, $options);
    $routing->parse('/route' . $routeNum);
    $routing->shutdown();
    printf('uncached match: %5.2fms, ', (microtime(true) - $t) * 1000);
 
    $t = microtime(true);
    $routing = new sfPatternRouting($dispatcher, $cache, $options);
    $routing->parse('/route' . $routeNum);
    $routing->shutdown();
    printf('cached match: %5.2fms', (microtime(true) - $t) * 1000);
 
    echo "\n";
  }
}

Results:

Normal mode:
Route #1..: uncached match: 11.00ms, cached match: 10.31ms
Route #100: uncached match: 16.67ms, cached match: 11.74ms
Route #200: uncached match: 14.92ms, cached match: 10.37ms

Lazy mode:
Route #1..: uncached match:  2.63ms, cached match:  1.90ms
Route #100: uncached match: 12.09ms, cached match:  1.44ms
Route #200: uncached match: 18.71ms, cached match:  1.48ms

As you can see, matching the first, the middle, or the last route without lazy deserialization doesn't make much of a difference. As the most expensive task is to deserialize every sfRoute object unconditionally, the cost of matching the route itself isn't noticeable. Even worse, for the same reason, the lookup cache doesn't save us much time.

With lazy deserialization, it's far better when the very first route matches, but the farther the matched route is, the worst the performance is compared to the normal mode. This is because calling unserialize() many times with a short string is slower than calling it once with a large one. So, it is important to put most used routes at the top of the routing configuration file as it now has an even greater impact.

But as you may have noticed, the main improvement with the lazy mode is regarding the lookups cache. Unlike before, the lookups have a real benefit, even on the very last route. This is because only one sfRoute is now deserialized on cache-hits! And chances are that most of your requests will ends up in this lookup cache.

Configuration

The new option is off by default. If you want to give it a try with your application, edit your factories.yml and add lazy_routes_deserialize: true to the routing factory:

all:
  routing:
    class: sfPatternRouting
    param:
      generate_shortest_url:            true
      extra_parameters_as_query_string: true
      lazy_routes_deserialize:          true

Conclusion

This new lazy deserialization option can save you a lot of CPU power if your routing configuration is big. As we have seen, this option can help or harm the performance depending on the circumstances. So make sure you always put your most used routes at the top of the configuration when possible before activating it. Unfortunately, it can be difficult to do that for some setups, for instance when the most used route is a catch-all, and thus can't be moved at the top. To fix this problem, I have another improvement waiting its inclusion into the symfony 1.3 branch, which I will probably talk about in a future post on this blog.

Stay tuned and let us know if this new option saved you some CPUs, because it sure did for us!

Comments

I have a project that has a great deal of routes which is shortly going to be upgraded from it's current 1.0 state to a 1.2 project, once done I can test this out.
Excellent first post!
Very nice work ! This kind of technical improvments are very appreciated :)
What a great couple of days for symfony performance! Thank you for your contribution to the framework, and welcome!
Great find! Welcome to the team Olivier.
Thanks for sharing this. I'll check this out asap.
I guess one of the problems with using objects here is that yu cannot benefit from ultra fast deserialization out of APC's user data store. That is always something to consider when moving data structures from array's to objects.

Speaking of APC's user cache. I always wonder why so few people have tried to load such static config data into Apache itself. I am sure someone could create an Apache mod that would load such configs at Apache startup and store everything in ZVAL's that could then be loaded into PHP with zero effort. I guess the issue is that this stuff would need to be readonly and the only thing we have for that atm are constants, which do not support arrays. There was once discussion about making it possible to mark certain variables read only (like a db query result).

At any rate, isnt it the better "caching" approach to generate the code into a file as plain PHP code, so that it can be cached by a bytecode cache? Though here again objects are less convinient than object imho.

So I guess the point through most of my comment is use arrays for data structures, which are then passed through a single object instance as needed.
is this improvement available at symfony 1.2 trunk?
@malas
it seems so http://trac.symfony-project.org/changeset/16949
Hello, Olivier
I made a simple and very handy tool for myself while were playing with speeding symfony/CI up. Here it is - http://code.google.com/p/xdebugtoolkit/
It always generates trees instead of messy cyclic call graphs, so I find it more informative.

It would be great if you check this out and leave some feedback, or probably it will be useful for you as is.
Lukas: Generating plain PHP code files was my initial approach, and it's true: with an opcode cache it's faster than this final solution. Unfortunatly, with such approach you can't use the provided sfCache and you need to add some option parameters to configure where to cache those generated files. It's why we finaly opted for this less performant but more generic solution.
Alexey Kupershtokh: Nice tool! I'll definitely test it, thanks.

Comments are closed.

To ensure that comments stay relevant, they are closed for old posts.