Hello symfony community. This is my first post on this blog, so I will briefly present myself. My name is Olivier Poitrey, I am the co-founder and CTO of Dailymotion. As Fabien mentioned in a previous post on this blog, we've recently switched part of our in-house framework to symfony 2.0. I'm proud to contribute to this excellent framework and I hope my company and I will be able to help the project as much as it helped us.
In the context of this economic crisis, you may not feel like buying new
hardware but instead pay attention on what is slowing down your existing
servers. At least, that's the way we fix things here at Dailymotion, and I
guess we're not alone at spending days and sometimes nights playing with
xdebug
/kcachegrind
or the excellent
XHProf PECL extension recently
released by Facebook.
During one of these optimization sessions, I discovered that in addition to our own mess, every page handled by symfony's routing was spending a constant and disproportionate amount of CPU time to initialize. Since the new greatly appreciated routing system introduced in symfony 1.2, every route is an object. Once the routing configuration is parsed and regexps are compiled, the whole routing map is serialized in a cache entry for subsequent requests.
In these subsequent requests, the cached routing map is deserialized, and all
sfRoute
objects are deserialized at the same time. Unfortunately, if the route
matching the current request is among the first ones, most of those
deserialized objects won't be used at all. With a routing configuration file
containing hundreds of routes, it is a waste of resources that is hard to
accept.
The obvious solution would be to deserialize sfRoute
objects only as they
are needed, and that's exactly what we have done to fix this performance
issue. Instead of serializing the route-map with sfRoute
objects in a single
pass, they are serialized twice: first individually, and then together with
the route map. This way, when the route map is deserialized, the sfRoute
objects are kept serialized. They are then deserialized on-the-fly as
required.
The Benchmarks
To emphasis the improvement, I wrote a script that generates 200 very simple routes, and which measures the time spent to parse a URL matching the first, the middle, and the last route.
require dirname(__FILE__) . '/lib/autoload/sfCoreAutoload.class.php'; sfCoreAutoload::register(); function generateRoutes(sfEvent $event) { $routing = $event->getSubject(); for ($i = 1; $i <= 200; $i++) { $routing->connect('route' . $i, new sfRoute('/route' . $i)); } } $dispatcher = new sfEventDispatcher(); $dispatcher->connect('routing.load_configuration', 'generateRoutes'); $cache = new sfFileCache(array('cache_dir' => '/tmp/routing')); foreach (array(false, true) as $lazy) { echo $lazy ? "Lazy mode:\n" : "Normal mode:\n"; $options = array('lazy_routes_deserialize' => $lazy); $cache->clean(); // Instanciate the routing a first time to cache the compiled routes $routing = new sfPatternRouting($dispatcher, $cache, $options); // ... and a second time to feed buffer caches $routing = new sfPatternRouting($dispatcher, $cache, $options); foreach (array(1, 100, 200) as $routeNum) { printf('Route #%\'.-3d: ', $routeNum); $t = microtime(true); $routing = new sfPatternRouting($dispatcher, $cache, $options); $routing->parse('/route' . $routeNum); $routing->shutdown(); printf('uncached match: %5.2fms, ', (microtime(true) - $t) * 1000); $t = microtime(true); $routing = new sfPatternRouting($dispatcher, $cache, $options); $routing->parse('/route' . $routeNum); $routing->shutdown(); printf('cached match: %5.2fms', (microtime(true) - $t) * 1000); echo "\n"; } }
Results:
Normal mode:
Route #1..: uncached match: 11.00ms, cached match: 10.31ms
Route #100: uncached match: 16.67ms, cached match: 11.74ms
Route #200: uncached match: 14.92ms, cached match: 10.37ms
Lazy mode:
Route #1..: uncached match: 2.63ms, cached match: 1.90ms
Route #100: uncached match: 12.09ms, cached match: 1.44ms
Route #200: uncached match: 18.71ms, cached match: 1.48ms
As you can see, matching the first, the middle, or the last route without lazy
deserialization doesn't make much of a difference. As the most expensive task
is to deserialize every sfRoute
object unconditionally, the cost of matching
the route itself isn't noticeable. Even worse, for the same reason, the lookup
cache doesn't save us much time.
With lazy deserialization, it's far better when the very first route matches,
but the farther the matched route is, the worst the performance is compared to
the normal mode. This is because calling unserialize()
many times with a short
string is slower than calling it once with a large one. So, it is important
to put most used routes at the top of the routing configuration file as it now
has an even greater impact.
But as you may have noticed, the main improvement with the lazy mode is
regarding the lookups cache. Unlike before, the lookups have a real benefit,
even on the very last route. This is because only one sfRoute
is now
deserialized on cache-hits! And chances are that most of your requests will
ends up in this lookup cache.
Configuration
The new option is off
by default. If you want to give it a try with your
application, edit your factories.yml
and add lazy_routes_deserialize: true
to the routing factory:
all: routing: class: sfPatternRouting param: generate_shortest_url: true extra_parameters_as_query_string: true lazy_routes_deserialize: true
Conclusion
This new lazy deserialization option can save you a lot of CPU power if your routing configuration is big. As we have seen, this option can help or harm the performance depending on the circumstances. So make sure you always put your most used routes at the top of the configuration when possible before activating it. Unfortunately, it can be difficult to do that for some setups, for instance when the most used route is a catch-all, and thus can't be moved at the top. To fix this problem, I have another improvement waiting its inclusion into the symfony 1.3 branch, which I will probably talk about in a future post on this blog.
Stay tuned and let us know if this new option saved you some CPUs, because it sure did for us!
I have a project that has a great deal of routes which is shortly going to be upgraded from it's current 1.0 state to a 1.2 project, once done I can test this out. Excellent first post!
Very nice work ! This kind of technical improvments are very appreciated :)
What a great couple of days for symfony performance! Thank you for your contribution to the framework, and welcome!
Great find! Welcome to the team Olivier.
Thanks for sharing this. I'll check this out asap.
I guess one of the problems with using objects here is that yu cannot benefit from ultra fast deserialization out of APC's user data store. That is always something to consider when moving data structures from array's to objects.
Speaking of APC's user cache. I always wonder why so few people have tried to load such static config data into Apache itself. I am sure someone could create an Apache mod that would load such configs at Apache startup and store everything in ZVAL's that could then be loaded into PHP with zero effort. I guess the issue is that this stuff would need to be readonly and the only thing we have for that atm are constants, which do not support arrays. There was once discussion about making it possible to mark certain variables read only (like a db query result).
At any rate, isnt it the better "caching" approach to generate the code into a file as plain PHP code, so that it can be cached by a bytecode cache? Though here again objects are less convinient than object imho.
So I guess the point through most of my comment is use arrays for data structures, which are then passed through a single object instance as needed.
is this improvement available at symfony 1.2 trunk?
@malas it seems so http://trac.symfony-project.org/changeset/16949
Hello, Olivier I made a simple and very handy tool for myself while were playing with speeding symfony/CI up. Here it is - http://code.google.com/p/xdebugtoolkit/ It always generates trees instead of messy cyclic call graphs, so I find it more informative.
It would be great if you check this out and leave some feedback, or probably it will be useful for you as is.
Lukas: Generating plain PHP code files was my initial approach, and it's true: with an opcode cache it's faster than this final solution. Unfortunatly, with such approach you can't use the provided sfCache and you need to add some option parameters to configure where to cache those generated files. It's why we finaly opted for this less performant but more generic solution.
Alexey Kupershtokh: Nice tool! I'll definitely test it, thanks.