Push it to the limits - Symfony2 for High Performance needs
August 4, 2014 • Published by Javier Eguiluz
This Case Study is a guest post written by Antoni Orfin, Co-Founder and Software Architect at Octivi. Want your company featured on the official Symfony blog? Send a proposal or case study to fabien.potencier@sensiolabs.com
For most people, using full-stack frameworks equals slowing down websites. At Octivi, we think that it depends on correctly choosing the right tools for specific projects.
When we were asked to optimize a website for one of our clients, we analyzed their setup from the ground up. The result: migrate them toward Service Oriented Architecture and extract their core-business system as a separate service.
In this Case Study, we'll reveal some architecture details of 1 Billion Symfony2 Application. We'll show you the project big-picture then focus on features we really like in Symfony2. Don't worry; we'll also talk about the things that we don't really use.
Key numbers to give you an overview of the described platform's scale:
- Application handling attains 1,000,000,000 (1 billion) requests every week
- Symfony2 instance handles 700 req/s with an average response time of 30 milliseconds.
- Varnish - more than 12,000 req/s (achieved during stress test)
- There are 160,000,000 records stored in Redis (our primary data store!) ...and
- 300,000,000 in MySQL
Business Requirements
Our system development had to fulfill both of the following requirements:
- Reliability - platform must ensure High Availability. That means, accepted downtimes are at a minimal level to keep business fully operational.
- Performance - as the previous system had some performance issues, this new one had to solve them and also be prepared for easy scaling-out to handle growth in customer numbers.
Choosing the right tools
Some may wonder why Symfony2 has been chosen instead of another product, i.e. Node.js which also works well for building fast APIs.
It was a business choice. Since the Client's platform is built with PHP, they've got a big team of great PHP developers. Building a new system with Node.js would lead to maintenance difficulties and would require hiring Node.js developers just to work on the one subsystem. Choosing Symfony2 led to lower IT related costs and made the platform more maintainable.
However, it was necessary to do proof of concept and confirm the required performance level. Performance tests were executed and confirmed planned setup efficiency.
Application architecture
Request flow through SOA (Service Oriented Architecture)
As we've said – the platform is structured into separate webservices serving as a REST APIs and Symfony2 application is one of the services.
Let's talk about three layers: Frontend >> Webservice >> Data Storage.
- Frontend - when the end-client visits a frontend-type website, it communicates internally with a specific Web service (as described in this article).
- Web service - web services handle large amounts of the business logic. Functionalities aren't divided on many sites, but everything is kept in a specific service. It's a well-known method of handling in software architecture separation of concerns principles
- Data Storage - services are using their own data storages. Frontend websites mustn't connect directly to the database handled by a web service as it would lead to bypassing control over a platform logic.
(click on the image to expand it)
Application's server architecture
To maintain high performance and high-availability, the Symfony2 service uses several application's servers - in redundant configuration.
- Requests are handled by a HAProxy load balancer which distributes them to Varnish reverse proxies.
- We use Varnish only as a cache layer, not as another load balancer. There are separate Varnish instances on each of the application's servers. That way we don't end up with SPOF - Single Point Of Failure. It makes cache hits lower, but it favors availability over performance here (which is still not a problem). Single Varnish can handle up to 12,000 requests per second.
- The last piece of the puzzle is an Apache2 server with Symfony2 based application. We're using PHP 5.4 running as PHP-FPM with APC enabled. That layer handles 700 req/s.
Storing data for High-Performance
Of course, every web-application uses some kind of data-storage. We've managed a nice setup here, too.
- Redis - it's extremely fast, in-memory data storage. Think of Memcache on steroids - with a persistence mechanism, HA setup and so on. We can store 160,000,000 records, including 98% of the persistent ones i.e., won't disappear after server outage. Key space hits stay at the average level of 15,000 hits/s. Symfony2 is integrated using SncRedisBundle with Predis library.
- MySQL - most known relational database. We're mostly using it as a last (third) cache layer, for non-expiring resources. We're using Doctrine DBAL for connections. Our MySQL stores 300,000,000 records - that's 400GB of pure data.
So, as you can see, some unusual things are happening here - Redis is our primary data store and MySQL is our last cache layer.
Things we liked the most
Clear project structure - Bundles
Symfony2 doesn't impose full structure of your project, it's actually quite flexible. Basically, you can structure your code into Bundles which maintains Symfony-related code and more generic Components which can handle common tasks, including those not necessarily strictly related to the Symfony2 ecosystem.
Following this concept, we're mostly using bundles to divide our Project into logically-connected parts. We barely modified the Symfony2 Standard structure, so our code base is very easy to understand for developers with Symfony2 experience.
Extending codebase - EventDispatcher Component
Do you need to change the response format in all your controllers? It's easy,
just add new ResponseFormatListener
and listen for kernel.response
event.
We have seen so much spaghetti code in so many projects that we really liked the popularization of Events concepts in Symfony. It's nothing new in software patterns, just old Observer Pattern, but before, in legacy PHP frameworks it wasn't so commonly used.
In addition to origin Symfony2 events, we've also chosen to add our custom ones. Using Event Listeners we can keep code clean, methods dispatch their specific events and that way new parts of the code can connect to the existing parts without actually making in-code changes.
Because performance is crucial in the project, we're evaluating the performance impact. The nice thing is that this mechanism comes with hardly any performance overhead. Internally it's simple array that stores event listeners' instances. Virtually worry-free!
Retrieving requests' data - OptionsResolver Component
While designing this application, we're also considering the most efficient way of retrieving and validating data from request content. We needed to smoothly transform request's data into DTO (Data Transfer Object). That way we won't get stuck with hard-to-maintain associative arrays and we'll stick to the structure of Request classes.
Basically, we ended with passing queries with Query String and considered different Symfony2 mechanisms...
You could use Form Component, build request structure in a Form Type and pass Request to it. It's nice and has rich features, but comes with huge overhead. We also didn't need or want advanced options like having nested fields.
Another way would be to pass data in a request's content in a JSON structure. Use a Serializer component (like great JMSSerializer or even Symfony Serializer Component) and validate resulting DTO Request's objects. There are still two points that could lead to performance bottleneck: serializing and validating with the Validator Component.
Thus, we didn't need any advanced validation (just checking required options and some basic format validation). Requests' format structure is also designed to be simple, and we've chosen... OptionsResolver Component. It's the same one you use when making options for your forms. We pass to it the GET array and on the output we receive a nicely validated and structured model object (it's an array to DTO normalization).
One nice thing about handling validation, exceptions come with verbose messages so they're ideal for debugging purposes and for presenting readable exceptions for API Endpoints.
Example of a handling request
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
<?php
namespace Octivi\WebserviceBundle\Controller;
/**
* @Route("/foo")
*/
class FooController extends BaseController
{
/**
* Sample requests:
* - OK: api.com/foo/bars?type=simple&offset=10
* - OK: api.com/foo/bars?type=simple
* - Wrong: api.com/foo/bars?offset=10
* - Wrong: api.com/foo/bars?limit=many
*
* @Route("/bars")
*/
public function barsAction()
{
$request = new GetBarsRequest($this->getRequest()->query->all());
$results = $this->get('main.foo')->getBars($request);
return $results;
}
}
<?php
namespace Octivi\WebserviceBundle\Request;
class GetBarsRequest extends AbstractRequest
{
protected $type;
protected $limit;
protected $offset;
public function __construct(array $options = array())
{
parent::__construct($options);
}
protected function setDefaultOptions(OptionsResolverInterface $resolver)
{
parent::setDefaultOptions($resolver);
$resolver->setRequired(array(
'type',
));
$resolver->setOptional(array(
'limit',
'offset'
));
$resolver->setAllowedTypes(array(
'limit' => array('numeric'),
'offset' => array('numeric')
));
$resolver->setDefaults(array(
'limit' => 10,
'offset' => 0
));
$resolver->setAllowedValues(array(
'type' => array('simple', 'extended'),
));
}
// ...
}
Pretty straightforward, isn't it? And we've achieved everything we want:
Basic validation
- Setting required fields
- Optional parameters
- Handling data types (numeric) and allowed values
- Default parameters
- Nice looking DTO representation of a request
Keeping configuration in-code - Annotations
Yes, we're using annotations in a high-performance application. Does that sound weird? Maybe, but we love this mechanism!
Annotations are just PHPDoc-like texts. They're parsed under a cache warm-up process and transformed to plain PHP files. As a matter of fact, it doesn't matter if you use XML, YAML or annotations, because they all end up transformed into plain PHP files.
We're using annotations as much as possible:
- Routing - as we've shown in previous barsAction, we declare our routes via
@Route annotations. We have found it cleaner to keep such declarations within
controllers' code as opposed to dividing it into separate YML/XML files. That
way, developers don't have to jump from file to file searching for routes. Keep
in mind, that we have declared loading of such annotations in a global
app/config/routing.yml
file. - Services - it's the next thing you can place in- code. Using JMSDiExtraBundle we don't have to worry about maintaining YAML files with Service Container declaration. Everything ends in service classes. Still, configuration of DI for external classes is done via XML files.
Example event listener configured with Annotations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
<?php
namespace Octivi\WebserviceBundle\EventListener;
use JMS\DiExtraBundle\Annotation\Service;
use JMS\DiExtraBundle\Annotation\Observe;
use JMS\DiExtraBundle\Annotation\InjectParams;
use JMS\DiExtraBundle\Annotation\Inject;
use Symfony\Component\HttpKernel\Event\FilterResponseEvent;
/**
* @Service
*/
class FormatListener
{
/**
* Constructor uses JMSDiExtraBundle for dependencies injection.
*
* @InjectParams({
* "em" = @Inject("doctrine.orm.entity_manager"),
* "security" = @Inject("security.context")
* })
*/
function __construct(EntityManager $em, SecurityContext $security) {
$this->em = $em;
$this->security = $security;
}
/**
* @Observe("kernel.response", priority = 0)
*/
public function onKernelResponse(FilterResponseEvent $event)
{
$this->em->...;
}
}
Making rich CLI commands - Console Component
Another extensively-used Symfony component is Console Component, our most used component since most of the new features for the application come as CLI commands.
In previous-gen PHP frameworks or other popular PHP-based software (e.g. Wordpress, Magento) making CLI commands gave us a serious headache. There weren't any standardized components or they were so lacking in features that everyone had to present his own solutions.
In Symfony2, we found a cool framework for making CLI commands. We can set command names, required options and arguments. We discovered a good practice to make accurate descriptions of the commands. They're self-documenting and there's no reason to add them to the standard form of documentation. It's especially great when you're developing with Agile methods. When new functionalities come fast, you'll end up with lots of out-dated versions of your paper-documentation.
We're using Console Component to create administrative tools and even long running processes. The longest one took about 6 days. Nice proof of lack of memory-leaks in the Symfony ecosystem!
1 2 3 4 5 6 7 8 9 10
$ php app/console octivi:test-command --help
Usage:
octivi:test-command [-l|--limit[="..."]] [-o|--offset[="..."]] table
Arguments:
table Database table to process
Options:
--limit (-l) Limit per SQL query. (default: 10)
--offset (-o) Offset for the first statement (default: 0)
Keeping an eye on the application - Profiler, Stopwatch & Monolog trio
To maintain control over potential performance-leaks, we strongly use HttpKernel's Profiler with Stopwatch Component. It's easier to spot Application methods that have un-expected, longer execution times as well as inefficient database queries (and we're talking here about MySQL and Redis).
When maintaining a bigger project, especially in an SOA environment, verbose logging is a must. Otherwise, you won't be able to track problems efficiently with e.g. API methods usage or connections with 3rd party Web services.
For that, we use rich-configured Monolog (all from app/config/config_prod.yml
!).
We're logging all un-expected things in a standard prod.log and leaving more
verbose logs to specific channels or different files.
We've worked out some nice logging patterns with that config. Since we don't use Fingers Crossed handler, we always add a context as verbose as possible that we can to each log line. That way, calls to external Web services look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
$exception = null;
try {
$response = $this->request($request);
} catch (\Exception $e) {
$exception = $e;
}
if (null !== $exception) {
$this->logger->error('Error response from XXX', array(
'request' => $request,
'response' => $response,
'exception' => (string) $exception,
));
throw $exception;
} else {
$this->logger->debug('Success response from XXX', array(
'request' => $request,
'response' => $response,
));
}
Features we don't use
Doctrine ORM
We use MySQL mostly as a cache layer, storing in it serialized BLOBs. We don't use Doctrine ORM as it would affect performance unnecessarily. Instead, we just use Doctrine DBAL to retrieve associative arrays and than we have a Model layer which handles deserialization/normalization of records.
I must add that it's not as much about volume of stored data, but performance requirements for retrieving a single record. ORM adds big over-head while hydrating objects (deserialization). If you accept it, you can use Doctrine ORM even if you have +200 million records. ORM performance will be the same as if you have 1000.
Twig
The application returns responses mostly in a JSON format. Therefore we don't use Twig in a standard requests flow. However, we do keep it for rendering some administrative dashboards.
Summary
As proven in this case study, Symfony2 can be used for a variety of projects. You just have to be smart about combining the components provided by Symfony to design a system that meets all your requirements. Your final result will be simple, easy-to-maintain, efficient set-up.
Who's behind the application?
Platform was designed by the Poland-based software house Octivi. We specialize in building scalable architectures focused on high-performance and availability. We'd also like to acknowledge the cooperation received from the client IT department.
Related articles
- Handling 1 Billion requests a week with Symfony2 - our overview of the whole application.
- Building A Growing Startup Architecture Using HAProxy, PHP, Redis And MySQL To Handle 1 Billion Requests A Week - details of the Hardware Architecture .
Help the Symfony project!
As with any Open-Source project, contributing code or documentation is the most common way to help, but we also have a wide range of sponsoring opportunities.
Comments are closed.
To ensure that comments stay relevant, they are closed for old posts.
This is gorgeous! I especially like the way request's parameters are retrieved
OptionsResolver as validator for requests... Very interesting.
Can you give details on the servers hosting this? Thanks!
Nicolas, according to the original post published at Octivi blog, the application server hardware is "Xeon E5-1620@3.60GHz, 64GB RAM, SATA".
Hi! Exactly, Symfony2 runs on 3.6GHz Xeon with 64 GB RAM. Redis and MySQL servers run on different configuration (much more RAM on them and SSD HDDs ;) ).
Very interesting post. There's no mention to authentication & authorisation for handling those APIs though. Is there any, or everything's public? And versioning? Thanks for sharing
Very nice article. I definitely have to try using OptionsResolver in filter query apis too.
Nice article. the OptionsResolver way is a nice alternative to https://github.com/FriendsOfSymfony/FOSRestBundle/blob/master/Resources/doc/3-listener-support.md#param-fetcher-listener
"Following this concept, we're mostly using bundles to divide our Project into logically-connected parts." can you give a example. How do you seperate frontend, api and amdmin stuff. All in one for example UserBundle or sperate bundles UserBundle, UserApiBundle, UserAdminBundle so that you can activate on a backend admin server only the admin bundles?
This sentence seems wrong "Send us a proposal or case study to fabien.potencier@sensiolabs.com"
Should be either: "Send us a proposal or case study at fabien.potencier@sensiolabs.com" or "Send a proposal or case study to fabien.potencier@sensiolabs.com".
Great article, interesting how you use OptionsResolver for validating a request :)
@Alex Salguero
@Gordon Franke
As the whole application is in fact an API, we don't have bundles like "UserApiBundle" in favor to "UserBundle". We've "MainBundle" to which dynamically connects "ModuleFooBundle", "ModuleBarBundle" etc.
@All above Thanks for warm feedback! Stay tuned for the next guest post (but on more "scalable" website ;) ). And glad you liked the idea of using OptionsResolver - internally it's a simple class, without much complex logic - ideal for performance needs.
Regarding the request handling... Would it make a difference in performance to define the controller as a service and inject 'main.foo' instead of locating it with $this->get()?
$this->get('foo') == $container['foo'] - complexity is ~O(1) so this is in a performance way :-)
Yep, relying on the OptionResolver component for validating requests is clever! Thanks for the trick.
Could you explain more about "Requests are handled by a HAProxy load balancer which distributes them to Varnish reverse proxies" as for example which tools was used for HAProxy load balancer and some tips on configuration if it's possible?
We are using the same scheme as described in this blog post, so I can answer it. HAProxy is a fast and reliable load balancer and proxy, just like nginx. It has own HTTP heartbeats and TCP pings, so it's possible to specify multiple backends (one main and one for emergency) and HAProxy will automatically switch between them in the case of emergency or unavailability.
This solutions is a typical for implementing failure tolerance applications (just google for details).
You mentioned about JMSDiExtraBundle, it is great improvement when services are in area of bundle's jurisdiction. What about services located at component level? Then, do You use yaml/xml? Thanks for hinting OptionResolver as validator.
Thanks a lot for this interesting article.
I have on question:
Do you really store data in both databases? That sounds a bit redundant for me. Or do you store something in redis and something in mysql?