Performance optimization and scaling strategy for your Magento2 Shop

8 min readJan 13, 2020

Whether you are on a tight budget or having a booming online store, you will have to follow an optimization strategy in order to minimize cost and optimize performance, ie providing a good customer experience in terms of latency at a cost which you can bear. In this article we will review performance optimization for low memory environments, how to change these settings when you increase the memory available and then how to use auto-scaling and high availability to virtually absorb any load of visitors reliably.

Low resources environments optimization

Though the recommended VM for a basic store is to run on a t2.medium (2 cores, 4Gb RAM), it actually can run on a t2.small (1 CPU, 2Gb RAM). But be aware that you will be very borderline on such a small instance. Both CPU and RAM will be limiting factors, but let’s look mostly at RAM since there is very little you can do to limit CPU consumption.

So first of all, let’s assume that you run the full stack on your VM (Database, application, caching), the following will consume RAM:

OS : about 300Mb
MySQL: around 1Gb
PHP (mostly during the CRON runs): depends on your PHP limit but can go up to 512Mb during CRON runs, up to 756Mb when deploying and up to 2Gb when installing some plugins)
Zend OPCache: 128Mb minimum (recommended 512MB)
PHP-FPM processes: about 100Mb per child (let’s count 110 to be safe)
Redis: 100Mb
Varnish: 256Mb

So when you have only 2Gb available, that’s going to be tight. You will basically not be able to install some plugins (Mailchimp needs 2Gb for example), and if you want to be safe when your CRON are running then you must forget about Varnish.

So while 2Gb is doable, you will definitely experience high latency when a few users browse your site and sometimes full crash of the instance.

I’ve just started a fresh Magento install with MariaDB, Redis and Varnish running on a t2.small EC2 instance. Here is the result from htop (zoom in for details):

We can see that our baseline is a consumption of 700Mb, but remember that since we have no activity yet, there is no content cached and the DB indexes and tables in memory are lower than what they should be. I can see that the CRON running every minute loads the RAM up to 1.1Gb. This is taken by DB and PHP processes since it running PHP scripts triggering various actions in the database (reindexing, catalog update, etc…) We will use this figure as a new baseline.

We do not want to sacrifice speed, therefore we will keep Varnish and Redis running, but we can definitely lower the allocated memory (if your pages are not heavy, 128Mb for Varnish should do the trick). That leaves us with around 750Mb of RAM left.

You have 2 ways to serve your website, Apache or Nginx. Nginx is known for being able to handle concurrent request in a better way for Apache and for consuming less resources, therefore we will only use Nginx.

Nginx is running PHP-FPM to process PHP code. So most of our memory left will be used for this purpose. We will leave the Zend OPCache as default since we have no extra memory to play with its settings.

PHP-FPM has 2 main settings that we want to look at: pm and pm.max_children.

pm = process manager. You can set it to static, dynamic, or on demand.

Since we want to use memory only when needed in order to leave free resources for whatever will pop up, we definitely want to go for on demand as it will kill php processes after a timeout, therefore freeing memory.

pm = ondemand

Then pm.max_children will determine how many child process do you allow FPM to spawn. I tend to consider that 1 child will eat up to 110Mb of memory, therefore to be on the safe side, according to the memory we have left, I would set it to 5 (yes I know it sounds small, but you have to think about reliability. If you set it higher you will experience memory overflow during CRON and ultimately crash your VM at least once a day). By monitoring google analytics, and looking at children processes, we can see that one fpm child can handle about 6–7 unique visitor per minute. So your website can handle 30–35 users per minute (very rough estimate :) )

I am using JMeter to simulate some http traffic to my instance from my laptop. After letting it run for a few minutes I can see that though the RAM is reaching 1.9, it absorbs the load according to what we want, even when CRON jobs are running at the same time.

Settings and strategies to look at when you increase your server resources

The very first thing that will improve performance of your site and at the same time lower the burden on your instance is to serve static and media files from a CDN. This comes at a little cost and will be of really high value.

When more resources become available, you will want to scale your ability to handle concurrent users, and to serve cached content, therefore I would look at the following:

increasing pm.max_children: to handle concurrent connections you definitely need more than 5 max children. If you have a 4Gb machine I would raise to 15, 8Gb 50 … let it run and fine tune over time.
moving to a static process manager: this will keep all children alive instead of terminating them after the timeout. The advantage is that when traffic augments, your children are ready to accept the load, there is no need to spawn new ones and get the overhead from that activity.
increasing Varnish cache memory: from 128Mb get back to the recommended 256Mb max cache
activate Zend OPCache: activate operations caching and give it a 128Mb memory space to start with
serve static and media folders from a CDN
decouple your database from your application instance
use a dedicated elasticsearch instance if you have a huge catalog to handle searches
use a dedicated redis instance for session caching
use a dedicated Varnish instance for full page caching

The key to handle concurrent requests is definitely the max_children allowed. You can look in the logs (php-fpm.log) if you often have the following error message:

[11-Oct-2019 10:05:24] WARNING: [pool www] server reached pm.max_children setting (35), consider raising it

If that’s the case then you definitely have to either scale up your machine, or raise the max_children settings. That means that you reached the maximum children allowed and therefore that you have more concurrent users on your website than you can handle.

On the contrary, when you monitor carefully your RAM usage, preferably over 24/48hrs, if you see that you still have some margin, then you probably should increase max_children.

Finally, here is some benchmark I found online showing that static process management gives a better response time than ondemand or dynamic:

I also suggest to have a read at the following article which provides good tips on optimizing NGinx itself, especially regarding browser caching settings:

https://www.if-not-true-then-false.com/2011/nginx-and-php-fpm-configuration-and-optimizing-tips-and-tricks/

Auto-scaling and High Availability strategy

When you site has grown big enough, but you have quite idle periods or some spike of activity time to time, then it becomes more cost effective to use auto-scaling. Also, at that time, any minute your site is down means loss of revenues, therefore building a resilient architecture becomes crucial to your business. Please see below the AWS reference architecture for a Magento deployment. It ensure performance, high availability and scaling.

When you create an instance, you use a disk image and potentially a startup script. It is very important to understand that autoscaling assume that your instance is stateless, ie there is no variable linked to a global context, only linked to a local context. That means that a prerequisite is to decouple your DB from your compute instance as the data contained in your DB are global. The rest of your stack can leave locally. Once your DB leaves in a managed environment (Amazon RDS, Aurora) or simply on a separate EC2 instance, then you can place your application instance in an autoscaling group.

You will find plenty of tutorials on how to configure auto-scaling so I won’t go in details here, but the principle is to define a template of the instance you want to use, ie disk image (your application), which will be used when new instances are created to absorb the load. I suggest that the rule used to create new VM is based on RAM consumption instead of CPU usage. I also strongly suggest to decouple Redis on its own VM if you do not use a load balancer with sticky sessions.

You also want to have your DB and Redis servers in the same placement group as your application server in order to minimize latency between them.

The next thing you want to look at after autoscaling is high availability. To achieve that you need to consider the following:

Database replication in a different availability zone
Redis session caching replication in a different availability zone
Load balancer directing traffic to 2 or more different availability zones (2 different subnets)

This will ensure that if your DB goes down, a replica is ready to takeover it, ensuring low downtime (hot backup). Same for your session server, sessions will be replicated, ensuring a smooth transition in case of failure of the main Redis server. And finally hot backup of your auto-scaling group ready to take over in case of whole availability zone crash (autoscaling should handle single instance failure). Your load balancer is configured with health checks which will automatically redirect the traffic to the hot backup in case of failure of the application instance.

I have seen architectures in which the Varnish servers are decoupled from the application servers. If you are to do that, make sure that your application instances and Varnish instances are in the same placement group in order to limit latency when Varnish needs to fetch content from the application servers. If you are using the free Varnish version then don’t forget that the cache will likely start empty, meaning that the first users to hit a newly popped up instance will not benefit much from it.

Conclusion

Small environments are challenging but it doesn’t mean that it is hopeless. With the right settings it is viable to run Magento on a small instance. We looked at which settings would help you absorb more users and improve performances.

There are countless other small things to look at, and I can’t list them all here, or even pretend that I can know them all, if you are really hardcore you can even look at recompiling yourself php to get rid of useless modules, but I believe that the scaling tips provided here should give you a strong enough base to understand where you should focus your efforts and build a scaling strategy. After all when you scale to some point, you’ll very likely not tweak it yourself but have a team for that :) at least that’s what I wish for you!

Need help with your Magento installation, performances or strategy, I am also providing consulting and implementation services. Contact me by email v.teyssier@gmail.com

Performance optimization and scaling strategy for your Magento2 Shop

Low resources environments optimization

Settings and strategies to look at when you increase your server resources

Auto-scaling and High Availability strategy

Conclusion

Written by Vincent Teyssier