In this chapter, we will discuss the following topics:
• Picking a web stack 挑选一个web服务
• Hosting approaches 托管方法
• Deployment tools 发布工具
• Monitoring 监控
• Performance tips 性能建议
So, you have developed and tested a fully functional web application in Django. Deploying this application can involve a diverse set of activities from choosing your hosting provider to performing installations. Even more challenging could be the tasks of maintaining a production site working without interruptions and handling unexpected bursts in traffic.
The discipline of system administration is vast. Hence, this chapter will cover a lot of ground. However, given the limited space, we will attempt to familiarize you with the various aspects of building a production environment.
Although, most of us intuitively understand what a production environment is, it is worthwhile to clarify what it really means. A production environment is simply one where end users use your application. It should be available, resilient, secure, responsive, and must have abundant capacity for current (and future) needs.
Unlike a development environment, the chance of real business damage due to any issues in a production environment is high. Hence, before moving to production, the code is moved to various testing and acceptance environments in order to get rid of as many bugs as possible. For easy traceability, every change made to the production environment must be tracked, documented, and made accessible to everyone in the team.
As an upshot, there must be no development performed directly on the production environment. In fact, there is no need to install development tools, such as a compiler or debugger in production. The presence of any additional software increases the attack surface of your site and could pose a security risk.
Most web applications are deployed on sites with extremely low downtime, say, large data centers running 24/7/365. By designing for failure, even if an internal component fails, there is enough redundancy to prevent the entire system crashing. This concept of avoiding a single point of failure (SPOF) can be applied at every level—hardware or software.
Hence, it is crucial which collection of software you choose to run in your production environment.
So far, we have not discussed the stack on which your application will be running on. Even though we are talking about it at the very end, it is best not to postpone such decisions to the later stages of the application lifecycle. Ideally, your development environment must be as close as possible to the production environment to avoid the "but it works on my machine" argument.
By a web stack, we refer to the set of technologies that are used to build a web application. It is usually depicted as a series of components, such as OS, database, and web server, all piled on top of one another. Hence, it is referred to as a stack.
We will mainly focus on open source solutions here because they are widely used. However, various commercial applications can also be used if they are more suited to your needs.
A production Django web stack is built using several kinds of application (or layers, depending on your terminology). While constructing your web stack, some of the choices you might need to make are as follows:
- Which and distribution? For exaple Debian, Red Hat, or penBD.
- Which WI server? For exaple unicorn, uWI.
- Which web server? For exaple pache, ginx.
- Which database? For exaple PostgreL, MyL, or Redis.
- Which caching syste? For exaple Mecached, Redis.
- Which process control and onitoring syste? For exaple pstart, Systemd, or Supervisord.
- How to store static edia? For exaple aon , loudFront.
There could be several more, and these choices are not mutually exclusive either. Some use several of these applications in tandem. For example, username availability might be looked up on Redis, while the primary database might be PostgreSQL.
There is no 'one sie fits all' answer when it coes to selecting your stack. Different components have different strengths and weaknesses. Choose them only after careful consideration and testing. For instance, you might have heard that Nginx is a popular choice for a web server, but you might actually need Apache's rich ecosystem of modules or options.
Sometimes, the selection of the stack is based on various non-technical reasons. Your organiation ight have standardied on a particular operating syste, say, Debian for all its servers. Or your cloud hosting provider might support only a limited set of stacks.
Hence, how you choose to host your Django application is one of the key factors in determining your production setup.
When it comes to hosting, you need to make sure whether to go for a hosting platform such as Heroku or not. If you do not know much about managing a server or do not have anyone with that knowledge in your team, then a hosting platform is a convenient option.
Platform as a service A Platform as a ervice Paa is defined as a cloud service where the solution stack is already provided and managed for you. Popular platforms for Django hosting include Heroku, PythonAnywhere, and Google App Engine.
In most cases, deploying a Django application should be as simple as selecting the services or coponents of your stack and pushing out your source code. You do not have to perform any system administration or setup yourself. The platform is entirely managed.
Like most cloud services, the infrastructure can also scale on demand. If you need an additional database server or more RAM on a server, it can be easily provisioned from a web interface or the command line. The pricing is primarily based on your usage.
The bottom line with such hosting platforms is that they are very easy to set up and ideal for smaller projects. They tend to be more expensive as your user base grows.
Another downside is that your application might get tied to a platform or becoe difficult to port. For instance, oogle pp ngine is used to support only a non-relational database, which means you need to use django-nonrel, a fork of Django. This limitation is now somewhat mitigated with Google Cloud SQL.
A virtual private server (VPS) is a virtual machine hosted in a shared environment. From the developer's perspective, it would seem like a dedicated machine (hence, the word private preloaded with an operating syste. You will need to install and set up the entire stack yourself, though many VPS providers such as WebFaction and DigitalOcean offer easier Django setups.
If you are a beginner and can spare some time, I highly recommend this approach. You would be given root access, and you can build the entire stack yourself. You will not only understand how various pieces of the stack come together but also have full control in finetuning each individual coponent.
Compared to a PaaS, a VPS might work out to be more value for money, especially for hightraffic sites. You ight be able to run several sites fro the same server as well.
Other hosting approaches Even though hosting on a platform or VPS are by far the two most popular hosting options, there are plenty of other options. If you are interested in maximizing performance, you can opt for a bare metal server with colocation from providers, such as Rackspace.
On the lighter end of the hosting spectrum, you can save the cost by hosting multiple applications within Docker containers. Docker is a tool to package your application and dependencies in a virtual container. Compared to traditional virtual machines, a Docker container starts up faster and has minimal overheads (since there is no bundled operating system or hypervisor).
Docker is ideal for hosting micro services-based applications. It is becoming as ubiquitous as virtualization with almost every PaaS and VPS provider supporting them. It is also a great development platform since Docker containers encapsulate the entire application state and can be directly deployed to production.
Once you have zeroed in on your hosting solution, there could be several steps in your deployment process, from running regression tests to spawning background services.
The key to a successful deployment process is automation. Since deploying applications involve a series of welldefined steps, it can be rightly approached as a programming problem. Once you have an automated deployment in place, you do not have to worry about deployments for fear of missing a step.
In fact, deployments should be painless and as frequent as required. For example, the Facebook team can release code to production up to twice a day. Considering Facebook's enormous user base and code base, this is an impressive feat, yet, it becoes necessary as eergency bug fixes and patches need to be deployed as soon as possible.
A good deployment process is also idempotent. In other words, even if you accidentally run the deployment tool twice, the actions should not be executed twice (or rather it should leave it in the same state).
Let's take a look at some of the popular tools for deploying Django applications.
Fabric is favored among Python web developers for its simplicity and ease of use. It expects a file naed fabfile.py that defines all the actions for deployent or otherwise) in your project. Each of these actions can be a local or remote shell command. The remote host is connected via SSH.
The key strength of Fabric is its ability to run commands on a set of remote hosts. For instance, you can define a web group that contains the hostnames of all web servers in production. You can run a Fabric action only against these web servers by specifying the web group name on the command line.
To illustrate the tasks involved in deploying a site using Fabric, let's take a look at a typical deployment scenario.
Imagine that you have a medium-sized web application deployed on a single web server. Git has been chosen as the version control and collaboration tool. A central repository that is shared with all users has been created in the form of a bare Git tree.
Let's assume that your production server has been fully set up. When you run your Fabric deployment command, say, fab deploy, the following scripted sequence of actions take place:
- Run all tests locally.
- Commit all local changes to Git.
- Push to a remote central Git repository.
- Resolve erge conicts, if any.
- ollect the static files , iages.
- opy the static files to the static file server.
- At remote host, pull changes from a central Git repository.
- At remote host, run (database) migrations.
- At remote host, touch app.wsgi to restart WSGI server.
The entire process is automatic and should be completed in a few seconds. By default, if any step fails, then the deployment gets aborted. Though not explicitly mentioned, there would be checks to ensure that the process is idempotent.
Note that Fabric is not yet compatible with Python 3, though the developers are in the process of porting it. In the meantime, you can run Fabric in a Python 2.x virtual environment or check out similar tools, such as PyInvoke.
Managing multiple servers in different states can be hard with Fabric. onfiguration management tools such as Chef, Puppet, or Ansible try to bring a server to a certain desired state.
nlike Fabric, which requires the deployent process to be specified in an iperative anner, these configurationanageent tools are declarative. You just need to define the final state you want the server to be in, and it will figure out how to get there.
For example, if you want to ensure that the Nginx service is running at startup on all your web servers, then you need to define a server state having the ginx service both running and starting on boot. On the other hand, with Fabric, you need to specify the exact steps to install and configure ginx to reach such a state.
ne of the ost iportant advantages of configurationanageent tools is that they are idepotent by default. Your servers can go fro an unknown state to a known state, resulting in easier server configuration anageent and reliable deployent. ong configurationanageent tools, hef and Puppet enjoy wide popularity since they were one of the earliest tools in this category. However, their roots in Ruby can make them look a bit unfamiliar to the Python programmer. For such folks, we have Salt and Ansible as excellent alternatives.
onfigurationanageent tools have a considerable learning curve copared to simpler tools, such as Fabric. However, they are essential tools for creating reliable production environments and are certainly worth learning.
Even a medium-sized website can be extremely complex. Django might be one of the hundreds of applications and services running and interacting with each other. In the same way that the heart beat and other vital signs can be constantly monitored to assess the health of the human body, so are various metrics collected, analyzed, and presented in most production systems.
While logging keeps track of various events, such as arrival of a web request or an exception, monitoring usually refers to collecting key information periodically, such as memory utilization or network latency. However, differences get blurred at application level, such as, while monitoring database query performance, which might very well be collected from logs.
Monitoring also helps with the early detection of problems. Unusual patterns, such as spikes or a gradually increasing load, can be signs of bigger underlying problems, such as a memory leak. A good monitoring system can alert site owners of problems before they happen. Monitoring tools usually need a backend service (sometimes called agents) to collect the statistics, and a frontend service to display dashboards or generate reports. Popular data collection backends include StatsD and Monit. This data can be passed to frontend tools, such as Graphite.
There are several hosted monitoring tools, such as New Relic and Status.io, which are easier to set up and use.
Measuring performance is another important role of monitoring. As we will soon see, any proposed optimization must be carefully measured and monitored before getting implemented.
Performance is a feature. Studies show how slow sites have an adverse effect on users, and therefore, revenue. For instance, tests at Amazon in 2007 revealed that for every 100 ms increase in load time of amazon.com, the sales decreased by 1 percent.
Reassuringly, several high-performance web applications such as Disqus and Instagram have been built on Django. At Disqus, in 2013, they could handle 1.5 million concurrently connected users, 45,000 new connections per second, 165,000 messages/second, with less than 0.2 seconds latency end-to-end.
The key to iproving perforance is finding where the bottlenecks are. Rather than relying on guesswork, it is always recoended that you easure and profile your application to identify these performance bottlenecks. As Lord Kelvin would say:
If you can't measure it, you can't improve it.
In most web applications, the bottlenecks are likely to be at the browser or the database end rather than within Django. However, to the user, the entire application needs to be responsive.
Let's take a look at some of the ways to improve the performance of a Django application. Due to widely differing techniques, the tips are split into two parts: frontend and backend.
Django programmers might quickly overlook frontend performance because it deals with understanding how the client-side, usually a browser, works. However, to quote Steve Souders' study of Alexa-ranked top 10 websites:
80-90% of the end-user response time is spent on the frontend. Start there.
A good starting point for frontend optimization would be to check your site with oogle Page peed or Yahoo Ylow coonly used as browser plugins. These tools will rate your site and recommend various best practices, such as minimizing the number of HTTP requests or gzipping the content.
Even then, Django can help you improve frontend performance in a number of ways:
• Cache infinitely with CachedStaticFilesStorage: The fastest way to load static assets is to leverage the browser cache. By setting a long caching time, you can avoid re-downloading the same asset again and again. However, the challenge is to know when not to use the cache when the content changes. CachedStaticFilesStorage solves this elegantly by appending the asset's MD5 hash to its filename. This way, you can extend the TTL of the cache for these files infinitely. To use this, set the STATICFILES_STORAGE to CachedStaticFilesStorage or, if you have a custom storage, inherit from CachedFilesMixin. Also, it is best to configure your caches to use the local memory cache backend to perform the static filename to its hashed name lookup. • Use a static asset manager: An asset manager can preprocess your static assets to minify, compress, or concatenate them, thereby reducing their size and minimizing requests. It can also preprocess them enabling you to write them in other languages, such as CoffeeScript and Syntactically awesome stylesheets (Sass). There are several Django packages that offer static asset management such as django-pipeline or webassets.
The scope of backend performance improvements covers your entire server-side web stack, including database queries, template rendering, caching, and background jobs. You will want to extract the highest perforance fro the, since it is entirely within your control.
For quick and easy profiling needs, django-debug-toolbar is quite handy. We can also use Python profiling tools, such as the hotshot module for detailed analysis. In Django, you can use one of the several profiling iddleware snippets to display the output of hotshot in the browser.
recent liveprofiling solution is django-silk. It stores all the requests and responses in the configured database, allowing aggregated analysis over an entire user session, say, to find the worstperforing views. It can also profile any piece of Python code by adding a decorator.
As before, we will take a look at some of the ways to improve backend performance. However, considering they are vast topics in themselves, they have been grouped into sections. Many of these have already been covered in the previous chapters but have been summarized here for easy reference.
As the documentation suggests, you should enable the cached template loader in production. This avoids the overhead of reparsing and recompiling the templates each tie it needs to be rendered. The cached teplate is copiled the first tie it is needed and then stored in memory. Subsequent requests for the same template are served from memory.
If you find that another teplating language such as inja renders your page significantly faster, then it is quite easy to replace the builtin Django teplate language. There are several libraries that can integrate Django and Jinja2, such as django-jinja. Django 1.8 is expected to support multiple templating engines out of the box.
Sometimes, the Django RM can generate inefficient L code. There are several optimization patterns to improve this:
• Reduce database hits with select_related: If you are using a OneToOneField or a Foreign Key relationship, in forward direction, for a large number of objects, then select_related() can perform a SQL join and reduce the number of database hits. • Reduce database hits with prefetch_related: For accessing a ManyToManyField method or, a Foreign Key relation, in reverse direction, or a Foreign Key relation in a large number of objects, consider using prefetch_related to reduce the number of database hits. • Fetch only needed fields with values or values_list You can save tie and memory usage by limiting queries to return only the needed fields and skip model instantiation using values() or values_list(). • Denormalize models: Selective denormalization improves performance by reducing joins at the cost of data consistency. It can also be used for precomputing values, such as the sum of fields or the active status report into an extra column. Compared to using annotated values in queries, denormalized fields are often simpler and faster. • Add an Index: If a non-primary key gets searched a lot in your queries, consider setting that field's db_index to True in your model definition. • Create, update, and delete multiple rows at once: Multiple objects can be operated upon in a single database query with the bulk_create(), update(), and delete() methods. However, they come with several important caveats such as skipping the save() method on that model. So, read the documentation carefully before using them.
As a last resort, you can always finetune the raw L stateents using proven database performance expertise. However, maintaining the SQL code can be painful over time.
Any computation that takes time can take advantage of caching and return precomputed results faster. However, the problem is stale data or, often, quoted as one of the hardest things in computer science, cache invalidation. This is coonly spotted when, despite refreshing the page, a YouTube video's view count doesn't change.
Django has a exible cache system that allows you to cache anything from a template fragment to an entire site. It allows a variety of pluggable backends such as filebased or databased backed storage.
Most production systems use a memory-based caching system such as Redis or Memcached. This is purely because volatile memory is many orders of magnitude faster than disk-based storage.
Such cache stores are ideal for storing frequently used but ephemeral data, like user sessions.
By default, Django stores its user session in the database. This usually gets retrieved for every request. To improve performance, the session data can be stored in memory by changing the SESSION_ENGINE setting. For instance, add the following in settings.py to store the session data in your cache:
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
Since some cache storages can evict stale data leading to the loss of session data, it is preferable to use Redis or Memcached as the session store, with memory limits high enough to support the maximum number of active user sessions.
For basic caching strategies, it might be easier to use a caching framework. Two popular ones are django-cache-machine and django-cachalot. They can handle common scenarios, such as automatically caching results of queries to avoid database hits every time you perform a read.
The simplest of these is Django-cachalot, a successor of Johnny Cache. It requires very little configuration. It is ideal for sites that have multiple reads and infrequent writes (that is, the vast majority of applications), it caches all Django ORM read queries in a consistent manner.
nce your site starts getting heavy traffic, you will need to start exploring several caching strategies throughout your stack. Using Varnish, a caching server that sits between your users and Django, many of your requests might not even hit the Django server.
Varnish can make pages load extremely fast (sometimes, hundreds of times faster than normal). However, if used improperly, it might serve static pages to your users. arnish can be easily configured to recognie dynaic pages or dynaic parts of a page such as a shopping cart.
Russian doll caching, popular in the Rails community, is an interesting template cache-invalidation pattern. Imagine a user's timeline page with a series of posts each containing a nested list of comments. In fact, the entire page can be considered as several nested lists of content. At each level, the rendered template fragment gets cached.
So, if a new comment gets added to a post, only the associated post and timeline caches get invalidated. otice that we first invalidate the cache content directly outside the changed content and move progressively until at the outermost content. The dependencies between models need to be tracked for this pattern to work.
Another common caching pattern is to cache forever. Even after the content changes, the user might get served stale data from the cache. However, an asynchronous job, such as, a elery job, also gets triggered to update the cache. You can also periodically warm the cache at a certain interval to refresh the content. ssentially, a successful caching strategy identifies the static and dynaic parts of a site. For any sites, the dynaic parts are the userspecific data when you are logged in. If this is separated from the generally available public content, then implementing caching becomes easier.
Don't treat caching as integral to the working of your site. The site must fall back to a slower but working state even if the caching system breaks down.
Cranos It was six in the morning and the S.H.I.M. building was surrounded by a grey fog. Somewhere inside, a small conference room had been designated the "War Room." For the last three hours, the SuperBook team had been holed up here diligently executing their pre-go-live plan.
More than 30 users had logged on the IRC chat room #superbookgolive from various parts of the world. The chat log was projected on a giant whiteboard. When the last item was struck off, Evan glanced at Steve. Then, he pressed a key triggering the deployment process.
The room fell silent as the script output kept scrolling off the wall. One error, Steve thought—just one error can potentially set them back by hours. Several seconds later, the command prompt reappeared. It was live! The team erupted in joy. Leaping from their chairs they gave highfives to each other. oe were crying tears of happiness. fter weeks of uncertainty and hard work, it all seemed surreal.
However, the celebrations were short-lived. A loud explosion from above shook the entire building. Steve knew the second breach had begun. He shouted to Evan, "Don't turn on the beacon until you get my message," and sprinted out of the room.
As Steve hurried up the stairway to the rooftop, he heard the sound of footsteps above hi. It was Mada . he opened the door and ung herself in. He could hear her screaming "No!" and a deafening blast shortly after that. By the time he reached the rooftop, he saw Madam O sitting with her back against the wall. She clutched her left arm and was wincing in pain. Steve slowly peered around the wall. At a distance, a tall bald man seemed to be working on something with the help of two robots.
"He looks like...." Steve broke off, unsure of himself. Yes, it is Hart. Rather I should say he is ranos now. What? Yes, a split personality. onster that laid hidden in Hart's ind for years. I tried to help him control it. Many years back, I thought I had stopped it from ever coming back. However, all this stress took a toll on him. Poor thing, if only I could get near him."
Poor thing indeed—he nearly tried to kill her. Steve took out his mobile and sent out a message to turn on the beacon. He had to improvise. With his hands high in the air and fingers crossed, he stepped out. The two robots immediately aimed directly at him. Cranos motioned them to stop. Well, who do we have here? Mr. uperBook hiself. Did I crash into your launch party, teve?
"It was our launch, Hart."
"Don't call me that," growled Cranos. "That guy was a fool. He wrote the Sentinel code but he never understood its potential. I mean, just look at what Sentinels can do—unravel every cryptographic algorithm known to an. What happens when it enters an intergalactic network?
The hint was not lost on teve. uperBook? he asked slowly.
Cranos let out a malicious grin. Behind him, the robots were busy wiring into S.H.I.M.'s core network. "While your SuperBook users will be busy playing SuperVille, the tentacles of Sentinel will spread into new unsuspecting worlds. Critical systems of every intelligent species will be sabotaged. The Supers will have to bow to a new intergalactic supervillain—Cranos." As Cranos was delivering this extended monologue, Steve noticed a movement in the corner of his eyes. It was Acorn, the super-intelligent squirrel, scurrying along the right edge of the rooftop. He also spotted Hexa hovering strategically on the other side. He nodded at them.
Hexa levitated a garbage bin and ung it towards the robots. corn distracted them with high-pitched whistles. "Kill them all!" Cranos said irritably. s he turned to watch his intruders, teve fished out his phone, dialed into FaceTime and held it towards Cranos.
"Say hello to your old friend, Cranos," said Steve. Cranos turned to face the phone and the screen revealed Madam O's face. With a smile, she muttered under her breath, "Taradiddle Bumfuzzle!" The expression on Cranos' face changed instantly. The seething anger disappeared. He now looked like a man they had once known.
>What happened? asked Hart confused.
"We thought we had lost you," said Madam O over the phone. "I had to use hypnotic trigger words to bring you back." Hart took a moment to survey the scene around him. Then, he slowly smiled and nodded at her.
One Year Later
Who would have guessed Acorn would turn into an intergalactic singing sensation in less than a year? His latest albu corn Unplugged" debuted at the top of Billboard's Top 20 chart. He had thrown a grand party in his new white mansion overlooking a lake. The guest list included superheroes, pop stars, actors, and celebrities of all sorts.
"So, there was a singer in you after all," said Captain Obvious holding a martini.
"I guess there was," replied Acorn. He looked dazzling in a golden tuxedo with all sorts of bling-bling.
>teve appeared with Hexa in towwho looked ravishing in a owing silver gown. "Hey Steve, Hexa.... It has been a while. Is SuperBook still keeping you late at work, teve?
"Not so much these days. Knock on wood," replied Hexa with a smile.
>h, you guys did a fantastic job. I owe a lot to uperBook. My first single, 'Warning: Contains Nuts', was a huge hit in the Tucana galaxy. They watched the video on SuperBook more than a billion times!"
"I am sure every other superhero has a good thing to say about SuperBook too. Take Blitz. His AskMeAnything interview won back the hearts of his fans. They were thinking that he was on experimental drugs all this time. It was only when he revealed that his father was Hurricane that his powers made sense." By the way, how is Hart doing these days?
"Much better," said Steve. "He got professional help. The sentinels were handed back to S.H.I.M. They are developing a new quantum cryptographic algorithm that will be much more secure."
"So, I guess we are safe until the next supervillain shows up," said Captain Obvious hesitantly.
"Hey, at least the beacon works," said Steve, and the crowd burst into laughter.
In this final chapter, we looked at various approaches to ake your Django application stable, reliable, and fast. In other words, to make it production-ready. While system administration might be an entire discipline in itself, a fair knowledge of the web stack is essential. We explored several hosting options, including PaaS and VPS.
We also looked at several automated deployment tools and a typical deployment scenario. Finally, we covered several techniques to improve frontend and backend performance.
The ost iportant ilestone of a website is finishing and taking it to production. However, it is by no means the end of your development journey. There will be new features, alterations, and rewrites.
very tie you revisit the code, use the opportunity to take a step back and find a cleaner design, identify a hidden pattern, or think of a better implementation. Other developers, or sometimes your future self, will thank you for it.