Stateless vs Stateful Containers: What's the Difference and Why Does It Matter?
The stateless vs. stateful container debate may seem endless and more than a little bewildering to anyone who has not dealt with stateful container deployment at close range. What are the definitions of statelessness and statefulness? Why would one type of container be better (or not better) than the other? In this post, we'll take a look at those questions, and try to get a clearer picture of what's at stake with stateless and stateful containers.
Stateless and Stateful Applications
What makes an application stateless or stateful?
A stateless application is one that neither reads nor stores information about its state from one time that it is run to the next.
"State" in this case can refer to any changeable condition, including the results of internal operations, interactions with other applications or services, user-set preferences, environment variables, the contents of memory or temporary storage, or files opened, read from, or written to. If the calculator app on your phone always starts with zero in the display, nothing in the registers, and no history of past calculations, it's stateless.
A stateful application, on the other hand, can remember at least some things about its state each time that it runs. The actual state data that it stores may depend on the application and on the conditions under which it operates. Most applications that we encounter on a day-to-day basis are at least somewhat stateful. They store our preferences, keep track of window size and location, and remember what files they have opened recently.
Storage Required!
A key point to keep in mind is that statefulness requires persistent storage. An application can only be stateful if it has somewhere to store information about its state, and if that information will be available for it to read later.
For an application running on a typical desktop system, that generally isn't a problem. It can usually store state data in a temp file, a database, or the system registry. Network- and Internet-based applications may be able to store state data on individual users' systems (for example, in the form of cookies), or on the server. As long as there is some kind of persistent storage, it is possible for a stateful application to save state data.
Can Containers be Stateful?
But what about containers? The ideal container, after all, pops up out of nowhere, does its job, and disappears. If it performs any operations involving data coming from/going to somewhere else, it is given the data by another process or service, and in turn hands the result off to some other process. Where could it store any information about its state?
As originally conceived, containers couldn't save state information. There was no provision for persistent storage, and without it, statefulness wasn't possible. They were supposed to only perform operations which did not require statefulness, leaving such things as persistent storage and saved state data to other parts of the system. Advocates of purely stateless containers maintain that this is still the best and cleanest approach, and that attempts to bring statefulness to container deployment are merely evidence of obsolete ways of thinking.
Stateless and Simple
Perhaps the strongest argument in favor of stateless container deployment is that it is simple. If all containers follow the stateless ideal (popping in and out of existence as needed for a particular task, and doing their job without leaving a trace), the only persistent state data will be that which is stored and used by the host operating system. Developers don't need to worry about where to save container state data, or how to make containers interact with persistent storage.
Reasons for Being Stateful
As containers have come into wider use, however (particularly in enterprise environments), the limits to pure container statelessness have become all too apparent. Many of the applications now being deployed in containers were not written from scratch with containerization in mind; they are existing applications (often in the middle or early stages of their lifecycle, rather than being "legacy") which may have been refactored for containers, or simply containerized wholesale. These applications are typically stateful, and they are likely to rely heavily on state data.
Making such an application stateless may require a complete redesign on the level of fundamental architecture, even beyond that required for refactoring. And depending on the nature and purpose of the application, even designed-from-scratch container-based software may lend itself more naturally to statefulness than statelessness. If making a containerized application stateless requires awkward workarounds, then it is hard to argue that statelessness makes it more "pure."
Bringing Statefulness to Containers
How can a container be stateful, if it doesn't have persistent storage? There are now several well-established vendors that do provide persistent storage for containers, including databases for storing container state information.
Companies such as Docker, Kubernetes, Flocker, and Mesosphere provide ways of managing both stateless and stateful containers using persistently stored data. Most of the key vendors in the container industry appear to see statefulness as a major part of the container landscape, and one that is here to stay, rather than being a vestige of pre-container development style. For most developers, the question is not whether to use stateful containers, but when they should be used.
Stateless or Stateful?
When should you use stateful containers, and when are stateless containers better? Not surprisingly, the answer depends to a large extent on the kind of software that you are deploying, and what it needs to do. Does it need to save information about its state, or could it achieve the same results if it were stateless?
For applications which were designed (or have been refactored) for containers, you can usually ask this question at the microservice level. It may turn out that only a handful of containers actually need to store state data, allowing the rest to be run statelessly.
More Work, But Less Awkward
As we said earlier, the advantage of statelessness is that it is simple. Statefulness, on the other hand, does require at least some overhead: persistent storage, and more likely, a state management system. This means more software to install, manage, and configure, and more programming time to connect to it via API.
If, however, you find yourself faced with a choice between this kind of overhead or a series of clumsy workarounds in order to remain stateless, you are probably better off accepting the overhead and including stateful containers.
Different Flavours of Statefulness
It is also important to be aware of the different kinds of statefulness, and the ways that they can be handled. Session-based state data, by its nature, needs to be maintained and read at the container level. Environment-based state data (such as IP address, database access, cluster configuration, etc.) can typically be handled at the host level. It may be necessary to store other kinds of state data using an independent file system which can remain available if the host shuts down.
Stateless vs. stateful? There's no one right answer, except perhaps this: It depends on what's best for your application, and whatever choice you make, there is now a range of tools and services to make it work.