Digital Anarchy: Abolishing State

Think about the high jump in track-and-field competitions. You run up, almost parallel to the bar, and jump toward the bar, pushing off your outside leg. In the air, you turn so that you are traveling headfirst toward the bar but facing away from it. Jumpers get their head over the bar, and then arch their back as their head starts to fall, keeping their hips traveling upward. Then as their hips fall, their knees stay over the bar, and as their knees start to drop, they flick their feet over the bar.

This method is called the Fosbury Flop, and it took the track-and-field world by storm after Dick Fosbury used that method to win gold at the 1968 Olympic Games.

Up until that point, high jumpers used jump techniques that scissored their legs over the bar, or other methods of feet-first jumps. These were predominant because the high jump originated before the advent of quality deep pads to land on, so they were designed so that the jumper could land on their feet. Once better landing surfaces were available, jumpers could use different, and revolutionary, techniques.

As with anything revolutionary, many a pearl was clutched by high-jump traditionalists, and I’m sure those deep cushions came in handy for the “but that’s not the way we’ve always done it” crowd to swoon onto.

In many ways, technologists have been jumping over ever-higher bars of performance and user experiences as we traverse technological breakthrough after breakthrough. Going back to the Dawn of the Computer Age (DotCA), computers were very large, expensive, delicate and hard to use. As such, they were a shared resource and were kept locked away in rooms requiring infinite space.

As networking capabilities developed, users could be in a different location from the more powerful computers (henceforth known as servers) and interface with them through terminal clients, which were very light-duty computers. This became the standard use pattern that is still the predominant method we have today.

Even our fancy multitiered, load-balanced, Kubernetes-based cloud native application architecture is still based on this premise: that large, powerful groups of computers were going to host code and data and run complex digital alchemy. All for what seems like just pictures of our cats.

However, something has happened since the DotCA to call into question the necessity of this basic architecture: Computers got cheaper, more powerful and more portable. Adobe did a comparison of a 1980s Cray supercomputer to an iPhone 12 showing the difference in power that we possess in the palms of our hands. Home computers, laptops, phones, smart devices, cameras, cars and even your home appliances all have powerful computers in them with gigabytes, and even terabytes, worth of storage and are connected to the internet.

The idea that compute needs to live in data centers is long gone, and we are seeing that in application development for these platforms, where the computing power of these clients provides a soft-landing place for code. The adoption of edge computing and fast cellular networks decentralizes compute even more, and that trend doesn’t look like it’s going to change.

Where this decentralization runs off the rails and where we are abruptly slammed back to 1965 is when we deal with state and other data storage.

Keeping state in applications requires a data store of some sort. Since the DotCA, state for networked apps was primarily kept on servers. That said, when client systems were able to have hard drives or removable media, you could keep state for some applications with the client.

Many of us who are longer in the tooth remember needing to have a specific floppy disk with your data that you could take with you to various computers and run your application. It was somewhat convenient because you only needed to have access to the application itself, and you kept your own data with you.

In modern times, client computers don’t have magnetic or optical removable storage, and for pretty much every practical purpose, they don’t need them. You can keep documents, photos, videos, cat memes, PDFs of your recipes and your risky selfies in cloud-based storage. They are accessible anywhere and, in theory, only accessible by the owner of those artifacts.

It’s very common to consolidate this storage into a single provider for ease of use and interoperability between devices within an ecosystem. You can access your files using biometric authentication from your specific authorized devices, and often with multifactor authentication. For example, while it’s not impossible to hack an iPhone to get someone’s files in iCloud, it’s not trivial, either.

For modern web and cloud native applications, this is far from the case. Each application will have its own data storage to keep its own state, in the form of flat files in object store, relational and nonrelational databases, caches and so on. While the backend services are standard (MySQL, PostgreSQL, Redis, Elasticsearch, Oracle, etc.), the structures and schemas for that data are usually unique to whatever application they’re serving.

Another thing to consider is how much of that data is redundant between the various applications. How many websites and applications have your email address, your name, your birthday, credit card information, your address and so on? Every application you fill that data out for is yet another copy of the same data in a different location.

This redundancy comes with increased risk — it’s yet another opportunity to have that data compromised. Currently, the grand total of digital storage for all humankind is somewhere in the 200 exabyte range, and while most of it are probably just memes stored in S3, think about how much of that data is redundant.

Another thing to consider is what is done with that data once it is written. Does it need to live on in perpetuity for compliance? Is it ever deleted? What happens to it when one company is acquired by another? Is it used to model artificial intelligence, which you agreed to in all those TOS/EULAs that you didn’t read? GitHub Copilot and OpenAI are already in court because they are emitting copyrighted code, and AI-powered illustrations are being called out for similar concerns by artists. How much control do we have over our data that we are providing to these services so they can keep state on their own?

So, with the client-side compute and storage capabilities, and the presence of edge computers to decentralize data as our deep cushions, what if we did something revolutionary and abandoned the notion that applications need to keep server-side state?

Let’s imagine an architecture where a person has their own state store. It can be on a local device, on a singular cloud-based store or even on an intentionally ephemeral storage platform. Applications are built to run either natively on a device or in the browser, and they access client-side data stores to preserve state. These applications would use a universal architecture and schema so they can read the existing data for state and write only what they need for their specific applications.

These client data stores would allow apps to only read data that the client allows and could be document/table/key encrypted such that no two applications could read each other’s data, and the client could revoke encryption keys as they like to prevent applications from reading any data.

Such an architecture would revolutionize application development and deployment practices since only static code would need to be served, and any dynamic execution would happen on the client. This aspect of it isn’t new, but these applications wouldn’t phone their own servers and object stores to read and record transactions; they would access the client’s data stores.

Obviously there are some use cases where this may not be the best idea (financial institutions and business intelligence tools come to mind) and the ways that you would design these applications would be dramatically different in some cases, but think of the possibilities. Clients have control over their data, in a single place. Access to that data can use biometric multifactor authentication. You won’t have multiple copies of the same information living in various big noisy rooms around the globe unless you choose to do so.

You don’t need large, powerful servers doing so much compute all the time because it’s done at the client side, so you have smaller, more efficient data centers. Just imagine the difference in power consumption if you’re not burning old plants to power computers to hold the 2,8175th copy of your answer to the question, “What street did you grow up on?” Infrastructure deployment looks dramatically different since state storage is not a driving concern. Even more server-side compute load can be born at the edge and, if you’re good, can even become fairly compute architecture agnostic.

Over the coming weeks and months, I’ll dig more into some of the details and aspects of what this future would look like, but I would like to challenge you, dear reader, to do the same. We have developed the deep cushions, and now it’s time to do something revolutionary in the way we get over higher bars for performance, security, data protection and sustainability in 2023 and beyond.