Reactive Systems: Designing Systems for Multi-Dimensional Change
How to design socio-technical systems that adapt to changes in multiple dimensions, like functionality or technology.
I follow the literature and discussions about reactive systems and appreciate what's happening in this area. Building distributed, data-intensive systems is very complex, and publications like the Reactive Manifesto and the Reactive Principles provide fantastic resources to help you manage this complexity. However, I think most discussions in the reactive systems community focus on reliability and scalability, i.e., building systems that are reactive to changes in load or infrastructure failure. This is crucial to building reactive systems, but I believe what's missing in the conversation is the inclusion of a broader set of change drivers.
This article aims to define an reactive system as a socio-technical system (involving both the technical system as well as the individuals and teams maintaining it) that is able to sense changes from multiple dimensions and transform accordingly, either immediately or over time. It is likely that not all dimensions are relevant for every system, so it is the responsibility of developers and architects to identify the most important ones and design the system accordingly.
In this article, I will discuss several dimensions that I have been brainstorming over the last few days. I will also lay out broadly what properties a system should have to be adaptive to each dimension.
Functionality
Functionality is the dimension where change occurs most frequently. That’s because the main purpose of software systems is to deliver value to users by providing functionality. Demands on functionality change often. New usage ideas and things like regulations typically drive these changes. Additionally, users, often subtly, start using the software differently due to changes in the context in which it is embedded.
Developers should be able to implement changes to functionality without much ceremony. Once central roles, groups, or committees need to acknowledge work, break it down to teams, and coordinate implementation across many parts of the system, the system loses certain reactive properties. Instead of a single cell in the system needing to react to a particular change, adaptions lead to a shock across the entire system. This leads to complex agreements and time dependencies. Teams then often must build solutions together and deploy all changes simultaneously.
Use Domain-driven Design, especially bounded contexts, to decouple modules. This will assign each module a clear boundary of functionality. In addition, as I have written in “A Case Against 'One Model to Rule Them All’”, you should consider the responsibilities of development teams as well:
One very important thing is that a bounded context must be small enough to be clearly assignable to a development team. Don't spread the responsibility of bounded contexts over multiple teams; this only leads to unclear responsibility and makes it harder to establish a ubiquitous language within the bounded context because more people are involved. If you have a bounded context too large for a single team, consider splitting it.
Check out this post for more information on bounded contexts. Extending that idea to include the autonomy of teams, Team Topologies introduces the concept of stream-aligned teams:
[…] the team is empowered to build and deliver customer or user value as quickly, safely, and independently as possible, without requiring hand-offs to other teams to perform parts of the work.
As you’ll discover in the rest of the post, I believe team autonomy serves as one of the main pillars of managing all dimensions of change.
Reactive Property
A system that is reactive in this dimension allows teams to implement most functionality as independently as possible and with little coordination. If coordination is needed, it is explicit and takes place over well-defined social and technical interfaces.
Usage and Load
Load patterns are inherently difficult to predict. Imagine your task is to predict how much hardware you need to support a given workload. You are almost guaranteed to guess too high or too low and end up over- or under-provisioning hardware. Some solutions, especially in cloud computing, counteract this problem by providing increasingly smaller scale-units, with newer technologies that can scale individual functions (Function as a Service). In combination with self-service APIs, these infrastructures can be spun up and torn down at the touch of a button. This enables autoscaling functionality, which scales the infrastructure based on the demand - for example, the current load the system is experiencing.
Besides infrastructure, you need to consider other aspects as well to support increasing and decreasing load patterns. Applications deployed as a single process (deployment monoliths) are harder to scale horizontally due to longer start times, resource intensity, and reliance on session state.
Meanwhile, splitting applications helps them become reactive to changes in load. Two things are important here. First, you can split along functionality and arrive at smaller deployables, which then form the entire system by integrating over network communication (e.g., microservices). Second, you can partition the input range among deployables. Narrower responsibilities (e.g., handling customers only with last names beginning with A-C) reduce the load on individual nodes and make smaller-scale increments possible. Combining both strategies leads to the best outcomes in terms of elasticity, i.e., the ability to scale up and down, on demand, and in small increments.
Reactive Property
A system that is reactive in this dimension can scale up or down in small increments, without manual intervention, to support increasing and decreasing load patterns.
Technological Foundation
Innovation in cross-cutting technical foundations includes not only frameworks, programming languages, or libraries but also patterns, cloud services, and hardware specifications. There is something new or different seemingly every day. How do you cope with this, especially in large applications?
Smaller deployables help here since they are more reactive to changes in their technological foundation. Teams maintaining individual services can experiment with and adopt new concepts without necessarily affecting the entire application. While a change in programming language or underlying frameworks probably requires rework in larger parts of monolithic applications, individual services communicating over a network can completely differ in their implementation or technology. This means experimenting with technological innovations in a single, isolated service is much easier and less risky.
Reactive Property
A system that is reactive in this dimension can absorb technological innovations by experimenting with them within a subset of the system and then, if the innovations prove beneficial, gradually adopt them across the entire system.
Methodology
Similar to technological foundations, changes to methodology can be more easily adopted by autonomous cells within the system rather than by the entire system at the same time. In this context, these autonomous cells are teams.
Innovation in methodology is related to ways of working and communication. This includes adopting team processes (like switching from Scrum to Kanban) or introducing new ways of working, such as conducting regular internal team reviews. Autonomous teams can decide to try out methodologies and adapt them if they prove successful.
Teams know best what works in their day-to-day communication and collaboration, so it is crucial that they be allowed to decide autonomously how to facilitate that. Organizations should not prevent variation in methodology but support cross-team communication so that teams can share their experiences and inspire ideas among other teams.
Reactive Property
A system that is reactive in this dimension allows multiple development teams to use different methodologies. This includes the team's internal workflows as well as communication principles and patterns used when interacting with other teams.
Conclusion
This list of change dimensions is not complete. There are nuances to the dimensions mentioned above, as well as completely different dimensions that I did not mention but which may play a role, often only within specific contexts. What’s important is that you are aware of change dimensions in your particular context and design a system that is reactive to those.
Further Reading
If you want to learn more about microservices (as mentioned in the section ”Functionality”), I recommend Building Microservices by Sam Newman.