Connecting Modules Across Teams: Selecting Integration Mechanisms Based on Team Distance

Sep 23, 2024

The organization of teams and their communication with other teams is intertwined with the software system’s structure, i.e., its modules and how they integrate. Aligning teams across value streams helps them to optimize for fast flow and strengthens their degree of responsibility. As I wrote in “The Homomorphic Force: When Software and Team Structures Reinforce Each Other”:

Instead, teams should develop a product (or part of a product) for which they are fully responsible. Let them cover the entire value stream, including determining and prioritizing what features to build, develop, deploy, and even gather and reincorporate customer feedback.

Nevertheless, teams must integrate individual product parts (or value streams). They need to communicate and integrate their modules to form a product as a whole. This gets especially interesting when those individual parts are maintained by different teams. Then, the organizational distance the teams are spread apart plays a significant role in the patterns and mechanisms those teams should use to integrate their code. The following figure shows multiple "distances" in which team communication usually falls:

These are the layers at which communication takes place.

Communication across contexts (e.g., products, product lines, businesses): Think of communication across your context boundary, like with other companies or products.
Communication across value streams: Both teams are still within the same context or product but spread across different value streams. Think of integration across modules like checkout and refund in an online shop.
Communication within value stream: Teams within the same value stream usually communicate often. Integration is also more common and generally tighter than across value streams.
Communication within the same team: This is regular communication between two or more members within the same team.

Semantic Versioning

Changes in interfaces always affect their consumers. It is, therefore, a good idea to use versioning, especially when introducing breaking changes. Not all changes break consumers, so you need a way to distinguish updates that introduce breaking changes from updates that do not. Semantic versioning is a way to achieve that. You may have seen those version sequences with three numbers separated by two dots, like 2.3.7. You increment the left number when introducing incompatible API changes, the middle number when you add functionality but preserve backward compatibility, and the right number when introducing minor (backward compatible) bug fixes.

Semantic versioning is a good way to formally describe and document API changes, especially over longer organizational distances. Use it for situations where communication does not occur regularly (and should not happen regularly), like across contexts and value streams.

Consumer-driven Contract Testing

A contract test checks adherence to a contract by defining input parameters and asserting expected output parameters. Consumer-driven contract testing is a discipline in which teams that provide an interface gather contract tests from their consumers. Each consumer defines tests that describe how they are using the interface. This provides value for the team that offers the interface. After making changes, they can execute all gathered consumer contract tests and, therefore, have higher confidence that the changes do not break consumers.

You can definitely use contract testing within and across value streams. Chances are that you have a manageable number of consumers. However, on the level of communication between contexts, you might have too many consumers to make this feasible. Imagine putting out a public API to consume outside of your business and gathering contract tests from all consumers—this would not be manageable. On the other hand, contract testing is probably too much of an overhead for communication within a team.

Event-Driven Communication

Event-driven communication is a pattern that uses events as messages to communicate state changes to other services, often combined with a publish/subscribe mechanism. Imagine a service publishing bank transactions (incoming and outgoing) to an event log. Another service can take those events and calculate the account’s balance only based on those events.

Event-driven communication works incredibly well between value streams. It is a means to decouple services as it reduces request-response and point-to-point integration. However, it is hard to implement across organizations and often is overkill for communication within team responsibilities. It works well when communication across value streams is implemented by using asynchronous communication like events. Teams within a value stream may use events but can also use direct point-to-point communication or other integration patterns.

Tolerant Reader

The tolerant reader pattern means that interfaces should be as permissive as possible in accepting input. For example, a web browser does not show you an error when a website fails to provide all closing HTML tags but tries to render the website anyway. Your service should assume the minimum about your input. If your interface takes JSON, for example, your service should only consider the attributes it needs and ignore the rest.

This is a pattern you should follow in multiple situations. However, I would be more cautious in overly using this pattern within a team. The reason is that the tolerant reader pattern sometimes breaks up the correctness guarantees you get by static typing. Imagine you accept a Data Transfer Object (DTO) over an interface. Now, you want to add an attribute. The tolerant reader pattern tells you to make that attribute optional (e.g., by using the ? Operator in TypeScript or C#) because you do not want to break your consumers by forcing them to submit this attribute. However, in overusing this, you could end up losing guarantees by your compiler since many of your attributes are optional - and therefore harder to handle because you need to implement fallback mechanisms (what if the attribute is not set?). Within a team, you often have concise communication channels and the possibility to refactor your codebase. So, introducing a new mandatory attribute would not cross any organizational boundaries and would not require a lot of cross-team communication. You could talk to a colleague and change the DTO’s contract by adding a mandatory field. Maybe you can also use your IDE’s refactoring tools to efficiently implement this change over your code base.

API Gateway

An API Gateway is a layer between API providers and consumers, where providers publish their API. It takes on responsibilities like routing or authentication, encapsulating them away from the service itself.

Use an API Gateway for external communication. It is always a good idea to use one if you communicate across boundaries like organizations and value streams. It is probably overkill for intra-team integration, unless you want to integrate vertically (e.g. frontend and backend).

Conversational Change

This pattern describes code changes that are agreed upon informally and implemented directly. For example, you might go to a colleague and ask for changes to the data structure they committed yesterday. You both agree on improvements and go along implementing them, maybe directly with pair programming.

Use conversational change mainly within a team because communication paths are very short. I would not recommend it for anything else since it leads to highly interdependent code over time. As long as you want to keep modules loosely integrated, refrain from using conversational change too much across module boundaries and use more formal integration patterns.

Conclusion

Choosing integration patterns and mechanisms depends on how distant the communication channels between teams are. The above categorization helps you compare different approaches and choose a fitting one depending on the context and the teams integrating their modules.

Shaping Shifts

Discussion about this post