Outbox

In the previous post we have shown how consistent messaging can be implemented by storing point-in-time state snapshots and using these snapshots for publishing outgoing messages. We discussed some pros and cons of this approach. This time we will focus on the alternative approach which is based on storing the outgoing messages before they are dispatched. Consistent messaging requires the ability to ensure exactly same side effects (in form of the outgoing messages) each time a copy of a given incoming message is processed.

State-based consistent messaging

In the previous posts, we described consistent messaging and justified its usefulness in building robust distributed systems. Enough with the theory, it’s high time to show some code! First comes the state-based approach. Context State-based consistent messaging comes with two requirements: “Point-in-time” state availability - it’s possible to restore any past version of the state. Deterministic message handling logic - for a set state and input message every handler execution gives the same result.

Distributed business processes

In the previous post we explained what a messaging infrastructure is. We showed that it is necessarily a distributed thing with parts running in different processes. We’ve seen the trade-offs involved in building the messaging infrastructure and what conditions must be met to guarantee consistent message processing on top of such infrastructure. This time we will explain why it is reasonable to expect that most line-of-business systems require a messaging infrastructure.

Messaging infrastructure

The messaging infrastructure is all the components required to exchange messages between parts of the business logic code. There are two parts to the messaging infrastructure. One is the message broker that manages the message queues (and possibly topics). The other equally important part consists of all the libraries running in the same process as the application logic, that expose the API for processing messages. Whenever an application wants to send a message, it calls the in-process part of the messaging infrastructure which, in turn, communicates with the out-of-process part.

Notes on 2PC

If there’s a distributed protocol every software engineer knows it’s Two-Phase Commit also know as 2PC. Although, in use for several decades1, it’s been in steady decline mainly due to lack of support in cloud environments. For quite some time it was a de-facto standard for building enterprise distributed systems. That said, with the cloud becoming the default deployment model, designers need to learn how to build reliable systems without it.

Consistent state

In the previous post we talked about exactly-once processing looking at the endpoint from the outside. Here we will re-focus on an individual endpoint and see what exactly-once means for an endpoint’s state. It’s not about the execution history Exactly-once spawned some heated debates in the past1 so let’s make sure we make it clear what it means in our context - or more importantly what it doesn’t. Here we talk about exactly-once processing not delivery, the two being quite different things.

Sync-async boundary

If our experience in the IT industry has taught us anything it would be that drawing boundaries is the most important part of the design process. Boundaries are essential for understanding and communicating the design. The sync-async is an example of a boundary that is useful when designing distributed systems. Purely sync The most popular technology for building synchronous systems is HTTP. Such systems often consist of multiple layers of HTTP endpoints.

Consistent messaging

Modern messaging infrastructures offer delivery guarantees that make it non-trivial to build distributed systems. Robust solutions require a good understanding of what can and can’t happen and how that affects business level behavior. This post looks at the main challenges from the system consistency perspective and sketches possible solutions. A system We will assume that systems in focus consist of endpoints, each owning a distinct piece of state. Every endpoint processes input messages, modifying internal state and producing output messages.