Improving microservices reliability - part 2: Outbox Pattern
Welcome back to the second part of the Series. Today we’ll talk about the Outbox Pattern.
Just to recap, last time we discussed how the 2-Phase-Commit technique can help us with distributed transactions. However, it may lead to unwanted side effects and performance issues.
So is there any other approach we could take? Personally I’m a great fan of persisting the state as much as possible, may it be the full Domain Entity or a stream of Events.
And here it comes the Outbox Pattern!
Let’s go back to our eCommerce example. We want to save an order and roughly at the same time send an email to the customer. I said “roughly” because we don’t really need these operations to occur at the same time. Moreover, there might even be other actions but let’s stick with one for now.
The problem is that since by definition every microservice has its own persistence mechanism, it’s quite impossible to have a distributed transaction spanning all the services.
The Order might be saved but messages might not be dispatched due to a network issue. Or we might get messages but no order stored in the db because we ran out of space. Whatever.
So what do we do?
With 2PC we use a Coordinator and a bunch of messages to ensure the flow is correctly executed.
With the Outbox instead the flow is much simpler:
- the Order service receives the command to store the new Order
- a local transaction is opened
- the Order is persisted
- an “order saved” event is serialized and stored into a generic Outbox table (or collection or whatever you’re using, doesn’t matter)
- the local transaction gets committed
At this point we’ve ensured that our local state is persisted so any potential subsequent query should be able to return fresh data (assuming caching is not an issue).
Now all we have to do is inform our subscribers and we can do this by using an offline worker: at regular intervals it will fetch a batch of records from the Outbox and publish them as messages on a queue, like RabbitMQ or Kafka.
This pattern ensures that each message is processed at least once. What does this mean? That we get guaranteed delivery, but it may occur more than once. This also mean that we have to be extremely careful ensuring that our messages are idempotent.
Since we don’t want of course to reinvent the wheel, an option could be using a third-party tool like NServiceBus, which can help us handling Sagas and complex scenarios hiding all the noise of the boilerplate code.
That’s all for today. Next time we’ll see the pattern in action in a small C# .NET Core application.