The Monolith: Simple and Safe
In a classic monolith application, you have one big database for everything. Transaction state is much easier to manage and trace because everything lives in the same system.
Example: Streaming Service Content Deal
Imagine we're building a streaming service. We've had success with indie content, but now we're big enough to work with major studios. We just signed a contract with Disney to stream the Star Wars trilogy for $50 million.
Our application handles legal contracts, financial transactions, and content management—all within a single monolithic system sharing one database.
graph TD
A[Monolith Application] --> B[(Shared Database)]
B --> C[Contracts Table]
B --> D[Payments Table]
B --> E[Movies Table]
style A fill:#4CAF50
style B fill:#2196F3
The Transaction
When we activate the Disney contract, we need to:
- Record the payment of $50 million
- Make the Star Wars movies visible to our users
In a monolith with a single database, this is straightforward:
-- Single atomic transaction
BEGIN TRANSACTION;
-- Record the payment
INSERT INTO payments (contract_id, amount, status, created_at)
VALUES (1001, 50000000, 'PAID', NOW());
-- Update our account balance
UPDATE accounts
SET balance = balance - 50000000
WHERE account_id = 1;
-- Make the movies visible
UPDATE movies
SET is_visible = true, available_date = NOW()
WHERE contract_id = 1001;
-- Update contract status
UPDATE contracts
SET status = 'ACTIVE', activated_at = NOW()
WHERE id = 1001;
COMMIT;
Automatic Rollback on Failure
In our monolithic system, if server crashes mid update, the database will automatically roll back all changes. It's all or nothing.
BEGIN TRANSACTION;
INSERT INTO payments (...); -- ✅ Executed
UPDATE accounts SET balance = ...; -- ✅ Executed
UPDATE movies SET is_visible = ...; -- ✅ Executed
-- 💥 SERVER CRASHES HERE
UPDATE contracts SET status = ...; -- ❌ Never executed
COMMIT; -- ❌ Never reached
Result:
- Payment record: ❌ Not inserted
- Account balance: ❌ Unchanged
- Movies: ❌ Still invisible
- Contract: ❌ Still pending
The Problem with Distributed Systems
But let's say our financial system is handled by a 3rd party. Now we have a distributed system where the payment processing happens in a separate service. We can't use a traditional database transaction anymore.
-- This doesn't work across different services!
BEGIN TRANSACTION;
-- Call external payment API (different service/database)
POST https://payment-service.com/api/charge
-- Update our database
UPDATE Movies_Table SET is_visible = true;
COMMIT;
What happens if:
- The payment succeeds but our server crashes before updating the movie visibility?
- Our database update succeeds but the payment API is down?
- The payment API responds with a timeout but actually processed the payment?
This is where Event-Driven Architecture comes in.
Event-Driven Architecture (EDA)
Event-Driven Architecture is a design pattern where services communicate by producing and consuming events rather than making direct API calls. An event is an immutable record of something that happened in the system.
Key Components
- Event Producers: Services that publish facts (e.g.,
VideoUploaded). - Event Bus/Broker: Middleware like Kafka that routes events.
- Event Consumers: Services that react to events (e.g., a "Search Service" updating its index).
Choreography vs Orchestration
Choreography (Decentralized)
Each service knows what to do when it sees specific events. No central coordinator.
graph LR
A[Order Service] -->|OrderPlaced| B[Event Bus]
B -->|OrderPlaced| C[Payment Service]
B -->|OrderPlaced| D[Inventory Service]
C -->|PaymentReceived| B
B -->|PaymentReceived| E[Shipping Service]
Pros: Loose coupling, resilient, scalable Cons: Hard to understand flow, difficult to debug
Orchestration (Centralized)
A coordinator service manages the workflow and tells each service what to do.
graph TD
A[Order Orchestrator] -->|Process Payment| B[Payment Service]
B -->|Success| A
A -->|Reserve Inventory| C[Inventory Service]
C -->|Success| A
A -->|Schedule Shipping| D[Shipping Service]
Pros: Clear workflow, easier to understand and debug Cons: Orchestrator becomes single point of failure, tight coupling
Eventual Consistency
In event-driven systems, we accept eventual consistency instead of immediate consistency. The system will eventually reach a consistent state, but there may be a delay.
Example: When you place an order on Amazon:
- You immediately see "Order placed!" (order service updated)
- A few seconds later: "Payment successful" (payment service processed)
- A minute later: "Preparing for shipment" (warehouse service notified)
Each step happens asynchronously via events.
The Saga Pattern
A saga is a sequence of local transactions coordinated through events or orchestration. Each transaction updates the database and publishes an event. If a step fails, compensating transactions undo the changes.
Example: E-commerce Order Saga
Order Placed
→ Reserve Inventory (SUCCESS)
→ Process Payment (FAILURE)
→ Release Inventory (COMPENSATE)
→ Cancel Order
Forward Flow:
- OrderPlaced → Reserve inventory
- InventoryReserved → Process payment
- PaymentProcessed → Ship order
Compensation Flow (if payment fails):
- PaymentFailed → Release inventory
- InventoryReleased → Cancel order
Outbox Pattern
To ensure events are published reliably, use the outbox pattern:
- In the same database transaction that changes business data, insert the event into an "outbox" table
- A background worker polls the outbox table
- The worker publishes events to the event bus
- Once published, mark the event as sent
This guarantees at-least-once delivery because the event survives even if the application crashes.
-- Single atomic transaction
BEGIN TRANSACTION;
UPDATE orders SET status = 'CONFIRMED';
INSERT INTO outbox_events (event_type, payload)
VALUES ('OrderConfirmed', '{"orderId": 123}');
COMMIT;
-- Background worker
SELECT * FROM outbox_events WHERE status = 'PENDING';
-- Publish to event bus
-- Mark as sent
UPDATE outbox_events SET status = 'SENT';
Best Practices
1. Idempotency
Events may be delivered more than once. Ensure consumers can safely process the same event multiple times.
// Use idempotency key
async function processPayment(event) {
const { paymentId } = event.payload;
// Check if already processed
const existing = await db.findOne('payments', { id: paymentId });
if (existing) {
console.log('Already processed, skipping');
return;
}
// Process payment...
}
2. Event Versioning
Events are contracts. Version them to handle schema changes.
{
eventType: 'OrderPlaced',
version: 2,
payload: {
orderId: 123,
userId: 456,
// New field in v2
couponCode: 'SAVE20'
}
}
3. Dead Letter Queues
When event processing fails repeatedly, move the event to a dead letter queue for manual inspection.
4. Event Schemas
Define and validate event schemas (e.g., using JSON Schema, Avro, Protobuf).
5. Observability
- Assign correlation IDs to track events across services
- Log all event publications and consumptions
- Monitor event lag and processing times
Trade-offs
Advantages:
- Loose coupling between services
- Natural fit for microservices
- Scalability and resilience
- Audit trail (event sourcing)
Challenges:
- Complexity in debugging distributed flows
- Eventual consistency requires careful design
- Need robust event infrastructure
- Duplicate event handling (idempotency)
- Ordering guarantees can be difficult