Microservices architecture is a design approach that divides large applications into small, independent services. Adopted by many companies like Netflix, Amazon, and Uber, it has become the mainstream of modern cloud-native development. This article systematically explains from basic concepts to implementation patterns of microservices.
Comparison of Monolith and Microservices
Monolithic Architecture
flowchart TB
subgraph Monolith["Monolithic Architecture"]
subgraph App["Single Application"]
User["User Mgmt"]
Product["Product Mgmt"]
Order["Order Mgmt"]
Payment["Payment"]
end
DB["Shared Database"]
App --> DB
end
Microservices Architecture
flowchart TB
subgraph Microservices["Microservices Architecture"]
subgraph U["User Service"]
US["User Service"]
UDB["Users DB"]
US --> UDB
end
subgraph P["Product Service"]
PS["Product Service"]
PDB["Products DB"]
PS --> PDB
end
subgraph O["Order Service"]
OS["Order Service"]
ODB["Orders DB"]
OS --> ODB
end
subgraph Pa["Payment Service"]
PaS["Payment Service"]
PaDB["Payments DB"]
PaS --> PaDB
end
end
Gateway["API Gateway / Service Mesh"] --> U
Gateway --> P
Gateway --> O
Gateway --> Pa
Comparison Table
| Aspect | Monolith | Microservices |
|---|---|---|
| Deployment | All together | Independent per service |
| Scaling | Scale entire system | Scale only needed services |
| Tech stack | Must be unified | Can choose per service |
| Fault impact | Spreads to entire system | Limited to affected service |
| Development team | Everyone knows everything | Dedicated teams per service |
| Complexity | Concentrated in code | Distributed in infrastructure/ops |
Microservices Design Principles
1. Single Responsibility Principle
Each service focuses on one business function.
// Bad example: One service with multiple responsibilities
class UserOrderService {
createUser() { /* ... */ }
updateUser() { /* ... */ }
createOrder() { /* ... */ } // Different domain
processPayment() { /* ... */ } // Different domain
}
// Good example: Services separated by responsibility
// user-service
class UserService {
createUser() { /* ... */ }
updateUser() { /* ... */ }
getUserById() { /* ... */ }
}
// order-service
class OrderService {
createOrder() { /* ... */ }
getOrdersByUser() { /* ... */ }
}
2. Data Independence (Database per Service)
Each service has its own data store and does not directly access other services’ databases.
| Pattern | Description |
|---|---|
| ❌ Anti-pattern | Services A and B directly access a shared database |
| ✅ Recommended | Each service owns its database, communicates via API |
flowchart TB
subgraph Good["✅ Recommended Pattern"]
SA2["Service A"] --> DBA["DB A"]
SB2["Service B"] --> DBB["DB B"]
SA2 <-->|"Via API"| SB2
end
3. Loose Coupling
Minimize dependencies between services and communicate only through interfaces.
// Example of order-service communicating with user-service
interface UserClient {
getUserById(userId: string): Promise<User>;
}
class OrderService {
constructor(private userClient: UserClient) {}
async createOrder(userId: string, items: OrderItem[]): Promise<Order> {
// Communicate with other service through interface
const user = await this.userClient.getUserById(userId);
if (!user.isActive) {
throw new Error('User is not active');
}
return this.orderRepository.create({
userId,
items,
createdAt: new Date(),
});
}
}
4. High Cohesion
Group related functionality within the same service.
Boundary Definition through Domain-Driven Design (DDD):
| User Context | Order Context | Inventory Context |
|---|---|---|
| User | Order | Product |
| Profile | OrderItem | Stock |
| Address | Payment | Warehouse |
| Authentication | Shipping | Supplier |
Inter-Service Communication Patterns
1. Synchronous Communication (REST / gRPC)
Used when an immediate response is required.
// Synchronous communication via REST API
class ProductClient {
private baseUrl = 'http://product-service:8080';
async getProduct(productId: string): Promise<Product> {
const response = await fetch(`${this.baseUrl}/products/${productId}`, {
headers: {
'Content-Type': 'application/json',
'X-Request-ID': generateRequestId(), // For tracing
},
signal: AbortSignal.timeout(5000), // Timeout setting
});
if (!response.ok) {
throw new ProductServiceError(response.status);
}
return response.json();
}
}
// Synchronous communication via gRPC (protocol buffers)
syntax = "proto3";
service ProductService {
rpc GetProduct(GetProductRequest) returns (Product);
rpc ListProducts(ListProductsRequest) returns (stream Product);
}
message GetProductRequest {
string product_id = 1;
}
message Product {
string id = 1;
string name = 2;
int32 price = 3;
int32 stock = 4;
}
2. Asynchronous Communication (Message Queue)
Used when eventual consistency is acceptable or when processing separation is needed.
// Event-driven architecture
interface OrderCreatedEvent {
eventType: 'ORDER_CREATED';
orderId: string;
userId: string;
items: OrderItem[];
totalAmount: number;
timestamp: Date;
}
// order-service: Publishing events
class OrderService {
async createOrder(order: CreateOrderDto): Promise<Order> {
const created = await this.orderRepository.create(order);
// Publish event (other services subscribe)
await this.eventBus.publish<OrderCreatedEvent>({
eventType: 'ORDER_CREATED',
orderId: created.id,
userId: created.userId,
items: created.items,
totalAmount: created.totalAmount,
timestamp: new Date(),
});
return created;
}
}
// inventory-service: Subscribing to events
class InventoryEventHandler {
@Subscribe('ORDER_CREATED')
async handleOrderCreated(event: OrderCreatedEvent): Promise<void> {
for (const item of event.items) {
await this.inventoryService.decrementStock(item.productId, item.quantity);
}
}
}
Communication Pattern Selection Criteria
| Pattern | Use Case | Features |
|---|---|---|
| REST | CRUD operations, simple APIs | Widely adopted, easy to debug |
| gRPC | High performance needed, type safety important | Fast, streaming support |
| Message Queue | Async processing, event-driven | Loose coupling, scalability |
| GraphQL | Client-driven data fetching | Flexible queries, prevents over-fetching |
API Gateway Pattern
Provides a single entry point between clients and microservices.
flowchart TB
Client --> Gateway["API Gateway<br/>- Authentication/Authorization<br/>- Rate limiting<br/>- Request routing<br/>- Response aggregation<br/>- Protocol translation"]
Gateway --> User["User Service"]
Gateway --> Order["Order Service"]
Gateway --> Product["Product Service"]
// API Gateway routing configuration example (Express + http-proxy-middleware)
import express from 'express';
import { createProxyMiddleware } from 'http-proxy-middleware';
const app = express();
// Authentication middleware
app.use(authMiddleware);
// Routing per service
app.use('/api/users', createProxyMiddleware({
target: 'http://user-service:8080',
changeOrigin: true,
pathRewrite: { '^/api/users': '' },
}));
app.use('/api/orders', createProxyMiddleware({
target: 'http://order-service:8080',
changeOrigin: true,
pathRewrite: { '^/api/orders': '' },
}));
app.use('/api/products', createProxyMiddleware({
target: 'http://product-service:8080',
changeOrigin: true,
pathRewrite: { '^/api/products': '' },
}));
Fault Tolerance Patterns
1. Circuit Breaker
Temporarily blocks calls to failing services.
enum CircuitState {
CLOSED, // Normal operation
OPEN, // Blocked
HALF_OPEN, // Recovery check
}
class CircuitBreaker {
private state = CircuitState.CLOSED;
private failureCount = 0;
private lastFailureTime: Date | null = null;
private readonly failureThreshold = 5;
private readonly resetTimeout = 30000; // 30 seconds
async call<T>(fn: () => Promise<T>): Promise<T> {
if (this.state === CircuitState.OPEN) {
if (this.shouldAttemptReset()) {
this.state = CircuitState.HALF_OPEN;
} else {
throw new CircuitBreakerOpenError();
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
private onSuccess(): void {
this.failureCount = 0;
this.state = CircuitState.CLOSED;
}
private onFailure(): void {
this.failureCount++;
this.lastFailureTime = new Date();
if (this.failureCount >= this.failureThreshold) {
this.state = CircuitState.OPEN;
}
}
private shouldAttemptReset(): boolean {
return Date.now() - (this.lastFailureTime?.getTime() ?? 0) > this.resetTimeout;
}
}
2. Retry Pattern
Retries with exponential backoff for transient failures.
async function withRetry<T>(
fn: () => Promise<T>,
options: {
maxRetries: number;
baseDelay: number;
maxDelay: number;
}
): Promise<T> {
let lastError: Error;
for (let attempt = 0; attempt <= options.maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (attempt === options.maxRetries) break;
// Exponential backoff + jitter
const delay = Math.min(
options.baseDelay * Math.pow(2, attempt) + Math.random() * 1000,
options.maxDelay
);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw lastError!;
}
// Usage example
const user = await withRetry(
() => userClient.getUserById(userId),
{ maxRetries: 3, baseDelay: 1000, maxDelay: 10000 }
);
3. Bulkhead Pattern
Isolates resources to limit the impact of failures.
// Thread pool / connection pool isolation
const pools = {
userService: new ConnectionPool({ maxConnections: 10 }),
orderService: new ConnectionPool({ maxConnections: 20 }),
paymentService: new ConnectionPool({ maxConnections: 5 }),
};
// Even if one service is overloaded, others are not affected
Distributed Transactions
Saga Pattern
Implements transactions spanning multiple services as a series of local transactions.
sequenceDiagram
participant O as Order Service
participant I as Inventory Service
participant P as Payment Service
Note over O,P: Normal Flow
O->>I: 1. Create Order
I->>P: 2. Reserve Stock
P->>P: 3. Process Payment
P-->>O: 4. All Success
Note over O,P: Compensating Flow (Payment failure)
P->>P: Payment Failed
P->>I: Rollback Stock
I->>O: Cancel Order
// Saga Orchestrator
class OrderSaga {
async execute(orderData: CreateOrderData): Promise<Order> {
const sagaLog: SagaStep[] = [];
try {
// Step 1: Create order
const order = await this.orderService.create(orderData);
sagaLog.push({ service: 'order', action: 'create', data: order });
// Step 2: Reserve inventory
await this.inventoryService.reserve(order.items);
sagaLog.push({ service: 'inventory', action: 'reserve', data: order.items });
// Step 3: Process payment
await this.paymentService.process(order.id, order.totalAmount);
sagaLog.push({ service: 'payment', action: 'process', data: order.id });
// Step 4: Confirm order
await this.orderService.confirm(order.id);
return order;
} catch (error) {
// Compensating transactions (execute in reverse order)
await this.compensate(sagaLog);
throw error;
}
}
private async compensate(sagaLog: SagaStep[]): Promise<void> {
for (const step of sagaLog.reverse()) {
switch (step.service) {
case 'inventory':
await this.inventoryService.release(step.data);
break;
case 'order':
await this.orderService.cancel(step.data.id);
break;
}
}
}
}
Observability
The Three Pillars
The Three Pillars of Observability:
| Pillar | Purpose | Tools |
|---|---|---|
| Logs | Application event recording | ELK Stack, Loki |
| Metrics | System state as numbers | Prometheus, Grafana |
| Traces | Request flow tracking | Jaeger, Zipkin |
Distributed Tracing
// Distributed tracing with OpenTelemetry
import { trace, context, SpanKind } from '@opentelemetry/api';
const tracer = trace.getTracer('order-service');
async function createOrder(req: Request): Promise<Order> {
return tracer.startActiveSpan(
'createOrder',
{ kind: SpanKind.SERVER },
async (span) => {
try {
span.setAttribute('user.id', req.userId);
// Child span: User validation
const user = await tracer.startActiveSpan('validateUser', async (childSpan) => {
const result = await userClient.getUser(req.userId);
childSpan.end();
return result;
});
// Child span: Save order
const order = await tracer.startActiveSpan('saveOrder', async (childSpan) => {
const result = await orderRepository.save(req.orderData);
childSpan.setAttribute('order.id', result.id);
childSpan.end();
return result;
});
span.setStatus({ code: SpanStatusCode.OK });
return order;
} catch (error) {
span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
throw error;
} finally {
span.end();
}
}
);
}
Criteria for Adopting Microservices
When to Adopt
- Team of 10+ people needing independent development
- Different parts have different scaling requirements
- Technology stack diversity is required
- Fault isolation is important
When to Avoid
- Small team (around 3-5 people)
- Early stage with insufficient domain understanding
- Lack of operational capability (monitoring, CI/CD)
- Simple CRUD applications
Important: The “start with a monolith and split as needed” approach is recommended in many cases.
Summary
Microservices architecture can bring significant benefits when implemented properly, but it also comes with complexity.
Design Principles
- Single Responsibility: 1 service = 1 business function
- Data Independence: Separate DB per service
- Loose Coupling: Communication through interfaces
- High Cohesion: Related functions within the same service
Essential Patterns
- API Gateway: Single entry point
- Circuit Breaker: Prevent fault propagation
- Saga: Distributed transaction management
- Distributed Tracing: Request tracking
The key to successful microservices is proper boundary definition and building a robust operational foundation.