Rolling deployment
A rolling deployment updates a software application gradually, replacing instances running the old version with the new version incrementally rather than all at once. During the deployment, both versions run simultaneously, with traffic shifting progressively until all instances run the new version. This approach minimizes risk and enables zero-downtime updates.
Why it matters
Traditional "big bang" deployments take the entire system offline, update everything, and bring it back up. This creates downtime, and if something goes wrong, all users are affected immediately.
Rolling deployments eliminate downtime by never stopping the old version until the new version is confirmed working. Users experience continuous service throughout the deployment. If problems emerge, only a fraction of users are affected, and the deployment can stop or roll back before damage spreads.
This gradual approach also reduces the blast radius of bugs. If the new version has a critical flaw, it's detected when serving a small percentage of traffic rather than after all users are affected. The combination of zero downtime and reduced risk makes rolling deployments standard practice for production systems.
How rolling deployments work
The basic mechanics involve:
Instance pool. The application runs on multiple instances (containers, servers, pods). A load balancer distributes traffic across them.
Sequential updates. One or more instances are taken out of the pool, updated to the new version, and returned to service.
Health verification. Before proceeding, the updated instances are verified to be healthy and serving traffic correctly.
Gradual progression. The process repeats until all instances run the new version.
Traffic management. The load balancer continues routing to healthy instances throughout, whether old or new version.
Rolling deployment parameters
Several settings control rolling deployment behavior:
Batch size. How many instances update simultaneously. Smaller batches are safer but slower. "1 at a time" maximizes safety; "50% at a time" is faster.
Health check criteria. What must be true for an instance to be considered healthy? HTTP response codes, latency thresholds, custom checks.
Wait period. How long to observe new instances before proceeding. Longer waits catch problems; shorter waits speed deployment.
Failure threshold. How many instances can fail before halting the deployment? Zero tolerance is strictest; some tolerance accommodates flaky tests.
Rolling deployment challenges
Several factors complicate rolling deployments:
Version compatibility. During deployment, old and new versions serve traffic simultaneously. They must be compatible: same API contracts, same data formats, same database schema.
Session affinity. If users are sticky to specific instances, they might experience version inconsistency during deployment. Stateless designs avoid this.
Database migrations. Schema changes must work with both versions. This typically means deploying migrations separately from code.
Long-running processes. Requests or jobs in progress when an instance updates may fail. Graceful draining handles in-flight work.
Stateful applications. Applications with local state (caches, files) require special handling during instance replacement.
Rolling vs. other deployment strategies
Rolling deployments are one of several approaches:
| Strategy | Description | Tradeoffs |
|---|---|---|
| Rolling | Gradual instance replacement | Zero downtime, moderate complexity |
| Blue-Green | Two full environments, instant switch | Simple rollback, double resources |
| Canary | Small subset first, then full | Maximum control, more orchestration |
| Big Bang | All at once | Simple, but downtime and risk |
Rolling deployments balance simplicity and safety. Blue-green offers cleaner separation but higher cost. Canary provides finer control but more complexity. Choose based on your risk tolerance, infrastructure, and operational capability.
Implementing rolling deployments
Modern platforms provide rolling deployment capabilities:
Kubernetes supports rolling updates natively with configurable parameters for max surge, max unavailable, and health checks.
AWS ECS offers rolling updates with deployment circuit breakers.
Cloud load balancers (ALB, GCP Load Balancing) support gradual traffic shifting.
Container orchestrators generally include rolling update capabilities as standard features.
Configuration typically involves:
Best practices
Several practices improve rolling deployment success:
Implement proper health checks. Shallow checks (is the port open?) miss problems that deeper checks (can we serve a real request?) catch.
Enable graceful shutdown. Instances should complete in-progress requests before terminating. Abrupt termination causes errors.
Design for backward compatibility. Both versions will run simultaneously. Plan for this explicitly.
Automate completely. Manual rolling deployments are error-prone. Automation ensures consistency.
Monitor actively. Watch error rates, latency, and business metrics during deployment. Pause if problems emerge.
Practice rollbacks. Ensure you can roll back quickly. Test rollback procedures before you need them.
Observability during rolling deployments
Visibility matters during the transition:
Version tagging. Tag metrics and logs by version. See how the new version behaves compared to old.
Error rate monitoring. Watch for spikes as new instances take traffic. Compare error rates between versions.
Performance comparison. Is the new version faster or slower? Detect regressions before completing deployment.
Business metrics. Conversion rates, successful transactions, and other business outcomes shouldn't degrade during deployment.
Tools like Klero help ensure that what you're deploying addresses real user needs. When rolling deployments deliver features customers actually want, the investment in safe deployment practices pays off in user value rather than just risk mitigation.

