Dark launch
A dark launch deploys new features to production without exposing them to users. The code runs, processes real data, and handles actual traffic - but users see the existing experience while the new functionality operates invisibly in the background. This approach lets teams validate performance, catch bugs, and build confidence before any user notices something changed.
Why it matters
Launching features is risky. Testing environments never perfectly replicate production's scale, data complexity, and usage patterns. Features that work flawlessly in staging can fail spectacularly when millions of users hit them simultaneously. Dark launching eliminates this uncertainty by testing with production reality before users experience the change.
The technique originated at companies operating at massive scale, where even brief outages affect millions of users and cost significant revenue. Facebook famously dark-launched their messaging system by running all message operations through the new infrastructure while displaying results from the old system. By the time users saw the new messaging, it had already handled billions of real messages successfully.
How dark launching works
A typical dark launch follows several phases:
Deploy invisibly. The new code ships to production servers but executes behind feature flags or traffic routing that prevents user-visible changes. Users continue seeing the current experience.
Process real data. The new system receives copies of real requests or writes data to shadow databases. It performs all the work it would in a full launch, just without users seeing the output.
Compare results. The new system's outputs are compared against the existing system. Discrepancies reveal bugs before they impact users. Performance metrics show whether the new code can handle production load.
Monitor extensively. Teams watch error rates, latency, resource consumption, and other indicators. Production conditions often reveal issues invisible in testing - unexpected data formats, edge cases, scale problems.
Enable gradually. Once confident, teams incrementally expose the feature to small user segments, monitoring closely before expanding.
Dark launch vs. related practices
| Practice | Visibility | Traffic | Use Case |
|---|---|---|---|
| Dark launch | Hidden | Real production traffic | Validating at scale |
| Feature flag | Controlled | Real, but targeted | Gradual rollout |
| Canary release | Visible to some | Small percentage | Risk mitigation |
| Blue-green | Visible (instant switch) | All or nothing | Zero-downtime deployment |
| Shadow testing | Hidden | Duplicated traffic | Performance comparison |
Dark launching specifically emphasizes invisible operation with real traffic. Feature flags might expose features to beta users; canary releases are visible to the users receiving them; blue-green deployments switch all traffic at once. Dark launching keeps everything hidden while testing against production conditions.
When to dark launch
Dark launching adds complexity, so it's not appropriate for every feature. It's most valuable when:
Scale is uncertain. Features that work at 1,000 requests per minute might fail at 1,000,000. Dark launching validates that the new code handles real load.
Data complexity is high. Production data is messy in ways test data isn't. Real user inputs, edge cases, and data combinations reveal issues synthetic testing misses.
Downtime is costly. When failures significantly impact users or revenue, the overhead of dark launching is worthwhile for the confidence it provides.
Backend changes are invisible. Dark launching works best for changes users can't see - infrastructure, databases, API backends, recommendation algorithms. User-facing changes are harder to test invisibly.
Reversibility is critical. Dark launched features can be disabled instantly if problems appear, without users ever knowing something changed.
Implementing dark launches
Successful dark launches require infrastructure and practices:
Feature flags control whether new code executes visibly. They enable instant rollback and gradual exposure as confidence builds.
Shadow traffic systems duplicate requests to new infrastructure without affecting responses users receive. This lets the new system process real requests while the old system handles actual responses.
Comparison tooling automatically validates that new systems produce correct results. For deterministic operations, automated comparison catches regressions. For non-deterministic results (like recommendations), statistical analysis validates behavior.
Monitoring and alerting must distinguish between the dark-launched system and production. Teams need visibility into how the new code performs without conflating metrics with the live system.
Kill switches enable instant disabling if problems appear. Dark launches should be trivially reversible.
Common pitfalls
Incomplete coverage means dark launching only some code paths while missing others. The launch then exposes untested paths, defeating the purpose.
Write side effects can't simply be duplicated. If dark-launched code would send emails, charge credit cards, or modify external systems, those side effects must be mocked or carefully isolated.
Resource contention occurs when shadow traffic competes with production for database connections, API rate limits, or server capacity. Dark launches consume resources even when invisible.
Extended dark periods let dark-launched code drift from the codebase. The longer code runs invisibly, the more it diverges from what eventually launches, potentially introducing new issues.
Overconfidence from success is dangerous. A successful dark launch doesn't guarantee a successful visible launch. User behavior changes when they can see features, revealing issues invisible to backend-only testing.
Dark launching represents a mature approach to deployment risk management. For teams operating at scale where failures carry significant cost, the investment in dark launch infrastructure pays dividends through fewer incidents and higher confidence in releases.

