← Back to portfolio 2025-09-15

Cloud Migration Is Not Lift-and-Shift: Lessons from Years of Practice

Cloud MigrationAWSOCIArchitectureLessons

After years of migrating systems between clouds, the biggest misconception is that migration is about moving workloads. It is not. It is about rebuilding operational practices for a new platform.

The Three Phases Nobody Plans For

Phase 1: The infrastructure works, but monitoring does not. Services deploy successfully on the new cloud. Pods are running, databases are connected. But Grafana dashboards show nothing because the metric names changed, the log format is different, and alert rules reference source-cloud-specific dimensions. The team is flying blind for weeks.

Phase 2: The developers do not trust it. The infrastructure team certified the new platform. But developers still deploy to the old one because they know how to debug it, they know where the logs are, and their muscle memory is tuned to the old CLI commands. Trust takes months of working alongside teams, pair-debugging issues, and showing that the new platform is equally observable.

Phase 3: The old platform will not die. Months after "migration complete," there are still services on the old platform. One has a dependency on a cloud-specific feature. One belongs to a team that is perpetually busy. One is "scheduled for next quarter." These stragglers create ongoing cost and operational overhead.

What Actually Works

Migrate observability first. Before moving any workload, set up monitoring, logging, and alerting on the new platform. When developers can see their metrics on the new platform before their code runs there, trust builds faster.

Migrate the on-call engineer, not just the service. The engineer who will be paged at 3 AM for this service on the new platform should be involved in the migration. If they are not, the first incident will be a disaster.

Set a kill date for the old platform. Not a migration date but a kill date. The date when the old platform is turned off regardless of what is still running on it. This creates urgency for the stragglers.