K8s migration framework

MG

Kubernetes workload migration framework.

Developed a comprehensive framework for Kubernetes workload migrations, with an example workflow for migrations between node types in a cluster. This allows the client to build multiple workflows to cover different scenarios including cross-cluster migrations.

Highlights

  • simple but powerful design
  • non-disruptive migration of highly available databases
  • backups to allow rollbacks to any stage
  • end-to-end test and load-test support for testing migrations

Feature Details

Simple, but Powerful Design

The design follows best practices like separation of concerns, SOLID, which makes it easy to extend, test, and reuse the components.

At a high level the framework is composed of a controller which executes the workflows, subject-specific reusable steps, and strategies that define the outcome-oriented concrete workflows.

The system comes with a number of predefined steps for workload migrations on Kubernetes such as running preflight checks, backing up resources and data from volumes and snapshots, suspending/resuming processing of resources, stashing/restoring resources, modifying resources, etc. And implementing custom steps is just a matter of adding classes that implement a simple interface.

This makes it easy to implement various workflows by combining appropriate subjects and steps in a new strategy class.

Non-disruptive Migrations

The system of subjects and steps is flexible enough to allow e.g. suspending a StatufulSet, and then processing each of its replicas one at a time, which made it possible to implement workflows that migrate highly-available databases like PostgreSQL, MongoDB, MariaDB, etc. in a non-disruptive manner where the service was available throughout the migration process.

Backups to Allow Rollbacks

The framework includes a set of utility classes that allow the migration steps to record or restore the state of any resources they modify in the Kubernetes cluster, which in turn makes it possible to roll back the modifications in the event of something not going according to plan.

End-to-end and Load Test Support

A separate tool that supports three modes of operation: populating data stores with random data and recording hashes of the data stored, running as a service in load-test mode where it continuously reads or writes those records, and a validation mode where it compares the data store contents with the recorded hashes and reports any inconsistencies.

This allows testing the migration workflows while another service is actively using the data stores. Quite useful for developing workflows that migrate the highly-available databases or data stores.

The tool abstracts data store interfaces using the adapter pattern and supports multiple backends like MinIO, MongoDB, MySQL/MariaDB, PostgreSQL, and Redis.

Skills & Technologies