Tale of Cassandra datastore migration

Verma Shailendra
4 min readJan 2, 2020
Live Cassandra Datastore migration from AWS to Azure

Considering the efficiency and economy of scale most organisations are building cloud based solutions for external as well internal business applications, data and computation infrastructure. There are various type of cloud service providers are available in market i.e Amazon, Cisco, Google, IBM, Oracle, SAP. Among them 3 are major & widely used CSP are Amazon’s AWS , Microsoft’s Azure and Googles GCP.

Many organisations are migrating their existing applications and data into cloud environment to take benefit of elasticity, scaling, redundancy, flexibility and also pay as you go model. Apart from that many times organisations make decision to migrate their data & infrastructure from 1 CSP to another to take advantage of most effective IT environment possible, based on factors such as cost, performance and security.

Recently we have also migrated from AWS to Microsoft Azure, majorly due to organisational decision considering the cost and offerings. Migrate from one cloud provider to another could be a mammoth task if applications are not designed considering this requirement. Fortunately this was not the case with us, in My organisation we are using a common library which contains most of native cloud related dependencies along with an abstract layer on top of it (like S3 or Azure Blob Integration), So application migration was not very challenging to deal with.

Unfortunately Same is not true with datastore & managed services migration. One of my application is using Cassandra & I was tasked to migrate the same. While choosing a data & datastore migration strategy you need to consider many things like if downtime is required or not, if yes how to minimise cutover duration. If no then how to perform live migration, I will share my experience below for Cassandra migration.

Overall strategy was- to create an identical cluster azure environment with same number of nodes and Cassandra version and migrate all data; later on we had to migrate only incremental data at time of cutover to keep downtime low.

Steps to perform in Existing Cluster

  1. ) Take a full snapshot of cluster using nodetool command on each node.
nodetool snapshot full_snapshot_name

You can use PSSH to run command in parallel on all nodes. (no need to flush memtable separately as this command will take care of it). For each key space separate snapshot folder is created with current date inside corresponding data directory.

2.) Copy snapshots folder for each keyspace to a cloud storage. Like s3 or BlobStore.

3.) Dump list of tokens handled by each node using below command

nodetool ring | grep ip_address_of_node | awk ‘{print $NF “,”}’ | xargs

4.) change incremntal_backup: true in cassandra.yaml for each node and restart the cluster. (This will take incremental backups for each SStable Created)

Steps to perform in New Cluster

1.) Shut down the cluster.

2.) Remove complete data from data folder (including system keyspace; if system keyspace is not present, cluster will be initialised using yaml settings including token range.

3.) Change Cassandra.yaml initial_tokens value on each node to the list of tokens taken in above step, (this will map one-to-one corresponding nodes from old to new cluster)

4.) create schema for all required keyspaces and tables.

5.) Copy snapshot data from cloud storage for each keyspace to corresponding data folder on each mapped node. (Node to Node mapping is done in above step like if you have copied token values from node A in old cluster to node X in new cluster, then copy snapshot data of Node A to Node X.

6.) Now Just start the cluster, and run node tool repair command on any node. (No need to run repair command on each node)

nodetool repair

Once full snapshot is applied using above steps, its time to do wait for cutover time and do migration of incremental data only. At time of cutover make sure application is not writing data to any cluster.

Steps for migrating incremental data.

  1. ) Run flush command on each node in existing cluster. (When incremntal_backup is enabled in Cassandra node, when ever memtable is flushed, to SStable, backup link is automatically created at backup location)
 nodetool flush

2.) Copy incremental snapshots to corresponding nodes in new cluster.

3.) Run refresh. (This doesn’t need restart of cluster).

 nodetool refresh

Now new cluster is ready for read and write by your application. With above strategy I was able to migrate Cassandra cluster with 6 nodes & around 2.6 TB data with 12 minutes downtime.

--

--