Because of CockroachDB's multi-active availability design, you can perform a "rolling upgrade" of your CockroachDB cluster. This means that you can upgrade nodes one at a time without interrupting the cluster's overall health and operations.
This page shows you how to upgrade to the latest v2.1 release (v2.1.11) from v2.0.x, or from any patch release in the v2.1.x series. To upgrade within the v2.0.x series, see the v2.0 version of this page.
Step 1. Verify that you can upgrade
If you are upgrading from v2.0.5 or later to v2.1, you do not have to go through intermediate releases; continue to step 2. However, if you are upgrading from 2.0.4 or earlier to v2.1, complete the following steps first:
Upgrade to v2.0.5 or any later patch release in the 2.0.x series. If upgrading from v1.1.x, be sure to complete all the steps, including the finalization step.
Return to this page and perform a second rolling upgrade to v2.1.
Step 2. Prepare to upgrade
Before starting the upgrade, complete the following steps.
Make sure your cluster is behind a load balancer, or your clients are configured to talk to multiple nodes. If your application communicates with a single node, stopping that node to upgrade its CockroachDB binary will cause your application to fail.
Verify the overall health of your cluster using the Admin UI. On the Cluster Overview:
- Under Node Status, make sure all nodes that should be live are listed as such. If any nodes are unexpectedly listed as suspect or dead, identify why the nodes are offline and either restart them or decommission them before beginning your upgrade. If there are dead and non-decommissioned nodes in your cluster, it will not be possible to finalize the upgrade (either automatically or manually).
- Under Replication Status, make sure there are 0 under-replicated and unavailable ranges. Otherwise, performing a rolling upgrade increases the risk that ranges will lose a majority of their replicas and cause cluster unavailability. Therefore, it's important to identify and resolve the cause of range under-replication and/or unavailability before beginning your upgrade.
- In the Node List:
- Make sure all nodes are on the same version. If any nodes are behind, upgrade them to the cluster's current version first, and then start this process over.
- Make sure capacity and memory usage are reasonable for each node. Nodes must be able to tolerate some increase in case the new version uses more resources for your workload. Also go to Metrics > Dashboard: Hardware and make sure CPU percent is reasonable across the cluster. If there's not enough headroom on any of these metrics, consider adding nodes to your cluster before beginning your upgrade.
Capture the cluster's current state by running the
cockroach debug zipcommand against any node in the cluster. If the upgrade does not go according to plan, the captured details will help you and Cockroach Labs troubleshoot issues.Back up the cluster. If the upgrade does not go according to plan, you can use the data to restore your cluster to its previous state.
Step 3. Decide how the upgrade will be finalized
This step is relevant only when upgrading from v2.0.x to v2.1. For upgrades within the v2.1.x series, skip this step.
By default, after all nodes are running the new version, the upgrade process will be auto-finalized. This will enable certain performance improvements and bug fixes introduced in v2.1. After finalization, however, it will no longer be possible to perform a downgrade to v2.0. In the event of a catastrophic failure or corruption, the only option will be to start a new cluster using the old binary and then restore from one of the backups created prior to performing the upgrade.
We recommend disabling auto-finalization so you can monitor the stability and performance of the upgraded cluster before finalizing the upgrade, but note that you will need to follow all of the subsequent directions, including the manual finalization in step 5:
Upgrade to v2.0, if you haven't already. The
cluster.preserve_downgrade_optionsetting mentioned below is available only as of v2.0.3.Start the
cockroach sqlshell against any node in the cluster.Set the
cluster.preserve_downgrade_optioncluster setting:> SET CLUSTER SETTING cluster.preserve_downgrade_option = '2.0';It is only possible to set this setting to the current cluster version.
Step 4. Perform the rolling upgrade
For each node in your cluster, complete the following steps.
We recommend creating scripts to perform these steps instead of performing them manually.
Upgrade only one node at a time, and wait at least one minute after a node rejoins the cluster to upgrade the next node. Simultaneously upgrading more than one node increases the risk that ranges will lose a majority of their replicas and cause cluster unavailability.
Connect to the node.
Terminate the
cockroachprocess.Without a process manager like
systemd, use this command:$ pkill cockroachIf you are using
systemdas the process manager, use this command to stop a node withoutsystemdrestarting it:$ systemctl stop <systemd config filename>Then verify that the process has stopped:
$ ps aux | grep cockroachAlternately, you can check the node's logs for the message
server drained and shutdown completed.Download and install the CockroachDB binary you want to use:
$ curl https://binaries.cockroachdb.com/cockroach-v2.1.11.darwin-10.9-amd64.tgz$ tar -xzf cockroach-v2.1.11.darwin-10.9-amd64.tgz