Backup/Restore still requires manual steps and has not yet been confirmed to work as expected for Bitbucket, Confluence or Jira, it is your responsibility to judge whether you can risk loosing your running cluster or whether an at this point manual backup is advised!
Backup an Atlassian Data Center cluster as follows:
This ensures a consistent filesystem backup based on defensive best practices – there may be options to achieve a consistent snapshot with the cluster still running, but these have not been explored yet.
Backup the filesystem manually via one of the following options:
Sync vs archive
Both examples use a sync approach for starters, obviously it's also possible and probably desired to come up with a more robust archive based scheme for long-term storage.
Either create an applicable backup snapshot of the cluster's shared data directory mounted under '/media/atl/...' and move it off the instance for safe keeping and long-term storage, for example via the aws.s3.sync AWS CLI command:
Or mount an existing EFS filesystem as the backup target directory (see instructions above) and create an ad hoc backup snapshot of the cluster's shared data directory mounted under '/media/atl/...', for example via the rsync command:
(Optional) Delete the CloudFormation stack if you want to maximize cost-savings and do not mind the slightly longer MTTR for creating a new stack rather than activating an existing one.
Only proceed with this once you have verified that you can restore the backup to a new cluster as outlined below!
Verify backup steps do work as desired
Craft applicable Bash script to automate backup
This seemingly cannot currently work with the official templates, because the CloudFormation stack requires to be started up with 1 instance before getting the chance to restore the filesystem, which will then mess up the restored RDS snapshot. However, it seemingly works fine when facilitating the custom 'cold standby' mode of our modified templates to ensure that the cluster is started up with 0, see instructions below.
Restore an Atlassian Data Center cluster backup to a new CloudFormation stack via a bastion host as follows:
Provision an Atlassian Data Center stack based on the 'cost-effective' Utoolity AWS Quick Start forks as usual, as outlined in TBD, and ensure that the following parameters are set correctly:
'Database snapshot ID to restore' => provide the desired RDS snapshot name from
'Stack standby mode' => select 'cold' this addresses the aforementioned catch 22 by creating the new cluster with 0 nodes initially to provide time for the manual filesystem restore
Or mount an existing EFS filesystem as the backup source directory (see instructions above) and restore the backup snapshot into the new cluster's shared data directory mounted under '/media/atl/...', for example via the rsync command:
Given a pristine installation requires this, it might be appropriate to restore a new cluster with only one node at first too, though it seems to work fine with multiple nodes as well (after all, the restored cluster had already been gone through all the configuration steps before being backed up)?!
(Optional) Scale out the cluster to the desired number of nodes
Perform applicable post restore operations, for example a reindex
This implies a notable time sink for large data sets, which is why e.g. Jira supports index backup/restore conceptually, and the Bitbucket template already supports Elasticsearch backup/restore out of the box – accordingly, index backup/restore should probably be added to the Jira and Confluence templates as well if possible?