rear/SOURCES/rear-bz1743303-rubrik.patch

355 lines
16 KiB
Diff
Raw Normal View History

2020-11-03 12:05:47 +00:00
diff --git a/.gitignore b/.gitignore
index 5e3dc940..a644c865 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,3 +4,6 @@
build-stamp
/var
/etc/rear/site.conf
+.DS_Store
+.vscode
+servers
diff --git a/doc/user-guide/16-Rubrik-CDM.adoc b/doc/user-guide/16-Rubrik-CDM.adoc
new file mode 100644
index 00000000..41f37d20
--- /dev/null
+++ b/doc/user-guide/16-Rubrik-CDM.adoc
@@ -0,0 +1,106 @@
+= Documentation for the Rubrik Cloud Data Management (CDM) Backup and Restore Method
+
+== Summary
+
+The Rubrik CDM backup and restore method for ReaR allows Rubrik CDM to perform bare metal recovery of Linux systems that are supported by ReaR. It does this by including the installed Rubrik CDM RBS agent files in the ISO that is created by `rear mkrescue` via a pre-script in the fileset. The ISO is left in place under `/var/lib/rear/output/rear-<hostname>.iso` by default. During the fileset backup Rubrik will backup the main operating system files as well as the ReaR ISO file.
+
+Bare Metal Recovery is performed by first restoring the ReaR ISO file from Rubrik CDM to an alternate host. Next the host being restored is booted from the ISO via CD/DVD, USB, vSphere Datastore ISO, etc... Once booted running `rear recover` will prepare the host for restore and start the Rubrik CDM RBS agent. If the host has a new IP address the new RBS agent will need to be registered with the Rubrik cluster. Registration is not necessary if the recovery host is reusing the same IP address as the original. All of the files for the host are then recovered from Rubrik CDM to the recovery host's `/mnt/local` directory by the user. Once complete the user exit's ReaR and reboots the host.
+
+== Configuration
+
+1. Install and configure ReaR in accordance with:
+- Red Hat
+ * https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/ch-relax-and-recover_rear
+- Ubuntu
+ * http://manpages.ubuntu.com/manpages/disco/en/man8/rear.8.html
+- SUSE
+ * https://en.opensuse.org/SDB:Disaster_Recovery
+ * https://documentation.suse.com/sle-ha/15-SP1/html/SLE-HA-all/cha-ha-rear.html
+- Generic
+ * https://github.com/rear/rear
+
+ NOTE: Ignore any instructions to configure external storage like NFS, CIFS/SMB or ftp. Also ignore any instructions to configure a specific backup method. This will be taken care of in the next steps.
+
+ NOTE: Ignore any instructions to schedule ReaR to run via the host based scheduler (cron). Rubrik CDM will run ReaR via a pre-script in the fileset. If this is not preferred ReaR can be scheduled on the host, however, the ISOs created may not be in sync with the backups.
+
+ NOTE: If installing the pre-release or development version for which there is no installer, copy the repo to the host being protected. Then run `make install` from its root directory of the repo.
+
+1. Install the Rubrik CDM RBS agent as directed by the Rubrik documentation.
+1. Edit `/etc/rear/local.conf` and enter:
+
+ OUTPUT=ISO
+ BACKUP=CDM
+
+1. Test `ReaR` by running `rear -v mkrescue`
+1. Configure fileset backup of the host and add `/usr/sbin/rear mkrescue` as a prescript.
+1. ISOs will be saved as `/var/lib/rear/output/*.iso`
+
+- Recovery
+
+1. Recover `/var/lib/rear/output/rear-<hostname>.iso` from host to be restored.
+1. Boot recovery machine using recovered ISO.
+
+ NOTE: Recovered system will use the same networking as the original machine. Verify no IP conflicts will occur.
+
+ NOTE: If the same static IP address may be used it will need to be changed if the original machine is still running.
+
+1. Verify Firewall is down on recovery host.
+1. Run `rear recover`
+1. Answer inline questions until `rear>` prompt appears.
+1. Run `ps -eaf` and verify that `backup_agent_main` and `bootstrap_agent_main` are running.
+1. Get the IP address of the system using `ip addr`
+1. Register the new IP with the Rubrik appliance (if needed)
+1. Perform a re-directed export of `/` to `/mnt/local`
+1. Reboot
+1. Recover other file systems as needed.
+
+ Note: that the Rubrik RBS agent will connect as the original machine now. The host may need to be reinstalled and re-registered if the original machine is still running.
+
+== Known Issues
+
+* Recovery via IPv6 is not yet supported.
+* Automatic recovery from replica CDM cluster is not supported
+* CDM may take some time to recognize that the IP address has moved from one system to another. When restoring using the same IP give CDM up to 10 minutes to recognize that the agent is running on another machine. This usually comes up during testing when the original machine is shutdown but not being restored to.
+* Recovery from a replica CDM cluster is only supported with CDM v4.2.1 and higher.
+* Care must be taken with SUSE systems on DHCP. They tend to request the same IP as the original host. If this is not the desired behavior the system will have to be adjusted after booting from the ReaR ISO.
+* If multiple restores are performed using the same temporary IP, the temporary IP must first be deleted from Servers & Apps -> Linux and Unix Servers and re-added upon each reuse.
+* ReaR's `ldd` check of other binaries or libraries may result in libraries not being found. This can generally be fixed by adding the path to those libraries to the `LD_LIBRARY_PATH` variable in `/etc/rear/local.conf`. Do this by adding the following line in `/etc/rear/local.conf`:
++
+ export LD_LIBRARY_PATH-"$LD_LIBRARY_PATH:<path>"
++
+To make CentoOS v7.7 work the following line was needed:
++
+ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib64/bind9-export"
++
+To make CentOS v8.0 work the following line was needed:
++
+ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib64/bind9-export:/usr/lib64/eog:/usr/lib64/python3.6/site-packages:/usr/lib64/samba:/usr/lib64/firefox"
+
+== Troubleshooting
+
+* Verify that ReaR will recover your system without using the CDM backup and restore method. Most errors are due to configuration with ReaR itself and not Rubrik CDM. Use the default ReaR backup and restore method to test with.
+* Follow the OS specific configuration guides as mentioned at the beginning of this document.
+
+== Test Matrix
+
+.Test Matrix
+[%header,format=csv]
+|===
+Operating System,DHCP,Static IP,Virtual,Physical,LVM Root Disk,Plain Root Disk,EXT3,EXT4,XFS,BTRFS,Original Cluster,Replication Cluster
+CentOS 7.3,,pass,Pass,,Pass,,,,Pass,,Pass,
+CentOS 7.6,Pass,,Pass,,Pass,,,,Pass,,Pass,
+CentOS 7.7,Pass,,Pass,Pass,Pass,,,,Pass,,Pass,
+CentOS 8.0,Pass,,Pass,,Pass,,,,Pass,,Pass,
+CentOS 5.11,,,,,,,,,,,,
+CentOS 6.10,,,,,,,,,,,,
+RHEL 7.6,Pass,,Pass,,Pass,,,,,,,
+RHEL 7.4,,,,,,,,,,,,
+RHEL 6.10,,,,,,,,,,,,
+SUSE 11 SP4,,,,,,,,,,,,
+SUSE 12 SP4,Pass (uses same IP as original),,Pass,,,,,,,Pass,Pass,
+Ubuntu 14.04 LTS,,,,,,,,,,,,
+Ubuntu 16.04 LTS,Pass,,,,Pass,,,Pass,,,Pass,
+Ubuntu 17.04 LTS,,,,,,,,,,,,
+|===
+
+* Empty cells indicate that no tests were run.
diff --git a/usr/share/rear/conf/default.conf b/usr/share/rear/conf/default.conf
index 0f0d0675..56967132 100644
--- a/usr/share/rear/conf/default.conf
+++ b/usr/share/rear/conf/default.conf
@@ -1334,6 +1334,17 @@ LANG_RECOVER=C
# low-quality master encryption key. For details, see the cryptsetup(8) manual page.
LUKS_CRYPTSETUP_OPTIONS="--iter-time 2000 --use-random"
+##
+# BACKUP=CDM (Rubrik CDM; Cloud Data Managemnt)
+##
+# ReaR support for Rubrik Cloud Data Managment (CDM).
+# ReaR will copy the Rubrk RBS agent and required OS binaries to its ISO for incluson on boot.
+# ReaR will start the Rubrik RBS agent when 'rear recover' is run.
+COPY_AS_IS_CDM=( /etc/rubrik /usr/bin/rubrik /var/log/rubrik /etc/pki /usr/lib64 )
+COPY_AS_IS_EXCLUDE_CDM=( /var/log/rubrik/* )
+PROGS_CDM=( /usr/bin/rubrik/backup_agent_main /usr/bin/rubrik/bootstrap_agent_main openssl uuidgen )
+
+
##
# BACKUP=FDRUPSTREAM stuff
##
diff --git a/usr/share/rear/prep/CDM/default/400_prep_cdm.sh b/usr/share/rear/prep/CDM/default/400_prep_cdm.sh
new file mode 100644
index 00000000..d3fd11b7
--- /dev/null
+++ b/usr/share/rear/prep/CDM/default/400_prep_cdm.sh
@@ -0,0 +1,7 @@
+#
+# prepare stuff for CDM
+#
+
+COPY_AS_IS=( "${COPY_AS_IS[@]}" "${COPY_AS_IS_CDM[@]}" )
+COPY_AS_IS_EXCLUDE=( "${COPY_AS_IS_EXCLUDE[@]}" "${COPY_AS_IS_EXCLUDE_CDM[@]}" )
+PROGS=( "${PROGS[@]}" "${PROGS_CDM[@]}" fmt )
diff --git a/usr/share/rear/prep/CDM/default/450_check_cdm_client.sh b/usr/share/rear/prep/CDM/default/450_check_cdm_client.sh
new file mode 100644
index 00000000..637fac5f
--- /dev/null
+++ b/usr/share/rear/prep/CDM/default/450_check_cdm_client.sh
@@ -0,0 +1,13 @@
+# 450_check_cdm_client.sh
+#
+# This script checks if a Rubrik CDM client is installed and running
+#
+
+Log "Backup method is Rubrik (CDM): check backup_agent_main"
+if [ ! -x /usr/bin/rubrik/backup_agent_main ]; then
+ StopIfError 1 "Please install Rubrik (CDM) RBS client software."
+fi
+
+ps ax | grep -v grep | grep backup_agent_main
+StopIfError $? "Rubrik (CDM) RBS backup_agent_main was not running on this client."
+
diff --git a/usr/share/rear/restore/CDM/default/400_restore_with_cdm.sh b/usr/share/rear/restore/CDM/default/400_restore_with_cdm.sh
new file mode 100644
index 00000000..bc4811c4
--- /dev/null
+++ b/usr/share/rear/restore/CDM/default/400_restore_with_cdm.sh
@@ -0,0 +1,19 @@
+# 400_restore_with_cdm.sh
+#
+#
+
+LogPrint "Please start the restore process on the Rubrik (CDM) cluster."
+
+if is_true $CDM_NEW_AGENT_UUID; then
+ LogPrint ""
+ LogPrint "Register the appropriate IP address from this list with Rubrik (CDM):"
+ LogPrint "$( ip addr | grep inet | cut -d / -f 1 | grep -v 127.0.0.1 | grep -v ::1 )"
+ LogPrint ""
+fi
+LogPrint "Make sure all required data is restored to $TARGET_FS_ROOT ."
+LogPrint ""
+LogPrint "Next type 'exit' to continue the recovery."
+LogPrint "Info: You can check the recovery process i.e. with the command 'df'."
+LogPrint ""
+
+rear_shell "Has the restore been completed and are you ready to continue the recovery?"
diff --git a/usr/share/rear/verify/CDM/default/410_use_replica_cdm_cluster_cert.sh b/usr/share/rear/verify/CDM/default/410_use_replica_cdm_cluster_cert.sh
new file mode 100644
index 00000000..518387e3
--- /dev/null
+++ b/usr/share/rear/verify/CDM/default/410_use_replica_cdm_cluster_cert.sh
@@ -0,0 +1,88 @@
+# 410_use_replica_cdm_cluster_cert.sh
+# If restoring from a replica Rubrik (CDM) cluster use its cert for RBS.
+
+LogPrint "If restoring from a replica Rubrik (CDM) cluster its cert will be downloaded and used for RBS"
+
+CDM_RBA_DIR=/etc/rubrik
+CDM_KEYS_DIR=${CDM_RBA_DIR}/keys
+
+local prompt="Is the data being restored from the original CDM Cluster?"
+local input_value=""
+local wilful_input=""
+while true ; do
+ # Find out if the restore is being done from the original CDM cluster or a Replica
+ # the default (i.e. the automated response after the timeout) should be 'no':
+ input_value="$( UserInput -I CDM_REPLICA_CLUSTER -p "$prompt" -D 'no' )" && wilful_input="yes" || wilful_input="no"
+ if is_false "$input_value" ; then
+ if is_true "$wilful_input" ; then
+ LogPrint "User confirmed the data is not being restored from the original CDM Cluster"
+ else
+ LogPrint "Assuming the data is not being restored from the original CDM Cluster"
+ fi
+ break
+ fi
+ if is_true "$input_value" ; then
+ LogPrint "User confirmed the data is being restored from the original CDM Cluster"
+ return 0
+ fi
+done
+
+LogPrint "Downloading cert from replica CDM cluster"
+# The name of the tar file that is being downloaded has changed in Rubrik CDM v5.1.
+# Before Rubrik CDM v5.1 it was rubrik-agent-sunos5.10.sparc.tar.gz
+# since Rubrik CDM v5.1 it is rubrik-agent-solaris.sparc.tar.gz
+# cf. https://github.com/rear/rear/issues/2441
+CDM_SUNOS_TAR=rubrik-agent-sunos5.10.sparc.tar.gz
+CDM_SOLARIS_TAR=rubrik-agent-solaris.sparc.tar.gz
+pushd $TMPDIR
+while true ; do
+ prompt="Enter one of the IP addresses for the replica CDM cluster (or 'no' to cancel)"
+ CDM_CLUSTER_IP="$( UserInput -I CDM_CLUSTER_IP -r -t 0 -p "$prompt" )"
+ test $CDM_CLUSTER_IP || continue
+ if is_false "$CDM_CLUSTER_IP" ; then
+ LogPrint "User canceled downloading cert from replica CDM cluster (data restore may fail now)"
+ popd
+ return 0
+ fi
+ # When curl fails for all files continue with an empty CDM_TAR_FILE to denote that nothing was downloaded:
+ for CDM_TAR_FILE in $CDM_SOLARIS_TAR $CDM_SUNOS_TAR '' ; do
+ test $CDM_TAR_FILE || continue
+ curl $v -fskLOJ https://${CDM_CLUSTER_IP}/connector/${CDM_TAR_FILE} && break
+ done
+ if ! test -s "$CDM_TAR_FILE" ; then
+ LogPrintError "Could not download Rubrik agent from https://${CDM_CLUSTER_IP}/connector/${CDM_SOLARIS_TAR} or https://${CDM_CLUSTER_IP}/connector/${CDM_SUNOS_TAR}"
+ while true ; do
+ prompt="Enter URL to download Rubrik agent tar archive (or 'no' to cancel)"
+ CDM_AGENT_URL="$( UserInput -I CDM_AGENT_URL -r -t 0 -p "$prompt" )"
+ test $CDM_AGENT_URL || continue
+ if is_false "$CDM_AGENT_URL" ; then
+ LogPrint "User canceled downloading Rubrik agent (data restore may fail now)"
+ popd
+ return 0
+ fi
+ curl $v -fskLOJ $CDM_AGENT_URL && break
+ LogPrintError "Could not download Rubrik agent from $CDM_AGENT_URL"
+ done
+ CDM_TAR_FILE=$( basename "$CDM_AGENT_URL" )
+ fi
+ if ! tar $v -xzf $CDM_TAR_FILE ; then
+ LogPrintError "Could not extract Rubrik agent (failed to 'tar -xzf $CDM_TAR_FILE')"
+ continue
+ fi
+ CDM_CERT_FILE=$(find ./ -name "rubrik.crt")
+ mv $v ${CDM_KEYS_DIR}/rubrik.crt ${CDM_KEYS_DIR}/rubrik.crt.orig
+ if ! cp $v $CDM_CERT_FILE $CDM_KEYS_DIR ; then
+ LogPrintError "Could not copy replica CDM cluster certificate"
+ continue
+ fi
+ chmod $v 600 ${CDM_KEYS_DIR}/rubrik.crt
+ mv $v ${CDM_KEYS_DIR}/agent.crt ${CDM_KEYS_DIR}/agent.crt.orig
+ mv $v ${CDM_KEYS_DIR}/agent.pem ${CDM_KEYS_DIR}/agent.pem.orig
+ # TODO: Actually do something if /etc/rubrik/rba-keygen.sh failed.
+ # Is /etc/rubrik/rba-keygen.sh perhaps only optional?
+ # cf. https://github.com/rear/rear/pull/2445#discussion_r448217873
+ /etc/rubrik/rba-keygen.sh || LogPrintError "/etc/rubrik/rba-keygen.sh failed (data restore may also fail)"
+ break
+done
+popd
+LogPrint "Replica Rubrik (CDM) cluster certificate installed"
diff --git a/usr/share/rear/verify/CDM/default/430_gen_rbs_uuid_for_cdm.sh b/usr/share/rear/verify/CDM/default/430_gen_rbs_uuid_for_cdm.sh
new file mode 100644
index 00000000..5e99b79c
--- /dev/null
+++ b/usr/share/rear/verify/CDM/default/430_gen_rbs_uuid_for_cdm.sh
@@ -0,0 +1,29 @@
+# 430_gen_rbs_uuid_for_cdm.sh
+# Reset the UUID used by RBS if the IP address has changed
+
+CDM_RBA_DIR=/etc/rubrik
+CDM_AGENT_UUID=${CDM_RBA_DIR}/conf/uuid
+
+# When USER_INPUT_CDM_SAME_AGENT_UUID has Does this client have the same IP address as the original 'y' was actually meant:
+LogPrint ""
+LogPrint "Found the following IP addresses on this system:"
+LogPrint "$( ip addr | grep inet | cut -d / -f 1 | grep -v 127.0.0.1 | grep -v ::1 )"
+LogPrint ""
+is_true "$USER_INPUT_CDM_SAME_AGENT_UUID" && USER_INPUT_SAME_AGENT_UUID="y"
+while true ; do
+ # Find out if the IP address has changed from the original. If so generate a new UUID.
+ # the default (i.e. the automated response after the timeout) should be 'n':
+ answer="$( UserInput -I CDM_SAME_AGENT_UUID -p "Does this client have the same IP address as the original? (y/n)" -D 'y' -t 300 )"
+ is_true "$answer" && return 0
+ if is_false "$answer" ; then
+ break
+ fi
+ UserOutput "Please answer 'y' or 'n'"
+done
+
+mv $v ${CDM_AGENT_UUID} ${CDM_AGENT_UUID}.old
+/usr/bin/uuidgen | tee -a ${CDM_AGENT_UUID} >&2
+StopIfError "Unable to generate new UUID"
+
+CDM_NEW_AGENT_UUID="true"
+LogPrint "Rubrik (CDM) RBS agent now has new UUID."
diff --git a/usr/share/rear/verify/CDM/default/450_start_cdm_rbs.sh b/usr/share/rear/verify/CDM/default/450_start_cdm_rbs.sh
new file mode 100644
index 00000000..571da1da
--- /dev/null
+++ b/usr/share/rear/verify/CDM/default/450_start_cdm_rbs.sh
@@ -0,0 +1,17 @@
+# 450_start_cdm_rbs.sh
+# Start the Rubrik (CDM) RBS Agent
+
+RBA_DIR=/etc/rubrik
+RBA_BIN_DIR=/usr/bin/rubrik
+
+BOOTSTRAP_DAEMON_OPTS="$( < ${RBA_DIR}/conf/bootstrap_flags.conf )"
+AGENT_DAEMON_OPTS="$( < ${RBA_DIR}/conf/agent_flags.conf )"
+BOOTSTRAP_DAEMON=$RBA_BIN_DIR/bootstrap_agent_main
+AGENT_DAEMON=$RBA_BIN_DIR/backup_agent_main
+
+$BOOTSTRAP_DAEMON $BOOTSTRAP_DAEMON_OPTS
+StopIfError "Unable to start RBS Bootstrap service"
+$AGENT_DAEMON $AGENT_DAEMON_OPTS
+StopIfError "Unable to start RBS Agent service"
+
+LogPrint "Rubrik (CDM) RBS agent started."