Upgrade OS on Ceph Server
At work, our ceph cluster is managed by the cephadm
utility, which
means that all ceph operations are launched within containers managed
by podman. Our original batch of storage servers was running CentOS
7, which as of late 2023 is only a few months from reaching its
official end-of-life date. The hardware is still good, so I just
wanted to update operating systems while keeping ceph data volumes
intact. I could find no official documentation for upgrading the
OS out from under cephadm, so I forged my own way.
Everyone’s clusters and infrastructure are a bit different, so I’m only going to outline the process, avoiding any attempt to guess all the obstacles that might be present in a different environment.
The servers to be upgraded each had
- CentOS 7
- 2 x 600GB SSDs, configured in RAID1 for the operating system
- 12 x 10TB HDDs, reserved for ceph data volumes
The goal was to leave the HDDs untouched, while clearing the SSDs and installing RHEL 9 instead.
Identifying the cephadm files
The trickiest part, for me, was identifying all the files cephadm installed and used to launch and maintain the daemons on each host. In our cluster, these servers were only hosting object-storage daemons (OSDs), but I don’t think the process would have differed if they had been running monitors, managers, etc.
Our cluster’s FSID begins with 834e1fa5-
, so I used that with a
shell wildcard as a shorthand to specify the targeted files.
Here is the list of files I decided needed saving:
- /etc/ceph/ (optional)
- /etc/logrotate.d/ceph-834e1fa5-*
- /etc/systemd/system/ceph-834e1fa5-*
- /etc/systemd/system/ceph.target
- /etc/systemd/system/ceph.target.wants/ceph-834e1fa5-*
- /etc/systemd/system/multi-user.target.wants/ceph-834e1fa5-*.target
- /etc/systemd/system/multi-user.target.wants/ceph.target
- /var/lib/ceph/834e1fa5-*
- /var/log/ceph/ (optional)
Note that our systems use the systemd multi-user.target
as default
boot target. You’ll need to adjust your backups if you use a different
one. Also note that some environments might not need /etc/ceph
(not used by the containers) or /var/log/ceph
(just log files),
but I grabbed them just in case.
The procedure, step by step
The overall procedure was vaguely complex. It just worked for us. I have no great insights about rolling back the process if something were to fail spectacularly.
- Archive files listed above.
- Save list of LVM physical volumes, volume groups, and logical volumes for reference.
- Take note of all information (shell, home directory, UID, GID) for ceph user.
All the above steps I scripted:
cd /root
(getent group ceph; getent passwd ceph) > /root/ceph-user.txt
(pvs; lvs -o lv_tags) > /root/lvm-config.txt
pushd /
tar -cWv -f /root/ceph-$(hostname -s).backup.tar \
./etc/ceph/*.keyring \
./etc/logrotate.d/ceph-834e1fa5-* \
./etc/systemd/system/ceph-834e1fa5-* \
./etc/systemd/system/ceph.target \
./etc/systemd/system/ceph.target.wants/ceph-834e1fa5-* \
./etc/systemd/system/multi-user.target.wants/ceph-834e1fa5-*.target \
./etc/systemd/system/multi-user.target.wants/ceph.target \
./root/ceph-user.txt \
./root/lvm-config.txt \
./var/lib/ceph/834e1fa5-* \
./var/log/ceph
popd
- Save everything from the steps above to remote storage so it can be retrieved after the new OS is installed.
- On your ceph administrative host, quiesce certain cluster disk ops as for normal node maintenance.
ceph osd set noout
ceph osd set noscrub
ceph osd set noscrub-deep
- Install new operating system, taking care to leave all ceph volumes intact.
We use kickstart for RHEL installations, so we specified the disks that were to be used by the operating system, e.g.,
clearpart --drives=disk/by-id/scsi-35000cca13d5ac884,disk/by-id/scsi-35000cca13d5ac240 --all
zerombr
bootloader --location=mbr
ignoredisk --only-use=disk/by-id/scsi-35000cca13d5ac884,disk/by-id/scsi-35000cca13d5ac240
# raid partitions
part raid.20 --ondisk=/dev/disk/by-id/scsi-35000cca13d5ac884 --size=2048 --asprimary
part raid.21 --ondisk=/dev/disk/by-id/scsi-35000cca13d5ac240 --size=2048 --asprimary
part raid.30 --ondisk=/dev/disk/by-id/scsi-35000cca13d5ac884 --size=8192
part raid.31 --ondisk=/dev/disk/by-id/scsi-35000cca13d5ac240 --size=8192
part raid.40 --ondisk=/dev/disk/by-id/scsi-35000cca13d5ac884 --size=1 --grow
part raid.41 --ondisk=/dev/disk/by-id/scsi-35000cca13d5ac240 --size=1 --grow
# raid groups
raid /boot --fstype="xfs" --level=RAID1 --device=boot raid.20 raid.21
raid swap --level=RAID1 --device=swap raid.30 raid.31
raid / --fstype="xfs" --level=RAID1 --device=root raid.40 raid.41
- After initial reboot, update to current packages if necessary.
- Install and enable EPEL repository (for Red Hat systems).
- Ensure ceph user and group exist as before.
- Install the correct version of the
cephadm
utility. - Execute
cephadm prepare-host
to make sure all software necessary for cephadm operations is installed. - Execute
cephadm shell -- ceph --version
to ensure ceph container image is installed on host. - Add ceph ssh key to new
/root/.ssh/authorized_keys
file. - Verify system sees the LVM volumes from before the upgrade.
- Restore archived ceph files into
/etc
and/var
. - Reboot host; the various ceph processes should start automatically, though it may be a minute or two before they’re up and running.
- On your ceph administrative host, verify that OSDs, mons, etc. on the upgraded host are returning to the cluster. I use
ceph -s
andceph osd tree
to check on things. - Restart disk ops to let any dust settle. For us it was just a few disk scrubs.
ceph osd unset noout
ceph osd unset noscrub
ceph osd unset noscrub-deep
- Repeat process on next host.