Decide Ceph failure domain

Currently, Ceph storage pools have their "failure domain" set to "host". This means that the replicas of each block will be written to disks in different machines, however those machines can be in the same building and even rack.

The problem is that if we lose a rack or a whole building at once, data will become unavailable.

We can set the failure domain to "rack" or "zone" (=building), however this will limit how much storage we can use, since most of our storage is in RCDC (actually most of it is the hsrn-ceph1 box: 1225.3 TB).

zone	hdd	ssd
rcdc	1635.8	9.7
370j	101.9	34.9
7e12th	67.3	0
wwh	52.8	0
12wvpl	17.5	9.7
2mt	17.5	9.7
60fifthave	17.5	9.7

Replicating 3 times across zones would only give us 130 TB of usable storage. Using erasure coding would give a little more (2 + 2 -> 146 TB) but far from the total capacity.

Another option would be splitting the RCDC zone into more zones, or keeping the current setting of "host" failure domain.

Edited May 31, 2023 by Remi Rampin