2. 2
Why move to Bluestore?
● Supportability
● Lower latency
● Higher throughput
ONTARIO INSTITUTE FOR CANCER RESEARCH
Read more @ https://ceph.com/community/new-luminous-bluestore/
8. ONTARIO INSTITUTE FOR CANCER RESEARCH
8
Migration process for each Storage node
Drain
Drain data from all OSD’s on desired
storage node
Find the numerical range of OSD’s (684 to
719) and change the osd crush weight to 0
Convert the OSD’s on desired storage
node from Filestore to Bluestore
*More detail in next few slides
Convert
Refill the OSD’s on desired storage node
Using the same range of OSD’s from the
Drain step, change the osd crush weight
to the appropriate disk size
Fill
9. Draining
9
ONTARIO INSTITUTE FOR CANCER RESEARCH
for i in $(seq 648 683); do ceph osd crush reweight osd.$i 0; done
● for loop to drain a server worth of OSD’s
● ~24 hours per server
● 1-2 servers draining at a time
● Multi-rack draining
● Wait for ‘ceph health ok’
● Tuneables
osd recovery max active 3 -> 4
osd max backfills 1 -> 16
10. Draining
10
ONTARIO INSTITUTE FOR CANCER RESEARCH
Majority drained in 3 hours
Long tail of 28 hours to complete
144TB server case study
12. Converting to Bluestore
12
ONTARIO INSTITUTE FOR CANCER RESEARCH
Migrate bluestore script @ https://github.com/CancerCollaboratory/infrastructure
1. Stop the OSD process (systemctl stop ceph-osd@501.service)
2. Unmount the OSD (umount /dev/sdr1)
3. Zap the disk (ceph-disk zap 501)
4. Mark the OSD as destroyed (ceph osd destroy 501 --yes-i-really-mean-it)
5. Prepare the disk as Bluestore (ceph-disk prepare --bluestore /dev/sdr --osd-id 501)
13. Filling
13
ONTARIO INSTITUTE FOR CANCER RESEARCH
for i in $(seq 648 683); do ceph osd crush reweight osd.$i 3.640; done
● for loop to fill a server worth of OSD’s
● ~24 hours per server
● 1-2 servers filling at a time
● Multi-rack draining
● Wait for ‘ceph health ok’
● Monitoring caveat
16. Filling
16
ONTARIO INSTITUTE FOR CANCER RESEARCH
Monitoring caveat
Zabbix graphs built from zabbix-agent
xfs disk usage
Grafana w/ graphite and ceph-mgr
18. How long did it take?
18
ONTARIO INSTITUTE FOR CANCER RESEARCH
0101011101010101000101101010101010
Start Finish
End of July Early September
+480TB of data uploaded during this time by researchers
+1PB of capacity added during migration (new nodes)
188TB of data served from the object store
20. Issues
20
ONTARIO INSTITUTE FOR CANCER RESEARCH
● Increased amount of drive failures
○ 4 failures within a week at the end of the migration
● Ceph monmap growing to ~15GB
21. Funding for the Ontario Institute for Cancer Research
is provided by the Government of Ontario