SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
Petabyte Scale Out Rocks
CEPH as replacement for Openstack's Swift & Co.
Udo Seidel
22JAN2014
Agenda
● Background
● Architecture of CEPH
● CEPH Storage
● Openstack
● Dream team: CEPH and Openstack
● Summary
22JAN2014
Background
22JAN2014
CEPH – what?
● So-called parallel distributed cluster file system
● Started as part of PhD studies at UCSC
●
Public announcement in 2006 at 7th
OSDI
● File system shipped with Linux kernel since
2.6.34
● Name derived from pet octopus - Cephalopods
22JAN2014
Shared file systems – short intro
● Multiple server access the same data
● Different approaches
● Network based, e.g. NFS, CIFS
● Clustered
– Shared disk, e.g. CXFS, CFS, GFS(2), OCFS2
– Distributed parallel, e.g. Lustre .. GlusterFS and CEPHFS
22JAN2014
Architecture of CEPH
22JAN2014
CEPH and storage
● Distributed file system => distributed storage
● Does not use traditional disks or RAID arrays
● Does use so-called OSD’s
– Object based Storage Devices
– Intelligent disks
22JAN2014
Object Based Storage I
● Objects of quite general nature
● Files
● Partitions
● ID for each storage object
● Separation of meta data operation and storing
file data
● HA not covered at all
● Object based Storage Devices
22JAN2014
Object Based Storage II
● OSD software implementation
● Usual an additional layer between between
computer and storage
● Presents object-based file system to the computer
● Use a “normal” file system to store data on the
storage
● Delivered as part of CEPH
● File systems: LUSTRE, EXOFS
22JAN2014
CEPH – the full architecture I
● 4 components
● Object based Storage Devices
– Any computer
– Form a cluster (redundancy and load balancing)
● Meta Data Servers
– Any computer
– Form a cluster (redundancy and load balancing)
● Cluster Monitors
– Any computer
● Clients ;-)
22JAN2014
CEPH – the full architecture II
22JAN2014
CEPH and Storage
22JAN2014
CEPH and OSD
● User land implementation
● Any computer can act as OSD
● Uses BTRFS as native file system
● Since 2009
● Before self-developed EBOFS
● Provides functions of OSD-2 standard
– Copy-on-write
– snapshots
● No redundancy on disk or even computer level
22JAN2014
CEPH and OSD – file systems
● BTRFS preferred
● Non-default configuration for mkfs
● XFS and EXT4 possible
● XATTR (size) is key -> EXT4 less recommended
● XFS recommended for Production
22JAN2014
OSD failure approach
● Any OSD expected to fail
● New OSD dynamically added/integrated
● Data distributed and replicated
● Redistribution of data after change in OSD
landscape
22JAN2014
Data distribution
● File stripped
● File pieces mapped to object IDs
● Assignment of so-called placement group to
object ID
● Via hash function
● Placement group (PG): logical container of storage
objects
● Calculation of list of OSD’s out of PG
● CRUSH algorithm
22JAN2014
CRUSH I
● Controlled Replication Under Scalable Hashing
● Considers several information
● Cluster setup/design
● Actual cluster landscape/map
● Placement rules
● Pseudo random -> quasi statistical distribution
● Cannot cope with hot spots
● Clients, MDS and OSD can calculate object
location
22JAN2014
CRUSH II
22JAN2014
Data replication
● N-way replication
● N OSD’s per placement group
● OSD’s in different failure domains
● First non-failed OSD in PG -> primary
● Read and write to primary only
● Writes forwarded by primary to replica OSD’s
● Final write commit after all writes on replica OSD
● Replication traffic within OSD network
22JAN2014
CEPH cluster monitors
● Status information of CEPH components critical
● First contact point for new clients
● Monitor track changes of cluster landscape
● Update cluster map
● Propagate information to OSD’s
22JAN2014
CEPH cluster map I
● Objects: computers and containers
● Container: bucket for computers or containers
● Each object has ID and weight
● Maps physical conditions
● rack location
● fire cells
22JAN2014
CEPH cluster map II
● Reflects data rules
● Number of copies
● Placement of copies
● Updated version sent to OSD’s
● OSD’s distribute cluster map within OSD cluster
● OSD re-calculates via CRUSH PG membership
– data responsibilities
– Order: primary or replica
● New I/O accepted after information synch
22JAN2014
CEPH - RADOS
● Reliable Autonomic Distributed Object Storage
● Direct access to OSD cluster via librados
● Support: C, C++, Java, Python, Ruby, PHP
● Drop/skip of POSIX layer (CEPHFS) on top
● Visible to all CEPH cluster members => shared
storage
22JAN2014
CEPH Block Device
● Aka RADOS block device (RBD)
● Upstream since kernel 2.6.37
● RADOS storage exposed as block device
● /dev/rdb
● qemu/KVM storage driver via librados/librdb
● Alternative to
● shared SAN/iSCSI for HA environments
● Storage HA solutions for qemu/KVM
22JAN2014
The RADOS picture
22JAN2014
CEPH Object Gateway
● Aka RADOS Gateway (RGW)
● RESTful API
● Amazon S3 -> s3 tools work
● SWIFT API's
● Proxy HTTP to RADOS
● Tested with apache and lighthttpd
22JAN2014
CEPH Take Aways
● Scalable
● Flexible configuration
● No SPOF but build on commodity hardware
● Accessible via different interfaces
22JAN2014
Openstack
22JAN2014
What?
● Infrastructure as a Service (IaaS)
● 'Open source' version of AWS
● New versions every 6 months
● Current called Havana
● Previous called Grizzly
● Managed by Openstack Foundation
22JAN2014
Openstack architecture
22JAN2014
Openstack Components
● Keystone – identity
● Glance - image
● Nova - compute
● Cinder - block
● Swift - object
● Neutron - network
● Horizon - dashboard
22JAN2014
About Glance
● There since almost the beginning
● Image store
● Server
● Disk
● Several formats
● raw
● qcow2
● ...
● Shared/non-shared
22JAN2014
About Cinder
● Kind of new
● Part of Nova before
● Separate since Folsom
● Block storage
● For compute instances
● Managed by Openstack users
● Several storage back-ends possible
22JAN2014
About Swift
● Since the beginning
● Replace Amazon S3 -> cloud storage
● Scalable
● Redundant
● Object store
22JAN2014
Openstack Object Store
● Proxy
● Object
● Container
● Account
● Auth
22JAN2014
Dream Team: CEPH and Openstack
22JAN2014
Why CEPH in the first place?
● Full blown storage solution
● Support
● Operational model
● Cloud'ish
● Separation of duties
● One solution for different storage needs
22JAN2014
Integration
● Full integration with RADOS
● Since Folsom
● Two parts
● Authentication
● Technical access
● Both parties must be aware
● Independent for each of the storage
components
● Different RADOS pools
22JAN2014
Authentication
● CEPH part
● Key rings
● Configuration
● For Cinder and Glance
● Openstack part
● Keystone
– Only for Swift
– Needs RGW
22JAN2014
Access to RADOS/RBD I
● Via API/libraries => no CEPHFS needed
● Easy for Glance/Cinder
● CEPH keyring configuration
● Update of CEPH.conf
● Update of Glance/Cinder API configuration
22JAN2014
Access to RADOS/RBD II
● Swift => more work needed
● CEPHFS => again: not needed
● CEPH Object Gateway
– Web server
– RGW software
– Keystone certificates
– No support for v2 Swift authentication
● Keystone authentication
– Endlist configuration -> RGW
22JAN2014
Integration the full picture
RADOS
qemu/kvm
CEPH Object Gateway CEPH Block Device
SwiftKeystone CinderGlance Nova
22JAN2014
Integration pitfalls
● CEPH versions not in sync
● Authentication
● CEPH Object Gateway setup
22JAN2014
Why CEPH - reviewed
● Previous arguments still valid :-)
● Modular usage
● High integration
● Built-in HA
● No need for POSIX compatible interface
● Works even with other IaaS implementations
22JAN2014
Summary
22JAN2014
Take Aways
● Sophisticated storage engine
● Mature OSS distributed storage solution
● Other use cases
● Can be used elsewhere
● Full integration in Openstack
22JAN2014
References
● http://ceph.com
● http://www.openstack.org
22JAN2014
Thank you!
22JAN2014
Petabyte Scale Out Rocks
CEPH as replacement for Openstack's Swift & Co.
Udo Seidel

Más contenido relacionado

La actualidad más candente

Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015Atin Mukherjee
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionGluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdoseGluster.org
 
20160401 Gluster-roadmap
20160401 Gluster-roadmap20160401 Gluster-roadmap
20160401 Gluster-roadmapGluster.org
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challengesGluster.org
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelonaGluster.org
 
20160401 guster-roadmap
20160401 guster-roadmap20160401 guster-roadmap
20160401 guster-roadmapGluster.org
 
Red Hat Gluster Storage : GlusterFS
Red Hat Gluster Storage : GlusterFSRed Hat Gluster Storage : GlusterFS
Red Hat Gluster Storage : GlusterFSbipin kunal
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster.org
 
20160130 Gluster-roadmap
20160130 Gluster-roadmap20160130 Gluster-roadmap
20160130 Gluster-roadmapGluster.org
 
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...Gluster.org
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Ceph Community
 
GlusterD - Daemon refactoring
GlusterD - Daemon refactoringGlusterD - Daemon refactoring
GlusterD - Daemon refactoringAtin Mukherjee
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013Gluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdoseGluster.org
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Community
 

La actualidad más candente (20)

Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015Gluster fs architecture_&_roadmap_atin_punemeetup_2015
Gluster fs architecture_&_roadmap_atin_punemeetup_2015
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
Cncf meetup-rook
Cncf meetup-rookCncf meetup-rook
Cncf meetup-rook
 
Cncf meetup-rook
Cncf meetup-rookCncf meetup-rook
Cncf meetup-rook
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
20160401 Gluster-roadmap
20160401 Gluster-roadmap20160401 Gluster-roadmap
20160401 Gluster-roadmap
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challenges
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
 
20160401 guster-roadmap
20160401 guster-roadmap20160401 guster-roadmap
20160401 guster-roadmap
 
Red Hat Gluster Storage : GlusterFS
Red Hat Gluster Storage : GlusterFSRed Hat Gluster Storage : GlusterFS
Red Hat Gluster Storage : GlusterFS
 
Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016Gluster d thread_synchronization_using_urcu_lca2016
Gluster d thread_synchronization_using_urcu_lca2016
 
Gdeploy 2.0
Gdeploy 2.0Gdeploy 2.0
Gdeploy 2.0
 
20160130 Gluster-roadmap
20160130 Gluster-roadmap20160130 Gluster-roadmap
20160130 Gluster-roadmap
 
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
Introduction to highly_availablenfs_server_on_scale-out_storage_systems_based...
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
 
GlusterD - Daemon refactoring
GlusterD - Daemon refactoringGlusterD - Daemon refactoring
GlusterD - Daemon refactoring
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Gluster d2
Gluster d2Gluster d2
Gluster d2
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
 

Destacado

Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarryCeph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarryThe Linux Foundation
 
Ceph Loves OpenStack: Why and How
Ceph Loves OpenStack: Why and HowCeph Loves OpenStack: Why and How
Ceph Loves OpenStack: Why and HowEmma Haruka Iwao
 
Private Cloud mit Ceph und OpenStack
Private Cloud mit Ceph und OpenStackPrivate Cloud mit Ceph und OpenStack
Private Cloud mit Ceph und OpenStackDaniel Schneller
 
State of Gluster Performance
State of Gluster PerformanceState of Gluster Performance
State of Gluster PerformanceGluster.org
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-Brée
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortNAVER D2
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep DiveRed_Hat_Storage
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterEttore Simone
 
Red Hat Gluster Storage Performance
Red Hat Gluster Storage PerformanceRed Hat Gluster Storage Performance
Red Hat Gluster Storage PerformanceRed_Hat_Storage
 
Introduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackIntroduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackOpenStack_Online
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017 Karan Singh
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph clusterMirantis
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDBSage Weil
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSHSage Weil
 

Destacado (18)

Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarryCeph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
Ceph, Xen, and CloudStack: Semper Melior-XPUS13 McGarry
 
Ceph Loves OpenStack: Why and How
Ceph Loves OpenStack: Why and HowCeph Loves OpenStack: Why and How
Ceph Loves OpenStack: Why and How
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
Private Cloud mit Ceph und OpenStack
Private Cloud mit Ceph und OpenStackPrivate Cloud mit Ceph und OpenStack
Private Cloud mit Ceph und OpenStack
 
State of Gluster Performance
State of Gluster PerformanceState of Gluster Performance
State of Gluster Performance
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
 
My SQL on Ceph
My SQL on CephMy SQL on Ceph
My SQL on Ceph
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data Center
 
Red Hat Gluster Storage Performance
Red Hat Gluster Storage PerformanceRed Hat Gluster Storage Performance
Red Hat Gluster Storage Performance
 
Introduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackIntroduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStack
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph cluster
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 

Similar a adp.ceph.openstack.talk

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
ceph openstack dream team
ceph openstack dream teamceph openstack dream team
ceph openstack dream teamUdo Seidel
 
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.ioDávid Kőszeghy
 
OSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo Seidel
OSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo SeidelOSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo Seidel
OSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo SeidelNETWAYS
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rookRohan Gupta
 
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
CEPH DAY BERLIN - WHAT'S NEW IN CEPH CEPH DAY BERLIN - WHAT'S NEW IN CEPH
CEPH DAY BERLIN - WHAT'S NEW IN CEPH Ceph Community
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project UpdateCeph Community
 
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
Ceph storage for ocp   deploying and managing ceph on top of open shift conta...Ceph storage for ocp   deploying and managing ceph on top of open shift conta...
Ceph storage for ocp deploying and managing ceph on top of open shift conta...OrFriedmann
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyCeph Community
 
Ceph Day New York: Ceph: one decade in
Ceph Day New York: Ceph: one decade inCeph Day New York: Ceph: one decade in
Ceph Day New York: Ceph: one decade inCeph Community
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech dayArthur Berezin
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonSage Weil
 
Lt2013 glusterfs.talk
Lt2013 glusterfs.talkLt2013 glusterfs.talk
Lt2013 glusterfs.talkUdo Seidel
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 
Red Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShiftRed Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShiftJeremy Eder
 
20160401 guster-roadmap
20160401 guster-roadmap20160401 guster-roadmap
20160401 guster-roadmapGluster.org
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCeph Community
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph Ceph Community
 

Similar a adp.ceph.openstack.talk (20)

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
ceph openstack dream team
ceph openstack dream teamceph openstack dream team
ceph openstack dream team
 
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
 
OSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo Seidel
OSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo SeidelOSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo Seidel
OSDC 2012 | Extremes Wolken Dateisystem!? by Dr. Udo Seidel
 
Introduction to rook
Introduction to rookIntroduction to rook
Introduction to rook
 
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
CEPH DAY BERLIN - WHAT'S NEW IN CEPH CEPH DAY BERLIN - WHAT'S NEW IN CEPH
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project Update
 
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
Ceph storage for ocp   deploying and managing ceph on top of open shift conta...Ceph storage for ocp   deploying and managing ceph on top of open shift conta...
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon Valley
 
Ceph Day New York: Ceph: one decade in
Ceph Day New York: Ceph: one decade inCeph Day New York: Ceph: one decade in
Ceph Day New York: Ceph: one decade in
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
Lt2013 glusterfs.talk
Lt2013 glusterfs.talkLt2013 glusterfs.talk
Lt2013 glusterfs.talk
 
DEVIEW 2013
DEVIEW 2013DEVIEW 2013
DEVIEW 2013
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Red Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShiftRed Hat Summit 2018 5 New High Performance Features in OpenShift
Red Hat Summit 2018 5 New High Performance Features in OpenShift
 
20160401 guster-roadmap
20160401 guster-roadmap20160401 guster-roadmap
20160401 guster-roadmap
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 
RubiX
RubiXRubiX
RubiX
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 

Último

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

adp.ceph.openstack.talk

  • 1. Petabyte Scale Out Rocks CEPH as replacement for Openstack's Swift & Co. Udo Seidel
  • 2. 22JAN2014 Agenda ● Background ● Architecture of CEPH ● CEPH Storage ● Openstack ● Dream team: CEPH and Openstack ● Summary
  • 4. 22JAN2014 CEPH – what? ● So-called parallel distributed cluster file system ● Started as part of PhD studies at UCSC ● Public announcement in 2006 at 7th OSDI ● File system shipped with Linux kernel since 2.6.34 ● Name derived from pet octopus - Cephalopods
  • 5. 22JAN2014 Shared file systems – short intro ● Multiple server access the same data ● Different approaches ● Network based, e.g. NFS, CIFS ● Clustered – Shared disk, e.g. CXFS, CFS, GFS(2), OCFS2 – Distributed parallel, e.g. Lustre .. GlusterFS and CEPHFS
  • 7. 22JAN2014 CEPH and storage ● Distributed file system => distributed storage ● Does not use traditional disks or RAID arrays ● Does use so-called OSD’s – Object based Storage Devices – Intelligent disks
  • 8. 22JAN2014 Object Based Storage I ● Objects of quite general nature ● Files ● Partitions ● ID for each storage object ● Separation of meta data operation and storing file data ● HA not covered at all ● Object based Storage Devices
  • 9. 22JAN2014 Object Based Storage II ● OSD software implementation ● Usual an additional layer between between computer and storage ● Presents object-based file system to the computer ● Use a “normal” file system to store data on the storage ● Delivered as part of CEPH ● File systems: LUSTRE, EXOFS
  • 10. 22JAN2014 CEPH – the full architecture I ● 4 components ● Object based Storage Devices – Any computer – Form a cluster (redundancy and load balancing) ● Meta Data Servers – Any computer – Form a cluster (redundancy and load balancing) ● Cluster Monitors – Any computer ● Clients ;-)
  • 11. 22JAN2014 CEPH – the full architecture II
  • 13. 22JAN2014 CEPH and OSD ● User land implementation ● Any computer can act as OSD ● Uses BTRFS as native file system ● Since 2009 ● Before self-developed EBOFS ● Provides functions of OSD-2 standard – Copy-on-write – snapshots ● No redundancy on disk or even computer level
  • 14. 22JAN2014 CEPH and OSD – file systems ● BTRFS preferred ● Non-default configuration for mkfs ● XFS and EXT4 possible ● XATTR (size) is key -> EXT4 less recommended ● XFS recommended for Production
  • 15. 22JAN2014 OSD failure approach ● Any OSD expected to fail ● New OSD dynamically added/integrated ● Data distributed and replicated ● Redistribution of data after change in OSD landscape
  • 16. 22JAN2014 Data distribution ● File stripped ● File pieces mapped to object IDs ● Assignment of so-called placement group to object ID ● Via hash function ● Placement group (PG): logical container of storage objects ● Calculation of list of OSD’s out of PG ● CRUSH algorithm
  • 17. 22JAN2014 CRUSH I ● Controlled Replication Under Scalable Hashing ● Considers several information ● Cluster setup/design ● Actual cluster landscape/map ● Placement rules ● Pseudo random -> quasi statistical distribution ● Cannot cope with hot spots ● Clients, MDS and OSD can calculate object location
  • 19. 22JAN2014 Data replication ● N-way replication ● N OSD’s per placement group ● OSD’s in different failure domains ● First non-failed OSD in PG -> primary ● Read and write to primary only ● Writes forwarded by primary to replica OSD’s ● Final write commit after all writes on replica OSD ● Replication traffic within OSD network
  • 20. 22JAN2014 CEPH cluster monitors ● Status information of CEPH components critical ● First contact point for new clients ● Monitor track changes of cluster landscape ● Update cluster map ● Propagate information to OSD’s
  • 21. 22JAN2014 CEPH cluster map I ● Objects: computers and containers ● Container: bucket for computers or containers ● Each object has ID and weight ● Maps physical conditions ● rack location ● fire cells
  • 22. 22JAN2014 CEPH cluster map II ● Reflects data rules ● Number of copies ● Placement of copies ● Updated version sent to OSD’s ● OSD’s distribute cluster map within OSD cluster ● OSD re-calculates via CRUSH PG membership – data responsibilities – Order: primary or replica ● New I/O accepted after information synch
  • 23. 22JAN2014 CEPH - RADOS ● Reliable Autonomic Distributed Object Storage ● Direct access to OSD cluster via librados ● Support: C, C++, Java, Python, Ruby, PHP ● Drop/skip of POSIX layer (CEPHFS) on top ● Visible to all CEPH cluster members => shared storage
  • 24. 22JAN2014 CEPH Block Device ● Aka RADOS block device (RBD) ● Upstream since kernel 2.6.37 ● RADOS storage exposed as block device ● /dev/rdb ● qemu/KVM storage driver via librados/librdb ● Alternative to ● shared SAN/iSCSI for HA environments ● Storage HA solutions for qemu/KVM
  • 26. 22JAN2014 CEPH Object Gateway ● Aka RADOS Gateway (RGW) ● RESTful API ● Amazon S3 -> s3 tools work ● SWIFT API's ● Proxy HTTP to RADOS ● Tested with apache and lighthttpd
  • 27. 22JAN2014 CEPH Take Aways ● Scalable ● Flexible configuration ● No SPOF but build on commodity hardware ● Accessible via different interfaces
  • 29. 22JAN2014 What? ● Infrastructure as a Service (IaaS) ● 'Open source' version of AWS ● New versions every 6 months ● Current called Havana ● Previous called Grizzly ● Managed by Openstack Foundation
  • 31. 22JAN2014 Openstack Components ● Keystone – identity ● Glance - image ● Nova - compute ● Cinder - block ● Swift - object ● Neutron - network ● Horizon - dashboard
  • 32. 22JAN2014 About Glance ● There since almost the beginning ● Image store ● Server ● Disk ● Several formats ● raw ● qcow2 ● ... ● Shared/non-shared
  • 33. 22JAN2014 About Cinder ● Kind of new ● Part of Nova before ● Separate since Folsom ● Block storage ● For compute instances ● Managed by Openstack users ● Several storage back-ends possible
  • 34. 22JAN2014 About Swift ● Since the beginning ● Replace Amazon S3 -> cloud storage ● Scalable ● Redundant ● Object store
  • 35. 22JAN2014 Openstack Object Store ● Proxy ● Object ● Container ● Account ● Auth
  • 36. 22JAN2014 Dream Team: CEPH and Openstack
  • 37. 22JAN2014 Why CEPH in the first place? ● Full blown storage solution ● Support ● Operational model ● Cloud'ish ● Separation of duties ● One solution for different storage needs
  • 38. 22JAN2014 Integration ● Full integration with RADOS ● Since Folsom ● Two parts ● Authentication ● Technical access ● Both parties must be aware ● Independent for each of the storage components ● Different RADOS pools
  • 39. 22JAN2014 Authentication ● CEPH part ● Key rings ● Configuration ● For Cinder and Glance ● Openstack part ● Keystone – Only for Swift – Needs RGW
  • 40. 22JAN2014 Access to RADOS/RBD I ● Via API/libraries => no CEPHFS needed ● Easy for Glance/Cinder ● CEPH keyring configuration ● Update of CEPH.conf ● Update of Glance/Cinder API configuration
  • 41. 22JAN2014 Access to RADOS/RBD II ● Swift => more work needed ● CEPHFS => again: not needed ● CEPH Object Gateway – Web server – RGW software – Keystone certificates – No support for v2 Swift authentication ● Keystone authentication – Endlist configuration -> RGW
  • 42. 22JAN2014 Integration the full picture RADOS qemu/kvm CEPH Object Gateway CEPH Block Device SwiftKeystone CinderGlance Nova
  • 43. 22JAN2014 Integration pitfalls ● CEPH versions not in sync ● Authentication ● CEPH Object Gateway setup
  • 44. 22JAN2014 Why CEPH - reviewed ● Previous arguments still valid :-) ● Modular usage ● High integration ● Built-in HA ● No need for POSIX compatible interface ● Works even with other IaaS implementations
  • 46. 22JAN2014 Take Aways ● Sophisticated storage engine ● Mature OSS distributed storage solution ● Other use cases ● Can be used elsewhere ● Full integration in Openstack
  • 49. 22JAN2014 Petabyte Scale Out Rocks CEPH as replacement for Openstack's Swift & Co. Udo Seidel