SlideShare una empresa de Scribd logo
1 de 53
Descargar para leer sin conexión
Ceph: one decade in 
Sage Weil
RESEARCH
RESEARCH INCUBATION
RESEARCH INCUBATION INKTANK
Research beginnings 
7
RESEARCH
UCSC research grant 
● “Petascale object storage” 
● DOE: LANL, LLNL, Sandia 
● Scalability, reliability, performance 
● HPC file system workloads 
● Scalable metadata management 
● First line of Ceph code 
● Summer internship at LLNL 
● High security national lab environment 
● Could write anything, as long as it was OSS
The rest of Ceph 
● RADOS – distributed object storage cluster (2005) 
● EBOFS – local object storage (2004/2006) 
● CRUSH – hashing for the real world (2005) 
● Paxos monitors – cluster consensus (2006) 
→ emphasis on consistent, reliable storage 
→ scale by pushing intelligence to the edges 
→ a different but compelling architecture
Industry black hole 
● Many large storage vendors 
● Proprietary solutions that don't scale well 
● Few open source alternatives (2006) 
● Very limited scale, or 
● Limited community and architecture (Lustre) 
● No enterprise feature sets (snapshots, quotas) 
● PhD grads all built interesting systems... 
● ...and then went to work for Netapp, DDN, EMC, Veritas. 
● They want you, not your project
A different path? 
● Change the storage world with open source 
● Do what Linux did to Solaris, Irix, Ultrix, etc. 
● License 
● LGPL: share changes, okay to link to proprietary code 
● Avoid unfriendly practices 
● Dual licensing 
● Copyright assignment 
● Platform 
● Remember sourceforge.net?
Incubation 
14
RESEARCH INCUBATION
DreamHost! 
● Move back to LA, continue hacking 
● Hired a few developers 
● Pure development 
● No deliverables
Ambitious feature set 
● Native Linux kernel client (2007-) 
● Per-directory snapshots (2008) 
● Recursive accounting (2008) 
● Object classes (2009) 
● librados (2009) 
● radosgw (2009) 
● strong authentication (2009) 
● RBD: rados block device (2010)
The kernel client 
● ceph-fuse was limited, not very fast 
● Build native Linux kernel implementation 
● Began attending Linux file system developer events 
(LSF) 
● Early words of encouragement from ex-Lustre dev 
● Engage Linux fs developer community as peer 
● Initial attempts merge rejected by Linus 
● Not sufficient evidence of user demand 
● A few fans and would-be users chimed in... 
● Eventually merged for v2.6.34 (early 2010)
Part of a larger ecosystem 
● Ceph need not solve all problems as monolithic 
stack 
● Replaced ebofs object file system with btrfs 
● Same design goals; avoid reinventing the wheel 
● Robust, supported, well-optimized 
● Kernel-level cache management 
● Copy-on-write, checksumming, other goodness 
● Contributed some early functionality 
● Cloning files 
● Async snapshots
Budding community 
● #ceph on irc.oftc.net, ceph-devel@vger.kernel.org 
● Many interested users 
● A few developers 
● Many fans 
● Too unstable for any real deployments 
● Still mostly focused on right architecture and 
technical solutions
Road to product 
● DreamHost decides to build an S3-compatible 
object storage service with Ceph 
● Stability 
● Focus on core RADOS, RBD, radosgw 
● Paying back some technical debt 
● Build testing automation 
● Code review! 
● Expand engineering team
The reality 
● Growing incoming commercial interest 
● Early attempts from organizations large and small 
● Difficult to engage with a web hosting company 
● No means to support commercial deployments 
● Project needed a company to back it 
● Fund the engineering effort 
● Build and test a product 
● Support users 
● Orchestrated a spin out of DreamHost in 2012
Inktank 
23
RESEARCH INCUBATION INKTANK
Do it right 
● How do we build a strong open source company? 
● How do we build a strong open source community? 
● Models? 
● Red Hat, SUSE, Cloudera, MySQL, Canonical, … 
● Initial funding from DreamHost, Mark 
Shuttleworth
Goals 
● A stable Ceph release for production deployment 
● DreamObjects 
● Lay foundation for widespread adoption 
● Platform support (Ubuntu, Red Hat, SUSE) 
● Documentation 
● Build and test infrastructure 
● Build a sales and support organization 
● Expand engineering organization
Branding 
● Early decision to engage professional agency 
● Terms like 
● “Brand core” 
● “Design system” 
● Company vs Project 
● Inktank != Ceph 
● Establish a healthy relationship with the community 
● Aspirational messaging: The Future of Storage
Slick graphics 
● broken powerpoint template 28
Traction 
● Too many production deployments to count 
● We don't know about most of them! 
● Too many customers (for me) to count 
● Growing partner list 
● Lots of buzz 
● OpenStack
Quality 
● Increased adoption means increased demands on 
robust testing 
● Across multiple platforms 
● Include platforms we don't use 
● Upgrades 
● Rolling upgrades 
● Inter-version compatibility
Developer community 
● Significant external contributors 
● First-class feature contributions from contributors 
● Non-Inktank participants in daily stand-ups 
● External access to build/test lab infrastructure 
● Common toolset 
● Github 
● Email (kernel.org) 
● IRC (oftc.net) 
● Linux distros
CDS: Ceph Developer Summit 
● Community process for building project roadmap 
● 100% online 
● Google hangouts 
● Wikis 
● Etherpad 
● First was in Spring 2013, sixth is coming up 
● (Continuing to) indoctrinate our own developers to 
an open development model
And then... 
s/Red Hat of Storage/Storage of Red Hat/
Calamari 
● Inktank strategy was to package Ceph for the 
Enterprise 
● Inktank Ceph Enterprise (ICE) 
● Ceph: a hardened, tested, validated version 
● Calamari: management layer and GUI (proprietary!) 
● Enterprise integrations: SNMP, HyperV, VMWare 
● Support SLAs 
● Red Hat model is pure open source 
● Open sourced Calamari
The Present 
35
Tiering 
● Client side caches are great, but only buy so much. 
● Can we separate hot and cold data onto different 
storage devices? 
● Cache pools: promote hot objects from an existing pool 
into a fast (e.g., FusionIO) pool 
● Cold pools: demote cold data to a slow, archival pool (e.g., 
erasure coding, NYI) 
● Very Cold Pools (efficient erasure coding, compression, osd 
spin down to save power) OR tape/public cloud 
● How do you identify what is hot and cold? 
● Common in enterprise solutions; not found in open 
source scale-out systems 
→ cache pools new in Firefly, better in Giant
Erasure coding 
● Replication for redundancy is flexible and fast 
● For larger clusters, it can be expensive 
● We can trade recovery performance for storage 
Storage 
overhead 
Repair 
traffic 
● Erasure coded data is hard to modify, but ideal for 
cold or read-only objects 
● Cold storage tiering 
● Will be used directly by radosgw 
MTTDL 
(days) 
3x replication 3x 1x 2.3 E10 
RS (10, 4) 1.4x 10x 3.3 E13 
LRC (10, 6, 5) 1.6x 5x 1.2 E15
Erasure coding (cont'd) 
● In firefly 
● LRC in Giant 
● Intel ISA-L (optimized library) in Giant, maybe 
backported to Firefly 
● Talk of ARM optimized (NEON) jerasure
Async Replication in RADOS 
● Clinic project with Harvey Mudd College 
● Group of students working on real world project 
● Reason about bounds on clock drift so we can 
achieve point-in-time consistency across a 
distributed set of nodes
CephFS 
● Dogfooding for internal QA infrastructure 
● Learning lots 
● Many rough edges, but working quite well! 
● We want to hear from you!
The Future 
41
CephFS 
→ This is where it all started – let's get there 
● Today 
● QA coverage and bug squashing continues 
● NFS and CIFS now large complete and robust 
● Multi-MDS stability continues to improve 
● Need 
● QA investment 
● Snapshot work 
● Amazing community effort
The larger ecosystem
Storage backends 
● Backends are pluggable 
● Recent work to use rocksdb everywhere leveldb 
can be used (mon/osd); can easily plug in other 
key/value store libraries 
● Other possibilities include LMDB or NVNKV (from 
fusionIO) 
● Prototype kinetic backend 
● Alternative OSD backends 
● KeyValueStore – put all data in a k/v db (Haomai @ 
UnitedStack) 
● KeyFileStore initial plans (2nd gen?) 
● Some partners looking at backends tuned to their 
hardware
Governance 
How do we strengthen the project community? 
● Recognize project leads 
● RBD, RGW, RADOS, CephFS, Calamari, etc. 
● Formalize processes around CDS, community 
roadmap 
● Formal foundation? 
● Community build and test lab infrastructure 
● Build and test for broad range of OSs, distros, hardware
Technical roadmap 
● How do we reach new use-cases and users? 
● How do we better satisfy existing users? 
● How do we ensure Ceph can succeed in enough 
markets for business investment to thrive? 
● Enough breadth to expand and grow the 
community 
● Enough focus to do well
Performance 
● Lots of work with partners to improve performance 
● High-end flash back ends. Optimize hot paths to 
limit CPU usage, drive up IOPS 
● Improve threading, fine-grained locks 
● Low-power processors. Run well on small ARM 
devices (including those new-fangled ethernet 
drives)
Ethernet Drives 
● Multiple vendors are building 'ethernet drives' 
● Normal hard drives w/ small ARM host on board 
● Could run OSD natively on the drive, completely 
remove the “host” from the deployment 
● Many different implementations, some vendors need help 
w/ open architecture and ecosystem concepts 
● Current devices are hard disks; no reason they 
couldn't also be flash-based, or hybrid 
● This is exactly what we were thinking when Ceph 
was originally designed!
Big data? 
● Hadoop map/reduce built on a weak “file system” 
model 
● s/HDFS/CephFS/ 
● It is easier to move computation than data 
● RADOS “classes” allow computation to be moved to 
storage nodes 
● New “methods” (beyond read/write) for your “objects” 
● Simple sandbox, very extensible 
● Still looking for killer use-case
Archival storage 
● Erasure coding makes storage cost compelling 
● Especially when combined with ethernet drives 
● Already do background scrubbing, repair 
● Data integrity 
● Yes: over the wire 
● Yes: at rest (...for erasure coded content) 
● Yes: background scrubbing 
● Still need end-to-end verification
The enterprise 
How do we pay for all our toys? 
● Support legacy and transitional interfaces 
● iSCSI, NFS, pNFS, CIFS 
● Vmware, Hyper-v 
● Identify the beachhead use-cases 
● Only takes one use-case to get in the door 
● Single platform – shared storage resource 
● Bottom-up: earn respect of engineers and admins 
● Top-down: strong brand and compelling product
Why we can beat the old guard 
● Strong foundational architecture 
● It is hard to compete with free and open source 
software 
● Unbeatable value proposition 
● Ultimately a more efficient development model 
● It is hard to manufacture community 
● Native protocols, Linux kernel support 
● Unencumbered by legacy protocols like NFS 
● Move beyond traditional client/server model 
● Ongoing paradigm shift 
● Software defined infrastructure, data center
Thanks!

Más contenido relacionado

La actualidad más candente

vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
CloudStack - Open Source Cloud Computing Project
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
buildacloud
 
Openstack with ceph
Openstack with cephOpenstack with ceph
Openstack with ceph
Ian Colle
 

La actualidad más candente (20)

vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
 
The Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgThe Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.org
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Galera on kubernetes_no_video
Galera on kubernetes_no_videoGalera on kubernetes_no_video
Galera on kubernetes_no_video
 
Ceph Overview for Distributed Computing Denver Meetup
Ceph Overview for Distributed Computing Denver MeetupCeph Overview for Distributed Computing Denver Meetup
Ceph Overview for Distributed Computing Denver Meetup
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
Ceph and Openstack in a Nutshell
Ceph and Openstack in a NutshellCeph and Openstack in a Nutshell
Ceph and Openstack in a Nutshell
 
create auto scale jboss cluster with openshift
create auto scale jboss cluster with openshiftcreate auto scale jboss cluster with openshift
create auto scale jboss cluster with openshift
 
Iocg Whats New In V Sphere
Iocg Whats New In V SphereIocg Whats New In V Sphere
Iocg Whats New In V Sphere
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
Ceph and OpenStack - Feb 2014
Ceph and OpenStack - Feb 2014Ceph and OpenStack - Feb 2014
Ceph and OpenStack - Feb 2014
 
I3 docker-intro-yusuf
I3 docker-intro-yusufI3 docker-intro-yusuf
I3 docker-intro-yusuf
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Openstack with ceph
Openstack with cephOpenstack with ceph
Openstack with ceph
 
Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
 
GlusterFS And Big Data
GlusterFS And Big DataGlusterFS And Big Data
GlusterFS And Big Data
 
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red Hat Storage Day New York - What's New in Red Hat Ceph StorageRed Hat Storage Day New York - What's New in Red Hat Ceph Storage
Red Hat Storage Day New York - What's New in Red Hat Ceph Storage
 
Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...Best practices for optimizing Red Hat platforms for large scale datacenter de...
Best practices for optimizing Red Hat platforms for large scale datacenter de...
 

Destacado

Guarda municipal e polícia militar análise da existência de superposição da...
Guarda municipal e polícia militar   análise da existência de superposição da...Guarda municipal e polícia militar   análise da existência de superposição da...
Guarda municipal e polícia militar análise da existência de superposição da...
pedroboaventura
 
Easy Evengelsim
Easy EvengelsimEasy Evengelsim
Easy Evengelsim
Randy Ray
 
The Beatles Fin PresentacióN Tic
The Beatles   Fin PresentacióN TicThe Beatles   Fin PresentacióN Tic
The Beatles Fin PresentacióN Tic
xandr3
 
propuestas de las mujeres plan anticrisis
propuestas de las mujeres plan anticrisispropuestas de las mujeres plan anticrisis
propuestas de las mujeres plan anticrisis
mery moisa
 
BoletíN Informativo Visita Codeme
BoletíN Informativo Visita CodemeBoletíN Informativo Visita Codeme
BoletíN Informativo Visita Codeme
duborgan
 
Protocolo[1]
Protocolo[1]Protocolo[1]
Protocolo[1]
saul
 
Walking in Ungvar, Ukraine
Walking in Ungvar, UkraineWalking in Ungvar, Ukraine
Walking in Ungvar, Ukraine
Ivan Szedo
 
Direito tributário e conceito de tributo 2011 2
Direito tributário e conceito de tributo 2011 2Direito tributário e conceito de tributo 2011 2
Direito tributário e conceito de tributo 2011 2
Rosangela Garcia
 
Presentaicon de lengua
Presentaicon  de lenguaPresentaicon  de lengua
Presentaicon de lengua
cristian159gdl
 

Destacado (20)

Viagem de finalistas
Viagem de finalistasViagem de finalistas
Viagem de finalistas
 
Teste
TesteTeste
Teste
 
Cambios en el paisaje evolución de la cobertura vegetal en la región orienta...
Cambios en el paisaje  evolución de la cobertura vegetal en la región orienta...Cambios en el paisaje  evolución de la cobertura vegetal en la región orienta...
Cambios en el paisaje evolución de la cobertura vegetal en la región orienta...
 
Guarda municipal e polícia militar análise da existência de superposição da...
Guarda municipal e polícia militar   análise da existência de superposição da...Guarda municipal e polícia militar   análise da existência de superposição da...
Guarda municipal e polícia militar análise da existência de superposição da...
 
Cinthia proyecto computacion
Cinthia proyecto computacionCinthia proyecto computacion
Cinthia proyecto computacion
 
Easy Evengelsim
Easy EvengelsimEasy Evengelsim
Easy Evengelsim
 
The Beatles Fin PresentacióN Tic
The Beatles   Fin PresentacióN TicThe Beatles   Fin PresentacióN Tic
The Beatles Fin PresentacióN Tic
 
propuestas de las mujeres plan anticrisis
propuestas de las mujeres plan anticrisispropuestas de las mujeres plan anticrisis
propuestas de las mujeres plan anticrisis
 
Billying
BillyingBillying
Billying
 
BoletíN Informativo Visita Codeme
BoletíN Informativo Visita CodemeBoletíN Informativo Visita Codeme
BoletíN Informativo Visita Codeme
 
Protocolo[1]
Protocolo[1]Protocolo[1]
Protocolo[1]
 
Walking in Ungvar, Ukraine
Walking in Ungvar, UkraineWalking in Ungvar, Ukraine
Walking in Ungvar, Ukraine
 
Direito tributário e conceito de tributo 2011 2
Direito tributário e conceito de tributo 2011 2Direito tributário e conceito de tributo 2011 2
Direito tributário e conceito de tributo 2011 2
 
Presentaicon de lengua
Presentaicon  de lenguaPresentaicon  de lengua
Presentaicon de lengua
 
Dowirde Ra66inaande Nataande Ngam Faamde Lislaam Fulani
Dowirde Ra66inaande Nataande Ngam Faamde Lislaam  FulaniDowirde Ra66inaande Nataande Ngam Faamde Lislaam  Fulani
Dowirde Ra66inaande Nataande Ngam Faamde Lislaam Fulani
 
Aula15 revisao
Aula15 revisaoAula15 revisao
Aula15 revisao
 
Saludo al Sol para Niños
Saludo al Sol para NiñosSaludo al Sol para Niños
Saludo al Sol para Niños
 
Imagenes sorprendentes de un Mundo sorprendente
Imagenes sorprendentes de un Mundo sorprendenteImagenes sorprendentes de un Mundo sorprendente
Imagenes sorprendentes de un Mundo sorprendente
 
Mayflower Mobile Apps Entwickeln
Mayflower Mobile Apps EntwickelnMayflower Mobile Apps Entwickeln
Mayflower Mobile Apps Entwickeln
 
Asia
AsiaAsia
Asia
 

Similar a Ceph Day New York: Ceph: one decade in

Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Community
 
Benchmarking for postgresql workloads in kubernetes
Benchmarking for postgresql workloads in kubernetesBenchmarking for postgresql workloads in kubernetes
Benchmarking for postgresql workloads in kubernetes
DoKC
 

Similar a Ceph Day New York: Ceph: one decade in (20)

Ceph: A decade in the making and still going strong
Ceph: A decade in the making and still going strongCeph: A decade in the making and still going strong
Ceph: A decade in the making and still going strong
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong
 
DEVIEW 2013
DEVIEW 2013DEVIEW 2013
DEVIEW 2013
 
Introduction to OpenStack Storage
Introduction to OpenStack StorageIntroduction to OpenStack Storage
Introduction to OpenStack Storage
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project Update
 
What's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon ValleyWhat's New with Ceph - Ceph Day Silicon Valley
What's New with Ceph - Ceph Day Silicon Valley
 
Instant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesInstant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositories
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Red Hat Cloud Infrastructure Conference 2013 - Presentation about OpenStack ...
Red Hat Cloud Infrastructure Conference 2013 -  Presentation about OpenStack ...Red Hat Cloud Infrastructure Conference 2013 -  Presentation about OpenStack ...
Red Hat Cloud Infrastructure Conference 2013 - Presentation about OpenStack ...
 
OpenStack Cinder Best Practices - Meet Up
OpenStack Cinder Best Practices - Meet UpOpenStack Cinder Best Practices - Meet Up
OpenStack Cinder Best Practices - Meet Up
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
Benchmarking for postgresql workloads in kubernetes
Benchmarking for postgresql workloads in kubernetesBenchmarking for postgresql workloads in kubernetes
Benchmarking for postgresql workloads in kubernetes
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
 
Ceph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfCeph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdf
 
EDB Postgres with Containers
EDB Postgres with ContainersEDB Postgres with Containers
EDB Postgres with Containers
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Open Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNETOpen Source Storage at Scale: Ceph @ GRNET
Open Source Storage at Scale: Ceph @ GRNET
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
 

Último

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Último (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

Ceph Day New York: Ceph: one decade in

  • 1. Ceph: one decade in Sage Weil
  • 2.
  • 3.
  • 9. UCSC research grant ● “Petascale object storage” ● DOE: LANL, LLNL, Sandia ● Scalability, reliability, performance ● HPC file system workloads ● Scalable metadata management ● First line of Ceph code ● Summer internship at LLNL ● High security national lab environment ● Could write anything, as long as it was OSS
  • 10. The rest of Ceph ● RADOS – distributed object storage cluster (2005) ● EBOFS – local object storage (2004/2006) ● CRUSH – hashing for the real world (2005) ● Paxos monitors – cluster consensus (2006) → emphasis on consistent, reliable storage → scale by pushing intelligence to the edges → a different but compelling architecture
  • 11.
  • 12. Industry black hole ● Many large storage vendors ● Proprietary solutions that don't scale well ● Few open source alternatives (2006) ● Very limited scale, or ● Limited community and architecture (Lustre) ● No enterprise feature sets (snapshots, quotas) ● PhD grads all built interesting systems... ● ...and then went to work for Netapp, DDN, EMC, Veritas. ● They want you, not your project
  • 13. A different path? ● Change the storage world with open source ● Do what Linux did to Solaris, Irix, Ultrix, etc. ● License ● LGPL: share changes, okay to link to proprietary code ● Avoid unfriendly practices ● Dual licensing ● Copyright assignment ● Platform ● Remember sourceforge.net?
  • 16. DreamHost! ● Move back to LA, continue hacking ● Hired a few developers ● Pure development ● No deliverables
  • 17. Ambitious feature set ● Native Linux kernel client (2007-) ● Per-directory snapshots (2008) ● Recursive accounting (2008) ● Object classes (2009) ● librados (2009) ● radosgw (2009) ● strong authentication (2009) ● RBD: rados block device (2010)
  • 18. The kernel client ● ceph-fuse was limited, not very fast ● Build native Linux kernel implementation ● Began attending Linux file system developer events (LSF) ● Early words of encouragement from ex-Lustre dev ● Engage Linux fs developer community as peer ● Initial attempts merge rejected by Linus ● Not sufficient evidence of user demand ● A few fans and would-be users chimed in... ● Eventually merged for v2.6.34 (early 2010)
  • 19. Part of a larger ecosystem ● Ceph need not solve all problems as monolithic stack ● Replaced ebofs object file system with btrfs ● Same design goals; avoid reinventing the wheel ● Robust, supported, well-optimized ● Kernel-level cache management ● Copy-on-write, checksumming, other goodness ● Contributed some early functionality ● Cloning files ● Async snapshots
  • 20. Budding community ● #ceph on irc.oftc.net, ceph-devel@vger.kernel.org ● Many interested users ● A few developers ● Many fans ● Too unstable for any real deployments ● Still mostly focused on right architecture and technical solutions
  • 21. Road to product ● DreamHost decides to build an S3-compatible object storage service with Ceph ● Stability ● Focus on core RADOS, RBD, radosgw ● Paying back some technical debt ● Build testing automation ● Code review! ● Expand engineering team
  • 22. The reality ● Growing incoming commercial interest ● Early attempts from organizations large and small ● Difficult to engage with a web hosting company ● No means to support commercial deployments ● Project needed a company to back it ● Fund the engineering effort ● Build and test a product ● Support users ● Orchestrated a spin out of DreamHost in 2012
  • 25. Do it right ● How do we build a strong open source company? ● How do we build a strong open source community? ● Models? ● Red Hat, SUSE, Cloudera, MySQL, Canonical, … ● Initial funding from DreamHost, Mark Shuttleworth
  • 26. Goals ● A stable Ceph release for production deployment ● DreamObjects ● Lay foundation for widespread adoption ● Platform support (Ubuntu, Red Hat, SUSE) ● Documentation ● Build and test infrastructure ● Build a sales and support organization ● Expand engineering organization
  • 27. Branding ● Early decision to engage professional agency ● Terms like ● “Brand core” ● “Design system” ● Company vs Project ● Inktank != Ceph ● Establish a healthy relationship with the community ● Aspirational messaging: The Future of Storage
  • 28. Slick graphics ● broken powerpoint template 28
  • 29. Traction ● Too many production deployments to count ● We don't know about most of them! ● Too many customers (for me) to count ● Growing partner list ● Lots of buzz ● OpenStack
  • 30. Quality ● Increased adoption means increased demands on robust testing ● Across multiple platforms ● Include platforms we don't use ● Upgrades ● Rolling upgrades ● Inter-version compatibility
  • 31. Developer community ● Significant external contributors ● First-class feature contributions from contributors ● Non-Inktank participants in daily stand-ups ● External access to build/test lab infrastructure ● Common toolset ● Github ● Email (kernel.org) ● IRC (oftc.net) ● Linux distros
  • 32. CDS: Ceph Developer Summit ● Community process for building project roadmap ● 100% online ● Google hangouts ● Wikis ● Etherpad ● First was in Spring 2013, sixth is coming up ● (Continuing to) indoctrinate our own developers to an open development model
  • 33. And then... s/Red Hat of Storage/Storage of Red Hat/
  • 34. Calamari ● Inktank strategy was to package Ceph for the Enterprise ● Inktank Ceph Enterprise (ICE) ● Ceph: a hardened, tested, validated version ● Calamari: management layer and GUI (proprietary!) ● Enterprise integrations: SNMP, HyperV, VMWare ● Support SLAs ● Red Hat model is pure open source ● Open sourced Calamari
  • 36. Tiering ● Client side caches are great, but only buy so much. ● Can we separate hot and cold data onto different storage devices? ● Cache pools: promote hot objects from an existing pool into a fast (e.g., FusionIO) pool ● Cold pools: demote cold data to a slow, archival pool (e.g., erasure coding, NYI) ● Very Cold Pools (efficient erasure coding, compression, osd spin down to save power) OR tape/public cloud ● How do you identify what is hot and cold? ● Common in enterprise solutions; not found in open source scale-out systems → cache pools new in Firefly, better in Giant
  • 37. Erasure coding ● Replication for redundancy is flexible and fast ● For larger clusters, it can be expensive ● We can trade recovery performance for storage Storage overhead Repair traffic ● Erasure coded data is hard to modify, but ideal for cold or read-only objects ● Cold storage tiering ● Will be used directly by radosgw MTTDL (days) 3x replication 3x 1x 2.3 E10 RS (10, 4) 1.4x 10x 3.3 E13 LRC (10, 6, 5) 1.6x 5x 1.2 E15
  • 38. Erasure coding (cont'd) ● In firefly ● LRC in Giant ● Intel ISA-L (optimized library) in Giant, maybe backported to Firefly ● Talk of ARM optimized (NEON) jerasure
  • 39. Async Replication in RADOS ● Clinic project with Harvey Mudd College ● Group of students working on real world project ● Reason about bounds on clock drift so we can achieve point-in-time consistency across a distributed set of nodes
  • 40. CephFS ● Dogfooding for internal QA infrastructure ● Learning lots ● Many rough edges, but working quite well! ● We want to hear from you!
  • 42. CephFS → This is where it all started – let's get there ● Today ● QA coverage and bug squashing continues ● NFS and CIFS now large complete and robust ● Multi-MDS stability continues to improve ● Need ● QA investment ● Snapshot work ● Amazing community effort
  • 44. Storage backends ● Backends are pluggable ● Recent work to use rocksdb everywhere leveldb can be used (mon/osd); can easily plug in other key/value store libraries ● Other possibilities include LMDB or NVNKV (from fusionIO) ● Prototype kinetic backend ● Alternative OSD backends ● KeyValueStore – put all data in a k/v db (Haomai @ UnitedStack) ● KeyFileStore initial plans (2nd gen?) ● Some partners looking at backends tuned to their hardware
  • 45. Governance How do we strengthen the project community? ● Recognize project leads ● RBD, RGW, RADOS, CephFS, Calamari, etc. ● Formalize processes around CDS, community roadmap ● Formal foundation? ● Community build and test lab infrastructure ● Build and test for broad range of OSs, distros, hardware
  • 46. Technical roadmap ● How do we reach new use-cases and users? ● How do we better satisfy existing users? ● How do we ensure Ceph can succeed in enough markets for business investment to thrive? ● Enough breadth to expand and grow the community ● Enough focus to do well
  • 47. Performance ● Lots of work with partners to improve performance ● High-end flash back ends. Optimize hot paths to limit CPU usage, drive up IOPS ● Improve threading, fine-grained locks ● Low-power processors. Run well on small ARM devices (including those new-fangled ethernet drives)
  • 48. Ethernet Drives ● Multiple vendors are building 'ethernet drives' ● Normal hard drives w/ small ARM host on board ● Could run OSD natively on the drive, completely remove the “host” from the deployment ● Many different implementations, some vendors need help w/ open architecture and ecosystem concepts ● Current devices are hard disks; no reason they couldn't also be flash-based, or hybrid ● This is exactly what we were thinking when Ceph was originally designed!
  • 49. Big data? ● Hadoop map/reduce built on a weak “file system” model ● s/HDFS/CephFS/ ● It is easier to move computation than data ● RADOS “classes” allow computation to be moved to storage nodes ● New “methods” (beyond read/write) for your “objects” ● Simple sandbox, very extensible ● Still looking for killer use-case
  • 50. Archival storage ● Erasure coding makes storage cost compelling ● Especially when combined with ethernet drives ● Already do background scrubbing, repair ● Data integrity ● Yes: over the wire ● Yes: at rest (...for erasure coded content) ● Yes: background scrubbing ● Still need end-to-end verification
  • 51. The enterprise How do we pay for all our toys? ● Support legacy and transitional interfaces ● iSCSI, NFS, pNFS, CIFS ● Vmware, Hyper-v ● Identify the beachhead use-cases ● Only takes one use-case to get in the door ● Single platform – shared storage resource ● Bottom-up: earn respect of engineers and admins ● Top-down: strong brand and compelling product
  • 52. Why we can beat the old guard ● Strong foundational architecture ● It is hard to compete with free and open source software ● Unbeatable value proposition ● Ultimately a more efficient development model ● It is hard to manufacture community ● Native protocols, Linux kernel support ● Unencumbered by legacy protocols like NFS ● Move beyond traditional client/server model ● Ongoing paradigm shift ● Software defined infrastructure, data center