SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Ceph Research at UCSC
Noah Watkins
2
● Graduate student
● UC Santa Cruz
● Data management,
file systems, HPC,
QoS
● Ceph as a
prototyping
platform
Storage research at UCSC
● Data management
● Storage systems
● High-performance computing
● Quality of service
● Real-time systems
•
•
•
–
•
•
•
•
•
4
● Graduate student
● UC Santa Cruz
● Data management,
file systems, HPC,
QoS
● Ceph as a
prototyping
platform
Storage research at UCSC
● SIRIUS Project (DOE)
● Programmable storage (CROSS, NSF)
● Declarative storage (NSF)
● IRIS-HEP (NSF)
SIRIUS: Science-driven Data Management for Multi-tiered
Storage (ORNL, Sandia, Brown, Rutgers, UCSC)
● Storage challenges for
exascale systems
● (1) Heterogeneity
○ Where should data be
stored?
● (2) Predictable
performance
○ Millions of processes
performing I/O
● Many challenges...
DOE SSIO, “Science-Driven Data
Management for Multi-Tiered Storage”
with ORNL and Sandia, award
DE-SC0016074
Malacology
A Programmable Storage System
[Sevilla et al. EuroSys '17]
Michael A. Sevilla, Noah Watkins, Ivo Jimenez,
Peter Alvaro, Shel Finkelstein, Jeff LeFevre, Carlos Maltzahn
University of California, Santa Cruz
Malacology: programmable interface research platform
7
Target storage interface
(Goal)
Internal sys services
(Building blocks)
Composed, generic
service glue layer
Malacology: A Programmable Storage System, M. Sevilla, N. Watkins, I. Jimenez, P. Alvaro, S. Finkelstein, J. LeFevre, C. Maltzahn, EuroSys ‘17
Mantle: A Programmable Metadata Load Balancer for the Ceph File System, M. Sevilla, N. Watkins, C. Maltzahn, I. Nassi, S. Brandt, et. al, SC ‘15
CORFU: A Shared Log Design for Flash Clusters, Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, and Ted Wobber, Michael Wei, et. al, NSDI ‘12
How to grow a database: scale-up
Database Node
CPURAM
Database Storage
Network /
Bus
Q
8
https://aws.amazon.com/ec2/instance-types/
Skyhook project
● Elastic database system
● Lead: Jeff LeFevre
● Active CROSS incubator
Skyhook: exploit storage resources
Database Node
CPURAM
Database storage
Network /
Bus
9
Database storage
Network
Q
Q
Q
Q
(Q)
Skyhook project
● Elastic database system
● Lead: Jeff LeFevre
● Active CROSS incubator
Single-node architecture
Database Node
CPURAM
Q
Skyhook
architecture
Programmable storage
DB-specific Data Interface
Ceph OSD
RAM CPU
Storage+Index
Q
Skyhook: aligns data with storage interfaces
Ceph OSD
RAM CPU
Storage+Index
Ceph OSD
RAM CPU
Storage+Index
Ceph OSD
RAM CPU
Storage+Index
Ceph OSD
RAM CPU
Storage+Index
Ceph OSD
RAM CPU
Storage+Index
Ceph OSD
RAM CPU
Storage+Index
Ceph Cluster
C1 C2 C3
Table
Table
Shards
partitioning
{ object.i }
{ object.i }
10
Database Node
RAM CPU
Foreign Data Wrappers
Q
Q
Database node
* Indexing
* Projection
* Filtering
* Aggregation
App-specific
interface
Skyhook experiments with programmable storage
● Real-world dataset
○ TPC lineitem table
○ 1 billion rows
○ 140 GB
● Storage in Ceph objects
○ Table divided into ~10,000 14 MB objects
■ Optimize for workload (e.g. 4MB)
○ Each object contains a dedicated index
■ Index stored in omap (RocksDB)
● Storage hardware (thanks CloudLab!)
○ Modern 20 core Intel
○ 128 GB DRAM, 500 GB SSD
○ 10 GB/s Ethernet
○ 1 -- 16 Ceph nodes
Database Node
CPURAM
Programmable storage
Network
11
(Database-specific data interface)
Q
Q
Q
Q
Benchmark queries evaluated
Qa: Range query with 10% selectivity:
SELECT * FROM lineitem WHERE extendedprice > 71000.0
Qb: Point query (unique row) issued with and without index:
SELECT extendedprice
FROM lineitem
WHERE orderkey=5 AND linenumber=3
Qc: Regex query with 10% selectivity (CPU intensive):
SELECT * FROM lineitem WHERE comment iLIKE '%uriously%'
12
Range query performance (10% selectivity)
13
Improved I/O performance
● Local I/O bandwidth
● Local CPU resources
● Reduced network traffic
● CPU parallelism
Database Node
CPURAM
Database Storage
Network
Lower=
Client-side processing Server-side processing
Point query performance (find unique row)
14
● Local I/O bandwidth
● Local CPU resources
● Reduced network traffic
● CPU parallelism
● 10,000 index lookups!
● 1 billion rows
Database Node
CPURAM
Database Storage
Network
Lower=
Client-side processing Server-side processing Server-side processing
with index acceleration
zlog: distributed shared-log for
software-defined storage
zlog: implementation of CORFU on Ceph
1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
● Extend the benefits of software-defined storage to log abstraction
○ Transparently select storage media and physical design
○ Take advantage of performance upgrades and new features
○ Offload critical components such as replication and erasure-coding
LevelDB
RocksD
B
WiredTiger
librados
osd osd osd
CORFU protocol
enforced by
custom,
transactional
storage interface.
Balakrishnan et al., “CORFU: A Shared Log Design for Flash Clusters”, NSDI, `12
←
MDS Cluster
rebalance
18
Mantle
API
admin
The state of programmable storage is messy
● Large design space
○ High cost of searching this space
● Costs are difficult to predict
○ Simple upgrade and change the calculus!
● Much harder than what we have presented
○ > 500 tunables/settings in Ceph
■ Not counting dependencies
○ Runs on a wide-variety of hardware
● No hope of migrating to a new system
○ There are no standards!
19
More programmability work:
“Data Center Scale Programmable Storage” with Dirk Grunwald (CU Boulder),
NSF #1705021
DeclStore: Layering is for
the Faint of Heart
Noah Watkins, Michael Sevilla, Ivo Jimenez, Kathryn
Dahlgren, Peter Alvaro, Shel Finkelstein, Carlos Maltzahn
[HotStorage, July 2017]
Declarative storage
21
● Automate parts of this process
○ Searching the design space
○ Generating implementations
Query optimization & plan generation
Cost model
● Express interfaces declaratively
○ Eliminate need for storage system expertise
○ High-level abstractions across services / systems
● Prototyping with the language
○ Formal underpinning, demonstrated across domains
○ Can express all of the CORFU semantics
“Declarative Programmable Storage”,
with Peter Alvaro, NSF #1764102
- 2018
IRIS-HEP
● $25 million NSF-funded 5-year project with Princeton
● Institute for Research and Innovation (IRIS)
● High Energy Physics (HEP)
● Exascale storage and analysis challenges

Más contenido relacionado

La actualidad más candente

Red Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed_Hat_Storage
 
My personal journey through the World of Open Source! How What Was Old Beco...
My personal journey through  the World of Open Source!  How What Was Old Beco...My personal journey through  the World of Open Source!  How What Was Old Beco...
My personal journey through the World of Open Source! How What Was Old Beco...Ceph Community
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Community
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldSage Weil
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Redis Conf 2019--Container Attached Storage for Redis
Redis Conf 2019--Container Attached Storage for RedisRedis Conf 2019--Container Attached Storage for Redis
Redis Conf 2019--Container Attached Storage for RedisOpenEBS
 
An intro to Ceph and big data - CERN Big Data Workshop
An intro to Ceph and big data - CERN Big Data WorkshopAn intro to Ceph and big data - CERN Big Data Workshop
An intro to Ceph and big data - CERN Big Data WorkshopPatrick McGarry
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Ceph Community
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4OpenEBS
 
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed Hat Gluster Storage - Direction, Roadmap and Use-Cases
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed_Hat_Storage
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephSage Weil
 
Ceph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Community
 
Red Hat Ceph Storage Roadmap: January 2016
Red Hat Ceph Storage Roadmap: January 2016Red Hat Ceph Storage Roadmap: January 2016
Red Hat Ceph Storage Roadmap: January 2016Red_Hat_Storage
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...OpenEBS
 
Cache Tiering and Erasure Coding
Cache Tiering and Erasure CodingCache Tiering and Erasure Coding
Cache Tiering and Erasure CodingShinobu Kinjo
 

La actualidad más candente (18)

Red Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS Plans
 
My personal journey through the World of Open Source! How What Was Old Beco...
My personal journey through  the World of Open Source!  How What Was Old Beco...My personal journey through  the World of Open Source!  How What Was Old Beco...
My personal journey through the World of Open Source! How What Was Old Beco...
 
NantOmics
NantOmicsNantOmics
NantOmics
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Red Hat Storage Roadmap
Red Hat Storage RoadmapRed Hat Storage Roadmap
Red Hat Storage Roadmap
 
GlusterFS And Big Data
GlusterFS And Big DataGlusterFS And Big Data
GlusterFS And Big Data
 
Redis Conf 2019--Container Attached Storage for Redis
Redis Conf 2019--Container Attached Storage for RedisRedis Conf 2019--Container Attached Storage for Redis
Redis Conf 2019--Container Attached Storage for Redis
 
An intro to Ceph and big data - CERN Big Data Workshop
An intro to Ceph and big data - CERN Big Data WorkshopAn intro to Ceph and big data - CERN Big Data Workshop
An intro to Ceph and big data - CERN Big Data Workshop
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed Hat Gluster Storage - Direction, Roadmap and Use-Cases
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
 
BlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for CephBlueStore: a new, faster storage backend for Ceph
BlueStore: a new, faster storage backend for Ceph
 
Ceph Month 2021: RADOS Update
Ceph Month 2021: RADOS UpdateCeph Month 2021: RADOS Update
Ceph Month 2021: RADOS Update
 
Red Hat Ceph Storage Roadmap: January 2016
Red Hat Ceph Storage Roadmap: January 2016Red Hat Ceph Storage Roadmap: January 2016
Red Hat Ceph Storage Roadmap: January 2016
 
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
 
Cache Tiering and Erasure Coding
Cache Tiering and Erasure CodingCache Tiering and Erasure Coding
Cache Tiering and Erasure Coding
 

Similar a Ceph Research at UCSC

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Rick Hwang
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022JayjeetChakraborty
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureCeph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
 
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Junping Du
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech dayArthur Berezin
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Enginefschupp
 
Oracle real application_cluster
Oracle real application_clusterOracle real application_cluster
Oracle real application_clusterPrabhat gangwar
 
Introduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackIntroduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackOpenStack_Online
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Community
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Community
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack Ceph Community
 

Similar a Ceph Research at UCSC (20)

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStack
 
Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017Hadoop 3 @ Hadoop Summit San Jose 2017
Hadoop 3 @ Hadoop Summit San Jose 2017
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
 
Oracle real application_cluster
Oracle real application_clusterOracle real application_cluster
Oracle real application_cluster
 
Introduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStackIntroduction into Ceph storage for OpenStack
Introduction into Ceph storage for OpenStack
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic Cloud
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic Cloud
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack
 
Ceph
CephCeph
Ceph
 

Último

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 

Último (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 

Ceph Research at UCSC

  • 1. Ceph Research at UCSC Noah Watkins
  • 2. 2 ● Graduate student ● UC Santa Cruz ● Data management, file systems, HPC, QoS ● Ceph as a prototyping platform Storage research at UCSC ● Data management ● Storage systems ● High-performance computing ● Quality of service ● Real-time systems
  • 4. 4 ● Graduate student ● UC Santa Cruz ● Data management, file systems, HPC, QoS ● Ceph as a prototyping platform Storage research at UCSC ● SIRIUS Project (DOE) ● Programmable storage (CROSS, NSF) ● Declarative storage (NSF) ● IRIS-HEP (NSF)
  • 5. SIRIUS: Science-driven Data Management for Multi-tiered Storage (ORNL, Sandia, Brown, Rutgers, UCSC) ● Storage challenges for exascale systems ● (1) Heterogeneity ○ Where should data be stored? ● (2) Predictable performance ○ Millions of processes performing I/O ● Many challenges... DOE SSIO, “Science-Driven Data Management for Multi-Tiered Storage” with ORNL and Sandia, award DE-SC0016074
  • 6. Malacology A Programmable Storage System [Sevilla et al. EuroSys '17] Michael A. Sevilla, Noah Watkins, Ivo Jimenez, Peter Alvaro, Shel Finkelstein, Jeff LeFevre, Carlos Maltzahn University of California, Santa Cruz
  • 7. Malacology: programmable interface research platform 7 Target storage interface (Goal) Internal sys services (Building blocks) Composed, generic service glue layer Malacology: A Programmable Storage System, M. Sevilla, N. Watkins, I. Jimenez, P. Alvaro, S. Finkelstein, J. LeFevre, C. Maltzahn, EuroSys ‘17 Mantle: A Programmable Metadata Load Balancer for the Ceph File System, M. Sevilla, N. Watkins, C. Maltzahn, I. Nassi, S. Brandt, et. al, SC ‘15 CORFU: A Shared Log Design for Flash Clusters, Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, and Ted Wobber, Michael Wei, et. al, NSDI ‘12
  • 8. How to grow a database: scale-up Database Node CPURAM Database Storage Network / Bus Q 8 https://aws.amazon.com/ec2/instance-types/ Skyhook project ● Elastic database system ● Lead: Jeff LeFevre ● Active CROSS incubator
  • 9. Skyhook: exploit storage resources Database Node CPURAM Database storage Network / Bus 9 Database storage Network Q Q Q Q (Q) Skyhook project ● Elastic database system ● Lead: Jeff LeFevre ● Active CROSS incubator Single-node architecture Database Node CPURAM Q Skyhook architecture Programmable storage
  • 10. DB-specific Data Interface Ceph OSD RAM CPU Storage+Index Q Skyhook: aligns data with storage interfaces Ceph OSD RAM CPU Storage+Index Ceph OSD RAM CPU Storage+Index Ceph OSD RAM CPU Storage+Index Ceph OSD RAM CPU Storage+Index Ceph OSD RAM CPU Storage+Index Ceph OSD RAM CPU Storage+Index Ceph Cluster C1 C2 C3 Table Table Shards partitioning { object.i } { object.i } 10 Database Node RAM CPU Foreign Data Wrappers Q Q Database node * Indexing * Projection * Filtering * Aggregation App-specific interface
  • 11. Skyhook experiments with programmable storage ● Real-world dataset ○ TPC lineitem table ○ 1 billion rows ○ 140 GB ● Storage in Ceph objects ○ Table divided into ~10,000 14 MB objects ■ Optimize for workload (e.g. 4MB) ○ Each object contains a dedicated index ■ Index stored in omap (RocksDB) ● Storage hardware (thanks CloudLab!) ○ Modern 20 core Intel ○ 128 GB DRAM, 500 GB SSD ○ 10 GB/s Ethernet ○ 1 -- 16 Ceph nodes Database Node CPURAM Programmable storage Network 11 (Database-specific data interface) Q Q Q Q
  • 12. Benchmark queries evaluated Qa: Range query with 10% selectivity: SELECT * FROM lineitem WHERE extendedprice > 71000.0 Qb: Point query (unique row) issued with and without index: SELECT extendedprice FROM lineitem WHERE orderkey=5 AND linenumber=3 Qc: Regex query with 10% selectivity (CPU intensive): SELECT * FROM lineitem WHERE comment iLIKE '%uriously%' 12
  • 13. Range query performance (10% selectivity) 13 Improved I/O performance ● Local I/O bandwidth ● Local CPU resources ● Reduced network traffic ● CPU parallelism Database Node CPURAM Database Storage Network Lower= Client-side processing Server-side processing
  • 14. Point query performance (find unique row) 14 ● Local I/O bandwidth ● Local CPU resources ● Reduced network traffic ● CPU parallelism ● 10,000 index lookups! ● 1 billion rows Database Node CPURAM Database Storage Network Lower= Client-side processing Server-side processing Server-side processing with index acceleration
  • 15. zlog: distributed shared-log for software-defined storage
  • 16. zlog: implementation of CORFU on Ceph 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ... ● Extend the benefits of software-defined storage to log abstraction ○ Transparently select storage media and physical design ○ Take advantage of performance upgrades and new features ○ Offload critical components such as replication and erasure-coding LevelDB RocksD B WiredTiger librados osd osd osd CORFU protocol enforced by custom, transactional storage interface. Balakrishnan et al., “CORFU: A Shared Log Design for Flash Clusters”, NSDI, `12
  • 17.
  • 19. The state of programmable storage is messy ● Large design space ○ High cost of searching this space ● Costs are difficult to predict ○ Simple upgrade and change the calculus! ● Much harder than what we have presented ○ > 500 tunables/settings in Ceph ■ Not counting dependencies ○ Runs on a wide-variety of hardware ● No hope of migrating to a new system ○ There are no standards! 19 More programmability work: “Data Center Scale Programmable Storage” with Dirk Grunwald (CU Boulder), NSF #1705021
  • 20. DeclStore: Layering is for the Faint of Heart Noah Watkins, Michael Sevilla, Ivo Jimenez, Kathryn Dahlgren, Peter Alvaro, Shel Finkelstein, Carlos Maltzahn [HotStorage, July 2017]
  • 21. Declarative storage 21 ● Automate parts of this process ○ Searching the design space ○ Generating implementations Query optimization & plan generation Cost model ● Express interfaces declaratively ○ Eliminate need for storage system expertise ○ High-level abstractions across services / systems ● Prototyping with the language ○ Formal underpinning, demonstrated across domains ○ Can express all of the CORFU semantics “Declarative Programmable Storage”, with Peter Alvaro, NSF #1764102 - 2018
  • 22. IRIS-HEP ● $25 million NSF-funded 5-year project with Princeton ● Institute for Research and Innovation (IRIS) ● High Energy Physics (HEP) ● Exascale storage and analysis challenges