SlideShare una empresa de Scribd logo
1 de 51
1 
Milano, 28–29 Novembre 2014 
Gluster FS 
Un File System distribuito per i Big 
Data di oggi e di domani 
Roberto Franchini 
@CELI_NLP 
CELI 2014
Company 
CELI 2014 2
WwhhooamAi(m1) I 
15 years of experience, proud to be a programmer 
Writes software for information extraction, nlp, opinion mining 
(@scale ), and a lot of other buzzwords 
Implements scalable architectures 
Plays with servers 
Member of the JUG-Torino coordination team 
CELI 2014 3
The problem 
Identify a distributed and scalable 
file system 
for today's and tomorrow's 
Big Data 
CELI 2014 4
Once upon a time 
Once upon a time 
2008: One nfs share 
1,5TB ought to be enough for anybody 
2010: Herd of shares 
(1,5TB x N) ought to be enough for anybody 
Nobody couldn’t stop the data flood 
It was the time for something new 
CELI 2014 5
Requirements 
1. Can be enlarged on demand 
2. No dedicated HW 
3. OS is preferred and trusted 
4. No specialized API 
5. No specialized Kernel 
6. POSIX compliance 
7. Zilions of big and small files 
8. No NAS or SAN (€€€€€) 
CELI 2014 6
Clustered Scale-out General Purpose Storage 
Platform 
−POSIX-y Distributed File System 
−...and so much more 
Built on commodity systems 
−x86_64 Linux ++ 
−POSIX filesystems underneath (XFS, 
EXT4) 
No central metadata Server (NO SPOF) 
Modular architecture for scale and functionality 
CELI 2014 7
Common 
use cases 
• Large Scale File Server 
• Media / Content Distribution Network 
(CDN) 
• Backup / Archive / Disaster Recovery 
(DR) 
• High Performance Computing (HPC) 
• Infrastructure as a Service (IaaS) storage 
layer 
• Database offload (blobs) 
• Unified Object Store + File Access 
CELI 2014 8
Features 
ACL and Quota support 
Fault-tolerance 
Peer to peer 
Self-healing 
Fast setup up 
Enlarge on demand 
Shrink on demand 
Snapshot 
On premise phisical or virtual 
On cloud 
Features 
CELI 2014 9
Architecture 
CELI 2014 10
Architecture 
Architecture 
Peer / Node 
−cluster servers (glusterfs server) 
−Runs the gluster daemons and participates in volumes 
Brick 
−A filesystem mountpoint on servers 
−A unit of storage used as a capacity building block 
CELI 2014 11
BBrricickks so no an n aod neode 
CELI 2014 12
Architecture 
Architecture 
Translator 
−Logic between bricks or subvolume that generate a 
subvolume with certain characteristic 
−distribute, replica, stripe are special translators to 
generate simil-RAID configuration 
−perfomance translators 
Volume 
−Bricks combined and passed through translators 
−Ultimately, what's presented to the end user 
CELI 2014 13
VVoolulumme e 
CELI 2014 14
Volume types 
CELI 2014 15
Distributed 
Distributed 
The default configuration 
Files “evenly” spread across bricks 
Similar to file-level RAID 0 
Server/Disk failure could be catastrophic 
CELI 2014 16
DDisistrtirbiubtuedted 
CELI 2014 17
Replicated 
Files written synchronously to replica peers 
Files read synchronously, 
but ultimately serviced by the first responder 
Similar to file-level RAID 1 
Replicated 
CELI 2014 18
RReepplicliacteadted 
CELI 2014 19
Distributed + replicated 
Distributed + replicated 
Distribued + replicated 
Similar to file-level RAID 10 
Most used layout 
CELI 2014 20
DDisistrtirbiubtuedte redp l+ic arteedplicated 
CELI 2014 21
Striped 
Individual files split among bricks (sparse files) 
Similar to block-level RAID 0 
Limited Use Cases 
HPC Pre/Post Processing 
File size exceeds brick size 
Striped 
CELI 2014 22
SSttrripipeded 
CELI 2014 23
Moving parts 
CELI 2014 24
Components 
Components 
glusterd 
Management daemon 
One instance on each GlusterFS server 
Interfaced through gluster CLI 
glusterfsd 
GlusterFS brick daemon 
One process for each brick on each server 
Managed by glusterd 
CELI 2014 25
Components 
Components 
glusterfs 
Volume service daemon 
One process for each volume service 
NFS server, FUSE client, Self-Heal, Quota, ... 
mount.glusterfs 
FUSE native client mount extension 
gluster 
Gluster Console Manager (CLI) 
CELI 2014 26
Clients 
CELI 2014 27
Clients: native 
Clients: native 
• FUSE kernel module allows the filesystem to be built and 
operated entirely in userspace 
• Specify mount to any GlusterFS server 
• Native Client fetches volfile from mount server, then 
communicates directly with all nodes to access data 
• Recommended for high concurrency and high write 
performance 
• Load is inherently balanced across distributed volumes 
CELI 2014 28
Clients: NFS 
Clients:NFS 
• Standard NFS v3 clients 
• Standard automounter is supported 
• Mount to any server, or use a load balancer 
• GlusterFS NFS server includes Network Lock Manager 
(NLM) to synchronize locks across clients 
• Better performance for reading many small files from a 
single client 
• Load balancing must be managed externally 
CELI 2014 29
Clients: libgfapi 
Clients: libgfapi 
• Introduced with GlusterFS 3.4 
• User-space library for accessing data in GlusterFS 
• Filesystem-like API 
• Runs in application process 
• no FUSE, no copies, no context switches 
• ...but same volfiles, translators, etc. 
CELI 2014 30
Clients: SMB/CIFS 
Clients: SMB/CIFS 
• In GlusterFS 3.4 – Samba + libgfapi 
• No need for local native client mount & re-export 
• Significant performance improvements with FUSE 
removed from the equation 
• Must be setup on each server you wish to connect to via 
CIFS 
• CTDB is required for Samba clustering 
CELI 2014 31
Clients: HDFS 
Clients: HDFS 
• Access data within and outside of Hadoop 
• No HDFS name node single point of failure / bottleneck 
• Seamless replacement for HDFS 
• Scales with the massive growth of big data 
CELI 2014 32
Scalability 
CELI 2014 33
Under the hood 
Under the hood 
• Elastic Hash Algorithm 
• No central metadata 
• No Performance Bottleneck 
• Eliminates risk scenarios 
• Location hashed intelligently on filename 
• Unique identifiers (GFID), similar to md5sum 
CELI 2014 34
Scalability 
Scalability 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
Gluster Server 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
Gluster Server 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
3TB 
Gluster Server 
Scale out performance and availability 
Scale out capacitry 
CELI 2014 35
Scalability 
Scalability 
Add disks to servers to increase storage size 
Add servers to increase bandwidth and storage size 
Add servers to increase availability (replica factor) 
CELI 2014 36
What we do with glusterFS 
CELI 2014 37
What we do with GFS 
What we do with GFS 
Daily production of more than 10GB of Lucene inverted 
indexes stored on glusterFS 
more than 200GB/month 
Search stored indexes to extract different sets of 
documents for every customers 
YES: we open indexes directly on storage 
(it's POSIX!!!) 
CELI 2014 38
2010: first installation 
2010: first installation 
Version 3.0.x 
8 (not dedicated) servers 
Distributed replicated 
No bound on brick size (!!!!) 
Ca 4TB avaliable 
NOTE: stuck to 3.0.x until 2012 due to problems on 3.1 and 
3.2 series, then RH acquired gluster (RH Storage) 
CELI 2014 39
2012: (little) cluster 
2012: (little) cluster 
New installation, version 3.3.2 
4TB available on 8 servers (DELL c5000) 
still not dedicated 
1 brick per server limited to 1TB 
2TB-raid 1 on each server 
Still in production 
CELI 2014 40
2012: enlarge 
2012: enlarge 
New installation, upgrade to 3.3.x 
6TB available on 12 servers (still not dedicated) 
Enlarged to 9TB on 18 servers 
Bricks size bounded AND unbounded 
CELI 2014 41
2013: fail 
2013: fail 
18 not dedicated servers: too much 
18 bricks of different sizes 
2 big down due to bricks out of space 
Didn’t restart after a move 
but… 
All data were recovered 
(files are scattered on bricks, read from them!) 
CELI 2014 42
2014: consolidate 
2014: consolidate 
2 dedicated servers 
12 x 3TB SAS raid6 
4 bricks per server 
28 TB available 
distributed replicated 
4x1Gb bonded NIC 
ca 40 clients (FUSE) (other 
servers) 
CELI 2014 43
Consolidate 
brick 1 
brick 2 
brick 3 
brick 4 
Gluster Server 1 
brick 1 
brick 2 
brick 3 
brick 4 
Gluster Server 2 
Consolidate 
CELI 2014 44
Scale up 
brick 11 
brick 12 
brick 13 
brick 31 
Gluster Server 1 
brick 21 
brick 22 
brick 32 
brick 24 
Gluster Server 2 
brick 31 
brick 32 
brick 23 
brick 14 
Gluster Server 3 
Scale up 
CELI 2014 45
Do 
Dedicated server (phisical or virtual) 
RAID 6 or RAID 10 (with small files) 
Multiple bricks of same size 
Plan to scale 
Do 
CELI 2014 46
Do not 
Multi purpose server 
Bricks of different size 
Very small files 
Write to bricks 
Do not 
CELI 2014 47
Some raw tests 
Some raw tests 
read 
Total transferred file size: 23.10G bytes 
43.46M bytes/sec 
write 
Total transferred file size: 23.10G bytes 
38.53M bytes/sec 
CELI 2014 48
NOTE: ran in production under heavy load, no clean 
test room 
Raw tests 
CELI 2014 49
Resources 
Resources 
• http://www.gluster.org/ 
• https://access.redhat.com/documentation/en- 
US/Red_Hat_Storage/ 
• https://github.com/gluster 
• http://www.redhat.com/products/storage-server/ 
• http://joejulian.name/blog/category/glusterfs/ 
• http://jread.us/2013/06/one-petabyte-red-hat-storage-and- 
glusterfs-project-overview/ 
CELI 2014 50
franchini@celi.it 
Grazie per 
www.celi.it 
l’attenzione! @robfrankie 
Torino | Milano| Trento 
CELI 2014 51

Más contenido relacionado

La actualidad más candente

Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development
Ceph Community
 
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
Ceph Community
 

La actualidad más candente (19)

2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
GlusterFs: a scalable file system for today's and tomorrow's big data
GlusterFs: a scalable file system for today's and tomorrow's big dataGlusterFs: a scalable file system for today's and tomorrow's big data
GlusterFs: a scalable file system for today's and tomorrow's big data
 
20160401 Gluster-roadmap
20160401 Gluster-roadmap20160401 Gluster-roadmap
20160401 Gluster-roadmap
 
Gluster technical overview
Gluster technical overviewGluster technical overview
Gluster technical overview
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMe
 
CEPH DAY BERLIN - CEPH MANAGEMENT THE EASY AND RELIABLE WAY
CEPH DAY BERLIN - CEPH MANAGEMENT THE EASY AND RELIABLE WAYCEPH DAY BERLIN - CEPH MANAGEMENT THE EASY AND RELIABLE WAY
CEPH DAY BERLIN - CEPH MANAGEMENT THE EASY AND RELIABLE WAY
 
Iocg Whats New In V Sphere
Iocg Whats New In V SphereIocg Whats New In V Sphere
Iocg Whats New In V Sphere
 
2015 open storage workshop ceph software defined storage
2015 open storage workshop   ceph software defined storage2015 open storage workshop   ceph software defined storage
2015 open storage workshop ceph software defined storage
 
Persistent Storage with Containers with Kubernetes & OpenShift
Persistent Storage with Containers with Kubernetes & OpenShiftPersistent Storage with Containers with Kubernetes & OpenShift
Persistent Storage with Containers with Kubernetes & OpenShift
 
Red Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS PlansRed Hat Gluster Storage, Container Storage and CephFS Plans
Red Hat Gluster Storage, Container Storage and CephFS Plans
 
The Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgThe Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.org
 
Red Hat Ceph Storage Roadmap: January 2016
Red Hat Ceph Storage Roadmap: January 2016Red Hat Ceph Storage Roadmap: January 2016
Red Hat Ceph Storage Roadmap: January 2016
 
Rethinking the OS
Rethinking the OSRethinking the OS
Rethinking the OS
 
Developing apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapiDeveloping apps and_integrating_with_gluster_fs_-_libgfapi
Developing apps and_integrating_with_gluster_fs_-_libgfapi
 
Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development
 
Accessing gluster ufo_-_eco_willson
Accessing gluster ufo_-_eco_willsonAccessing gluster ufo_-_eco_willson
Accessing gluster ufo_-_eco_willson
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
NVMe over Fabrics and Composable Infrastructure - What Do They Mean for Softw...
 
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
Integrating gluster fs,_qemu_and_ovirt-vijay_bellur-linuxcon_eu_2013
 

Destacado

Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
nathanmarz
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
DataWorks Summit
 

Destacado (15)

Real-time discovery e sentiment analysis su Twitter: Blogmeter Now
Real-time discovery e sentiment analysis su Twitter: Blogmeter NowReal-time discovery e sentiment analysis su Twitter: Blogmeter Now
Real-time discovery e sentiment analysis su Twitter: Blogmeter Now
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
More Than Websites: PHP And The Firehose @DataSift (2013)
More Than Websites: PHP And The Firehose @DataSift (2013)More Than Websites: PHP And The Firehose @DataSift (2013)
More Than Websites: PHP And The Firehose @DataSift (2013)
 
Nginx for Fun & Performance - Philipp Krenn - Codemotion Rome 2015
Nginx for Fun & Performance - Philipp Krenn - Codemotion Rome 2015Nginx for Fun & Performance - Philipp Krenn - Codemotion Rome 2015
Nginx for Fun & Performance - Philipp Krenn - Codemotion Rome 2015
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Icons and Stencils for Hadoop
Icons and Stencils for HadoopIcons and Stencils for Hadoop
Icons and Stencils for Hadoop
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017
 

Similar a Celi @Codemotion 2014 - Roberto Franchini GlusterFS

GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 MeetupGlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS
 

Similar a Celi @Codemotion 2014 - Roberto Franchini GlusterFS (20)

Gluster FS a filesistem for Big Data | Roberto Franchini - Codemotion Rome 2015
Gluster FS  a filesistem for Big Data | Roberto Franchini - Codemotion Rome 2015Gluster FS  a filesistem for Big Data | Roberto Franchini - Codemotion Rome 2015
Gluster FS a filesistem for Big Data | Roberto Franchini - Codemotion Rome 2015
 
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
GlusterFS : un file system open source per i big data di oggi e domani - Robe...GlusterFS : un file system open source per i big data di oggi e domani - Robe...
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
20160401 guster-roadmap
20160401 guster-roadmap20160401 guster-roadmap
20160401 guster-roadmap
 
20160401 guster-roadmap
20160401 guster-roadmap20160401 guster-roadmap
20160401 guster-roadmap
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
 
Gluster fs architecture_&_roadmap-vijay_bellur-linuxcon_eu_2013
Gluster fs architecture_&_roadmap-vijay_bellur-linuxcon_eu_2013Gluster fs architecture_&_roadmap-vijay_bellur-linuxcon_eu_2013
Gluster fs architecture_&_roadmap-vijay_bellur-linuxcon_eu_2013
 
GlusterFS And Big Data
GlusterFS And Big DataGlusterFS And Big Data
GlusterFS And Big Data
 
Make room for more virtual desktops with fast storage
Make room for more virtual desktops with fast storageMake room for more virtual desktops with fast storage
Make room for more virtual desktops with fast storage
 
GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 MeetupGlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 Meetup
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
 
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRGlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
 
Gluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephant
 
Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System	Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Performance characterization in large distributed file system with gluster fs
Performance characterization in large distributed file system with gluster fsPerformance characterization in large distributed file system with gluster fs
Performance characterization in large distributed file system with gluster fs
 
Spectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN CachingSpectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN Caching
 

Más de CELI

Matteo Casu Exploring The Betrothed Lovers Hamburg
Matteo Casu Exploring The Betrothed Lovers HamburgMatteo Casu Exploring The Betrothed Lovers Hamburg
Matteo Casu Exploring The Betrothed Lovers Hamburg
CELI
 
Celi_Di Tomaso presentazione futurodigitale_csipiemonte
Celi_Di Tomaso presentazione futurodigitale_csipiemonteCeli_Di Tomaso presentazione futurodigitale_csipiemonte
Celi_Di Tomaso presentazione futurodigitale_csipiemonte
CELI
 
Forum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentationForum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentation
CELI
 

Más de CELI (13)

Celi 2017 presentazione_breve
Celi 2017 presentazione_breveCeli 2017 presentazione_breve
Celi 2017 presentazione_breve
 
Libri, biblioteche e scuole digitali. Seminario di Digital Humanities 2015
Libri, biblioteche e scuole digitali. Seminario di Digital Humanities 2015Libri, biblioteche e scuole digitali. Seminario di Digital Humanities 2015
Libri, biblioteche e scuole digitali. Seminario di Digital Humanities 2015
 
Ricerca semantica: annotazioni manuali e automatiche per l'Archivio storico...
Ricerca semantica:  annotazioni manuali e automatiche  per l'Archivio storico...Ricerca semantica:  annotazioni manuali e automatiche  per l'Archivio storico...
Ricerca semantica: annotazioni manuali e automatiche per l'Archivio storico...
 
Celi @Clic2014: Geometric and Statistical Analysis of Topic and Emotions in C...
Celi @Clic2014: Geometric and Statistical Analysis of Topic and Emotions in C...Celi @Clic2014: Geometric and Statistical Analysis of Topic and Emotions in C...
Celi @Clic2014: Geometric and Statistical Analysis of Topic and Emotions in C...
 
Celi @Clic2014: OCR Errors & Named Entity Recognition in La Stampa Historical...
Celi @Clic2014: OCR Errors & Named Entity Recognition in La Stampa Historical...Celi @Clic2014: OCR Errors & Named Entity Recognition in La Stampa Historical...
Celi @Clic2014: OCR Errors & Named Entity Recognition in La Stampa Historical...
 
Celi presentazione @clic2014
Celi presentazione @clic2014 Celi presentazione @clic2014
Celi presentazione @clic2014
 
Cross library @Internet Festival 2014
Cross library @Internet Festival 2014Cross library @Internet Festival 2014
Cross library @Internet Festival 2014
 
Celi @TOSM Pitch Day Smart Enterprise
Celi @TOSM Pitch Day Smart EnterpriseCeli @TOSM Pitch Day Smart Enterprise
Celi @TOSM Pitch Day Smart Enterprise
 
Matteo Casu Exploring The Betrothed Lovers Hamburg
Matteo Casu Exploring The Betrothed Lovers HamburgMatteo Casu Exploring The Betrothed Lovers Hamburg
Matteo Casu Exploring The Betrothed Lovers Hamburg
 
Linked Open Data di Vittorio Di Tomaso
Linked Open Data di Vittorio Di TomasoLinked Open Data di Vittorio Di Tomaso
Linked Open Data di Vittorio Di Tomaso
 
Celi_Di Tomaso presentazione futurodigitale_csipiemonte
Celi_Di Tomaso presentazione futurodigitale_csipiemonteCeli_Di Tomaso presentazione futurodigitale_csipiemonte
Celi_Di Tomaso presentazione futurodigitale_csipiemonte
 
Forum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentationForum Tal 2014: Celi company presentation
Forum Tal 2014: Celi company presentation
 
Exploring the "Betrothed Lovers" and other literary works
Exploring the "Betrothed Lovers" and other literary works Exploring the "Betrothed Lovers" and other literary works
Exploring the "Betrothed Lovers" and other literary works
 

Último

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 

Último (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Celi @Codemotion 2014 - Roberto Franchini GlusterFS

  • 1. 1 Milano, 28–29 Novembre 2014 Gluster FS Un File System distribuito per i Big Data di oggi e di domani Roberto Franchini @CELI_NLP CELI 2014
  • 3. WwhhooamAi(m1) I 15 years of experience, proud to be a programmer Writes software for information extraction, nlp, opinion mining (@scale ), and a lot of other buzzwords Implements scalable architectures Plays with servers Member of the JUG-Torino coordination team CELI 2014 3
  • 4. The problem Identify a distributed and scalable file system for today's and tomorrow's Big Data CELI 2014 4
  • 5. Once upon a time Once upon a time 2008: One nfs share 1,5TB ought to be enough for anybody 2010: Herd of shares (1,5TB x N) ought to be enough for anybody Nobody couldn’t stop the data flood It was the time for something new CELI 2014 5
  • 6. Requirements 1. Can be enlarged on demand 2. No dedicated HW 3. OS is preferred and trusted 4. No specialized API 5. No specialized Kernel 6. POSIX compliance 7. Zilions of big and small files 8. No NAS or SAN (€€€€€) CELI 2014 6
  • 7. Clustered Scale-out General Purpose Storage Platform −POSIX-y Distributed File System −...and so much more Built on commodity systems −x86_64 Linux ++ −POSIX filesystems underneath (XFS, EXT4) No central metadata Server (NO SPOF) Modular architecture for scale and functionality CELI 2014 7
  • 8. Common use cases • Large Scale File Server • Media / Content Distribution Network (CDN) • Backup / Archive / Disaster Recovery (DR) • High Performance Computing (HPC) • Infrastructure as a Service (IaaS) storage layer • Database offload (blobs) • Unified Object Store + File Access CELI 2014 8
  • 9. Features ACL and Quota support Fault-tolerance Peer to peer Self-healing Fast setup up Enlarge on demand Shrink on demand Snapshot On premise phisical or virtual On cloud Features CELI 2014 9
  • 11. Architecture Architecture Peer / Node −cluster servers (glusterfs server) −Runs the gluster daemons and participates in volumes Brick −A filesystem mountpoint on servers −A unit of storage used as a capacity building block CELI 2014 11
  • 12. BBrricickks so no an n aod neode CELI 2014 12
  • 13. Architecture Architecture Translator −Logic between bricks or subvolume that generate a subvolume with certain characteristic −distribute, replica, stripe are special translators to generate simil-RAID configuration −perfomance translators Volume −Bricks combined and passed through translators −Ultimately, what's presented to the end user CELI 2014 13
  • 15. Volume types CELI 2014 15
  • 16. Distributed Distributed The default configuration Files “evenly” spread across bricks Similar to file-level RAID 0 Server/Disk failure could be catastrophic CELI 2014 16
  • 18. Replicated Files written synchronously to replica peers Files read synchronously, but ultimately serviced by the first responder Similar to file-level RAID 1 Replicated CELI 2014 18
  • 20. Distributed + replicated Distributed + replicated Distribued + replicated Similar to file-level RAID 10 Most used layout CELI 2014 20
  • 21. DDisistrtirbiubtuedte redp l+ic arteedplicated CELI 2014 21
  • 22. Striped Individual files split among bricks (sparse files) Similar to block-level RAID 0 Limited Use Cases HPC Pre/Post Processing File size exceeds brick size Striped CELI 2014 22
  • 24. Moving parts CELI 2014 24
  • 25. Components Components glusterd Management daemon One instance on each GlusterFS server Interfaced through gluster CLI glusterfsd GlusterFS brick daemon One process for each brick on each server Managed by glusterd CELI 2014 25
  • 26. Components Components glusterfs Volume service daemon One process for each volume service NFS server, FUSE client, Self-Heal, Quota, ... mount.glusterfs FUSE native client mount extension gluster Gluster Console Manager (CLI) CELI 2014 26
  • 28. Clients: native Clients: native • FUSE kernel module allows the filesystem to be built and operated entirely in userspace • Specify mount to any GlusterFS server • Native Client fetches volfile from mount server, then communicates directly with all nodes to access data • Recommended for high concurrency and high write performance • Load is inherently balanced across distributed volumes CELI 2014 28
  • 29. Clients: NFS Clients:NFS • Standard NFS v3 clients • Standard automounter is supported • Mount to any server, or use a load balancer • GlusterFS NFS server includes Network Lock Manager (NLM) to synchronize locks across clients • Better performance for reading many small files from a single client • Load balancing must be managed externally CELI 2014 29
  • 30. Clients: libgfapi Clients: libgfapi • Introduced with GlusterFS 3.4 • User-space library for accessing data in GlusterFS • Filesystem-like API • Runs in application process • no FUSE, no copies, no context switches • ...but same volfiles, translators, etc. CELI 2014 30
  • 31. Clients: SMB/CIFS Clients: SMB/CIFS • In GlusterFS 3.4 – Samba + libgfapi • No need for local native client mount & re-export • Significant performance improvements with FUSE removed from the equation • Must be setup on each server you wish to connect to via CIFS • CTDB is required for Samba clustering CELI 2014 31
  • 32. Clients: HDFS Clients: HDFS • Access data within and outside of Hadoop • No HDFS name node single point of failure / bottleneck • Seamless replacement for HDFS • Scales with the massive growth of big data CELI 2014 32
  • 34. Under the hood Under the hood • Elastic Hash Algorithm • No central metadata • No Performance Bottleneck • Eliminates risk scenarios • Location hashed intelligently on filename • Unique identifiers (GFID), similar to md5sum CELI 2014 34
  • 35. Scalability Scalability 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB Gluster Server 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB Gluster Server 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB 3TB Gluster Server Scale out performance and availability Scale out capacitry CELI 2014 35
  • 36. Scalability Scalability Add disks to servers to increase storage size Add servers to increase bandwidth and storage size Add servers to increase availability (replica factor) CELI 2014 36
  • 37. What we do with glusterFS CELI 2014 37
  • 38. What we do with GFS What we do with GFS Daily production of more than 10GB of Lucene inverted indexes stored on glusterFS more than 200GB/month Search stored indexes to extract different sets of documents for every customers YES: we open indexes directly on storage (it's POSIX!!!) CELI 2014 38
  • 39. 2010: first installation 2010: first installation Version 3.0.x 8 (not dedicated) servers Distributed replicated No bound on brick size (!!!!) Ca 4TB avaliable NOTE: stuck to 3.0.x until 2012 due to problems on 3.1 and 3.2 series, then RH acquired gluster (RH Storage) CELI 2014 39
  • 40. 2012: (little) cluster 2012: (little) cluster New installation, version 3.3.2 4TB available on 8 servers (DELL c5000) still not dedicated 1 brick per server limited to 1TB 2TB-raid 1 on each server Still in production CELI 2014 40
  • 41. 2012: enlarge 2012: enlarge New installation, upgrade to 3.3.x 6TB available on 12 servers (still not dedicated) Enlarged to 9TB on 18 servers Bricks size bounded AND unbounded CELI 2014 41
  • 42. 2013: fail 2013: fail 18 not dedicated servers: too much 18 bricks of different sizes 2 big down due to bricks out of space Didn’t restart after a move but… All data were recovered (files are scattered on bricks, read from them!) CELI 2014 42
  • 43. 2014: consolidate 2014: consolidate 2 dedicated servers 12 x 3TB SAS raid6 4 bricks per server 28 TB available distributed replicated 4x1Gb bonded NIC ca 40 clients (FUSE) (other servers) CELI 2014 43
  • 44. Consolidate brick 1 brick 2 brick 3 brick 4 Gluster Server 1 brick 1 brick 2 brick 3 brick 4 Gluster Server 2 Consolidate CELI 2014 44
  • 45. Scale up brick 11 brick 12 brick 13 brick 31 Gluster Server 1 brick 21 brick 22 brick 32 brick 24 Gluster Server 2 brick 31 brick 32 brick 23 brick 14 Gluster Server 3 Scale up CELI 2014 45
  • 46. Do Dedicated server (phisical or virtual) RAID 6 or RAID 10 (with small files) Multiple bricks of same size Plan to scale Do CELI 2014 46
  • 47. Do not Multi purpose server Bricks of different size Very small files Write to bricks Do not CELI 2014 47
  • 48. Some raw tests Some raw tests read Total transferred file size: 23.10G bytes 43.46M bytes/sec write Total transferred file size: 23.10G bytes 38.53M bytes/sec CELI 2014 48
  • 49. NOTE: ran in production under heavy load, no clean test room Raw tests CELI 2014 49
  • 50. Resources Resources • http://www.gluster.org/ • https://access.redhat.com/documentation/en- US/Red_Hat_Storage/ • https://github.com/gluster • http://www.redhat.com/products/storage-server/ • http://joejulian.name/blog/category/glusterfs/ • http://jread.us/2013/06/one-petabyte-red-hat-storage-and- glusterfs-project-overview/ CELI 2014 50
  • 51. franchini@celi.it Grazie per www.celi.it l’attenzione! @robfrankie Torino | Milano| Trento CELI 2014 51