SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
Long-­‐Term	
  Storage	
  
Panel	
  Session	
  
Erik	
  Riedel,	
  EMC	
  
Library	
  of	
  Congress	
  Workshop	
  
September	
  2012	
  
top	
  picture	
  “Once	
  Blue”	
  by	
  Jesse	
  Wagstaff	
  via	
  flickr/cc	
  	
  
right	
  picture	
  by	
  AusNn	
  Marshall	
  via	
  flickr/cc	
  
revision	
  3	
  
Parameters	
  
•  Non-­‐compressible	
  data	
  
•  Long-­‐term	
  storage	
  
•  Very	
  high	
  reliability	
  
•  Request	
  rate	
  of	
  10%	
  per	
  year	
  
•  5,	
  20,	
  50	
  PB	
  in	
  2012,	
  2015,	
  2018	
  
Density	
  
2012	
   Disks	
  (raw)	
  @	
  3TB	
   Disks	
  (protected)	
   Racks	
  @	
  480	
  disks	
  
5	
  PB	
   1,700	
  disks	
   2,700	
  disks	
   6	
  racks	
  
20	
  PB	
   6,700	
  disks	
   11,000	
  disks	
   23	
  racks	
  
50	
  PB	
   17,000	
  disks	
   27,000	
  disks	
   56	
  racks	
  
Density	
  
2012	
   Disks	
  (raw)	
  @	
  3TB	
   Disks	
  (protected)	
   Racks	
  @	
  480	
  disks	
  
5	
  PB	
   1,700	
  disks	
   2,700	
  disks	
   6	
  racks	
  
20	
  PB	
   6,700	
  disks	
   11,000	
  disks	
   23	
  racks	
  
50	
  PB	
   17,000	
  disks	
   27,000	
  disks	
   56	
  racks	
  
2015	
   Disks	
  (raw)	
  @	
  6TB	
   Disks	
  (protected)	
   Racks	
  @	
  600	
  disks	
  
5	
  PB	
   830	
  disks	
   1,300	
  disks	
   3	
  racks	
  
20	
  PB	
   3,300	
  disks	
   5,300	
  disks	
   9	
  racks	
  
50	
  PB	
   8,300	
  disks	
   13,000	
  disks	
   23	
  racks	
  
Density	
  
2012	
   Disks	
  (raw)	
  @	
  3TB	
   Disks	
  (protected)	
   Racks	
  @	
  480	
  disks	
  
5	
  PB	
   1,700	
  disks	
   2,700	
  disks	
   6	
  racks	
  
20	
  PB	
   6,700	
  disks	
   11,000	
  disks	
   23	
  racks	
  
50	
  PB	
   17,000	
  disks	
   27,000	
  disks	
   56	
  racks	
  
2015	
   Disks	
  (raw)	
  @	
  6TB	
   Disks	
  (protected)	
   Racks	
  @	
  600	
  disks	
  
5	
  PB	
   830	
  disks	
   1,300	
  disks	
   3	
  racks	
  
20	
  PB	
   3,300	
  disks	
   5,300	
  disks	
   9	
  racks	
  
50	
  PB	
   8,300	
  disks	
   13,000	
  disks	
   23	
  racks	
  
2018	
   Disks	
  (raw)	
  @	
  10TB	
   Disks	
  (protected)	
   Racks	
  @	
  600	
  disks	
  
5	
  PB	
   500	
  disks	
   800	
  disks	
   2	
  racks	
  
20	
  PB	
   2,000	
  disks	
   3,200	
  disks	
   6	
  racks	
  
50	
  PB	
   5,000	
  disks	
   8,000	
  disks	
   14	
  racks	
  
Performance	
  
2012	
   10%/yr	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
  BW	
   Days-­‐to-­‐fill	
  
5	
  PB	
   16	
  MB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   63	
  MB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   159	
  MB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
Performance	
  
2012	
   10%/yr	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
  BW	
   Days-­‐to-­‐fill	
  
5	
  PB	
   16	
  MB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   63	
  MB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   159	
  MB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
2012	
   10%/2day	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
  BW	
   Days-­‐to-­‐fill	
  
5	
  PB	
   2.9	
  GB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   11	
  GB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   29	
  GB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
Performance	
  
2012	
   10%/yr	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
  BW	
   Days-­‐to-­‐fill	
  
5	
  PB	
   16	
  MB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   63	
  MB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   159	
  MB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
2018	
   10%/2day	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
  BW	
   Days-­‐to-­‐fill	
  
5	
  PB	
   2.9	
  GB/s	
   800	
   80	
  GB/s	
   2	
  	
   10	
  GB/s	
   3.3	
  GB/s	
   17	
  
20	
  PB	
   11	
  GB/s	
   3,200	
   320	
  GB/s	
   6	
   30	
  GB/s	
   10	
  GB/s	
   23	
  
50	
  PB	
   29	
  GB/s	
   8,000	
   800	
  GB/s	
   14	
   70	
  GB/s	
   23	
  GB/s	
   25	
  
Cost	
  
2012	
   10%yr	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
   Days-­‐to-­‐fill	
  
5	
  PB	
   16	
  MB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   63	
  MB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   159	
  MB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
Cost	
  
2012	
   10%yr	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
   Days-­‐to-­‐fill	
  
5	
  PB	
   16	
  MB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   63	
  MB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   159	
  MB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
2012	
   $/month	
  @	
  $0.01/GB	
  
5	
  PB	
   $50,000/month	
  
20	
  PB	
   $200,000/month	
  
50	
  PB	
   $500,000/month	
  
Cost	
  if	
  using	
  e.g.	
  “cold”	
  public	
  cloud	
  storage	
  
Cost	
  
2012	
   10%yr	
   Disks	
   Disk	
  BW	
   Racks	
   Bandwidth	
   Actual	
   Days-­‐to-­‐fill	
  
5	
  PB	
   16	
  MB/s	
   2,700	
   200	
  GB/s	
   6	
  	
   30	
  GB/s	
   3	
  GB/s	
   19	
  
20	
  PB	
   63	
  MB/s	
   11,000	
   1.1	
  TB/s	
   23	
   115	
  GB/s	
   11	
  GB/s	
   20	
  
50	
  PB	
   159	
  MB/s	
   27,000	
   2.7	
  TB/s	
   56	
   280	
  GB/s	
   28	
  GB/s	
   21	
  
2012	
   sqN/person	
   $/sqN	
   $/month	
  
20	
  employees	
   90	
   $48	
  	
   $86,000/month	
   Washington,	
  DC	
  
80	
  employees	
   75	
   $48	
   $288,000/month	
   Washington,	
  DC	
  
200	
  employees	
   75	
   $24	
   $360,000/month	
   Minneapolis,	
  MN	
  
2012	
   $/month	
  @	
  $0.01/GB	
  
5	
  PB	
   $50,000/month	
  
20	
  PB	
   $200,000/month	
  
50	
  PB	
   $500,000/month	
  
Cost	
  if	
  using	
  e.g.	
  “cold”	
  public	
  cloud	
  storage	
  
For	
  comparison,	
  the	
  cost	
  to	
  “store”	
  
20	
  librarians	
  or	
  data	
  scienNsts	
  
AssumpNons	
  
•  Data	
  protecNon	
  in	
  a	
  single	
  data	
  center,	
  using	
  an	
  erasure-­‐coding	
  
scheme	
  at	
  1.6x	
  overhead	
  
•  480	
  drive	
  racks	
  in	
  2012	
  (40U)	
  
•  600	
  drive	
  racks	
  in	
  2015	
  and	
  2018	
  (50+U)	
  
•  10%/year	
  access	
  assumes	
  10%	
  of	
  total	
  data	
  is	
  accessed	
  in	
  even	
  
distribuNon	
  over	
  365	
  days/year,	
  24	
  hours/day	
  –	
  opNmisNc	
  
•  10%/2day	
  access	
  assumes	
  10%	
  of	
  data	
  is	
  accessed	
  on	
  only	
  2	
  days	
  
per	
  year	
  (say	
  Thanksgiving	
  and	
  Xmas)	
  –	
  very	
  bursty	
  
•  Bandwidth	
  is	
  theoreNcal	
  bandwidth	
  at	
  40	
  Gb/s	
  per	
  rack	
  (4x	
  10	
  GbE)	
  
•  Actual	
  bandwidth	
  is	
  1/10	
  of	
  theoreNcal	
  maximum	
  for	
  2012	
  and	
  
2015;	
  up	
  to	
  1/3	
  theoreNcal	
  max	
  for	
  2018	
  (sohware	
  improvements)	
  
•  sqh	
  per	
  person	
  and	
  $/sqh	
  references	
  
hip://www.inc.com/news/arNcles/2010/10/washington-­‐dc-­‐rents-­‐top-­‐those-­‐in-­‐nyc.html	
  
hip://newsfeed.Nme.com/2011/02/08/youre-­‐not-­‐imagining-­‐it-­‐your-­‐cubicle-­‐is-­‐gekng-­‐smaller/	
  
References	
  
•  Why	
  access	
  to	
  data	
  maiers,	
  not	
  just	
  “dark	
  storage”,	
  
but	
  wide	
  access	
  to	
  electronic	
  data:	
  
–  The	
  Internet	
  Archive	
  
–  hip://archive.org/about/	
  
–  History	
  of	
  the	
  Internet,	
  sNll	
  online	
  aher	
  20	
  years	
  
–  hip://www.cs.cmu.edu/~riedel/library/birthday.html	
  
	
  (from	
  April	
  2003,	
  LoC	
  workshop	
  on	
  Digital	
  PreservaNon)	
  
•  What	
  about	
  Flash?	
  
–  Death	
  of	
  Disks	
  (has	
  been	
  widely	
  exaggerated)	
  
–  hip://www.cs.cmu.edu/~riedel/#HECFSIO2011	
  
–  How	
  to	
  Build	
  Big	
  Storage	
  as	
  a	
  Cloud	
  
–  hip://storageconference.org/2012/PresentaNons/R00.Keynote.pdf	
  
Backup	
  
What	
  About	
  Tape?	
  
pictures	
  by	
  Gill	
  Wildman	
  via	
  flickr/cc	
  
What	
  About	
  Tape?	
  
•  Tapes	
  are	
  not	
  a	
  commodity	
  technology	
  
•  2011	
  total	
  worldwide	
  market	
  for	
  tape	
  cartridges	
  
is	
  about	
  8m	
  units	
  (just	
  under	
  $1b	
  annual	
  
revenue)	
  
•  Compare	
  to	
  the	
  HDD	
  business	
  at	
  650m	
  units	
  in	
  
2010	
  (close	
  to	
  $40b	
  annual	
  revenue)	
  
•  80	
  disk	
  drives	
  are	
  manufactured	
  for	
  each	
  tape	
  
cartridge;	
  robots	
  are	
  complicated	
  
•  Fits	
  parNcular	
  applicaNon	
  segments	
  very	
  well,	
  but	
  
is	
  not	
  a	
  general-­‐purpose	
  soluNon	
  
hip://www.storagenewsleier.com/news/tapes/sccg-­‐ww-­‐tape-­‐market-­‐lto-­‐1q11	
  
hip://techreport.com/discussions.x/20890	
  
David	
  Anderson,	
  James	
  Dykes,	
  Erik	
  Riedel	
  “SCSI	
  vs.	
  ATA	
  -­‐	
  More	
  than	
  
an	
  interface”	
  2nd	
  Conference	
  on	
  File	
  and	
  Storage	
  Technology	
  (FAST).	
  
San	
  Francisco,	
  CA.	
  April	
  2003.	
  www.cs.cmu.edu/~riedel/#SCSIvsATA	
  

Más contenido relacionado

Similar a Long-Term Storage - Panel Session @ Library of Congress Workshop

Storage: Alternate Futures
Storage: Alternate FuturesStorage: Alternate Futures
Storage: Alternate Futures小新 制造
 
Optimizing Your WAN Bandwidth Has Immediate ROI
Optimizing Your WAN Bandwidth Has Immediate ROIOptimizing Your WAN Bandwidth Has Immediate ROI
Optimizing Your WAN Bandwidth Has Immediate ROISigniant
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreAmazon Web Services
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreAmazon Web Services
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreAmazon Web Services
 
Accelerating forensic and incident response workflow: the case for a new stan...
Accelerating forensic and incident response workflow: the case for a new stan...Accelerating forensic and incident response workflow: the case for a new stan...
Accelerating forensic and incident response workflow: the case for a new stan...Bradley Schatz
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMRightScale
 
AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법
AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법
AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법Amazon Web Services Korea
 
Chapter 8a: PowerPoint Presentation for External Hard Drives
Chapter 8a: PowerPoint Presentation for External Hard DrivesChapter 8a: PowerPoint Presentation for External Hard Drives
Chapter 8a: PowerPoint Presentation for External Hard DrivesMichaelHernandez217
 
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...
Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...Red_Hat_Storage
 
History of data storage: Infographic
History of data storage: InfographicHistory of data storage: Infographic
History of data storage: InfographicWebFX
 
Cis1 202d-ch8b-project-valenzuela-zibouche
Cis1 202d-ch8b-project-valenzuela-zibouche  Cis1 202d-ch8b-project-valenzuela-zibouche
Cis1 202d-ch8b-project-valenzuela-zibouche MacarenaValenzuela14
 
Blu ray disc by gautam
Blu ray disc by gautamBlu ray disc by gautam
Blu ray disc by gautamGAUTAM
 
Presentation on Blu ray disc by gautam
Presentation on Blu ray disc by gautamPresentation on Blu ray disc by gautam
Presentation on Blu ray disc by gautamGAUTAM
 

Similar a Long-Term Storage - Panel Session @ Library of Congress Workshop (20)

Storage devices
Storage devicesStorage devices
Storage devices
 
Storage devices
Storage devicesStorage devices
Storage devices
 
Storage: Alternate Futures
Storage: Alternate FuturesStorage: Alternate Futures
Storage: Alternate Futures
 
Optimizing Your WAN Bandwidth Has Immediate ROI
Optimizing Your WAN Bandwidth Has Immediate ROIOptimizing Your WAN Bandwidth Has Immediate ROI
Optimizing Your WAN Bandwidth Has Immediate ROI
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block Store
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block Store
 
Deep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block StoreDeep Dive on Amazon Elastic Block Store
Deep Dive on Amazon Elastic Block Store
 
Accelerating forensic and incident response workflow: the case for a new stan...
Accelerating forensic and incident response workflow: the case for a new stan...Accelerating forensic and incident response workflow: the case for a new stan...
Accelerating forensic and incident response workflow: the case for a new stan...
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
2879 771435
2879 7714352879 771435
2879 771435
 
Bandwidthreport
BandwidthreportBandwidthreport
Bandwidthreport
 
AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법
AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법
AWS Summit Seoul 2015 - EBS 성능 향상 및 EC2 비용 최적화 기법
 
Chapter 8a: PowerPoint Presentation for External Hard Drives
Chapter 8a: PowerPoint Presentation for External Hard DrivesChapter 8a: PowerPoint Presentation for External Hard Drives
Chapter 8a: PowerPoint Presentation for External Hard Drives
 
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...
Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...Implementation of Dense Storage Utilizing  HDDs with SSDs and PCIe Flash  Acc...
Implementation of Dense Storage Utilizing HDDs with SSDs and PCIe Flash Acc...
 
History of data storage: Infographic
History of data storage: InfographicHistory of data storage: Infographic
History of data storage: Infographic
 
Blue Ray Disc
Blue Ray DiscBlue Ray Disc
Blue Ray Disc
 
10tb hard drive
10tb hard drive10tb hard drive
10tb hard drive
 
Cis1 202d-ch8b-project-valenzuela-zibouche
Cis1 202d-ch8b-project-valenzuela-zibouche  Cis1 202d-ch8b-project-valenzuela-zibouche
Cis1 202d-ch8b-project-valenzuela-zibouche
 
Blu ray disc by gautam
Blu ray disc by gautamBlu ray disc by gautam
Blu ray disc by gautam
 
Presentation on Blu ray disc by gautam
Presentation on Blu ray disc by gautamPresentation on Blu ray disc by gautam
Presentation on Blu ray disc by gautam
 

Último

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Long-Term Storage - Panel Session @ Library of Congress Workshop

  • 1. Long-­‐Term  Storage   Panel  Session   Erik  Riedel,  EMC   Library  of  Congress  Workshop   September  2012   top  picture  “Once  Blue”  by  Jesse  Wagstaff  via  flickr/cc     right  picture  by  AusNn  Marshall  via  flickr/cc   revision  3  
  • 2. Parameters   •  Non-­‐compressible  data   •  Long-­‐term  storage   •  Very  high  reliability   •  Request  rate  of  10%  per  year   •  5,  20,  50  PB  in  2012,  2015,  2018  
  • 3. Density   2012   Disks  (raw)  @  3TB   Disks  (protected)   Racks  @  480  disks   5  PB   1,700  disks   2,700  disks   6  racks   20  PB   6,700  disks   11,000  disks   23  racks   50  PB   17,000  disks   27,000  disks   56  racks  
  • 4. Density   2012   Disks  (raw)  @  3TB   Disks  (protected)   Racks  @  480  disks   5  PB   1,700  disks   2,700  disks   6  racks   20  PB   6,700  disks   11,000  disks   23  racks   50  PB   17,000  disks   27,000  disks   56  racks   2015   Disks  (raw)  @  6TB   Disks  (protected)   Racks  @  600  disks   5  PB   830  disks   1,300  disks   3  racks   20  PB   3,300  disks   5,300  disks   9  racks   50  PB   8,300  disks   13,000  disks   23  racks  
  • 5. Density   2012   Disks  (raw)  @  3TB   Disks  (protected)   Racks  @  480  disks   5  PB   1,700  disks   2,700  disks   6  racks   20  PB   6,700  disks   11,000  disks   23  racks   50  PB   17,000  disks   27,000  disks   56  racks   2015   Disks  (raw)  @  6TB   Disks  (protected)   Racks  @  600  disks   5  PB   830  disks   1,300  disks   3  racks   20  PB   3,300  disks   5,300  disks   9  racks   50  PB   8,300  disks   13,000  disks   23  racks   2018   Disks  (raw)  @  10TB   Disks  (protected)   Racks  @  600  disks   5  PB   500  disks   800  disks   2  racks   20  PB   2,000  disks   3,200  disks   6  racks   50  PB   5,000  disks   8,000  disks   14  racks  
  • 6. Performance   2012   10%/yr   Disks   Disk  BW   Racks   Bandwidth   Actual  BW   Days-­‐to-­‐fill   5  PB   16  MB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   63  MB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   159  MB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21  
  • 7. Performance   2012   10%/yr   Disks   Disk  BW   Racks   Bandwidth   Actual  BW   Days-­‐to-­‐fill   5  PB   16  MB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   63  MB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   159  MB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21   2012   10%/2day   Disks   Disk  BW   Racks   Bandwidth   Actual  BW   Days-­‐to-­‐fill   5  PB   2.9  GB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   11  GB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   29  GB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21  
  • 8. Performance   2012   10%/yr   Disks   Disk  BW   Racks   Bandwidth   Actual  BW   Days-­‐to-­‐fill   5  PB   16  MB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   63  MB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   159  MB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21   2018   10%/2day   Disks   Disk  BW   Racks   Bandwidth   Actual  BW   Days-­‐to-­‐fill   5  PB   2.9  GB/s   800   80  GB/s   2     10  GB/s   3.3  GB/s   17   20  PB   11  GB/s   3,200   320  GB/s   6   30  GB/s   10  GB/s   23   50  PB   29  GB/s   8,000   800  GB/s   14   70  GB/s   23  GB/s   25  
  • 9. Cost   2012   10%yr   Disks   Disk  BW   Racks   Bandwidth   Actual   Days-­‐to-­‐fill   5  PB   16  MB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   63  MB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   159  MB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21  
  • 10. Cost   2012   10%yr   Disks   Disk  BW   Racks   Bandwidth   Actual   Days-­‐to-­‐fill   5  PB   16  MB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   63  MB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   159  MB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21   2012   $/month  @  $0.01/GB   5  PB   $50,000/month   20  PB   $200,000/month   50  PB   $500,000/month   Cost  if  using  e.g.  “cold”  public  cloud  storage  
  • 11. Cost   2012   10%yr   Disks   Disk  BW   Racks   Bandwidth   Actual   Days-­‐to-­‐fill   5  PB   16  MB/s   2,700   200  GB/s   6     30  GB/s   3  GB/s   19   20  PB   63  MB/s   11,000   1.1  TB/s   23   115  GB/s   11  GB/s   20   50  PB   159  MB/s   27,000   2.7  TB/s   56   280  GB/s   28  GB/s   21   2012   sqN/person   $/sqN   $/month   20  employees   90   $48     $86,000/month   Washington,  DC   80  employees   75   $48   $288,000/month   Washington,  DC   200  employees   75   $24   $360,000/month   Minneapolis,  MN   2012   $/month  @  $0.01/GB   5  PB   $50,000/month   20  PB   $200,000/month   50  PB   $500,000/month   Cost  if  using  e.g.  “cold”  public  cloud  storage   For  comparison,  the  cost  to  “store”   20  librarians  or  data  scienNsts  
  • 12. AssumpNons   •  Data  protecNon  in  a  single  data  center,  using  an  erasure-­‐coding   scheme  at  1.6x  overhead   •  480  drive  racks  in  2012  (40U)   •  600  drive  racks  in  2015  and  2018  (50+U)   •  10%/year  access  assumes  10%  of  total  data  is  accessed  in  even   distribuNon  over  365  days/year,  24  hours/day  –  opNmisNc   •  10%/2day  access  assumes  10%  of  data  is  accessed  on  only  2  days   per  year  (say  Thanksgiving  and  Xmas)  –  very  bursty   •  Bandwidth  is  theoreNcal  bandwidth  at  40  Gb/s  per  rack  (4x  10  GbE)   •  Actual  bandwidth  is  1/10  of  theoreNcal  maximum  for  2012  and   2015;  up  to  1/3  theoreNcal  max  for  2018  (sohware  improvements)   •  sqh  per  person  and  $/sqh  references   hip://www.inc.com/news/arNcles/2010/10/washington-­‐dc-­‐rents-­‐top-­‐those-­‐in-­‐nyc.html   hip://newsfeed.Nme.com/2011/02/08/youre-­‐not-­‐imagining-­‐it-­‐your-­‐cubicle-­‐is-­‐gekng-­‐smaller/  
  • 13. References   •  Why  access  to  data  maiers,  not  just  “dark  storage”,   but  wide  access  to  electronic  data:   –  The  Internet  Archive   –  hip://archive.org/about/   –  History  of  the  Internet,  sNll  online  aher  20  years   –  hip://www.cs.cmu.edu/~riedel/library/birthday.html    (from  April  2003,  LoC  workshop  on  Digital  PreservaNon)   •  What  about  Flash?   –  Death  of  Disks  (has  been  widely  exaggerated)   –  hip://www.cs.cmu.edu/~riedel/#HECFSIO2011   –  How  to  Build  Big  Storage  as  a  Cloud   –  hip://storageconference.org/2012/PresentaNons/R00.Keynote.pdf  
  • 15. What  About  Tape?   pictures  by  Gill  Wildman  via  flickr/cc  
  • 16. What  About  Tape?   •  Tapes  are  not  a  commodity  technology   •  2011  total  worldwide  market  for  tape  cartridges   is  about  8m  units  (just  under  $1b  annual   revenue)   •  Compare  to  the  HDD  business  at  650m  units  in   2010  (close  to  $40b  annual  revenue)   •  80  disk  drives  are  manufactured  for  each  tape   cartridge;  robots  are  complicated   •  Fits  parNcular  applicaNon  segments  very  well,  but   is  not  a  general-­‐purpose  soluNon   hip://www.storagenewsleier.com/news/tapes/sccg-­‐ww-­‐tape-­‐market-­‐lto-­‐1q11   hip://techreport.com/discussions.x/20890  
  • 17. David  Anderson,  James  Dykes,  Erik  Riedel  “SCSI  vs.  ATA  -­‐  More  than   an  interface”  2nd  Conference  on  File  and  Storage  Technology  (FAST).   San  Francisco,  CA.  April  2003.  www.cs.cmu.edu/~riedel/#SCSIvsATA