SlideShare una empresa de Scribd logo
1 de 40
An Overview of Cloud Computing:My Other Computer is a Data Center Robert GrossmanOpen Cloud Consortium January 7, 2010
Part 1What is a Cloud? 2
What is a Cloud? 3 Software as a Service (SaaS)
What Else is a Cloud? 4 Platform as a Service (PaaS)
Is Anything Else a Cloud? 5 Infrastructure as a Service (IaaS)
Are There Other Types of Clouds? 6 ad targeting  Large Data Cloud Services
What is Virtualization? 7
Idea Dates Back to the 1960s 8 App App App CMS CMS MVS IBM VM/370 IBM Mainframe Native (Full) Virtualization Examples: Vmware ESX Virtualization first widely deployed with IBM VM/370.
What Do You Optimize? Goal: Minimize latency and control heat. Goal: Maximize data (with matching compute) and control cost.
10 Scale is new
Elastic, Usage Based Pricing Is New 11 costs the same as 1 computer in a rack for 120 hours 120 computers in  three racks for 1 hour ,[object Object]
 Clouds can be used to manage surges in computing needs.,[object Object]
13
What Resource is Managed? Scarce processors wait for data Manage cycles wait for an opening in the queue scatter the data to the processors and gather the results Persistent data wait for queries Manage data persistent data waits for queries computation done locally results returned Supercomputer Center Model  Data Center Model
Part 2.  Data Centers as the Unit of Computing Cloud computing is at the top of the Gartner hype cycle. “Cloud computing has become the center of investment and innovation.”Nicholas Carr, 2009 IDC Directions 15
2004 10x-100x 1976 10x-100x data science 1670 250x simulation science 1609 30x experimental science
Requirements for Clouds
Transition Taking Place A hand full of players are building multiple data centers a year and improving with each one. This includes Google, Microsoft, Yahoo, … A data center today costs $200 M – $400+ M Berkeley RAD Report points out analogy with semiconductor industry as companies stopped building their own Fabs and starting leasing Fabs from others as Fabs approached $1B  18
Which is the Operating System? 19 … … VM 1 VM 5 VM 50,000 VM 1 Data Center Operating System Hyperviser workstation data center
How Do You Program A Data Center? 20
Some Programming Models for Data Centers Operations over data center of disks MapReduce (“string-based”) User-Defined Functions (UDFs) over data center SQL and Quasi-SQL over data center Data analysis / statistics over data center Operations over data center of memory Grep over distributed memory UDFs over distributed memory SQL and Quasi-SQL over distributed memory Data analysis / statistics over distributed memory
Part 3.Open Cloud Consortium
U.S. 501(3)(c) not-for-profit corporation Supports the development of standards and interoperability frameworks. Supports reference implementations for cloud computing.   Manages testbeds: Open Cloud Testbed, IntercloudTestbed, Open Science Data Cloud Develops benchmarks. 23 www.opencloudconsortium.org
OCC Members Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo Universities:  CalIT2, Johns Hopkins, Northwestern, University of Illinois at Chicago, University of Chicago Government agencies: NASA Organizations: Sector Project 24
             Open Cloud Testbed C-Wave CENIC Dragon Phase 2 9 racks 250+ Nodes 1000+ Cores 10+ Gb/s ,[object Object]
Sector/Sphere
Thrift
KVM VMs
Eucalyptus VMsMREN 25
IntercloudTestbed ,[object Object]
Cloud Compute Services
Data & Storage as a ServiceLarge Data Cloud Interoperability Framework Working with Infrastructure 2.0 Working Group SNIA Cloud Data Management Interface (CDMI) Dynamic infrastructure service linking IaaS and DaaS Working with Infrastructure 2.0 Working Group ,[object Object],Virtual Data Centers (VDC) Virtual Networks (VN) Virtual Machines (VM) Physical Resources Dynamic infrastructure service naming and linking entities in the IaaS layers Open Cloud Computing Interface (OCCI) Open Virtualization Format (OVF)
 Open Science Data Cloud sky cloud Planning to work with 5 international partners (all connected with 10 Gbps networks). biocloud 27
MalStone (OCC-Developed Benchmark) Sector/Sphere 1.20, Hadoop 0.18.3 with no replication on Phase 1 of Open Cloud Testbed in a single rack.  Data consisted of 20 nodes with 500 million 100-byte records / node.
Some Lessons Learned (So Far) Python over Hadoop Distributed File System surprisingly powerful. Tuning Hadoop can be a large (unacknowledged) cost.  Performance of a cloud computation can be significantly impacted by just 1 or 2 nodes that are a bit slower. Wide area clouds can be practical in some cases. 29
Part 4.  Sector 30 http://sector.sourceforge.net
Sector Overview Sector is fast As measured by MalStone & Terasort Sector is easy to program Supports UDFs, MapReduce & Python over streams Sector does not require extensive tuning. Sector is secure A HIPAA compliant Sector cloud is being set up Sector is reliable Sector v1.24 supports multiple master node servers 31
Google’s Large Data Cloud Compute Services Data Services Storage Services 32 Applications Google’s MapReduce Google’s BigTable Google File System (GFS) Google’s Stack
Hadoop’s Large Data Cloud Compute Services Storage Services 33 Applications Hadoop’sMapReduce Data Services Hadoop Distributed File System (HDFS) Hadoop’s Stack
Sector’s Large Data Cloud 34 Applications Compute Services Sphere’s UDFs Data Services Sector’s Distributed File System (SDFS) Storage Services UDP-based Data Transport Protocol (UDT) Routing & Transport Services Sector’s Stack

Más contenido relacionado

La actualidad más candente

Extending Application Data In The Cloud
Extending Application Data In The CloudExtending Application Data In The Cloud
Extending Application Data In The CloudRonald Bradford
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Mundi Presentation - A Space of New Opportunities
Mundi Presentation - A Space of New OpportunitiesMundi Presentation - A Space of New Opportunities
Mundi Presentation - A Space of New Opportunitiesplan4all
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is DistributedAlluxio, Inc.
 
Orchestrate a Data Symphony
Orchestrate a Data SymphonyOrchestrate a Data Symphony
Orchestrate a Data SymphonyAlluxio, Inc.
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONijcsit
 
Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016
Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016
Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016Jisc
 
Cloud and Big Data Conference Images
Cloud and Big Data Conference ImagesCloud and Big Data Conference Images
Cloud and Big Data Conference ImagesPatrickCrompton
 

La actualidad más candente (19)

Extending Application Data In The Cloud
Extending Application Data In The CloudExtending Application Data In The Cloud
Extending Application Data In The Cloud
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
TerraEchos Kairos on IBM PowerLinux servers
TerraEchos Kairos on IBM PowerLinux serversTerraEchos Kairos on IBM PowerLinux servers
TerraEchos Kairos on IBM PowerLinux servers
 
Cloud & Data Center Networking
Cloud & Data Center NetworkingCloud & Data Center Networking
Cloud & Data Center Networking
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
CREODIAS: Cloud for Eath Obseration Data Processing
CREODIAS: Cloud for Eath Obseration Data ProcessingCREODIAS: Cloud for Eath Obseration Data Processing
CREODIAS: Cloud for Eath Obseration Data Processing
 
Sobloo Geospatial Ecosystem
Sobloo Geospatial EcosystemSobloo Geospatial Ecosystem
Sobloo Geospatial Ecosystem
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Mundi Presentation - A Space of New Opportunities
Mundi Presentation - A Space of New OpportunitiesMundi Presentation - A Space of New Opportunities
Mundi Presentation - A Space of New Opportunities
 
Ss eb29
Ss eb29Ss eb29
Ss eb29
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
 
Orchestrate a Data Symphony
Orchestrate a Data SymphonyOrchestrate a Data Symphony
Orchestrate a Data Symphony
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
 
WTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The FundamentalsWTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The Fundamentals
 
Expect More from Hadoop
Expect More from Hadoop Expect More from Hadoop
Expect More from Hadoop
 
Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016
Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016
Cloud present, future and trajectory (Amazon Web Services) - JIsc Digifest 2016
 
TierraCloud HC2 Customer Presentation
TierraCloud HC2 Customer PresentationTierraCloud HC2 Customer Presentation
TierraCloud HC2 Customer Presentation
 
Cloud and Big Data Conference Images
Cloud and Big Data Conference ImagesCloud and Big Data Conference Images
Cloud and Big Data Conference Images
 
VINEYARD Overview - ARC 2016
VINEYARD Overview - ARC 2016VINEYARD Overview - ARC 2016
VINEYARD Overview - ARC 2016
 

Destacado

Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...
Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...
Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...BTI Systems
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCRyousei Takano
 
MS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure PlatformMS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure PlatformSpiffy
 
Windows Azure David Chappell White Paper March 09
Windows Azure David Chappell White Paper March 09Windows Azure David Chappell White Paper March 09
Windows Azure David Chappell White Paper March 09guest120d945
 
Cloud Computing & Windows Azure
Cloud Computing & Windows AzureCloud Computing & Windows Azure
Cloud Computing & Windows Azureyeschandana
 
Introducing Azure Services Platform V1
Introducing Azure Services Platform V1Introducing Azure Services Platform V1
Introducing Azure Services Platform V1guest120d945
 
2011.05.31 super mondays-servicebus-demo
2011.05.31 super mondays-servicebus-demo2011.05.31 super mondays-servicebus-demo
2011.05.31 super mondays-servicebus-demodaveingham
 
S00193ed1v01y200905cac006
S00193ed1v01y200905cac006S00193ed1v01y200905cac006
S00193ed1v01y200905cac006guest120d945
 
Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...
Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...
Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...Paolo Giaccone
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAlessio Villardita
 
Trend and Future of Cloud Computing
Trend and Future of Cloud ComputingTrend and Future of Cloud Computing
Trend and Future of Cloud Computinghybrid cloud
 
Data center network architectures v1.3
Data center network architectures v1.3Data center network architectures v1.3
Data center network architectures v1.3Jeong, Wookjae
 
4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloudtervela
 

Destacado (16)

Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...
Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...
Webinar: Learn How To Deploy High-Scale, Low-Latency Cost-Efficient Solutions...
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
 
MS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure PlatformMS TechDays 2011 - Cloud Computing with the Windows Azure Platform
MS TechDays 2011 - Cloud Computing with the Windows Azure Platform
 
Windows Azure David Chappell White Paper March 09
Windows Azure David Chappell White Paper March 09Windows Azure David Chappell White Paper March 09
Windows Azure David Chappell White Paper March 09
 
Cloud Computing & Windows Azure
Cloud Computing & Windows AzureCloud Computing & Windows Azure
Cloud Computing & Windows Azure
 
Introducing Azure Services Platform V1
Introducing Azure Services Platform V1Introducing Azure Services Platform V1
Introducing Azure Services Platform V1
 
2011.05.31 super mondays-servicebus-demo
2011.05.31 super mondays-servicebus-demo2011.05.31 super mondays-servicebus-demo
2011.05.31 super mondays-servicebus-demo
 
S00193ed1v01y200905cac006
S00193ed1v01y200905cac006S00193ed1v01y200905cac006
S00193ed1v01y200905cac006
 
IT HealthCheck
IT HealthCheckIT HealthCheck
IT HealthCheck
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...
Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...
Power Comparison Power Comparison of Cloud Data of Cloud Data Center Architec...
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale Computers
 
Trend and Future of Cloud Computing
Trend and Future of Cloud ComputingTrend and Future of Cloud Computing
Trend and Future of Cloud Computing
 
Data center network architectures v1.3
Data center network architectures v1.3Data center network architectures v1.3
Data center network architectures v1.3
 
Ingram Micro IaaS Playbook
Ingram Micro IaaS PlaybookIngram Micro IaaS Playbook
Ingram Micro IaaS Playbook
 
4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud4 Ways To Save Big Money in Your Data Center and Private Cloud
4 Ways To Save Big Money in Your Data Center and Private Cloud
 

Similar a My Other Computer is a Data Center (2010 v21)

An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)Robert Grossman
 
Open Cloud Consortium: An Update (04-23-10, v9)
Open Cloud Consortium: An Update (04-23-10, v9)Open Cloud Consortium: An Update (04-23-10, v9)
Open Cloud Consortium: An Update (04-23-10, v9)Robert Grossman
 
Cloud Computing Standards and Use Cases (Robert Grossman) 09-v8p
Cloud Computing Standards and Use Cases (Robert Grossman) 09-v8pCloud Computing Standards and Use Cases (Robert Grossman) 09-v8p
Cloud Computing Standards and Use Cases (Robert Grossman) 09-v8pRobert Grossman
 
Cloud computing and grid computing 360 degree compared
Cloud computing and grid computing 360 degree comparedCloud computing and grid computing 360 degree compared
Cloud computing and grid computing 360 degree comparedMd. Hasibur Rashid
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3Robert Grossman
 
Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)Robert Grossman
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataRobert Grossman
 
Cloud Computing: Overview and Examples
Cloud Computing: Overview and ExamplesCloud Computing: Overview and Examples
Cloud Computing: Overview and ExamplesEueung Mulyana
 
CENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.pptCENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.pptdhanasekarscse
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Robert Grossman
 
Spatial data infrastructure in the cloud, 2011
Spatial data infrastructure in the cloud, 2011Spatial data infrastructure in the cloud, 2011
Spatial data infrastructure in the cloud, 2011Moullet
 
Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefRobert Grossman
 
Cloud computing - dien toan dam may
Cloud computing - dien toan dam mayCloud computing - dien toan dam may
Cloud computing - dien toan dam mayNguyen Duong
 
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediFundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediAnimesh Chaturvedi
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 

Similar a My Other Computer is a Data Center (2010 v21) (20)

An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
 
Open Cloud Consortium: An Update (04-23-10, v9)
Open Cloud Consortium: An Update (04-23-10, v9)Open Cloud Consortium: An Update (04-23-10, v9)
Open Cloud Consortium: An Update (04-23-10, v9)
 
Cloud Computing Standards and Use Cases (Robert Grossman) 09-v8p
Cloud Computing Standards and Use Cases (Robert Grossman) 09-v8pCloud Computing Standards and Use Cases (Robert Grossman) 09-v8p
Cloud Computing Standards and Use Cases (Robert Grossman) 09-v8p
 
Cloud computing and grid computing 360 degree compared
Cloud computing and grid computing 360 degree comparedCloud computing and grid computing 360 degree compared
Cloud computing and grid computing 360 degree compared
 
OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3OCC Overview OMG Clouds Meeting 07-13-09 v3
OCC Overview OMG Clouds Meeting 07-13-09 v3
 
Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)Open Cloud Consortium Overview (01-10-10 V6)
Open Cloud Consortium Overview (01-10-10 V6)
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
Cloud Computing: Overview and Examples
Cloud Computing: Overview and ExamplesCloud Computing: Overview and Examples
Cloud Computing: Overview and Examples
 
CENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.pptCENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...
 
Cloud vs grid
Cloud vs gridCloud vs grid
Cloud vs grid
 
Spatial data infrastructure in the cloud, 2011
Spatial data infrastructure in the cloud, 2011Spatial data infrastructure in the cloud, 2011
Spatial data infrastructure in the cloud, 2011
 
Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster Relief
 
Cloud computing - dien toan dam may
Cloud computing - dien toan dam mayCloud computing - dien toan dam may
Cloud computing - dien toan dam may
 
cloud computing models
cloud computing modelscloud computing models
cloud computing models
 
Cloud computing: highlights
Cloud computing: highlightsCloud computing: highlights
Cloud computing: highlights
 
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediFundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
 
CLOUD COMPUTING
CLOUD COMPUTINGCLOUD COMPUTING
CLOUD COMPUTING
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 

Más de Robert Grossman

Some Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your CompanySome Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your CompanyRobert Grossman
 
Some Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsSome Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsRobert Grossman
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataRobert Grossman
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
 
A Data Biosphere for Biomedical Research
A Data Biosphere for Biomedical ResearchA Data Biosphere for Biomedical Research
A Data Biosphere for Biomedical ResearchRobert Grossman
 
What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?Robert Grossman
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...Robert Grossman
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016Robert Grossman
 
Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data Robert Grossman
 
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...Robert Grossman
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)Robert Grossman
 
Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)Robert Grossman
 
Practical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large DatasetsPractical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large DatasetsRobert Grossman
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Robert Grossman
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Robert Grossman
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 
Adversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World TalkAdversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World TalkRobert Grossman
 

Más de Robert Grossman (20)

Some Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your CompanySome Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your Company
 
Some Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsSome Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data Platforms
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
 
A Data Biosphere for Biomedical Research
A Data Biosphere for Biomedical ResearchA Data Biosphere for Biomedical Research
A Data Biosphere for Biomedical Research
 
What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?What is Data Commons and How Can Your Organization Build One?
What is Data Commons and How Can Your Organization Build One?
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016AnalyticOps - Chicago PAW 2016
AnalyticOps - Chicago PAW 2016
 
Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data Keynote on 2015 Yale Day of Data
Keynote on 2015 Yale Day of Data
 
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
How to Lower the Cost of Deploying Analytics: An Introduction to the Portable...
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
Clouds and Commons for the Data Intensive Science Community (June 8, 2015)
 
Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)Architectures for Data Commons (XLDB 15 Lightning Talk)
Architectures for Data Commons (XLDB 15 Lightning Talk)
 
Practical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large DatasetsPractical Methods for Identifying Anomalies That Matter in Large Datasets
Practical Methods for Identifying Anomalies That Matter in Large Datasets
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Adversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World TalkAdversarial Analytics - 2013 Strata & Hadoop World Talk
Adversarial Analytics - 2013 Strata & Hadoop World Talk
 

Último

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Último (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

My Other Computer is a Data Center (2010 v21)

  • 1. An Overview of Cloud Computing:My Other Computer is a Data Center Robert GrossmanOpen Cloud Consortium January 7, 2010
  • 2. Part 1What is a Cloud? 2
  • 3. What is a Cloud? 3 Software as a Service (SaaS)
  • 4. What Else is a Cloud? 4 Platform as a Service (PaaS)
  • 5. Is Anything Else a Cloud? 5 Infrastructure as a Service (IaaS)
  • 6. Are There Other Types of Clouds? 6 ad targeting Large Data Cloud Services
  • 8. Idea Dates Back to the 1960s 8 App App App CMS CMS MVS IBM VM/370 IBM Mainframe Native (Full) Virtualization Examples: Vmware ESX Virtualization first widely deployed with IBM VM/370.
  • 9. What Do You Optimize? Goal: Minimize latency and control heat. Goal: Maximize data (with matching compute) and control cost.
  • 10. 10 Scale is new
  • 11.
  • 12.
  • 13. 13
  • 14. What Resource is Managed? Scarce processors wait for data Manage cycles wait for an opening in the queue scatter the data to the processors and gather the results Persistent data wait for queries Manage data persistent data waits for queries computation done locally results returned Supercomputer Center Model Data Center Model
  • 15. Part 2. Data Centers as the Unit of Computing Cloud computing is at the top of the Gartner hype cycle. “Cloud computing has become the center of investment and innovation.”Nicholas Carr, 2009 IDC Directions 15
  • 16. 2004 10x-100x 1976 10x-100x data science 1670 250x simulation science 1609 30x experimental science
  • 18. Transition Taking Place A hand full of players are building multiple data centers a year and improving with each one. This includes Google, Microsoft, Yahoo, … A data center today costs $200 M – $400+ M Berkeley RAD Report points out analogy with semiconductor industry as companies stopped building their own Fabs and starting leasing Fabs from others as Fabs approached $1B 18
  • 19. Which is the Operating System? 19 … … VM 1 VM 5 VM 50,000 VM 1 Data Center Operating System Hyperviser workstation data center
  • 20. How Do You Program A Data Center? 20
  • 21. Some Programming Models for Data Centers Operations over data center of disks MapReduce (“string-based”) User-Defined Functions (UDFs) over data center SQL and Quasi-SQL over data center Data analysis / statistics over data center Operations over data center of memory Grep over distributed memory UDFs over distributed memory SQL and Quasi-SQL over distributed memory Data analysis / statistics over distributed memory
  • 22. Part 3.Open Cloud Consortium
  • 23. U.S. 501(3)(c) not-for-profit corporation Supports the development of standards and interoperability frameworks. Supports reference implementations for cloud computing. Manages testbeds: Open Cloud Testbed, IntercloudTestbed, Open Science Data Cloud Develops benchmarks. 23 www.opencloudconsortium.org
  • 24. OCC Members Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo Universities: CalIT2, Johns Hopkins, Northwestern, University of Illinois at Chicago, University of Chicago Government agencies: NASA Organizations: Sector Project 24
  • 25.
  • 30.
  • 32.
  • 33. Open Science Data Cloud sky cloud Planning to work with 5 international partners (all connected with 10 Gbps networks). biocloud 27
  • 34. MalStone (OCC-Developed Benchmark) Sector/Sphere 1.20, Hadoop 0.18.3 with no replication on Phase 1 of Open Cloud Testbed in a single rack. Data consisted of 20 nodes with 500 million 100-byte records / node.
  • 35. Some Lessons Learned (So Far) Python over Hadoop Distributed File System surprisingly powerful. Tuning Hadoop can be a large (unacknowledged) cost. Performance of a cloud computation can be significantly impacted by just 1 or 2 nodes that are a bit slower. Wide area clouds can be practical in some cases. 29
  • 36. Part 4. Sector 30 http://sector.sourceforge.net
  • 37. Sector Overview Sector is fast As measured by MalStone & Terasort Sector is easy to program Supports UDFs, MapReduce & Python over streams Sector does not require extensive tuning. Sector is secure A HIPAA compliant Sector cloud is being set up Sector is reliable Sector v1.24 supports multiple master node servers 31
  • 38. Google’s Large Data Cloud Compute Services Data Services Storage Services 32 Applications Google’s MapReduce Google’s BigTable Google File System (GFS) Google’s Stack
  • 39. Hadoop’s Large Data Cloud Compute Services Storage Services 33 Applications Hadoop’sMapReduce Data Services Hadoop Distributed File System (HDFS) Hadoop’s Stack
  • 40. Sector’s Large Data Cloud 34 Applications Compute Services Sphere’s UDFs Data Services Sector’s Distributed File System (SDFS) Storage Services UDP-based Data Transport Protocol (UDT) Routing & Transport Services Sector’s Stack
  • 41. Generalization: Apply User Defined Functions (UDF) to Files in Storage Cloud map/shuffle reduce 35 UDF UDF
  • 42. Hadoopvs Sector 36 Source: Gu and Grossman, Sector and Sphere, Phil. Trans. Royal Society A, 2009.
  • 43. Terasort - Sector vsHadoop Performance Sector/Sphere 1.24a, Hadoop 0.20.1 with no replication on Phase 2 of Open Cloud Testbed with co-located racks.
  • 44. Sector Applications Distributing the 15 TB Sloan Digital Sky Survey to astronomers around the world (joint with JHU, 2005) Managing and analyzing high throughput sequence data (Cistrack, University of Chicago, Cistrack, 2007). Detecting emergent behavior in distributed network data (Angle, won SC 07 Analytics Challenge) Image processing for high throughput sequencing. Wide area clouds (won SC 09 BWC with 100 Gbps wide area computation) New ensemble-based algorithms for trees Graph processing 38
  • 45. Cistrack Web Portal & Widgets Cistrack Elastic Cloud Services Cistrack Database Analysis Pipelines & Re-analysis Services Cistrack Large Data Cloud Services Ingestion Services
  • 46. Thank you For more information, please see blog.rgrossman.com 40