SlideShare una empresa de Scribd logo
1 de 59
IDEAS for thought
SHPC lunch and learn
JULY 25, 2013
John D. Almon
• Full stack software engineer
• Implemented RTM on GPU using MPI
• Implemented Cloud basedWEM using SOA
• Terabyte scale database design and data warehousing
• Architected hybrid web interpretation and processing system
• C++, Java, MPI, C, Oracle PL/SQL, HTML,Web Based Systems, XML
• Managed software team
• Currently serves as CEO ofAdvanced SeismicTechnologies
Hardware
Small HPC setup - Guess what company
• Fiber optic to every desktop using HPC grid
• 400Terabytes of Storage
• 300 x 10 GbE ports
• 1500 x 1 GbE ports
• Desktop workstations automatically added to HPC grid after hours
• 5,000 AMD processors + 3,000 desktop processors at night
Monsters University
• 100 Million CPU hours
• 5.5 million individual hairs
• 127 simulated garments
• Global illumination ray tracing
Key point #1
Perhaps we can learn new techniques from
other industries that operate at scale
Software
Bi Modal Distribution of Developers
This shapes Architecture and Design Innovation
Loosely coupled code
Fast hardware
Open source
Closely coupled code
Slow hardware
More optimization
Geoscience Gap
Massive hardware changes
Better compilers and cheaper hardware has
changed everything about software development
• No more fortran ( sort of )
• Object oriented approach
• Teenage internet billionaires
Software access patterns affect memory
speed ( affected by data and users )
Word Size Affects
Memory Bandwidth
Temporal Locality &
Spatial Locality
Can affect bandwidth
Memory Mountain software code
/* Iterate over first "elems" elements of array "data" with stride of
* "stride". */
void test(int elems, int stride)
{
int i;
double result = 0.0;
volatile double sink;
for (i = 0; i < elems; i += stride)
result += data[i];
sink = result; /* So compiler doesn't optimize away the loop */
}
Everything is a cache ( memory heirachy )
• Register, ~2ns
• Primary cache, ~4-5ns
• Secondary cache, ~30ns
• Main memory, ~22ns
• Magnetic Disk, ~3ms
• SSD,~100µs
• File server on Gigabit ethernet
• Cloud
Bottleneck is the
memory bus
Bottleneck is the
network
New Paradigm for Optimization of Compute
at Cluster / Cloud level
• Pre sorting / caching of data for maximum
throughput
• Hueristic analysis at the application level
• Optimization of hardware resources determined by
the application
• Hardware switching based on access patterns of
application and user
All developers are:
(artists | engineers | brilliant | clueless )
• There is no one right way to build a piece of software
• Heterogeous development staff builds heterogeneous
solutions
• What about UI / UX ( User Interface / User Experience )
• Business workflows should drive UI / UX
• Steve jobs was tyrannical about every detail fitting into his one
overaching product vision
Who are we ?
No sacred cows
• temp
Key point #2
Software developers shape the choice of
architecture and available tools
2 Companies with really “Big Data”
• $50 Billion in revenue
• 30,000 + employees
• Optimization throughout entire stack
• Google Filesystem, Operating System, CHROME
• 2,000,000 servers
• Free food to keep their developers working long
hours
Google
• Pluto switch
Google tools
• Google Hangout - collaboration
• Google Maps
• Google compute engine
• Google bigQuery
• $1 Billion data center in Iowa
• 450,000 servers
• API first development strategy
• Supports multiple interface connectivity using
“restful” applications
• Compete with UI / UX
• Creates user lock in through iterative conditioning
Iterative conditioning
• Workflows are hard to learn
• You should need software training to learn how to use software
• Software fatigue
• Switching cost
• Adoption rates
• Advanced features
• Tracking all of this and dynamic menus and configuration
Facebook tools and contributions
• Apache Cassandra ( Big data database, linear
scalability )
• ApacheThrift ( cross language services )
Architecture choices provide insight … still have to
implement for specifics of Oil and Gas
Open Source Licensing
• MIT X11 License – ANY use permissible
• BSD – Identical to MIT X11
• GPL – no linking
• LPGL – linking allowed
• Appliances – ethical / versus legal
Must read the fine print before using, but can save very large amount
of time by using these frameworks and implementations where
possible
Key point #3
Internet companies have innovation at scale
Using REST architecture to go FAST
Representational State Transfer
• 6 constraints
• Client Server – clients are not concerned with data storage
• Stateless – server does not store client context
• Cacheable – client stores responses
• Layered system – client does not know if it is at end server or intermediary
• Optional code on demand – client downloads code and runs
• Uniform interface – decouples interface and allows each part to evolve
independently
Representational State Transfer
• 6 constraints
• Client Server – clients are not concerned with data storage
• Stateless – server does not store client context
• Cacheable – client stores responses
• Layered system – client does not know if it is at end server or intermediary
• Optional code on demand – client downloads code and runs
• Uniform interface – decouples interface and allows each part to evolve
independently
Simplified REST
Web Browser Web Server
Database
File Servers
Presentation Layer
can’t handle
Geoscience or
local compute
Web server has the
majority of control
Compute Engine
REST API
REST with Mashup
Web Browser Web Server 1
Database
File Servers
Presentation Layer
can mashup data
from 2 separate
sources
Compute Engine
Web Server 2
REST API
REST with new application layer
Form window Application
Database
File Servers
Compute Engine
Web Server 2
REST API
OpenGLWindow
Web Browser
Internet architecture / legacy style code
• REST Architecture for NON – INTERNET
applications
• Can keep inside corporate networks
• Distributed systems architecture
• Predominant webAPI design model
• Allows for distributed development team
• Separate data model from view model
• But allows for computation on either side
Software Demo
Client Server
• FINALLY !! Interactive HPC apps made easy
• Our tabs are the clients connection to application
layer via a “REST” style API
• Application layer provides caching and file system
access
• Application layer provides access to heterogeneous
compute
Stateless
• Each tab does not know about other tabs
• This creates the ability to very quickly have
developer from different teams and disciplines work
independently
• Application layer provides synchronization states
• Application layer provides for off-workstation
transferability ( work from iPad on the Beach )
Cacheable
• Heuristic data sorting and precaching based on user /
algorithm needs
• Allows for compute distribution without presentation layer
needing to know
• Allows for disparate file systems
• Abstracts data location from user
• Communicate with HPC grid in more advanced manner
Layered System
• Allows for use of 3rd party plugins
• Allows EVERY application connect to HPC grid
• Graphics as plugins
• Workflows as plugins - dynamic workflow
• No menu on Amazon
• Optimize each layer independently
Code on demand
• Safer since security is controlled by application layer
• Sandbox each user and only give access with additional security
credentials
• Can download and run legacy code through Pinvoke
• DLL injection
Uniform Interface
• HTML for cross platform consistency
• User adoption and ease of use
• Internet style decoupling of functionality from
graphics creates a better user experience and more
intuitive style workflow
• Most graphic designers do NOT know C++
• Geoscientists won’t always agree on color scheme,
styles, icons
Most important benefits
• More flexibility means rapid application development and easier
maintenance
• Presentation layer needs change as business requirements needs
change over time
• Hooking into outside tools that have REST API’s
• Data
• Social
• Compute engines
• Mash ups
Key point #4
A REST architecture enables scalability,
extensible development, and mashup of
tools and ideas created for the Internet
InterestingTechnologies for Big Data
Google BigQuery
• Underlying technology is called DREMEL
• Uses google file system as abstraction for database
• Dremel can even execute a complex regular expression text matching on a huge
logging table that consists of about 35 billion rows and 20TB, in merely tens of
seconds
Cassandra
• Cassandra provides a structured key-value store with tunable
consistency.
• Keys map to multiple values, which are grouped into column families.
The column families are fixed when a Cassandra database is created,
but columns can be added to a family at any time.
• Furthermore, columns are added only to specified keys, so different
keys can have different numbers of columns in any given family.
• The values from a column family for each key are stored together.
Palantir
• Does work for government agencies
• High security layer that sits on top of disparate data sources
• The Palantir Stack Layer
• Brings together structured and unstructured data
• Serves as foundation for applications using the dataAPI
• Search and discovery layer
• Granular multi layered security model
• Revisioning database and original source tracking
• Collaboration and data editing
Ayasdi
• Topological data analysis using machine learning
• Can cross analyze multiple data
sources
• Query free approach
Zoom Data
• Automated connectivity to third party sources
• Visualization studio
• Interactive visualizations
WebGL ( Open GL in web browser )
• Could be used for presentation layer in mobile device
http://demos.vicomtech.org/x3dom/test/functional/volrenShaderBoun
daryEnh.xhtml
http://ourbricks.com/viewer/178d62ac29aa44459a6d57ce474fa6b6
Key point #5
Connect to these and other tools using REST
Questions ?
john@advancedseismic.com
832.544.7305

Más contenido relacionado

La actualidad más candente

Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Continuent
 
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...Lucas Jellema
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...HostedbyConfluent
 
Hybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and KubernetesHybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and KubernetesDatabricks
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoopgregchanan
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology confluent
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networkspbelko82
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaData Con LA
 
Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopDataWorks Summit
 
Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...Lucas Jellema
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Lucas Jellema
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryDataWorks Summit/Hadoop Summit
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
Choosing the right Cloud Database
Choosing the right Cloud DatabaseChoosing the right Cloud Database
Choosing the right Cloud DatabaseJanakiram MSV
 
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...Lucas Jellema
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamRomeo Kienzler
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsMike Broberg
 
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Rui Quintino
 
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Dan Harvey
 

La actualidad más candente (20)

Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
 
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
Oracle OpenWorld 2016 Review - High Level Overview of major themes and grand ...
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
 
Hybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and KubernetesHybrid Apache Spark Architecture with YARN and Kubernetes
Hybrid Apache Spark Architecture with YARN and Kubernetes
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
 
Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
 
Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...Handson Oracle Management Cloud with Application Performance Monitoring and L...
Handson Oracle Management Cloud with Application Performance Monitoring and L...
 
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...
 
Preventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive IndustryPreventative Maintenance of Robots in Automotive Industry
Preventative Maintenance of Robots in Automotive Industry
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Choosing the right Cloud Database
Choosing the right Cloud DatabaseChoosing the right Cloud Database
Choosing the right Cloud Database
 
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
What is the Oracle PaaS Cloud for Developers (Oracle Cloud Day, The Netherlan...
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
SQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 QuestionsSQL to NoSQL: Top 6 Questions
SQL to NoSQL: Top 6 Questions
 
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL S...
 
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.
 

Similar a Hpc lunch and learn

Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointChristopher Dubois
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsYong Feng
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionNguyen Tung
 
Cloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming CurriculumCloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming CurriculumSteven Miller
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...moneyjh
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud PlatformSujai Prakasam
 
8. Software Development Security
8. Software Development Security8. Software Development Security
8. Software Development SecuritySam Bowne
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyserAlex Moskvin
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?CQD
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...Sri Ambati
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithMarkus Eisele
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to MicroservicesMahmoudZidan41
 
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...Serdar Basegmez
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indixYu Ishikawa
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 

Similar a Hpc lunch and learn (20)

Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
Cloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming CurriculumCloud Computing in Systems Programming Curriculum
Cloud Computing in Systems Programming Curriculum
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...PEARC17: Live Integrated Visualization Environment: An Experiment in General...
PEARC17: Live Integrated Visualization Environment: An Experiment in General...
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
8. Software Development Security
8. Software Development Security8. Software Development Security
8. Software Development Security
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI ...
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 

Último

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 

Último (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Hpc lunch and learn

  • 1. IDEAS for thought SHPC lunch and learn JULY 25, 2013
  • 2. John D. Almon • Full stack software engineer • Implemented RTM on GPU using MPI • Implemented Cloud basedWEM using SOA • Terabyte scale database design and data warehousing • Architected hybrid web interpretation and processing system • C++, Java, MPI, C, Oracle PL/SQL, HTML,Web Based Systems, XML • Managed software team • Currently serves as CEO ofAdvanced SeismicTechnologies
  • 4. Small HPC setup - Guess what company • Fiber optic to every desktop using HPC grid • 400Terabytes of Storage • 300 x 10 GbE ports • 1500 x 1 GbE ports • Desktop workstations automatically added to HPC grid after hours • 5,000 AMD processors + 3,000 desktop processors at night
  • 5.
  • 6.
  • 7. Monsters University • 100 Million CPU hours • 5.5 million individual hairs • 127 simulated garments • Global illumination ray tracing
  • 8. Key point #1 Perhaps we can learn new techniques from other industries that operate at scale
  • 10. Bi Modal Distribution of Developers This shapes Architecture and Design Innovation Loosely coupled code Fast hardware Open source Closely coupled code Slow hardware More optimization Geoscience Gap Massive hardware changes
  • 11. Better compilers and cheaper hardware has changed everything about software development • No more fortran ( sort of ) • Object oriented approach • Teenage internet billionaires
  • 12. Software access patterns affect memory speed ( affected by data and users ) Word Size Affects Memory Bandwidth Temporal Locality & Spatial Locality Can affect bandwidth
  • 13. Memory Mountain software code /* Iterate over first "elems" elements of array "data" with stride of * "stride". */ void test(int elems, int stride) { int i; double result = 0.0; volatile double sink; for (i = 0; i < elems; i += stride) result += data[i]; sink = result; /* So compiler doesn't optimize away the loop */ }
  • 14. Everything is a cache ( memory heirachy ) • Register, ~2ns • Primary cache, ~4-5ns • Secondary cache, ~30ns • Main memory, ~22ns • Magnetic Disk, ~3ms • SSD,~100µs • File server on Gigabit ethernet • Cloud Bottleneck is the memory bus Bottleneck is the network
  • 15. New Paradigm for Optimization of Compute at Cluster / Cloud level • Pre sorting / caching of data for maximum throughput • Hueristic analysis at the application level • Optimization of hardware resources determined by the application • Hardware switching based on access patterns of application and user
  • 16. All developers are: (artists | engineers | brilliant | clueless ) • There is no one right way to build a piece of software • Heterogeous development staff builds heterogeneous solutions • What about UI / UX ( User Interface / User Experience ) • Business workflows should drive UI / UX • Steve jobs was tyrannical about every detail fitting into his one overaching product vision
  • 19.
  • 20. Key point #2 Software developers shape the choice of architecture and available tools
  • 21. 2 Companies with really “Big Data”
  • 22.
  • 23. • $50 Billion in revenue • 30,000 + employees • Optimization throughout entire stack • Google Filesystem, Operating System, CHROME • 2,000,000 servers • Free food to keep their developers working long hours
  • 25. Google tools • Google Hangout - collaboration • Google Maps • Google compute engine • Google bigQuery
  • 26.
  • 27. • $1 Billion data center in Iowa • 450,000 servers • API first development strategy • Supports multiple interface connectivity using “restful” applications • Compete with UI / UX • Creates user lock in through iterative conditioning
  • 28. Iterative conditioning • Workflows are hard to learn • You should need software training to learn how to use software • Software fatigue • Switching cost • Adoption rates • Advanced features • Tracking all of this and dynamic menus and configuration
  • 29. Facebook tools and contributions • Apache Cassandra ( Big data database, linear scalability ) • ApacheThrift ( cross language services ) Architecture choices provide insight … still have to implement for specifics of Oil and Gas
  • 30. Open Source Licensing • MIT X11 License – ANY use permissible • BSD – Identical to MIT X11 • GPL – no linking • LPGL – linking allowed • Appliances – ethical / versus legal Must read the fine print before using, but can save very large amount of time by using these frameworks and implementations where possible
  • 31. Key point #3 Internet companies have innovation at scale
  • 33.
  • 34. Representational State Transfer • 6 constraints • Client Server – clients are not concerned with data storage • Stateless – server does not store client context • Cacheable – client stores responses • Layered system – client does not know if it is at end server or intermediary • Optional code on demand – client downloads code and runs • Uniform interface – decouples interface and allows each part to evolve independently
  • 35. Representational State Transfer • 6 constraints • Client Server – clients are not concerned with data storage • Stateless – server does not store client context • Cacheable – client stores responses • Layered system – client does not know if it is at end server or intermediary • Optional code on demand – client downloads code and runs • Uniform interface – decouples interface and allows each part to evolve independently
  • 36. Simplified REST Web Browser Web Server Database File Servers Presentation Layer can’t handle Geoscience or local compute Web server has the majority of control Compute Engine REST API
  • 37.
  • 38. REST with Mashup Web Browser Web Server 1 Database File Servers Presentation Layer can mashup data from 2 separate sources Compute Engine Web Server 2 REST API
  • 39. REST with new application layer Form window Application Database File Servers Compute Engine Web Server 2 REST API OpenGLWindow Web Browser
  • 40. Internet architecture / legacy style code • REST Architecture for NON – INTERNET applications • Can keep inside corporate networks • Distributed systems architecture • Predominant webAPI design model • Allows for distributed development team • Separate data model from view model • But allows for computation on either side
  • 42. Client Server • FINALLY !! Interactive HPC apps made easy • Our tabs are the clients connection to application layer via a “REST” style API • Application layer provides caching and file system access • Application layer provides access to heterogeneous compute
  • 43. Stateless • Each tab does not know about other tabs • This creates the ability to very quickly have developer from different teams and disciplines work independently • Application layer provides synchronization states • Application layer provides for off-workstation transferability ( work from iPad on the Beach )
  • 44. Cacheable • Heuristic data sorting and precaching based on user / algorithm needs • Allows for compute distribution without presentation layer needing to know • Allows for disparate file systems • Abstracts data location from user • Communicate with HPC grid in more advanced manner
  • 45. Layered System • Allows for use of 3rd party plugins • Allows EVERY application connect to HPC grid • Graphics as plugins • Workflows as plugins - dynamic workflow • No menu on Amazon • Optimize each layer independently
  • 46. Code on demand • Safer since security is controlled by application layer • Sandbox each user and only give access with additional security credentials • Can download and run legacy code through Pinvoke • DLL injection
  • 47. Uniform Interface • HTML for cross platform consistency • User adoption and ease of use • Internet style decoupling of functionality from graphics creates a better user experience and more intuitive style workflow • Most graphic designers do NOT know C++ • Geoscientists won’t always agree on color scheme, styles, icons
  • 48. Most important benefits • More flexibility means rapid application development and easier maintenance • Presentation layer needs change as business requirements needs change over time • Hooking into outside tools that have REST API’s • Data • Social • Compute engines • Mash ups
  • 49. Key point #4 A REST architecture enables scalability, extensible development, and mashup of tools and ideas created for the Internet
  • 51.
  • 52. Google BigQuery • Underlying technology is called DREMEL • Uses google file system as abstraction for database • Dremel can even execute a complex regular expression text matching on a huge logging table that consists of about 35 billion rows and 20TB, in merely tens of seconds
  • 53. Cassandra • Cassandra provides a structured key-value store with tunable consistency. • Keys map to multiple values, which are grouped into column families. The column families are fixed when a Cassandra database is created, but columns can be added to a family at any time. • Furthermore, columns are added only to specified keys, so different keys can have different numbers of columns in any given family. • The values from a column family for each key are stored together.
  • 54. Palantir • Does work for government agencies • High security layer that sits on top of disparate data sources • The Palantir Stack Layer • Brings together structured and unstructured data • Serves as foundation for applications using the dataAPI • Search and discovery layer • Granular multi layered security model • Revisioning database and original source tracking • Collaboration and data editing
  • 55. Ayasdi • Topological data analysis using machine learning • Can cross analyze multiple data sources • Query free approach
  • 56. Zoom Data • Automated connectivity to third party sources • Visualization studio • Interactive visualizations
  • 57. WebGL ( Open GL in web browser ) • Could be used for presentation layer in mobile device http://demos.vicomtech.org/x3dom/test/functional/volrenShaderBoun daryEnh.xhtml http://ourbricks.com/viewer/178d62ac29aa44459a6d57ce474fa6b6
  • 58. Key point #5 Connect to these and other tools using REST