SlideShare una empresa de Scribd logo
1 de 106
Descargar para leer sin conexión
Computational Science and Engineering (CSE) @ Berkeley
The Emergence of Computation for Interdisciplinary Large Data
inspired by Science Bounded by our imagination innovation through Technology Create Social impact
Masoud Nikravesh @ CITRIS and LBNL
CITRIS Director for CSE
Executive Director, DE-CSE @ Berkeley
http://cse.berkeley.edu/ http://cloud.citris-uc.org/
http://citris-uc.org/ http://www.lbl.gov/cs
Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism
1
Outline of Talk
2
Drivers for Change: Computing and Big Data
Computational Science and Engineering
State Leadership
California – “The Golden State”
The State New Economy Model
“Sustainable California” –
a return to “The Golden State”
Outline of Talk
3
Drivers for Change: Computing and Big Data
Computational Science and Engineering
State Leadership
California – “The Golden State”
The State New Economy Model
“Sustainable California” –
a return to “The Golden State”
Drivers for Change
• Continued exponential increase in computational
power  simulation (Computing) is becoming third
pillar of science, complementing theory (Analytic
and Math ) and experiment (Applications)
Applications
HPC-Cloud
Computing
Analytics
Math
High performance computing
(HPC), large-scale simulations,
and scientific applications all
play a central role in CSE.
CSE
The HPC/cloud computing initiative
and next generation data center
Extreme simulation, visual-data analytics,
data-enabled scientific discovery
Applications/real‐world complex applications (scientific, engineering, social, economic,
policy) using the future multi-core parallel computing ((i.e. E-Informatics, Earthquake Early
Warning, NextGenMaps, Genome Atlas, Genetic Facebook, Genomics Browser)
HPC-Petascale and Exascale
systems are an indispensable
tool for exploring the frontiers of
science and technology for
social impact.
4
Revolution is Happening Now
 Chip density is
continuing increase ~2x
every 2 years
 Clock speed is not
 Number of processor
cores may double
instead
 There is little or no more
hidden parallelism (ILP)
to be found
 Parallelism must be
exposed to and managed
by software
Source: Intel, Microsoft (Sutter) and
Stanford (Olukotun, Hammond)
5
Computing Growth is Not Just
an HPC Problem
10
100
1,000
10,000
100,000
1,000,000
1985 1990 1995 2000 2005 2010 2015 2020
Year of Introduction
The Expectation Gap
Microprocessor Performance “Expectation Gap” over Time
(1985-2020 projected)
6
New Processors Means New Software
 Exascale will have chips with thousands of tiny processor cores,
and a few large ones
 Architecture is an open question:
 sea of embedded cores with heavyweight “service” nodes
 Lightweight cores are accelerators to CPUs
 Autotuning eases code generation for new architectures
Interconnect
Memory
Processors
Server Processors Manycore processors
130 Megawatts 75 Megawatts
Source: Kathy Yelick,
7
Interconnect
Memory
Processors
New Memory and Network Technology
to Lower Energy
 Memory as important as processors in energy
 Latency is physics, bandwidth is money
 Software managed memory or cache hybrids
 Autotuning has helped with that management
 Need to raise level of autotuning to higher level kernels
Usual memory + network New memory + network
25 Megawatts75 Megawatts
Source: Kathy Yelick,
8
TOP500 Sites – June 2011
Today, HPC-Petascale and soon Exascale systems- is not just a tool of
choice, but it becomes an indispensable tool for frontiers of science and
technology for social impact.
Petaflop with ~1M Cores in your PC by 2025?
9
TOP10 Sites - June 2010
10
TOP10 Sites - November 2010
11
TOP10 Sites – June 2011
12
TOP500 Sites – June 2011
Today, HPC-Petascale and soon Exascale systems- is not just a tool of
choice, but it becomes an indispensable tool for frontiers of science and
technology for social impact.
Petaflop with ~1M Cores in your PC by 2025?
8-10 years
6-8 years
13
goal
usual
scaling
2005 2010 2015
2020
Energy Cost Challenge for Computing Facilities
At ~$1M per MW, energy costs are substantial
 1 petaflop in 2010 will use 3 MW
 1 exaflop in 2018 possible in 200 MW with “usual” scaling
 1 exaflop in 2018 at 20 MW is DOE target
14
New Processor Designs are
Needed to Save Energy
 Server processors have been designed for performance,
not energy
 Graphics processors are 10-100x more efficient
 Embedded processors are 100-1000x (1.25 rather than 100 watt)
 Need manycore chips with thousands of cores
Cell phone processor
(0.1 Watt, 4 Gflop/s)
Server processor
(100 Watts, 50 Gflop/s)
Source: Kathy Yelick, HPC-SEG July 2011
15
Motif/Dwarf: Common Computational Methods
(Red Hot  Blue Cool)
Embed
SPEC
DB
Games
ML
HPC
Health Image Speech Music Browser
1 Finite State Mach.
2 Combinational
3 Graph Traversal
4 Structured Grid
5 Dense Matrix
6 Sparse Matrix
7 Spectral (FFT)
8 Dynamic Prog
9 N-Body
10 MapReduce
11 Backtrack/ B&B
12 Graphical Models
13 Unstructured Grid
What do commercial and CSE applications have in common?
Source: Jim Demmel, Berkeley Parlab
16
Source: Oliver Pell, HPC-SEG July 2011, Berkeley
CPU, GPU, Hybrid, FPGA?
17
x86 Multicores GPU FPGA
Numbers -Current generation: 4–6 cores/CPU x 2
CPUs/node = 8–12 cores/node
-Future generation: 16–20 cores/CPU x 4
CPUs/node = 64–80 cores/node
-512 cores/GPU (Nvidia)
-1600 cores/GPU (AMD)
-No more cores but BRAM,
--Look Up Tables, FlipFlops,
etc..
-Clock frequency is in the
order of hundreds of MHz
-Memory per card is in the
order of tens of GB
What is the
easy part?
-Well known and mature technology
-Well established development environments
-Parallelism between core and nodes
-Well known technology (for
gaming purposes)
-It is becoming reliable also
for HPC computation
-High performance-per-watt
ratio
What is
difficult to do?
-Linear speedup with increasing core numbers -CUDA: good tool but
proprietary
-OpenCL: open technology
but not yet standard and more
complex to use
-Development tools (+
profiling, debugging, etc) not
yet fully available
-Non standard development
tools (VHDL is not for
Geophysicists… but we
have MaxCompiler!)
-Data streaming technology is
different from standard
approaches
(grid/matrix)
Main
problems
-Slow memory access
-Legacy codes need to be re-engineered in
order to get the best performance
(e.g. SSE vectorization, cache blocking)
-Network connections have to be optimized for
the architecture
-Limited amount of memory
(4–6 GB) per card
-Slow communication with the
host CPU (due to PCI
Express)
-Internal bandwidth is not
always enough
-The technology is not yet
standard for HPC
-Slow communication with the
host CPU (due to PCI
Express)
Source: Carlo Tomas, HPC-SEG, July 2011, Berkeley
18
A Likely Trajectory - Collision or
Convergence?
CPU
GPU
multi-threading multi-core many-core
fixed function
partially programmable
fully programmable
future
processor
by 2012
?
programmability
parallelism
after Justin Rattner, Intel, ISC 2008
19
Drivers for Change
• Continued exponential increase in experimental,
simulation, sensors, and social data 
techniques and technology in data analysis,
visualization, analytics, networking, and
collaboration tools are becoming essential in all
data rich applications
Big
Data
Model
Human
Experts- Citizen Cyber Science
Crowdsourceing
Analytic ToolsFirst Principles Hybrid Models
Google
IBM-Watson
IBM- Cognitive Model
Boeing 747 Simulation
Protein Folding
Amazon AI-ImageIncreased
climate/environmentaldetail
Increased socio-economic detail
Tera
Peta
Peta
Exa
Socio-Economic Modeling
for Large-scale Quantitative
Climate/Environmental
Change Analysis
En Informatics
Environment-Genetic
20
World Population: Today-~6B, 2050-~9B, 2100-~10B
%70 will live in Cities by 2050
By 2020: 35 trillion Gigabytes Data (Cyber-Physical world is
connected through
billions to even trillions of sensors and devices)
Petaflop with ~1M Cores in your PC by 2025?
Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism
21
Why BIG Data is a Big Deal?
Size of Data:
• 2010: 1.2 million Petabytes, or 1.2 Zettabytes
• 2020: 35 trillion Gigabytes (Cyber-Physical World is connected through
billions to even trillions of sensors and devices)
Type of data:
• from homogenous data to heterogeneous and multi-scale
• from physical sensor data to social-economical data
• from complete to incomplete, imprecise and uncertain
• from implementing on single-simple hardware-software
architecture to scalable parallel complex
hardware-software architectures
22
Why BIG Data is a Big Deal?
Crisis: Data storage/transfer/communication and security-
privacy doomsday forecast
Opportunities: Information gold mine
Needs: better, faster, cheaper, and scalable technologies
for storage, manipulation, communication and analysis
23
Why BIG Data is a Big Deal?
Challenge: Combine our current and to be developed
advanced-scalable* analytical tools with first principle
models and human capabilities at scale with anticipatory
capabilities to discover the un-seen phenomena and
insights and to make and deliver securely right decisions
and at the right time based on incomplete, imprecision,
and uncertain public/private data dealing with multi and
conflicting objectives and criteria.
24
Why BIG Data is a Big Deal?
Crowdsourcing
Big
Data
Model
Human
Experts- Citizen Cyber Science
Crowdsourceing
Analytic ToolsFirst Principles Hybrid Models
Google
IBM-Watson
IBM- Cognitive Model
Boeing 747 Simulation
Protein Folding
Amazon AI-Image
Increased
climate/environmentaldetail
Increased socio-economic detail
Tera
Peta
Peta
Exa
Socio-Economic Modeling
for Large-scale Quantitative
Climate/Environmental
Change Analysis
En Informatics
Environment-Genetic
25
Distributed thinking / Human computing
Physical participation coordinated via Internet
BIG Data and Citizen Cyber Science?
What can be aggregated?
 Aggregate perception, knowledge, reasoning
 Visual pattern recognition
 Real-world knowledge
 3D spatial manipulation
 Language skills
Where to get Volunteers
 Tell a good story about your research
 Give recognition
 Make it a game
 Add a social dimension
26
BIG Data and Citizen Cyber science?
27
AMP: Algorithms, Machines, People
Adaptive/Active
Machine Learning
and Analytics
Cloud ComputingCrowdSourcing
Massive
and
Diverse
Data
Source: M. Franklin
28
Cloud Initiative at Berkeley
~120 Faculty (CSE), ~120 Researchers (Cloud-HPC) , 22 Departments
Data Structure
Analytics
Service
Delivery
29
CSE Cloud Computing Initiative
Cloud Computing are being used by a broad array of
Computational Science and Engineering faculty
investigators, researchers and graduate students
from social scientists and economists to
astrophysicist and Bioengineers. Our list of faculty
includes experts from both computational science
and engineering, and the cloud and HPC
community. It includes ~120 faculty and over ~120
researchers/students from over 22 departments
(http://cse.berkeley.edu and http://cloud.citris-
uc.org/).
30
Cloud
Infrastructure
Applications (scientific,
engineering, social,
economic/business/finance,
policy)
Delivery of
Services
Mobile Devices
Mobile CloudSoftware and Appliances
Cluster Scheduling &
Reliability
Network Research and
Security
Supercomputer
Public Cloud
Private Cloud
Volunteering Computing
Mobile Cloud
Streaming Data
Massive Data
Extreme Simulation
Large Scale Visualization
Machine Learning
Analytics
Intelligent Dynamic Maps
Early Warning
Social Networking
Second Life
Cyber Citizen
Personalized Services
Crowd Sourcing
Cloud Initiative at Berkeley
~120 Faculty (CSE), ~120 Researchers (HPC-Cloud) , 22 Departments
31
Cloud Initiative at Berkeley
~120 Faculty (CSE), ~120 Researchers (HPC-Cloud) , 22 Departments
 Infrastructure – Cloud Cluster and Data Centers
 Delivery of Services – Mobile Cloud
 Applications
 Scientific
 Social
 Economics/Business
 Software and Appliances
 Cluster Scheduling & Reliability
 Network Research and Security
Mobile devices, Mobile Cloud, and Cloud Infrastructure
will be the device/tools of choice for delivery of services.
32
Cloud Computing Initiative
We will focus on three main areas:
 Machine Learning: Provide the general public with
machine learning analytics tools and algorithm runs in
cloud infrastructure.
 Streaming Data Analytics and Visualization: Analyses
and visualization of large-scale real time data sets such
as traffic information, online news sources, economics
data, and scientific data such as astrophysical and
Genomics data.
 Scientific Applications: Benchmarking and cataloging the
suitability of cloud computing for science and engineering
applications, including HPC applications.
33
BIG Data and Sensors/Cyber-Physical
Infrastructure
Water
Air
Energy
Earthquake
Marvell
Lab
μSensors
TinyOS
Prototyping
Devices
and
Sensors
G/H
FEEDBACK
California Independent System (Cal ISO)
Department of Water Resources
California Department of Health and Social Services and FCC
Cyberspace
Handhelds
Laptop/PC
Clusters
IBM/ room143
Cloud
+
+
+
Analytics
Algorithms
M/C Learning/A.I.
Statistical Analysis
Social Comp
Knowledge
Insight
Large-Scale
Information
Extraction
Delivery and
Service
Back to
Handhelds
Distributed
Systems
Visualization, Analytics
and Insight
Physical
World
Big Data
Streams
34
Increased
climate/environmentaldetail
Increased socio-economic detail
Tera
Peta
Peta
Exa
Socio-Economic Modeling
for Large-scale Quantitative
Climate/Environmental
Change Analysis
En Informatics
Environment-Genetic
BIG Data and Exa-Scale Computing
35
Courtesy of U.S. Department of Energy Human Genome Program , http://www.ornl.gov/hgmis
BIG Data and DNA Computing
36
BIG Data and DNA Computing
37
BIG Data and DNA Computing
38
BIG Data and Visualization –Scientific
39
BIG Data and Visualization
40
Outline of Talk
41
Drivers for Change: Computing and Big Data
Computational Science and Engineering
State Leadership
California – “The Golden State”
The State New Economy Model
“Sustainable California” –
a return to “The Golden State”
Computational Science
Nature, March 23, 2006
“An important development in
sciences is occurring at the
intersection of computer science and
the sciences that has the potential to
have a profound impact on science. It
is a leap from the application of
computing … to the integration of
computer science concepts, tools,
and theorems into the very fabric of
science.” -Science 2020 Report, March 2006
42
Nature of Work, Education and Future Society
“Creative Creators” or “Creative Servers”: Do complex task, and Enhance, Refine,
and Reinvent. “T. Friedman and M. Mandelbaum” That Used to be Us”
20th Century 21th Century
Number of Jobs
1-2 Jobs 10-15 Jobs
Job Requirement
Mastery of
one Field
(Single Deep Expertise)
Breadth;
Depth in several Fields
(Multiple Deep Expertise)
(Broad Knowledge)
Alternative sources of Natural Resources: Energy and Water
Technology: Nano-technology, Quantum Computers, Genetic and Biometrics, and Robotics
Services: Online Education and Services on Demand
Resources: Sensors and Devices, Big Data, Computing Power, Social Network and Computing
Charles Fadel
43
Tm
T m
Tm-shaped Individual and not just T or m-shaped
Single Expertise Multiple Deep Expertise
Single Deep + Multiple Expertise Hybrid (CSE)
Broad Knowledge
21st century skills: problem-solving, critical thinking,
entrepreneurship and creativity
44
Computational Science and Engineering (CSE) @ Berkeley
45
What is CSE?
CSE is a rapidly growing multidisciplinary field that
encompasses real-world complex applications
(scientific, engineering, social, economic, policy),
computational mathematics, and computer science
and engineering. High performance computing
(HPC), large-scale simulations and modeling
(physical, biological, economic, social, and policy
processes), and scientific applications all play a
central role in CSE.
Petaflop with ~1M Cores in your PC by 2025?
46
What is CSE?
Simulation of complex problems is sometimes the only feasible way
to make progress if the theory is intractable and experiments are
too difficult, too expensive, too dangerous, or too slow.
Through modeling and simulation of multiscale systems of
systems, and through scientific discovery from large-scale
heterogeneous data, CSE aims to advance solutions for a wide
range of problems in the areas of nanoscience and
nanotechnology, energy, climate change, engineering design,
neuroscience, cognitive computing and intelligent systems,
plasma physics, transportation, bioinformatics and computational
biology, earthquake engineering, geophysical modeling,
astrophysics, materials science, national defense, information
technology for health care, engineering better search engines,
socio-economic-policy modeling, and other fields that are critical
to scientific, economic, and social progress.
47
CSE: Vision
To support the work of scientists and engineers as they
pursue complex –simulation/modeling, as well as
computational, data and visualization- intensive research
to enhance scientific, technological, and economic
leadership while improving our quality of life.
inspired by Science Bounded by our imagination innovation through Technology Create Social impact
Today, HPC-Petascale and soon Exascale systems- is not just a tool of
choice, but it becomes an indispensable tool for frontiers of science and
technology for social impact.
48
CSE: Mission
 Conduct world-leading research in applied mathematics and computer
science to provide leadership in such areas as energy, environment, health-
information technology, climate, bioscience and neuroscience, and
intelligent cyber-physical infrastructure to name a few.
 Be at the forefront of the development and use of ultra-efficient largest-scale
computer systems, focusing on discoveries and solutions that link to the
evolution of the commercial market for high-performance and cloud
computing and services.
 Allow industry collaborators to gain experience with computational modeling
/ simulation and the effective use of HPC and Cloud facilities and carrying
back new expertise to their institutions. This would enable the Industry
partners to be “first to market” with important scientific and technological
capabilities, breakthrough ideas, and new hardware-software.
 Educate the next generation of interdisciplinary students and industry
leaders (DE-CSE program and a new Professional Master Program
(PMS) to be developed)
inspired by Science Bounded by our imagination innovation through Technology Create Social impact
Petaflop with ~1M Cores in your PC by 2025?
49
High performance computing
(HPC), large-scale simulations,
and scientific applications all
play a central role in CSE.
Applications
HPC-Cloud
Computing
Analytics
Math
CSE
The HPC/cloud computing initiative
and next generation data center
Extreme simulation, visual-data analytics,
data-enabled scientific discovery
Applications/real‐world complex applications (scientific, engineering, social, economic,
policy) using the future multi-core parallel computing ((i.e. E-Informatics, Earthquake Early
Warning, NextGenMaps, Genome Atlas, Genetic Facebook, Genomics Browser)
CSE
Berkeley and LBNL Partnership
HPC-Petascale and Exascale
systems are an indispensable
tool for exploring the frontiers of
science and technology for
social impact.
50
Computational Research Division
Applied Mathematics Computer Science
Computational Science
HPC architecture,
OS, and compilers
512
256
128
64
32
16
8
4
2
1024
1/16 1 2 4 8 16321/8
1/4
1/2
1/32
RTM/wave eqn.
NVIDIA C2050 (Fermi)
SpMV
7pt Stencil
27pt Stencil
DGEMM
GTC/chargei
GTC/pushi
Performance
& Autotuning
Visualization
and Data
Management
Cloud, grid &
distributed
computing
Mathematical
Models
Adaptive Mesh
Refinement
Linear
Algebra
Libraries and
Frameworks
Interface
Methods
NanoscienceCombustion Climate Cosmology &
Astrophysics
GenomicsEnergy &
Environment
Source- LBNL & CSE51
Computational Science and Engineering (CSE) @ Berkeley
Designated Emphasis (DE) in CSE Participants
~120 Faculty (CSE), ~120 Researchers (HPC-Cloud), ~22 Departments,
, ~33 Students and growing, ~60 Courses, more being developed
http://cse.berkeley.edu/ http://cloud.citris-uc.org/
http://citris-uc.org/ http://www.lbl.gov/cs
52
Designated Emphasis (DE) in CSE
• New “graduate minor” – approved, starting July 1, 2008
• Motivation
– Widespread need to train PhD students in large scale
simulation, or analysis of large data sets
– Opportunities for collaboration, across campus and at LBNL
• Graduate students participate by
– Getting accepted into existing department/program
– Taking CSE course requirements
– Qualifying examination with CSE component
– Need to sign up before quals!
– Thesis with CSE component
– Receive “PhD in X with a DE in CSE”
53
CSE Participating Departments (1/2)
( # faculty by “primary affiliation”, # courses, # Students )
•Astronomy (7,3,1)
•Bioengineering (3,1,0)
•Biostatistics (2,0,1)
•Chemical Engineering (6,0,0)
•Chemistry (8,1,0)
•Civil and Environmental Engineering (7,8,2)
•Earth and Planetary Science (6,3,4)
•EECS (19,14,4)
•IEOR (5,5,0)
•School of Information (1,0,0)
54
CSE Participating Departments (2/2)
( # faculty by “primary affiliation”, # courses, # Students )
• Integrative Biology (1,0,0)
•Materials Science and Engineering (2,1,0)
•Mathematics (15,4,0)
•Mechanical Engineering (9,6,8)
•Neuroscience (7,1,4)
•Nuclear Engineering (2,1,3)
•Physics (1,1,0)
•Political Science (2,0,1)
•Statistics (5, 11,0)
•New: Biostatistics (1), Public Health (0), Vision
Science(1), Biopyhsics(1), Business School (1)
55
Course Structure
 3 kinds of students, course requirements
 Applications, CS, Math
 Each kind of student has 3 course requirements in other two
fields
 Goal: enforce cross-disciplinary training
 Ex: Applications students takes courses from EECS, Math,
Statistics, IEOR
 We support new course development
 5 courses recently created/updated
56
Educating the Workforce of the Future
China & India:
300M Skilled worker by 2025
Eng. Ph.D Median Salary:
India: $39,200
China: $53,700
Germany: $99,400
US(CA): $125,200
Science and Engineering Graduate
US 420000, EU 470000,
China 530000 , India 690000,
Japan 350000
McKinsey report concluded that only
10% of Chinese engineers and 25%
of Indian engineers can compete in
the global outsourcing arena.
Revised by: Nikarvesh
57
Annualized Job Openings vs. Annual Degrees Granted (2008-2018)
CSE educates the next generation of
interdisciplinary students and industry
leaders.
CSE Revised by: Nikarvesh
58
Degree Production vs. Job Openings
Sources: Adapted from a presentation by John Sargent, Senior Policy Analyst, Department of Commerce,
at the CRA Computing Research Summit, February 23, 2004. Original sources listed as
National Science Foundation/Division of Science Resources Statistics; degree data from
Department of Education/National Center for Education Statistics: Integrated Postsecondary
Education Data System Completions Survey; and NSF/SRS; Survey of Earned Doctorates; and
Projected Annual Average Job Openings derived from Department of Commerce (Office of
Technology Policy) analysis of Bureau of Labor Statistics 2002-2012 projections. See
http://www.cra.org/govaffairs/content.php?cid=22.
160,000
140,000
120,000
100,000
80,000
60,000
40,000
20,000
Engineering Physical Sciences Biological Sciences Computer Science
Ph.D.
Master’s
Bachelor’s
Projected job openings
CSE educates the next generation of
interdisciplinary students and industry
leaders.
CSE Revised by: Nikarvesh
59
CSE
Center Concepts
CDISC
ACCESS
Insight
60
Open Big Data Science
Computational Foundations and Driving Applications
CDISC – Center Concept
Open
Big Data
Science
APPS
CORE
LIBRARIES
ANALYTICS
MACHINE
LEARNING
TRANINING &
EDUCATION
OUTREACH
Devices and Computing Environment
61
Our Center will develop a wide array of computational tools to tackle the
challenges of data-intensive scientific research across multiple scientific
disciplines.
These tools will encapsulate state of the art machine learning and statistical
modeling algorithms into broadly applicable, high-level interfaces that can
be easily used by application scientists.
Our goal is to dramatically reduce the time needed to extract knowledge
from the floods of data science is facing, thanks to workflows that permit
exploratory and collaborative research to evolve into robustly reproducible
outcomes.
CDISC: Center Concept
Center for Data-Driven Scientific Computing
62
Our development will be driven by a collection of scientific problems that
share a common theme.
They all present major data-intensive challenges requiring significant
algorithmic breakthroughs and represent key questions within their field,
from rapid astronomical discovery of rare events to early warning
systems for natural hazards such as earthquakes or tsunamis.
Moving beyond the traditional domain of scientific computing, we will
tackle a collection of problems in social sciences and the digital
humanities, pushing the boundaries of quantitative scholarship in these
disciplines.
CDISC: Center Concept
Center for Data-Driven Scientific Computing
63
CDISC: Center for Data-Driven Scientific Computing
Center Concept
Date-Driven
Scientific Computing
APPS
CORE
LIBRARIES
ANALYTICS
MACHINE
LEARNING
TRANINING &
EDUCATION
OUTREACH
Devices and Computing Environment
64
Center for Accelerating Environmental Synthesis and Solutions (ACCESS)
& Environment Quality and Security
To enable synthesis, En Informatics
(En= Environmental, Ecological, Epidemiological, Economic,
Engineering, Equitable, Ethical,… )
Center Concept
Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism
World Population: Today-~6B, 2050-~9B, 2100-~10B
%70 will live in Cities by 2050
65
ACCESS Focus
ACCESS will focus on five major domains critical
for human welfare and environmental quality:
freshwater, health, ecosystems, urban metabolism,
and food security; and will create and implement a
synthesis process that makes research tools and
understanding rapidly accessible across disciplines,
and foster new ways of thinking across disciplines
about critical environmental problems.
Source: Inez Fung
Center for Accelerating Environmental Synthesis and Solutions (ACCESS)
66
Berkeley ACCESS Themes
Ecosystem trajectories over the past million years and in the future -
rate and nature - result principally 8000 generations of human
population growth and aspirations.
Underlying ecosystem trajectories are the changing supply and
demand of water and the need to harness energy to advance
civilization.
Urban metabolism: Theoretical models of cities as complex socio-
ecological systems with particular metabolic dynamics. Urban policy
is increasingly critical to building a more sustainable future.
The increasing ease of utilizing existing resources leads to their rapid
and unsustainable depletion, with many resulting intolerable impacts,
including those on
 Human and animal health
 Food security
Source: Inez Fung
Center for Accelerating Environmental Synthesis and Solutions (ACCESS)
67
Urban Metabolism
Conceptual Frameworks for Urban Metabolism: Theoretical models of
cities as complex socio-ecological systems with particular metabolic
dynamics include approaches based in political economy, sociology, urban
ecology and biogeochemistry, and industrial ecology – many of which
remain disconnected from each other. In addition, because the inputs to
urban life are globalized, the geography of consumption and production
networks must be integrated into conceptual frameworks.
Data Integration: A rapidly expanding volume of geospatial data on urban
stocks and flows – about people, animals, vegetation, consumer products,
energy, waste, etc. – is available for synthesis and building models of the
complex metabolic cycles of cities.
Policy and Activism: Urban policy is increasingly critical to building a more
sustainable future, but the policy interventions and activist campaigns are
piecemeal remedies rather than solutions based on an understanding of
cities as complex socio-ecological systems.
Visualization and Decision-Support: Decision makers and stakeholders of
many types need to visuzlize model results quickly and effectively.
Generating sophisticated and insightful visualizations of urban systems is
an emergent and critical field.
Source: Inez Fung
68
Insight Lab
Applications
Machine
Learning
Massive
Scale Data
Analytics and
Visualization
69
Strategic Projects/
Shared Facilities,
Resources, Expertise
Technology
Streaming Data and
Visual Analytics
Core Group*
Core Scientific
Group*
Shared Facilities
VisLab+ Computing
Infrastructures
Delivery of Service
Mobile Devices,
Internet, and Cloud
Science/Applications
scientific,engineering,social,economic/business/finance
ACCESS- E-informatics
Earthquake Early
Warning
Next Generation
Dynamic Maps
Genome Atlas, Genetic
Facebook, Genomics
Browser, bioinformatics,
Immune System, …
Computational
Bioscience,
Neuroscience,
Nanoscience ,
Astrophysics , …
*core group of enabling computational scientists would stand at the heart of the center, and that they would both cross-
pollinate expertise among projects and provide great leverage in winning large federally-supported projects*.
Educational, Research, and Social Impacts; IT-Enabled Disaster Resilience
Insight Lab
Intensive Computing, Immersive Visualization and Human Interaction
Data and Visual-enabled Scientific Discovery and Insight Accelerator
(~120 CSE Faculty, ~120 HPC-Cloud Researchers, and 22 Departments)
70
Earthquake early warning
400 seismic stations
across California
Use existing seismic stations to
• detect the beginning of earthquakes
• estimate the location and magnitude
• predict damaging ground shaking
• issue a warning to those in harms way
Seconds to tens
of seconds warning,
up to 1 minute
• people move to safe zone (under table)
• slow and stop trains (BART)
• isolate hazards (equipment, chemicals)
new science + modern communications
Allen Richard
71
Opinion Space:
Crowdsourcing Insights
Scalability: N Participants, N Viewpoints
Each Viewpoint is n-Dimensional
Dim. Reduction: 2D Map of Affinity/Similarity
Insight vs. Agreement: Nonlinear Scoring
N2 Peer to Peer Reviews
Source: Ken Goldberg and Alec Ross
72
CISN
ShakeMap
Crowdsourcing + physical modeling + sensing + data assimilation
Physical modeling-based live maps, which contain real-time assessments of
situation integrating streaming data
Source: Alex Bayen
NextGenMap: The Value of Multi-disciplinary Research:
Invention, Societal-pull, Products, New Legislation
73
 Real-time (machine-learned) classification of astronomical event data
 data deluge requires abstracting traditional roles of scientist in discovery
 working with real data now, towards a scalable framework for the Large
Synoptic Survey (LSST) era
new statistical analytics
on sparse data
machine learning with noisy
& spurious feature sets
cloud-based ML with
massive databases
Source: Josh Bloom
Berkeley Time-Series Center
74
Innovative visualizations for a topic’s
summary in news across time
 Real-time summaries of topics across many news sources
 Global image of news landscape
 Interpretable results obtained via sparse machine learning techniques
 Massive data sets requires cloud computing
Real-time image of news sources or topics
Source: Laurent El Ghaoui
StatNews:
Analytics and Visualization of News Data
75
Lawrence Berkeley National
Laboratory
76
Berkeley Lab’s Major Scientific Facilities
Complex Tools to Address Scientific Challenges
Advanced Light Source
Molecular
Foundry
National Center for Electron
Microscopy
National Energy
Research Scientific
Computing Center
88-Inch
Cyclotron
Joint Genome
Institute
Energy Sciences
Network (ESnet)
77
Computational Research Division
Applied Mathematics Computer Science
Computational Science
HPC architecture,
OS, and compilers
512
256
128
64
32
16
8
4
2
1024
1/16 1 2 4 8 16321/8
1/4
1/2
1/32
RTM/wave eqn.
NVIDIA C2050 (Fermi)
SpMV
7pt Stencil
27pt Stencil
DGEMM
GTC/chargei
GTC/pushi
Performance
& Autotuning
Visualization
and Data
Management
Cloud, grid &
distributed
computing
Mathematical
Models
Adaptive Mesh
Refinement
Linear
Algebra
Libraries and
Frameworks
Interface
Methods
NanoscienceCombustion Climate Cosmology &
Astrophysics
GenomicsEnergy &
Environment
78
National Energy Research Scientific Computing Facility
Department of Energy Office of Science
(unclassified) Facility
• 4000 users, 500 projects
• From 48 states; 65% from universities
• 1400 refereed publications per year
Systems designed for science
• 1.3 PF Hopper system (Cray XE6)
- 4th Fastest computer in US, 8th in world
• .5 PF in Franklin (Cray XT4), Carver (IBM
iDataplex) and other clusters
79
NERSC Systems
Large-Scale Computing Systems
Franklin (NERSC-5): Cray XT4
• 9,532 compute nodes; 38,128 cores
• ~25 Tflop/s on applications; 356 Tflop/s peak
Hopper (NERSC-6): Cray XE6
• 6,384 compute nodes, 153,216 cores
• 120 Tflop/s on applications; 1.3 Pflop/s peak
HPSS Archival Storage
• 40 PB capacity
• 4 Tape libraries
• 150 TB disk cache
NERSC Global
Filesystem (NGF)
Uses IBM’s GPFS
• 1.5 PB capacity
• 5.5 GB/s of bandwidth
Clusters
140 Tflops total
Carver
• IBM iDataplex cluster
PDSF (HEP/NP)
• ~1K core cluster
Magellan Cloud testbed
• IBM iDataplex cluster
GenePool (JGI)
• ~5K core cluster
Analytics
Euclid
(512 GB shared
memory)
Dirac GPU
testbed (48
nodes)
80
The TOP10 of the TOP500
Rank Site Manufacturer Computer Country Cores
Rmax
[Pflops] [MW]
1
RIKEN Advanced Institute
for Computational Science
Fujitsu
K Computer
SPARC64 VIIIfx 2.0GHz,
Tofu Interconnect
Japan 548,352 8.162 9.90
2
National SuperComputer
Center in Tianjin
NUDT
Tianhe-1A
NUDT TH MPP,
Xeon 6C, NVidia, FT-1000 8C
China 186,368 2.566 4.04
3
Oak Ridge National
Laboratory
Cray
Jaguar
Cray XT5, HC 2.6 GHz
USA 224,162 1.759 6.95
4
National Supercomputing
Centre in Shenzhen
Dawning
Nebulae
TC3600 Blade, Intel X5650, NVidia
Tesla C2050 GPU
China 120,640 1.271 2.58
5
GSIC, Tokyo Institute of
Technology
NEC/HP
TSUBAME-2
HP ProLiant, Xeon 6C, NVidia,
Linux/Windows
Japan 73,278 1.192 1.40
6 DOE/NNSA/LANL/SNL Cray
Cielo
Cray XE6, 8C 2.4 GHz
USA 142,272 1.110 3.98
7
NASA/Ames Research
Center/NAS
SGI
Pleiades
SGI Altix ICE 8200EX/8400EX
USA 111,104 1.088 4.10
8
DOE/SC/
LBNL/NERSC
Cray
Hopper
Cray XE6, 6C 2.1 GHz
USA 153,408 1.054 2.91
9
Commissariat a l'Energie
Atomique (CEA)
Bull
Tera 100
Bull bullx super-node S6010/S6030
France 138.368 1.050 4.59
10 DOE/NNSA/LANL IBM
Roadrunner
BladeCenter QS22/LS21
USA 122,400 1.042 2.3481
Exascale: Who Needs It?
Fusion: Simulations
of plasma properties
to ITER scale model
Combustion:
complete predictive
engine simulation
Astronomy: origins
of the universe
Sequestration:
Understanding fluid
flow & chemistry
Materials: solar panels
to database of
materials-by-design.
Climate: Resolve
clouds (1km scale) &
model mitigations
Protein structures:
From Biofuels to
Alzheimers
Every field needs more computing!
1) To quantify and reduce uncertainty in simulations
2) Analyze data from experiments and simulations
82
ESnet provides the critical network infrastructure that supports
the Department of Energy’s Office of Science missions.
• ESnet directly supports the research of some 15,000 scientists, postdocs
and graduate students at DOE laboratories, universities, other federal
agencies, and industry worldwide
• Science is increasingly collaborative and globally distributed
• ESnet provides the reliable connection, science-driven innovation and user
focus that enables scientists to collaborate, manage, and exchange data
The Energy Sciences Network
83
Prototype 100G Topology
Magellan
Magellan
Supporting Advanced Scientific Computing Research • Basic Energy Sciences •
Biological and Environmental Research • Fusion Energy Sciences • High Energy
Physics • Nuclear Physics
84
Outline of Talk
85
Drivers for Change: Computing and Big Data
Computational Science and Engineering
State Leadership
California – “The Golden State”
The State New Economy Model
“Sustainable California” –
a return to “The Golden State”
“Sustainable California” –
a Return to the Golden State
Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism
86
“California”-
The Golden State
“Silicon Valley” –
The Golden High Tech Region
Top 10 Countries by GDP 2009 & 2010
Overall
Rank
Country or
U.S. State
GDP
(millions of USD)
— World 62,220,000
1 United States 14,620,000
2 People's Republic of China 5,879,100[2]
3 Japan 5,391,000
4 Germany 3,306,000
5 France 2,555,000
6 United Kingdom 2,259,000
7 Italy 2,037,000
8 Brazil 2,024,000
California 1,911,822
9 Canada 1,564,000
10 Russia 1,477,000
Overall
Rank
Country or
U.S. State
GDP
(millions of USD)
— World 58,133,309
1 United States 14,119,000
2 Japan 5,068,996
3 People's Republic of China 4,985,461[2][3]
4 Germany 3,330,032
5 France 2,649,390[4]
6 United Kingdom 2,174,530
7 Italy 2,112,780
California 1,911,822
8 Brazil 1,573,409
9 Spain 1,460,250
10 Canada 1,336,068
2010 2009
Source: Wikipedia 87
State
Rank Company
Fortune 500
rank City
Revenues
($ millions)
1 Chevron 3 San Ramon 196,337.0
2 Hewlett-Packard 11 Palo Alto 126,033.0
3 McKesson 15 San Francisco 108,702.0
4 Wells Fargo 23 San Francisco 93,249.0
5 Apple 35 Cupertino 65,225.0
6 Intel 56 Santa Clara 43,623.0
7 Safeway 60 Pleasanton 41,050.0
8 Cisco Systems 62 San Jose 40,040.0
9 Walt Disney 65 Burbank 38,063.0
10 Northrop Grumman 72 Los Angeles 34,757.0
Top publicly traded companies in California for 2011 (over
50) according to revenues with State and U.S. rankings
A Total of $1,218,340.30 ($Millions)
Source: Fortune 500 88
“California” – The Golden State
California's economy is the ninth (eighth in 2010) largest economy in the world,
if the states of the U.S. were compared with other countries.
• California is house to top publicly traded companies in California (over 50
Fortune 500 in 2011 ) according to revenues with State and U.S. rankings
A Total of $1,218,340.30 ($Millions) in Revenue
• California is not only the house to the largest High Technology companies
but also house to the largest company in the world. Apple with Market Cap
of over $420B ranked 1st with Exxon ranked 2nd .
• California is the house to the leaders of the Internet and ICT and super-
computers
• California is the house to the largest and leading Bioscience, Life Sciences
and Biomedicine
• California is the house to the leading Nano and Sensor Technology
• California is the house to the many leading Universities and DoE Leading
National Lab in Science and Technology
The University of California is well known for developing and operating
academic research centers in cooperation with partners world-wide. UC
Berkeley has a proud reputation of solving problems of interest not only in the
state of California, but for the people of world.
89
California’s commitment to the Leadership in
Science and Technology (UCOP)
 In the last part of the 20th century, California created the high-tech
and biotechnology innovations that formed the backbone of today's
"New Economy." As we begin the 21st century, the state of
California, the University of California and hundreds of the state's
leading-edge businesses have joined together in an unprecedented
partnership to lay the foundation for the "next New Economy.“
 The Governor Gray Davis Institutes for Science and Innovation –
now named for the former governor in recognition of his instrumental
role in their creation – include:
Source- UCOP90
California’s commitment to the Leadership
in Science and Technology (UCOP)
 Taken together, these four institutes represent a billion-dollar,
multidisciplinary effort that focuses public/private resources and
expertise simultaneously on research areas critical to sustaining
California's economic growth and its competitiveness in the global
marketplace.
 The new ideas and technologies developed by researchers at the
institutes help expand our economy into new industries and markets
- and bring the benefits of innovation more quickly into the lives of
people everywhere. These institutes open the doors to new
understanding, new applications and new products through essential
research in biomedicine, bioengineering, nanosystems,
telecommunications and information technology.
Source- UCOP91
Silicon Valley and Stanford University
“Stanford University, its affiliates, and graduates played a major
role in the development of California's electronics and high-tech
industry.[16] From the 1890s, Stanford University's leaders saw its
mission as service to the West and shaped the school accordingly.
Regionalism helped align Stanford's interests with those of the
area's high-tech firms for the first fifty years of Silicon Valley's
development.[17] “
“During the 1940s and 1950s, Frederick Terman, as Stanford's
dean of engineering and provost, encouraged faculty and
graduates to start their own companies. He is credited with
nurturing Hewlett-Packard, Varian Associates, and other high-tech
firms, until what would become Silicon Valley grew up around the
Stanford campus.”
Source: Wikipedia 92
Top Universities by Reputation 2012
93
Reputation
Rank
Institution
Country /
Region
Overall score Reputation Reputation for teaching
1 Harvard University United States 100.0 100.0 100.0
2
Massachusetts Institute of
Technology
United States 87.2 81.7 90.0
3 University of Cambridge
United
Kingdom
80.7 82.7 79.7
4 Stanford University United States 72.1 67.5 74.5
5
University of California
Berkeley
United States 71.6 65.0 74.8
6 University of Oxford
United
Kingdom
71.2
The World University Rankings 2011-2012
94
World
Rank
Institution Country /
Region
Overall
score
Teaching International
mix
Industry
income
Research Citations
1 California Institute of
Technology
United States 94.8 95.7 56 97 98.2 99.9
2 Harvard University United States 93.9 95.8 67.5 35.9 97.4 99.8
2 Stanford University United States 93.9 94.8 57.2 63.8 98.9 99.8
4 University of Oxford United
Kingdom
93.6 89.5 91.9 62.1 96.6 97.9
5 Princeton University United States 92.9 91.5 49.6 81 99.1 100
6 University of Cambridge United
Kingdom
92.4 90.5 85.3 55.5 94.2 97.3
7 Massachusetts Institute of
Technology
United States 92.3 92.7 79.2 94.4 87.4 100
8 Imperial College London United
Kingdom
90.7 88.8 92.2 93.1 88.7 93.9
9 University of Chicago United States 90.2 89.4 58.8 Data
withheld
by THE
90.8 99.4
10 University of California
Berkeley
United States 89.8
List of U.S. States by Unemployment Rate
State or District
Unemployment rate
(seasonally adjusted)
Monthly percent change
(=drop in unemployment)
Nevada 12.6 0.4%
California 11.1 0.2%
Rhode Island 10.8 0.3%
Mississippi 10.4 0.1%
District of Columbia 10.4 0.2%
North Carolina 9.9 0.1%
Florida 9.9 0.1%
Illinois 9.8 0.2%
Georgia 9.7 0.1%
South Carolina 9.5 0.4%
Michigan 9.3 0.5%
Kentucky 9.1 0.3%
Indiana 9.0 0.0%
New Jersey 9.0 0.1%
Oregon 8.9 0.2%
Arizona 8.7 0.0%
Tennessee 8.7 0.4%
Washington 8.5 0.2%
Idaho 8.4 0.1%
United States (mean)[5] 8.3 0.2%
Connecticut 8.2 0.2%
Alabama 8.1 0.6%
Ohio 8.1 0.4%
New York 8.0 0.0%
Missouri 8.0 0.2%
Colorado 7.9 0.1%
West Virginia 7.9 0.0%
State or District
Unemployment rate
(seasonally adjusted)
Monthly percent change
(=drop in unemployment)
United States (mean)[5] 8.3 0.2%
Texas 7.8 0.3%
Arkansas 7.7 0.2%
Pennsylvania 7.6 0.3%
Delaware 7.4 0.2%
Alaska 7.3 0.0%
Wisconsin 7.1 0.2%
Maine 7.0 0.0%
Massachusetts 6.8 0.2%
Louisiana 6.8 0.1%
Montana 6.8 0.3%
Maryland 6.7 0.2%
New Mexico 6.6 0.1%
Hawaii 6.6 0.1%
Kansas 6.3 0.2%
Virginia 6.2 0.0%
Oklahoma 6.1 0.0%
Utah 6.0 0.4%
Wyoming 5.8 0.0%
Minnesota 5.7 0.2%
Iowa 5.6 0.1%
Vermont 5.1 0.2%
New Hampshire 5.1 0.1%
South Dakota 4.2 0.1%
Nebraska 4.1 0.0%
North Dakota 3.3 0.1%
January 24, 2012 for December 2011
Source: Wikipedia 95
Outline of Talk
96
Drivers for Change: Computing and Big Data
Computational Science and Engineering
State Leadership
California – “The Golden State”
The State New Economy Model
“Sustainable California” –
a return to “The Golden State”
The State New Economy Index*
Methodology
The State New Economy Index uses 26 indicators. These
Indicators are divided into five categories. These categories
best capture what is new about the New Economy:
1) Knowledge Jobs (5)
2) Globalization (2)
3) Economic Dynamism (3.5)
4) Transformation to a Digital Economy (3)
5) Technological Innovation Capacity (5)
97*Source: ITIF-Kauffman
Top 10 US States ranked based on “The New Economy Index”
2010
1. Massachusetts (92.6)
2. Washington (77.5)
3. Maryland (76.9)
4. New Jersey (76.9)
5. Connecticut(76.6)
6. Delaware (75.0)
7. California (74.3)
8. Virginia (73.7)
9. Colorado (72.8)
10. New York (71.3)
2008
1. Massachusetts (97)
2. Washington (81.9)
3. Maryland (80)
4. Delaware (79.3)
5. New Jersey (77)
6. Connecticut (76.1)
7. Virginia (75.6)
8. California (75)
9. New York (74.4)
10. Colorado (70.4)
2007
1. Massachusetts (96.1)
2. New Jersey (86.4)
3. Maryland (85.0)
4. Washington (84.6)
5. California (82.9)
6. Connecticut (81.8)
7. Delaware (79.6)
8. Virginia (79.5)
9. Colorado (78.3)
10. New York (77.4)
2002
1. Massachusetts (90.0)
2. Washington (86.2)
3. California (85.5)
4. Colorado (84.3)
5. Maryland (75.6)
6. New Jersey (75.1)
7. Connecticut (74.2)
8. Virginia (72.1)
9. Delaware (70.5)
10. New York (69.3)
1999
1. Massachusetts (82.3)
2. California (74.3)
3. Colorado (72.3)
4. Washington (69.0)
5. Connecticut (64.9)
6. Utah (64.0)
7. New Hampshire (62.5)
8. New Jersey (60.9)
9. Delaware (59.9)
10. Arizona (59.2)
98
CSE-CITRISRole
InnovationandBetterEducation
ITIF-Kauffman
Ranking
26 Attributes PCA
(MNIK2012)
5 Categories PCA
(MNIK2012)
Massachusetts Massachusetts Massachusetts
Washington Washington New Jersey
Maryland Connecticut Connecticut
New Jersey Maryland Washington
Connecticut New Jersey Maryland
Delaware Virginia Delaware
California California California
Virginia Colorado Virginia
Colorado Delaware New York
New York New Hampshire Colorado
New Hampshire Minnesota New Hampshire
Utah Utah Minnesota
Minnesota New York Utah
Oregon Oregon Oregon
Illinois Illinois Illinois
Rhode Island Michigan Rhode Island
Michigan Rhode Island Texas
Texas Pennsylvania Michigan
Georgia Texas Georgia
Arizona Vermont Florida
Florida Arizona Pennsylvania
Pennsylvania Georgia Arizona
Vermont North Carolina Vermont
North Carolina Ohio North Carolina
ITIF-Kauffman
Ranking
26 Attributes PCA
(MNIK2012)
5 Categories PCA
(MNIK2012)
Ohio Idaho Kansas
Kansas Kansas Ohio
Idaho Wisconsin Nevada
Maine Florida Maine
Wisconsin Missouri Idaho
Nevada Nebraska Wisconsin
Alaska New Mexico Alaska
New Mexico Maine Missouri
Missouri Iowa Nebraska
Nebraska Alaska Hawaii
Indiana North Dakota Indiana
Montana Hawaii Iowa
North Dakota Indiana North Dakota
Iowa South Carolina New Mexico
South Carolina Nevada Tennessee
Hawaii South Dakota South Carolina
Tennessee Tennessee Montana
Oklahoma Montana Louisiana
Kentucky Oklahoma Oklahoma
Louisiana Wyoming Kentucky
South Dakota Alabama South Dakota
Wyoming Kentucky Wyoming
Alabama Louisiana Alabama
Arkansas Arkansas Arkansas
West Virginia West Virginia West Virginia
Mississippi Mississippi Mississippi
US States ranked based on “The New Economy Index”
and two new PCA ranking models!??
99
CSE-CITRISRole
InnovationandBetterEducation
KNOWLEDGE JOBS Weight
IT Professionals
Professional and Managerial Jobs
Workforce Education
Immigration of Knowledge Workers
U.S. Migration of Knowledge Workers
Manufacturing Value-Added
Traded-Services Employment
GLOBALIZATION
Export Focus on Manufacturing and Services
Foreign Direct Investment (FDI)
ECONOMIC DYNAMISM
Job Churning
Initial Public Offerings (IPOs)
Entrepreneurial Activity
Inventor Patents
Fastest-Growing Firms
The State New Economy Index*
DIGITAL ECONOMY
Online Population
Digital Government
Farms and Technology
Broadband
Health IT
INNOVATION CAPACITY
High-Tech Employment
Scientists and Engineers
Patents
Industry R&D
Non-industry R&D
Green Economy
Venture Capital
100Ref.*: ITIF and Kauffman Foundation
Knowledge Job (5)
1 Massachusetts (17.39)
2 Connecticut (16.78)
3 Maryland (15.40)
4 Virginia (15.37)
5 Delaware (13.94)
6 Minnesota (13.94)
7 New Jersey (13.85)
8 Washington (13.80)
9 New York (13.66)
10 New Hampshire (12.96)
13 California (10.70)
Top 10 US States ranked based on “The New Economy Index”
Globalization (2)
1 Delaware (18.05)
2 Texas (16.39)
3 South Carolina (15.31)
4 New Jersey (14.73)
5 Connecticut (14.68)
6 Massachusetts (14.59)
7 Kentucky (14.24)
8 New York (14.21)
9 Washington (13.73)
10 North Carolina (13.61)
17 California (13.17)
Economic Dynamism (3.5)
1 Utah (14.94)
2 Colorado (13.74)
3 Georgia (13.38)
4 Massachusetts (13.30)
5 Florida (13.09)
6 Montana (12.87)
7 Arizona (12.64)
8 Nevada (12.56)
9 California (12.01)
10 Idaho (11.86)
Digital Economy (3)
1 Massachusetts (16.40)
2 Rhode Island (15.53)
3 New Jersey (15.13)
4 Maryland (14.29)
5 Connecticut (14.09)
6 California (14.07)
7 New York (14.03)
8 Oregon (13.58)
9 Washington (13.41)
10 Virginia (12.82)
Innovation Capacity (5)
1 Massachusetts (19.0)
2 Washington (17.5)
3 California (15.0)
4 Maryland (13.4)
5 Delaware (13.1)
6 Colorado (13.0)
7 New Hampshire (12.2)
8 New Jersey (12.2)
9 Virginia (12.0)
10 New Mexico (11.8)
CSE-CITRISRole
InnovationandBetterEducation
CSE-CITRISRole
BetterEducation
101
CSE-CITRISRole
BetterEducation
CSE-CITRISRole
InnovationandBetterEducation
Projection of the cases on the factor-plane ( 1 x 3)
Cases with sum of cosine square >= 0.00
Active
AL
AK
AZ
AR
CA
CO
CT
DE
FL
GA
HI
ID
IL
IN
IAKS
KY
LA
ME
MD
MA
MI
MN MS
MO
MT
NE
NV
NH
NJ
NM
NY
NC ND
OH
OK
OR
PA
RI
SC
SD
TN
TX
US
UT
VT
VA
WA WV
WI
WY
-8 -6 -4 -2 0 2 4 6
Factor 1: 34.46%
-2
-1
0
1
2
3
4
Factor3:10.00%
AL
AK
AZ
AR
CA
CO
CT
DE
FL
GA
HI
ID
IL
IN
IAKS
KY
LA
ME
MD
MA
MI
MN MS
MO
MT
NE
NV
NH
NJ
NM
NY
NC ND
OH
OK
OR
PA
RI
SC
SD
TN
TX
US
UT
VT
VA
WA WV
WI
WY
Top 25 States Bottom 25 States
PCA Analysis of US States Ranking:
The New Economy Index (26 Indicators)
102
Outline of Talk
103
Drivers for Change: Computing and Big Data
Computational Science and Engineering
State Leadership
California – “The Golden State”
The State New Economy Model
“Sustainable California” –
a return to “The Golden State”
CDISC
ACCESS
Insight
Increased
climate/environmentaldetail
Increased socio-economic detail
Tera
Peta
Peta
Exa
Socio-Economic Modeling
for Large-scale Quantitative
Climate/Environmental
Change Analysis
En Informatics
Environment-Genetic
World Population: Today-~6B, 2050-~9B, 2100-~10B
%70 will live in Cities by 2050
By 2020: 35 trillion Gigabytes Data (Cyber-Physical world is connected through
billions to even trillions of sensors and devices)
Petaflop with ~1M Cores in your PC by 2025?
Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism
104
“Sustainable California” –
a return to the Golden State
 building upon massive scale datasets –
streaming and static (sensors/social-economic)
 employing sophisticated analytics, with an
emphasis on modeling, simulation, and
crowdsourcing
 focus on major domains critical for human
welfare and environmental quality (Environment
and Security); urban metabolism and smart
cities, food security, fresh water resources,
public health, natural disasters, energy
conservation, and ecosystem.
 educating the next generation of
interdisciplinary students and industry leaders
A statewide initiative to create integrated
systems and advanced analytic tools
using advanced computational science
and engineering
105
California can improve the standard of living by applying predictive
simulation systems and integrated advanced analytic tools using advanced
computational science and engineering to critical problems facing the state
How can California respond to rapidly
changing environment, climate change,
socio-economic forces and
demographics?
 water resources, public health, natural
disasters, energy conservation,
environment and security
Predictive simulation and advanced
analytic can be used to
 understand the impacts of policy choices
 understand social and economical impacts
 create new technologies and industries
 find more efficient solutions to California’s
pressing infrastructure problems 7
TURING’s TEST
Turing: A computer can be said to be intelligent if its
answers are indistinguishable from the answers of a
human being
??
Computer
Health, Freshwater, Food, Energy, Environment Security, Ecosystems, and Urban Metabolism
106

Más contenido relacionado

La actualidad más candente

Big data, data science & fast data
Big data, data science & fast dataBig data, data science & fast data
Big data, data science & fast dataKunal Joshi
 
Big Data Story - From An Engineer's Perspective
Big Data Story - From An Engineer's PerspectiveBig Data Story - From An Engineer's Perspective
Big Data Story - From An Engineer's PerspectiveHien Luu
 
The Future Of Big Data
The Future Of Big DataThe Future Of Big Data
The Future Of Big DataMatthew Dennis
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoopAnusha sweety
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingMinhazul Arefin
 
Introduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj BongirrIntroduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj BongirrPranav Kulkarni
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayXoriant Corporation
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputinginside-BigData.com
 
Dell_whitepaper[1]
Dell_whitepaper[1]Dell_whitepaper[1]
Dell_whitepaper[1]Jim Romeo
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesCRISIL Limited
 
The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data FrameworkseXascale Infolab
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitGanesan Narayanasamy
 
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisLarry Smarr
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 

La actualidad más candente (20)

Big data, data science & fast data
Big data, data science & fast dataBig data, data science & fast data
Big data, data science & fast data
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
Big Data Story - From An Engineer's Perspective
Big Data Story - From An Engineer's PerspectiveBig Data Story - From An Engineer's Perspective
Big Data Story - From An Engineer's Perspective
 
The Future Of Big Data
The Future Of Big DataThe Future Of Big Data
The Future Of Big Data
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computing
 
Introduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj BongirrIntroduction to Big Data by Manouj Bongirr
Introduction to Big Data by Manouj Bongirr
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
 
Dell_whitepaper[1]
Dell_whitepaper[1]Dell_whitepaper[1]
Dell_whitepaper[1]
 
TerraEchos Kairos on IBM PowerLinux servers
TerraEchos Kairos on IBM PowerLinux serversTerraEchos Kairos on IBM PowerLinux servers
TerraEchos Kairos on IBM PowerLinux servers
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on Businesses
 
The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data Frameworks
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on Summit
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Big data storage
Big data storageBig data storage
Big data storage
 
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data Analysis
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 

Destacado

MoorelandMonitor - Future of Payments - Wallets (April 2014)
MoorelandMonitor - Future of Payments - Wallets (April 2014)MoorelandMonitor - Future of Payments - Wallets (April 2014)
MoorelandMonitor - Future of Payments - Wallets (April 2014)Inga Barchan
 
LawGeex gives you back your TIME.
LawGeex gives you back your TIME.LawGeex gives you back your TIME.
LawGeex gives you back your TIME.Manson Ho
 
социальная работа с инвалидами лучинская
социальная работа с инвалидами лучинскаясоциальная работа с инвалидами лучинская
социальная работа с инвалидами лучинскаяluchinskaya
 
The edfu temple
The edfu templeThe edfu temple
The edfu templedobermanC
 
Standing waves
Standing wavesStanding waves
Standing wavesdegaa
 

Destacado (11)

Ramanathan A - CV
Ramanathan A - CVRamanathan A - CV
Ramanathan A - CV
 
MoorelandMonitor - Future of Payments - Wallets (April 2014)
MoorelandMonitor - Future of Payments - Wallets (April 2014)MoorelandMonitor - Future of Payments - Wallets (April 2014)
MoorelandMonitor - Future of Payments - Wallets (April 2014)
 
Comune di Savignano sul Rubicone | Progetto di riqualificazione di via della ...
Comune di Savignano sul Rubicone | Progetto di riqualificazione di via della ...Comune di Savignano sul Rubicone | Progetto di riqualificazione di via della ...
Comune di Savignano sul Rubicone | Progetto di riqualificazione di via della ...
 
LawGeex gives you back your TIME.
LawGeex gives you back your TIME.LawGeex gives you back your TIME.
LawGeex gives you back your TIME.
 
Vipra today july 2016
Vipra today july 2016Vipra today july 2016
Vipra today july 2016
 
Kmew presentation
Kmew presentationKmew presentation
Kmew presentation
 
SachinKumar_Imp
SachinKumar_ImpSachinKumar_Imp
SachinKumar_Imp
 
بحث الهند
بحث الهندبحث الهند
بحث الهند
 
социальная работа с инвалидами лучинская
социальная работа с инвалидами лучинскаясоциальная работа с инвалидами лучинская
социальная работа с инвалидами лучинская
 
The edfu temple
The edfu templeThe edfu temple
The edfu temple
 
Standing waves
Standing wavesStanding waves
Standing waves
 

Similar a CSE @ Berkeley Drives Computational Science with Big Data

Nikravesh big datafeb2013bt
Nikravesh big datafeb2013btNikravesh big datafeb2013bt
Nikravesh big datafeb2013btMasoud Nikravesh
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!TigerGraph
 
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙Tracy Chen
 
CC LECTURE NOTES (1).pdf
CC LECTURE NOTES (1).pdfCC LECTURE NOTES (1).pdf
CC LECTURE NOTES (1).pdfHasanAfwaaz1
 
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...Dell World
 
Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresSpark Summit
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3mustafa sarac
 
Big data high performance computing commenting
Big data   high performance computing commentingBig data   high performance computing commenting
Big data high performance computing commentingIntel IT Center
 
David Moss - Hartree Centre
David Moss - Hartree CentreDavid Moss - Hartree Centre
David Moss - Hartree CentreIBMInterconnect
 
New Technologies For The Sustainable Enterprise; keynote @Wharton
New Technologies For The Sustainable Enterprise; keynote @WhartonNew Technologies For The Sustainable Enterprise; keynote @Wharton
New Technologies For The Sustainable Enterprise; keynote @WhartonPaul Hofmann
 
Jax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised MedicineJax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised MedicineGaurav Kaul
 
Cloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovaciónCloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovaciónFundación Ramón Areces
 
High Performance Computing and Big Data: The coming wave
High Performance Computing and Big Data: The coming waveHigh Performance Computing and Big Data: The coming wave
High Performance Computing and Big Data: The coming waveIntel IT Center
 

Similar a CSE @ Berkeley Drives Computational Science with Big Data (20)

BigDataCSEKeyNote_2012
BigDataCSEKeyNote_2012BigDataCSEKeyNote_2012
BigDataCSEKeyNote_2012
 
Nikravesh big datafeb2013bt
Nikravesh big datafeb2013btNikravesh big datafeb2013bt
Nikravesh big datafeb2013bt
 
Future of hpc
Future of hpcFuture of hpc
Future of hpc
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
Cloud Computing,雲端運算-中研院網格計畫主持人林誠謙
 
CC LECTURE NOTES (1).pdf
CC LECTURE NOTES (1).pdfCC LECTURE NOTES (1).pdf
CC LECTURE NOTES (1).pdf
 
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
 
Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi Torres
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Big data high performance computing commenting
Big data   high performance computing commentingBig data   high performance computing commenting
Big data high performance computing commenting
 
David Moss - Hartree Centre
David Moss - Hartree CentreDavid Moss - Hartree Centre
David Moss - Hartree Centre
 
New Technologies For The Sustainable Enterprise; keynote @Wharton
New Technologies For The Sustainable Enterprise; keynote @WhartonNew Technologies For The Sustainable Enterprise; keynote @Wharton
New Technologies For The Sustainable Enterprise; keynote @Wharton
 
Jax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised MedicineJax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised Medicine
 
Cloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovaciónCloud Computing y Big Data, próxima frontera de la innovación
Cloud Computing y Big Data, próxima frontera de la innovación
 
High Performance Computing and Big Data: The coming wave
High Performance Computing and Big Data: The coming waveHigh Performance Computing and Big Data: The coming wave
High Performance Computing and Big Data: The coming wave
 
FC Brochure & Insert
FC Brochure & InsertFC Brochure & Insert
FC Brochure & Insert
 

Último

LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2AuEnriquezLontok
 
cybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationcybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationSanghamitraMohapatra5
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learningvschiavoni
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaDr.Mahmoud Abbas
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterHanHyoKim
 
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyChayanika Das
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsDobusch Leonhard
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPRPirithiRaju
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxpriyankatabhane
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsDanielBaumann11
 
Measures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UGMeasures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UGSoniaBajaj10
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 

Último (20)

LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
 
cybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationcybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitation
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarter
 
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and Pitfalls
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptx
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
 
Measures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UGMeasures of Central Tendency.pptx for UG
Measures of Central Tendency.pptx for UG
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 

CSE @ Berkeley Drives Computational Science with Big Data

  • 1. Computational Science and Engineering (CSE) @ Berkeley The Emergence of Computation for Interdisciplinary Large Data inspired by Science Bounded by our imagination innovation through Technology Create Social impact Masoud Nikravesh @ CITRIS and LBNL CITRIS Director for CSE Executive Director, DE-CSE @ Berkeley http://cse.berkeley.edu/ http://cloud.citris-uc.org/ http://citris-uc.org/ http://www.lbl.gov/cs Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism 1
  • 2. Outline of Talk 2 Drivers for Change: Computing and Big Data Computational Science and Engineering State Leadership California – “The Golden State” The State New Economy Model “Sustainable California” – a return to “The Golden State”
  • 3. Outline of Talk 3 Drivers for Change: Computing and Big Data Computational Science and Engineering State Leadership California – “The Golden State” The State New Economy Model “Sustainable California” – a return to “The Golden State”
  • 4. Drivers for Change • Continued exponential increase in computational power  simulation (Computing) is becoming third pillar of science, complementing theory (Analytic and Math ) and experiment (Applications) Applications HPC-Cloud Computing Analytics Math High performance computing (HPC), large-scale simulations, and scientific applications all play a central role in CSE. CSE The HPC/cloud computing initiative and next generation data center Extreme simulation, visual-data analytics, data-enabled scientific discovery Applications/real‐world complex applications (scientific, engineering, social, economic, policy) using the future multi-core parallel computing ((i.e. E-Informatics, Earthquake Early Warning, NextGenMaps, Genome Atlas, Genetic Facebook, Genomics Browser) HPC-Petascale and Exascale systems are an indispensable tool for exploring the frontiers of science and technology for social impact. 4
  • 5. Revolution is Happening Now  Chip density is continuing increase ~2x every 2 years  Clock speed is not  Number of processor cores may double instead  There is little or no more hidden parallelism (ILP) to be found  Parallelism must be exposed to and managed by software Source: Intel, Microsoft (Sutter) and Stanford (Olukotun, Hammond) 5
  • 6. Computing Growth is Not Just an HPC Problem 10 100 1,000 10,000 100,000 1,000,000 1985 1990 1995 2000 2005 2010 2015 2020 Year of Introduction The Expectation Gap Microprocessor Performance “Expectation Gap” over Time (1985-2020 projected) 6
  • 7. New Processors Means New Software  Exascale will have chips with thousands of tiny processor cores, and a few large ones  Architecture is an open question:  sea of embedded cores with heavyweight “service” nodes  Lightweight cores are accelerators to CPUs  Autotuning eases code generation for new architectures Interconnect Memory Processors Server Processors Manycore processors 130 Megawatts 75 Megawatts Source: Kathy Yelick, 7
  • 8. Interconnect Memory Processors New Memory and Network Technology to Lower Energy  Memory as important as processors in energy  Latency is physics, bandwidth is money  Software managed memory or cache hybrids  Autotuning has helped with that management  Need to raise level of autotuning to higher level kernels Usual memory + network New memory + network 25 Megawatts75 Megawatts Source: Kathy Yelick, 8
  • 9. TOP500 Sites – June 2011 Today, HPC-Petascale and soon Exascale systems- is not just a tool of choice, but it becomes an indispensable tool for frontiers of science and technology for social impact. Petaflop with ~1M Cores in your PC by 2025? 9
  • 10. TOP10 Sites - June 2010 10
  • 11. TOP10 Sites - November 2010 11
  • 12. TOP10 Sites – June 2011 12
  • 13. TOP500 Sites – June 2011 Today, HPC-Petascale and soon Exascale systems- is not just a tool of choice, but it becomes an indispensable tool for frontiers of science and technology for social impact. Petaflop with ~1M Cores in your PC by 2025? 8-10 years 6-8 years 13
  • 14. goal usual scaling 2005 2010 2015 2020 Energy Cost Challenge for Computing Facilities At ~$1M per MW, energy costs are substantial  1 petaflop in 2010 will use 3 MW  1 exaflop in 2018 possible in 200 MW with “usual” scaling  1 exaflop in 2018 at 20 MW is DOE target 14
  • 15. New Processor Designs are Needed to Save Energy  Server processors have been designed for performance, not energy  Graphics processors are 10-100x more efficient  Embedded processors are 100-1000x (1.25 rather than 100 watt)  Need manycore chips with thousands of cores Cell phone processor (0.1 Watt, 4 Gflop/s) Server processor (100 Watts, 50 Gflop/s) Source: Kathy Yelick, HPC-SEG July 2011 15
  • 16. Motif/Dwarf: Common Computational Methods (Red Hot  Blue Cool) Embed SPEC DB Games ML HPC Health Image Speech Music Browser 1 Finite State Mach. 2 Combinational 3 Graph Traversal 4 Structured Grid 5 Dense Matrix 6 Sparse Matrix 7 Spectral (FFT) 8 Dynamic Prog 9 N-Body 10 MapReduce 11 Backtrack/ B&B 12 Graphical Models 13 Unstructured Grid What do commercial and CSE applications have in common? Source: Jim Demmel, Berkeley Parlab 16
  • 17. Source: Oliver Pell, HPC-SEG July 2011, Berkeley CPU, GPU, Hybrid, FPGA? 17
  • 18. x86 Multicores GPU FPGA Numbers -Current generation: 4–6 cores/CPU x 2 CPUs/node = 8–12 cores/node -Future generation: 16–20 cores/CPU x 4 CPUs/node = 64–80 cores/node -512 cores/GPU (Nvidia) -1600 cores/GPU (AMD) -No more cores but BRAM, --Look Up Tables, FlipFlops, etc.. -Clock frequency is in the order of hundreds of MHz -Memory per card is in the order of tens of GB What is the easy part? -Well known and mature technology -Well established development environments -Parallelism between core and nodes -Well known technology (for gaming purposes) -It is becoming reliable also for HPC computation -High performance-per-watt ratio What is difficult to do? -Linear speedup with increasing core numbers -CUDA: good tool but proprietary -OpenCL: open technology but not yet standard and more complex to use -Development tools (+ profiling, debugging, etc) not yet fully available -Non standard development tools (VHDL is not for Geophysicists… but we have MaxCompiler!) -Data streaming technology is different from standard approaches (grid/matrix) Main problems -Slow memory access -Legacy codes need to be re-engineered in order to get the best performance (e.g. SSE vectorization, cache blocking) -Network connections have to be optimized for the architecture -Limited amount of memory (4–6 GB) per card -Slow communication with the host CPU (due to PCI Express) -Internal bandwidth is not always enough -The technology is not yet standard for HPC -Slow communication with the host CPU (due to PCI Express) Source: Carlo Tomas, HPC-SEG, July 2011, Berkeley 18
  • 19. A Likely Trajectory - Collision or Convergence? CPU GPU multi-threading multi-core many-core fixed function partially programmable fully programmable future processor by 2012 ? programmability parallelism after Justin Rattner, Intel, ISC 2008 19
  • 20. Drivers for Change • Continued exponential increase in experimental, simulation, sensors, and social data  techniques and technology in data analysis, visualization, analytics, networking, and collaboration tools are becoming essential in all data rich applications Big Data Model Human Experts- Citizen Cyber Science Crowdsourceing Analytic ToolsFirst Principles Hybrid Models Google IBM-Watson IBM- Cognitive Model Boeing 747 Simulation Protein Folding Amazon AI-ImageIncreased climate/environmentaldetail Increased socio-economic detail Tera Peta Peta Exa Socio-Economic Modeling for Large-scale Quantitative Climate/Environmental Change Analysis En Informatics Environment-Genetic 20
  • 21. World Population: Today-~6B, 2050-~9B, 2100-~10B %70 will live in Cities by 2050 By 2020: 35 trillion Gigabytes Data (Cyber-Physical world is connected through billions to even trillions of sensors and devices) Petaflop with ~1M Cores in your PC by 2025? Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism 21
  • 22. Why BIG Data is a Big Deal? Size of Data: • 2010: 1.2 million Petabytes, or 1.2 Zettabytes • 2020: 35 trillion Gigabytes (Cyber-Physical World is connected through billions to even trillions of sensors and devices) Type of data: • from homogenous data to heterogeneous and multi-scale • from physical sensor data to social-economical data • from complete to incomplete, imprecise and uncertain • from implementing on single-simple hardware-software architecture to scalable parallel complex hardware-software architectures 22
  • 23. Why BIG Data is a Big Deal? Crisis: Data storage/transfer/communication and security- privacy doomsday forecast Opportunities: Information gold mine Needs: better, faster, cheaper, and scalable technologies for storage, manipulation, communication and analysis 23
  • 24. Why BIG Data is a Big Deal? Challenge: Combine our current and to be developed advanced-scalable* analytical tools with first principle models and human capabilities at scale with anticipatory capabilities to discover the un-seen phenomena and insights and to make and deliver securely right decisions and at the right time based on incomplete, imprecision, and uncertain public/private data dealing with multi and conflicting objectives and criteria. 24
  • 25. Why BIG Data is a Big Deal? Crowdsourcing Big Data Model Human Experts- Citizen Cyber Science Crowdsourceing Analytic ToolsFirst Principles Hybrid Models Google IBM-Watson IBM- Cognitive Model Boeing 747 Simulation Protein Folding Amazon AI-Image Increased climate/environmentaldetail Increased socio-economic detail Tera Peta Peta Exa Socio-Economic Modeling for Large-scale Quantitative Climate/Environmental Change Analysis En Informatics Environment-Genetic 25
  • 26. Distributed thinking / Human computing Physical participation coordinated via Internet BIG Data and Citizen Cyber Science? What can be aggregated?  Aggregate perception, knowledge, reasoning  Visual pattern recognition  Real-world knowledge  3D spatial manipulation  Language skills Where to get Volunteers  Tell a good story about your research  Give recognition  Make it a game  Add a social dimension 26
  • 27. BIG Data and Citizen Cyber science? 27
  • 28. AMP: Algorithms, Machines, People Adaptive/Active Machine Learning and Analytics Cloud ComputingCrowdSourcing Massive and Diverse Data Source: M. Franklin 28
  • 29. Cloud Initiative at Berkeley ~120 Faculty (CSE), ~120 Researchers (Cloud-HPC) , 22 Departments Data Structure Analytics Service Delivery 29
  • 30. CSE Cloud Computing Initiative Cloud Computing are being used by a broad array of Computational Science and Engineering faculty investigators, researchers and graduate students from social scientists and economists to astrophysicist and Bioengineers. Our list of faculty includes experts from both computational science and engineering, and the cloud and HPC community. It includes ~120 faculty and over ~120 researchers/students from over 22 departments (http://cse.berkeley.edu and http://cloud.citris- uc.org/). 30
  • 31. Cloud Infrastructure Applications (scientific, engineering, social, economic/business/finance, policy) Delivery of Services Mobile Devices Mobile CloudSoftware and Appliances Cluster Scheduling & Reliability Network Research and Security Supercomputer Public Cloud Private Cloud Volunteering Computing Mobile Cloud Streaming Data Massive Data Extreme Simulation Large Scale Visualization Machine Learning Analytics Intelligent Dynamic Maps Early Warning Social Networking Second Life Cyber Citizen Personalized Services Crowd Sourcing Cloud Initiative at Berkeley ~120 Faculty (CSE), ~120 Researchers (HPC-Cloud) , 22 Departments 31
  • 32. Cloud Initiative at Berkeley ~120 Faculty (CSE), ~120 Researchers (HPC-Cloud) , 22 Departments  Infrastructure – Cloud Cluster and Data Centers  Delivery of Services – Mobile Cloud  Applications  Scientific  Social  Economics/Business  Software and Appliances  Cluster Scheduling & Reliability  Network Research and Security Mobile devices, Mobile Cloud, and Cloud Infrastructure will be the device/tools of choice for delivery of services. 32
  • 33. Cloud Computing Initiative We will focus on three main areas:  Machine Learning: Provide the general public with machine learning analytics tools and algorithm runs in cloud infrastructure.  Streaming Data Analytics and Visualization: Analyses and visualization of large-scale real time data sets such as traffic information, online news sources, economics data, and scientific data such as astrophysical and Genomics data.  Scientific Applications: Benchmarking and cataloging the suitability of cloud computing for science and engineering applications, including HPC applications. 33
  • 34. BIG Data and Sensors/Cyber-Physical Infrastructure Water Air Energy Earthquake Marvell Lab μSensors TinyOS Prototyping Devices and Sensors G/H FEEDBACK California Independent System (Cal ISO) Department of Water Resources California Department of Health and Social Services and FCC Cyberspace Handhelds Laptop/PC Clusters IBM/ room143 Cloud + + + Analytics Algorithms M/C Learning/A.I. Statistical Analysis Social Comp Knowledge Insight Large-Scale Information Extraction Delivery and Service Back to Handhelds Distributed Systems Visualization, Analytics and Insight Physical World Big Data Streams 34
  • 35. Increased climate/environmentaldetail Increased socio-economic detail Tera Peta Peta Exa Socio-Economic Modeling for Large-scale Quantitative Climate/Environmental Change Analysis En Informatics Environment-Genetic BIG Data and Exa-Scale Computing 35
  • 36. Courtesy of U.S. Department of Energy Human Genome Program , http://www.ornl.gov/hgmis BIG Data and DNA Computing 36
  • 37. BIG Data and DNA Computing 37
  • 38. BIG Data and DNA Computing 38
  • 39. BIG Data and Visualization –Scientific 39
  • 40. BIG Data and Visualization 40
  • 41. Outline of Talk 41 Drivers for Change: Computing and Big Data Computational Science and Engineering State Leadership California – “The Golden State” The State New Economy Model “Sustainable California” – a return to “The Golden State”
  • 42. Computational Science Nature, March 23, 2006 “An important development in sciences is occurring at the intersection of computer science and the sciences that has the potential to have a profound impact on science. It is a leap from the application of computing … to the integration of computer science concepts, tools, and theorems into the very fabric of science.” -Science 2020 Report, March 2006 42
  • 43. Nature of Work, Education and Future Society “Creative Creators” or “Creative Servers”: Do complex task, and Enhance, Refine, and Reinvent. “T. Friedman and M. Mandelbaum” That Used to be Us” 20th Century 21th Century Number of Jobs 1-2 Jobs 10-15 Jobs Job Requirement Mastery of one Field (Single Deep Expertise) Breadth; Depth in several Fields (Multiple Deep Expertise) (Broad Knowledge) Alternative sources of Natural Resources: Energy and Water Technology: Nano-technology, Quantum Computers, Genetic and Biometrics, and Robotics Services: Online Education and Services on Demand Resources: Sensors and Devices, Big Data, Computing Power, Social Network and Computing Charles Fadel 43
  • 44. Tm T m Tm-shaped Individual and not just T or m-shaped Single Expertise Multiple Deep Expertise Single Deep + Multiple Expertise Hybrid (CSE) Broad Knowledge 21st century skills: problem-solving, critical thinking, entrepreneurship and creativity 44
  • 45. Computational Science and Engineering (CSE) @ Berkeley 45
  • 46. What is CSE? CSE is a rapidly growing multidisciplinary field that encompasses real-world complex applications (scientific, engineering, social, economic, policy), computational mathematics, and computer science and engineering. High performance computing (HPC), large-scale simulations and modeling (physical, biological, economic, social, and policy processes), and scientific applications all play a central role in CSE. Petaflop with ~1M Cores in your PC by 2025? 46
  • 47. What is CSE? Simulation of complex problems is sometimes the only feasible way to make progress if the theory is intractable and experiments are too difficult, too expensive, too dangerous, or too slow. Through modeling and simulation of multiscale systems of systems, and through scientific discovery from large-scale heterogeneous data, CSE aims to advance solutions for a wide range of problems in the areas of nanoscience and nanotechnology, energy, climate change, engineering design, neuroscience, cognitive computing and intelligent systems, plasma physics, transportation, bioinformatics and computational biology, earthquake engineering, geophysical modeling, astrophysics, materials science, national defense, information technology for health care, engineering better search engines, socio-economic-policy modeling, and other fields that are critical to scientific, economic, and social progress. 47
  • 48. CSE: Vision To support the work of scientists and engineers as they pursue complex –simulation/modeling, as well as computational, data and visualization- intensive research to enhance scientific, technological, and economic leadership while improving our quality of life. inspired by Science Bounded by our imagination innovation through Technology Create Social impact Today, HPC-Petascale and soon Exascale systems- is not just a tool of choice, but it becomes an indispensable tool for frontiers of science and technology for social impact. 48
  • 49. CSE: Mission  Conduct world-leading research in applied mathematics and computer science to provide leadership in such areas as energy, environment, health- information technology, climate, bioscience and neuroscience, and intelligent cyber-physical infrastructure to name a few.  Be at the forefront of the development and use of ultra-efficient largest-scale computer systems, focusing on discoveries and solutions that link to the evolution of the commercial market for high-performance and cloud computing and services.  Allow industry collaborators to gain experience with computational modeling / simulation and the effective use of HPC and Cloud facilities and carrying back new expertise to their institutions. This would enable the Industry partners to be “first to market” with important scientific and technological capabilities, breakthrough ideas, and new hardware-software.  Educate the next generation of interdisciplinary students and industry leaders (DE-CSE program and a new Professional Master Program (PMS) to be developed) inspired by Science Bounded by our imagination innovation through Technology Create Social impact Petaflop with ~1M Cores in your PC by 2025? 49
  • 50. High performance computing (HPC), large-scale simulations, and scientific applications all play a central role in CSE. Applications HPC-Cloud Computing Analytics Math CSE The HPC/cloud computing initiative and next generation data center Extreme simulation, visual-data analytics, data-enabled scientific discovery Applications/real‐world complex applications (scientific, engineering, social, economic, policy) using the future multi-core parallel computing ((i.e. E-Informatics, Earthquake Early Warning, NextGenMaps, Genome Atlas, Genetic Facebook, Genomics Browser) CSE Berkeley and LBNL Partnership HPC-Petascale and Exascale systems are an indispensable tool for exploring the frontiers of science and technology for social impact. 50
  • 51. Computational Research Division Applied Mathematics Computer Science Computational Science HPC architecture, OS, and compilers 512 256 128 64 32 16 8 4 2 1024 1/16 1 2 4 8 16321/8 1/4 1/2 1/32 RTM/wave eqn. NVIDIA C2050 (Fermi) SpMV 7pt Stencil 27pt Stencil DGEMM GTC/chargei GTC/pushi Performance & Autotuning Visualization and Data Management Cloud, grid & distributed computing Mathematical Models Adaptive Mesh Refinement Linear Algebra Libraries and Frameworks Interface Methods NanoscienceCombustion Climate Cosmology & Astrophysics GenomicsEnergy & Environment Source- LBNL & CSE51
  • 52. Computational Science and Engineering (CSE) @ Berkeley Designated Emphasis (DE) in CSE Participants ~120 Faculty (CSE), ~120 Researchers (HPC-Cloud), ~22 Departments, , ~33 Students and growing, ~60 Courses, more being developed http://cse.berkeley.edu/ http://cloud.citris-uc.org/ http://citris-uc.org/ http://www.lbl.gov/cs 52
  • 53. Designated Emphasis (DE) in CSE • New “graduate minor” – approved, starting July 1, 2008 • Motivation – Widespread need to train PhD students in large scale simulation, or analysis of large data sets – Opportunities for collaboration, across campus and at LBNL • Graduate students participate by – Getting accepted into existing department/program – Taking CSE course requirements – Qualifying examination with CSE component – Need to sign up before quals! – Thesis with CSE component – Receive “PhD in X with a DE in CSE” 53
  • 54. CSE Participating Departments (1/2) ( # faculty by “primary affiliation”, # courses, # Students ) •Astronomy (7,3,1) •Bioengineering (3,1,0) •Biostatistics (2,0,1) •Chemical Engineering (6,0,0) •Chemistry (8,1,0) •Civil and Environmental Engineering (7,8,2) •Earth and Planetary Science (6,3,4) •EECS (19,14,4) •IEOR (5,5,0) •School of Information (1,0,0) 54
  • 55. CSE Participating Departments (2/2) ( # faculty by “primary affiliation”, # courses, # Students ) • Integrative Biology (1,0,0) •Materials Science and Engineering (2,1,0) •Mathematics (15,4,0) •Mechanical Engineering (9,6,8) •Neuroscience (7,1,4) •Nuclear Engineering (2,1,3) •Physics (1,1,0) •Political Science (2,0,1) •Statistics (5, 11,0) •New: Biostatistics (1), Public Health (0), Vision Science(1), Biopyhsics(1), Business School (1) 55
  • 56. Course Structure  3 kinds of students, course requirements  Applications, CS, Math  Each kind of student has 3 course requirements in other two fields  Goal: enforce cross-disciplinary training  Ex: Applications students takes courses from EECS, Math, Statistics, IEOR  We support new course development  5 courses recently created/updated 56
  • 57. Educating the Workforce of the Future China & India: 300M Skilled worker by 2025 Eng. Ph.D Median Salary: India: $39,200 China: $53,700 Germany: $99,400 US(CA): $125,200 Science and Engineering Graduate US 420000, EU 470000, China 530000 , India 690000, Japan 350000 McKinsey report concluded that only 10% of Chinese engineers and 25% of Indian engineers can compete in the global outsourcing arena. Revised by: Nikarvesh 57
  • 58. Annualized Job Openings vs. Annual Degrees Granted (2008-2018) CSE educates the next generation of interdisciplinary students and industry leaders. CSE Revised by: Nikarvesh 58
  • 59. Degree Production vs. Job Openings Sources: Adapted from a presentation by John Sargent, Senior Policy Analyst, Department of Commerce, at the CRA Computing Research Summit, February 23, 2004. Original sources listed as National Science Foundation/Division of Science Resources Statistics; degree data from Department of Education/National Center for Education Statistics: Integrated Postsecondary Education Data System Completions Survey; and NSF/SRS; Survey of Earned Doctorates; and Projected Annual Average Job Openings derived from Department of Commerce (Office of Technology Policy) analysis of Bureau of Labor Statistics 2002-2012 projections. See http://www.cra.org/govaffairs/content.php?cid=22. 160,000 140,000 120,000 100,000 80,000 60,000 40,000 20,000 Engineering Physical Sciences Biological Sciences Computer Science Ph.D. Master’s Bachelor’s Projected job openings CSE educates the next generation of interdisciplinary students and industry leaders. CSE Revised by: Nikarvesh 59
  • 61. Open Big Data Science Computational Foundations and Driving Applications CDISC – Center Concept Open Big Data Science APPS CORE LIBRARIES ANALYTICS MACHINE LEARNING TRANINING & EDUCATION OUTREACH Devices and Computing Environment 61
  • 62. Our Center will develop a wide array of computational tools to tackle the challenges of data-intensive scientific research across multiple scientific disciplines. These tools will encapsulate state of the art machine learning and statistical modeling algorithms into broadly applicable, high-level interfaces that can be easily used by application scientists. Our goal is to dramatically reduce the time needed to extract knowledge from the floods of data science is facing, thanks to workflows that permit exploratory and collaborative research to evolve into robustly reproducible outcomes. CDISC: Center Concept Center for Data-Driven Scientific Computing 62
  • 63. Our development will be driven by a collection of scientific problems that share a common theme. They all present major data-intensive challenges requiring significant algorithmic breakthroughs and represent key questions within their field, from rapid astronomical discovery of rare events to early warning systems for natural hazards such as earthquakes or tsunamis. Moving beyond the traditional domain of scientific computing, we will tackle a collection of problems in social sciences and the digital humanities, pushing the boundaries of quantitative scholarship in these disciplines. CDISC: Center Concept Center for Data-Driven Scientific Computing 63
  • 64. CDISC: Center for Data-Driven Scientific Computing Center Concept Date-Driven Scientific Computing APPS CORE LIBRARIES ANALYTICS MACHINE LEARNING TRANINING & EDUCATION OUTREACH Devices and Computing Environment 64
  • 65. Center for Accelerating Environmental Synthesis and Solutions (ACCESS) & Environment Quality and Security To enable synthesis, En Informatics (En= Environmental, Ecological, Epidemiological, Economic, Engineering, Equitable, Ethical,… ) Center Concept Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism World Population: Today-~6B, 2050-~9B, 2100-~10B %70 will live in Cities by 2050 65
  • 66. ACCESS Focus ACCESS will focus on five major domains critical for human welfare and environmental quality: freshwater, health, ecosystems, urban metabolism, and food security; and will create and implement a synthesis process that makes research tools and understanding rapidly accessible across disciplines, and foster new ways of thinking across disciplines about critical environmental problems. Source: Inez Fung Center for Accelerating Environmental Synthesis and Solutions (ACCESS) 66
  • 67. Berkeley ACCESS Themes Ecosystem trajectories over the past million years and in the future - rate and nature - result principally 8000 generations of human population growth and aspirations. Underlying ecosystem trajectories are the changing supply and demand of water and the need to harness energy to advance civilization. Urban metabolism: Theoretical models of cities as complex socio- ecological systems with particular metabolic dynamics. Urban policy is increasingly critical to building a more sustainable future. The increasing ease of utilizing existing resources leads to their rapid and unsustainable depletion, with many resulting intolerable impacts, including those on  Human and animal health  Food security Source: Inez Fung Center for Accelerating Environmental Synthesis and Solutions (ACCESS) 67
  • 68. Urban Metabolism Conceptual Frameworks for Urban Metabolism: Theoretical models of cities as complex socio-ecological systems with particular metabolic dynamics include approaches based in political economy, sociology, urban ecology and biogeochemistry, and industrial ecology – many of which remain disconnected from each other. In addition, because the inputs to urban life are globalized, the geography of consumption and production networks must be integrated into conceptual frameworks. Data Integration: A rapidly expanding volume of geospatial data on urban stocks and flows – about people, animals, vegetation, consumer products, energy, waste, etc. – is available for synthesis and building models of the complex metabolic cycles of cities. Policy and Activism: Urban policy is increasingly critical to building a more sustainable future, but the policy interventions and activist campaigns are piecemeal remedies rather than solutions based on an understanding of cities as complex socio-ecological systems. Visualization and Decision-Support: Decision makers and stakeholders of many types need to visuzlize model results quickly and effectively. Generating sophisticated and insightful visualizations of urban systems is an emergent and critical field. Source: Inez Fung 68
  • 70. Strategic Projects/ Shared Facilities, Resources, Expertise Technology Streaming Data and Visual Analytics Core Group* Core Scientific Group* Shared Facilities VisLab+ Computing Infrastructures Delivery of Service Mobile Devices, Internet, and Cloud Science/Applications scientific,engineering,social,economic/business/finance ACCESS- E-informatics Earthquake Early Warning Next Generation Dynamic Maps Genome Atlas, Genetic Facebook, Genomics Browser, bioinformatics, Immune System, … Computational Bioscience, Neuroscience, Nanoscience , Astrophysics , … *core group of enabling computational scientists would stand at the heart of the center, and that they would both cross- pollinate expertise among projects and provide great leverage in winning large federally-supported projects*. Educational, Research, and Social Impacts; IT-Enabled Disaster Resilience Insight Lab Intensive Computing, Immersive Visualization and Human Interaction Data and Visual-enabled Scientific Discovery and Insight Accelerator (~120 CSE Faculty, ~120 HPC-Cloud Researchers, and 22 Departments) 70
  • 71. Earthquake early warning 400 seismic stations across California Use existing seismic stations to • detect the beginning of earthquakes • estimate the location and magnitude • predict damaging ground shaking • issue a warning to those in harms way Seconds to tens of seconds warning, up to 1 minute • people move to safe zone (under table) • slow and stop trains (BART) • isolate hazards (equipment, chemicals) new science + modern communications Allen Richard 71
  • 72. Opinion Space: Crowdsourcing Insights Scalability: N Participants, N Viewpoints Each Viewpoint is n-Dimensional Dim. Reduction: 2D Map of Affinity/Similarity Insight vs. Agreement: Nonlinear Scoring N2 Peer to Peer Reviews Source: Ken Goldberg and Alec Ross 72
  • 73. CISN ShakeMap Crowdsourcing + physical modeling + sensing + data assimilation Physical modeling-based live maps, which contain real-time assessments of situation integrating streaming data Source: Alex Bayen NextGenMap: The Value of Multi-disciplinary Research: Invention, Societal-pull, Products, New Legislation 73
  • 74.  Real-time (machine-learned) classification of astronomical event data  data deluge requires abstracting traditional roles of scientist in discovery  working with real data now, towards a scalable framework for the Large Synoptic Survey (LSST) era new statistical analytics on sparse data machine learning with noisy & spurious feature sets cloud-based ML with massive databases Source: Josh Bloom Berkeley Time-Series Center 74
  • 75. Innovative visualizations for a topic’s summary in news across time  Real-time summaries of topics across many news sources  Global image of news landscape  Interpretable results obtained via sparse machine learning techniques  Massive data sets requires cloud computing Real-time image of news sources or topics Source: Laurent El Ghaoui StatNews: Analytics and Visualization of News Data 75
  • 77. Berkeley Lab’s Major Scientific Facilities Complex Tools to Address Scientific Challenges Advanced Light Source Molecular Foundry National Center for Electron Microscopy National Energy Research Scientific Computing Center 88-Inch Cyclotron Joint Genome Institute Energy Sciences Network (ESnet) 77
  • 78. Computational Research Division Applied Mathematics Computer Science Computational Science HPC architecture, OS, and compilers 512 256 128 64 32 16 8 4 2 1024 1/16 1 2 4 8 16321/8 1/4 1/2 1/32 RTM/wave eqn. NVIDIA C2050 (Fermi) SpMV 7pt Stencil 27pt Stencil DGEMM GTC/chargei GTC/pushi Performance & Autotuning Visualization and Data Management Cloud, grid & distributed computing Mathematical Models Adaptive Mesh Refinement Linear Algebra Libraries and Frameworks Interface Methods NanoscienceCombustion Climate Cosmology & Astrophysics GenomicsEnergy & Environment 78
  • 79. National Energy Research Scientific Computing Facility Department of Energy Office of Science (unclassified) Facility • 4000 users, 500 projects • From 48 states; 65% from universities • 1400 refereed publications per year Systems designed for science • 1.3 PF Hopper system (Cray XE6) - 4th Fastest computer in US, 8th in world • .5 PF in Franklin (Cray XT4), Carver (IBM iDataplex) and other clusters 79
  • 80. NERSC Systems Large-Scale Computing Systems Franklin (NERSC-5): Cray XT4 • 9,532 compute nodes; 38,128 cores • ~25 Tflop/s on applications; 356 Tflop/s peak Hopper (NERSC-6): Cray XE6 • 6,384 compute nodes, 153,216 cores • 120 Tflop/s on applications; 1.3 Pflop/s peak HPSS Archival Storage • 40 PB capacity • 4 Tape libraries • 150 TB disk cache NERSC Global Filesystem (NGF) Uses IBM’s GPFS • 1.5 PB capacity • 5.5 GB/s of bandwidth Clusters 140 Tflops total Carver • IBM iDataplex cluster PDSF (HEP/NP) • ~1K core cluster Magellan Cloud testbed • IBM iDataplex cluster GenePool (JGI) • ~5K core cluster Analytics Euclid (512 GB shared memory) Dirac GPU testbed (48 nodes) 80
  • 81. The TOP10 of the TOP500 Rank Site Manufacturer Computer Country Cores Rmax [Pflops] [MW] 1 RIKEN Advanced Institute for Computational Science Fujitsu K Computer SPARC64 VIIIfx 2.0GHz, Tofu Interconnect Japan 548,352 8.162 9.90 2 National SuperComputer Center in Tianjin NUDT Tianhe-1A NUDT TH MPP, Xeon 6C, NVidia, FT-1000 8C China 186,368 2.566 4.04 3 Oak Ridge National Laboratory Cray Jaguar Cray XT5, HC 2.6 GHz USA 224,162 1.759 6.95 4 National Supercomputing Centre in Shenzhen Dawning Nebulae TC3600 Blade, Intel X5650, NVidia Tesla C2050 GPU China 120,640 1.271 2.58 5 GSIC, Tokyo Institute of Technology NEC/HP TSUBAME-2 HP ProLiant, Xeon 6C, NVidia, Linux/Windows Japan 73,278 1.192 1.40 6 DOE/NNSA/LANL/SNL Cray Cielo Cray XE6, 8C 2.4 GHz USA 142,272 1.110 3.98 7 NASA/Ames Research Center/NAS SGI Pleiades SGI Altix ICE 8200EX/8400EX USA 111,104 1.088 4.10 8 DOE/SC/ LBNL/NERSC Cray Hopper Cray XE6, 6C 2.1 GHz USA 153,408 1.054 2.91 9 Commissariat a l'Energie Atomique (CEA) Bull Tera 100 Bull bullx super-node S6010/S6030 France 138.368 1.050 4.59 10 DOE/NNSA/LANL IBM Roadrunner BladeCenter QS22/LS21 USA 122,400 1.042 2.3481
  • 82. Exascale: Who Needs It? Fusion: Simulations of plasma properties to ITER scale model Combustion: complete predictive engine simulation Astronomy: origins of the universe Sequestration: Understanding fluid flow & chemistry Materials: solar panels to database of materials-by-design. Climate: Resolve clouds (1km scale) & model mitigations Protein structures: From Biofuels to Alzheimers Every field needs more computing! 1) To quantify and reduce uncertainty in simulations 2) Analyze data from experiments and simulations 82
  • 83. ESnet provides the critical network infrastructure that supports the Department of Energy’s Office of Science missions. • ESnet directly supports the research of some 15,000 scientists, postdocs and graduate students at DOE laboratories, universities, other federal agencies, and industry worldwide • Science is increasingly collaborative and globally distributed • ESnet provides the reliable connection, science-driven innovation and user focus that enables scientists to collaborate, manage, and exchange data The Energy Sciences Network 83
  • 84. Prototype 100G Topology Magellan Magellan Supporting Advanced Scientific Computing Research • Basic Energy Sciences • Biological and Environmental Research • Fusion Energy Sciences • High Energy Physics • Nuclear Physics 84
  • 85. Outline of Talk 85 Drivers for Change: Computing and Big Data Computational Science and Engineering State Leadership California – “The Golden State” The State New Economy Model “Sustainable California” – a return to “The Golden State”
  • 86. “Sustainable California” – a Return to the Golden State Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism 86 “California”- The Golden State “Silicon Valley” – The Golden High Tech Region
  • 87. Top 10 Countries by GDP 2009 & 2010 Overall Rank Country or U.S. State GDP (millions of USD) — World 62,220,000 1 United States 14,620,000 2 People's Republic of China 5,879,100[2] 3 Japan 5,391,000 4 Germany 3,306,000 5 France 2,555,000 6 United Kingdom 2,259,000 7 Italy 2,037,000 8 Brazil 2,024,000 California 1,911,822 9 Canada 1,564,000 10 Russia 1,477,000 Overall Rank Country or U.S. State GDP (millions of USD) — World 58,133,309 1 United States 14,119,000 2 Japan 5,068,996 3 People's Republic of China 4,985,461[2][3] 4 Germany 3,330,032 5 France 2,649,390[4] 6 United Kingdom 2,174,530 7 Italy 2,112,780 California 1,911,822 8 Brazil 1,573,409 9 Spain 1,460,250 10 Canada 1,336,068 2010 2009 Source: Wikipedia 87
  • 88. State Rank Company Fortune 500 rank City Revenues ($ millions) 1 Chevron 3 San Ramon 196,337.0 2 Hewlett-Packard 11 Palo Alto 126,033.0 3 McKesson 15 San Francisco 108,702.0 4 Wells Fargo 23 San Francisco 93,249.0 5 Apple 35 Cupertino 65,225.0 6 Intel 56 Santa Clara 43,623.0 7 Safeway 60 Pleasanton 41,050.0 8 Cisco Systems 62 San Jose 40,040.0 9 Walt Disney 65 Burbank 38,063.0 10 Northrop Grumman 72 Los Angeles 34,757.0 Top publicly traded companies in California for 2011 (over 50) according to revenues with State and U.S. rankings A Total of $1,218,340.30 ($Millions) Source: Fortune 500 88
  • 89. “California” – The Golden State California's economy is the ninth (eighth in 2010) largest economy in the world, if the states of the U.S. were compared with other countries. • California is house to top publicly traded companies in California (over 50 Fortune 500 in 2011 ) according to revenues with State and U.S. rankings A Total of $1,218,340.30 ($Millions) in Revenue • California is not only the house to the largest High Technology companies but also house to the largest company in the world. Apple with Market Cap of over $420B ranked 1st with Exxon ranked 2nd . • California is the house to the leaders of the Internet and ICT and super- computers • California is the house to the largest and leading Bioscience, Life Sciences and Biomedicine • California is the house to the leading Nano and Sensor Technology • California is the house to the many leading Universities and DoE Leading National Lab in Science and Technology The University of California is well known for developing and operating academic research centers in cooperation with partners world-wide. UC Berkeley has a proud reputation of solving problems of interest not only in the state of California, but for the people of world. 89
  • 90. California’s commitment to the Leadership in Science and Technology (UCOP)  In the last part of the 20th century, California created the high-tech and biotechnology innovations that formed the backbone of today's "New Economy." As we begin the 21st century, the state of California, the University of California and hundreds of the state's leading-edge businesses have joined together in an unprecedented partnership to lay the foundation for the "next New Economy.“  The Governor Gray Davis Institutes for Science and Innovation – now named for the former governor in recognition of his instrumental role in their creation – include: Source- UCOP90
  • 91. California’s commitment to the Leadership in Science and Technology (UCOP)  Taken together, these four institutes represent a billion-dollar, multidisciplinary effort that focuses public/private resources and expertise simultaneously on research areas critical to sustaining California's economic growth and its competitiveness in the global marketplace.  The new ideas and technologies developed by researchers at the institutes help expand our economy into new industries and markets - and bring the benefits of innovation more quickly into the lives of people everywhere. These institutes open the doors to new understanding, new applications and new products through essential research in biomedicine, bioengineering, nanosystems, telecommunications and information technology. Source- UCOP91
  • 92. Silicon Valley and Stanford University “Stanford University, its affiliates, and graduates played a major role in the development of California's electronics and high-tech industry.[16] From the 1890s, Stanford University's leaders saw its mission as service to the West and shaped the school accordingly. Regionalism helped align Stanford's interests with those of the area's high-tech firms for the first fifty years of Silicon Valley's development.[17] “ “During the 1940s and 1950s, Frederick Terman, as Stanford's dean of engineering and provost, encouraged faculty and graduates to start their own companies. He is credited with nurturing Hewlett-Packard, Varian Associates, and other high-tech firms, until what would become Silicon Valley grew up around the Stanford campus.” Source: Wikipedia 92
  • 93. Top Universities by Reputation 2012 93 Reputation Rank Institution Country / Region Overall score Reputation Reputation for teaching 1 Harvard University United States 100.0 100.0 100.0 2 Massachusetts Institute of Technology United States 87.2 81.7 90.0 3 University of Cambridge United Kingdom 80.7 82.7 79.7 4 Stanford University United States 72.1 67.5 74.5 5 University of California Berkeley United States 71.6 65.0 74.8 6 University of Oxford United Kingdom 71.2
  • 94. The World University Rankings 2011-2012 94 World Rank Institution Country / Region Overall score Teaching International mix Industry income Research Citations 1 California Institute of Technology United States 94.8 95.7 56 97 98.2 99.9 2 Harvard University United States 93.9 95.8 67.5 35.9 97.4 99.8 2 Stanford University United States 93.9 94.8 57.2 63.8 98.9 99.8 4 University of Oxford United Kingdom 93.6 89.5 91.9 62.1 96.6 97.9 5 Princeton University United States 92.9 91.5 49.6 81 99.1 100 6 University of Cambridge United Kingdom 92.4 90.5 85.3 55.5 94.2 97.3 7 Massachusetts Institute of Technology United States 92.3 92.7 79.2 94.4 87.4 100 8 Imperial College London United Kingdom 90.7 88.8 92.2 93.1 88.7 93.9 9 University of Chicago United States 90.2 89.4 58.8 Data withheld by THE 90.8 99.4 10 University of California Berkeley United States 89.8
  • 95. List of U.S. States by Unemployment Rate State or District Unemployment rate (seasonally adjusted) Monthly percent change (=drop in unemployment) Nevada 12.6 0.4% California 11.1 0.2% Rhode Island 10.8 0.3% Mississippi 10.4 0.1% District of Columbia 10.4 0.2% North Carolina 9.9 0.1% Florida 9.9 0.1% Illinois 9.8 0.2% Georgia 9.7 0.1% South Carolina 9.5 0.4% Michigan 9.3 0.5% Kentucky 9.1 0.3% Indiana 9.0 0.0% New Jersey 9.0 0.1% Oregon 8.9 0.2% Arizona 8.7 0.0% Tennessee 8.7 0.4% Washington 8.5 0.2% Idaho 8.4 0.1% United States (mean)[5] 8.3 0.2% Connecticut 8.2 0.2% Alabama 8.1 0.6% Ohio 8.1 0.4% New York 8.0 0.0% Missouri 8.0 0.2% Colorado 7.9 0.1% West Virginia 7.9 0.0% State or District Unemployment rate (seasonally adjusted) Monthly percent change (=drop in unemployment) United States (mean)[5] 8.3 0.2% Texas 7.8 0.3% Arkansas 7.7 0.2% Pennsylvania 7.6 0.3% Delaware 7.4 0.2% Alaska 7.3 0.0% Wisconsin 7.1 0.2% Maine 7.0 0.0% Massachusetts 6.8 0.2% Louisiana 6.8 0.1% Montana 6.8 0.3% Maryland 6.7 0.2% New Mexico 6.6 0.1% Hawaii 6.6 0.1% Kansas 6.3 0.2% Virginia 6.2 0.0% Oklahoma 6.1 0.0% Utah 6.0 0.4% Wyoming 5.8 0.0% Minnesota 5.7 0.2% Iowa 5.6 0.1% Vermont 5.1 0.2% New Hampshire 5.1 0.1% South Dakota 4.2 0.1% Nebraska 4.1 0.0% North Dakota 3.3 0.1% January 24, 2012 for December 2011 Source: Wikipedia 95
  • 96. Outline of Talk 96 Drivers for Change: Computing and Big Data Computational Science and Engineering State Leadership California – “The Golden State” The State New Economy Model “Sustainable California” – a return to “The Golden State”
  • 97. The State New Economy Index* Methodology The State New Economy Index uses 26 indicators. These Indicators are divided into five categories. These categories best capture what is new about the New Economy: 1) Knowledge Jobs (5) 2) Globalization (2) 3) Economic Dynamism (3.5) 4) Transformation to a Digital Economy (3) 5) Technological Innovation Capacity (5) 97*Source: ITIF-Kauffman
  • 98. Top 10 US States ranked based on “The New Economy Index” 2010 1. Massachusetts (92.6) 2. Washington (77.5) 3. Maryland (76.9) 4. New Jersey (76.9) 5. Connecticut(76.6) 6. Delaware (75.0) 7. California (74.3) 8. Virginia (73.7) 9. Colorado (72.8) 10. New York (71.3) 2008 1. Massachusetts (97) 2. Washington (81.9) 3. Maryland (80) 4. Delaware (79.3) 5. New Jersey (77) 6. Connecticut (76.1) 7. Virginia (75.6) 8. California (75) 9. New York (74.4) 10. Colorado (70.4) 2007 1. Massachusetts (96.1) 2. New Jersey (86.4) 3. Maryland (85.0) 4. Washington (84.6) 5. California (82.9) 6. Connecticut (81.8) 7. Delaware (79.6) 8. Virginia (79.5) 9. Colorado (78.3) 10. New York (77.4) 2002 1. Massachusetts (90.0) 2. Washington (86.2) 3. California (85.5) 4. Colorado (84.3) 5. Maryland (75.6) 6. New Jersey (75.1) 7. Connecticut (74.2) 8. Virginia (72.1) 9. Delaware (70.5) 10. New York (69.3) 1999 1. Massachusetts (82.3) 2. California (74.3) 3. Colorado (72.3) 4. Washington (69.0) 5. Connecticut (64.9) 6. Utah (64.0) 7. New Hampshire (62.5) 8. New Jersey (60.9) 9. Delaware (59.9) 10. Arizona (59.2) 98 CSE-CITRISRole InnovationandBetterEducation
  • 99. ITIF-Kauffman Ranking 26 Attributes PCA (MNIK2012) 5 Categories PCA (MNIK2012) Massachusetts Massachusetts Massachusetts Washington Washington New Jersey Maryland Connecticut Connecticut New Jersey Maryland Washington Connecticut New Jersey Maryland Delaware Virginia Delaware California California California Virginia Colorado Virginia Colorado Delaware New York New York New Hampshire Colorado New Hampshire Minnesota New Hampshire Utah Utah Minnesota Minnesota New York Utah Oregon Oregon Oregon Illinois Illinois Illinois Rhode Island Michigan Rhode Island Michigan Rhode Island Texas Texas Pennsylvania Michigan Georgia Texas Georgia Arizona Vermont Florida Florida Arizona Pennsylvania Pennsylvania Georgia Arizona Vermont North Carolina Vermont North Carolina Ohio North Carolina ITIF-Kauffman Ranking 26 Attributes PCA (MNIK2012) 5 Categories PCA (MNIK2012) Ohio Idaho Kansas Kansas Kansas Ohio Idaho Wisconsin Nevada Maine Florida Maine Wisconsin Missouri Idaho Nevada Nebraska Wisconsin Alaska New Mexico Alaska New Mexico Maine Missouri Missouri Iowa Nebraska Nebraska Alaska Hawaii Indiana North Dakota Indiana Montana Hawaii Iowa North Dakota Indiana North Dakota Iowa South Carolina New Mexico South Carolina Nevada Tennessee Hawaii South Dakota South Carolina Tennessee Tennessee Montana Oklahoma Montana Louisiana Kentucky Oklahoma Oklahoma Louisiana Wyoming Kentucky South Dakota Alabama South Dakota Wyoming Kentucky Wyoming Alabama Louisiana Alabama Arkansas Arkansas Arkansas West Virginia West Virginia West Virginia Mississippi Mississippi Mississippi US States ranked based on “The New Economy Index” and two new PCA ranking models!?? 99 CSE-CITRISRole InnovationandBetterEducation
  • 100. KNOWLEDGE JOBS Weight IT Professionals Professional and Managerial Jobs Workforce Education Immigration of Knowledge Workers U.S. Migration of Knowledge Workers Manufacturing Value-Added Traded-Services Employment GLOBALIZATION Export Focus on Manufacturing and Services Foreign Direct Investment (FDI) ECONOMIC DYNAMISM Job Churning Initial Public Offerings (IPOs) Entrepreneurial Activity Inventor Patents Fastest-Growing Firms The State New Economy Index* DIGITAL ECONOMY Online Population Digital Government Farms and Technology Broadband Health IT INNOVATION CAPACITY High-Tech Employment Scientists and Engineers Patents Industry R&D Non-industry R&D Green Economy Venture Capital 100Ref.*: ITIF and Kauffman Foundation
  • 101. Knowledge Job (5) 1 Massachusetts (17.39) 2 Connecticut (16.78) 3 Maryland (15.40) 4 Virginia (15.37) 5 Delaware (13.94) 6 Minnesota (13.94) 7 New Jersey (13.85) 8 Washington (13.80) 9 New York (13.66) 10 New Hampshire (12.96) 13 California (10.70) Top 10 US States ranked based on “The New Economy Index” Globalization (2) 1 Delaware (18.05) 2 Texas (16.39) 3 South Carolina (15.31) 4 New Jersey (14.73) 5 Connecticut (14.68) 6 Massachusetts (14.59) 7 Kentucky (14.24) 8 New York (14.21) 9 Washington (13.73) 10 North Carolina (13.61) 17 California (13.17) Economic Dynamism (3.5) 1 Utah (14.94) 2 Colorado (13.74) 3 Georgia (13.38) 4 Massachusetts (13.30) 5 Florida (13.09) 6 Montana (12.87) 7 Arizona (12.64) 8 Nevada (12.56) 9 California (12.01) 10 Idaho (11.86) Digital Economy (3) 1 Massachusetts (16.40) 2 Rhode Island (15.53) 3 New Jersey (15.13) 4 Maryland (14.29) 5 Connecticut (14.09) 6 California (14.07) 7 New York (14.03) 8 Oregon (13.58) 9 Washington (13.41) 10 Virginia (12.82) Innovation Capacity (5) 1 Massachusetts (19.0) 2 Washington (17.5) 3 California (15.0) 4 Maryland (13.4) 5 Delaware (13.1) 6 Colorado (13.0) 7 New Hampshire (12.2) 8 New Jersey (12.2) 9 Virginia (12.0) 10 New Mexico (11.8) CSE-CITRISRole InnovationandBetterEducation CSE-CITRISRole BetterEducation 101 CSE-CITRISRole BetterEducation CSE-CITRISRole InnovationandBetterEducation
  • 102. Projection of the cases on the factor-plane ( 1 x 3) Cases with sum of cosine square >= 0.00 Active AL AK AZ AR CA CO CT DE FL GA HI ID IL IN IAKS KY LA ME MD MA MI MN MS MO MT NE NV NH NJ NM NY NC ND OH OK OR PA RI SC SD TN TX US UT VT VA WA WV WI WY -8 -6 -4 -2 0 2 4 6 Factor 1: 34.46% -2 -1 0 1 2 3 4 Factor3:10.00% AL AK AZ AR CA CO CT DE FL GA HI ID IL IN IAKS KY LA ME MD MA MI MN MS MO MT NE NV NH NJ NM NY NC ND OH OK OR PA RI SC SD TN TX US UT VT VA WA WV WI WY Top 25 States Bottom 25 States PCA Analysis of US States Ranking: The New Economy Index (26 Indicators) 102
  • 103. Outline of Talk 103 Drivers for Change: Computing and Big Data Computational Science and Engineering State Leadership California – “The Golden State” The State New Economy Model “Sustainable California” – a return to “The Golden State”
  • 104. CDISC ACCESS Insight Increased climate/environmentaldetail Increased socio-economic detail Tera Peta Peta Exa Socio-Economic Modeling for Large-scale Quantitative Climate/Environmental Change Analysis En Informatics Environment-Genetic World Population: Today-~6B, 2050-~9B, 2100-~10B %70 will live in Cities by 2050 By 2020: 35 trillion Gigabytes Data (Cyber-Physical world is connected through billions to even trillions of sensors and devices) Petaflop with ~1M Cores in your PC by 2025? Health, Freshwater, Food Security, Ecosystems, and Urban Metabolism 104
  • 105. “Sustainable California” – a return to the Golden State  building upon massive scale datasets – streaming and static (sensors/social-economic)  employing sophisticated analytics, with an emphasis on modeling, simulation, and crowdsourcing  focus on major domains critical for human welfare and environmental quality (Environment and Security); urban metabolism and smart cities, food security, fresh water resources, public health, natural disasters, energy conservation, and ecosystem.  educating the next generation of interdisciplinary students and industry leaders A statewide initiative to create integrated systems and advanced analytic tools using advanced computational science and engineering 105
  • 106. California can improve the standard of living by applying predictive simulation systems and integrated advanced analytic tools using advanced computational science and engineering to critical problems facing the state How can California respond to rapidly changing environment, climate change, socio-economic forces and demographics?  water resources, public health, natural disasters, energy conservation, environment and security Predictive simulation and advanced analytic can be used to  understand the impacts of policy choices  understand social and economical impacts  create new technologies and industries  find more efficient solutions to California’s pressing infrastructure problems 7 TURING’s TEST Turing: A computer can be said to be intelligent if its answers are indistinguishable from the answers of a human being ?? Computer Health, Freshwater, Food, Energy, Environment Security, Ecosystems, and Urban Metabolism 106