SlideShare una empresa de Scribd logo
1 de 67
Descargar para leer sin conexión
Data Science in 2016:
Moving Up
2015-10-15 • Madrid • http://bigdataspain.org/
Paco Nathan, @pacoid

O’Reilly Media
• general patterns
• trends and analysis: the discipline, the jobs
• some good examples: moving up into use cases
• glimpses ahead: an emerging content
• a proposed theme
Data Science 2016: Moving Up
Design Patterns
Design Patterns
Methodology for cloud-computing architecture

(2008-06-29)
http://ceteri.blogspot.com/2008/06/methodology-for-
cloud-computing.html
cluster scheduler
data
pipes
some cloud
containers
analytics
search/index
elastic
compute
elastic
storage
Design Patterns
Design Patterns
some cloud
Design Patterns
some cloud
DataStax
$189.7M
Confluent
$30.9M
Databricks
$47M
Jupyter
$6M
Elastic
$104M
Docker
$162MMesosphere
$48.75M
Design Patterns: Issues
some cloud
• integration could be better
• that implies sharing markets
• VCs in SiliconValley dislike that
• customers need integration
some cloud
Design Patterns: Where?
Design Patterns: Where?
some cloud
Design Patterns: Where?
some cloud
Design Patterns: Where?
some cloud
Design Patterns: Where?
some cloud
Design Patterns: Where?
some cloud
• that playing field becomes
overly crowded, soon…
• what happens at that point?
• so much emphasis on plumbing: `data engineering`
• not enough on domain expertise, which trumps all
Much activity in Big Data seems awkwardly focused at the
bottom of the tech stack: infrastructure, not domain
However, that may be changing…
Design Patterns: Opinion
Interesting Trends
Interesting Trends
There are many possible trends to discuss, but let’s 

concentrate on four of these going into 2016:
• leveraging multicore and large memory spaces
• generalized libraries for frequently repeated work
• workflows blend the best of people and computing
• framework for a big leap ahead, not just incremental
Original definitions for what became relational
databases had less to do with dedicated SQL
products, more similarity with something like 

Spark SQL
Interesting Trend #1: Contemporary Hardware
A relational model of data 

for large shared data banks

Edgar Codd

Communications of the ACM (1970)

dl.acm.org/citation.cfm?id=362685
Python Java/Scala RSQL …
DataFrame
Logical Plan
LLVMJVM GPU NVRAM
Unified API, One Engine, Automatically Optimized
Tungsten
backend
language
frontend
…
from Databricks
Interesting Trend #1: Contemporary Hardware
Deep Dive into ProjectTungsten: 

Bringing Spark Closer to Bare Metal

Josh Rosen

spark-summit.org/2015/events/deep-dive-into-project-
tungsten-bringing-spark-closer-to-bare-metal/
Set Footer from Insert Dropdown Menu
Physical Execution:
CPU Efficient Data Structures
Keep data closure to CPU cache
Interesting Trend #1: Contemporary Hardware
from Databricks
Interesting Trend #2: Generalized Libraries
Tensors are a good way to handle time-series 

geo-spatially distributed linked data with lots 

of N-dimensional attributes
In other words, nearly a general case for handling
much of the data that we’re likely to encounter
That’s better than attempting to shoehorn data
into matrix representation, then writing lots of
custom code to support it
Tensor factorization may be problematic, but
probabilistic solutions seem to provide relatively
general case solutions:
TheTensor Renaissance in Data Science

Anima Anandkumar @UC Irvine

radar.oreilly.com/2015/05/the-tensor-
renaissance-in-data-science.html
Spacey RandomWalks and 

Higher Order Markov Chains

David Gleich @Purdue

slideshare.net/dgleich/spacey-random-
walks-and-higher-order-markov-chains
Interesting Trend #2: Generalized Libraries
Interesting Trend #3: Leveraging Workflows
evaluationoptimizationrepresentationcirca 2010
ETL into
cluster/cloud
data
data
visualize,
reporting
Data
Prep
Features
Learners,
Parameters
Unsupervised
Learning
Explore
train set
test set
models
Evaluate
Optimize
Scoring
production
data
use
cases
data pipelines
actionable results
decisions, feedback
bar developers
foo algorithms
APIs, algorithms, developer-centric template thinking – 

these only go so far; the overall context is a workflow…
evaluationoptimizationrepresentationcirca 2010
ETL into
cluster/cloud
data
data
visualize,
reporting
Data
Prep
Features
Learners,
Parameters
Unsupervised
Learning
Explore
train set
test set
models
Evaluate
Optimize
Scoring
production
data
use
cases
data pipelines
actionable results
decisions, feedback
bar developers
foo algorithms
look beyond an API, beyond a
code repo … think of people
and machines working together
Interesting Trend #3: Leveraging Workflows
APIs, algorithms, developer-centric template thinking –
these only
Chris Ré, @Stanford

https://www.macfound.org/fellows/943/
Drugs, DNA, and Dinosaurs: Building High Quality
Knowledge Bases with DeepDive

Strata CA (2015)
TheThorn in the Side of Big Data: too few artists

Strata CA (2014)
Interesting Trend #4: A Leap Ahead
Chris Ré
https://www.macfound.org/fellows/943/
Drugs, DNA, and Dinosaurs: Building High Quality
Knowledge Bases with DeepDive
Strata CA (2015)
TheThorn in the Side of Big Data: too few artists
Strata CA (2014)
Interesting Trend #4: A Leap Ahead
cognitive computing “flywheel”:
probabilistic reasoning about complex
data and predictions together
Chris Ré
https://www.macfound.org/fellows/943/
Drugs, DNA, and Dinosaurs: Building High Quality
Knowledge Bases with DeepDive
Strata CA (2015)
TheThorn in the Side of Big Data: too few artists
Strata CA (2014)
Interesting Trend #4: A Leap Ahead
Data Scientists
William Cleveland 

“Data Science: an Action Plan for Expanding 

the Technical Areas of the Field of Statistics,” 

International Statistical Review (2001), 69, 21-26
http://www.stat.purdue.edu/~wsc/papers/
datascience.pdf
Leo Breiman

“Statistical modeling: the two cultures”, 

Statistical Science (2001), 16:199-231
http://projecteuclid.org/euclid.ss/1009213726
…also good to mention John Tukey
Data Scientists: Primary Sources
Data Scientists: Five Years of Strata Conference
One 2015 report (RJMetrics) tallied a minimum of 

11,400 data scientists worldwide by scraping LinkedIn
So many suddenly, really? Perhaps that’s doubtful…
Comparing surveys: O’Reilly Media conducts salary surveys 

for data scientists, along with exploring about the tools used
2013 – tools, trends, not all data is “Big”, coding scripts!
2014 – correlation of tools and skills, rapid evolution
2015 – divide blurring between open source and proprietary
Data Scientists: Everywhere, all the time?
http://radar.oreilly.com/2015/09/2015-data-science-salary-survey.html
John King, Roger Magoulas
Data Scientists: 2015 Survey
Data Scientists: 2015 Survey
Moving Up
Enlitic http://www.enlitic.com/
deep learning to assist doctors treating cancer
Moving Up: Medicine
Moving Up: Medicine
“Whatever the models might discover or predict, Howard
isn’t suggesting they’ll do away with a doctor’s judgment.
Rather, artificially intelligent computers could provide strong,
unbiased second opinions, or perhaps lead a doctor down 

a path of investigation she other wouldn’t have considered.”
With Enlitic, a veteran data scientist plans 

to fight disease using deep learning

GigaOM (2014-08-22)

https://gigaom.com/2014/08/22/with-enlitic-a-veteran-
data-scientist-plans-to-fight-disease-using-deep-learning/
Moving Up: Political Platform
http://www.predikon.ch/en/voting-patterns/residents
Moving Up: Political Platform
Mining Democracy

Matthias Grossglauser @EPFL

ICT Labs (2015)

http://ictlabs-summer-school.sics.se/
slides/mining%20democracy.pdf
What if a political candidate could cluster political
positions in a multi-dimensional data space, to
optimize for being recommended to voters?
http://www.predikon.ch/en/voting-patterns/residents
Moving Up: Government Ethics
TheWhite House has a plan to help society through data analysis

Fortune (2018-09-30)

http://fortune.com/2015/09/30/dj-patil-white-house-data/
Moving Up: Government Ethics
TheWhite House has a plan to help society through data analysis

Fortune (2018-09-30)

http://fortune.com/2015/09/30/dj-patil-white-house-data/
“Opening up government data about child labor to concerned data
scientists; recruiting folks to help analyze data about suicide prevention,
social injustice and incarceration; a call for mandatory and `intrinsic`
ethics instruction in every course teaching students data science; and an
effort to help the transgender community create its own census of sorts,
so that members and society can get a better grasp on the issues that
matter to the group.”
Moving Up: Neuroscience
Analytics +Visualization for Neuroscience:
Spark,Thunder, Lightning
Jeremy Freeman

2015-01-29
youtu.be/cBQm4LhHn9g?t=28m55s
For excellent examples of Science and Data
together see CodeNeuro, particularly for 

use of Jupyter notebooks + Apache Spark
Moving Up: Neuroscience
Learning
Learning: What About MOOCs?
Massive Open Online Courses – 

seven year trend, beginning with:
Connectivism and Connective Knowledge

George Siemens, Stephen Downes

University of PEI (2008)

http://cck11.mooc.ca/
Learning: What About MOOCs?
Adios EdTech. Hola something else

George Siemens (2015-09-09)

http://www.elearnspace.org/blog/2015/09/09/
adios-ed-tech-hola-something-else/
Online education: MOOCs taken by educated few

Ezekiel Emanuel, Nature 503, 342 (2013-11-21)
• 80% students already have an advanced degree
• 80% come from the richest 6% of the population
Michael Shanks @Stanford: “retrenchment around traditional
disciplines will make disparities even more pronounced”
An Early Report Card on Massive Open Online Courses

Geoffrey Fowler, WSJ (2013-10-08)
Amherst, Duke, etc., have rejected edX
Learning: What About MOOCs?
Online education: MOOCs taken by educated few
Ezekiel Emanuel
• 80% students already have an advanced degree
• 80% come from the richest 6% of the population
Michael Shanks
disciplines will make disparities even more pronounced”
An Early Report Card on Massive Open Online Courses
Geoffrey Fowler
Amhers
Learning: What About MOOCs?
So then, what else works better?
How to Flip a Class 

CTL @UT/Austin

http://ctl.utexas.edu/teaching/flipping-a-class/how
1. identify where the flipped classroom model makes 

the most sense for your course
2. spend class time engaging students in application
activities with feedback
3. clarify connections between inside and outside 

of class learning
4. adapt your materials for students to acquire course
content in preparation of class
5. extend learning beyond class through individual 

and collaborative practice
Learning: Inverted Classroom
Scalable Learning

David Black-Schaffer @Uppsala

Sverker Janson @KTH SICS
https://www.scalable-learning.com/
• active learning: Flipped Classroom and Just-in-timeTeaching
• exams built directly into specific diagrams within videos
• metrics for where in video+code that students get stuck
• instructor can customize subsequent classroom discussions 

(active teaching phase) based on stuck/unstuck metrics
Learning: Inverted Classroom
Learning programming at scale
Philip Guo 

O’Reilly Radar (2015-08-13)
http://radar.oreilly.com/2015/08/learning-
programming-at-scale.html
• PythonTutor
• Codechella
Tutors could keep an eye on around 

50 learners during a 30-minute session, 

start 12 chat conversations, and 

concurrently help 3 learners at once
Learning: Collaborative Learning
Data-driven Education and the Quantified Student
Lorena Barba @GWU
PyData Seattle (2015)
https://youtu.be/2YIZ2SY9mW4
• keynote talk: abstract, slides
• homepage
• Open edX Universities Symposium, DC 2015-11-11
Learning: If you study just one link from this talk…
If by some bizarre chance you haven’t used 

it already, go to https://jupyter.org/
• 50+ different language kernels
• new funding 2015-07
• UC Berkeley, Cal Poly
• nbgrader autograder by Jess Hamrick
• jupyterhub multi-user server
• curating a list of examples
• repeatable science!
see also:

Teaching with Jupyter Notebooks

http://tinyurl.com/scipy2015-education
Learning: Jupyter Project
Embracing Jupyter Notebooks at O'Reilly

Andrew Odewahn

O’Reilly Media (2015-05-07)
https://beta.oreilly.com/ideas/jupyter-at-oreilly
O’Reilly Media is using our Atlas platform 

to make Jupyter Notebooks a first class
authoring environment for our publishing
program
Jupyter, Thebe, Atlas, Docker, etc.
Learning: O’Reilly Media
Learning: O’Reilly Media
https://beta.oreilly.com/
in-person blended on-demand
Mostly
Synchronous
Mostly
Asynch
Inverted
Classroom
Subscription
Free
Content
Learning: Audience Patterns
Is it possible to measure “distance” between 

a learner and a subject community?
From Amateurs to Connoisseurs:

Modeling the Evolution of User 

Expertise through Online Reviews

Julian McAuley, Jure Leskovec

http://i.stanford.edu/~julian/pdfs/www13.pdf
Learning: Machine Learning about People Learning
Learning,Assessment,Team Building, Diversity –
these can be accomplished together, in situ
Collective Intelligence in Human Groups

Anita Williams Woolley @CMU

https://youtu.be/Bz1dDiW2mvM
• balance of participation (no one dominates)
• 2+ women engaging within the group
• group size < 9
• diversity of formal backgrounds
Learning: Machine Learning about People Learning
People + Automation
Data Science teams apply machine learning (automation)
to help arrive at key insights, to learn what is important 

in data sets – finding the proverbial needle in the haystack
Cognitive Computing exhibits people + automation 

as a process, in a learning context
That’s also a basic tenet of workflows in general: 

people + automation
And a key aspect of the emerging gig economy too…
People + Automation
People + Automation: Gig Economy
People + Automation: Gig Economy
http://orchestra.unlimitedlabs.com/
“Workflows with humans and machines”
People + Automation: Gig Economy
Workers in aWorld of Continuous Partial Employment
Tim O’Reilly
Medium (2015-08-31)

https://medium.com/the-wtf-economy/workers-in-a-
world-of-continuous-partial-employment-4d7b53f18f96
http://conferences.oreilly.com/next-economy
Learning is key. Effective use of Data Science in these new
economic conditions requires people + automation, learning
together – albeit in different ways. Plus, there’s an excellent
framework for that:
Autopoiesis and Cognition

Humberto Maturana, FranciscoVarela

Springer (1973)
https://books.google.es/books?id=nVmcN9Ja68kC
People + Automation
I’d like to leave this as a theme for you to consider about 

Data Science 2016, Moving Up into use cases…
We see an intersection of key points in both the emerging
Cognitive Computing context and the Gig Economy in general:
systems of people + automation, learning together
It posits an interesting duality for use to leverage
With that I wish you a great conference here at Big Data Spain!
People + Automation
Gracias
contact:
Just Enough Math
O’Reilly (2014)
justenoughmath.com

preview: youtu.be/TQ58cWgdCpA
monthly newsletter for updates, 

events, conf summaries, etc.:
liber118.com/pxn/
Intro to Apache Spark

O’Reilly (2015)

shop.oreilly.com/product/
0636920036807.do

Más contenido relacionado

La actualidad más candente

DSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanDSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanPaco Nathan
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningNik Spirin
 
Designing a second generation of open data platforms
Designing a second generation of open data platformsDesigning a second generation of open data platforms
Designing a second generation of open data platformsYannis Charalabidis
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big DataRevolution Analytics
 
Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013Roger Hoerl
 
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
Keynote -  An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroKeynote -  An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Gregg Barrett
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceWesley Eldridge
 
How it works- Data Science
How it works- Data ScienceHow it works- Data Science
How it works- Data ScienceEdureka!
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedPaco Nathan
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
Big Data in NATO and Your Role
Big Data in NATO and Your RoleBig Data in NATO and Your Role
Big Data in NATO and Your RoleJay Gendron
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)James Hendler
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data ScienceFeyzi R. Bagirov
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...Stefan Dietze
 

La actualidad más candente (20)

Urban Data Science at UW
Urban Data Science at UWUrban Data Science at UW
Urban Data Science at UW
 
Broad Data
Broad DataBroad Data
Broad Data
 
Data Science and Urban Science @ UW
Data Science and Urban Science @ UWData Science and Urban Science @ UW
Data Science and Urban Science @ UW
 
DSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanDSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco Nathan
 
Introduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine LearningIntroduction to Data Science and Large-scale Machine Learning
Introduction to Data Science and Large-scale Machine Learning
 
Designing a second generation of open data platforms
Designing a second generation of open data platformsDesigning a second generation of open data platforms
Designing a second generation of open data platforms
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big Data
 
Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013Roger hoerl say award presentation 2013
Roger hoerl say award presentation 2013
 
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
Keynote -  An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroKeynote -  An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data Science
 
How it works- Data Science
How it works- Data ScienceHow it works- Data Science
How it works- Data Science
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML Unpaused
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Big Data in NATO and Your Role
Big Data in NATO and Your RoleBig Data in NATO and Your Role
Big Data in NATO and Your Role
 
Broad Data (India 2015)
Broad Data (India 2015)Broad Data (India 2015)
Broad Data (India 2015)
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
 

Destacado

IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...Big Data Spain
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...Big Data Spain
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Big Data Spain
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...Big Data Spain
 
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...Big Data Spain
 
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...Big Data Spain
 
Big Data as a game-changer of clinical research strategies by Rafael San Migu...
Big Data as a game-changer of clinical research strategies by Rafael San Migu...Big Data as a game-changer of clinical research strategies by Rafael San Migu...
Big Data as a game-changer of clinical research strategies by Rafael San Migu...Big Data Spain
 
Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015Big Data Spain
 
Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015
Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015
Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015Big Data Spain
 
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015Big Data Spain
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...Big Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceBig Data Spain
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014Big Data Spain
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...Big Data Spain
 

Destacado (20)

IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
 
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
 
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
 
Big Data as a game-changer of clinical research strategies by Rafael San Migu...
Big Data as a game-changer of clinical research strategies by Rafael San Migu...Big Data as a game-changer of clinical research strategies by Rafael San Migu...
Big Data as a game-changer of clinical research strategies by Rafael San Migu...
 
Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015
 
Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015
Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015
Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015
 
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 

Similar a Data Science in 2016: Moving Up

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data ScienceAndrew Gardner
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltoolssuresh sood
 
Using Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsUsing Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsNeo4j
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-ResearchEric Meyer
 
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...jybufgofasfbkpoovh
 
Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Gerben Zaagsma
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadKelly Technologies
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learningGiuseppe Manco
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science suresh sood
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731jeffreylancaster
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressMarcel Blattner, PhD
 
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...NOVA DATASCIENCE
 

Similar a Data Science in 2016: Moving Up (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
10 problems 06
10 problems 0610 problems 06
10 problems 06
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
Using Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsUsing Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale Analytics
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-Research
 
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
 
Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR Congress
 
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
 
Lecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptxLecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptx
 

Más de Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 

Más de Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 

Último

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Data Science in 2016: Moving Up

  • 1.
  • 2. Data Science in 2016: Moving Up 2015-10-15 • Madrid • http://bigdataspain.org/ Paco Nathan, @pacoid
 O’Reilly Media
  • 3. • general patterns • trends and analysis: the discipline, the jobs • some good examples: moving up into use cases • glimpses ahead: an emerging content • a proposed theme Data Science 2016: Moving Up
  • 5. Design Patterns Methodology for cloud-computing architecture
 (2008-06-29) http://ceteri.blogspot.com/2008/06/methodology-for- cloud-computing.html
  • 9. Design Patterns: Issues some cloud • integration could be better • that implies sharing markets • VCs in SiliconValley dislike that • customers need integration
  • 15. Design Patterns: Where? some cloud • that playing field becomes overly crowded, soon… • what happens at that point?
  • 16. • so much emphasis on plumbing: `data engineering` • not enough on domain expertise, which trumps all Much activity in Big Data seems awkwardly focused at the bottom of the tech stack: infrastructure, not domain However, that may be changing… Design Patterns: Opinion
  • 18. Interesting Trends There are many possible trends to discuss, but let’s 
 concentrate on four of these going into 2016: • leveraging multicore and large memory spaces • generalized libraries for frequently repeated work • workflows blend the best of people and computing • framework for a big leap ahead, not just incremental
  • 19. Original definitions for what became relational databases had less to do with dedicated SQL products, more similarity with something like 
 Spark SQL Interesting Trend #1: Contemporary Hardware A relational model of data 
 for large shared data banks
 Edgar Codd
 Communications of the ACM (1970)
 dl.acm.org/citation.cfm?id=362685
  • 20. Python Java/Scala RSQL … DataFrame Logical Plan LLVMJVM GPU NVRAM Unified API, One Engine, Automatically Optimized Tungsten backend language frontend … from Databricks Interesting Trend #1: Contemporary Hardware
  • 21. Deep Dive into ProjectTungsten: 
 Bringing Spark Closer to Bare Metal
 Josh Rosen
 spark-summit.org/2015/events/deep-dive-into-project- tungsten-bringing-spark-closer-to-bare-metal/ Set Footer from Insert Dropdown Menu Physical Execution: CPU Efficient Data Structures Keep data closure to CPU cache Interesting Trend #1: Contemporary Hardware from Databricks
  • 22. Interesting Trend #2: Generalized Libraries Tensors are a good way to handle time-series 
 geo-spatially distributed linked data with lots 
 of N-dimensional attributes In other words, nearly a general case for handling much of the data that we’re likely to encounter That’s better than attempting to shoehorn data into matrix representation, then writing lots of custom code to support it
  • 23. Tensor factorization may be problematic, but probabilistic solutions seem to provide relatively general case solutions: TheTensor Renaissance in Data Science
 Anima Anandkumar @UC Irvine
 radar.oreilly.com/2015/05/the-tensor- renaissance-in-data-science.html Spacey RandomWalks and 
 Higher Order Markov Chains
 David Gleich @Purdue
 slideshare.net/dgleich/spacey-random- walks-and-higher-order-markov-chains Interesting Trend #2: Generalized Libraries
  • 24. Interesting Trend #3: Leveraging Workflows evaluationoptimizationrepresentationcirca 2010 ETL into cluster/cloud data data visualize, reporting Data Prep Features Learners, Parameters Unsupervised Learning Explore train set test set models Evaluate Optimize Scoring production data use cases data pipelines actionable results decisions, feedback bar developers foo algorithms APIs, algorithms, developer-centric template thinking – 
 these only go so far; the overall context is a workflow…
  • 25. evaluationoptimizationrepresentationcirca 2010 ETL into cluster/cloud data data visualize, reporting Data Prep Features Learners, Parameters Unsupervised Learning Explore train set test set models Evaluate Optimize Scoring production data use cases data pipelines actionable results decisions, feedback bar developers foo algorithms look beyond an API, beyond a code repo … think of people and machines working together Interesting Trend #3: Leveraging Workflows APIs, algorithms, developer-centric template thinking – these only
  • 26. Chris Ré, @Stanford
 https://www.macfound.org/fellows/943/ Drugs, DNA, and Dinosaurs: Building High Quality Knowledge Bases with DeepDive
 Strata CA (2015) TheThorn in the Side of Big Data: too few artists
 Strata CA (2014) Interesting Trend #4: A Leap Ahead
  • 27. Chris Ré https://www.macfound.org/fellows/943/ Drugs, DNA, and Dinosaurs: Building High Quality Knowledge Bases with DeepDive Strata CA (2015) TheThorn in the Side of Big Data: too few artists Strata CA (2014) Interesting Trend #4: A Leap Ahead cognitive computing “flywheel”: probabilistic reasoning about complex data and predictions together
  • 28. Chris Ré https://www.macfound.org/fellows/943/ Drugs, DNA, and Dinosaurs: Building High Quality Knowledge Bases with DeepDive Strata CA (2015) TheThorn in the Side of Big Data: too few artists Strata CA (2014) Interesting Trend #4: A Leap Ahead
  • 30. William Cleveland 
 “Data Science: an Action Plan for Expanding 
 the Technical Areas of the Field of Statistics,” 
 International Statistical Review (2001), 69, 21-26 http://www.stat.purdue.edu/~wsc/papers/ datascience.pdf Leo Breiman
 “Statistical modeling: the two cultures”, 
 Statistical Science (2001), 16:199-231 http://projecteuclid.org/euclid.ss/1009213726 …also good to mention John Tukey Data Scientists: Primary Sources
  • 31. Data Scientists: Five Years of Strata Conference
  • 32. One 2015 report (RJMetrics) tallied a minimum of 
 11,400 data scientists worldwide by scraping LinkedIn So many suddenly, really? Perhaps that’s doubtful… Comparing surveys: O’Reilly Media conducts salary surveys 
 for data scientists, along with exploring about the tools used 2013 – tools, trends, not all data is “Big”, coding scripts! 2014 – correlation of tools and skills, rapid evolution 2015 – divide blurring between open source and proprietary Data Scientists: Everywhere, all the time?
  • 36. Enlitic http://www.enlitic.com/ deep learning to assist doctors treating cancer Moving Up: Medicine
  • 37. Moving Up: Medicine “Whatever the models might discover or predict, Howard isn’t suggesting they’ll do away with a doctor’s judgment. Rather, artificially intelligent computers could provide strong, unbiased second opinions, or perhaps lead a doctor down 
 a path of investigation she other wouldn’t have considered.” With Enlitic, a veteran data scientist plans 
 to fight disease using deep learning
 GigaOM (2014-08-22)
 https://gigaom.com/2014/08/22/with-enlitic-a-veteran- data-scientist-plans-to-fight-disease-using-deep-learning/
  • 38. Moving Up: Political Platform http://www.predikon.ch/en/voting-patterns/residents
  • 39. Moving Up: Political Platform Mining Democracy
 Matthias Grossglauser @EPFL
 ICT Labs (2015)
 http://ictlabs-summer-school.sics.se/ slides/mining%20democracy.pdf What if a political candidate could cluster political positions in a multi-dimensional data space, to optimize for being recommended to voters? http://www.predikon.ch/en/voting-patterns/residents
  • 40. Moving Up: Government Ethics TheWhite House has a plan to help society through data analysis
 Fortune (2018-09-30)
 http://fortune.com/2015/09/30/dj-patil-white-house-data/
  • 41. Moving Up: Government Ethics TheWhite House has a plan to help society through data analysis
 Fortune (2018-09-30)
 http://fortune.com/2015/09/30/dj-patil-white-house-data/ “Opening up government data about child labor to concerned data scientists; recruiting folks to help analyze data about suicide prevention, social injustice and incarceration; a call for mandatory and `intrinsic` ethics instruction in every course teaching students data science; and an effort to help the transgender community create its own census of sorts, so that members and society can get a better grasp on the issues that matter to the group.”
  • 42. Moving Up: Neuroscience Analytics +Visualization for Neuroscience: Spark,Thunder, Lightning Jeremy Freeman
 2015-01-29 youtu.be/cBQm4LhHn9g?t=28m55s
  • 43. For excellent examples of Science and Data together see CodeNeuro, particularly for 
 use of Jupyter notebooks + Apache Spark Moving Up: Neuroscience
  • 46. Massive Open Online Courses – 
 seven year trend, beginning with: Connectivism and Connective Knowledge
 George Siemens, Stephen Downes
 University of PEI (2008)
 http://cck11.mooc.ca/ Learning: What About MOOCs? Adios EdTech. Hola something else
 George Siemens (2015-09-09)
 http://www.elearnspace.org/blog/2015/09/09/ adios-ed-tech-hola-something-else/
  • 47. Online education: MOOCs taken by educated few
 Ezekiel Emanuel, Nature 503, 342 (2013-11-21) • 80% students already have an advanced degree • 80% come from the richest 6% of the population Michael Shanks @Stanford: “retrenchment around traditional disciplines will make disparities even more pronounced” An Early Report Card on Massive Open Online Courses
 Geoffrey Fowler, WSJ (2013-10-08) Amherst, Duke, etc., have rejected edX Learning: What About MOOCs?
  • 48. Online education: MOOCs taken by educated few Ezekiel Emanuel • 80% students already have an advanced degree • 80% come from the richest 6% of the population Michael Shanks disciplines will make disparities even more pronounced” An Early Report Card on Massive Open Online Courses Geoffrey Fowler Amhers Learning: What About MOOCs? So then, what else works better?
  • 49. How to Flip a Class 
 CTL @UT/Austin
 http://ctl.utexas.edu/teaching/flipping-a-class/how 1. identify where the flipped classroom model makes 
 the most sense for your course 2. spend class time engaging students in application activities with feedback 3. clarify connections between inside and outside 
 of class learning 4. adapt your materials for students to acquire course content in preparation of class 5. extend learning beyond class through individual 
 and collaborative practice Learning: Inverted Classroom
  • 50. Scalable Learning
 David Black-Schaffer @Uppsala
 Sverker Janson @KTH SICS https://www.scalable-learning.com/ • active learning: Flipped Classroom and Just-in-timeTeaching • exams built directly into specific diagrams within videos • metrics for where in video+code that students get stuck • instructor can customize subsequent classroom discussions 
 (active teaching phase) based on stuck/unstuck metrics Learning: Inverted Classroom
  • 51. Learning programming at scale Philip Guo 
 O’Reilly Radar (2015-08-13) http://radar.oreilly.com/2015/08/learning- programming-at-scale.html • PythonTutor • Codechella Tutors could keep an eye on around 
 50 learners during a 30-minute session, 
 start 12 chat conversations, and 
 concurrently help 3 learners at once Learning: Collaborative Learning
  • 52. Data-driven Education and the Quantified Student Lorena Barba @GWU PyData Seattle (2015) https://youtu.be/2YIZ2SY9mW4 • keynote talk: abstract, slides • homepage • Open edX Universities Symposium, DC 2015-11-11 Learning: If you study just one link from this talk…
  • 53. If by some bizarre chance you haven’t used 
 it already, go to https://jupyter.org/ • 50+ different language kernels • new funding 2015-07 • UC Berkeley, Cal Poly • nbgrader autograder by Jess Hamrick • jupyterhub multi-user server • curating a list of examples • repeatable science! see also:
 Teaching with Jupyter Notebooks
 http://tinyurl.com/scipy2015-education Learning: Jupyter Project
  • 54. Embracing Jupyter Notebooks at O'Reilly
 Andrew Odewahn
 O’Reilly Media (2015-05-07) https://beta.oreilly.com/ideas/jupyter-at-oreilly O’Reilly Media is using our Atlas platform 
 to make Jupyter Notebooks a first class authoring environment for our publishing program Jupyter, Thebe, Atlas, Docker, etc. Learning: O’Reilly Media
  • 57. Is it possible to measure “distance” between 
 a learner and a subject community? From Amateurs to Connoisseurs:
 Modeling the Evolution of User 
 Expertise through Online Reviews
 Julian McAuley, Jure Leskovec
 http://i.stanford.edu/~julian/pdfs/www13.pdf Learning: Machine Learning about People Learning
  • 58. Learning,Assessment,Team Building, Diversity – these can be accomplished together, in situ Collective Intelligence in Human Groups
 Anita Williams Woolley @CMU
 https://youtu.be/Bz1dDiW2mvM • balance of participation (no one dominates) • 2+ women engaging within the group • group size < 9 • diversity of formal backgrounds Learning: Machine Learning about People Learning
  • 60. Data Science teams apply machine learning (automation) to help arrive at key insights, to learn what is important 
 in data sets – finding the proverbial needle in the haystack Cognitive Computing exhibits people + automation 
 as a process, in a learning context That’s also a basic tenet of workflows in general: 
 people + automation And a key aspect of the emerging gig economy too… People + Automation
  • 61. People + Automation: Gig Economy
  • 62. People + Automation: Gig Economy http://orchestra.unlimitedlabs.com/ “Workflows with humans and machines”
  • 63. People + Automation: Gig Economy Workers in aWorld of Continuous Partial Employment Tim O’Reilly Medium (2015-08-31)
 https://medium.com/the-wtf-economy/workers-in-a- world-of-continuous-partial-employment-4d7b53f18f96 http://conferences.oreilly.com/next-economy
  • 64. Learning is key. Effective use of Data Science in these new economic conditions requires people + automation, learning together – albeit in different ways. Plus, there’s an excellent framework for that: Autopoiesis and Cognition
 Humberto Maturana, FranciscoVarela
 Springer (1973) https://books.google.es/books?id=nVmcN9Ja68kC People + Automation
  • 65. I’d like to leave this as a theme for you to consider about 
 Data Science 2016, Moving Up into use cases… We see an intersection of key points in both the emerging Cognitive Computing context and the Gig Economy in general: systems of people + automation, learning together It posits an interesting duality for use to leverage With that I wish you a great conference here at Big Data Spain! People + Automation
  • 67. contact: Just Enough Math O’Reilly (2014) justenoughmath.com
 preview: youtu.be/TQ58cWgdCpA monthly newsletter for updates, 
 events, conf summaries, etc.: liber118.com/pxn/ Intro to Apache Spark
 O’Reilly (2015)
 shop.oreilly.com/product/ 0636920036807.do