SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
#datapopupseattle
What is Your Data Worth?
Chloe Mawer
Data Scientist, SVDS
SVDataScience
#datapopupseattle
UNSTRUCTURED
Data Science POP-UP in Seattle
www.dominodatalab.com
D
Produced by Domino Data Lab
Domino’s enterprise data science platform is used
by leading analytical organizations to increase
productivity, enable collaboration, and publish
models into production faster.
What is your data worth?
UNSTRUCTURED Data Pop-Up Seattle
October 7, 2015
Chloe Mawer |@chloemawer
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
3 © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
How much is my data worth?
4 © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
How much is my data worth?
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Information goods
Information does not adhere to three main features
of the traditional market system:
1.  Excludability
2.  Rivalry
3.  Transparency
DeLong and Froomkin, 2000; Varian, 1995;
Varian, 1999;Dou and Wu, 2015
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Intangible asset valuation
Intangible assets: non-physical assets with
commercial value but not in a form eligible for
traditional IP law protection.
Three main models:
1.  Cost-based
2.  Market-based
3.  Income-based
Matsuura, 2004
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Added complexity of data
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
How much is my data worth to others?
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Other considerations
•  Completing the chain
•  First copy costs
•  Liability
•  Substitutes
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
The data market (or lack thereof)
•  Data marketplaces have been bought by
aggregators.
•  Data exchanges occur through aggregators or 1:1
deals.
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
How much is 3rd party data worth to me?
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Buying data
•  Completing the chain
•  First copy costs
•  Substitutes
•  Liability
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
External data’s value to you
•  A/B testing
•  Value of information (VOI) framework
VOI = Expected value of decision using data
- Expected value of decision without data
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
What is my data worth within the organization?
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Assessing current data value
•  Mutual information (MI)
MI = Information entropy without added data
- Information entropy with added data
•  Take inventory
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Increasing data’s value in your
organization
•  Accessibility
•  Integration
•  Ease of use
•  Coverage
•  Completeness
•  Accuracy
•  Latency
Choose metrics that are easy to measure and suited
to your business.
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
Data breach or loss
•  How much PII do you have?
•  What trust could you lose from your customers?
•  Average $154 per record
•  $363 healthcare
•  $300 education
•  $165 retail
•  $121 transportation
•  $68 public sector 2015 Ponemon
Institute Cost of Data
Breach Study
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
•  Unique properties of data make
it difficult to value.
•  Buying and selling is taking
place 1:1 or via aggregators.
•  Method for valuing data for
internal use depends on
intention.
CONCLUDING
THOUGHTS
#datapopupseattle
@datapopup
#datapopupseattle
#datapopupseattle
Thank You To Our Sponsors
© 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
THANK YOU

Más contenido relacionado

La actualidad más candente

Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Chief Analytics Officer Forum
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
London Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentationLondon Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentationKETL Limited
 
Creating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCreating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCarl Anderson
 
Anchormen corne versloot
Anchormen corne verslootAnchormen corne versloot
Anchormen corne verslootBigDataExpo
 
Becoming a data driven organization
Becoming a data driven organization Becoming a data driven organization
Becoming a data driven organization Magnus Backman
 
Week3 day6slide
Week3 day6slideWeek3 day6slide
Week3 day6slideRohitKar2
 
How can campus big data be captured and incorporated Craig napier_NSW Learnin...
How can campus big data be captured and incorporated Craig napier_NSW Learnin...How can campus big data be captured and incorporated Craig napier_NSW Learnin...
How can campus big data be captured and incorporated Craig napier_NSW Learnin...hazelj59
 
2013 10 cu leeds school big data conference - bill jacobs - revolution analytics
2013 10 cu leeds school big data conference - bill jacobs - revolution analytics2013 10 cu leeds school big data conference - bill jacobs - revolution analytics
2013 10 cu leeds school big data conference - bill jacobs - revolution analyticsBill Jacobs
 
Marketing Network presentation: Why marketers need to be concerned with data ...
Marketing Network presentation: Why marketers need to be concerned with data ...Marketing Network presentation: Why marketers need to be concerned with data ...
Marketing Network presentation: Why marketers need to be concerned with data ...KETL Limited
 
Setting up Data Science for Success: The Data Layer
Setting up Data Science for Success: The Data LayerSetting up Data Science for Success: The Data Layer
Setting up Data Science for Success: The Data LayerCarl Anderson
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...DataWorks Summit
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentationkperi
 
Why Data Science Projects Fail?
Why Data Science Projects Fail?Why Data Science Projects Fail?
Why Data Science Projects Fail?Ethan Ram
 
Talend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK eventTalend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK eventKETL Limited
 
Industry Focus Camp SCB17 "How to build a data driven organization"
Industry Focus Camp SCB17 "How to build a data driven organization"Industry Focus Camp SCB17 "How to build a data driven organization"
Industry Focus Camp SCB17 "How to build a data driven organization"Bundesverband Deutsche Startups e.V.
 

La actualidad más candente (19)

Agile Analytics
Agile AnalyticsAgile Analytics
Agile Analytics
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
London Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentationLondon Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentation
 
Creating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCreating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetup
 
Anchormen corne versloot
Anchormen corne verslootAnchormen corne versloot
Anchormen corne versloot
 
Becoming a data driven organization
Becoming a data driven organization Becoming a data driven organization
Becoming a data driven organization
 
Week3 day6slide
Week3 day6slideWeek3 day6slide
Week3 day6slide
 
How can campus big data be captured and incorporated Craig napier_NSW Learnin...
How can campus big data be captured and incorporated Craig napier_NSW Learnin...How can campus big data be captured and incorporated Craig napier_NSW Learnin...
How can campus big data be captured and incorporated Craig napier_NSW Learnin...
 
2013 10 cu leeds school big data conference - bill jacobs - revolution analytics
2013 10 cu leeds school big data conference - bill jacobs - revolution analytics2013 10 cu leeds school big data conference - bill jacobs - revolution analytics
2013 10 cu leeds school big data conference - bill jacobs - revolution analytics
 
Marketing Network presentation: Why marketers need to be concerned with data ...
Marketing Network presentation: Why marketers need to be concerned with data ...Marketing Network presentation: Why marketers need to be concerned with data ...
Marketing Network presentation: Why marketers need to be concerned with data ...
 
Setting up Data Science for Success: The Data Layer
Setting up Data Science for Success: The Data LayerSetting up Data Science for Success: The Data Layer
Setting up Data Science for Success: The Data Layer
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...Adoption is the only option hadoop is changing our world and changing yours f...
Adoption is the only option hadoop is changing our world and changing yours f...
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentation
 
Why Data Science Projects Fail?
Why Data Science Projects Fail?Why Data Science Projects Fail?
Why Data Science Projects Fail?
 
Talend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK eventTalend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK event
 
Industry Focus Camp SCB17 "How to build a data driven organization"
Industry Focus Camp SCB17 "How to build a data driven organization"Industry Focus Camp SCB17 "How to build a data driven organization"
Industry Focus Camp SCB17 "How to build a data driven organization"
 

Similar a What is Your Data Worth? - Data Science Pop-up Seattle

Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...StampedeCon
 
Top Trends for Hadoop in 2015
Top Trends for Hadoop in 2015Top Trends for Hadoop in 2015
Top Trends for Hadoop in 2015Hortonworks
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworksHortonworks
 
Executive guidedatastrategy email
Executive guidedatastrategy emailExecutive guidedatastrategy email
Executive guidedatastrategy emailDATAVERSITY
 
Enabling a Culture of Self-Service Analytics
Enabling a Culture of Self-Service AnalyticsEnabling a Culture of Self-Service Analytics
Enabling a Culture of Self-Service AnalyticsPrecisely
 
Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data eraPieter De Leenheer
 
Immuta Overview - February 2016
Immuta Overview - February 2016Immuta Overview - February 2016
Immuta Overview - February 2016John Sarazen
 
Data Strategy - Enabling the Data-Guided Enterprise
Data Strategy - Enabling the Data-Guided EnterpriseData Strategy - Enabling the Data-Guided Enterprise
Data Strategy - Enabling the Data-Guided EnterpriseThoughtworks
 
DataEd Slides: Growing Practical Data Governance Programs
DataEd Slides: Growing Practical Data Governance ProgramsDataEd Slides: Growing Practical Data Governance Programs
DataEd Slides: Growing Practical Data Governance ProgramsDATAVERSITY
 
S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7Tony Pearson
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...Pieter De Leenheer
 
The X Factor in Data Centric Security
The X Factor in Data Centric SecurityThe X Factor in Data Centric Security
The X Factor in Data Centric SecurityWatchful Software
 
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Caserta
 
"How to Create a Practical and Profitable Personalization Strategy" - Brooks ...
"How to Create a Practical and Profitable Personalization Strategy" - Brooks ..."How to Create a Practical and Profitable Personalization Strategy" - Brooks ...
"How to Create a Practical and Profitable Personalization Strategy" - Brooks ...Tealium
 
Who, What, Where and How: Why You Want to Know
 Who, What, Where and How: Why You Want to Know Who, What, Where and How: Why You Want to Know
Who, What, Where and How: Why You Want to KnowEric Kavanagh
 
What makes an effective data team?
What makes an effective data team?What makes an effective data team?
What makes an effective data team?Snowplow Analytics
 
Real-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance ToolsReal-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance ToolsDATAVERSITY
 

Similar a What is Your Data Worth? - Data Science Pop-up Seattle (20)

Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
Beyond a Big Data Pilot: Building a Production Data Infrastructure - Stampede...
 
Top Trends for Hadoop in 2015
Top Trends for Hadoop in 2015Top Trends for Hadoop in 2015
Top Trends for Hadoop in 2015
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworks
 
Executive guidedatastrategy email
Executive guidedatastrategy emailExecutive guidedatastrategy email
Executive guidedatastrategy email
 
8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy
 
Enabling a Culture of Self-Service Analytics
Enabling a Culture of Self-Service AnalyticsEnabling a Culture of Self-Service Analytics
Enabling a Culture of Self-Service Analytics
 
Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data era
 
06 summary
06 summary06 summary
06 summary
 
Immuta Overview - February 2016
Immuta Overview - February 2016Immuta Overview - February 2016
Immuta Overview - February 2016
 
Data Strategy - Enabling the Data-Guided Enterprise
Data Strategy - Enabling the Data-Guided EnterpriseData Strategy - Enabling the Data-Guided Enterprise
Data Strategy - Enabling the Data-Guided Enterprise
 
DataEd Slides: Growing Practical Data Governance Programs
DataEd Slides: Growing Practical Data Governance ProgramsDataEd Slides: Growing Practical Data Governance Programs
DataEd Slides: Growing Practical Data Governance Programs
 
S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
 
Working With Different Kinds of Data
Working With Different Kinds of DataWorking With Different Kinds of Data
Working With Different Kinds of Data
 
The X Factor in Data Centric Security
The X Factor in Data Centric SecurityThe X Factor in Data Centric Security
The X Factor in Data Centric Security
 
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
 
"How to Create a Practical and Profitable Personalization Strategy" - Brooks ...
"How to Create a Practical and Profitable Personalization Strategy" - Brooks ..."How to Create a Practical and Profitable Personalization Strategy" - Brooks ...
"How to Create a Practical and Profitable Personalization Strategy" - Brooks ...
 
Who, What, Where and How: Why You Want to Know
 Who, What, Where and How: Why You Want to Know Who, What, Where and How: Why You Want to Know
Who, What, Where and How: Why You Want to Know
 
What makes an effective data team?
What makes an effective data team?What makes an effective data team?
What makes an effective data team?
 
Real-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance ToolsReal-World Data Governance: Build Your Own Data Governance Tools
Real-World Data Governance: Build Your Own Data Governance Tools
 

Más de Domino Data Lab

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...Domino Data Lab
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationDomino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceDomino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at ScaleDomino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data ScientistsDomino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the RescueDomino Data Lab
 
How to Effectively Combine Numerical Features and Categorical Features
How to Effectively Combine Numerical Features and Categorical FeaturesHow to Effectively Combine Numerical Features and Categorical Features
How to Effectively Combine Numerical Features and Categorical FeaturesDomino Data Lab
 

Más de Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the Rescue
 
How to Effectively Combine Numerical Features and Categorical Features
How to Effectively Combine Numerical Features and Categorical FeaturesHow to Effectively Combine Numerical Features and Categorical Features
How to Effectively Combine Numerical Features and Categorical Features
 

Último

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 

Último (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 

What is Your Data Worth? - Data Science Pop-up Seattle

  • 1. #datapopupseattle What is Your Data Worth? Chloe Mawer Data Scientist, SVDS SVDataScience
  • 2. #datapopupseattle UNSTRUCTURED Data Science POP-UP in Seattle www.dominodatalab.com D Produced by Domino Data Lab Domino’s enterprise data science platform is used by leading analytical organizations to increase productivity, enable collaboration, and publish models into production faster.
  • 3. What is your data worth? UNSTRUCTURED Data Pop-Up Seattle October 7, 2015 Chloe Mawer |@chloemawer
  • 4. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
  • 5. 3 © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience How much is my data worth?
  • 6. 4 © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience How much is my data worth?
  • 7. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Information goods Information does not adhere to three main features of the traditional market system: 1.  Excludability 2.  Rivalry 3.  Transparency DeLong and Froomkin, 2000; Varian, 1995; Varian, 1999;Dou and Wu, 2015
  • 8. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Intangible asset valuation Intangible assets: non-physical assets with commercial value but not in a form eligible for traditional IP law protection. Three main models: 1.  Cost-based 2.  Market-based 3.  Income-based Matsuura, 2004
  • 9. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Added complexity of data
  • 10. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience How much is my data worth to others?
  • 11. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Other considerations •  Completing the chain •  First copy costs •  Liability •  Substitutes
  • 12. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience The data market (or lack thereof) •  Data marketplaces have been bought by aggregators. •  Data exchanges occur through aggregators or 1:1 deals.
  • 13. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience How much is 3rd party data worth to me?
  • 14. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Buying data •  Completing the chain •  First copy costs •  Substitutes •  Liability
  • 15. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience External data’s value to you •  A/B testing •  Value of information (VOI) framework VOI = Expected value of decision using data - Expected value of decision without data
  • 16. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience What is my data worth within the organization?
  • 17. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Assessing current data value •  Mutual information (MI) MI = Information entropy without added data - Information entropy with added data •  Take inventory
  • 18. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Increasing data’s value in your organization •  Accessibility •  Integration •  Ease of use •  Coverage •  Completeness •  Accuracy •  Latency Choose metrics that are easy to measure and suited to your business.
  • 19. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience Data breach or loss •  How much PII do you have? •  What trust could you lose from your customers? •  Average $154 per record •  $363 healthcare •  $300 education •  $165 retail •  $121 transportation •  $68 public sector 2015 Ponemon Institute Cost of Data Breach Study
  • 20. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience •  Unique properties of data make it difficult to value. •  Buying and selling is taking place 1:1 or via aggregators. •  Method for valuing data for internal use depends on intention. CONCLUDING THOUGHTS
  • 23. © 2015 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience THANK YOU