SlideShare una empresa de Scribd logo
1 de 10
Integrating Disparate Data May 27, 2010 Steve Newman – CTO/Gist.com
the WHY? What we believe in… All your important people already reside in email, calendar, contact lists, social sites The web is a rich source of information about the people you care about One tool should exist that can pull all this together in a single, rich, integrated experience
Pain Points (External) Disparate Data/API sources and protocols e.g. GNIP Change notification (when/what) e.g. Linked Open Data Dataset Dynamics, pubsubhub Standard entity data structures e.g. Portable Contacts, vcard, hcard 3
The Problem (Internal) Need a single, disambiguated set of entities where an entity itself contains accurate/disambiguated attributes Entity attributes can be sourced from one or more endpoints Email Twitter/Facebook  Calendar Google Contacts, Outlook Contacts, Plaxo Google Social Graph API Rapleaf API
The Problem (Internal) Now that we have this data, we need to process and make sense of it Need to support reoccurring updates Merge and unmerge support Recursive derivation is a huge win if done correctly Historical Tracking is necessary both to drive operations but also for debugging (and it’s a cool user feature)
How we did it Enhancers Execute the request and creation of attribute data Can be called synch or asynch Cached, Logged, Rate Limited Meta data about attributes Source, Source Type, When created, Derived?, Derived Source, Score Rules for ‘enhancement’ Rules for recursion Scoring methodology (accuracy and relative prioritization) 6
Example – Email Enhancer “Brad Feld” vs “Brad” Data/Time Score State Value
Key Takeaways Worry about integration both external and internal to your application Lots of good work on the external issues…take advantage of it! Create a strong object model for internal data representation (workers, meta data, engines) so you can perform concise/discrete operations
Additional Info GIST API coming out this Summer Direct interface to Fragments  Standard and Third party Enhancer support @stevepnewman, @gist
« We know now that the source of wealth is something specificallyhuman : knowledge. Applied to tasksthatwealready know how to do, itbecomes'productivity'. Applied to tasksthat are new and differentwe call it'innovation'. Onlyknowledgeallows us to achievethesetwo goals. » Peter Drucker Management challenges of the XXIst Century-1999

Más contenido relacionado

La actualidad más candente

Big Data Maturity Model and Governance
Big Data Maturity Model and GovernanceBig Data Maturity Model and Governance
Big Data Maturity Model and GovernanceIMC Institute
 
Big data Competitions by Komes Chandavimol
Big data Competitions by Komes ChandavimolBig data Competitions by Komes Chandavimol
Big data Competitions by Komes ChandavimolIMC Institute
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)Buhwan Jeong
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Edureka!
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifactahuguk
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Edureka!
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school studentsMelanie Manning, CFA
 
Cortana Analytics Workshop: Azure Data Catalog
Cortana Analytics Workshop: Azure Data CatalogCortana Analytics Workshop: Azure Data Catalog
Cortana Analytics Workshop: Azure Data CatalogMSAdvAnalytics
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome ThemQubole
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...PyData
 
Eat whatever you can with PyBabe
Eat whatever you can with PyBabeEat whatever you can with PyBabe
Eat whatever you can with PyBabeDataiku
 
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Edureka!
 
Best practices in building machine learning models in Azure ML
Best practices in building machine learning models in Azure MLBest practices in building machine learning models in Azure ML
Best practices in building machine learning models in Azure MLZeydy Ortiz, Ph. D.
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015 Dataiku
 

La actualidad más candente (20)

Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Big Data Maturity Model and Governance
Big Data Maturity Model and GovernanceBig Data Maturity Model and Governance
Big Data Maturity Model and Governance
 
Big data Competitions by Komes Chandavimol
Big data Competitions by Komes ChandavimolBig data Competitions by Komes Chandavimol
Big data Competitions by Komes Chandavimol
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)
 
Data Skills for Digital Era
Data Skills for Digital EraData Skills for Digital Era
Data Skills for Digital Era
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
 
Cortana Analytics Workshop: Azure Data Catalog
Cortana Analytics Workshop: Azure Data CatalogCortana Analytics Workshop: Azure Data Catalog
Cortana Analytics Workshop: Azure Data Catalog
 
Data analytics
Data analyticsData analytics
Data analytics
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
 
Eat whatever you can with PyBabe
Eat whatever you can with PyBabeEat whatever you can with PyBabe
Eat whatever you can with PyBabe
 
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
 
Best practices in building machine learning models in Azure ML
Best practices in building machine learning models in Azure MLBest practices in building machine learning models in Azure ML
Best practices in building machine learning models in Azure ML
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 

Destacado

концгмп новая евразия 24.03.11
концгмп новая евразия 24.03.11концгмп новая евразия 24.03.11
концгмп новая евразия 24.03.11migrocenter
 
Poletaev khabarovsk
Poletaev khabarovskPoletaev khabarovsk
Poletaev khabarovskmigrocenter
 
The ADS Group
The ADS GroupThe ADS Group
The ADS Groupwwjdtomf
 
25 февр 2010
25 февр 201025 февр 2010
25 февр 2010migrocenter
 
2009 12 12 памяти А
2009 12 12 памяти А2009 12 12 памяти А
2009 12 12 памяти Аmigrocenter
 
Benchmark Index is BACK!
Benchmark Index is BACK!Benchmark Index is BACK!
Benchmark Index is BACK!zoefw
 
Curiosity is the core of innovation.
Curiosity is the core of innovation.Curiosity is the core of innovation.
Curiosity is the core of innovation.Assist
 
The Production Process: The Behavior of Profit Maximizing Firms
The Production Process: The Behavior of Profit Maximizing FirmsThe Production Process: The Behavior of Profit Maximizing Firms
The Production Process: The Behavior of Profit Maximizing FirmsNoel Buensuceso
 
Strategy Review, Evaluation, and Control
Strategy Review, Evaluation, and ControlStrategy Review, Evaluation, and Control
Strategy Review, Evaluation, and ControlNoel Buensuceso
 
The Nature of Strategic Management
The Nature of Strategic ManagementThe Nature of Strategic Management
The Nature of Strategic ManagementNoel Buensuceso
 
Introduction to Macroeconomics
Introduction to MacroeconomicsIntroduction to Macroeconomics
Introduction to MacroeconomicsNoel Buensuceso
 
General Equilibrium and the Efficiency of Perfect Competition
General Equilibrium and the Efficiency of Perfect CompetitionGeneral Equilibrium and the Efficiency of Perfect Competition
General Equilibrium and the Efficiency of Perfect CompetitionNoel Buensuceso
 

Destacado (18)

концгмп новая евразия 24.03.11
концгмп новая евразия 24.03.11концгмп новая евразия 24.03.11
концгмп новая евразия 24.03.11
 
Poletaev khabarovsk
Poletaev khabarovskPoletaev khabarovsk
Poletaev khabarovsk
 
Destinys Word
Destinys WordDestinys Word
Destinys Word
 
The ADS Group
The ADS GroupThe ADS Group
The ADS Group
 
Destinys Word
Destinys WordDestinys Word
Destinys Word
 
25 февр 2010
25 февр 201025 февр 2010
25 февр 2010
 
Destinys Word
Destinys WordDestinys Word
Destinys Word
 
2009 12 12 памяти А
2009 12 12 памяти А2009 12 12 памяти А
2009 12 12 памяти А
 
Benchmark Index is BACK!
Benchmark Index is BACK!Benchmark Index is BACK!
Benchmark Index is BACK!
 
Curiosity is the core of innovation.
Curiosity is the core of innovation.Curiosity is the core of innovation.
Curiosity is the core of innovation.
 
The Internal Assessment
The Internal AssessmentThe Internal Assessment
The Internal Assessment
 
The Production Process: The Behavior of Profit Maximizing Firms
The Production Process: The Behavior of Profit Maximizing FirmsThe Production Process: The Behavior of Profit Maximizing Firms
The Production Process: The Behavior of Profit Maximizing Firms
 
Strategy Review, Evaluation, and Control
Strategy Review, Evaluation, and ControlStrategy Review, Evaluation, and Control
Strategy Review, Evaluation, and Control
 
Monopoly
MonopolyMonopoly
Monopoly
 
Fiscal Policy
Fiscal PolicyFiscal Policy
Fiscal Policy
 
The Nature of Strategic Management
The Nature of Strategic ManagementThe Nature of Strategic Management
The Nature of Strategic Management
 
Introduction to Macroeconomics
Introduction to MacroeconomicsIntroduction to Macroeconomics
Introduction to Macroeconomics
 
General Equilibrium and the Efficiency of Perfect Competition
General Equilibrium and the Efficiency of Perfect CompetitionGeneral Equilibrium and the Efficiency of Perfect Competition
General Equilibrium and the Efficiency of Perfect Competition
 

Similar a Integrating Disparate Data Sources into a Single Entity Graph

Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery Attivio
 
Salesforce mumbai user group june meetup
Salesforce mumbai user group   june meetupSalesforce mumbai user group   june meetup
Salesforce mumbai user group june meetupRakesh Gupta
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfData Science Council of America
 
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Chain Sys Corporation
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of dataHarsha MV
 
Not What You Think: A Simple Approach to Scalable Access of CMS Data
Not What You Think: A Simple Approach to Scalable Access of CMS DataNot What You Think: A Simple Approach to Scalable Access of CMS Data
Not What You Think: A Simple Approach to Scalable Access of CMS DataRowdMap has joined Cotiviti
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceSukirti Garg
 
One Size Does Not Fit All
One Size Does Not Fit AllOne Size Does Not Fit All
One Size Does Not Fit AllChris Dwan
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattooMohamed Magdy
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data AnalyticsOsman Ali
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?Inside Analysis
 
big data and machine learning ppt.pptx
big data and machine learning ppt.pptxbig data and machine learning ppt.pptx
big data and machine learning ppt.pptxNATASHABANO
 

Similar a Integrating Disparate Data Sources into a Single Entity Graph (20)

Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery
 
Big data
Big dataBig data
Big data
 
Salesforce mumbai user group june meetup
Salesforce mumbai user group   june meetupSalesforce mumbai user group   june meetup
Salesforce mumbai user group june meetup
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
 
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of data
 
Not What You Think: A Simple Approach to Scalable Access of CMS Data
Not What You Think: A Simple Approach to Scalable Access of CMS DataNot What You Think: A Simple Approach to Scalable Access of CMS Data
Not What You Think: A Simple Approach to Scalable Access of CMS Data
 
TSE_Pres12.pptx
TSE_Pres12.pptxTSE_Pres12.pptx
TSE_Pres12.pptx
 
Joe C
Joe CJoe C
Joe C
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
One Size Does Not Fit All
One Size Does Not Fit AllOne Size Does Not Fit All
One Size Does Not Fit All
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
 
big data and machine learning ppt.pptx
big data and machine learning ppt.pptxbig data and machine learning ppt.pptx
big data and machine learning ppt.pptx
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 

Último

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Último (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Integrating Disparate Data Sources into a Single Entity Graph

  • 1. Integrating Disparate Data May 27, 2010 Steve Newman – CTO/Gist.com
  • 2. the WHY? What we believe in… All your important people already reside in email, calendar, contact lists, social sites The web is a rich source of information about the people you care about One tool should exist that can pull all this together in a single, rich, integrated experience
  • 3. Pain Points (External) Disparate Data/API sources and protocols e.g. GNIP Change notification (when/what) e.g. Linked Open Data Dataset Dynamics, pubsubhub Standard entity data structures e.g. Portable Contacts, vcard, hcard 3
  • 4. The Problem (Internal) Need a single, disambiguated set of entities where an entity itself contains accurate/disambiguated attributes Entity attributes can be sourced from one or more endpoints Email Twitter/Facebook Calendar Google Contacts, Outlook Contacts, Plaxo Google Social Graph API Rapleaf API
  • 5. The Problem (Internal) Now that we have this data, we need to process and make sense of it Need to support reoccurring updates Merge and unmerge support Recursive derivation is a huge win if done correctly Historical Tracking is necessary both to drive operations but also for debugging (and it’s a cool user feature)
  • 6. How we did it Enhancers Execute the request and creation of attribute data Can be called synch or asynch Cached, Logged, Rate Limited Meta data about attributes Source, Source Type, When created, Derived?, Derived Source, Score Rules for ‘enhancement’ Rules for recursion Scoring methodology (accuracy and relative prioritization) 6
  • 7. Example – Email Enhancer “Brad Feld” vs “Brad” Data/Time Score State Value
  • 8. Key Takeaways Worry about integration both external and internal to your application Lots of good work on the external issues…take advantage of it! Create a strong object model for internal data representation (workers, meta data, engines) so you can perform concise/discrete operations
  • 9. Additional Info GIST API coming out this Summer Direct interface to Fragments Standard and Third party Enhancer support @stevepnewman, @gist
  • 10. « We know now that the source of wealth is something specificallyhuman : knowledge. Applied to tasksthatwealready know how to do, itbecomes'productivity'. Applied to tasksthat are new and differentwe call it'innovation'. Onlyknowledgeallows us to achievethesetwo goals. » Peter Drucker Management challenges of the XXIst Century-1999