SlideShare una empresa de Scribd logo
1 de 33
Big Data and You
Preparing Current & Future Information Specialists

Sands Fish
Data Scientist / MIT Libraries
@sandsfish
sands@mit.edu
Knowing in the Age of
Networked Knowledge
Nothing is static.
Everything is connected.
Knowledge representation
is now complex
Scholarly Primitives
- Discovering
- Annotating
- Comparing
- Referring
- Sampling
- Illustrating
- Representing
John Unsworth, 2000.
http://people.brandeis.edu/~unsworth/Kings.5-00/primitives.html
Complex Knowledge
Objects
- Have multiple representations & ways of being consumed
- Can be a link in a chain, node in a graph, or ecosystem of
knowledge.

- Allow different perspectives or ways to ask questions of.

(none of these are true of physical books)
Complex Knowledge
Objects
Data Examples:
•
•
•
•
•

JSON, XML, etc. esp. from a URL that allows it to be updated
Visualizations, sonifications, etc. (mind-maps, interactives)
Geospatial data, layered, constrained by area
APIs
Linked Data, integrated with many other resources
Complex Knowledge
Objects
• Tool / Platform Examples:
•
•
•
•
•

Integrated Data Platforms
Courseware
MOOCs
Interactive Visualizations
Commons-based peer production
(wikis, reviews, software, etc.)
• Tweets
• Data analysis tools
• Data Enclaves (limited access processing endpoints)
Methods of Exploration
In this diverse ecosystem, there is no one way of exploring a
topic.
- Manual Browsing

- Automated Spidering (e.g. Berkman / Media Cloud)
- Collection / Trawling (e.g. Browser Plugins)
- Conventional Big Data (e.g. Hadoop, Map/Reduce)
- Using Linked Data to branch out through related concepts
- Algorithmic Data Processing (e.g. Topic Modeling)
Problems of Completeness
- When do you know that you have enough information?

- What kind of compromises are made when information is
more massive than anyone can consume?
Problems of Integration
When data comes from many different silos, in many
different structures and formats, how do you bring all of this
knowledge together?

- One solution is RDF, which provides a common data
generic data model. Collaborative ontology development
can allow communities to work together.
- Open standards.
- Build tools and services that provide easy access to the
underlying data.
How To Get A Grip
- Keep abreast of W3C developments and other standards
bodies.
- Don’t focus too much on single technologies. They will
shift quickly.
- Learn at least one data visualization technology.
- Remember to frame questions of data in more than one
way.
- Ask your own questions of the data yourself. Understand
it from the point of the user.
Sands Fish - Knowing in the Age of Networked Knowledge

Más contenido relacionado

La actualidad más candente

Rise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen RomboutsRise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen RomboutsLibrary_Connect
 
Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing Library_Connect
 
Freedman Center for Digital Scholarship Colloquium - 14_1106
Freedman Center for Digital Scholarship Colloquium - 14_1106Freedman Center for Digital Scholarship Colloquium - 14_1106
Freedman Center for Digital Scholarship Colloquium - 14_1106jeffreylancaster
 
Towards a digital library for York
Towards a digital library for YorkTowards a digital library for York
Towards a digital library for YorkJulie Allinson
 
Linked Data for Knowledge Discovery: Introduction
Linked Data for Knowledge Discovery: IntroductionLinked Data for Knowledge Discovery: Introduction
Linked Data for Knowledge Discovery: IntroductionMathieu d'Aquin
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Mathieu d'Aquin
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Incremental idcc 08_12_10_slideshare
Incremental idcc 08_12_10_slideshareIncremental idcc 08_12_10_slideshare
Incremental idcc 08_12_10_slideshareIncremental Project
 
Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Rebekah Cummings
 
Introduction to databases and metadata
Introduction to databases and metadataIntroduction to databases and metadata
Introduction to databases and metadatalibrarianrafia
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...EDINA, University of Edinburgh
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleCelia Emmelhainz
 

La actualidad más candente (20)

Rise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen RomboutsRise of the Databrarian - Jeroen Rombouts
Rise of the Databrarian - Jeroen Rombouts
 
Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing Library Connect Webinar - Data Sharing
Library Connect Webinar - Data Sharing
 
Freedman Center for Digital Scholarship Colloquium - 14_1106
Freedman Center for Digital Scholarship Colloquium - 14_1106Freedman Center for Digital Scholarship Colloquium - 14_1106
Freedman Center for Digital Scholarship Colloquium - 14_1106
 
Towards a digital library for York
Towards a digital library for YorkTowards a digital library for York
Towards a digital library for York
 
Linked Data for Knowledge Discovery: Introduction
Linked Data for Knowledge Discovery: IntroductionLinked Data for Knowledge Discovery: Introduction
Linked Data for Knowledge Discovery: Introduction
 
Organising and Documenting Data
Organising and Documenting DataOrganising and Documenting Data
Organising and Documenting Data
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...
 
The Blossoming of the Semantic Web
The Blossoming of the Semantic WebThe Blossoming of the Semantic Web
The Blossoming of the Semantic Web
 
DH2012_Bellamy
DH2012_BellamyDH2012_Bellamy
DH2012_Bellamy
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Presentation to KILT
Presentation to KILTPresentation to KILT
Presentation to KILT
 
Incremental idcc 08_12_10_slideshare
Incremental idcc 08_12_10_slideshareIncremental idcc 08_12_10_slideshare
Incremental idcc 08_12_10_slideshare
 
Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...Your digital humanities are in my library! No, your library is in my digital ...
Your digital humanities are in my library! No, your library is in my digital ...
 
Introduction to databases and metadata
Introduction to databases and metadataIntroduction to databases and metadata
Introduction to databases and metadata
 
MANTRA & Open Educational Resources
MANTRA & Open Educational ResourcesMANTRA & Open Educational Resources
MANTRA & Open Educational Resources
 
Engaging the Researcher in RDM
Engaging the Researcher in RDMEngaging the Researcher in RDM
Engaging the Researcher in RDM
 
Is Linked Open Data the way forward?
Is Linked Open Data the way forward?Is Linked Open Data the way forward?
Is Linked Open Data the way forward?
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycle
 
Benoit Visual Only Retrieval
Benoit Visual Only RetrievalBenoit Visual Only Retrieval
Benoit Visual Only Retrieval
 

Destacado

Evaluation of preliminary task
Evaluation of preliminary task Evaluation of preliminary task
Evaluation of preliminary task emh3196
 
Plan for music magazine
Plan for music magazinePlan for music magazine
Plan for music magazineemh3196
 
Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...
Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...
Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...Dániel Kristóf
 
Roman, britanian roman, peninggalan bangsa roman
Roman, britanian roman, peninggalan bangsa romanRoman, britanian roman, peninggalan bangsa roman
Roman, britanian roman, peninggalan bangsa romanApep Wahyudin
 
A Life Changing Opportunity
A Life Changing OpportunityA Life Changing Opportunity
A Life Changing OpportunityAnil Kumar
 
Evaluation question 3
Evaluation question 3Evaluation question 3
Evaluation question 3emh3196
 
Huge UX Design School Exercise
Huge UX Design School ExerciseHuge UX Design School Exercise
Huge UX Design School ExerciseElliott Romano
 
Handsketches for Huge Design Exercise
Handsketches for Huge Design ExerciseHandsketches for Huge Design Exercise
Handsketches for Huge Design ExerciseElliott Romano
 
Dampak IPTEK Terhadap kehidupan Sosial
Dampak IPTEK Terhadap kehidupan SosialDampak IPTEK Terhadap kehidupan Sosial
Dampak IPTEK Terhadap kehidupan SosialApep Wahyudin
 
Makalah ISBD(manusia dan lingkungan)
Makalah ISBD(manusia dan lingkungan)Makalah ISBD(manusia dan lingkungan)
Makalah ISBD(manusia dan lingkungan)Apep Wahyudin
 

Destacado (14)

Evaluation of preliminary task
Evaluation of preliminary task Evaluation of preliminary task
Evaluation of preliminary task
 
Plan for music magazine
Plan for music magazinePlan for music magazine
Plan for music magazine
 
Marcela ocampo
Marcela ocampoMarcela ocampo
Marcela ocampo
 
Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...
Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...
Pushing MODIS to the edge: high-resolution applications of moderate-resolutio...
 
Roman, britanian roman, peninggalan bangsa roman
Roman, britanian roman, peninggalan bangsa romanRoman, britanian roman, peninggalan bangsa roman
Roman, britanian roman, peninggalan bangsa roman
 
A Life Changing Opportunity
A Life Changing OpportunityA Life Changing Opportunity
A Life Changing Opportunity
 
Evaluation question 3
Evaluation question 3Evaluation question 3
Evaluation question 3
 
Huge UX Design School Exercise
Huge UX Design School ExerciseHuge UX Design School Exercise
Huge UX Design School Exercise
 
ART OBZOR
ART OBZORART OBZOR
ART OBZOR
 
Handsketches for Huge Design Exercise
Handsketches for Huge Design ExerciseHandsketches for Huge Design Exercise
Handsketches for Huge Design Exercise
 
Makalah Raeding
Makalah RaedingMakalah Raeding
Makalah Raeding
 
Makalah
MakalahMakalah
Makalah
 
Dampak IPTEK Terhadap kehidupan Sosial
Dampak IPTEK Terhadap kehidupan SosialDampak IPTEK Terhadap kehidupan Sosial
Dampak IPTEK Terhadap kehidupan Sosial
 
Makalah ISBD(manusia dan lingkungan)
Makalah ISBD(manusia dan lingkungan)Makalah ISBD(manusia dan lingkungan)
Makalah ISBD(manusia dan lingkungan)
 

Similar a Sands Fish - Knowing in the Age of Networked Knowledge

Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Talis Consulting
 
FAIR data: LOUD for all audiences
FAIR data: LOUD for all audiencesFAIR data: LOUD for all audiences
FAIR data: LOUD for all audiencesAlessandro Adamou
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Dataaba-sah
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloudNational Institute of Informatics
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
The Social Semantic Server: A Flexible Framework to Support Informal Learning...
The Social Semantic Server: A Flexible Framework to Support Informal Learning...The Social Semantic Server: A Flexible Framework to Support Informal Learning...
The Social Semantic Server: A Flexible Framework to Support Informal Learning...tobold
 
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...Sebastian Dennerlein
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEnno Meijers
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationMathieu d'Aquin
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by SunnyDignitasDigital1
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 

Similar a Sands Fish - Knowing in the Age of Networked Knowledge (20)

Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University
 
The Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web InitiativeThe Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web Initiative
 
FAIR data: LOUD for all audiences
FAIR data: LOUD for all audiencesFAIR data: LOUD for all audiences
FAIR data: LOUD for all audiences
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
 
Aggregation as tactic sm new
Aggregation as tactic sm newAggregation as tactic sm new
Aggregation as tactic sm new
 
Aggregation as Tactic
Aggregation as TacticAggregation as Tactic
Aggregation as Tactic
 
Toward universal information access on the digital object cloud
Toward universal information access on the digital object cloudToward universal information access on the digital object cloud
Toward universal information access on the digital object cloud
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Implementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource ConditionsImplementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource Conditions
 
Web Information Systems Introduction and Origin of World Wide Web
Web Information Systems Introduction and Origin of World Wide WebWeb Information Systems Introduction and Origin of World Wide Web
Web Information Systems Introduction and Origin of World Wide Web
 
The Social Semantic Server: A Flexible Framework to Support Informal Learning...
The Social Semantic Server: A Flexible Framework to Support Informal Learning...The Social Semantic Server: A Flexible Framework to Support Informal Learning...
The Social Semantic Server: A Flexible Framework to Support Informal Learning...
 
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
The Social Semantic Server - A Flexible Framework to Support Informal Learnin...
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education Organisation
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by Sunny
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Sands Fish - Knowing in the Age of Networked Knowledge

  • 1. Big Data and You Preparing Current & Future Information Specialists Sands Fish Data Scientist / MIT Libraries @sandsfish sands@mit.edu
  • 2.
  • 3.
  • 4. Knowing in the Age of Networked Knowledge
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 14. Scholarly Primitives - Discovering - Annotating - Comparing - Referring - Sampling - Illustrating - Representing John Unsworth, 2000. http://people.brandeis.edu/~unsworth/Kings.5-00/primitives.html
  • 15. Complex Knowledge Objects - Have multiple representations & ways of being consumed - Can be a link in a chain, node in a graph, or ecosystem of knowledge. - Allow different perspectives or ways to ask questions of. (none of these are true of physical books)
  • 16. Complex Knowledge Objects Data Examples: • • • • • JSON, XML, etc. esp. from a URL that allows it to be updated Visualizations, sonifications, etc. (mind-maps, interactives) Geospatial data, layered, constrained by area APIs Linked Data, integrated with many other resources
  • 17. Complex Knowledge Objects • Tool / Platform Examples: • • • • • Integrated Data Platforms Courseware MOOCs Interactive Visualizations Commons-based peer production (wikis, reviews, software, etc.) • Tweets • Data analysis tools • Data Enclaves (limited access processing endpoints)
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Methods of Exploration In this diverse ecosystem, there is no one way of exploring a topic. - Manual Browsing - Automated Spidering (e.g. Berkman / Media Cloud) - Collection / Trawling (e.g. Browser Plugins) - Conventional Big Data (e.g. Hadoop, Map/Reduce) - Using Linked Data to branch out through related concepts - Algorithmic Data Processing (e.g. Topic Modeling)
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. Problems of Completeness - When do you know that you have enough information? - What kind of compromises are made when information is more massive than anyone can consume?
  • 31. Problems of Integration When data comes from many different silos, in many different structures and formats, how do you bring all of this knowledge together? - One solution is RDF, which provides a common data generic data model. Collaborative ontology development can allow communities to work together. - Open standards. - Build tools and services that provide easy access to the underlying data.
  • 32. How To Get A Grip - Keep abreast of W3C developments and other standards bodies. - Don’t focus too much on single technologies. They will shift quickly. - Learn at least one data visualization technology. - Remember to frame questions of data in more than one way. - Ask your own questions of the data yourself. Understand it from the point of the user.

Notas del editor

  1. We have crossed technological thresholds in the past, where the scale of operational parameters exceeded human conceptualization. 10-60rpms.
  2. 1000-6000rpms
  3. 40,000rpms
  4. We are crossing a similar threshold now with big data. Mathematical complexity in 3 dimensions is conceivable.
  5. Anything beyond that is basically impossible to visualize or conceptualize, even though the Euclidean geometry scales and continues to work for things like the Vector Space Model for representing high-dimensionality text for data mining.
  6. The scale of network complexity has crossed this threshold as well. (2007)
  7. Linked Open Data Cloud, 2010. Has scaled enormously since.
  8. This all leaves us in an environment where the knowledge we need to educate ourselves about a given topic are not static, or found in a stack of books, but live, connected, spanning the borders of conventional containers.
  9. We now have to contend with new forms of knowledge, new ways of discovering it, and new challenges to making use of it.Amazon deleted copies of Farrentheit 451 off of users’ Kindles. We are no longer in a world where knowledge is a simple, static asset. If authors can change their writing, the government can change their data.
  10. Unsworth’s Scholarly Primitives are worth considering when planning for high-level functionality in a world where we rely on abstractions in the form of tools and algorithms to represent things in a lower-dimensionality. What are the tasks we need to provide access to on top of great scale.
  11. Knowledge is being represented in vastly more complex structures than it used to be. Previously, books, tables, encyclopedias, and simple web pages were the standard. Now we have a heterogeneous mix of knowledge representation. These go beyond the page or document. They have the following properties.
  12. They range from the concrete to the abstract.
  13. Enigma.io, connecting disparate but linkable gov’t data sets.
  14. WikiData as a complex knowledge object / ecosystem.
  15. Tweet Metadatahttp://stackoverflow.com/questions/16600099/how-to-output-json-data-correctly-using-php
  16. Mind maps (even in planning for this talk) are complex knowledge objects.
  17. Browsing History Data; Civic Media; Catherine D’Ignazio doing work to learn about where you pay attention to and where you don’t. Browsing trackers are not independent, but networked.http://skyeome.net/wordpress/?paged=2
  18. Linked Data Wanderer: Archie Bunker -> Singing -> List of Sovereign States -> Republican Party -> Imigrants to Cuba -> Human Voice -> Homeward Bound (the album)
  19. Influence sub-graph on top of live DBpedia data. ( @sandsfish – http://dbpeople.herokuapp.com )
  20. Highly linked.Influence sub-graph on top of live DBpedia data. ( @sandsfish – http://dbpeople.herokuapp.com )
  21. Geo-parsing of place mentions in MIT Open Access collection. http://dspace.mit.edu
  22. Geo-parsing of place mentions in MIT Open Access collection. http://dspace.mit.edu
  23. For an Open Access collection to be useful to a wide population, it needs to be exposed for mining.I’m building an API to allow anyone to mine this information, instead of being limited to the representation of it on a web page or in a PDF. @sandsfish
  24. If you primarily work with raw data, or are a librarian, learn even the most basic methods of visualization. This will give you a vocabulary with which to interact with data owners or patrons, and provide a better way of conceptualizing what questions are possible with data.Shifting your perspective from geographic to temporal, or from a single answer to a range of possibilities will help expand the knowledge you can acquire about a topic. This is one of the benefits of knowledge being complex.