Letizia Tanca - Exploring Databases: The Indiana Project

•

0 recomendaciones•695 vistas

Letizia Tanca, Politecnico di Milano, made this presentation for the Cognitive Systems Institute Speaker Series on July 21, 2016..

Tecnología

Le#zia Tanca
Politecnico di Milano
joint work with Università della Basilicata
(credits in the last slide)
Cogni#ve Systems Ins#tute
Speaker Series

User Interaction
Visualize
Annotation
Collaboration
Efficiency
Explanations
Sampling
Personalization
Intensional view
Query Suggestion

•  Rich data
•  Dialogue-based interac#on
•  Based on intensional characteriza#on
of the informa#on
•  Meaningful feedback (relevance)
•  User experience
Database Explora#on as a viewpoint of
Exploratory Compu5ng:
à only, more emphasis on eﬃciency

•  Starting point: a large,
“semantically-rich” db
•  Goals
•  explore, to learn
interesting things
•  without a clear, a-priori
perception of what we
are looking for

•  A classical db is inherently
transactional
•  “Data Enthusiasts” are not
willing to afford building a
warehouse
•  Interactive Data Cleaning
•  Let’s do it on the database!

The UI Layer
The Engine Layer
The DB Layer
“interesting”
attributes
Ac#vity
id
type
start
length
userId

AcmeUser Ac#vity Loca#on Sleep
The Engine Layer
The DB Layer
AcmeUser ⨝
Loca#on
Ac#vity ⨝
AcmeUser
Sleep ⨝
AcmeUser
type sex
quality
view X is a parent
of view Y means
Y contains X as a
subexpression

•  Query Engine
•  Frequency distributions
of attribute values
•  Sampling
•  Statistical hypothesis
tests:
•  Real-valued attributes:
•  Kolmogorov-Smirnov
•  Categorical attributes
•  Chi-Square
•  or Entropy Test for low
frequencies
Query Engine
Computing Distributions
Running Hypothesis Tests

1) Extrac#on
3) Itera#on
4) Ranking of the
analyses based on the
Hellinger Distance
between the distribu#ons

An interactive dialogue:
•  Users may change their
minds
•  Feedback: emphasis on
dataset properties, not on
extensions
•  Summarization
What is interesting is
discovered:
•  Discontinuities
•  Niche knowledge detection
is serendipitous: surprise vs.
previous subsets or vs. user’s
expectations
•  At each iteration the user
should understand
•  the “current” subset of
items (its properties)
•  the main differences vs.
one or more of the
previous subsets
•  where to focus her
attention (what is
interesting?)
•  Statistical approach to
finding discrepancies
•  A way to highlight relevant
properties

•  Politecnico di Milano: Paolo Paolini, NicoleQa Di Blas, Elisa
Quintarelli, Manuel Roveri, Mirjana Mazuran
•  Università della Basilicata: Giansalvatore Mecca, Donatello
Santoro, Marcello Buoncris#ano, Antonio Giuzio
•  M. Buoncris#ano, G. Mecca, E. Quintarelli, M.Roveri,D.
Santoro, L. Tanca: Database Challenges for Exploratory
Compu5ng. SIGMOD Record, 2015
•  N. Di Blas, M. Mazuran, P. Paolini, E. Quintarelli, L.Tanca:
Exploratory compu5ng: a dra= Manifesto. DSAA 2014
•  S. Idreos, O. Papaemmanouil, S. Chaudhuri:
Overview of Data Explora5on Techniques. SIGMOD 2015.
•  My post on the SIGMOD Blog

Letizia Tanca - Exploring Databases: The Indiana Project

Más contenido relacionado

Destacado

Powerpoint presentation - From A Railway CarriageRemya000

CREDITSEC - Next Generation Credit Card SecurityRahul Tyagi

Turn audience into customers with social contestsAndrea D'Ottavio

KATE - a Platform for Machine Learningdiannepatricia

Designing Social Interactions in a Teachable Agentdiannepatricia

Hicss17 asakawadiannepatricia

Cognitive Computing for Aging Societydiannepatricia

Machine design 3 (md) Mechanical Engineering handwritten classes notes (study...Khagendra Gautam

"ExpoDB: An Exploratory Data Science Platform"diannepatricia

Burden of Proofximenajg

Demand LetterRaven Kittler

Understanding Cognitive Applications: A Framework - Sue Feldmandiannepatricia

Higher engineering-mathematics-b-s-grewal-companion-textShashank Ravishankar

Cyber-Social Learning Systemsdiannepatricia

Embodied Cognition - Booch HICSS50diannepatricia

Design for Learning and Assessment in Virtual Worldsdiannepatricia

Cognitive Computing by Professor Gordon Pipadiannepatricia

Cognitive Point of View from World of Watson 2016diannepatricia

Destacado (18)

Powerpoint presentation - From A Railway Carriage

CREDITSEC - Next Generation Credit Card Security

Turn audience into customers with social contests

KATE - a Platform for Machine Learning

Designing Social Interactions in a Teachable Agent

Hicss17 asakawa

Cognitive Computing for Aging Society

Machine design 3 (md) Mechanical Engineering handwritten classes notes (study...

"ExpoDB: An Exploratory Data Science Platform"

Burden of Proof

Demand Letter

Understanding Cognitive Applications: A Framework - Sue Feldman

Higher engineering-mathematics-b-s-grewal-companion-text

Cyber-Social Learning Systems

Embodied Cognition - Booch HICSS50

Design for Learning and Assessment in Virtual Worlds

Cognitive Computing by Professor Gordon Pipa

Cognitive Point of View from World of Watson 2016

Similar a Letizia Tanca - Exploring Databases: The Indiana Project

Florence2blalbritton

Web-Scale Discovery: Post ImplementationRachel Vacek

Managing your library's online presenceSuhui Ho

IoT and xAPI - ADL Design Cohort xAPIGnomeTorranceLearning

Shared Canvas presentation at the LIBER conferenceMatthieu Bonicel

Ux and Data VisualisationCity Unrulyversity

People centralized SharePoint solutionsNicki Borell

Research on collaborative information sharing systemsDavide Eynard

3 - Discovery-systemsWilliam Helling

ArchivesSpace: Building a Next-Generation Archives Management ToolMark Matienzo

Back to the Future: The Reinvention of the Library Catalog, Yesterday, Today,...zepheiraorg

Library discovery: past, present and some futureslisld

Crowdsourcing or bust: The Indexer, Archives NZ donellemckinley

Integrating digital traces into a semantic enriched dataDhaval Thakker

DeLiddo&BuckinghamShum-e-Part2014Anna De Liddo

The Social Semantic Server: A Flexible Framework to Support Informal Learning...tobold

The Social Semantic Server - A Flexible Framework to Support Informal Learnin...Sebastian Dennerlein

Jones "Enabling Discovery in the Library"National Information Standards Organization (NISO)

Thinking about technology .... differentlylisld

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...National Information Standards Organization (NISO)

Similar a Letizia Tanca - Exploring Databases: The Indiana Project (20)

Florence2

Web-Scale Discovery: Post Implementation

Managing your library's online presence

IoT and xAPI - ADL Design Cohort xAPIGnome

Shared Canvas presentation at the LIBER conference

Ux and Data Visualisation

People centralized SharePoint solutions

Research on collaborative information sharing systems

3 - Discovery-systems

ArchivesSpace: Building a Next-Generation Archives Management Tool

Back to the Future: The Reinvention of the Library Catalog, Yesterday, Today,...

Library discovery: past, present and some futures

Crowdsourcing or bust: The Indexer, Archives NZ

Integrating digital traces into a semantic enriched data

DeLiddo&BuckinghamShum-e-Part2014

The Social Semantic Server: A Flexible Framework to Support Informal Learning...

The Social Semantic Server - A Flexible Framework to Support Informal Learnin...

Jones "Enabling Discovery in the Library"

Thinking about technology .... differently

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...

Más de diannepatricia

Teaching cognitive computing with ibm watsondiannepatricia

Cognitive systems institute talk 8 june 2017 - v.1.0diannepatricia

Building Compassionate Conversational Systemsdiannepatricia

“Artificial Intelligence, Cognitive Computing and Innovating in Practice”diannepatricia

Cognitive Insights drive self-driving Accessibilitydiannepatricia

Artificial Intellingence in the Cardiannepatricia

“Semantic PDF Processing & Document Representation”diannepatricia

Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...diannepatricia

170330 cognitive systems institute speaker series mark sherman - watson pr...diannepatricia

“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”diannepatricia

Cognitive Assistance for the Agingdiannepatricia

From complex Systems to Networks: Discovering and Modeling the Correct Network"diannepatricia

The Role of Dialog in Augmented Intelligencediannepatricia

Developing Cognitive Systems to Support Team Cognitiondiannepatricia

“IT Technology Trends in 2017… and Beyond”diannepatricia

"Curious Learning: using a mobile platform for early literacy education as a ...diannepatricia

“Semantic Technologies for Smart Services” diannepatricia

Embodied Cognition with Pproject Intudiannepatricia

“Towards Building a Cognitive System to Fight for National College Admission ...diannepatricia

Declarative Multilingual Information Extraction with SystemTdiannepatricia

Más de diannepatricia (20)

Teaching cognitive computing with ibm watson

Cognitive systems institute talk 8 june 2017 - v.1.0

Building Compassionate Conversational Systems

“Artificial Intelligence, Cognitive Computing and Innovating in Practice”

Cognitive Insights drive self-driving Accessibility

Artificial Intellingence in the Car

“Semantic PDF Processing & Document Representation”

Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...

170330 cognitive systems institute speaker series mark sherman - watson pr...

“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”

Cognitive Assistance for the Aging

From complex Systems to Networks: Discovering and Modeling the Correct Network"

The Role of Dialog in Augmented Intelligence

Developing Cognitive Systems to Support Team Cognition

“IT Technology Trends in 2017… and Beyond”

"Curious Learning: using a mobile platform for early literacy education as a ...

“Semantic Technologies for Smart Services”

Embodied Cognition with Pproject Intu

“Towards Building a Cognitive System to Fight for National College Admission ...

Declarative Multilingual Information Extraction with SystemT

Último

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Developing An App To Navigate The Roads of BrazilV3cube

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

A Domino Admins Adventures (Engage 2024)Gabriella Davis

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Letizia Tanca - Exploring Databases: The Indiana Project

1. Le#zia Tanca Politecnico di Milano joint work with Università della Basilicata (credits in the last slide) Cogni#ve Systems Ins#tute Speaker Series

2. User Interaction Visualize Annotation Collaboration Efficiency Explanations Sampling Personalization Intensional view Query Suggestion

3. •  Rich data •  Dialogue-based interac#on •  Based on intensional characteriza#on of the informa#on •  Meaningful feedback (relevance) •  User experience Database Explora#on as a viewpoint of Exploratory Compu5ng: à only, more emphasis on eﬃciency

4. •  Starting point: a large, “semantically-rich” db •  Goals •  explore, to learn interesting things •  without a clear, a-priori perception of what we are looking for

10.

11.

12.

13.

14.

15.

16.

17. •  A classical db is inherently transactional •  “Data Enthusiasts” are not willing to afford building a warehouse •  Interactive Data Cleaning •  Let’s do it on the database!

18. The UI Layer The Engine Layer The DB Layer “interesting” attributes Ac#vity id type start length userId

19. AcmeUser Ac#vity Loca#on Sleep The Engine Layer The DB Layer AcmeUser ⨝ Loca#on Ac#vity ⨝ AcmeUser Sleep ⨝ AcmeUser type sex quality view X is a parent of view Y means Y contains X as a subexpression

20. •  Query Engine •  Frequency distributions of attribute values •  Sampling •  Statistical hypothesis tests: •  Real-valued attributes: •  Kolmogorov-Smirnov •  Categorical attributes •  Chi-Square •  or Entropy Test for low frequencies Query Engine Computing Distributions Running Hypothesis Tests

21. 1) Extrac#on 3) Itera#on 4) Ranking of the analyses based on the Hellinger Distance between the distribu#ons

22. An interactive dialogue: •  Users may change their minds •  Feedback: emphasis on dataset properties, not on extensions •  Summarization What is interesting is discovered: •  Discontinuities •  Niche knowledge detection is serendipitous: surprise vs. previous subsets or vs. user’s expectations •  At each iteration the user should understand •  the “current” subset of items (its properties) •  the main differences vs. one or more of the previous subsets •  where to focus her attention (what is interesting?) •  Statistical approach to finding discrepancies •  A way to highlight relevant properties

23. •  Politecnico di Milano: Paolo Paolini, NicoleQa Di Blas, Elisa Quintarelli, Manuel Roveri, Mirjana Mazuran •  Università della Basilicata: Giansalvatore Mecca, Donatello Santoro, Marcello Buoncris#ano, Antonio Giuzio •  M. Buoncris#ano, G. Mecca, E. Quintarelli, M.Roveri,D. Santoro, L. Tanca: Database Challenges for Exploratory Compu5ng. SIGMOD Record, 2015 •  N. Di Blas, M. Mazuran, P. Paolini, E. Quintarelli, L.Tanca: Exploratory compu5ng: a dra= Manifesto. DSAA 2014 •  S. Idreos, O. Papaemmanouil, S. Chaudhuri: Overview of Data Explora5on Techniques. SIGMOD 2015. •  My post on the SIGMOD Blog

Letizia Tanca - Exploring Databases: The Indiana Project

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (18)

Similar a Letizia Tanca - Exploring Databases: The Indiana Project

Similar a Letizia Tanca - Exploring Databases: The Indiana Project (20)

Más de diannepatricia

Más de diannepatricia (20)

Último

Último (20)

Letizia Tanca - Exploring Databases: The Indiana Project