Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Human-in-the-loop @ ISWS 2019

"Pill" tutorial about human in the loop approaches at the International Semantic Web Research Summer School in Bertinoro, July 2019

  • Inicia sesión para ver los comentarios

Human-in-the-loop @ ISWS 2019

  1. 1. HUMAN IN THE LOOP Irene Celino – irene.celino@cefriel.com Cefriel, Viale Sarca 226, 20126 Milano In Depth Pill @ ISWS 2019 – Bertinoro, July 3rd, 2019
  2. 2. 1. Why humans in KG evolution and preservation 2. Approaches to human involvement 3. Need for human-in-the-loop & Explainable AI AGENDA 2copyright © 2019 Cefriel – All rights reserved
  3. 3. from ideation to business value 3 1. WHY HUMANS IN KG EVOLUTION AND PRESERVATION Do we need to involve people in Knowledge Graph processing? What semantic data management tasks can we effectively “outsource” to humans? copyright © 2019 Cefriel – All rights reserved
  4. 4. HUMANS IN THE SEMANTIC WEB • Knowledge-intensive and/or context-specific character of Semantic Web tasks: • e.g., conceptual modelling, multi-language resource labelling, content annotation with ontologies, concept/entity similarity recognition, … • Need to engage users and involve them in executing tasks: • e.g., wikis for semantic content authoring, folksonomies to bootstrap formal ontologies, instance creation by data entry, … 4copyright © 2019 Cefriel – All rights reserved
  5. 5. SEMANTIC WEB TASKS (ALSO) FOR HUMANS 5copyright © 2019 Cefriel – All rights reserved Fact level Schema level Collection Creation CorrectionValidation Filtering Ranking Linking Conceptual modelling Ontology population Quality assessment Ontology re- engineering Ontology pruning Ontology elicitation Knowledge acquisition Ontology repair KG evolution Data search/ selection Link generation Ontology alignment Ontology matching KG population KG preservation
  6. 6. KNOWLEDGE GRAPH REFINEMENT • Knowledge Graph Refinement is an emerging and hot topic to (1) identify and correct errors and (2) add missing knowledge, often by means of statistical learning and/or machine learning • Machine learning approaches train automatic models on the basis of a training set, thus they require some partial gold standard, often also named “ground truth” • Ground truth is usually put together manually by experts, but sourcing training sets from humans is expensive! 6 Building a training set for Knowledge Graph refinement = = asking people to execute a set of KG refinement tasks Heiko Paulheim. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web Journal, 2017copyright © 2019 Cefriel – All rights reserved
  7. 7. l rs ro p• Data linking is the creation of links in the form of RDF triples (subject, predicate, object) • Common cases of Data Linking: • Link creation: a link l = (rs,p,ro) is created A link score σ ∈ [0,1]can be attached to each existing/created link, indicating the confidence on the truth value of the link • Link ranking: a score σ ∈ [0,1] is assigned to each link l, representing the probability of the link to be recognized as true; links are ordered by their score σ (ranking) • Link validation: a score σ ∈ [0,1] is assigned to each link l, representing the actual truth value of the link; a threshold t ∈ [0,1] is tells apart true (σ ≥ t) from false links DATA LINKING TASKS 7 l1 l2 σ1 > σ2 σ1 > t “true” σ2 < t “false” l rs ro p σ l1 l2 copyright © 2019 Cefriel – All rights reserved
  8. 8. OUTSOURCE DATA LINKING TO HUMANS • There are several incentive schemes to solicit convince people to participate to a (possibly tedious) activity of KG refinement: money prizes, premium access, curiosity, recognition, … • One of such incentives is fun and enjoyment • The GWAP Enabler is a software framework to build gaming applications designed to solve some KG refinement task by involving participants as players (https://github.com/STARS4ALL/gwap-enabler) 8 execute KG refinement tasks play games and have fun produce training set (ground truth) used to train machine learning models for automatic KG refinement copyright © 2019 Cefriel – All rights reserved Gloria Re Calegari, Andrea Fiano and Irene Celino: "A Framework to build Games with a Purpose for Linked Data Refinement", International Semantic Web Conference, 2018
  9. 9. 9 • Input: set of all links <asset> foaf:depiction <photo> • Goal: assign score 𝜎 to rank links on their recognisability/ representativeness • The score 𝜎 is a function of Τ𝑋 𝑁 where 𝑋 is the no. of successes (=recognitions) and 𝑁 the no. of trials of the Bernoulli process (guess or not guess) realized by the game • Cultural heritage assets in Milano and their pictures EXAMPLE 1: INDOMILANDO http://bit.ly/Indomilando Game with a hidden purpose Points, badges, leaderboard as intrinsic reward Link ranking is a result of the “agreement” between players The game has also an educational “collateral effect”  Irene Celino, Andrea Fiano, Riccardo Fino. Analysis of a Cultural Heritage Game with a Purpose with an Educational Incentive. 16th International Conference on Web Engineering, 2016 copyright © 2019 Cefriel – All rights reserved LINK RANKING
  10. 10. 10 • Input: set of links <land-area> clc:hasLandCover <land-cover> • Goal: assign score 𝜎 to each link to discover the “right” land cover class • Score 𝜎 of each link is updated on the basis of players’ choices (incremented if link selected, decremented if link not selected) • When the score of a link overcomes the threshold 𝜎 ≥ 𝑡 , the link is considered “true” (and removed from the game) • Two automatic classifications of land cover in disagreement: <land-cover-assigned-by-DUSAF> ≠ <land-cover-assigned-by-GL30> EXAMPLE 2: LCV GAME http://bit.ly/foss4game https://youtu.be/Q0ru1hhDM9Q Game with a not-so-hidden purpose (played by “experts”) Points, badges, leaderboard as intrinsic reward A player scores if he/she guess one of the two disagreeing classifications Link validation is a result of the “agreement” between players Maria Antonia Brovelli, Irene Celino, Andrea Fiano, Monia Elisa Molinari, Vijaycharan Venkatachalam. A crowdsourcing-based game for land cover validation. Applied Geomatics, 2017 copyright © 2019 Cefriel – All rights reserved LINK VALIDATION
  11. 11. 11 • Input: set of subject resources (pictures) and object resources (classification categories) • Goal: create links <picture> hasCategory <category> and assign score 𝜎 to each link • Score 𝜎 of each link is updated on the basis of players’ choices (incremented if link selected) • When the score of a link overcomes the threshold 𝜎 ≥ 𝑡 , the link is considered “true” (and the picture is removed from the game) • Identify pictures of cities from above between those taken on board of the ISS (the pictures are used then in a scientific process in light pollution research) EXAMPLE 3: NIGHT KNIGHTS http://nightknights.eu Pure game with a not-so-hidden purpose (but played by anybody) Points, badges, leaderboard as intrinsic reward A player scores if he/she agrees with another player “Bonus” intrinsic reward with NASA pictures! Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning: an Experimental Comparison for Image Classification. Human Computation Journal, vol. 5, issue 1, 2018. Gloria Re Calegari and Irene Celino: Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with a Purpose, in proceedings of EKAW 2018. copyright © 2019 Cefriel – All rights reserved LINK CREATION & VALIDATION
  12. 12. from ideation to business value 12 2. APPROACHES TO HUMAN INVOLVEMENT What goals can humans help machines to achieve? How to involve a crowd of persons? What extrinsic rewards (money, prizes, etc.) or intrinsic incentives can we adopt to motivate people? copyright © 2019 Cefriel – All rights reserved
  13. 13. GAMES WITH A PURPOSE • A GWAP lets to outsource to humans some steps of a computational process in an entertaining way • The application has a “collateral effect”, because players’ actions are exploited to solve a hidden task • The application *IS* a fully-fledged game (opposed to gamification, which is the use of game-like features in non-gaming environments) • The players are (usually) unaware of the hidden purpose, they simply meet game challenges 13copyright © 2019 Cefriel – All rights reserved Luis Von Ahn. Games with a purpose. Computer, 39(6):92–94, 2006 Luis Von Ahn and Laura Dabbish. Designing games with a purpose. Communications of the ACM, 51(8):58–67, 2008
  14. 14. GAMES WITH A PURPOSE (GWAP) 14copyright © 2019 Cefriel – All rights reserved Solution: Solution: hide the task within a game, so that users are motivated by game challenges, often remaining unaware of the hidden purpose, task solution comes from agreement between players Problem: AI is unable to achieve an adequate result with a satisfactory level of confidence
  15. 15. HUMAN COMPUTATION • Human Computation is a computer science technique in which a computational process is performed by outsourcing certain steps to humans. Unlike traditional computation, in which a human delegates a task to a computer, in Human Computation the computer asks a person or a large group of people to solve a problem; then it collects, interprets and integrates their solutions • The original concept of Human Computation by its inventor Luis von Ahn derived from the common sense observation that people are intrinsically very good at solving some kinds of tasks which are, on the other hand, very hard to address for a computer; this is the case of a number of targets of Artificial Intelligence (like computer vision or natural language understanding) for which research is still open 15copyright © 2019 Cefriel – All rights reserved Edith Law and Luis von Ahn. Human computation. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2011
  16. 16. HUMAN COMPUTATION 16copyright © 2019 Cefriel – All rights reserved Problem: an Artificial Intelligence algorithm is unable to achieve an adequate result with a satisfactory level of confidence Solution: ask people to intervene when the AI system fails, “masking” the task within another human process Example: https://www.google.com/recaptcha/
  17. 17. CROWDSOURCING • Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people. The possibility to exploit the Internet as vehicle to recruit contributors and to assign tasks led to the rise of micro-work platforms, thus often (but not always) implying a monetary reward. The term Crowdsourcing, although quite recent, is used to indicate a wide range of practices; however, the most common meaning of Crowdsourcing implies that the “crowd” of workers involved in the solution of tasks is different from the traditional or intended groups of task solvers 17copyright © 2019 Cefriel – All rights reserved Jeff Howe. Crowdsourcing: How the power of the crowd is driving the future of business. Random House, 2008
  18. 18. CROWDSOURCING 18copyright © 2019 Cefriel – All rights reserved Problem: a company needs to execute a lot of simple tasks, but cannot afford hiring a person to do that job Solution: pack tasks in bunches (human intelligence tasks or HITs) and outsource them to a very cheap workforce through an online platform Example: https://www.mturk.com/
  19. 19. CITIZEN SCIENCE • Citizen Science is the involvement of volunteers to collect or process data as part of a scientific or research experiment; those volunteers can be the scientists and researchers themselves, but more often the name of this discipline “implies a form of science developed and enacted by citizens” including those “outside of formal scientific institutions”, thus representing a form of public participation to science. Formally, Citizen Science has been defined as “the systematic collection and analysis of data; development of technology; testing of natural phenomena; and the dissemination of these activities by researchers on a primarily avocational basis”. 19copyright © 2019 Cefriel – All rights reserved Alan Irwin. Citizen science: A study of people, expertise and sustainable development. Psychology Press, 1995
  20. 20. CITIZEN SCIENCE 20copyright © 2019 Cefriel – All rights reserved Example: https://www.zooniverse.org/ Problem: a scientific experiment requires the execution of a lot of simple tasks, but researchers are busy Solution: engage the general audience in solving those tasks, explaining that they are contributing to science, research and the public good
  21. 21. SPOT THE DIFFERENCE… • Similarities: • Involvement of people • No automatic replacement • Variations: • Motivation • Reward (glory, money, passion/need) • Hybrids or parallel! 21copyright © 2019 Cefriel – All rights reserved Citizen Science Crowdsourcing Human Computation
  22. 22. from ideation to business value 22 3. NEED FOR HUMAN-IN-THE- LOOP & EXPLAINABLE AI Why are humans still on the rise in the AI era? copyright © 2019 Cefriel – All rights reserved
  23. 23. WHEN TO INVOLVE HUMAN-IN-THE-LOOP IN MODELLING • Before creating a model • Training set creation (on which data do I build a model?) • During the modelling phase • Model validation (is my model correct?) • Active learning (what additional training data would improve my model performance?) • Using a model in production • Algorithmic transparency (should I trust the way my model gave such a prediction?) • Prediction explainability (why did my model give such a prediction?) 23copyright © 2019 Cefriel – All rights reserved
  24. 24. EXPLAINABLE AI (XAI) • “We are entering a new age of AI applications, machine learning is the core technology” • “Machine learning models are opaque, non-intuitive, and difficult for people to understand” • “Current AI systems offer tremendous benefits, but their effectiveness is limited by the machine’s inability to explain its decisions and actions to users” • “Explainable AI will be essential if users are to understand, appropriately trust, and effectively manage this incoming generation of artificially intelligent partners” 24copyright © 2019 Cefriel – All rights reserved Source: https://www.darpa.mil/program/explainable-artificial-intelligence
  25. 25. WHAT EXPLANATION MEANS • Explanation = set of hints to understand the relationship between the characteristics of an individual (e.g. an email) and the model prediction on that individual (e.g. this email is spam) • Different levels of prediction trust • On individual prediction  it requires explanation about the individual (e.g. why this mail is spam) • On entire model  it requires (1) selecting a representative sample of individuals (e.g. a set of spam/non-spam emails) + (2) explaining each individual in the sample • Characteristics of explanations • Local fidelity  valid in the vicinity of the individual (but non necessarily globally) • Model agnostic  independent on the specific black box model • Interpretability  quantitative understanding of the explanation (e.g. words, not word embeddings); this depends on the audience, because humans use their previous knowledge to interpret an explanation 25copyright © 2019 Cefriel – All rights reserved M. Ribeiro, S. Singh, C. Guestrin "Why should I trust you? Explaining the predictions of any classifier" Proceedings of the 22nd ACM SIGKDD, 2016
  26. 26. WHAT HUMAN EXPLANATION MEANS • Two meaning of explanation • “Machine” explanation = what the machine thinks (scientific theory, phenomena comprehension) • “Human” explanation = what the human wants to know to interpret a model • Characteristics of explanations from the human point of view • Selective explanations (not all possible reasons, but only "relevant" causes, not including pre- existing beliefs/assumptions) • Contrastive explanations (counterfactual causality, “why P and not Q?”) • Social explanations (dialogue/conversation, interaction, iteration) • Why human-in-the-loop is needed for Explainable AI • Don't let computer scientists decide how to formulate explanations, because otherwise explanations are too close to the model and too far from human understanding • There is a large body of knowledge about explanations from social sciences 26copyright © 2019 Cefriel – All rights reserved B. Mittelstadt, C. Russell, S. Wachter "Explaining explanations in AI" Proceedings of FAT 2019 Tim Miller, Piers Howe and Liz Sonenberg “Explainable AI: Beware of Inmates Running the Asylum”, IJCAI-17 Workshop on Explainable AI, 2017
  27. 27. Requirements for AI Adoption 7 Pillars of Explainable AI (in Healthcare) A. Teredesai, M. Aurangzeb Ahmad, C. Eckert, V. Kumar (KenSci Inc.) “Explainable Models for Healthcare AI” https://www.kensci.com/explainable-machine-learning/ WHY IS THIS SO IMPORTANT? THE BUSINESS VIEW 27copyright © 2019 Cefriel – All rights reserved Freddy Lecue "On The Role of Knowledge Graphs in Explainable AI" submitted to Semantic Web Journal, 2019
  28. 28. BEFORE YOU GO… 28copyright © 2019 Cefriel – All rights reserved bit.ly/isws2019hitl
  29. 29. © copyright 2018 CEFRIEL – All rights reserved© copyright 2016 CEFRIEL – All rights reserved© copyright 2018 CEFRIEL – All rights reserved MILANO viale Sarca 226, 20126, Milano - Italy LONDON 4th floor 57 Rathbone Place London W1T 1JU – UK NEW YORK One Liberty Plaza, 165 Broadway, 23rd Floor, New York City, New York, 10006 USA Cefriel.com Thanks for your attention! Any question? Irene Celino Knowledge Technologies Group Digital Interaction Division irene.celino@cefriel.com

×