2. Overview
o My Background & Research Themes
o Structuring Evidence in Wikipedia Discussions
o Supporting Systematic Review of Biomedical Evidence
2
3. Themes in My Research
o How do people collaborate to generate knowledge?
o What counts as evidence in a given community?
o How can structuring evidence help synthesize info?
3
4. What knowledge should be included
in Wikipedia?
Jodi Schneider, Krystian Samp, Alexandre Passant, and Stefan Decker. “Arguments about Deletion:
How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”. In CSCW
2013.
Jodi Schneider and Krystian Samp. “Alternative Interfaces for Deletion Discussions in Wikipedia:
Some Proposals Using Decision Factors. [Demo]” In WikiSym2012.
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in Wikipedia:
Decision Factors and Outcomes.” In WikiSym2012.
4
13. Problem: Newcomers are confused about
Wikipedia’s standards.
o “Why should a local cricket club not have it's own
page on this website? Obviously a valid club and
been established for a while. Nothing offensive or
false on the page. All need to do is put in Emsworth
Cricket Club into a search engine and information
comes up. Why just because it is a small team
and not major does it not deserve it's own page
on here?” (sic)
o “At the end of the day the club has history which
being 200 years is just as special as a article on a
breed of dog or something similar.”
o “really is worth a mention. Especially on a
website, where pointless people ... gets a
mention.” (sic)
13
14. Problem: Newcomers are confused about
Wikipedia’s standards.
o “Why should a local cricket club not have it's own
page on this website? Obviously a valid club and
been established for a while. Nothing offensive or
false on the page. All need to do is put in Emsworth
Cricket Club into a search engine and information
comes up. Why just because it is a small team
and not major does it not deserve it's own page
on here?” (sic)
o “At the end of the day the club has history which
being 200 years is just as special as a article on a
breed of dog or something similar.”
o “really is worth a mention. Especially on a
website, where pointless people ... gets a
mention.” (sic)
14
18. Problem Summary
o Long, no-consensus discussions
Summarize discussions
o Newcomers are confused about Wikipedia's standards
Make article criteria more explicit
18
19. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
19
20. Approach: Structure Evidence
1. Understand what evidence the community uses
to establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
20
21. Sample Corpus
o 72 discussions started on 1 day.
Each discussion has
• 3–33 messages
• 2–15 participants
o In total, 741 messages contributed by 244 users.
Each message has
• 3–350+ words
o 98 printed A4 sheets
21
22. Structuring the Data: Annotation
o Content analysis of the corpus
o Compare two different annotation approaches
o Iterative annotation
• Multiple annotators
• Refine to get good inter-annotator agreement
• 4 rounds of annotation
22
23. 2 Types of Annotation
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Informal argumentation
(philosophical & computational argumentation)
• Identify & prevent errors in reasoning (fallacies)
• 60 patterns
o 2. Factors Analysis
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
23
24. 2 Types of Annotation
1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
Informal argumentation
(philosophical & computational argumentation)
Identify & prevent errors in reasoning (fallacies)
60 patterns
o 2. Factors Analysis
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
24
26. Factor Example (used to justify ‘keep’)
Notability Anyone covered by another
encyclopedic reference is considered
notable enough for inclusion in
Wikipedia.
Sources Basic information about this album at a
minimum is certainly verifiable, it's a
major label release, and a highly
notable band.
Maintenance …this article is savable but at its
current state, needs a lot of
improvement.
Bias It is by no means spam (it does not
promote the products).
Other I'm advocating a blanket “hangon” for
all articles on newly-drafted players…
Jodi Schneider, Alexandre Passant & Stefan Decker
Deletion Discussions in Wikipedia: Decision Factors and Outcomes
4 Key Factors (& “Other”)
26
27. Decision factors articulate values/criteria.
o 4 Factors in Deletion Discussions cover:
• 91% of comments
• 70% of discussions
o Readers who understand these criteria:
• Understand what content is appropriate.
• Are less likely to have content deleted, and less likely to
take deletion personally.
27
28. To structure the data, we chose factors.
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Most appropriate for writing support
• 15 categories + 2 non-argumentative categories
• Detailed analysis of content
o 2. Factors Analysis
o (drawing on Ashley 1991)
• Close to the community rules & policies
• 4 categories + 1 catchall
• Good domain coverage
28
29. Approach: Structure Evidence
1. Understand what evidence the community uses
to establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
29
30. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
30
33. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
33
34. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
34
35. Build a computer support system.
Original
Discussion
Ontology
Semantic
Enrichment
Semantically
Enriched
RDFa
Querying
Queryable
User Interface
With Barchart
47. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
47
53. PU* - Perceived usefulness
PE* - Perceived ease of use
DC -Decision completeness
PF - Perceived effort
IC* - Information
completeness
Statistical Significance
PU* p < .001
PE* p .001
IC* p .039
53
55. Results: 84% prefer our system.
“Information is structured and I can quickly get an
overview of the key arguments.”
“The ability to navigate the comments made it a bit
easier to filter my mind set and to come to a
conclusion.”
“It offers the structure needed to consider each factor
separately, thus making the decision easier. Also, the
number of comments per factor offers a quick
indication of the relevance and the deepness of the
decision.”
16/19, based on a 20 participant user test.
1 participant did not take the final survey
55
56. Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test…
… & refine the system.
56
57. Summary
o Information technology can organize information
based on a community’s key decision factors.
o In Wikipedia, we developed an alternate interface for
deletion discussions.
o In Wikipedia, 4 questions are used to evaluate
borderline articles:
o Notability – Is the topic appropriate for our encyclopedia?
o Sources – Is the article well-sourced?
o Maintenance – Can we maintain this article?
o Bias – Is the article neutral? POV appropriately weighted?
57
58. Summary: Our Process
1. Get to know a community and its needs.
Ethnography
1. Structure the data.
Annotation & ontology development
1. Build a computer support system.
Web standards:
HTML, JavaScript, RDF/OWL, SPARQL
1. Test & refine the system.
Human computer interaction
58
61. Info overload now goes beyond
papers
Bastian, Glasziou, and Chalmers. "75 trials and 11 systematic reviews
a day: how will we ever keep up?." PLoS medicine 7.9 (2010): e1000326.
62. For medication safety, how to
structure evidence on drug-drug
interactions and keep it up-to-date?
Jodi Schneider, Paolo Ciccarese, Tim Clark and Richard D. Boyce. “Using the
Micropublications ontology and the Open Annotation Data Model to represent evidence
within a drug-drug interaction knowledge base.” 4th Workshop on Linked Science 2014—
Making Sense Out of Data (LISC2014) at ISWC 2014
Mathias Brochhausen, Jodi Schneider, Daniel Malone, Philip E. Empey, William R. Hogan
and Richard D. Boyce “Towards a foundational representation of potential drug-drug
interaction knowledge.” First International Workshop on Drug Interaction Knowledge
Representation (DIKR-2014) at the International Conference on Biomedical Ontologies
(ICBO 2014)
Jodi Schneider, Carol Collins, Lisa Hines, John R Horn and Richard Boyce. “Modeling
Arguments in Scientific Papers to Support Pharmacists.” at ArgDiaP 2014, The 12th
ArgDiaP Conference: From Real Data to Argument Mining, Warsaw, Poland
62
63. Part of a Larger Effort
o “Addressing gaps in clinically useful evidence on
drug-drug interactions”
o 4-year project, U.S. National Library of Medicine R01
grant
(PI, Richard Boyce; 1R01LM011838-01)
o Since February 2013:
evidence panel of domain experts
(Carol Collins, Lisa Hines, John R Horn, Phil Empey)
& informaticists
(Tim Clark, Paolo Ciccarese, Jodi Schneider)
o Programmer: Yifan Ning
65. Prescribers consult drug interaction references
which are maintained by expert pharmacists.
Medscape EpocratesMicromedex 2.0
65
66. Prescribers consult drug interaction references
which are maintained by expert pharmacists.
Medscape EpocratesMicromedex 2.0
66
67. Goals
o Support evidence-based updates to
drug-interaction reference databases.
o Make sense of the EVIDENCE:
• New clinical trials
• Adverse drug event reports
• Drug product labels
• FDA regulatory updates
http://jama.jamanetwork.com/article.aspx?articleid=18345467
69. Evidence Base Competency Questions
o 40 competency questions, such as:
• List all evidence by drug, drug pair, …
• List all default assumptions
(assertions not supported by evidence)
• Which single evidence items act as as support or rebuttal
for multiple assertions of type X?
(e.g., substrate_of assertions)
• What data, methods, materials, were used in the study
reported in evidence item X?
• Which research group conducted the study reported in
evidence item X?
• Show me what evidence has been deprecated since my
last visit?
• Which assertions are supported by a specific FDA
guidance statement?
69
70. An Ontology for Representing Evidence
Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and
annotations in biomedical communications
70
71. An Ontology for Representing Evidence
71
Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and
annotations in biomedical communications
83. Next steps
o Continuing data model development & testing.
o NLP support: Create a pipeline for extracting
potential drug-drug interaction mentions from
scientific & clinical literature.
o NLP + "expertsourcing" and crowdsourcing
(distributed annotation).
o Test annotation tools: usability for domain experts.
o Resolving links to paywalled PDFs.
83
90. Walton’s Argumentation Schemes
Example Argumentation Scheme:
Argument from Rules – “we apply rule X”
Critical Questions
1. Does the rule require carrying out this type of
action?
2. Are there other established rules that might conflict
with or override this one?
3. Are there extenuating circumstances or an excuse
for noncompliance?
Walton, Reed, and Macagno 2008
Online discussions are the focus of the first project, which addresses the question of “What knowledge should be included in Wikipedia?"
Wikipedia is extremely popular: it’s the world’s 7th most visited website. But what knowledge gets included?
It’s a little known fact that Wikipedia deletes articles. For most readers, messages like these are the only sign of articles at risk for deletion,
or deleted articles.
In fact, 1 in 4 Wikipedia articles is deleted.
While many articles are deleted without discussion, each week about 500 borderline articles are considered for deletion, through open online discussions that anyone can comment on.
Here is an example discussion. First, someone nominates the article for deletion. In this case, the article is about a baseball pitcher. The nominator says that we should delete the article: Heath Totten doesn’t merit an article since he doesn’t have a very good record and hasn’t played in a few years.
The second message responds and suggests keeping the article. This message gives new evidence to support keeping the article about Heath Totten. That he is actively playing.
We find that there are a few problems with these discussions. First of all, some discussions have no consensus, even after lengthy discussion. The same article may be repeatedly proposed for deletion, in some cases over 20 times.
One goal of this work is to summarize long discussions.
Second, newcomers are confused about Wikipedia’s standards. Newcomers make comments like these:
"Why just because it is a small team and not major does it not deserve it’s (sic) own page on here?"
just as special as a article on a breed of dog
especially on a website where pointless people get a mention
Making the criteria
A second goal of this work is to make the community standards more explicit.
Newcomers also do not understand particular terminology, such as “reliable secondary source”. A common argument from an old-hand in our corpus is that “Notability [is] not demonstrated in a reliable secondary source”.
Newcomers misunderstand what Wikipedia counts as a “reliable secondary source”. Here, a newcomer replies that the article “will have refs from other sources” once the website it is describing goes live. To a Wikipedian, this is not a convincing argument, because how does this person know this? The "refs from other sources" sound like press releases – but reliable secondary sources must be independent.
So again, this shows the need to make the community standards more explicit.
Technically started or relisted
Corpus is https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Log/2011_January_29
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
very few content standards need to be clearly communicated to readers in order to bring significant benefit.
69.5% of discussions and 91% of comments are well-represented by just four factors: Notability, Sources, Maintenance and Bias. The best way to avoid deletion is for readers to understand these criteria.
Categories (Walton’s argumentation schemes) vs. process (factors analysis)
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
****42-45.
45: rdfs:type,
20 novice participants used both systems
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”
20 novice participants used both systems
“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”
“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”
Identify and explicitly represent arguments, and in particular
successful arguments that are persuasive to a given audience.
Adverse drug events are a leading cause of death
Image from https://www.njpharmacy.com/wp-content/uploads/2013/02/drug-interactions-checker.png
Image from http://www.clipartbest.com/clipart-McLLpbGKi
Adverse drug events are a leading cause of death
Images from
http://www.knowabouthealth.com/android-version-of-medscape-app-ready-to-download/7568/
Android Play store
http://amazingsgs.blogspot.com/2011/10/top-5-free-android-medical-apps-for.html
Most sources of clinically-oriented PDDI knowledge disagree substantially in their content,
including about which drug combinations should never be never co-administered. For
example, only one quarter of 59 contraindicated drug pairs were listed in three PDDI
information sources[4], only 18 (28%) of 64 pharmacy information and clinical decisions
support systems correctly identified 13 PDDIs considered clinically significant
by a team of drug interaction experts[5], and four clinically oriented drug information
compendia agreed on only 2.2% of 406 PDDIs considered to be “major” by at least
one source[6].
From our paper: http://ceur-ws.org/Vol-1309/paper2.pdf
4. Wang, L.M., Wong, M., Lightwood, J.M., Cheng, C.M.: Black box
warning contraindicated comedications: concordance among three
major drug interaction screening programs. Ann. Pharmacother. 44,
28–34 (2010).
5. Saverno, K.R., Hines, L.E., Warholak, T.L., Grizzle, A.J., Babits, L.,
Clark, C., Taylor, A.M., Malone, D.C.: Ability of pharmacy clinical
decision-support software to alert users about clinically important
drug-drug interactions. J. Am. Med. Inform. Assoc. JAMIA. 18, 32–
37 (2011).
6. Abarca, J., Malone, D.C., Armstrong, E.P., Grizzle, A.J., Hansten,
P.D., Van Bergen, R.C., Lipton, R.B.: Concordance of severity ratings
provided in four drug interaction compendia. J. Am. Pharm. Assoc.
JAPhA. 44, 136–141 (2004).
Adverse drug events are a leading cause of death
Images from
http://www.knowabouthealth.com/android-version-of-medscape-app-ready-to-download/7568/
Android Play store
http://amazingsgs.blogspot.com/2011/10/top-5-free-android-medical-apps-for.html
40 compentency questions
https://docs.google.com/document/d/1o0DYpu9FuXGCz861OOGkhYKA-KWMY-hHRBQ-R8IlqXc/edit
Not the only competency questions – also have e.g.
Queries Supporting Drug Interaction Management
https://docs.google.com/spreadsheets/d/1ikYsOB09XHUQiSl-KPlDZBQWScbi15rHUeyfOcUQz5M/edit#gid=0
Very precise specification of the entities
Improve sensitivity of information retrieval (recall/precision)
From http://dailymed.nlm.nih.gov/dailymed/fda/fdaDrugXsl.cfm?setid=13bb8267-1cab-43e5-acae-55a4d957630a&type=display
Evidence entry form from:
https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxkZGlrcmFuZGlyfGd4OjE0ZGIwY2IwNzJhOWNjMjY
From http://dailymed.nlm.nih.gov/dailymed/fda/fdaDrugXsl.cfm?setid=13bb8267-1cab-43e5-acae-55a4d957630a&type=display
For adding annotations: Existing MP plugin for Domeo
For viewing annotations: Want them highlighted in a web-based interface BUT Resolving annotations requires a method for pointing to paywalled/subscription PDF & HTML
An existing Micropublication plugin for Domeo [Ciccarese2014] is being mod- ified as part of the project. Our plan is to use the revised plugin to support the evidence board with the collection of the evidence and associated annotation data. It will also enable the broader community to access and view annotations of PDDIs highlighted in a web-based interface. We anticipate that this approach will enable a broader community of experts to review each PDDI recorded in the DIKB and examine the underlying research study to confirm its appropriateness and relevance to the evidence base.
The usability of the annotation plug-in is critically important so that the panel of domain experts will not face barriers to annotating and entering ev- idence. This will require usability studies of the new PDDI Micropublication plugin. Another issue is that many PDDI evidence items can be found only in PDF documents. Currently, the tool chain for PDF annotation is relatively weak: compared to text and HTML, PDF annotation tools are not as widely available and not as familiar to end-users. Suitable tools will have to be integrated into the revised plugin.
PDF documents may be in proprietary portals or academic library systems
Annotations in the data model are a set of RDF resources that connect some target to a set of resources that are in some way about it.
We would count this as an Argument from Rules
Major Premise: If carrying out types of actions including A is the established rule for x, then (unless the case is an exception), a must carry out A.
Minor Premise: Carrying out types of actions including A is the established rule for a.
Conclusion: Therefore, a must carry out A.
Earlier in CSCW: Jodi Schneider, Krystian Samp, Alexandre Passant, Stefan Decker. “Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”. In Computer Supported Cooperative Work and Social Computing (CSCW). San Antonio, TX, February 23-27, 2013.
Used as categories
Initial annotation
60 categories (each Walton argumentation scheme)
all arguments in each message
Round 4
15 most common argumentation schemes
main argument in each message
Good inter-annotator agreement for hard task:54% agreement (compared to 12% chance) among 2 annotators