SlideShare una empresa de Scribd logo
1 de 194
Descargar para leer sin conexión
Identifying Biological Knowledge:
    Three Possible Strategies
             Anita de Waard
     Disruptive Technologies Director,
        Elsevier Labs, Amsterdam
           Casimir Researcher,
       UiL-OTS, Utrecht University

     XRCE, Grenoble, 24 September 2009
Overview
Overview
Problem: too much discourse, tools are not yet good
enough...
Overview
Problem: too much discourse, tools are not yet good
enough...
1. First attempt: allow authors to validate entities
Overview
Problem: too much discourse, tools are not yet good
enough...
1. First attempt: allow authors to validate entities
2. Second attempt: discourse analysis
Overview
Problem: too much discourse, tools are not yet good
enough...
1. First attempt: allow authors to validate entities
2. Second attempt: discourse analysis
3. Third attempt: collaboration to identify hypotheses
Why Study Biological Discourse?
Why Study Biological Discourse?

-   There is too much of it!
Why Study Biological Discourse?

-   There is too much of it!
Why Study Biological Discourse?

-   There is too much of it!

-   Text mining and ‘fact
    extraction’ techniques are
    gaining ground to tame this
    tangle
Why Study Biological Discourse?

-   There is too much of it!

-   Text mining and ‘fact
    extraction’ techniques are
    gaining ground to tame this
    tangle

-   Emerging area of biological
    natural language processing
    (BioNLP): subfield of computational linguistics
Why Study Biological Discourse?

-   There is too much of it!

-   Text mining and ‘fact
    extraction’ techniques are
    gaining ground to tame this
    tangle

-   Emerging area of biological
    natural language processing
    (BioNLP): subfield of computational linguistics

-   Main focus: identifying biological entities (genes,
    proteins, drugs) and their relationships
Example state of the art: MEDIE
Example state of the art: MEDIE




Alteration of nm23, P53, and S100A4 expression may
contribute to the development of gastric
Example state of the art: MEDIE




Alteration of nm23, P53, and S100A4 expression may
contribute to the development of gastric



       Previous studies have implicated miR-34a as a tumor
       suppressor gene whose transcription is activated by p53.
Example state of the art: MEDIE


                              Add this knowledge during authoring?

Alteration of nm23, P53, and S100A4 expression may
contribute to the development of gastric



       Previous studies have implicated miR-34a as a tumor
       suppressor gene whose transcription is activated by p53.
First attempt: allow authors
     to validate entities
Improve time + quality of knowledgebase entry
Improve time + quality of knowledgebase entry

 - For database curators: save time and money
Improve time + quality of knowledgebase entry

 - For database curators: save time and money
 - For authors: lower the threshold to submitting papers with
   metadata
Improve time + quality of knowledgebase entry

 - For database curators: save time and money
 - For authors: lower the threshold to submitting papers with
   metadata
 - Structured Digital Abstract: an editorial experiment to increase
   the reach of online published articles
Improve time + quality of knowledgebase entry

 - For database curators: save time and money
 - For authors: lower the threshold to submitting papers with
   metadata
 - Structured Digital Abstract: an editorial experiment to increase
   the reach of online published articles
 - SDA encodes in a schema information contained in the article
expression of GSG1 stimulates TPAP targeting to the
ER, suggesting that interactions between the two
proteins lead to the redistribution of TPAP from the
cytosol to the ER.

MINT-6168263:
Gsg1 (uniprotkb:Q8R1W2), TPAP
(uniprotkb:Q9WVP6) and Calmegin
(uniprotkb:P52194) colocalize (MI:0403) by
cosedimentation (MI:0027)

MINT-6168204, MINT-6168178:
Gsg1 (uniprotkb:Q8R1W2) and TPAP
(uniprotkb:Q9WVP6) colocalize (MI:0403) by
fluorescence microscopy (MI:0416)

MINT-6167930:
Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
expression of GSG1 stimulates TPAP targeting to the
ER, suggesting that interactions between the two
proteins lead to the redistribution of TPAP from the
cytosol to the ER.

MINT-6168263:
Gsg1 (uniprotkb:Q8R1W2), TPAP
(uniprotkb:Q9WVP6) and Calmegin
(uniprotkb:P52194) colocalize (MI:0403) by
cosedimentation (MI:0027)

MINT-6168204, MINT-6168178:
Gsg1 (uniprotkb:Q8R1W2) and TPAP
(uniprotkb:Q9WVP6) colocalize (MI:0403) by
fluorescence microscopy (MI:0416)

MINT-6167930:
Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
expression of GSG1 stimulates TPAP targeting to the
ER, suggesting that interactions between the two
proteins lead to the redistribution of TPAP from the
cytosol to the ER.

MINT-6168263:
Gsg1 (uniprotkb:Q8R1W2), TPAP
(uniprotkb:Q9WVP6) and Calmegin
(uniprotkb:P52194) colocalize (MI:0403) by
cosedimentation (MI:0027)

MINT-6168204, MINT-6168178:
Gsg1 (uniprotkb:Q8R1W2) and TPAP
(uniprotkb:Q9WVP6) colocalize (MI:0403) by
fluorescence microscopy (MI:0416)

MINT-6167930:
Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
expression of GSG1 stimulates TPAP targeting to the
ER, suggesting that interactions between the two
proteins lead to the redistribution of TPAP from the
cytosol to the ER.

MINT-6168263:
Gsg1 (uniprotkb:Q8R1W2), TPAP
(uniprotkb:Q9WVP6) and Calmegin
(uniprotkb:P52194) colocalize (MI:0403) by
cosedimentation (MI:0027)

MINT-6168204, MINT-6168178:
Gsg1 (uniprotkb:Q8R1W2) and TPAP
(uniprotkb:Q9WVP6) colocalize (MI:0403) by
fluorescence microscopy (MI:0416)

MINT-6167930:
Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
expression of GSG1 stimulates TPAP targeting to the
ER, suggesting that interactions between the two
proteins lead to the redistribution of TPAP from the
cytosol to the ER.

MINT-6168263:
Gsg1 (uniprotkb:Q8R1W2), TPAP
(uniprotkb:Q9WVP6) and Calmegin
(uniprotkb:P52194) colocalize (MI:0403) by
cosedimentation (MI:0027)

MINT-6168204, MINT-6168178:
Gsg1 (uniprotkb:Q8R1W2) and TPAP
(uniprotkb:Q9WVP6) colocalize (MI:0403) by
fluorescence microscopy (MI:0416)

MINT-6167930:
Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
How? Word Plugin




                   5
How? Word Plugin
-   Okkam4MsW: a Microsoft Word plugin interact with Web Services performing NLP
    and semantic technologies to detect entities and contextual information
-   The OKKAM repository is queried to get the right OKKAM id and alternative ids
    (UniProt in this case)




                                                                                    5
How? Word Plugin
-   Okkam4MsW: a Microsoft Word plugin interact with Web Services performing NLP
    and semantic technologies to detect entities and contextual information
-   The OKKAM repository is queried to get the right OKKAM id and alternative ids
    (UniProt in this case)




                                                                                    5
OKKAM Entity Editor in MS Word
OKKAM Entity Editor in MS Word
OKKAM Entity Editor in MS Word
OKKAM Entity Editor in MS Word
http://sig.ma
Second attempt:
discourse analysis
What else is wrong with MEDIE?
What else is wrong with MEDIE?




Alteration of nm23, P53, and S100A4 expression may
contribute to the development of gastric
What else is wrong with MEDIE?




Alteration of nm23, P53, and S100A4 expression may
contribute to the development of gastric



        Previous studies have implicated miR-34a as a tumor
        suppressor gene whose transcription is activated by p53.
What else is wrong with MEDIE?

                          without some idea of the status of the
                           sentence, it cannot be interpreted!

Alteration of nm23, P53, and S100A4 expression may
contribute to the development of gastric



        Previous studies have implicated miR-34a as a tumor
        suppressor gene whose transcription is activated by p53.
Discourse Analysis
Discourse Analysis
Underlying model of text mining systems:

   -   Scientific paper is ‘statement of pertinent facts’
   -   So: finding entities and relationships will give you a summary of
       the knowledge within the paper
   -   However, information extracted this way is not very useful....
Discourse Analysis
Underlying model of text mining systems:

    -   Scientific paper is ‘statement of pertinent facts’
    -   So: finding entities and relationships will give you a summary of
        the knowledge within the paper
    -   However, information extracted this way is not very useful....
Proposed approach: treat scientific paper as a persuasive text: specific
genre, with genre characteristics and allowed persuasive techniques:
    -   ‘these results suggest’ (depersonification)

    -   ‘as fig. 2a shows’ (evidence is in the data)

    -   ‘oncogenes produce a stress response [Serrano, 2003]’

References and data form a “folded array of successive defense lines, behind
which scientists ensconce themselves” (Latour, 1986)
Overall Research Questions
Overall Research Questions
i. How can we model the discourse/suasive moves in a
   biological paper?
Overall Research Questions
i. How can we model the discourse/suasive moves in a
   biological paper?
ii. Can this model help enable automated epistemic
    markup?
Overall Research Questions
i. How can we model the discourse/suasive moves in a
   biological paper?
ii. Can this model help enable automated epistemic
    markup?
iii. Can it improve knowledge representations of
     collections of papers?
Discourse analysis
Discourse analysis
Segmentation and classification:
Discourse analysis
Segmentation and classification:
1. Parse text into discourse segments (edu’s) containing a
   single rhetorical move (if possible...)
Discourse analysis
Segmentation and classification:
1. Parse text into discourse segments (edu’s) containing a
   single rhetorical move (if possible...)
2. Determine categories or types of discourse segments
   that have similar semantic/pragmatic properties
Discourse analysis
Segmentation and classification:
1. Parse text into discourse segments (edu’s) containing a
   single rhetorical move (if possible...)
2. Determine categories or types of discourse segments
   that have similar semantic/pragmatic properties
3. Look at a number of linguistic characteristics and see if
   these segment types share those characteristics.
Segmentation
Segmentation
Goal: ‘one new thought per segment’:
Segmentation
Goal: ‘one new thought per segment’:
Figure 4A shows that following RASV12 stimulation, p53
was stabilized and activated, and its target gene, p21cip1,
was induced in all cases, indicating an intact p53 pathway
in these cells.
Segmentation
Goal: ‘one new thought per segment’:
Figure 4A shows that following RASV12 stimulation, p53
was stabilized and activated, and its target gene, p21cip1,
was induced in all cases, indicating an intact p53 pathway
in these cells.

a.   Figure 4a shows that
b.   following RASV12 stimulation
c.   p53 was stabilized and activated
d.   and the target gene, p21cip1, was induced in all cases,
e.   indicating an intact p53 pathway in these cells.
Segmentation
Goal: ‘one new thought per segment’:
Figure 4A shows that following RASV12 stimulation, p53
was stabilized and activated, and its target gene, p21cip1,
was induced in all cases, indicating an intact p53 pathway
in these cells.

a.   Figure 4a shows that Intratextual
b.   following RASV12 stimulation         Method
c.   p53 was stabilized and activated              Result
d.   and the target gene, p21cip1, was induced in all cases, Result
e.   indicating an intact p53 pathway in these cells. Implication
Segmentation
Goal: ‘one new thought per segment’:
Figure 4A shows that following RASV12 stimulation, p53
was stabilized and activated, and its target gene, p21cip1,
was induced in all cases, indicating an intact p53 pathway
in these cells.

a.   Figure 4a shows that Intratextual
b.   following RASV12 stimulation         Method
c.   p53 was stabilized and activated              Result
d.   and the target gene, p21cip1, was induced in all cases, Result
e.   indicating an intact p53 pathway in these cells. Implication
Segment Types
Segment Types
Segment       Description                          Example
Fact          a known fact, generally without      mature miR-373 is a homolog of miR-372
              explicit citation
Hypothesis    a proposed idea, not supported by    This could for instance be a result of high
              evidence                             mdm2 levels
Problem       unresolved, contradictory, or        However, further investigation is required to
              unclear issue                        demonstrate the exact mechanism of LATS2
                                                   action
Goal          research goal                        To identify novel functions of miRNAs,

Method        experimental method                  Using fluorescence microscopy and luciferase
                                                   assays,
Result        a restatement of the outcome of      all constructs yielded high expression levels
              an experiment                        of mature miRNAs
Implication   an interpretation of the results, in our procedure is sensitive enough to detect
              light of earlier hypotheses and facts mild growth differences
Segment Types
Segment       Description                          Example
Fact          a known fact, generally without      mature miR-373 is a homolog of miR-372
              explicit citation
Hypothesis    a proposed idea, not supported by    This could for instance be a result of high
              evidence                             mdm2 levels
Problem       unresolved, contradictory, or        However, further investigation is required to
              unclear issue                        demonstrate the exact mechanism of LATS2
                                                   action
Goal          research goal                        To identify novel functions of miRNAs,

Method        experimental method                  Using fluorescence microscopy and luciferase
                                                   assays,
Result        a restatement of the outcome of      all constructs yielded high expression levels
              an experiment                        of mature miRNAs
Implication   an interpretation of the results, in our procedure is sensitive enough to detect
              light of earlier hypotheses and facts mild growth differences

‘Other-segments’, related to (referenced) other work:
Segment Types
Segment       Description                          Example
Fact          a known fact, generally without      mature miR-373 is a homolog of miR-372
              explicit citation
Hypothesis    a proposed idea, not supported by    This could for instance be a result of high
              evidence                             mdm2 levels
Problem       unresolved, contradictory, or        However, further investigation is required to
              unclear issue                        demonstrate the exact mechanism of LATS2
                                                   action
Goal          research goal                        To identify novel functions of miRNAs,

Method        experimental method                  Using fluorescence microscopy and luciferase
                                                   assays,
Result        a restatement of the outcome of      all constructs yielded high expression levels
              an experiment                        of mature miRNAs
Implication   an interpretation of the results, in our procedure is sensitive enough to detect
              light of earlier hypotheses and facts mild growth differences

‘Other-segments’, related to (referenced) other work:

Regulatory segments, acting as matrix sentences framing other segments:
Linguistic and structural properties
Linguistic and structural properties
Linguistic and structural properties
1. Position in text

      -   Section of the paper (Introduction, Results, Discussion)
      -   Beginning/middle/end of section
      -   First/second/third part of sentence
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second/third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class: Thing (increase), Thing-Thing (inhibit),
           Person-Thing (examine, observe, operate, implicate), Person: Report
      -    Lexicon
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second/third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class: Thing (increase), Thing-Thing (inhibit),
           Person-Thing (examine, observe, operate, implicate), Person: Report
      -    Lexicon
3. Metadiscourse markers [Hyland, 2003]:

      -    Connectives
      -    Endophorics, Evidentials
      -    Hedges, Boosters
      -    Person markers
Results: Section and Sequence
Results: Section and Sequence
1. Voorhoeve, 2006: Cell - 427 segments
Results: Section and Sequence
1. Voorhoeve, 2006: Cell - 427 segments

2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments
Results: Section and Sequence
1. Voorhoeve, 2006: Cell - 427 segments

2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments

-   Introduction (90):
    Other-Result (24), Other-Implication (11), Problem (9), Fact (8)
Results: Section and Sequence
1. Voorhoeve, 2006: Cell - 427 segments

2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments

-   Introduction (90):
    Other-Result (24), Other-Implication (11), Problem (9), Fact (8)

-   Result (334):
    Goal (26) -> Method (68) -> Result (105) -> Reg-Implication (23)
    ->Implication (50)
Results: Section and Sequence
1. Voorhoeve, 2006: Cell - 427 segments

2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments

-   Introduction (90):
    Other-Result (24), Other-Implication (11), Problem (9), Fact (8)

-   Result (334):
    Goal (26) -> Method (68) -> Result (105) -> Reg-Implication (23)
    ->Implication (50)

-   Discussion (187):
    Implication (27), Result (21), Other-Result (24), Hypothesis (19),
    Problem (17)
Results:Verb Tense
Results:Verb Tense

-   Realm of the Present:
    Fact (82%), Hypothesis (71%), Implication (62%)
Results:Verb Tense

-   Realm of the Present:
    Fact (82%), Hypothesis (71%), Implication (62%)

-   Realm of the Past:
    Result (82%), Method (76%) - 50% Passive, of Method
    50% Past Perfect
Results:Verb Tense

-   Realm of the Present:
    Fact (82%), Hypothesis (71%), Implication (62%)

-   Realm of the Past:
    Result (82%), Method (76%) - 50% Passive, of Method
    50% Past Perfect

-   Realm of the Modal:
    44% in Hypothesis
Results:Verb Tense

-   Realm of the Present:
    Fact (82%), Hypothesis (71%), Implication (62%)

-   Realm of the Past:
    Result (82%), Method (76%) - 50% Passive, of Method
    50% Past Perfect

-   Realm of the Modal:
    44% in Hypothesis

-   Realm of the To-Infinitive:
    50% is Goal, 75% of Goal is to-infinitive (Purpose Clause)
Results: Verb Type
Results: Verb Type

-   Thing - Thing: high in experimental (Method, Result)
    and conceptual (Problem, Hypothesis, Fact,
    Implication) segments:

    ‣ Need to differentiate between ‘concept’ things
      and ‘experimental’ things!
Results: Verb Type

-   Thing - Thing: high in experimental (Method, Result)
    and conceptual (Problem, Hypothesis, Fact,
    Implication) segments:

    ‣ Need to differentiate between ‘concept’ things
      and ‘experimental’ things!

-   Person - Implicate: high in Hypothesis, Implication,
    Problem
Results: Verb Type

-   Thing - Thing: high in experimental (Method, Result)
    and conceptual (Problem, Hypothesis, Fact,
    Implication) segments:

    ‣ Need to differentiate between ‘concept’ things
      and ‘experimental’ things!

-   Person - Implicate: high in Hypothesis, Implication,
    Problem

-   Person - Operate: high in Methods (90%)
Results: Verb Type

-   Thing - Thing: high in experimental (Method, Result)
    and conceptual (Problem, Hypothesis, Fact,
    Implication) segments:

    ‣ Need to differentiate between ‘concept’ things
      and ‘experimental’ things!

-   Person - Implicate: high in Hypothesis, Implication,
    Problem

-   Person - Operate: high in Methods (90%)

-   Person - Examine: high in Goal (87%)
Results: Metadiscourse Markers
Results: Metadiscourse Markers

-   Causitive: high in Implications (therefore, thus),

-   Comparison: high in Results (whereas, in contrast),

-   Temporality: high in Methods (next, subsequently)

-   Person markers: high in Methods (50%) and Results

-   Boosters: high in Results (indeed, surprisingly,
    interestingly)

-   Hedges: high in Implication, Reg-Implication (raises the
    possibility that, explains at least in part)

    -   but modals and ‘suggest’ verbs are left out
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis


              problem




       fact     fact    fact
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis
                                     goal
                                to
              problem




       fact     fact    fact
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis
                                     goal
                                to
              problem




       fact     fact    fact
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis
                                     goal
                                to
              problem
                                              we
                                            method

                                             resulting in
                                            result




       fact     fact    fact
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis
                                     goal
                                to
              problem
                                              we
                                            method

                                             resulting in
                                            result




       fact     fact    fact
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis
                                             goal
                                  to
              problem
                                                       we
                                                     method

                                                       resulting in
                                                      result

                                                    suggests that

                                discussion
       fact     fact    fact
                                                implication
i. How can we model the discourse moves in a biological paper?

       Discourse as a Fact-ory
                   hypothesis
                                             goal
                                  to
              problem
                                                       we
                                                     method

                                                       resulting in
                                                      result

                                                    suggests that

                                discussion
       fact     fact    fact
                                                implication
i. How can we model the discourse moves in a biological paper?

        Discourse as a Fact-ory
                    hypothesis
                                              goal
                                   to
               problem
                                                        we
                                                      method
introduction
                                                        resulting in
                                                       result

                                                     suggests that

                                 discussion
        fact     fact    fact
                                                 implication
i. How can we model the discourse moves in a biological paper?

        Discourse as a Fact-ory
                    hypothesis
                                              goal
                                   to
               problem                                         results
                                                        we
                                                      method
introduction
                                                        resulting in
                                                       result

                                                     suggests that

                                 discussion
        fact     fact    fact
                                                 implication
i. How can we model the discourse moves in a biological paper?

        Discourse as a Fact-ory
                    hypothesis
                                              goal
                                   to
               problem                                         results
                                                        we
                                                      method
introduction
                                                        resulting in
                                                       result

                                                     suggests that

                                 discussion
        fact     fact    fact
                                                 implication

                                                                         discussion
i. How can we model the discourse moves in a biological paper?

        Discourse as a Fact-ory
                    hypothesis
                                              goal
                                   to
               problem                                         results
                                                        we
                                                      method
introduction
                                                        resulting in
                                                       result

                                                     suggests that

                                 discussion
        fact     fact    fact
                                                 implication


          Shared view                           Own view                 discussion
i. How can we model the discourse moves in a biological paper?

            Discourse as a Fact-ory
   hypothetical realm:   hypothesis                             realm of activity:
    (might, would)                                             (to test, to see)
                                                   goal
                                        to
                   problem                                          results
                                                             we                   realm of
                                                           method
introduction                                                                     experience:
                                                                                    past
                                                             resulting in
                                                            result

                                                          suggests that

                                      discussion                              realm of models:
            fact     fact     fact                                                present
                                                      implication


              Shared view                            Own view                   discussion
ii. Is this useful for enabling automated epistemic markup?
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
   verbs, connectives, etc.) already help:
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
   verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
   verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
TRIPLET(that A_1_GENE:+ - 42 - induced memory
deficits,involve,subtler neuronal alternations)
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
   verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
TRIPLET(that A_1_GENE:+ - 42 - induced memory
deficits,involve,subtler neuronal alternations)
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
    verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
TRIPLET(that A_1_GENE:+ - 42 - induced memory
deficits,involve,subtler neuronal alternations)

‣   issue: segment parsing is difficult!
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
    verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
TRIPLET(that A_1_GENE:+ - 42 - induced memory
deficits,involve,subtler neuronal alternations)

‣   issue: segment parsing is difficult!
‣   issue: verb tense is not always accessible
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
    verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
TRIPLET(that A_1_GENE:+ - 42 - induced memory
deficits,involve,subtler neuronal alternations)

‣   issue: segment parsing is difficult!
‣   issue: verb tense is not always accessible
‣   bionlp: not that much work on full text, since
    commercial publishers are difficult :-)!
ii. Is this useful for enabling automated epistemic markup?

✓ first efforts seem promising: simple markers (‘suggest’
    verbs, connectives, etc.) already help:
6> It is thus emerging that A_1-42-induced memory deficits may
involve subtler neuronal alternations leading to synaptic deficits, prior
to frank neurodegeneration in AD brains.
TRIPLET(that A_1_GENE:+ - 42 - induced memory
deficits,involve,subtler neuronal alternations)

‣   issue: segment parsing is difficult!
‣   issue: verb tense is not always accessible
‣   bionlp: not that much work on full text, since
    commercial publishers are difficult :-)!
‣   possible challenge at biolink 2011: watch this space...
KnownFact   KnownFact

Concepts
To investigate the possibility that
   miR-372 and miR-373 suppress the
       expression of LATS2, we...




                      KnownFact                   KnownFact

Concepts                             Hypothesis
To investigate the possibility that
   miR-372 and miR-373 suppress the
       expression of LATS2, we...




                      KnownFact                     KnownFact

Concepts                             Hypothesis


                             Goal


                   Method                  Result


                              Data

 Experiment 1
To investigate the possibility that
   miR-372 and miR-373 suppress the
       expression of LATS2, we...

                           Therefore, these results point to
                           LATS2 as a mediator of the miR-372 and
                           miR-373 effects on cell proliferation and
                           tumorigenicity,



                      KnownFact                        KnownFact

Concepts                             Hypothesis                 Implication


                             Goal


                   Method                     Result


                              Data

 Experiment 1
Voorhoeve, 2006
    To investigate the possibility that
   miR-372 and miR-373 suppress the
       expression of LATS2, we...

                           Therefore, these results point to
                           LATS2 as a mediator of the miR-372 and
                           miR-373 effects on cell proliferation and
                           tumorigenicity,



                      KnownFact                        KnownFact

Concepts                             Hypothesis                 Implication


                             Goal


                   Method                     Result


                              Data

 Experiment 1
Voorhoeve, 2006
    To investigate the possibility that
   miR-372 and miR-373 suppress the
       expression of LATS2, we...

                           Therefore, these results point to
                           LATS2 as a mediator of the miR-372 and
                           miR-373 effects on cell proliferation and
                           tumorigenicity,



                      KnownFact                        KnownFact

Concepts                             Hypothesis                 Implication


                                                                                Goal
                             Goal


                   Method                     Result                          Method      Result



                              Data                                                 Data

 Experiment 1                                                     Experiment 2
Voorhoeve, 2006
    To investigate the possibility that
   miR-372 and miR-373 suppress the
       expression of LATS2, we...
                                                                                Raver-Shapira et.al, JMolCell 2007

                           Therefore, these results point to              two miRNAs, miRNA-372 and-373, function as
                           LATS2 as a mediator of the miR-372 and        potential novel oncogenes in testicular germ cell
                           miR-373 effects on cell proliferation and    tumors by inhibition of LATS2 expression, which
                           tumorigenicity,                                 suggests that Lats2 is an important tumor
                                                                              suppressor (Voorhoeve et al., 2006).


                      KnownFact                        KnownFact

Concepts                             Hypothesis                 Implication                                   Fact


                                                                                      Goal
                             Goal


                   Method                     Result                              Method                    Result



                              Data                                                          Data

 Experiment 1                                                     Experiment 2
Yabuta, JBioChem 2007

                                          Voorhoeve, 2006                                   miR-372 and miR-373 target the
                                                                                                Lats2 tumor suppressor
    To investigate the possibility that                                                         (Voorhoeve et al., 2006)
   miR-372 and miR-373 suppress the
       expression of LATS2, we...
                                                                                Raver-Shapira et.al, JMolCell 2007

                           Therefore, these results point to              two miRNAs, miRNA-372 and-373, function as
                           LATS2 as a mediator of the miR-372 and        potential novel oncogenes in testicular germ cell
                           miR-373 effects on cell proliferation and    tumors by inhibition of LATS2 expression, which
                           tumorigenicity,                                 suggests that Lats2 is an important tumor
                                                                              suppressor (Voorhoeve et al., 2006).


                      KnownFact                        KnownFact

Concepts                             Hypothesis                 Implication                                   Fact


                                                                                      Goal
                             Goal


                   Method                     Result                              Method                    Result



                              Data                                                          Data

 Experiment 1                                                     Experiment 2
Fact creation vs. Latour (1986)
Fact creation vs. Latour (1986)
Future research:
Future research:

‣   Need co-annotators to verify semantic types
Future research:

‣   Need co-annotators to verify semantic types
‣   Need to scale up with more (types of) texts!
Future research:

‣   Need co-annotators to verify semantic types
‣   Need to scale up with more (types of) texts!
I. How is a scientific fact created, as it moves from a
   hedged claim to a throughout successive citations?
Future research:

‣   Need co-annotators to verify semantic types
‣   Need to scale up with more (types of) texts!
I. How is a scientific fact created, as it moves from a
   hedged claim to a throughout successive citations?
II. Can we identify a rhetorically successful text, using
    these segments and characteristics?
Future research:

‣   Need co-annotators to verify semantic types
‣   Need to scale up with more (types of) texts!
I. How is a scientific fact created, as it moves from a
   hedged claim to a throughout successive citations?
II. Can we identify a rhetorically successful text, using
    these segments and characteristics?
III. Can we help authors create such texts (guidelines,
     tools?
Third attempt:
collaboration!
Improve ‘what is claimed about an entity’
insulin ::: maintaining   glucose       ... diabetes defect) to overcome
GB000841                  homeostasis   insulin resistance in maintaining
                                        glucose homeostasis,
                                        hyperglycemia and glucose
           improve        glucose       intolerance able to increase
                                        ... in T2D is ...
                          homeostasis   insulin secretion and improve
                                        glucose homeostasis.

           improves       glucose       ... SIRT1, whose administration
                          homeostasis   to insulin-resistant animals
                                        improves glucose homeostasis.

           is capable     glucose       S15511 is a novel insulin
                          homeostasis   sensitizer that is capable of
                                        improving glucose homeostasis
                                        in nondiabetic rats.
           maintains      glucose       Pancreatic beta-cells possess a
                          homeostasis   well-regulated insulin secretory
                                        property that maintains
                                        systemic glucose homeostasis.
           may be         glucose       ... similar way to those of
           involved       homeostasis   insulin, PANDER may be
                                        involved in glucose homeostasis.

           participates   glucose       Fine-tuning of insulin secretion
                          homeostasis   from pancreatic beta-cells
                                        participates in blood glucose
                                        homeostasis.
Improve ‘what is claimed about an entity’
insulin ::: maintaining   glucose       ... diabetes defect) to overcome    When insulin secretion cannot be increased adequately (type I diabetes
                                                                            defect) to overcome insulin resistance in maintaining glucose homeostasis,
GB000841                  homeostasis   insulin resistance in maintaining   hyperglycemia and glucose intolerance ensues. Insulin resistance and glucose
                                        glucose homeostasis,                intolerance has been well recognized in patients with advanced chronic
                                        hyperglycemia and glucose           kidney diseases (CKD).
           improve        glucose       intolerance able to increase
                                        ... in T2D is ...                   .. Incretin metabolism is abnormal in T2D, evidenced by a decreased
                                                                            incretin effect, reduction in nutrient-mediated secretion of GIP and GLP-1 in
                          homeostasis   insulin secretion and improve       T2D, and resistance to GIP. GLP-1, on the other hand, when administered
                                        glucose homeostasis.                intravenously in T2D is able to increase insulin secretion and improve glucose
                                                                            homeostasis.
           improves       glucose       ... SIRT1, whose administration     SIRT1, a NAD(+)-dependent protein deacetylase that regulates transcription
                                                                            factors involved in key cellular processes, has been implicated as a mediator
                          homeostasis   to insulin-resistant animals        of the beneficial effects of calorie restriction. In a recent issue of Nature,
                                        improves glucose homeostasis.       Milne et al. (2007) describe novel potent activators of SIRT1, whose
                                                                            administration to insulin-resistant animals improves glucose homeostasis.
           is capable     glucose       S15511 is a novel insulin           S15511 is a novel insulin sensitizer that is capable of improving glucose
                                                                            homeostasis in nondiabetic rats.... However, the mechanisms behind the insulin-
                          homeostasis   sensitizer that is capable of       sensitizing effect of S15511 are unknown. The aim of our study was to
                                        improving glucose homeostasis       explore whether S15511 improves insulin sensitivity in skeletal muscles.
                                        in nondiabetic rats.                S15511 treatment was associated with an increase in insulin-stimulated
                                                                            glucose transport in type IIb well-regulatedtype I fibers were unaffected.
                                                                            Pancreatic beta-cells possess a fibers, while insulin secretory property that
           maintains      glucose       Pancreatic beta-cells possess a
                                                                            maintains systemic glucose homeostasis. Although it has long been
                          homeostasis   well-regulated insulin secretory    thought that differentiated beta-cells are nearly static, recent studies
                                        property that maintains             have shown that beta-cell mass dynamically changes throughout the
                                        systemic glucose homeostasis.       lifetime. In this article, recent progress of regenerative medicine of the
                                                                            pancreasresults showed that glucose up-regulated PANDER mRNA and
                                                                            ... Our is reviewed.
           may be         glucose       ... similar way to those of
                                                                            protein levels in a time- and dose-dependent manner in MIN6 cells and
           involved       homeostasis   insulin, PANDER may be              pancreatic islets. ...Because PANDER is expressed by pancreatic beta-cells
                                        involved in glucose homeostasis.    and in response to glucose in a similar way to those of insulin, PANDER may be
                                                                            involved in glucose homeostasis.
           participates   glucose       Fine-tuning of insulin secretion    Fine-tuning of insulin secretion from pancreatic beta-cells participates in blood
                                                                            glucose homeostasis. ... Our data identify miR124a and miR96 as novel
                          homeostasis   from pancreatic beta-cells          regulators of the expression of proteins playing a critical role in insulin
                                        participates in blood glucose       exocytosis and in the release of other hormones and neurotransmitters.
                                        homeostasis.
Improve ‘what is claimed about an entity’
insulin ::: maintaining   glucose       ... diabetes defect) to overcome    When insulin secretion cannot be increased adequately (type I diabetes
                                                                            defect) to overcome insulin resistance in maintaining glucose homeostasis,
GB000841                  homeostasis   insulin resistance in maintaining   hyperglycemia and glucose intolerance ensues. Insulin resistance and glucose
                                        glucose homeostasis,                intolerance has been well recognized in patients with advanced chronic
                                        hyperglycemia and glucose           kidney diseases (CKD).
           improve        glucose       intolerance able to increase
                                        ... in T2D is ...                   .. Incretin metabolism is abnormal in T2D, evidenced by a decreased
                                                                            incretin effect, reduction in nutrient-mediated secretion of GIP and GLP-1 in
                          homeostasis   insulin secretion and improve       T2D, and resistance to GIP. GLP-1, on the other hand, when administered
                                        glucose homeostasis.                intravenously in T2D is able to increase insulin secretion and improve glucose
                                                                            homeostasis.
           improves       glucose       ... SIRT1, whose administration     SIRT1, a NAD(+)-dependent protein deacetylase that regulates transcription
                                                                            factors involved in key cellular processes, has been implicated as a mediator
                          homeostasis   to insulin-resistant animals        of the beneficial effects of calorie restriction. In a recent issue of Nature,
                                        improves glucose homeostasis.       Milne et al. (2007) describe novel potent activators of SIRT1, whose
                                                                            administration to insulin-resistant animals improves glucose homeostasis.
           is capable     glucose       S15511 is a novel insulin           S15511 is a novel insulin sensitizer that is capable of improving glucose
                                                                            homeostasis in nondiabetic rats.... However, the mechanisms behind the insulin-
                          homeostasis   sensitizer that is capable of       sensitizing effect of S15511 are unknown. The aim of our study was to
                                        improving glucose homeostasis       explore whether S15511 improves insulin sensitivity in skeletal muscles.
                                        in nondiabetic rats.                S15511 treatment was associated with an increase in insulin-stimulated
                                                                            glucose transport in type IIb well-regulatedtype I fibers were unaffected.
                                                                            Pancreatic beta-cells possess a fibers, while insulin secretory property that
           maintains      glucose       Pancreatic beta-cells possess a
                                                                            maintains systemic glucose homeostasis. Although it has long been
                          homeostasis   well-regulated insulin secretory    thought that differentiated beta-cells are nearly static, recent studies
                                        property that maintains             have shown that beta-cell mass dynamically changes throughout the
                                        systemic glucose homeostasis.       lifetime. In this article, recent progress of regenerative medicine of the
                                                                            pancreasresults showed that glucose up-regulated PANDER mRNA and
                                                                            ... Our is reviewed.
           may be         glucose       ... similar way to those of
                                                                            protein levels in a time- and dose-dependent manner in MIN6 cells and
           involved       homeostasis   insulin, PANDER may be              pancreatic islets. ...Because PANDER is expressed by pancreatic beta-cells
                                        involved in glucose homeostasis.    and in response to glucose in a similar way to those of insulin, PANDER may be
                                                                            involved in glucose homeostasis.
           participates   glucose       Fine-tuning of insulin secretion    Fine-tuning of insulin secretion from pancreatic beta-cells participates in blood
                                                                            glucose homeostasis. ... Our data identify miR124a and miR96 as novel
                          homeostasis   from pancreatic beta-cells          regulators of the expression of proteins playing a critical role in insulin
                                        participates in blood glucose       exocytosis and in the release of other hormones and neurotransmitters.
                                        homeostasis.
A network of hypotheses and evidence




                               30
A network of hypotheses and evidence

       PHC   undergo Growth arrest




                                     30
A network of hypotheses and evidence

                   PHC     undergo Growth arrest



Paper A:
            implication
    method                fact
     goal                 fact
               results




                                                   30
A network of hypotheses and evidence

                   PHC       undergo Growth arrest



Paper A:
            implication
    method                  fact
     goal                   fact
               results


  data 1

             data 2       data 3

                                                     30
A network of hypotheses and evidence

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       30
A network of hypotheses and evidence

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       30
A network of hypotheses and evidence

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       30
A network of hypotheses and evidence

                   PHC       undergo Growth arrest



Paper A:                                Paper B:
            implication                            implication
                                                 g
                                           n nin
    method                  fact        rpi method
                                     de                          fact
                                   un
     goal                   fact             goal                fact
               results
                                                     results

  data 1
                                            data 4
             data 2       data 3
                                                     data 5      data 6
                                                                        30
A network of hypotheses and evidence

                   PHC       undergo Growth arrest



Paper A:                             Paper B:
            implication                           implication
    method                  fact         method                 fact
     goal                   fact           goal                 fact
               results
                                                     results

  data 1
                                         data 4
             data 2       data 3
                                                     data 5     data 6
                                                                       30
A network of hypotheses and evidence

                   PHC        undergo Growth arrest



Paper A:                               Paper B:
            implication                             implication
    method               method link
                            fact           method                 fact
     goal                   fact            goal                  fact
               results
                                                       results

  data 1
                                           data 4
             data 2       data 3
                                                      data 5      data 6
                                                                         30
For Example: SWAN
For Example: SWAN
For Example: SWAN
For Example: SWAN
HypER Working Group:
-       Goal: Align and expand existing efforts on detection and
        analysis of Hypotheses, Evidence & Relationships

-       Partners:
    -    Harvard/MGH: SWAN, ARF

    -    Open University: Cohere

    -    Oxford University: CiTO, eLearning/Rhetoric

    -    DERI: SALT, aTags

    -    University of Trento: LiquidPub

    -    Xerox Research: XIP hypothesis identifier

    -    U Tilburg: ML for Science

    -    Elsevier, UUtrecht: Discourse analysis of biology
HypER Working Group:
-       Goal: Align and expand existing efforts on detection and
        analysis of Hypotheses, Evidence & Relationships

-       Partners:
    -    Harvard/MGH: SWAN, ARF

    -    Open University: Cohere

    -    Oxford University: CiTO, eLearning/Rhetoric

    -    DERI: SALT, aTags

    -    University of Trento: LiquidPub

    -    Xerox Research: XIP hypothesis identifier

    -    U Tilburg: ML for Science

    -    Elsevier, UUtrecht: Discourse analysis of biology
HypER Working Group:
-       Goal: Align and expand existing efforts on detection and
        analysis of Hypotheses, Evidence & Relationships

-       Partners:
    -    Harvard/MGH: SWAN, ARF

    -    Open University: Cohere

    -    Oxford University: CiTO, eLearning/Rhetoric

    -    DERI: SALT, aTags

    -    University of Trento: LiquidPub

    -    Xerox Research: XIP hypothesis identifier

    -    U Tilburg: ML for Science

    -    Elsevier, UUtrecht: Discourse analysis of biology
HypER Working Group:
-       Goal: Align and expand existing efforts on detection and
        analysis of Hypotheses, Evidence & Relationships

-       Partners:
    -    Harvard/MGH: SWAN, ARF

    -    Open University: Cohere

    -    Oxford University: CiTO, eLearning/Rhetoric

    -    DERI: SALT, aTags

    -    University of Trento: LiquidPub

    -    Xerox Research: XIP hypothesis identifier

    -    U Tilburg: ML for Science

    -    Elsevier, UUtrecht: Discourse analysis of biology
HypER Working Group:
-       Goal: Align and expand existing efforts on detection and
        analysis of Hypotheses, Evidence & Relationships

-       Partners:
    -    Harvard/MGH: SWAN, ARF

    -    Open University: Cohere

    -    Oxford University: CiTO, eLearning/Rhetoric

    -    DERI: SALT, aTags

    -    University of Trento: LiquidPub

    -    Xerox Research: XIP hypothesis identifier

    -    U Tilburg: ML for Science

    -    Elsevier, UUtrecht: Discourse analysis of biology
HypER Working Group:
-       Goal: Align and expand existing efforts on detection and
        analysis of Hypotheses, Evidence & Relationships

-       Partners:
    -    Harvard/MGH: SWAN, ARF

    -    Open University: Cohere

    -    Oxford University: CiTO, eLearning/Rhetoric

    -    DERI: SALT, aTags

    -    University of Trento: LiquidPub

    -    Xerox Research: XIP hypothesis identifier

    -    U Tilburg: ML for Science

    -    Elsevier, UUtrecht: Discourse analysis of biology
HypER Working Group:
-     Goal: Align and expand existing efforts on detection and
      analysis of Hypotheses, Evidence & Relationships

-     Partners:
    - Hypothesis 22: Intramembrenous Aβ dimer may be toxic.
         Harvard/MGH: SWAN, ARF

    - Derived of these Abeta peptides never leave theessay explores the possibility they aare
         Open from: POSTAT_CONTRIBUTION(This
      fraction
               University: Cohere
                                                       membrane lipid bilayer after
                                                                                    that

    - generated,University: CiTO,their toxic effects by competing with and compromising
         Oxford but instead exert eLearning/Rhetoric
      the functions of intramembranous segments of membrane-bound proteins that serve
    - many critical functions.
         DERI: SALT, aTags

    - University of Trento: LiquidPub
    - Xerox Research: XIP hypothesis identifier
    - U Tilburg: ML for Science
    - Elsevier, UUtrecht: Discourse analysis of biology
HypER Activities: http://hyper.wik.is
HypER Activities: http://hyper.wik.is

Current activities:

   -   Aligning discourse ontologies: joint task with W3C HCLSSig

   -   Aligning architectures to exchange hypotheses + evidence

   -   Format for a rhetorical conference paper (SALT + abcde)

   -   Parser test of hypothesis identification tools on pharmacology corpus
HypER Activities: http://hyper.wik.is

Current activities:

   -   Aligning discourse ontologies: joint task with W3C HCLSSig

   -   Aligning architectures to exchange hypotheses + evidence

   -   Format for a rhetorical conference paper (SALT + abcde)

   -   Parser test of hypothesis identification tools on pharmacology corpus
Further interests:

   -   Better structure of evidence: MyExperiment, KeFeD, ...

   -   Granularity of annotation/access: entity, hypothesis, discussion?
Conclusion
Conclusion
Problem: too much discourse, tools are not yet good
enough...
Conclusion
Problem: too much discourse, tools are not yet good
enough...
1. First attempt: allow authors to validate entities -
   pursue
Conclusion
Problem: too much discourse, tools are not yet good
enough...
1. First attempt: allow authors to validate entities -
   pursue
2. Second attempt: discourse analysis - any help is
   great!
Conclusion
Problem: too much discourse, tools are not yet good
enough...
1. First attempt: allow authors to validate entities -
   pursue
2. Second attempt: discourse analysis - any help is
   great!
3. Third attempt: collaboration to identify hypotheses:
   do join!
Questions?
       a.dewaard@elsevier.com
http://elsatglabs.elsevier.com/labs/anita
References
Hyland, K. (2004). Disciplinary Discourses: Social Interactions in Academic
Writing, Addison Wesley Publishing Company, 2004.

Latour, B., and Woolgar, S. (1986). Laboratory Life: The Construction of
Scientific Facts. 2nd ed. Princeton, NJ: Princeton University Press, 1986. ISBN:
9780691028323.

Latour, B. (1987). Science in Action, How to Follow Scientists and Engineers
through Society, (Cambridge, Ma.: Harvard University Press, 1987)
Segmentation Criteria (summary)
  Finite/
                          Grammatical role                 Segment?                      Example
 Non-finite

                                                                      The extent to which miRNAs specifically affect
Finite/Non-finite                 Subject                      N       metastasis

Finite/Non-finite              Direct Object                   Y       these miRNAs are potential novel oncogenes

                   Phrase-level adjunct (restrictive and
   Nonfinite                                                   N       spanning a given miRNA genomic region
                             non-restrictive)

   Nonfinite                Clause-level adjunct               Y       by cloning eight miR-Vec plasmids


                                                                      which is only active when tamoxifen is added (De
     Finite        Non-restrictive Phrase-level adjunct       Y       Vita et al, 2005) […]


     Finite          Restrictive Phrase-level adjunct         N       that we examined

                                                                      which correlates with the reported ES-cell
     Finite                Clause-level adjunct               Y       expression pattern of the miR-371-3 cluster (Suh et
                                                                      al, 2004)
Basic Segment Types
Segment              Description                                     Example

                a known fact, generally
   Fact                                        mature miR-373 is a homolog of miR-372
               without explicit citation

                  a proposed idea, not
Hypothesis                                     This could for instance be a result of high mdm2 levels
                 supported by evidence

              unresolved, contradictory, or However, further investigation is required to
 Problem
                     unclear issue          demonstrate the exact mechanism of LATS2 action


   Goal               research goal            To identify novel functions of miRNAs,


 Method           experimental method          Using fluorescence microscopy and luciferase assays,

              a restatement of the outcome all constructs yielded high expression levels of mature
  Result
                     of an experiment      miRNAs

                an interpretation of the
                                               our procedure is sensitive enough to detect mild growth
Implication     results, in light of earlier
                  hypotheses and facts         differences
Two Types of Derived Segment Types
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’

-   other-implication: ‘D1 or, more likely, D5, receptors have been implicated in
    mechanisms underlying long-term spatial memory [Hersi et al., 1995]’
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’

-   other-implication: ‘D1 or, more likely, D5, receptors have been implicated in
    mechanisms underlying long-term spatial memory [Hersi et al., 1995]’

Regulatory segments, acting as matrix sentences framing other segments:
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’

-   other-implication: ‘D1 or, more likely, D5, receptors have been implicated in
    mechanisms underlying long-term spatial memory [Hersi et al., 1995]’

Regulatory segments, acting as matrix sentences framing other segments:

-   reg-hypothesis: ‘we hypothesized that ’
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’

-   other-implication: ‘D1 or, more likely, D5, receptors have been implicated in
    mechanisms underlying long-term spatial memory [Hersi et al., 1995]’

Regulatory segments, acting as matrix sentences framing other segments:

-   reg-hypothesis: ‘we hypothesized that ’

-   reg-implication: ‘These observations suggest that’
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’

-   other-implication: ‘D1 or, more likely, D5, receptors have been implicated in
    mechanisms underlying long-term spatial memory [Hersi et al., 1995]’

Regulatory segments, acting as matrix sentences framing other segments:

-   reg-hypothesis: ‘we hypothesized that ’

-   reg-implication: ‘These observations suggest that’

-   intratextual: ‘Fig 4 shows that’
Two Types of Derived Segment Types
‘Other-segments’, related to (referenced) other work:

-   other-result: ‘they are also found in the FCX and other cortical structures
    ([Sokoloff et al., 1990]’

-   other-goal: ‘the role of D3 receptors in the control of motivation and affect
    has been intensively studied [Heidbreder et al., 2005]’

-   other-implication: ‘D1 or, more likely, D5, receptors have been implicated in
    mechanisms underlying long-term spatial memory [Hersi et al., 1995]’

Regulatory segments, acting as matrix sentences framing other segments:

-   reg-hypothesis: ‘we hypothesized that ’

-   reg-implication: ‘These observations suggest that’

-   intratextual: ‘Fig 4 shows that’

-   intertextual: ‘reviewed in (Serrano, 1997)’
My categories vs. Latour (1979)
Linguistic and structural properties
Linguistic and structural properties
Linguistic and structural properties
1. Position in text
Linguistic and structural properties
1. Position in text

      -   Section of the paper (Introduction, Results, Discussion)
Linguistic and structural properties
1. Position in text

      -   Section of the paper (Introduction, Results, Discussion)
      -   Beginning/middle/end of section
Linguistic and structural properties
1. Position in text

      -   Section of the paper (Introduction, Results, Discussion)
      -   Beginning/middle/end of section
      -   First/second third part of sentence
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
      -    Lexicon
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
      -    Lexicon

3. Metadiscourse markers [Hyland, 2003]:
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
      -    Lexicon

3. Metadiscourse markers [Hyland, 2003]:

      -    Connectives
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
      -    Lexicon

3. Metadiscourse markers [Hyland, 2003]:

      -    Connectives
      -    Endophorics, Evidentials
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
      -    Lexicon

3. Metadiscourse markers [Hyland, 2003]:

      -    Connectives
      -    Endophorics, Evidentials
      -    Hedges, Boosters
Linguistic and structural properties
1. Position in text

      -    Section of the paper (Introduction, Results, Discussion)
      -    Beginning/middle/end of section
      -    First/second third part of sentence
2. Verb:

      -    Tense, aspect, voice
      -    Verb class (idiosyncratic)
      -    Lexicon

3. Metadiscourse markers [Hyland, 2003]:

      -    Connectives
      -    Endophorics, Evidentials
      -    Hedges, Boosters
      -    Person markers
Verb class
Verb class
Two types of entities interact in biology texts:
-   Thing:
    -   Thing -> Increase, die, etc
    -   Thing-thing: affect, stimulate etc.
-   People:
    -   People -> Thing:
        -   Examine (Goal)
        -   Operate (Method)

        -   Observe (Result)
        -   Implicate (Implication)
    -   People - people: Report
Interpretation: 3 Realms of Science:

  Conceptual
    realm



Experimental realm




     Data realm
Interpretation: 3 Realms of Science:
                       (1) Oncogene-induced senescence is            (4b) transduction with either
   Conceptual          characterized by the appearance of            miR-Vec-371&2 or miR-Vec-
                                                                                        V12
                       cells with a flat morphology that             373 prevents RAS -
     realm             express senescence associated (SA)-           induced growth arrest in
                        -Galactosid a s e .                          primary human cells.


                            (2a) Indeed,              (4a) Altogether, these data
                                                      show that

Experimental realm (2b) control RAS    V12
                                          -arrested                     (3b) very few cells showed
                       cells showed relatively high                     senescent morphology when
                                                      (3a) Consistent
                       abundance of flat cells                          transduced with either miR-
                                                      with the cell
                       expressing SA- -                                 Vec-371&2, miR-Vec-373, or
                                                      growth assay,                 kd
                       Galactosidase                                    control p53 .

                              (2c) (Figures
                              2G and 2H).
      Data realm
                               (Figures)
Interpretation: 3 Realms of Science:
                       (1) Oncogene-induced senescence is            (4b) transduction with either
   Conceptual          characterized by the appearance of            miR-Vec-371&2 or miR-Vec-
                                                                                        V12
                       cells with a flat morphology that             373 prevents RAS -
     realm             express senescence associated (SA)-           induced growth arrest in
                        -Galactosid a s e .                          primary human cells.


                            (2a) Indeed,              (4a) Altogether, these data
                                                      show that

Experimental realm (2b) control RAS    V12
                                          -arrested                     (3b) very few cells showed
                       cells showed relatively high                     senescent morphology when
                                                      (3a) Consistent
                       abundance of flat cells                          transduced with either miR-
                                                      with the cell
                       expressing SA- -                                 Vec-371&2, miR-Vec-373, or
                                                      growth assay,                 kd
                       Galactosidase                                    control p53 .

                              (2c) (Figures
                              2G and 2H).
      Data realm
                               (Figures)
Tense 1: Concepts vs. Experiment
(1) Oncogene-induced senescence is            (4b) transduction with either




                                                                               Concept realm
characterized by the appearance of            miR-Vec-371&2 or miR-Vec-
                                                                 V12
cells with a flat morphology that             373 prevents RAS -
express senescence associated (SA)-           induced growth arrest in
 -Galactosid a s e .                          primary human cells.


     (2a) Indeed,              (4a) Altogether, these data
                               show that




                                                                               Experimental realm
                                                                               (personal, past)
                V12
(2b) control RAS -arrested                       (3b) very few cells showed
cells showed relatively high                     senescent morphology when
                               (3a) Consistent
abundance of flat cells                          transduced with either miR-
                               with the cell
expressing SA- -                                 Vec-371&2, miR-Vec-373, or
                               growth assay,                 kd
Galactosidase                                    control p53 .

       (2c) (Figures
       2G and 2H).




                                                                               (nontverbal)
                                                                               Data realm
        (Figures)
Tense 2: Referral

                past                                present                                  future
                            Introduction                                        Discussion
own paper




                    After     Before current      Current work       After current
                    other     work: present                            work: past
                                               (= Results section)
                    work:
                    past
other papers




               Other Work
Tense 1+ 2 = 3:


                          Claim,
                           fact
Conceptual




                          Experi
                          ment
Experiential




               past     present       future
                       Reading time

Más contenido relacionado

La actualidad más candente

Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
Chris Evelo
 
Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final Report
Shruthi Choudary
 
2013 talk at TGAC, November 4
2013 talk at TGAC, November 42013 talk at TGAC, November 4
2013 talk at TGAC, November 4
c.titus.brown
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
Anita de Waard
 
PaulaTataru_PhD_defense
PaulaTataru_PhD_defensePaulaTataru_PhD_defense
PaulaTataru_PhD_defense
Paula Tataru
 

La actualidad más candente (20)

Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Mining Drug Targets, Structures and Activity Data
Mining Drug Targets, Structures and Activity DataMining Drug Targets, Structures and Activity Data
Mining Drug Targets, Structures and Activity Data
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
GoTermsAnalysisWithR
GoTermsAnalysisWithRGoTermsAnalysisWithR
GoTermsAnalysisWithR
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
 
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...
Next Generation Cancer Data Discovery, Access, and Integration Using Prizms a...
 
Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final Report
 
Scaling Genetic Data Analysis with Apache Spark with Jon Bloom and Tim Poterba
Scaling Genetic Data Analysis with Apache Spark with Jon Bloom and Tim PoterbaScaling Genetic Data Analysis with Apache Spark with Jon Bloom and Tim Poterba
Scaling Genetic Data Analysis with Apache Spark with Jon Bloom and Tim Poterba
 
ContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC DigifestContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC Digifest
 
A knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsA knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systems
 
2013 talk at TGAC, November 4
2013 talk at TGAC, November 42013 talk at TGAC, November 4
2013 talk at TGAC, November 4
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontology
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
 
Cshl minseqe 2013_ouellette
Cshl minseqe 2013_ouelletteCshl minseqe 2013_ouellette
Cshl minseqe 2013_ouellette
 
PaulaTataru_PhD_defense
PaulaTataru_PhD_defensePaulaTataru_PhD_defense
PaulaTataru_PhD_defense
 

Destacado

Cshals2012dewaardsmall
Cshals2012dewaardsmallCshals2012dewaardsmall
Cshals2012dewaardsmall
Anita de Waard
 

Destacado (16)

Sensemaking in Science
Sensemaking in ScienceSensemaking in Science
Sensemaking in Science
 
Cshals2012dewaardsmall
Cshals2012dewaardsmallCshals2012dewaardsmall
Cshals2012dewaardsmall
 
Towards Incidental Collaboratories For Experimental Data
Towards Incidental Collaboratories For Experimental DataTowards Incidental Collaboratories For Experimental Data
Towards Incidental Collaboratories For Experimental Data
 
Annotation systems
Annotation systemsAnnotation systems
Annotation systems
 
Small Data: Bridging the Gap Between Generic and Specific Repositories
Small Data: Bridging the Gap Between Generic and Specific RepositoriesSmall Data: Bridging the Gap Between Generic and Specific Repositories
Small Data: Bridging the Gap Between Generic and Specific Repositories
 
Executable papers
Executable papersExecutable papers
Executable papers
 
Social barriers at http://projects.iq.harvard.edu/attribution_workshop/
Social barriers at http://projects.iq.harvard.edu/attribution_workshop/Social barriers at http://projects.iq.harvard.edu/attribution_workshop/
Social barriers at http://projects.iq.harvard.edu/attribution_workshop/
 
Executing the Research Paper
Executing the Research PaperExecuting the Research Paper
Executing the Research Paper
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with data
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and software
 
Vu210610futurejournal
Vu210610futurejournalVu210610futurejournal
Vu210610futurejournal
 
A syntagmatic/Paradigmatic analysis of scientific text
A syntagmatic/Paradigmatic analysis of scientific textA syntagmatic/Paradigmatic analysis of scientific text
A syntagmatic/Paradigmatic analysis of scientific text
 
The Rocky Road to Reuse
The Rocky Road to ReuseThe Rocky Road to Reuse
The Rocky Road to Reuse
 

Similar a Xerox2009

dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET
 

Similar a Xerox2009 (20)

dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
 
UKSG 2023 - Will artificial intelligence change how readers use the research ...
UKSG 2023 - Will artificial intelligence change how readers use the research ...UKSG 2023 - Will artificial intelligence change how readers use the research ...
UKSG 2023 - Will artificial intelligence change how readers use the research ...
 
Dynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical CommunicationsDynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical Communications
 
Introduction to Proteogenomics
Introduction to Proteogenomics Introduction to Proteogenomics
Introduction to Proteogenomics
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
2014 naples
2014 naples2014 naples
2014 naples
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
Open Minds Bring Open Collaborations
Open Minds Bring Open CollaborationsOpen Minds Bring Open Collaborations
Open Minds Bring Open Collaborations
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014
 
A systematic approach to Genotype-Phenotype correlations
A systematic approach to Genotype-Phenotype correlationsA systematic approach to Genotype-Phenotype correlations
A systematic approach to Genotype-Phenotype correlations
 
Ontology at Manchester
Ontology at ManchesterOntology at Manchester
Ontology at Manchester
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Ismb2009
Ismb2009Ismb2009
Ismb2009
 
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
 
Connected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul GrothConnected Data for Machine Learning | Paul Groth
Connected Data for Machine Learning | Paul Groth
 
BM405 Lecture Slides 21/11/2014 University of Strathclyde
BM405 Lecture Slides 21/11/2014 University of StrathclydeBM405 Lecture Slides 21/11/2014 University of Strathclyde
BM405 Lecture Slides 21/11/2014 University of Strathclyde
 

Más de Anita de Waard

Más de Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Xerox2009

  • 1. Identifying Biological Knowledge: Three Possible Strategies Anita de Waard Disruptive Technologies Director, Elsevier Labs, Amsterdam Casimir Researcher, UiL-OTS, Utrecht University XRCE, Grenoble, 24 September 2009
  • 3. Overview Problem: too much discourse, tools are not yet good enough...
  • 4. Overview Problem: too much discourse, tools are not yet good enough... 1. First attempt: allow authors to validate entities
  • 5. Overview Problem: too much discourse, tools are not yet good enough... 1. First attempt: allow authors to validate entities 2. Second attempt: discourse analysis
  • 6. Overview Problem: too much discourse, tools are not yet good enough... 1. First attempt: allow authors to validate entities 2. Second attempt: discourse analysis 3. Third attempt: collaboration to identify hypotheses
  • 7. Why Study Biological Discourse?
  • 8. Why Study Biological Discourse? - There is too much of it!
  • 9. Why Study Biological Discourse? - There is too much of it!
  • 10. Why Study Biological Discourse? - There is too much of it! - Text mining and ‘fact extraction’ techniques are gaining ground to tame this tangle
  • 11. Why Study Biological Discourse? - There is too much of it! - Text mining and ‘fact extraction’ techniques are gaining ground to tame this tangle - Emerging area of biological natural language processing (BioNLP): subfield of computational linguistics
  • 12. Why Study Biological Discourse? - There is too much of it! - Text mining and ‘fact extraction’ techniques are gaining ground to tame this tangle - Emerging area of biological natural language processing (BioNLP): subfield of computational linguistics - Main focus: identifying biological entities (genes, proteins, drugs) and their relationships
  • 13. Example state of the art: MEDIE
  • 14. Example state of the art: MEDIE Alteration of nm23, P53, and S100A4 expression may contribute to the development of gastric
  • 15. Example state of the art: MEDIE Alteration of nm23, P53, and S100A4 expression may contribute to the development of gastric Previous studies have implicated miR-34a as a tumor suppressor gene whose transcription is activated by p53.
  • 16. Example state of the art: MEDIE Add this knowledge during authoring? Alteration of nm23, P53, and S100A4 expression may contribute to the development of gastric Previous studies have implicated miR-34a as a tumor suppressor gene whose transcription is activated by p53.
  • 17. First attempt: allow authors to validate entities
  • 18. Improve time + quality of knowledgebase entry
  • 19. Improve time + quality of knowledgebase entry - For database curators: save time and money
  • 20. Improve time + quality of knowledgebase entry - For database curators: save time and money - For authors: lower the threshold to submitting papers with metadata
  • 21. Improve time + quality of knowledgebase entry - For database curators: save time and money - For authors: lower the threshold to submitting papers with metadata - Structured Digital Abstract: an editorial experiment to increase the reach of online published articles
  • 22. Improve time + quality of knowledgebase entry - For database curators: save time and money - For authors: lower the threshold to submitting papers with metadata - Structured Digital Abstract: an editorial experiment to increase the reach of online published articles - SDA encodes in a schema information contained in the article
  • 23. expression of GSG1 stimulates TPAP targeting to the ER, suggesting that interactions between the two proteins lead to the redistribution of TPAP from the cytosol to the ER. MINT-6168263: Gsg1 (uniprotkb:Q8R1W2), TPAP (uniprotkb:Q9WVP6) and Calmegin (uniprotkb:P52194) colocalize (MI:0403) by cosedimentation (MI:0027) MINT-6168204, MINT-6168178: Gsg1 (uniprotkb:Q8R1W2) and TPAP (uniprotkb:Q9WVP6) colocalize (MI:0403) by fluorescence microscopy (MI:0416) MINT-6167930: Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
  • 24. expression of GSG1 stimulates TPAP targeting to the ER, suggesting that interactions between the two proteins lead to the redistribution of TPAP from the cytosol to the ER. MINT-6168263: Gsg1 (uniprotkb:Q8R1W2), TPAP (uniprotkb:Q9WVP6) and Calmegin (uniprotkb:P52194) colocalize (MI:0403) by cosedimentation (MI:0027) MINT-6168204, MINT-6168178: Gsg1 (uniprotkb:Q8R1W2) and TPAP (uniprotkb:Q9WVP6) colocalize (MI:0403) by fluorescence microscopy (MI:0416) MINT-6167930: Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
  • 25. expression of GSG1 stimulates TPAP targeting to the ER, suggesting that interactions between the two proteins lead to the redistribution of TPAP from the cytosol to the ER. MINT-6168263: Gsg1 (uniprotkb:Q8R1W2), TPAP (uniprotkb:Q9WVP6) and Calmegin (uniprotkb:P52194) colocalize (MI:0403) by cosedimentation (MI:0027) MINT-6168204, MINT-6168178: Gsg1 (uniprotkb:Q8R1W2) and TPAP (uniprotkb:Q9WVP6) colocalize (MI:0403) by fluorescence microscopy (MI:0416) MINT-6167930: Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
  • 26. expression of GSG1 stimulates TPAP targeting to the ER, suggesting that interactions between the two proteins lead to the redistribution of TPAP from the cytosol to the ER. MINT-6168263: Gsg1 (uniprotkb:Q8R1W2), TPAP (uniprotkb:Q9WVP6) and Calmegin (uniprotkb:P52194) colocalize (MI:0403) by cosedimentation (MI:0027) MINT-6168204, MINT-6168178: Gsg1 (uniprotkb:Q8R1W2) and TPAP (uniprotkb:Q9WVP6) colocalize (MI:0403) by fluorescence microscopy (MI:0416) MINT-6167930: Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
  • 27. expression of GSG1 stimulates TPAP targeting to the ER, suggesting that interactions between the two proteins lead to the redistribution of TPAP from the cytosol to the ER. MINT-6168263: Gsg1 (uniprotkb:Q8R1W2), TPAP (uniprotkb:Q9WVP6) and Calmegin (uniprotkb:P52194) colocalize (MI:0403) by cosedimentation (MI:0027) MINT-6168204, MINT-6168178: Gsg1 (uniprotkb:Q8R1W2) and TPAP (uniprotkb:Q9WVP6) colocalize (MI:0403) by fluorescence microscopy (MI:0416) MINT-6167930: Gsg1 (uniprotkb:Q8R1W2) physically interacts (MI:
  • 29. How? Word Plugin - Okkam4MsW: a Microsoft Word plugin interact with Web Services performing NLP and semantic technologies to detect entities and contextual information - The OKKAM repository is queried to get the right OKKAM id and alternative ids (UniProt in this case) 5
  • 30. How? Word Plugin - Okkam4MsW: a Microsoft Word plugin interact with Web Services performing NLP and semantic technologies to detect entities and contextual information - The OKKAM repository is queried to get the right OKKAM id and alternative ids (UniProt in this case) 5
  • 31. OKKAM Entity Editor in MS Word
  • 32. OKKAM Entity Editor in MS Word
  • 33. OKKAM Entity Editor in MS Word
  • 34. OKKAM Entity Editor in MS Word
  • 37. What else is wrong with MEDIE?
  • 38. What else is wrong with MEDIE? Alteration of nm23, P53, and S100A4 expression may contribute to the development of gastric
  • 39. What else is wrong with MEDIE? Alteration of nm23, P53, and S100A4 expression may contribute to the development of gastric Previous studies have implicated miR-34a as a tumor suppressor gene whose transcription is activated by p53.
  • 40. What else is wrong with MEDIE? without some idea of the status of the sentence, it cannot be interpreted! Alteration of nm23, P53, and S100A4 expression may contribute to the development of gastric Previous studies have implicated miR-34a as a tumor suppressor gene whose transcription is activated by p53.
  • 42. Discourse Analysis Underlying model of text mining systems: - Scientific paper is ‘statement of pertinent facts’ - So: finding entities and relationships will give you a summary of the knowledge within the paper - However, information extracted this way is not very useful....
  • 43. Discourse Analysis Underlying model of text mining systems: - Scientific paper is ‘statement of pertinent facts’ - So: finding entities and relationships will give you a summary of the knowledge within the paper - However, information extracted this way is not very useful.... Proposed approach: treat scientific paper as a persuasive text: specific genre, with genre characteristics and allowed persuasive techniques: - ‘these results suggest’ (depersonification) - ‘as fig. 2a shows’ (evidence is in the data) - ‘oncogenes produce a stress response [Serrano, 2003]’ References and data form a “folded array of successive defense lines, behind which scientists ensconce themselves” (Latour, 1986)
  • 45. Overall Research Questions i. How can we model the discourse/suasive moves in a biological paper?
  • 46. Overall Research Questions i. How can we model the discourse/suasive moves in a biological paper? ii. Can this model help enable automated epistemic markup?
  • 47. Overall Research Questions i. How can we model the discourse/suasive moves in a biological paper? ii. Can this model help enable automated epistemic markup? iii. Can it improve knowledge representations of collections of papers?
  • 50. Discourse analysis Segmentation and classification: 1. Parse text into discourse segments (edu’s) containing a single rhetorical move (if possible...)
  • 51. Discourse analysis Segmentation and classification: 1. Parse text into discourse segments (edu’s) containing a single rhetorical move (if possible...) 2. Determine categories or types of discourse segments that have similar semantic/pragmatic properties
  • 52. Discourse analysis Segmentation and classification: 1. Parse text into discourse segments (edu’s) containing a single rhetorical move (if possible...) 2. Determine categories or types of discourse segments that have similar semantic/pragmatic properties 3. Look at a number of linguistic characteristics and see if these segment types share those characteristics.
  • 54. Segmentation Goal: ‘one new thought per segment’:
  • 55. Segmentation Goal: ‘one new thought per segment’: Figure 4A shows that following RASV12 stimulation, p53 was stabilized and activated, and its target gene, p21cip1, was induced in all cases, indicating an intact p53 pathway in these cells.
  • 56. Segmentation Goal: ‘one new thought per segment’: Figure 4A shows that following RASV12 stimulation, p53 was stabilized and activated, and its target gene, p21cip1, was induced in all cases, indicating an intact p53 pathway in these cells. a. Figure 4a shows that b. following RASV12 stimulation c. p53 was stabilized and activated d. and the target gene, p21cip1, was induced in all cases, e. indicating an intact p53 pathway in these cells.
  • 57. Segmentation Goal: ‘one new thought per segment’: Figure 4A shows that following RASV12 stimulation, p53 was stabilized and activated, and its target gene, p21cip1, was induced in all cases, indicating an intact p53 pathway in these cells. a. Figure 4a shows that Intratextual b. following RASV12 stimulation Method c. p53 was stabilized and activated Result d. and the target gene, p21cip1, was induced in all cases, Result e. indicating an intact p53 pathway in these cells. Implication
  • 58. Segmentation Goal: ‘one new thought per segment’: Figure 4A shows that following RASV12 stimulation, p53 was stabilized and activated, and its target gene, p21cip1, was induced in all cases, indicating an intact p53 pathway in these cells. a. Figure 4a shows that Intratextual b. following RASV12 stimulation Method c. p53 was stabilized and activated Result d. and the target gene, p21cip1, was induced in all cases, Result e. indicating an intact p53 pathway in these cells. Implication
  • 60. Segment Types Segment Description Example Fact a known fact, generally without mature miR-373 is a homolog of miR-372 explicit citation Hypothesis a proposed idea, not supported by This could for instance be a result of high evidence mdm2 levels Problem unresolved, contradictory, or However, further investigation is required to unclear issue demonstrate the exact mechanism of LATS2 action Goal research goal To identify novel functions of miRNAs, Method experimental method Using fluorescence microscopy and luciferase assays, Result a restatement of the outcome of all constructs yielded high expression levels an experiment of mature miRNAs Implication an interpretation of the results, in our procedure is sensitive enough to detect light of earlier hypotheses and facts mild growth differences
  • 61. Segment Types Segment Description Example Fact a known fact, generally without mature miR-373 is a homolog of miR-372 explicit citation Hypothesis a proposed idea, not supported by This could for instance be a result of high evidence mdm2 levels Problem unresolved, contradictory, or However, further investigation is required to unclear issue demonstrate the exact mechanism of LATS2 action Goal research goal To identify novel functions of miRNAs, Method experimental method Using fluorescence microscopy and luciferase assays, Result a restatement of the outcome of all constructs yielded high expression levels an experiment of mature miRNAs Implication an interpretation of the results, in our procedure is sensitive enough to detect light of earlier hypotheses and facts mild growth differences ‘Other-segments’, related to (referenced) other work:
  • 62. Segment Types Segment Description Example Fact a known fact, generally without mature miR-373 is a homolog of miR-372 explicit citation Hypothesis a proposed idea, not supported by This could for instance be a result of high evidence mdm2 levels Problem unresolved, contradictory, or However, further investigation is required to unclear issue demonstrate the exact mechanism of LATS2 action Goal research goal To identify novel functions of miRNAs, Method experimental method Using fluorescence microscopy and luciferase assays, Result a restatement of the outcome of all constructs yielded high expression levels an experiment of mature miRNAs Implication an interpretation of the results, in our procedure is sensitive enough to detect light of earlier hypotheses and facts mild growth differences ‘Other-segments’, related to (referenced) other work: Regulatory segments, acting as matrix sentences framing other segments:
  • 65. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second/third part of sentence
  • 66. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second/third part of sentence 2. Verb: - Tense, aspect, voice - Verb class: Thing (increase), Thing-Thing (inhibit), Person-Thing (examine, observe, operate, implicate), Person: Report - Lexicon
  • 67. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second/third part of sentence 2. Verb: - Tense, aspect, voice - Verb class: Thing (increase), Thing-Thing (inhibit), Person-Thing (examine, observe, operate, implicate), Person: Report - Lexicon 3. Metadiscourse markers [Hyland, 2003]: - Connectives - Endophorics, Evidentials - Hedges, Boosters - Person markers
  • 69. Results: Section and Sequence 1. Voorhoeve, 2006: Cell - 427 segments
  • 70. Results: Section and Sequence 1. Voorhoeve, 2006: Cell - 427 segments 2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments
  • 71. Results: Section and Sequence 1. Voorhoeve, 2006: Cell - 427 segments 2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments - Introduction (90): Other-Result (24), Other-Implication (11), Problem (9), Fact (8)
  • 72. Results: Section and Sequence 1. Voorhoeve, 2006: Cell - 427 segments 2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments - Introduction (90): Other-Result (24), Other-Implication (11), Problem (9), Fact (8) - Result (334): Goal (26) -> Method (68) -> Result (105) -> Reg-Implication (23) ->Implication (50)
  • 73. Results: Section and Sequence 1. Voorhoeve, 2006: Cell - 427 segments 2. Louiseau, 2008: European Neuropsychopharmacology - 281 segments - Introduction (90): Other-Result (24), Other-Implication (11), Problem (9), Fact (8) - Result (334): Goal (26) -> Method (68) -> Result (105) -> Reg-Implication (23) ->Implication (50) - Discussion (187): Implication (27), Result (21), Other-Result (24), Hypothesis (19), Problem (17)
  • 75. Results:Verb Tense - Realm of the Present: Fact (82%), Hypothesis (71%), Implication (62%)
  • 76. Results:Verb Tense - Realm of the Present: Fact (82%), Hypothesis (71%), Implication (62%) - Realm of the Past: Result (82%), Method (76%) - 50% Passive, of Method 50% Past Perfect
  • 77. Results:Verb Tense - Realm of the Present: Fact (82%), Hypothesis (71%), Implication (62%) - Realm of the Past: Result (82%), Method (76%) - 50% Passive, of Method 50% Past Perfect - Realm of the Modal: 44% in Hypothesis
  • 78. Results:Verb Tense - Realm of the Present: Fact (82%), Hypothesis (71%), Implication (62%) - Realm of the Past: Result (82%), Method (76%) - 50% Passive, of Method 50% Past Perfect - Realm of the Modal: 44% in Hypothesis - Realm of the To-Infinitive: 50% is Goal, 75% of Goal is to-infinitive (Purpose Clause)
  • 80. Results: Verb Type - Thing - Thing: high in experimental (Method, Result) and conceptual (Problem, Hypothesis, Fact, Implication) segments: ‣ Need to differentiate between ‘concept’ things and ‘experimental’ things!
  • 81. Results: Verb Type - Thing - Thing: high in experimental (Method, Result) and conceptual (Problem, Hypothesis, Fact, Implication) segments: ‣ Need to differentiate between ‘concept’ things and ‘experimental’ things! - Person - Implicate: high in Hypothesis, Implication, Problem
  • 82. Results: Verb Type - Thing - Thing: high in experimental (Method, Result) and conceptual (Problem, Hypothesis, Fact, Implication) segments: ‣ Need to differentiate between ‘concept’ things and ‘experimental’ things! - Person - Implicate: high in Hypothesis, Implication, Problem - Person - Operate: high in Methods (90%)
  • 83. Results: Verb Type - Thing - Thing: high in experimental (Method, Result) and conceptual (Problem, Hypothesis, Fact, Implication) segments: ‣ Need to differentiate between ‘concept’ things and ‘experimental’ things! - Person - Implicate: high in Hypothesis, Implication, Problem - Person - Operate: high in Methods (90%) - Person - Examine: high in Goal (87%)
  • 85. Results: Metadiscourse Markers - Causitive: high in Implications (therefore, thus), - Comparison: high in Results (whereas, in contrast), - Temporality: high in Methods (next, subsequently) - Person markers: high in Methods (50%) and Results - Boosters: high in Results (indeed, surprisingly, interestingly) - Hedges: high in Implication, Reg-Implication (raises the possibility that, explains at least in part) - but modals and ‘suggest’ verbs are left out
  • 86. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis problem fact fact fact
  • 87. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem fact fact fact
  • 88. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem fact fact fact
  • 89. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem we method resulting in result fact fact fact
  • 90. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem we method resulting in result fact fact fact
  • 91. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem we method resulting in result suggests that discussion fact fact fact implication
  • 92. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem we method resulting in result suggests that discussion fact fact fact implication
  • 93. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem we method introduction resulting in result suggests that discussion fact fact fact implication
  • 94. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem results we method introduction resulting in result suggests that discussion fact fact fact implication
  • 95. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem results we method introduction resulting in result suggests that discussion fact fact fact implication discussion
  • 96. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothesis goal to problem results we method introduction resulting in result suggests that discussion fact fact fact implication Shared view Own view discussion
  • 97. i. How can we model the discourse moves in a biological paper? Discourse as a Fact-ory hypothetical realm: hypothesis realm of activity: (might, would) (to test, to see) goal to problem results we realm of method introduction experience: past resulting in result suggests that discussion realm of models: fact fact fact present implication Shared view Own view discussion
  • 98. ii. Is this useful for enabling automated epistemic markup?
  • 99. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help:
  • 100. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains.
  • 101. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains. TRIPLET(that A_1_GENE:+ - 42 - induced memory deficits,involve,subtler neuronal alternations)
  • 102. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains. TRIPLET(that A_1_GENE:+ - 42 - induced memory deficits,involve,subtler neuronal alternations)
  • 103. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains. TRIPLET(that A_1_GENE:+ - 42 - induced memory deficits,involve,subtler neuronal alternations) ‣ issue: segment parsing is difficult!
  • 104. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains. TRIPLET(that A_1_GENE:+ - 42 - induced memory deficits,involve,subtler neuronal alternations) ‣ issue: segment parsing is difficult! ‣ issue: verb tense is not always accessible
  • 105. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains. TRIPLET(that A_1_GENE:+ - 42 - induced memory deficits,involve,subtler neuronal alternations) ‣ issue: segment parsing is difficult! ‣ issue: verb tense is not always accessible ‣ bionlp: not that much work on full text, since commercial publishers are difficult :-)!
  • 106. ii. Is this useful for enabling automated epistemic markup? ✓ first efforts seem promising: simple markers (‘suggest’ verbs, connectives, etc.) already help: 6> It is thus emerging that A_1-42-induced memory deficits may involve subtler neuronal alternations leading to synaptic deficits, prior to frank neurodegeneration in AD brains. TRIPLET(that A_1_GENE:+ - 42 - induced memory deficits,involve,subtler neuronal alternations) ‣ issue: segment parsing is difficult! ‣ issue: verb tense is not always accessible ‣ bionlp: not that much work on full text, since commercial publishers are difficult :-)! ‣ possible challenge at biolink 2011: watch this space...
  • 107.
  • 108. KnownFact KnownFact Concepts
  • 109. To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... KnownFact KnownFact Concepts Hypothesis
  • 110. To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... KnownFact KnownFact Concepts Hypothesis Goal Method Result Data Experiment 1
  • 111. To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, KnownFact KnownFact Concepts Hypothesis Implication Goal Method Result Data Experiment 1
  • 112. Voorhoeve, 2006 To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, KnownFact KnownFact Concepts Hypothesis Implication Goal Method Result Data Experiment 1
  • 113. Voorhoeve, 2006 To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... Therefore, these results point to LATS2 as a mediator of the miR-372 and miR-373 effects on cell proliferation and tumorigenicity, KnownFact KnownFact Concepts Hypothesis Implication Goal Goal Method Result Method Result Data Data Experiment 1 Experiment 2
  • 114. Voorhoeve, 2006 To investigate the possibility that miR-372 and miR-373 suppress the expression of LATS2, we... Raver-Shapira et.al, JMolCell 2007 Therefore, these results point to two miRNAs, miRNA-372 and-373, function as LATS2 as a mediator of the miR-372 and potential novel oncogenes in testicular germ cell miR-373 effects on cell proliferation and tumors by inhibition of LATS2 expression, which tumorigenicity, suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006). KnownFact KnownFact Concepts Hypothesis Implication Fact Goal Goal Method Result Method Result Data Data Experiment 1 Experiment 2
  • 115. Yabuta, JBioChem 2007 Voorhoeve, 2006 miR-372 and miR-373 target the Lats2 tumor suppressor To investigate the possibility that (Voorhoeve et al., 2006) miR-372 and miR-373 suppress the expression of LATS2, we... Raver-Shapira et.al, JMolCell 2007 Therefore, these results point to two miRNAs, miRNA-372 and-373, function as LATS2 as a mediator of the miR-372 and potential novel oncogenes in testicular germ cell miR-373 effects on cell proliferation and tumors by inhibition of LATS2 expression, which tumorigenicity, suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006). KnownFact KnownFact Concepts Hypothesis Implication Fact Goal Goal Method Result Method Result Data Data Experiment 1 Experiment 2
  • 116. Fact creation vs. Latour (1986)
  • 117. Fact creation vs. Latour (1986)
  • 119. Future research: ‣ Need co-annotators to verify semantic types
  • 120. Future research: ‣ Need co-annotators to verify semantic types ‣ Need to scale up with more (types of) texts!
  • 121. Future research: ‣ Need co-annotators to verify semantic types ‣ Need to scale up with more (types of) texts! I. How is a scientific fact created, as it moves from a hedged claim to a throughout successive citations?
  • 122. Future research: ‣ Need co-annotators to verify semantic types ‣ Need to scale up with more (types of) texts! I. How is a scientific fact created, as it moves from a hedged claim to a throughout successive citations? II. Can we identify a rhetorically successful text, using these segments and characteristics?
  • 123. Future research: ‣ Need co-annotators to verify semantic types ‣ Need to scale up with more (types of) texts! I. How is a scientific fact created, as it moves from a hedged claim to a throughout successive citations? II. Can we identify a rhetorically successful text, using these segments and characteristics? III. Can we help authors create such texts (guidelines, tools?
  • 125. Improve ‘what is claimed about an entity’ insulin ::: maintaining glucose ... diabetes defect) to overcome GB000841 homeostasis insulin resistance in maintaining glucose homeostasis, hyperglycemia and glucose improve glucose intolerance able to increase ... in T2D is ... homeostasis insulin secretion and improve glucose homeostasis. improves glucose ... SIRT1, whose administration homeostasis to insulin-resistant animals improves glucose homeostasis. is capable glucose S15511 is a novel insulin homeostasis sensitizer that is capable of improving glucose homeostasis in nondiabetic rats. maintains glucose Pancreatic beta-cells possess a homeostasis well-regulated insulin secretory property that maintains systemic glucose homeostasis. may be glucose ... similar way to those of involved homeostasis insulin, PANDER may be involved in glucose homeostasis. participates glucose Fine-tuning of insulin secretion homeostasis from pancreatic beta-cells participates in blood glucose homeostasis.
  • 126. Improve ‘what is claimed about an entity’ insulin ::: maintaining glucose ... diabetes defect) to overcome When insulin secretion cannot be increased adequately (type I diabetes defect) to overcome insulin resistance in maintaining glucose homeostasis, GB000841 homeostasis insulin resistance in maintaining hyperglycemia and glucose intolerance ensues. Insulin resistance and glucose glucose homeostasis, intolerance has been well recognized in patients with advanced chronic hyperglycemia and glucose kidney diseases (CKD). improve glucose intolerance able to increase ... in T2D is ... .. Incretin metabolism is abnormal in T2D, evidenced by a decreased incretin effect, reduction in nutrient-mediated secretion of GIP and GLP-1 in homeostasis insulin secretion and improve T2D, and resistance to GIP. GLP-1, on the other hand, when administered glucose homeostasis. intravenously in T2D is able to increase insulin secretion and improve glucose homeostasis. improves glucose ... SIRT1, whose administration SIRT1, a NAD(+)-dependent protein deacetylase that regulates transcription factors involved in key cellular processes, has been implicated as a mediator homeostasis to insulin-resistant animals of the beneficial effects of calorie restriction. In a recent issue of Nature, improves glucose homeostasis. Milne et al. (2007) describe novel potent activators of SIRT1, whose administration to insulin-resistant animals improves glucose homeostasis. is capable glucose S15511 is a novel insulin S15511 is a novel insulin sensitizer that is capable of improving glucose homeostasis in nondiabetic rats.... However, the mechanisms behind the insulin- homeostasis sensitizer that is capable of sensitizing effect of S15511 are unknown. The aim of our study was to improving glucose homeostasis explore whether S15511 improves insulin sensitivity in skeletal muscles. in nondiabetic rats. S15511 treatment was associated with an increase in insulin-stimulated glucose transport in type IIb well-regulatedtype I fibers were unaffected. Pancreatic beta-cells possess a fibers, while insulin secretory property that maintains glucose Pancreatic beta-cells possess a maintains systemic glucose homeostasis. Although it has long been homeostasis well-regulated insulin secretory thought that differentiated beta-cells are nearly static, recent studies property that maintains have shown that beta-cell mass dynamically changes throughout the systemic glucose homeostasis. lifetime. In this article, recent progress of regenerative medicine of the pancreasresults showed that glucose up-regulated PANDER mRNA and ... Our is reviewed. may be glucose ... similar way to those of protein levels in a time- and dose-dependent manner in MIN6 cells and involved homeostasis insulin, PANDER may be pancreatic islets. ...Because PANDER is expressed by pancreatic beta-cells involved in glucose homeostasis. and in response to glucose in a similar way to those of insulin, PANDER may be involved in glucose homeostasis. participates glucose Fine-tuning of insulin secretion Fine-tuning of insulin secretion from pancreatic beta-cells participates in blood glucose homeostasis. ... Our data identify miR124a and miR96 as novel homeostasis from pancreatic beta-cells regulators of the expression of proteins playing a critical role in insulin participates in blood glucose exocytosis and in the release of other hormones and neurotransmitters. homeostasis.
  • 127. Improve ‘what is claimed about an entity’ insulin ::: maintaining glucose ... diabetes defect) to overcome When insulin secretion cannot be increased adequately (type I diabetes defect) to overcome insulin resistance in maintaining glucose homeostasis, GB000841 homeostasis insulin resistance in maintaining hyperglycemia and glucose intolerance ensues. Insulin resistance and glucose glucose homeostasis, intolerance has been well recognized in patients with advanced chronic hyperglycemia and glucose kidney diseases (CKD). improve glucose intolerance able to increase ... in T2D is ... .. Incretin metabolism is abnormal in T2D, evidenced by a decreased incretin effect, reduction in nutrient-mediated secretion of GIP and GLP-1 in homeostasis insulin secretion and improve T2D, and resistance to GIP. GLP-1, on the other hand, when administered glucose homeostasis. intravenously in T2D is able to increase insulin secretion and improve glucose homeostasis. improves glucose ... SIRT1, whose administration SIRT1, a NAD(+)-dependent protein deacetylase that regulates transcription factors involved in key cellular processes, has been implicated as a mediator homeostasis to insulin-resistant animals of the beneficial effects of calorie restriction. In a recent issue of Nature, improves glucose homeostasis. Milne et al. (2007) describe novel potent activators of SIRT1, whose administration to insulin-resistant animals improves glucose homeostasis. is capable glucose S15511 is a novel insulin S15511 is a novel insulin sensitizer that is capable of improving glucose homeostasis in nondiabetic rats.... However, the mechanisms behind the insulin- homeostasis sensitizer that is capable of sensitizing effect of S15511 are unknown. The aim of our study was to improving glucose homeostasis explore whether S15511 improves insulin sensitivity in skeletal muscles. in nondiabetic rats. S15511 treatment was associated with an increase in insulin-stimulated glucose transport in type IIb well-regulatedtype I fibers were unaffected. Pancreatic beta-cells possess a fibers, while insulin secretory property that maintains glucose Pancreatic beta-cells possess a maintains systemic glucose homeostasis. Although it has long been homeostasis well-regulated insulin secretory thought that differentiated beta-cells are nearly static, recent studies property that maintains have shown that beta-cell mass dynamically changes throughout the systemic glucose homeostasis. lifetime. In this article, recent progress of regenerative medicine of the pancreasresults showed that glucose up-regulated PANDER mRNA and ... Our is reviewed. may be glucose ... similar way to those of protein levels in a time- and dose-dependent manner in MIN6 cells and involved homeostasis insulin, PANDER may be pancreatic islets. ...Because PANDER is expressed by pancreatic beta-cells involved in glucose homeostasis. and in response to glucose in a similar way to those of insulin, PANDER may be involved in glucose homeostasis. participates glucose Fine-tuning of insulin secretion Fine-tuning of insulin secretion from pancreatic beta-cells participates in blood glucose homeostasis. ... Our data identify miR124a and miR96 as novel homeostasis from pancreatic beta-cells regulators of the expression of proteins playing a critical role in insulin participates in blood glucose exocytosis and in the release of other hormones and neurotransmitters. homeostasis.
  • 128. A network of hypotheses and evidence 30
  • 129. A network of hypotheses and evidence PHC undergo Growth arrest 30
  • 130. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: implication method fact goal fact results 30
  • 131. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: implication method fact goal fact results data 1 data 2 data 3 30
  • 132. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 30
  • 133. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 30
  • 134. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 30
  • 135. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: Paper B: implication implication g n nin method fact rpi method de fact un goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 30
  • 136. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: Paper B: implication implication method fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 30
  • 137. A network of hypotheses and evidence PHC undergo Growth arrest Paper A: Paper B: implication implication method method link fact method fact goal fact goal fact results results data 1 data 4 data 2 data 3 data 5 data 6 30
  • 142. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Harvard/MGH: SWAN, ARF - Open University: Cohere - Oxford University: CiTO, eLearning/Rhetoric - DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 143. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Harvard/MGH: SWAN, ARF - Open University: Cohere - Oxford University: CiTO, eLearning/Rhetoric - DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 144. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Harvard/MGH: SWAN, ARF - Open University: Cohere - Oxford University: CiTO, eLearning/Rhetoric - DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 145. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Harvard/MGH: SWAN, ARF - Open University: Cohere - Oxford University: CiTO, eLearning/Rhetoric - DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 146. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Harvard/MGH: SWAN, ARF - Open University: Cohere - Oxford University: CiTO, eLearning/Rhetoric - DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 147. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Harvard/MGH: SWAN, ARF - Open University: Cohere - Oxford University: CiTO, eLearning/Rhetoric - DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 148. HypER Working Group: - Goal: Align and expand existing efforts on detection and analysis of Hypotheses, Evidence & Relationships - Partners: - Hypothesis 22: Intramembrenous Aβ dimer may be toxic. Harvard/MGH: SWAN, ARF - Derived of these Abeta peptides never leave theessay explores the possibility they aare Open from: POSTAT_CONTRIBUTION(This fraction University: Cohere membrane lipid bilayer after that - generated,University: CiTO,their toxic effects by competing with and compromising Oxford but instead exert eLearning/Rhetoric the functions of intramembranous segments of membrane-bound proteins that serve - many critical functions. DERI: SALT, aTags - University of Trento: LiquidPub - Xerox Research: XIP hypothesis identifier - U Tilburg: ML for Science - Elsevier, UUtrecht: Discourse analysis of biology
  • 150. HypER Activities: http://hyper.wik.is Current activities: - Aligning discourse ontologies: joint task with W3C HCLSSig - Aligning architectures to exchange hypotheses + evidence - Format for a rhetorical conference paper (SALT + abcde) - Parser test of hypothesis identification tools on pharmacology corpus
  • 151. HypER Activities: http://hyper.wik.is Current activities: - Aligning discourse ontologies: joint task with W3C HCLSSig - Aligning architectures to exchange hypotheses + evidence - Format for a rhetorical conference paper (SALT + abcde) - Parser test of hypothesis identification tools on pharmacology corpus Further interests: - Better structure of evidence: MyExperiment, KeFeD, ... - Granularity of annotation/access: entity, hypothesis, discussion?
  • 153. Conclusion Problem: too much discourse, tools are not yet good enough...
  • 154. Conclusion Problem: too much discourse, tools are not yet good enough... 1. First attempt: allow authors to validate entities - pursue
  • 155. Conclusion Problem: too much discourse, tools are not yet good enough... 1. First attempt: allow authors to validate entities - pursue 2. Second attempt: discourse analysis - any help is great!
  • 156. Conclusion Problem: too much discourse, tools are not yet good enough... 1. First attempt: allow authors to validate entities - pursue 2. Second attempt: discourse analysis - any help is great! 3. Third attempt: collaboration to identify hypotheses: do join!
  • 157. Questions? a.dewaard@elsevier.com http://elsatglabs.elsevier.com/labs/anita
  • 158. References Hyland, K. (2004). Disciplinary Discourses: Social Interactions in Academic Writing, Addison Wesley Publishing Company, 2004. Latour, B., and Woolgar, S. (1986). Laboratory Life: The Construction of Scientific Facts. 2nd ed. Princeton, NJ: Princeton University Press, 1986. ISBN: 9780691028323. Latour, B. (1987). Science in Action, How to Follow Scientists and Engineers through Society, (Cambridge, Ma.: Harvard University Press, 1987)
  • 159. Segmentation Criteria (summary) Finite/ Grammatical role Segment? Example Non-finite The extent to which miRNAs specifically affect Finite/Non-finite Subject N metastasis Finite/Non-finite Direct Object Y these miRNAs are potential novel oncogenes Phrase-level adjunct (restrictive and Nonfinite N spanning a given miRNA genomic region non-restrictive) Nonfinite Clause-level adjunct Y by cloning eight miR-Vec plasmids which is only active when tamoxifen is added (De Finite Non-restrictive Phrase-level adjunct Y Vita et al, 2005) […] Finite Restrictive Phrase-level adjunct N that we examined which correlates with the reported ES-cell Finite Clause-level adjunct Y expression pattern of the miR-371-3 cluster (Suh et al, 2004)
  • 160. Basic Segment Types Segment Description Example a known fact, generally Fact mature miR-373 is a homolog of miR-372 without explicit citation a proposed idea, not Hypothesis This could for instance be a result of high mdm2 levels supported by evidence unresolved, contradictory, or However, further investigation is required to Problem unclear issue demonstrate the exact mechanism of LATS2 action Goal research goal To identify novel functions of miRNAs, Method experimental method Using fluorescence microscopy and luciferase assays, a restatement of the outcome all constructs yielded high expression levels of mature Result of an experiment miRNAs an interpretation of the our procedure is sensitive enough to detect mild growth Implication results, in light of earlier hypotheses and facts differences
  • 161. Two Types of Derived Segment Types
  • 162. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work:
  • 163. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’
  • 164. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’
  • 165. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’ - other-implication: ‘D1 or, more likely, D5, receptors have been implicated in mechanisms underlying long-term spatial memory [Hersi et al., 1995]’
  • 166. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’ - other-implication: ‘D1 or, more likely, D5, receptors have been implicated in mechanisms underlying long-term spatial memory [Hersi et al., 1995]’ Regulatory segments, acting as matrix sentences framing other segments:
  • 167. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’ - other-implication: ‘D1 or, more likely, D5, receptors have been implicated in mechanisms underlying long-term spatial memory [Hersi et al., 1995]’ Regulatory segments, acting as matrix sentences framing other segments: - reg-hypothesis: ‘we hypothesized that ’
  • 168. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’ - other-implication: ‘D1 or, more likely, D5, receptors have been implicated in mechanisms underlying long-term spatial memory [Hersi et al., 1995]’ Regulatory segments, acting as matrix sentences framing other segments: - reg-hypothesis: ‘we hypothesized that ’ - reg-implication: ‘These observations suggest that’
  • 169. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’ - other-implication: ‘D1 or, more likely, D5, receptors have been implicated in mechanisms underlying long-term spatial memory [Hersi et al., 1995]’ Regulatory segments, acting as matrix sentences framing other segments: - reg-hypothesis: ‘we hypothesized that ’ - reg-implication: ‘These observations suggest that’ - intratextual: ‘Fig 4 shows that’
  • 170. Two Types of Derived Segment Types ‘Other-segments’, related to (referenced) other work: - other-result: ‘they are also found in the FCX and other cortical structures ([Sokoloff et al., 1990]’ - other-goal: ‘the role of D3 receptors in the control of motivation and affect has been intensively studied [Heidbreder et al., 2005]’ - other-implication: ‘D1 or, more likely, D5, receptors have been implicated in mechanisms underlying long-term spatial memory [Hersi et al., 1995]’ Regulatory segments, acting as matrix sentences framing other segments: - reg-hypothesis: ‘we hypothesized that ’ - reg-implication: ‘These observations suggest that’ - intratextual: ‘Fig 4 shows that’ - intertextual: ‘reviewed in (Serrano, 1997)’
  • 171. My categories vs. Latour (1979)
  • 174. Linguistic and structural properties 1. Position in text
  • 175. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion)
  • 176. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section
  • 177. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence
  • 178. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb:
  • 179. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice
  • 180. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic)
  • 181. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic) - Lexicon
  • 182. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic) - Lexicon 3. Metadiscourse markers [Hyland, 2003]:
  • 183. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic) - Lexicon 3. Metadiscourse markers [Hyland, 2003]: - Connectives
  • 184. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic) - Lexicon 3. Metadiscourse markers [Hyland, 2003]: - Connectives - Endophorics, Evidentials
  • 185. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic) - Lexicon 3. Metadiscourse markers [Hyland, 2003]: - Connectives - Endophorics, Evidentials - Hedges, Boosters
  • 186. Linguistic and structural properties 1. Position in text - Section of the paper (Introduction, Results, Discussion) - Beginning/middle/end of section - First/second third part of sentence 2. Verb: - Tense, aspect, voice - Verb class (idiosyncratic) - Lexicon 3. Metadiscourse markers [Hyland, 2003]: - Connectives - Endophorics, Evidentials - Hedges, Boosters - Person markers
  • 188. Verb class Two types of entities interact in biology texts: - Thing: - Thing -> Increase, die, etc - Thing-thing: affect, stimulate etc. - People: - People -> Thing: - Examine (Goal) - Operate (Method) - Observe (Result) - Implicate (Implication) - People - people: Report
  • 189. Interpretation: 3 Realms of Science: Conceptual realm Experimental realm Data realm
  • 190. Interpretation: 3 Realms of Science: (1) Oncogene-induced senescence is (4b) transduction with either Conceptual characterized by the appearance of miR-Vec-371&2 or miR-Vec- V12 cells with a flat morphology that 373 prevents RAS - realm express senescence associated (SA)- induced growth arrest in -Galactosid a s e . primary human cells. (2a) Indeed, (4a) Altogether, these data show that Experimental realm (2b) control RAS V12 -arrested (3b) very few cells showed cells showed relatively high senescent morphology when (3a) Consistent abundance of flat cells transduced with either miR- with the cell expressing SA- - Vec-371&2, miR-Vec-373, or growth assay, kd Galactosidase control p53 . (2c) (Figures 2G and 2H). Data realm (Figures)
  • 191. Interpretation: 3 Realms of Science: (1) Oncogene-induced senescence is (4b) transduction with either Conceptual characterized by the appearance of miR-Vec-371&2 or miR-Vec- V12 cells with a flat morphology that 373 prevents RAS - realm express senescence associated (SA)- induced growth arrest in -Galactosid a s e . primary human cells. (2a) Indeed, (4a) Altogether, these data show that Experimental realm (2b) control RAS V12 -arrested (3b) very few cells showed cells showed relatively high senescent morphology when (3a) Consistent abundance of flat cells transduced with either miR- with the cell expressing SA- - Vec-371&2, miR-Vec-373, or growth assay, kd Galactosidase control p53 . (2c) (Figures 2G and 2H). Data realm (Figures)
  • 192. Tense 1: Concepts vs. Experiment (1) Oncogene-induced senescence is (4b) transduction with either Concept realm characterized by the appearance of miR-Vec-371&2 or miR-Vec- V12 cells with a flat morphology that 373 prevents RAS - express senescence associated (SA)- induced growth arrest in -Galactosid a s e . primary human cells. (2a) Indeed, (4a) Altogether, these data show that Experimental realm (personal, past) V12 (2b) control RAS -arrested (3b) very few cells showed cells showed relatively high senescent morphology when (3a) Consistent abundance of flat cells transduced with either miR- with the cell expressing SA- - Vec-371&2, miR-Vec-373, or growth assay, kd Galactosidase control p53 . (2c) (Figures 2G and 2H). (nontverbal) Data realm (Figures)
  • 193. Tense 2: Referral past present future Introduction Discussion own paper After Before current Current work After current other work: present work: past (= Results section) work: past other papers Other Work
  • 194. Tense 1+ 2 = 3: Claim, fact Conceptual Experi ment Experiential past present future Reading time