1. Automatic Text Summarization
Katja Filippova
filippova@eml-research.de
EML Research gGmbH
TU Darmstadt
Text Summarization – 25.02.2009 – p. 1
2. Text summarization
• A summary is a text that is produced from one or more
texts, that contains a significant portion of the information in
the original text(s), and that is no longer than half of the
original text(s) (Hovy, 2003)
• information retrieval
• stock market prediction
• generation of abstracts
• online news summarization
• ...
Text Summarization – 25.02.2009 – p. 2
3. Overview
• Introduction
• classification of summarization systems
• abstraction vs. extraction
• Text cohesion and coherence for summarization
• graph based methods
• discourse structure based methods
• Document Understanding Conference
• tasks
• an example
• Research directions
• sentence fusion and compression
• integrating world knowledge
Text Summarization – 25.02.2009 – p. 3
4. Text summarization: types
• A summary is a text that is produced from one or more
texts, that contains a significant portion of the information in
the original text(s), and that is no longer than half of the
original text(s) (Hovy, 2003)
• Indicative
« indicates types of information
« “alerts”
Text Summarization – 25.02.2009 – p. 4
5. Text summarization: types
• A summary is a text that is produced from one or more
texts, that contains a significant portion of the information in
the original text(s), and that is no longer than half of the
original text(s) (Hovy, 2003)
• Indicative
« indicates types of information
« “alerts”
• Informative
« includes quantitative/qualitative information
« “informs”
Text Summarization – 25.02.2009 – p. 4
6. Text summarization: types
• A summary is a text that is produced from one or more
texts, that contains a significant portion of the information in
the original text(s), and that is no longer than half of the
original text(s) (Hovy, 2003)
• Indicative
« indicates types of information
« “alerts”
• Informative
« includes quantitative/qualitative information
« “informs”
• Critic/evaluative
« evaluates the content of the document Text Summarization – 25.02.2009 – p. 4
7. Text summarization: types
INDICATIVE
• The work of Consumer Advice Centres is examined. The
information sources used to support this work are reviewed.
The recent closure of many CACs has seriously affected the
availability of consumer information and advice. The
contribution that public libraries can make in enhancing the
availability of consumer information and advice both to the
public and other agencies involved in consumer information
and advice, is discussed.
Text Summarization – 25.02.2009 – p. 5
8. Text summarization: types
INFORMATIVE
• An examination of the work of Consumer Advice Centres
and of the information sources and support activities that
public libraries can offer. CACs have dealt with pre-shopping
advice, education on consumers’ rights and complaints
about goods and services, advising the client and often
obtaining expert assessment. They have drawn on a wide
range of information sources including case records, trade
literature, contact files and external links. The recent closure
of many CACs has seriously affected the availability of
consumer information and advice. Libraries can cooperate
closely with advice agencies through local coordinating
committed, shared premises, join publicity referral and the
sharing of professional expertise.
Text Summarization – 25.02.2009 – p. 5
9. Text summarization: types
• Source: single-document vs. multi-document
« research paper
« proceedings of a conference
Text Summarization – 25.02.2009 – p. 6
10. Text summarization: types
• Source: single-document vs. multi-document
« research paper
« proceedings of a conference
• Content: generic vs. query-based vs. user-focused
« equal coverage of all major topics
« based on a question “what are the causes of the war?”
« users interested in chemistry
Text Summarization – 25.02.2009 – p. 6
11. Text summarization: types
• Source: single-document vs. multi-document
« research paper
« proceedings of a conference
• Content: generic vs. query-based vs. user-focused
« equal coverage of all major topics
« based on a question “what are the causes of the war?”
« users interested in chemistry
• Form: extract vs. abstract
« fragments from the document
« newly re-written text
Text Summarization – 25.02.2009 – p. 6
12. Extraction vs. abstraction
How should a text summarization system proceed?
• read the documents
• understand them – build
a semantic representation
• generate a summary from
this representation
Text Summarization – 25.02.2009 – p. 7
13. Extraction vs. abstraction
• unfortunately, a rich semantic representation is not
possible yet
• to date, most summarization systems are extractive
• usually, extraction units are sentences
• low cost solution: could work without ontologies,
complex representations, etc.
• extractive summaries are usually incoherent
• trade-off between non-redundancy and completeness
Text Summarization – 25.02.2009 – p. 8
14. Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):
• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of quot;criminal and terrorist
aggressionquot;. (The Guardian)
• Syria accused the United States on Monday of carrying out
a quot;terrorist aggressionquot; after a deadly raid near its border
with Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contacted
his Syrian counterpart Bashar Assad to denounce
quot;Sunday’s American aggressionquot; against the Syrian village
of Abu Kamal near the border with Iraq, local Elnashra
website reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
15. Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):
• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of quot;criminal and terrorist
aggressionquot;. (The Guardian)
• Syria accused the United States on Monday of carrying out
a quot;terrorist aggressionquot; after a deadly raid near its border
with Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contacted
his Syrian counterpart Bashar Assad to denounce
quot;Sunday’s American aggressionquot; against the Syrian village
of Abu Kamal near the border with Iraq, local Elnashra
website reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
16. Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):
• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of quot;criminal and terrorist
aggressionquot;. (The Guardian)
• Syria accused the United States on Monday of carrying out
a quot;terrorist aggressionquot; after a deadly raid near its border
with Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contacted
his Syrian counterpart Bashar Assad to denounce
quot;Sunday’s American aggressionquot; against the Syrian village
of Abu Kamal near the border with Iraq, local Elnashra
website reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
17. Extraction vs. abstraction
Three sentences from related documents (Oct. 27 2009):
• The Syrian foreign minister today condemned the killing of
eight civilians in a US raid as an act of quot;criminal and terrorist
aggressionquot;. (The Guardian)
• Syria accused the United States on Monday of carrying out
a quot;terrorist aggressionquot; after a deadly raid near its border
with Iraq which it said killed eight civilians. (Reuters)
• Lebanese President Michel Suleiman on Monday contacted
his Syrian counterpart Bashar Assad to denounce
quot;Sunday’s American aggressionquot; against the Syrian village
of Abu Kamal near the border with Iraq, local Elnashra
website reported. (Aljazeera)
Text Summarization – 25.02.2009 – p. 9
18. Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulled
out from different documents make sense each but sound
awkward when put together
Text Summarization – 25.02.2009 – p. 10
19. Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulled
out from different documents make sense each but sound
awkward when put together
• unresolved pronouns may distort the meaning
Text Summarization – 25.02.2009 – p. 10
20. Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulled
out from different documents make sense each but sound
awkward when put together
• unresolved pronouns may distort the meaning
• beginning with a sentence which starts with However, ... is
not a good idea
Text Summarization – 25.02.2009 – p. 10
21. Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulled
out from different documents make sense each but sound
awkward when put together
• unresolved pronouns may distort the meaning
• beginning with a sentence which starts with However, ... is
not a good idea
• there is a striking difference with human generated texts –
pronouns and connectives are in the right place, the flow of
discourse makes sense
Text Summarization – 25.02.2009 – p. 10
22. Extraction vs. abstraction
• extractive summaries are not coherent – sentences pulled
out from different documents make sense each but sound
awkward when put together
• unresolved pronouns may distort the meaning
• beginning with a sentence which starts with However, ... is
not a good idea
• there is a striking difference with human generated texts –
pronouns and connectives are in the right place, the flow of
discourse makes sense
• How could one use this property of natural discourse for
summarization?
Text Summarization – 25.02.2009 – p. 10
23. Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become a
famous piano player. John works hard and works hard every
day. Working hard is necessary to become a famous piano
player.
Text Summarization – 25.02.2009 – p. 11
24. Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become a
famous piano player. John works hard and works hard every
day. Working hard is necessary to become a famous piano
player.
Text Summarization – 25.02.2009 – p. 11
25. Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become a
famous piano player. John works hard and works hard every
day. Working hard is necessary to become a famous piano
player.
• John enjoys playing the piano. However, he woke up early
yesterday. But the day before yesterday the weather was
wonderful, because rain and snow started immediately and
continued the whole day through. By the way, his teacher
did the same.
Text Summarization – 25.02.2009 – p. 11
26. Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become a
famous piano player. John works hard and works hard every
day. Working hard is necessary to become a famous piano
player.
• John enjoys playing the piano. However, he woke up early
yesterday. But the day before yesterday the weather was
wonderful, because rain and snow started immediately and
continued the whole day through. By the way, his teacher
did the same.
Text Summarization – 25.02.2009 – p. 11
27. Text coherence vs. text cohesion
• John enjoys playing the piano. John wants to become a
famous piano player. John works hard and works hard every
day. Working hard is necessary to become a famous piano
player.
• John enjoys playing the piano. However, he woke up early
yesterday. But the day before yesterday the weather was
wonderful, because rain and snow started immediately and
continued the whole day through. By the way, his teacher
did the same.
• John enjoys playing the piano and wants to become famous.
He works hard and does it every day because it is
necessary for his goal.
Text Summarization – 25.02.2009 – p. 11
28. Text coherence vs. text cohesion
• Text coherence represents the overall structure of a
multi-sentence text in terms of macro-level relations
between clauses or sentences (Halliday & Hasan, 1996).
« Rhetorical Structure Theory (Mann & Thompson, 1988)
« Discourse Representation Theory (Kamp, 1981)
« Discourse Lexicalized Tree Adjoining Grammar (Forbes,
2001)
• John enjoys playing the piano. [John wants to become a
famous piano player.] (that’s why) [John works hard and
works hard every day.] Working hard is necessary to
become a famous piano player.
Text Summarization – 25.02.2009 – p. 12
29. Text coherence vs. text cohesion
• Text cohesion involves relations between words, word
senses, or referring expressions, which determine how
tightly connected the text is (Halliday & Hasan, 1996).
« anaphora, ellipsis, connectives
« synonymy and other lexical relations
• John enjoys playing the piano. However, he woke up early
yesterday. But the day before yesterday the weather was
wonderful, because rain and snow started immediately and
continued the whole day through. By the way, his teacher
did the same.
Text Summarization – 25.02.2009 – p. 12
30. Coherence based summarization
• earlier systems considered technical documents and aimed
at identifying important information by assigning weights to
sentences (Luhn, 1958; Edmundson, 1969)
• several weighted features were used:
« word (stem) frequency
« presence of cue words (e.g., as a result, significant)
which signalize important content
« sentence position
« document structure
• feature weights were tuned manually
Text Summarization – 25.02.2009 – p. 13
31. Coherence based summarization
• Rhetorical Structure Theory (Mann & Thompson, 1987)
• elaboration
• example
• contrast
• background
• motivation
• etc.
Circumstance
Attribution
quot;I am optimisticquot;
said Mr. Smith
as the market plunged.
(from Sporleder & Lapata, 2005)
Text Summarization – 25.02.2009 – p. 14
32. Coherence based summarization
• one could use discourse structure for summarization
(Marcu, 2000)
• however, this is not done often:
• there are few discourse parsers and they are not very
precise
• there are arguments whether tree representation is
sufficient for discourse (Wolf & Gibson, 2005)
• it is not obvious to classify rhetorical relations
• some relations are argued to be anaphoric and not
discourse (Webber et al., 2003)
Text Summarization – 25.02.2009 – p. 15
33. Cohesion based summarization
• it is common to represent a text as a graph, where nodes
are sentences and edges are some relations between them
(e.g., discourse relations or just similarity)
• a common graph connectivity assumption is that the nodes
which are connected to many other nodes are likely to carry
salient information
• it is also assumed that nodes whose removal affects the
structure of the document are important (Skorochodko, 1972
from Mani, 2001)
Text Summarization – 25.02.2009 – p. 16
34. Cohesion based summarization
• it is common to represent a text as a graph, where nodes
are sentences and edges are some relations between them
(e.g., discourse relations or just similarity)
• a common graph connectivity assumption is that the nodes
which are connected to many other nodes are likely to carry
salient information
• it is also assumed that nodes whose removal affects the
structure of the document are important (Skorochodko, 1972
from Mani, 2001)
Text Summarization – 25.02.2009 – p. 16
35. Cohesion based summarization
• modern approaches extend this idea and use PageRank
(Page & Brin, 1998) to find salient nodes (Erkan & Radev,
2004; Mihalcea & Tarau, 2004) in such a graph
• similar sentences are connected
(bag-of-words similarity)
Text Summarization – 25.02.2009 – p. 17
36. Cohesion based summarization
• modern approaches extend this idea and use PageRank
(Page & Brin, 1998) to find salient nodes (Erkan & Radev,
2004; Mihalcea & Tarau, 2004) in such a graph
• similar sentences are connected
(bag-of-words similarity)
• a similarity threshold is used
Text Summarization – 25.02.2009 – p. 17
37. Cohesion based summarization
• modern approaches extend this idea and use PageRank
(Page & Brin, 1998) to find salient nodes (Erkan & Radev,
2004; Mihalcea & Tarau, 2004) in such a graph
• similar sentences are connected
(bag-of-words similarity)
• a similarity threshold is used
• the top N of page-ranked
sentences are extracted
Text Summarization – 25.02.2009 – p. 17
38. Coherence vs. cohesion based TS
• Coherence:
+ transparent; coherence of the output can be improved
– annotation of relations is still a challenge; preprocessing
difficulties
• Cohesion:
+ intuitively appealing; low-cost; even unsupervized
– requires WSD*, anaphora resolution; hard to pin down;
tuned thresholds
* word sense disambiguation
Text Summarization – 25.02.2009 – p. 18
39. DUC competitions
• Document Understanding Conferences (2000-2007)
• from 2008 Text Analysis Conference (TAC)
• provide participants with
- a task
- data
- manual and automatic evaluation
• increasing challenge in tasks: from generic single-document
summarization to multi-document update summary (2008)
Text Summarization – 25.02.2009 – p. 19
40. DUC competitions
Sample topic: D0740I
round-the-world balloon flight
Report on the planning, attempts and first
successful balloon circumnavigation of the earth
by Bertrand Piccard and his crew.
Text Summarization – 25.02.2009 – p. 20
41. DUC competitions
<DOC>
<DOCNO> APW19981112.0453 </DOCNO>
<DOCTYPE> NEWS STORY </DOCTYPE>
<DATE_TIME> 11/12/1998 08:21:00 </DATE_TIME>
<HEADER> w1942 &Cx1f; wstm- r i &Cx13; &Cx11; BC-Switzerland-BalloonQu
11-12 0355 </HEADER>
<BODY>
<SLUG> BC-Switzerland-Balloon Quest </SLUG> <HEADLINE> Swiss challenger
prepares third attempt at global record </HEADLINE> &UR; AP Photos GEV
101-102 &QL; <TEXT> GENEVA (AP) _ Swiss balloon pilot Bertrand Piccard
and his new teammate, British flight engineer Tony Brown, said Thursday
they will be ready later this month for a new attempt to fly nonstop
round the world. Their new Breitling Orbiter 3 balloon will take off
from Chateau d’Oex, in the Swiss Alps, as soon after Nov. 25 as weather
conditions are favorable, they said. It will be Piccard’s third attempt
to become the first to pilot a balloon around the world. In February
the Swiss pilot, along with British flight engineer AndyText Summarization – 25.02.2009 – p. 20
Elson and
42. The EML NLP group at DUC 2007
Text Summarization – 25.02.2009 – p. 21
43. Preprocessing: Annotation
• Sentence splitting
• Tokenization
• PoS tagging
• Chunking
• Named Entities recognition
Text Summarization – 25.02.2009 – p. 22
44. Preprocessing: Problems
• Sentence splitting
<sentence>At Pine Ridge, a scrolling marquee
at Big Bat’s Texaco expressed both joy over
Clinton’s visit and wariness of all the
official attention: “Welcome President
Clinton.</sentence> <sentence>Remember our
treaties,” the sign read.
Text Summarization – 25.02.2009 – p. 23
45. Preprocessing: Problems
• Sentence splitting
<sentence>At Pine Ridge, a scrolling marquee
at Big Bat’s Texaco expressed both joy over
Clinton’s visit and wariness of all the
official attention: “Welcome President
Clinton.</sentence> <sentence>Remember our
treaties,” the sign read.
• and cleaning
<sentence>PINE RIDGE, S.D.</sentence>
<sentence>(AP) - President Clinton turned the
attention of his national poverty tour today
to arguably the poorest, most forgotten U.S.
citizens of them all: American
Indians.</sentence> Text Summarization – 25.02.2009 – p. 23
46. Preprocessing: Document filtering
• Match topic with document extracts
• Pick the top 5 matching documents
Text Summarization – 25.02.2009 – p. 24
47. Semantic analysis
• Filter topic
• Connect topic words with words in
document sentences
• Compute sentence scores
matching words
matching word sequences
« ranked list of sentences
Text Summarization – 25.02.2009 – p. 25
48. Extractive summary generation
• Rerank sentences
• Select the top non-redundant sentences (250 word limit)
• Re-arrange sentences Text Summarization – 25.02.2009 – p. 26
49. A good summary
Round-the-world balloon flight: Report on the planning, attempts
and first successful balloon circumnavigation of the earth by
Bertrand Piccard and his crew.
Swiss balloon pilot Bertrand Piccard announced Wednesday
that he has chosen Brian Jones as his teammate for his next
attempt at circling the world in a balloon. Jones, 52, replaces
fellow British flight engineer Tony Brown. Achieving what
promoters called the last great milestone of aviation, Bertrand
Piccard and Brian Jones joined legends like the Wright Brothers
and Charles Lindbergh with Saturday’s completion of the first
manned round-the-world balloon flight. At 4:54 a.m. EST
Saturday, the two balloonists crossed the line of longitude from
which they had departed on March 1 at Chateau D’Oex,
Switzerland, ... Text Summarization – 25.02.2009 – p. 27
50. A bad summary
Angelina Jolie: What have been the most recent significant
events in the life and career of actress Angelina Jolie?
Angelina Jolie’s win for best supporting actress for her role in
“Girl, Interrupted” came 21 years after father Jon Voight was
awarded best actor for “Coming Home.“ ANGELINA JOLIE’S
LIFE ON THE EDGE After all, her career is in overdrive. But
Jolie cautions that she’s still a serious actress. It’s not like I’m
suddenly a better actress because I have awards or this box
office clout,” she says. “I am secure in the fact that I do have
something to offer as an actress,”Jolie says. ‘...
Text Summarization – 25.02.2009 – p. 28
51. Evaluation
• automatic evaluation with ROUGE (Lin, 2004)
• manual evaluation with respect to
« responsiveness
« linguistic quality
1. grammaticality
2. non-redundancy
3. referential clarity
4. focus
5. structure and coherence
• our system scored above the average, top 5 for
non-redundancy and coherence (recall the document
filtering stage)
Text Summarization – 25.02.2009 – p. 29
52. Research directions
• like in information retrieval, query expansion is expected to
improve recall
« WordNet (Fellbaum, 1998) for similarity
« Wikipedia for relatedness (Strube & Ponzetto, 2006)
« paraphrases
Text Summarization – 25.02.2009 – p. 30
53. Research directions
• like in information retrieval, query expansion is expected to
improve recall
« WordNet (Fellbaum, 1998) for similarity
« Wikipedia for relatedness (Strube & Ponzetto, 2006)
« paraphrases
• coreference resolution is needed for preprocessing,
otherwise, e.g., pronouns are filtered as stopwords
Text Summarization – 25.02.2009 – p. 30
54. Research directions
• like in information retrieval, query expansion is expected to
improve recall
« WordNet (Fellbaum, 1998) for similarity
« Wikipedia for relatedness (Strube & Ponzetto, 2006)
« paraphrases
• coreference resolution is needed for preprocessing,
otherwise, e.g., pronouns are filtered as stopwords
• relevance vs. redundancy issue: in MDS, how can we
ensure non-redundancy of the summary? (Carbonell &
Goldstein, 1998)
Text Summarization – 25.02.2009 – p. 30
55. Research directions
• like in information retrieval, query expansion is expected to
improve recall
« WordNet (Fellbaum, 1998) for similarity
« Wikipedia for relatedness (Strube & Ponzetto, 2006)
« paraphrases
• coreference resolution is needed for preprocessing,
otherwise, e.g., pronouns are filtered as stopwords
• relevance vs. redundancy issue: in MDS, how can we
ensure non-redundancy of the summary? (Carbonell &
Goldstein, 1998)
• sentence ordering for extractive MDS (Barzilay & Lapata,
2005)
Text Summarization – 25.02.2009 – p. 30
56. Directions of research
• abstractive summarization is a distant goal but there are
ways to go beyond sentence extraction
« sentence compression
« sentence fusion
Text Summarization – 25.02.2009 – p. 31
57. Sentence compression
This is true, regardless of the opinion that some people have of Syria, and of
their unhappiness at Syria’s presence in Lebanon.
Text Summarization – 25.02.2009 – p. 32
58. Sentence compression
This is true, regardless of the opinion that some people have of Syria, and of
their unhappiness at Syria’s presence in Lebanon.
Text Summarization – 25.02.2009 – p. 32
59. Sentence compression
This is true, regardless of the opinion that some people have of Syria, and of
their unhappiness at Syria’s presence in Lebanon.
• summarization on the sentence level
• in principle, a compression can be different from the input
(different wording and structure)
• to date, most systems use word deletion only
• meanwhile there is a compression corpus available online
http://homepages.inf.ed.ac.uk/s0460084/data
• the performance can be evaluated automatically
Text Summarization – 25.02.2009 – p. 32
60. Sentence fusion
1 John Smith, born November 15 1900, studied chemistry and physics at
the University of London.
2 From 1917 Mr. Smith studied at the University of London and in 1921 he
graduated with distinction.
Text Summarization – 25.02.2009 – p. 33
61. Sentence fusion
1 John Smith, born November 15 1900, studied chemistry and physics at
the University of London.
2 From 1917 Mr. Smith studied at the University of London and in 1921 he
graduated with distinction.
« Mr. Smith studied chemistry and physics at the University of London
from 1917.
• pieces of related sentences are used to generate a novel
sentence
• can be seen as a middle ground between extractive and
abstractive summarization
• addresses the incompleteness-redundancy problem
Text Summarization – 25.02.2009 – p. 33
62. Thank you!
(FOR YOUR ATTENTION)
Text Summarization – 25.02.2009 – p. 34
63. References
• R. Barzilay & M. Lapata, 2005: Modeling local coherence:
An entity-based approach
• S. Brin & L. Page, 1998: The anatomy of a large-scale
hypertextual web search engine
• J. G. Carbonell & J. Goldstein, 1998: The use of MMR,
diversity-based reranking for reordering documents and
producing summaries
• H. P. Edmundson, 1969: New methods in automatic
extracting
• G. Erkan & D. Radev, 2004: LexRank: Graph-based lexical
centrality as salience in text summarization
• C. Fellbaum, 1998: WordNet: An electronic lexical database
Text Summarization – 25.02.2009 – p. 35
64. References
• K. Forbes, E. Miltsakaki, R. Prasad, A. Sarkar, A. Joshi, B.
L. Webber, 2001: DLTAG system – discourse parsing with a
Lexicalized Tree Adjoining Grammar
• M. Halliday & R. Hasan, 1996: Cohesion in text
• E. H. Hovy, 2003: Text summarization
• H. Kamp, 1981: A theory of truth and semantic
representation
• C.-Y. Lin, 2004: Automatic evaluation of summaries using
N-gram co-occurrence statistics
• H. P. Luhn, 1958: The automatic creation of literature
abstracts
• I. Mani, 2001: Automatic summarization
Text Summarization – 25.02.2009 – p. 36
65. References
• W. C. Mann & S. A. Thompson, 1988: Rhetorical structure
theory. Towards a functional theory of text organization
• D. Marcu, 2000: The theory and practice of discourse
parsing and summarization
• R. Mihalcea & P. Tarau, 2004: TextRank: Bringing order
into text
• E. Skorochodko, 1972: Adaptive method of automatic
abstracting and indexing
• C. Sporleder & M. Lapata, 2005: Discourse chunking and its
application to sentence compression
• M. Strube & S. P. Ponzetto, 2006: WikiRelate! Computing
semantic relatedness using Wikipedia
Text Summarization – 25.02.2009 – p. 37
66. References
• B. L. Webber, M. Stone, A. Joshi, A. Knott, 2003: Anaphora
and discourse structure
• F. Wolf & E. Gibson, 2005: Representing discourse
coherence: A corpus-based study
Text Summarization – 25.02.2009 – p. 38