1. Dragomir R. Radev
2907 Philadelphia Drive
Ann Arbor, MI 48103
Office: (734) 615-5225, home: (734) 623-9259
E-mail: radev@umich.edu
URL: http://tangra.si.umich.edu/~radev
RESEARCH GROUP URL: http:/tangra.si.umich.edu/clair
RESEARCH INTERESTS:
Information retrieval and natural language processing (Web graph analysis, biologically-inspired
natural language processing, text and data mining, summarization, text generation, information
extraction), and in general: database systems, machine learning, artificial intelligence, and digital
libraries
EMPLOYMENT:
09/2005 — University of Michigan (on leave 2006-2007) Ann Arbor, MI
Associate Professor,
School of Information (SI)
Dept. of Electrical Engineering and Computer Science (EECS)
Dept. of Linguistics
Faculty Member, Program in Bioinformatics
Faculty Member, Center for Computational Medicine and Biology
01/2000 — 08/2005 University of Michigan Ann Arbor, MI
Assistant Professor
08/1998 — 12/1999 IBM TJ Watson Research Center Hawthorne, NY
Research Staff Member
01/1999 — 12/1999 Columbia University New York, NY
Adjunct Assistant Professor
Department of Computer Science
EDUCATION:
09/1996 — 05/1999 Columbia University New York, NY
Ph.D., Computer Science
09/1993 — 05/1996 Columbia University New York, NY
M.S., Computer Science
09/1991 — 05/1993 University of Maine Orono, ME
B.A., Computer Science
Concentration in Linguistics
10/1988 — 07/1991 Sofia Technical University Sofia, Bulgaria
Undergraduate student, Computer Systems
CURRENT EXTERNAL FUNDING:
10/01/2005 – 09/30/2008 BlogoCenter: Infrastructure for Collecting, Mining and Accessing Blogs
NSF
PI
(collaborative project with Junghoo Cho of UCLA)
Amount: $291,963 (Michigan portion only)
2. 09/01/2003 – 08/31/2006 Probabilistic and link-based Methods for Exploiting Very Large
Textual Repositories
NSF
Principal Investigator
Amount: $310,000
09/30/2003 – 09/29/2007 Representing and Acquiring Knowledge of Genome Regulation
NIH (NLM)
Co-Investigator
PI is David States , other co-PIs are Steve Abney and H.V. Jagadish
Amount: $1,331,976
07/01/2003 – 06/30/2006 Collaborative research: semantic entity and relation extraction from
Web-scale text document collections
NSF (Human Languages and Communications Program)
Co-PI
PI is Steve Abney of Michigan, second co-PI is Michael Collins of MIT
Amount: $355,344 (Michigan portion only, the total is $524,046)
09/01/2005 – 08/31/2008 DHB: The dynamics of Political Representation and Political Rhetoric
NSF
Co-PI (joint project with Michigan State U. Penn. State U., U. of Georgia,
and Harvard U., PI is Burt Monroe)
Amount: $749,724 (personal portion is $135,353)
09/30/2005 – 09/29/2010 National center for integrative bioinformatics
NIH
Investigator
PI is Brian Athey
Amount: $18,700,000
PREVIOUS EXTERNAL FUNDING:
09/01/2000 – 08/31/2003 ITR/IM: Information Fusion Across Multiple Text Sources: A Common
Theory
NSF (Information Technology Research Program)
Principal Investigator
Amount: $363,181
09/01/2000 – 08/31/2005 ITR/SOC+IM: Sustainable and Generalizable Technologies to Support
Collaboration in Science
NSF (Information Technology Research)
Senior Personnel
PI is Gary Olson, I was funded only in year one
Amount: $2,400,000
08/01/2002 – 01/31/2003 Human Agent Speech Interface Architecture
ONR
Senior Personnel (subcontract from Soar Technologies)
Subcontract amount: $30,982
06/01/2002 – 08/31/2002 Workshop On "Effective Tools And Methodologies For Teaching
Natural Language Processing And Computational Linguistics"
NSF
Principal Investigator
Amount: $11,750
U.S. PATENTS:
3. Eric Brown, Anni Coden, John Prager, and Dragomir Radev. U.S. Patent 6665666: System, method
and program product for answering questions using a search engine.
Joyce Chai, Sunil Govindappa, Nandakishore Kambhatla, Tetsunosuke Fujisaki, Catherine Wolf,
Dragomir Radev, Yiming Ye, and Wlodek Zadrozny. U.S. Patent 6829603: System, method and
program product for interactive natural dialog.
One additional U.S. patent is pending.
REFEREED JOURNAL ARTICLES:
[1] Wai Lam, Ki Chan, Dragomir Radev, Horacio Saggion, and Simone Teufel. Context-based generic cross-
lingual retrieval of documents and automated summaries. Journal of the American Society for
Information Science and Technology 56(2), February 2005.
[2] Dragomir R. Radev, Weiguo Fan, Hong Qi, Harris Wu, and Amardeep Grewal. Probabilistic question
answering on the web. Journal of the American Society for Information Science and Technology 56(3),
March 2005.
[3] Dragomir R. Radev, Jahna Otterbacher, Adam Winkel, and Sasha Blair-Goldensohn. Newsinessence:
Summarizing online news topics. Communications of the ACM, 10 2005.
[4] Gunes Erkan and Dragomir R. Radev. Lexrank: Graph-based centrality as salience in text summarization.
Journal of Artificial Intelligence Research (JAIR), 2004.
[5] Dragomir R. Radev, Hongyan Jing, Malgorzata Stys, and Daniel Tam. Centroid-based summarization of
multiple documents. Information Processing and Management, 40:919-938, December 2004.
[6] James Allan, Jay Aslam, Nicholas Belkin, Chris Buckley, Jamie Callan, Bruce Croft, Sue Dumais,
Norbert Fuhr, Donna Harman, David J. Harper, Djoerd Hiemstra, Thomas Hofmann, Eduard Hovy,
Wessel Kraaij, John Lafferty, Victor Lavrenko, David Lewis, Liz Liddy, R. Manmatha, Andrew
McCallum, Jay Ponte, John Prager, Dragomir Radev, Philip Resnik, Stephen Robertson, Roni
Rosenfeld, Salim Roukos, Mark Sanderson, Rich Schwartz, Amit Singhal, Alan Smeaton, Howard
Turtle, Ellen Voorhees, Ralph Weischedel, Jinxi Xu, and Chengxiang Zhai. Challenges in information
retrieval and language modeling. SIGIR Forum, 37(1), March 2003.
[7] Dragomir R. Radev, Eduard Hovy, and Kathleen McKeown. Introduction to the special issue on text
summarization. Computational Linguistics, 28(4), December 2002.
[8] Dragomir R. Radev, Kelsey Libner, and Weiguo Fan. Getting Answers to Natural Language Queries on
the Web. Journal of the American Society for Information Science and Technology, 53(5):359-364,
2002.
[9] Alfred Aho, Shih-Fu Chang, Kathleen McKeown, Dragomir Radev, John Smith, and Kazi Zaman.
Columbia Digital News Project. International Journal of Digital Libraries, 1(4):377-385, 1998.
[10] Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple
on-line sources. Computational Linguistics, 24(3):469-500, September 1998.
REFEREED CONFERENCE AND WORKSHOP PAPERS:
[11] Jahna Otterbacher, Dragomir Radev, and Omer Kareem. News to Go: Hierarchical Text Summarization
for Mobile Devices. In 29th Annual ACM SIGIR Conference on Research and Development in
Information Retrieval, Seattle, Washington, August 2006.
[12] Agam Patel and Dragomir R. Radev. Lexical similarity can distinguish between automatic and manual
translations. In LREC, Genoa, Italy, May 2006.
[13] Kevin M. Quinn, Burt L. Monroe, Michael Colaresi, Michael H. Crespin, and Dragomir R. Radev. An
automated method of topic-coding legislative speech over time with application to the 105th-108th u.
s. senate. In Midwest Political Science Association Meeting, 2006.
4. [14] Jahna Otterbacher, Gunes Erkan, and Dragomir R. Radev. Using random walks for question-focused
sentence retrieval. In Proceedings of HLT-EMNLP, 2005.
[15] Gunes Erkan and Dragomir R. Radev. Lexpagerank: Prestige in multi-document text summarization. In
EMNLP, Barcelona, Spain, 2004.
[16] Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur, Anoop Sarkar, Kenji Yamada, Alex Fraser,
Shankar Kumar, Libin Shen, David Smith, Katherine Eng, Viren Jain, Zhen Jin, and Dragomir Radev.
A smorgasbord of features for statistical machine translation. In Proceedings of HLT-NAACL 2004,
Boston, MA, May 2004.
[17] Jahna Otterbacher and Dragomir Radev. Revisionbank: A resource for revision-based multi-document
summarization and evaluation. In Proceedings of LREC 2004, Lisbon, Portugal, May 2004.
[18] Jahna C. Otterbacher and Dragomir Radev. Comparing semantically related sentences: The case of
paraphrase versus subsumption. COLING 2004, August 23rd-27th 2004.
[19] Frederick A. Peck, Suresh K. Bhavnani, Marilyn H. Blackmon, and Dragomir R. Radev. Exploring the
use of natural language systems for fact identification: Towards the automatic construction of
healthcare portals. ASIST 2004, November 13 - 18 2004.
[20] Dragomir Radev, Timothy Allison, Sasha Blair-Goldensohn, John Blitzer, Arda Celebi, Stanko
Dimitrov, Elliott Drabek, Ali Hakim, Wai Lam, Danyu Liu, Jahna Otterbacher, Hong Qi, Horacio
Saggion, Simone Teufel, Michael Topper, Adam Winkel, and Zhang Zhu. MEAD - a platform for
multidocument multilingual text summarization. In Proceedings of LREC 2004, Lisbon, Portugal, May
2004.
[21] Dragomir R. Radev, Hong Qi, Daniel Tam, and Adam Winkel. Computational linkuistics: word triggers
across hyperlinks. In Proceedings of HLT-NAACL 2004 (short paper), 2004.
[22] Zhu Zhang. Weakly-supervised relation classification for information extraction. In CIKM 2004,
Washington, DC, November 2004.
[23] Zhu Zhang and Dragomir R. Radev. Learning cross-document structural relationships using both labeled
and unlabeled data. In Proceedings of IJC-NLP 2004, Hainan Island, China, March 2004.
[24] Naomi Daniel, Dragomir R. Radev, and Timothy Allison. Sub-event based multidocument
summarization. In Proceedings, HLT-NAACL Workshop on Text Summarization, Edmonton, AB,
Canada, 2003.
[25] Amardeep Grewal, Timothy Allison, Stanko Dimitrov, and Dragomir R. Radev. Multi-document
summarization using off the shelf compression software. In Proceedings, HLT-NAACL Workshop on
Text Summarization, Edmonton, AB, Canada, 2003.
[26] James Pustejovsky, Jose Castano, Robert Ingria, Roser Sauri, Robert Gaizauskas, Andrea Setzer,
Graham Katz, and Dragomir R. Radev. TimeML: Robust specification of event and temporal
expressiong in text. In Proceedings, AAAI Spring Symposium on New Directions in Question
Answering, Stanford, CA, March 2003.
[27] Dragomir R. Radev, Simone Teufel, Horacio Saggion, Wai Lam, John Blitzer, Hong Qi, Arda Celebi,
Danyu Liu, and Elliott Drabek. Evaluation challenges in large-scale multi-document summarization:
the mead project. In Proceedings of ACL 2003, Sapporo, Japan, 2003.
[28] Zhu Zhang, Jahna Otterbacher, and Dragomir R. Radev. Combining labeled and unlabeled data for
learning cross-document structural relationships. In Proceedings of ACM CIKM 2003, New Orleans,
LA, November 2003.
[29] Jahna C. Otterbacher, Dragomir R. Radev, and Airong Luo. Revisions that improve cohesion in multi-
document summaries: a preliminary study. In Proceedings of the Workshop on Automatic
Summarization (including DUC 2002), pages 27-36, Philadelphia, July 2002. Association for
Computational Linguistics.
5. [30] Dragomir R. Radev, Weiguo Fan, Hong Qi, Harris Wu, and Amardeep Grewal. Probabilistic Question
Answering from the Web. In The 11th International World Wide Web Conference, Honolulu, Hawaii,
May 2002.
[31] Horacio Saggion, Dragomir Radev, Simon Teufel, Wai Lam, and Stephanie Strassel. Developing
infrastructure for the evaluation of single and multi-document summarization systems in a cross-
lingual environment. In Proceedings of LREC'2002, Las Palmas, Spain, June 2002.
[32] Horacio Saggion, Dragomir Radev, Simone Teufel, and Wai Lam. Meta-evaluation of summaries in a
cross-lingual environment using content-based metrics. In Proceedings of COLING'2002, Taipei,
Taiwan, August 2002.
[33] Harris Wu, Dragomir R. Radev, and Weiguo Fan. Towards Answer-Focused Summarization. In
Proceedings of the 1st International Conference on Information Technology and Applications,
Bathurst, Australia, November 25-28 2002.
[34] Zhu Zhang, Sasha Blair-Goldensohn, and Dragomir Radev. Towards CST-enhanced summarization. In
Proceedings of the AAAI 2002 Conference, Edmonton, Alberta, July - August 2002.
[35] Suresh Bhavnani, Karen Drabenstott, and Dragomir Radev. Towards a unified framework of IR tasks
and strategies. In 2001 ASIST Annual Meeting, Washington, DC, November 2001.
[36] John Prager, Dragomir R. Radev, and Krzysztof Czuba. Answering what-is questions by virtual
annotation. In Proceedings, HLT-2001, San Diego, CA, March 2001.
[37] Dragomir R. Radev, Sasha Blair-Goldensohn, Zhu Zhang, and Revathi Sundara Raghavan. Interactive,
domain-independent identification and summarization of topically related news articles. In
Proceedings, 5th European Conference on Research and Advanced Technology for Digital Libraries,
Darmstadt, Germany, September 2001.
[38] Dragomir R. Radev, Weiguo Fan, and Zhu Zhang. Webinessence: A personalized web-based multi-
document summarization and recommendation system. In NAACL Workshop on Automatic
Summarization, Pittsburgh, PA, 2001.
[39] Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, and
John Prager. Mining the web for answers to natural language questions. In ACM CIKM 2001: Tenth
International Conference on Information and Knowledge Management, Atlanta, GA, 2001.
[40] John Prager, Eric Brown, Anni Coden, and Dragomir Radev. Question-answering by predictive
annotation. In Proceedings, 23rd Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval, Athens, Greece, July 2000.
[41] Dragomir Radev. A common theory of information fusion from multiple text sources, step one: Cross-
document structure. In Proceedings, 1st ACL SIGDIAL Workshop on Discourse and Dialogue, Hong
Kong, October 2000.
[42] Dragomir Radev and Weiguo Fan. Automatic summarization of search engine hit lists. In Proceedings,
ACL Workshop on Recent Advances in NLP and IR, Hong Kong, October 2000.
[43] Dragomir R. Radev, Hongyan Jing, and Malgorzata Budzikowska. Summarization of multiple
documents: clustering, sentence extraction, and evaluation. In Proceedings, ANLP-NAACL Workshop
on Automatic Summarization, Seattle, WA, April 2000.
[44] Dragomir R. Radev, John Prager, and Valerie Samn. Ranking potential answers to natural language
questions. In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle,
WA, May 2000.
[45] Dragomir R. Radev, Nanda Kambhatla, Yiming Ye, Catherine Wolf, and Wlodek Zadrozny. DSML: A
proposal for XML standards for messaging between components of a natural language dialogue system.
In Proceedings, AISB Workshop on Reference Architectures and Data Standards for NLP, Edinburgh,
UK, April 1999.
6. [46] Dragomir R. Radev. Learning correlations between linguistic indicators and semantic constraints: Reuse
of context-dependent descriptions of entities. In Proceedings, 17th International Conference on
Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics
COLING-ACL'98, Montreal, Canada, August 1998.
[47] Alfred Aho, Shih-Fu Chang, Kathleen McKeown, Dragomir Radev, John Smith, and Kazi Zaman.
Columbia Digital News System : An environment for briefing and search over multimedia information.
In Proceedings, IEEE International Conference on the Advances of Digital Libraries ADL'97,
Washington, DC, May 1997.
[48] Karen Kukich, Rebecca Passonneau, Kathleen McKeown, Dragomir Radev, Vasileios Hatzivassiloglou,
and Hongyan Jing. Software re-use and evolution in text generation applications. In Proceedings, ACL/
EACL Workshop - From Research to Commercial Applications: Making NLP Technology Work in
Practice, Madrid, Spain, July 1997.
[49] Dragomir R. Radev and Kathleen R. McKeown. Building a generation knowledge source using internet-
accessible newswire. In Proceedings, Fifth ACL Conference on Applied Natural Language Processing
ANLP'97, pages 221-228, Washington, DC, April 1997.
[50] Evelyne Tzoukermann and Dragomir R. Radev. Using word class for part-of-speech disambiguation. In
Proceedings, Fourth Workshop on Very Large Corpora WVLC'96, pages 1-13, Copenhagen, Denmark,
August 1996. Coling.
[51] Kathleen R. McKeown and Dragomir R. Radev. Generating summaries of multiple news articles. In
Proceedings, ACM Conference on Research and Development in Information Retrieval SIGIR'95,
pages 74-82, Seattle, Washington, July 1995.
[52] Evelyne Tzoukermann, Dragomir R. Radev, and William A. Gale. Combining linguistic knowledge and
statistical learning in French part-of-speech tagging. In Proceedings, EACL Workshop on Very Large
Corpora WVLC'95, pages 51-57, Dublin, Ireland, February 1995. eacl.
[53] Siwei Shen, Dragomir Radev, and Agam Patel. Using syntax and dynamic programming for aligning
comparable texts. ACL 2006 poster session.
EDITED PROCEEDINGS:
[54] Rada Mihalcea and Dragomir R. Radev, editors. Textgraphs: Graph-based methods for NLP, New York
City, 2006.
[55] Chris Brew and Dragomir R. Radev, editors. Effective Tools and Methodologies for Teaching Natural
Language Processing and Computational Linguistics, Ann Arbor, MI, 2005.
[56] Dragomir R. Radev and Simone Teufel, editors. Text Summarization, Edmonton, AB, Canada, 2003.
[57] Dragomir R. Radev and Chris Brew, editors. Effective Tools and Methodologies for Teaching Natural
Language Processing and Computational Linguistics, Philadelphia, PA, 2002.
[58] Udo Hahn, Chin-Yew Lin, Inderjeet Mani, and Dragomir Radev, editors. Automatic Summarization,
Proceedings of the ANLP/NAACL Workshop, Seattle, WA, 2000.
[59] Eduard Hovy and Dragomir R. Radev, editors. Intelligent Text Summarization, Working notes of the
1998 AAAI Spring Symposium, Stanford, California, March 1998. AAAI Technical Report SS-98-06.
REFEREED BOOK CHAPTERS:
[60] James Pustejovsky, Jose Castano, Roser Sauri, Robert Gaizauskas, Andrea Setzer, Graham Katz,
Dragomir Radev, and Beth Sundheim. Representing temporal and event knowledge for question
answering systems. In Mark Maybury, editor, New Directions in Question Answering. 2004.
[61] Dragomir R. Radev. Speech processing. In Philipp Strazny, editor, Encyclopedia of Linguistics. Fitzroy
Dearborn Publishers, 2004.
7. [62] Harris Wu, Dragomir R. Radev, and Weiguo Fan. Towards answer-focused summarization using search
engines. In Mark Maybury, editor, New Directions in Question Answering. 2004.
[63] Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, and
John Prager. Query modulation for web-based question answering. In Tomek Strzalkowski and Sanda
Harabagiu, editors, Advances in Open-Domain Question Answering. 2003.
[64] Kathleen R. McKeown and Dragomir R. Radev. Collocations. In Robert Dale, Hermann Moisl, and
Harold Somers, editors, A Handbook of Natural Language Processing. Marcel Dekker, 2000.
[65] Kathleen R. McKeown and Dragomir R. Radev. Generating summaries of multiple news articles. In
Inderjeet Mani and Mark Maybury, editors, Advances in Automatic Text Summarization. MIT Press,
1999.
[66] Evelyne Tzoukermann and Dragomir R. Radev. Use of weighted finite state transducers in part of speech
tagging. In Andras Kornai, editor, Extended Finite State Models of Language. Cambridge University
Press, 1999.
[67] Evelyne Tzoukermann, Dragomir R. Radev, and William A. Gale. Lexical vs. contextual probabilities
for tagging French: Combining linguistic knowledge and statistical learning. In Susan Armstrong,
Kenneth Ward Church, Pierre Isabelle, Sandra Manzi, Evelyne Tzoukermann, and David Yarowsky,
editors, Natural Language Processing Using Very Large Corpora. Kluwer Academic Publishers, 1999.
REFEREED PRESENTATIONS, POSTERS, AND DEMONSTRATIONS:
[68] Jahna Otterbacher and Dragomir Radev. Fact-focused Novelty Detection: a Feasibility Study. In Poster
session, 29th Annual ACM SIGIR Conference on Research and Development in Information Retrieval,
Seattle, Washington, August 2006.
[69] Dragomir R. Radev, Gunes Erkan, Anthony Fader, Patrick Jordan, Siwei Shen, and James Sweeney.
Lexnet: A graphical environment for graph-based natural language processing. In Demo session,
COLING-ACL 2006, Sydney, Australia, July 2006.
[70] Siwei Shen, Dragomir R. Radev, Agam Patel, and Gunes Erkan. Adding syntax to dynamic
programming for aligning comparable texts for the generation of paraphrases. In Poster session,
COLING-ACL 2006, Sydney, Australia, July 2006.
[71] Dragomir R. Radev, Omer Kareem, and Jahna Otterbacher. Hierarchical text summarization for wap-
enabled mobile devices. In SIGIR 2005 (Demo session), Salvador, Brazil, August 2005.
[72] Dragomir Radev, Timothy Allison, Matthew Craig, Stanko Dimitrov, Omer Kareem, Michael Topper,
and Adam Winkel. A scaleable multi-document centroid-based summarizer. In HLT-NAACL Demo
Session, Boston, MA, May 2004.
[73] Dragomir Radev, Jahna Otterbacher, and Zhu Zhang. CSTBank: A corpus for the study of cross-
document structural relationships. In LREC 2004 Poster Session, Lisbon, Portugal, May 2004.
[74] Dragomir R. Radev and Daniel Tam. Single-document and multi-document summary evaluation via
relative utility. In CIKM 2003 poster session, New Orleans, LA, November 2003.
[75] Dragomir R. Radev, Adam Winkel, and Hong Qi. Hypergraph based content transfer for information
retrieval and question answering. Second Workshop on Algorithms and Models for the Web Graph
(WAW 2003), May 2003.
[76] Cong Yu, H. V. Jagadish, and Dragomir R. Radev. Querying xml using structures and keywords in
timber. In SIGIR 2003 (Demo session), Toronto, ON, Canada, August 2003.
[77] Dragomir R. Radev, Hong Qi, Harris Wu, and Weiguo Fan. Evaluating Web-based Question Answering
Systems. In Demo section, LREC 2002, Las Palmas, Spain, June 2002.
[78] Dragomir R. Radev, Adam Winkel, and Michael Topper. Multi Document Centroid-based Text
Summarization. In ACL 2002 (Demo Session), Philadelphia, PA, July 2002.
8. [79] Dragomir R. Radev, Sasha Blair-Goldensohn, Zhu Zhang, and Revathi Sundara Raghavan.
Newsinessence: A system for domain-independent, real-time news clustering and multi-document
summarization. In Demo Presentation, Human Language Technology Conference, San Diego, CA,
March 2001.
[80] Dragomir R. Radev. An architecture for distributed natural language summarization. In Poster
Presentation, Eighth International Workshop on Natural Language Generation INLG'96, pages 45-48,
Herstmonceux, England, June 1996.
[81] Dragomir R. Radev. Rendezvous: A WWW synchronization system. Poster Presentation, Second WWW
Conference, October 1994.
[82] Dragomir R. Radev. FREX: A verb conjugation system for French. Student Poster Session, ACM
Conference, March 1993.
TECHNICAL REPORTS AND MISCELLANEOUS NON-REFEREED PUBLICATIONS:
[83] Gunes Erkan and Dragomir Radev. The university of Michigan at duc 2004. In Document Understanding
Conference (DUC), Boston, Massachusetts, May 2004.
[84] Dragomir R. Radev. Weakly supervised graph-based methods for classification. Technical Report CSE-
TR-500-04, University of Michigan. Department of Electrical Engineering and Computer Science,
2004.
[85] Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur, Anoop Sarkar, Kenji Yamada, Alex Fraser,
Shankar Kumar, Libin Shen, David Smith, Katherine Eng, Viren Jain, Zhen Jin, and Dragomir Radev.
Syntax for statistical machine translation. Technical report, Center for Language and Speech
Processing, Johns Hopkins University, Baltimore, 2003. Johns Hopkins University 2003 summer
workshop final report.
[86] Jahna Otterbacher, Hong Qi, Ali Hakim, and Dragomir Radev. The university of Michigan at trec 2003.
In Proceedings, TREC-2003 Conference, Gaithersburg, MD, November 2003.
[87] Dragomir R. Radev. Panel on web-based question answering. AAAI Spring Symposium on New
Directions in Question Answering, March 2003.
[88] Dragomir R. Radev, Jahna Otterbacher, Hong Qi, and Daniel Tam. Mead reducs: Michigan at duc 2003.
In DUC 2003, Edmonton, AB, Canada, June 2003.
[89] Jan Hajic, Martin Cmejrek, Bonnie Dorr, Yuan Ding, Jason Eisner, Daniel Gildea, Terry Koo, Kristen
Parton, Gerald Penn, Dragomir Radev, and Owen Rambow. Natural language generation in the context
of machine translation. Technical report, Center for Language and Speech Processing, Johns Hopkins
University, Baltimore, 2002. Johns Hopkins University 2002 summer workshop final report.
[90] Hong Qi, Jahna Otterbacher, Adam Winkel, and Dragomir R. Radev. The University of Michigan at
TREC2002: Question Answering and Novelty Tracks. In Proceedings, TREC-2002 Conference,
Gaithersburg, MD, November 2002.
[91] Dragomir Radev, Simone Teufel, Horacio Saggion, Wai Lam, John Blitzer, Arda Celebi, Hong Qi,
Elliott Drabek, and Danyu Liu. Evaluation of text summarization in a cross-lingual information
retrieval framework. Technical report, Center for Language and Speech Processing, Johns Hopkins
University, Baltimore, MD, June 2002. Johns Hopkins University 2001 summer workshop final report.
[92] Dragomir Radev, Sasha Blair-Goldensohn, and Zhu Zhang. Experiments in single and multi-document
summarization using MEAD. In First Document Understanding Conference, New Orleans, LA,
September 2001.
[93] Breck Baldwin, Robert Donaway, Eduard Hovy, Elizabeth Liddy, Inderjeet Mani, Daniel Marcu,
Kathleen McKeown, Vibhu Mittal, Marc Moens, Dragomir Radev, Karen Sparck Jones, Beth
Sundheim, Simone Teufel, Ralph Weischedel, and Michael White. An evaluation road map for
summarization research. TIDES, July 2000.
9. [94] John Prager, Eric Brown, Dragomir Radev, and Krzysztof Czuba. One search engine or two for question-
answering. In Proceedings, TREC-9 Conference, Gaithersburg, MD, November 2000.
[95] John Prager, Dragomir Radev, Eric Brown, Anni Coden, and Valerie Samn. The use of predictive
annotation for question answering in TREC8. In Proceedings, TREC-8 Conference, Gaithersburg, MD,
November 1999.
[96] Dragomir R. Radev. Frequently asked questions about natural language processing Ð second edition.
Technical Report CUCS-027-99, Columbia University Department of Computer Science, September
1999.
[97] Dragomir R. Radev. Topic shift detection - finding new information in threaded news. Technical Report
CUCS-026-99, Columbia University, 1999.
[98] Dragomir R. Radev and Eduard Hovy. Intelligent text summarization - AAAI spring symposium report.
AI Magazine, 20(3), 1999.
[99] Dragomir R. Radev, Vasileios Hatzivassiloglou, and Kathleen R. McKeown. A description of the CIDR
system as used for TDT-2. In Proceedings, DARPA Broadcast News Workshop, Herndon, VA,
February 1999.
[100] Srikant Krishna and Dragomir R. Radev. Dictator: A GUI-based system for the analysis and
modification of information extraction rules. Technical Report CUCS-014-98, Columbia University,
1998.
[101] Rebecca Passonneau, Karen Kukich, Kathleen McKeown, Dragomir Radev, and Hongyan Jing.
Summarizing Web traffic: A portability exercise. Technical Report CUCS-009-97, Columbia
University, Department of Computer Science, New York, NY, USA, March 1997.
[102] Dragomir R. Radev. Frequently asked questions about natural language processing. Vivek, 10(3), July
1997.
[103] Dragomir R. Radev. Generating natural language summaries from multiple on-line sources. Technical
Report CUCS-005-97, Columbia University, Department of Computer Science, New York, NY, USA,
March 1997.
[104] Evelyne Tzoukermann and Dragomir R. Radev. Use of weighted finite state trasducers in part of speech
tagging. Technical Report 11334-970220-02TM, Lucent Technologies Bell Laboratories, Murray Hill,
N. J. , USA, February 1997.
[105] Dragomir R. Radev, Evelyne Tzoukermann, and William A. Gale. Part-of-speech tagger for protect
French: a user's manual. Technical Report 11222-950726-03TM, 11215-950727-08TM, AT&T Bell
Laboratories, Murray Hill, N. J. , USA, August 1994.
PAPERS UNDER REVIEW:
[106] Dragomir A. Radev, Simone Teufel, Horacio Saggion, Wai Lam, John Blitzer, Arda Celebi, Elliott
Drabek, Danyu Liu, and Hong Qi. Large-scale summarization evaluation in a cross-lingual information
retrieval context. Submitted to Information Processing and Management.
[107] Jahna Otterbacher and Dragomir R. Radev. Retrieval of context-specific, dynamic information: A
survey of related work. Submitted to ACM Computing Surveys.
[108] Dragomir Radev, Daniel Tam, and Gunes Erkan. Single-document and multi-document summary
evaluation using relative utility. Submitted to Information Retrieval.
[109] Exploring Fact-Focused Relevance and Novelty Detection, submitted to Information Processing and
Management
[110] Hierarchical Summarization for Delivering Information to Mobile Devices, submitted to Decision
Support Systems
10. PAPERS IN PROGRESS:
[111] Modeling Burstiness in Discourse Using a Stochastic Stack
[112] A topological analysis of semisupervised graph-based learning with harmonic functions
[113] Protein-protein interaction with no external knowledge
[114] An empirical analysis of 100 lexical networks
[115] Hiring networks in information science and computer science
[116] Blind men and elephants: What do citation summaries tell us about a research article
[117] Reinforcement classifiers
[118] Dependency parsing using random walks
[119] Modeling Document Dynamics: An Evolutionary Approach
[120] Cross-document relationship classification for text summarization
PH. D. THESIS:
[121] Dragomir R. Radev. Language Reuse and Regeneration: Generating Natural Language Summaries
from Multiple On-Line Sources. PhD thesis, Department of Computer Science, Columbia University,
New York, April 1999.
AWARDS AND HONORS:
The Gosnell Prize for Excellence in Political Methodology (shared) (2006)
UROP Faculty Recognition Award for Outstanding Research Mentorship University of Michigan (2004)
(funded by Coca-Cola)
Ph.D. Teaching Award of Excellence Dept. of Computer Science, Columbia University (1995)
ACM International Collegiate Computer Programming Contest International Finalist (1993)
International Student Trustee Tuition Waiver University of Maine (1991 — 1993)
Scholarship Recipient Open Society Fund (1991 — 1993)
Department of Education Scholarship (merit-based) Technical University, Sofia (1988 — 1991)
High School Mathematical Linguistics Contest 3rd place in Bulgaria (1985)
EDITORIAL BOARD MEMBERSHIP:
Natural Language Engineering, since 2006.
Information Retrieval, since 2002.
JAIR (Journal of Artificial Intelligence Research), 2003—2006.
SERVICE ACTIVITIES:
11. Secretary of the ACL (2006-2010)
Program chair, 2007 US high school computational linguistics competition
Co-chair (with Rada Mihalcea) HLT-NAACL workshop on Graph-based methods for NLP, New
York, NY
Co-chair (with Tim Finin), AAAI Special Track on AI and the Web, AAAI 2006, Boston,
Massachusetts
Web chair, SIGMOD 2006, Chicago, Illinois
Local chair, ACL 2005, Ann Arbor, MI
Co-chair (with Chris Brew), ACL 2005 workshop on Effective Tools and Methodologies for
Teaching NLP and CL, Ann Arbor, MI, June 2005.
Treasurer, North American Chapter of the Association for Computational Linguistics (NAACL) 2002,
2003, (re-elected) 2004, 2005
Co-chair (with Simone Teufel), Document Understanding Conference (DUC 2003), Edmonton, Canada,
May-June 2003.
Publications co-chair (with Steve Abney), HLT-NAACL-03, Edmonton, Canada, May-June 2003.
Publications chair, ACL-02, Philadelphia, PA, July 2002.
Guest editor (with Eduard Hovy and Kathy McKeown), Special issue of Computational Linguistics on
Summarization, December 2002
Co-chair (with Chris Brew), ACL Workshop on Effective Tools and Methodologies for Teaching NLP
and CL, Philadelphia, PA, July 2002.
Co-Chair (with Eduard Hovy), AAAI Spring Symposium on Intelligent Text Summarization, Stanford,
CA, March 1998.
Co-Chair (with Maria Milosavljevic), COLING-ACL Student Session, Montréal, Canada, August 1998.
OTHER ORGANIZATIONAL ACTIVITIES:
Session chair, LREC 2006, Genoa, Italy
Session chair, HLT-NAACL 2006, New York City
Session chair, DUC-05, Vancouver, BC
Session chair, HLT-EMNLP 2005, Vancouver, BC
Session chair, RANLP 2005, Borovetz, Bulgaria
Organizing committee, ACL Workshop on Evaluation for Machine Translation and Summarization,
Ann Arbor, MI, June 2005.
Organizing committee, ACL Workshop on Text Summarization, Barcelona, Spain, July 2004.
Mentor, SIGIR 2004, Sheffield, UK, August 2004.
Organizing committee, Document Understanding Conference (DUC’04), Boston, MA, May 2004.
Session chair, DUC-02, Philadelphia, PA, July 2002.
Organizing committee, Document Understanding Conference (DUC’02), Philadelphia, PA, July 2002.
12. Session Chair, NAACL Workshop on Text Summarization, Pittsburgh, PA, June 2001.
Session Chair, ACL Workshop on Information Retrieval and Natural Language Processing, Hong Kong,
October 2000.
Session Chair, ACM SIGIR’00, Athens, Greece, July 2000.
Workshops Chair, International Conference on Natural Language Generation, Mitzpe Ramon, Israel,
June 2000.
Session Chair, ANLP/NAACL Workshop on Automatic Summarization, Seattle, WA, April 2000.
Organizing Committee, ANLP/NAACL Workshop on Automatic Summarization, Seattle, WA, April
2000.
JOURNAL REVIEWING:
VLDBJ: 2005
Technometrics: 2005
Artificial Intelligence: 1998, 1999, 2001
Machine Learning: 2004
IBM Systems Journal: 2002
Computational Linguistics: 2002, 2006
Journal of Artificial Intelligence Research (JAIR): 2001
Journal of the American Society for Information Science and Technology (JASIST): 2002, 2003, 2004
Information Processing and Management: 1998, 2000, 2001, 2002
ACM Transactions on Information Systems (TOIS): 1997, 2001, 2003
ACM Transactions on Internet Technology (TOIT): 2002
ACM Transactions on Asian Language Processing (TALIP): 2003, 2004
ACM Transactions on Speech and Language Processing (TSLP): 2005
IEEE Internet Computing: 1997
Natural Language Engineering: 2001, 2002, 2003, 2004, 2005
Journal of Web Intelligence and Agent Systems: 2003
IEEE Intelligent Systems: 2003
DKE Journal: 2004
Traitement Automatique de Langues (TAL): 2004
IEEE Transactions on Knowledge and Data Engineering (TKDE) : 2005
International journal on intelligent systems: 2006
PROGRAM COMMITTEES AND CONFERENCE REVIEWING:
HLT-NAACL 2007
IJCAI-2007 WS: TextLink 2007
2007 AAAI International Conference on Weblogs and Social Media
DUC 2007
SLT 2006 Speech and Language Technology conference, Aruba
LINKKDD 2006
Third Midwest computational linguistics colloquium, 2006
COLING-ACL 2006 WS on sentiment and subjectivity in text
HLT-NAACL 2006 WS on Analyzing Conversations in Text and Speech (ACTS), NY
HLT-NAACL 2006 (area chair), New York, NY
LREC 2006, Genoa, Italy
EACL 2006 WS on the Web as a corpus, Trento, Italy
AAAI Spring Symposium 2006 on Computational Approaches for Analyzing Weblogs, Stanford
RANLP 2005, Borovetz, Bulgaria
RANLP 2005 Workshop on summarization: Borovetz, Bulgaria
SIGKDD 2005 LinkKDD workshop, Chicago, Illinois
IJCAI 2005, Edinburgh, UK
HLT-EMNLP 2005, Vancouver, BC, Canada
IJCNLP 2005 (area chair), Jeju Island, Korea
13. SIGIR 2005, Salvador, Brazil, August 2005
SIGIR 2005 (poster session), Salvador, Brazil, August 2005
CoNLL 2005, Ann Arbor, June 2005
ACL 2005 WS on Evaluation for Machine Translation and Summarization, June 2005
JNLE Special Issue on parallel texts, 2004
EMNLP 2004, Barcelona, Spain, July 2004
SIGIR 2004 WS on Information Retrieval for Question Answering, Sheffield, UK, July 2004
WAW 2004: 3rd WS on Algorithms and Models for the Web Graph, Rome, Italy, October 2004 (at
FOCS 2004)
TEDC: 2nd IEEE WS on Technology and Educ. in Developing Countries, Joensu, Finland, August 2004
ACL 2004, Barcelona, Spain, July 2004 (reviewer)
IJCNLP 2004, Hainan Island, China, March 2004 (reviewer)
IJCNLP 2004 WS on Multilingual Summarization and Question Answering, Hainan Island, China
CHI 2004 (reviewer)
HLT-NAACL 2004, Boston, MA, May 2004 (reviewer)
FLAIRS 2004 – Web and AI track
EMNLP 2003, Sapporo, Japan, July 2003.
SIGIR 2003, Toronto, Canada, July 2003.
TEDC: 1st IEEE WS on Technology and Education in Developing Countries, Newark, NJ, August 2003
RANLP 2003, Borovetz, Bulgaria, September 2003.
CIKM 2003, New Orleans, LA, November 2003.
ASIST 2003, Long Beach, CA, October 2003.
ACL 2003, Sapporo, Japan, July 2003 (reviewer).
ACL 2003 WS on Multilingual Summarization and Question Answering, Sapporo, Japan, July 2003.
SIGIR 2003 (poster session), Toronto, Canada, August 2003.
EACL 2003 (student session), Budapest, Hungary, April 2003.
SIGIR 2002 (poster session), Tampere, Finland, August 2002.
COLING 2002, Taipei, Taiwan, August 2002.
AAAI 2002, Edmonton, Alberta, July-August 2002.
AAAI 2002 (student session), Edmonton, Alberta, July-August 2002.
HLT 2002, San Diego, CA, March 2002.
INLG 2002, Arden House, New York, July 2002.
DUC 2001, New Orleans, LA, September 2001.
RANLP 2001, Tzigov Chark, Bulgaria, September 2001.
ACL 2001, Toulouse, France, July 2001.
ACL 2001 Workshop on Open-Domain Question Answering, Toulouse, France, July 2001.
ACL 2001 Workshop on Temporal and Spatial Information Processing, Toulouse, France, July 2001.
NAACL Workshop on Automatic Summarization, Pittsburgh, Pennsylvania, June 2001.
HLT 2001, San Diego, CA, March 2001.
EMNLP/VLC 2000, Hong Kong, October 2000.
ACL 2000, Hong Kong, October 2000.
AAAI 2000, Austin, TX, July-August 2000.
INLG 2000, Mitzpe Ramon, Israel, June 2000.
ANLP/NAACL 2000, Seattle, WA, April — May 2000.
ACL/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, July 1997.
ACL 1996 (student session), Santa Cruz, CA, 1996.
MISCELLANEOUS REVIEWING:
SAC (Symposium on Applied Computing), 2002.
CONTEXT, 2001.
ACL Student Session, College Park, MD, June 1999.
Formal Approaches to Slavic Linguistics (FASL8), 1999.
RANLP, Tzigov Chark, Bulgaria, September 1997.
AAAI Student Session, Seattle, WA, August 1994.
EDUCATIONAL OUTREACH ACTIVITIES:
Tutorial on Graph-based methods for NLP and IR (joint with Rada Mihalcea), New York City, June
2006. Attendance: around 40.
14. Tutorial on Text Summarization at ACM SIGIR’04, Sheffield, UK, August 2004. Attendance: 20
Tutorial on Text Summarization at ACM SIGIR’03, Toronto, Canada, August 2003. Attendance: 21
Text Generation and Summarization (lecture and lab), Summer School on Human Language
Technologies, Johns Hopkins University, July 3, 2003.
Text Generation (lab), Summer School on Human Language Technologies, Johns Hopkins University,
July 5, 2002, Attendance: 35.
Tutorial on Text Summarization at ACM SIGIR’01, New Orleans, LA, September 2001. Attendance:
22.
Tutorial on Text Summarization at ACM SIGIR’00, Athens, Greece, July 2000. Attendance: 20.
Tutorial on Text Summarization at AAAI’00, Austin, TX, July 2000. Attendance: 125.
Tutorial on Text Summarization at IBM TJ Watson Center, Yorktown Heights, NY, June 2000.
Attendance: 30.
OTHER PROFESSIONAL ACTIVITIES:
Webmaster, Association for Computational Linguistics (www.aclweb.org), 1994 — present.
Reviewer, Belgian Science Foundation, 2006
Reviewer, Research Grants Council of Hong Kong, 2005, 2006
ACL 2004 student session, panelist
Reviewer for CRDF, 2001, 2003, 2004, 2005
Reviewer for NSF in the IIS Division, 2001, 2002, 2003, 2004
Reviewer for NSERC, Canada, 2001
Co-chair, IBM Research Natural Languages and Linguistics Special Interest Group, 1999.
Reviewer, Moldovan-US Bilateral grants program
Reviewer, NIH, 2002
Reviewer, NSF SBIR/STTR, 2002, 2004
INVITED PARTICIPATION:
07/2003 – 08/2003 2003 Language Engineering Workshop, Center for Language and
Speech Processing
Johns Hopkins University, Baltimore, Maryland
External project member, Syntax for Statistical Machine Translation
http://www.clsp.jhu.edu/ws2003/groups/translate/
06/2003 ARDA-NRRC Workshop on Scenario-Based QA
Bedford, MA
06/2003 2003 Document Understanding Conference
Edmonton, AB, Canada
Panelist
11/2002 Planning meeting, 2003 JHU Summer Language Engineering Workshop
Turf Valley, Maryland
15. 09/2002 Workshop on Challenges in Information Retrieval and Language
Modeling
Amherst, Massachusetts
http://ciir.cs.umass.edu/irchallenges
07/2002 – 08/2002 2002 Language Engineering Workshop, Center for Language and
Speech Processing
Johns Hopkins University, Baltimore, Maryland
Project member, Generation in the context of machine translation
http://www.clsp.jhu.edu/ws2002/groups/mt/
07/2002 2002 Document Understanding Conference
Philadelphia, PA
Panelist
01/2002 – 07/2002 TERQAS: An ARDA Workshop on Time and Event Recognition for
Question Answering Systems
MITRE and Brandeis University
http://time2002.org
11/2001 Planning meeting, 2002 JHU Summer Language Engineering Workshop
Turf Valley, Maryland
07/2001 – 08/2001 2001 Language Engineering Workshop, Center for Language and
Speech Processing
Johns Hopkins University, Baltimore, Maryland
Team Leader, Automatic Summarization of Multilingual Documents
http://www.clsp.jhu.edu/ws2001/groups/asmd/
05/2001 Linguistics Department, University of Michigan
Invited panelist
11/2000 Planning meeting, 2001 Document Understanding Conference
Gaithersburg, Maryland
10/2000 EMNLP/VLC-2000
Invited panelist, “Natural Language and the Web” panel. Panel moderator:
Hinrich Schütze (GroupFire). Participants: Ken Church (AT&T Research),
John Lowe (VP, Ask Jeeves), Joe Zhou (Intel Research, Beijing), and
Dragomir Radev (University of Michigan)
Hong Kong
04/2000 DARPA
TIDES Summarization Vision Committee
12/1999 Planning meeting, 2000 JHU Summer Language Engineering Workshop
Baltimore, Maryland
RESEARCH EXPERIENCE:
IBM T.J. Watson Research Center, June 2001 Visiting Professor
Participated in a question answering project.
Collaborators: Dr. John Prager and Dr. Yael Ravin.
IBM T.J. Watson Research Center, August 2000 Visiting Professor
Participated in two projects: question answering and natural language dialogue systems.
Collaborators: Dr. John Prager, Dr. Yael Ravin, and Krzysztof Czuba.
IBM T.J. Watson Research Center, 1998 — 1999 Research Staff Member
Participated in several projects in natural language dialogue systems, information retrieval,
question answering, and text summarization.
16. Collaborators: Dr. Wlodek Zadrozny, Dr. Joyce Chai, Dr. Malgorzata Styś-Budzikowska, Dr.
Veronika Horvath, Dr. Nanda Kambhatla, Dr. Yiming Ye, Sunil Govindappa, Dr. Catherine
Wolf, Dr. John Prager, Dr. Eric Brown, Dr. Anni Coden, Valerie Samn, Dr. Yael Ravin, and Dr.
Tetsunosuke Fujisaki.
Columbia University, 1993 — 1998 Graduate Research Assistant
Participated in various projects in Natural Language Summarization, Statistical Information
Extraction, and Digital Libraries.
Adviser: Prof. Kathleen McKeown. Collaborated with Prof. Alfred Aho, Prof. Shih-Fu Chang,
Dr. Vasileios Hatzivassiloglou, Prof. Luis Gravano, and Dr. Judith Klavans.
IBM T.J. Watson Research Center, Summer 1997 Research Staff
Participated in a project in Natural Language Summarization on the Internet.
Collaborator: Dr. Wlodek Zadrozny, Human Centric Solutions and Applications Department
Bell Communications Research, Summer 1996 Resident Visitor
Participated in a Natural Language Summarization project for World-Wide Web logs.
Collaborators: Dr. Karen Kukich, Dr. Rebecca Passonneau.
AT&T Bell Laboratories, Summer 1995 Member of Technical Staff, Consultant
Participated in a project involving statistical part-of-speech tagging for French for use in text-to-
speech applications.
Collaborator: Dr. Evelyne Tzoukermann.
AT&T Bell Laboratories, Summer 1994 Member of Technical Staff
Participated in a project involving the merging of statistical and symbolic knowledge in part-of-
speech tagging.
Collaborators: Dr. Evelyne Tzoukermann, Dr. William Gale, and Dr. Diane Lambert.
UNIVERSITY-LEVEL COURSES TAUGHT:
Instructor, “COMS 6998 Search Engine Technology” Columbia University
Spring 2007
Course material includes information retrieval, webometrics, social network analysis, text
mining, clustering and categorization, etc.
Instructor, “SI 767/EECS767 Seminar: Advanced NLP and IR” University of Michigan
Winter 2006 (9 students)
The course takes place as a seminar in which students take turns presenting recent research
papers in NLP and IR. Such topics may include spectral methods, expectation maximization,
conditional random fields, noisy channel models, statistical machine translation, document
ranking methods, semi-supervised learning, label propagation, document models, text centrality,
mincut-based methods, sentiment and polarity analysis, text classification, and the Web as
corpus.
Instructor, “SI 654 Database Application Design” University of Michigan
Winter 2000 (26 students), Winter 2001 (40 students), Winter 2002 (45 students), Winter 2003
(37 students), Winter 2004 (30 students), Winter 2005 (13 students), Fall 2005 (20 students)
This course is an introduction to database management systems (DBMS). It covers both
theoretical and practical aspects of DBMS such as database design, use, and implementation.
Topics covered: Entity-Relationship model, relational model, relational algebra, SQL, database
design, application design, Web interfaces, transaction processing, database systems and tools,
system administration, XML and XSL, XML query languages, data mining.
Instructor, “SI 760/LING 792/EECS 597 Language and Information” University of Michigan
Fall 2000 (22 students), Fall 2002 (20 students), Winter 2004 (12 students)
The course presents survey of techniques used in the statistical processing of natural language
and information. The material includes: introduction to computational linguistics, information
theory, data compression and coding, N-gram models, clustering, lexicography, collocations, text
summarization, information extraction, question answering, word sense disambiguation, analysis
of style, and other topics.
17. Instructor, “EECS 595/LING 541/SI 661/SI761 Natural Language Processing” University of
Michigan
Fall 2001 (40 students), Fall 2003 (22 students), Fall 2004 (23 students), Fall 2005 (25 students)
The course covers: introduction to computational linguistics, morphology, part-of-speech
tagging, regular grammars and finite-state automata, context-free grammars, parsing with
context-free grammars, knowledge representation, semantics, text generation, discourse and
dialogue.
Instructor, “SI 650 Concepts of Information Retrieval” University of Michigan
Winter 2003 (12 students), Winter 2005 (11 students)
Course covers: information need, IR models, documents, queries, query languages, relevance,
retrieval evaluation, reference collections, query expansion and relevance feedback, indexing and
searching, XML retrieval, language modeling approaches, crawling the Web, hyperlink analysis,
measuring the Web, similarity and clustering, social network analysis for IR, hubs and
authorities, PageRank and HITS, focused crawling, relevance transfer, question answering.
Co-Instructor, “SI 503 Search and Retrieval” (w/Suresh Bhavnani) University of Michigan
Winter 2002 (110 students), Winter 2006 (120 students)
The course covers the following topics: organization and labeling, search behavior, cognitive
search and retrieval, search, indexing, and filtering, interfaces, social and organizational search,
Web search, navigational search, data structures for search, problem space search.
Instructor, “COMS 4705 Natural Language Processing” Columbia University
Fall 1999 (25 students) Prepared lectures, assignments, projects, exams. Supervised one teaching
assistant. Held office hours. (this course was later offered as a taped course in Sp02, Su02, F02,
Sp03, Sp04).
Course material includes linguistic fundamentals, mathematical and information theoretic
fundamentals, regular languages and automata, phrase-structure grammars, feature grammars,
statistical techniques, part-of-speech tagging, semantics, information extraction, text generation,
and other topics.
Instructor, “COMS 4999 Computing and the Humanities” Columbia University,
Spring 1995 (45 students), Spring 1999 (29 students). Prepared lectures, assignments, projects,
exams. Supervised two teaching assistants. Held office hours.
Course material includes literary analysis, authorship analysis, information retrieval, statistical
language processing, digital library issues, legal applications, and text markup.
Co-instructor, “COMS 6998 Topics in Digital Libraries” Columbia University
Fall 1997 (Seminar enrollment: 13 students). Prepared course, assignments, reading lists (jointly
with Prof. Luis Gravano).
Course material includes information retrieval, text, image, video, and multimedia repositories,
information extraction, summarization, multilingual access, database issues, user interfaces and
visualization, integration of text and visual features.
Instructor, “COMS 1001 Introduction to Computers” Columbia University
Fall 1996 (102 students). Prepared lectures, assignments, projects, exams. Supervised four
teaching assistants. Held office hours.
Course material includes introduction to computing, programming in Scheme, the Internet, and
the UNIX operating system.
OTHER TEACHING EXPERIENCE:
Reading group organizer:
Models of the Web graph Fall 2004
Natural language processing Fall 2003
Guest lecturer University of Michigan, 2000 – 2003
Seminar in Linguistics (3 times), Artificial Intelligence, Digital Libraries (twice), Web-based
database systems, Natural Language Processing, SI Doctoral Foundations, Network theory
18. Course Manager, “COMS 4118 Operating Systems” Columbia University, Summer 1996
Enrollment: 7 students. Assisted in teaching a pre-taped course on the Columbia Video Network.
Prepared and graded homework assignments, projects, and exams. Held office hours. Installed
and maintained software for the course.
Course material includes processes, memory management, input/output, file systems,
networking.
Instructor, “Internet Programming” Ben-Gurion University, November 1997
Enrollment: 20 students. Taught a mini-course while visiting Ben-Gurion University. Prepared
lectures, example code, study materials, course software, and a course project.
Course material includes client/server programming, WWW-based user interfaces, relational
database back-ends, WWW robots, text indexing and search.
Teaching Assistant Columbia University, 1993 — 1996
Teaching assistant in 3 courses: “Introduction to Computer Programming - Fortran”, “Computing
and the Humanities”, “Introduction to Computers”. Held office hours, prepared and graded
assignments, projects, and exams. Occasionally taught lectures. Also taught single lectures in
“Natural Language Processing”.
Teacher/counselor Upward Bound Program, University of Maine, summer 1992, summer 1993
Taught SAT preparation course and performed various residential counseling activities.
PH. D. STUDENTS ADVISED:
Bryan Pardo (Ph.D. 2004, joint advisee with Bill Birmingham), music modeling and retrieval , now
Assistant Professor, Computer Science, Northwestern University
Zhu Zhang, 2000 — 2005, machine learning, information extraction, cross-document structure theory,
now Assistant Professor, Management Information Systems, University of Arizona
Jahna Otterbacher, 2002 — 2006, text summarization, information extraction, now visiting faculty,
Department of Public and Business Administration, University of Cyprus
Güneş Erkan, 2003 —, graph-based methods, machine learning. question answering, information
extraction, bioinformatics
Zhuoran Chen, 2005 —
Arzucan Ozgur, 2006 —
Xiaodong Shi, 2006 —
PH. D. THESIS COMMITTEE MEMBER:
Weiguo “Patrick” Fan (2002), University of Michigan (Business School) – information filtering – now
at Virginia Tech
Timothy Allison (2003), University of Michigan (Classics) – computational stylometrics – now at
Smith College
Mark Arehart (2003), University of Michigan (Linguistics) - computational semantics
Yu-Ying Chang (2004), University of Michigan (Linguistics) – discourse analysis and bibliometrics
Michael Gastner (2005), University of Michigan (Physics) – diffusion-based methods – now at Santa Fe
institute
Carlos Santos (expected 2006), University of Michigan (Bioinformatics) – NLP in the biomedical
domain
Damian Fermin (expected 2007), University of Michigan (Bioinformatics) - bioinformatics
Andrew Nierman (expected 2007), University of Michigan (EECS) – XML databases
Victoria Fossum (expected 2007), University of Michigan (EECS) – machine translation
Yunyao Li (expected 2007), University of Michigan (EECS) – NLP interfaces to databases
Yang Ye (expected 2007), University of Michigan (Linguistics) – machine translation
Cong Yu (expected 2007), University of Michigan (EECS) – XML databases
Yvonne Ford (expected 2007), University of Michigan (Nursing) – informatics in nursing
Winston Hsu (expected 2006), Columbia University (EE) – image processing
19. Maria Fuentes Fort (expected 2007), U Politecnica de Catalunya – text summarization
GRADUATE AND UNDERGRADUATE STUDENT RESEARCH ASSISTANTS ADVISED:
Chuan-Chih Chou, 2006 —, PhD student, applied physics
Anthony Fader, 2006 —, BA student, mathematics
Kevin McGowan, 2005 — 2006, MS student: machine translation
Alejandro C de Baca, 2004 — 2005, MS student: models of the Web
Rohit Laungani, 2005 — 2006, MS student
James Sweeney, 2005 — 2006, MS student
Patrick Jordan, 2005 — 2006, MS student
Jacob Balazer, 2005 — 2006, MS student
Erin Rhode, 2005 —, PhD student
Aaron Elkiss, 2005 —, PhD student
Siwei Shen, 2003 — 2005, MS student, machine learning, grammar acquisition, bioinformatics
Stanko Dimitrov, 2002 — 2004, BA student (UROP)
Michael Topper, 2001 — 2003, BA student: multidocument summarization
Amardeep Singh Grewal, 2001 — 2003, BA student (UROP)
Dan Tam, 2002 — 2004, MS student: text generation. text summarization
Scott Gifford, 2002, MS student: information retrieval
Adam Winkel, 2001 — 2002, MS student: multidocument summarization
Naomi Daniel, 2001 — 2002, MS student: subevent-based summarization
Zhiping Zheng, 2001, MS student
Hong Qi, 2000 — 2003, question answering, information retrieval, PhD student
Sasha Blair-Goldensohn, 2000 — 2001, MS student
Revathi Sundara Raghavan, 2000, MS student
Arica Jackson, 2002, BA student (UROP)
Michael Gimbel, 2001 — 2002, BA student (UROP)
Oluremi Kufeji, 2001 — 2002, BA student (UROP)
Bojan Peovski, 2001 — 2002, BA student (UROP)
Kiran Divvela, 2000, BS student (UROP)
INDEPENDENT STUDY PROJECT ADVISER:
Amy Lau, Christopher Small, Srikant Krishna, Efrat Levy (at Columbia between 1994 and 1998)
Zachary Haberer, Stanley Cavin, Yingqi Feng, Kate Lockwood, Krittaya Alapol, Sasha Blair-
Goldensohn, Zhiping Zheng (2000—2001)
Krittaya Alapol, Yong Huang (2001—2002)
Ammar Qusaibaty, Erin Doumpoulaki (2002)
Omer Abdul Kareem, Siwei Shen, Renju Jacob, Matthew Forsythe, Ping Yu, Yang Ye (2003)
Jacob Balazer, Gumwon Hong, Gunes Erkan, Agam Patel, Dan Tam, Sean Gerrish, Cristina Negrut
(2004)
Esha Parvathi Krishnaswamy (2005)
DEPARTMENTAL SERVICE:
Strategic planning committee:
University of Michigan Dept. of EECS, 2005 — 2006
Curriculum committee:
University of Michigan Bioinformatics, 2006 —
Research committee:
University of Michigan School of Information, 2005 — 2007
Master’s committee:
University of Michigan School of Information, 2004 — 2006
Doctoral committee:
University of Michigan School of Information, 2002 — 2004
20. Dean’s advisory committee:
University of Michigan School of Information, 2000 — 2002, 2004 — 2005
Undergraduate committee:
University of Michigan School of Information, 2000 — 2002.
I proposed the creation of a 3-2 program with Linguistics which was approved by both SI and
Linguistics.
Ph.D. committee:
Columbia University, Department of Computer Science, 1996 — 1998.
MS admissions committee:
Columbia University, Department of Computer Science, 1994 — 1996.
Webmaster:
Columbia University, Department of Computer Science, 1994 — 1997.
ACM Programming Team Coach:
Coach, International Finals 1995 top 20 worldwide (out of 600 teams)
Coach, International Finals 1996 7th (tie) worldwide (out of 1000 teams)
Coach, International Finals 1997 top 20 worldwide (out of 1000 teams)
RESEARCH TALKS (2004-05):
January 20, 2004, “Content diffusion on the Web graph”, General Motors Research
March 3, 2004 “Content diffusion on the Web graph”, University of Maryland (NLP group)
March 4, 2004 “Multilngual computing: syntax, generation, and summarization”, Georgetown
University
April 14, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”,
Northwestern University
June 8, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”, Columbia
University
July 13, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”, Sofia
University
July 15, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”, University
of Cambridge
July 19, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”, University
of Wolverhampton
July 27, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”, University
of Sheffield
July 29, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”, University
of Essex
August 4, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”,
Microsoft Research
September 23, 2004 “Social network analysis of text”, University of Michigan STIET seminar
September 24, 2004 “Social network analysis of text”, Cornell University AI seminar
October 7, 2004 “Social network analysis of text”, MIT
October 8, 2004 “Social network analysis of text”, Brown University
October 15, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”,
Indiana University CS/Informatics Seminar
November 10, 2004, UC Berkeley, BISC seminar
November 11, 2004 “Social network analysis of text”, University of Southern California
November 12, 2004, ISI
November 15, 2004, “Words, links, and patterns: novel representations for Web-scale text mining “,
UC Irvine
December 2, 2004 “Words, links, and patterns: novel representations for Web-scale text mining”,
University of Illinois, DB seminar
December 3, 2004, University of Illinois, Chicago
December 3, 2004, University of Chicago
December 9, 2004 IBM T.J. Watson Research Center, NL PIC talk
December 10, 2004 New York University, CS department seminar and NYC-NLP forum
21. December 15, 2004 UCLA
March 28, 2005 Ask Jeeves
July 2005 MSN Search
July 2005 Microsoft Research
August 18, 2005 University of Washington “Random walk methods for natural language processing”
September 24, 2005 RANLP workshop on text summarization, Borovetz, Bulgaria “Graphs everywhere:
novel methods for summarization and NLP”
November 1, 2005 Michigan State University, East Lansing “Graphs everywhere: novel methods for
summarization and NLP”
November 21, 2005 Google
PAST TALKS (1997-2003, in alphabetical order):
AT&T Labs Research, Ben-Gurion University of the Negev, Bulgarian Academy of Sciences,
Columbia University (NLP group), Educational Testing Service, First Document Understanding
Conference (DUC 2001, New Orleans, Louisiana), Ford Research, Fordham University, George Mason
University, Georgetown University, Georgia Institute of Technology, IBM T.J. Watson Research
Center, Michigan-Ohio Chapter of ACM SIGCHI, New Jersey Chapter of the American Society for
Information Science, Rutgers University (SCILS), SUNY-Stony Brook, United States Department of
State Foreign Service Institute, University of California, San Diego (AI seminar), University of
Chicago, University of Edinburgh, University of Illinois (SLIS), University of Illinois-Chicago,
University of Maine, University of Maryland (NLP seminar), University of Massachusetts, University
of Michigan (IPOCSE), University of Michigan (School of Information), University of Minnesota,
University of Pennsylvania, University of Texas, University of Washington (CSE), University of
Washington (Information School), Virginia Tech
SYSTEMS BUILT OR UNDER DEVELOPMENT:
ALE – Web page link text indexer (with Scott Gifford)
ANSEL – question answering (PI was John Prager from IBM)
Argent – generation system for machine translation
CIDR – topic detection and tracking
Clairlib – generic NLP and IR
GIN – GenesInEssence – fact extraction about genes and proteins
LexRank – graph-theoretical, cosine centrality based summarizer
NewsInEssence – News summarization (www.newsinessence.com)
NSIR – Probabilistic question answering
PROFILE – information extraction
QASM – Natural language query modulation
SUMMONS – multi-document summarization
TUMBL – semi-supservised graph-based machine learning
WapMEAD – a Wap interface to MEAD
WebInEssence – Web-based multi-document summarization
ZEDDOC – Web-based report generation
PRESS COVERAGE:
NewsInEssence has been featured in Wired News, La Stampa (Turin, Italy), Il Giornale (Vicenza,
Italy), L'Arena (Verona, Italy), The Hindu (Chennai, India), The Michigan Daily, The Ann Arbor
News, The University Record, and NPR affiliate WEMU.
The work on Question Answering was featured in TRNews.
The work on lexrank was featured in TRNews.
ACL 2005 interview in WAAM-AM, Ann Arbor, June 27, 2005.
My interview on biases of search engines was cited by Reuters, the Associated Press, San Jose
Mercury News, WJLA-Los Angeles, Wired News, Orland Sentinel, Rapid City Journal, Boston
Globe, Cleveland Morning Journal, Chicago Tribune, Los Angeles Times, South Bend Tribune, Las
22. Vegas Sun, Beaufort Gazette, Raleigh-Durham News Observer, Modesto Bee, Washington Post,
South Idaho Press, Augusta Chronicle, Kansas City Star, Sacramento Bee, Jefferson City News
Tribune, Miami Herald, Waterloo Cedar Falls Courier, Seattle Post-Intelligencer, New York
Newsday, WTOP-Washington, Bakersfield Californian, Daytona Beach News Journal, Victoria (TX)
Advocate, Chippewa Falls (WI) Herald, Porterville (CA) Recorder, Ocala (FL) Star Banner,
Worcester Telegram, The Harrisburg (PA) Patriot-News, Toronto Globe and Mail, Times of India
(Mumbai)
SOFTWARE RELEASED:
MEAD: Public-domain multi-document multi-lingual summarizer
(http://www.summarization.com/mead)
BLOG/MAILING LIST ACTIVITIES:
Dr-list (personal mailing list): http://tangra.si.umich.edu/~radev/dr-list/email/date.html
I-list (mailing list about Information): http://tangra.si.umich.edu/~radev/ilist/date.html
ACL-news (additions to the ACL Web site): http://tangra.si.umich.edu/~radev/acl-news-archive/
PROFESSIONAL AFFILIATION:
ACM
ACM SIGIR (Information Retrieval)
ACM SIGMOD (Management of Data)
ACL (Association for Computational Linguistics)
AAAI (American Association for Artificial Intelligence)
LANGUAGES:
English, Bulgarian, French, and Russian: fluent spoken and written
Spanish: advanced level
Italian: intermediate level
German and Japanese: beginner’s level
LAST UPDATED:
September 27, 2006