Developing Teaching Materials with Authentic Data and Corpus Analysis Tools
1. Developing Teaching Materials
with Authentic Data and Corpus
Analysis Tools
Hongyin Tao
University of California, Los Angeles
tao@humnet.ucla.edu
2. The Project
One of the National Resource Center
projects at Penn State University
Center for Advanced Language
Proficiency Education and Research
(CALPER), Penn State University
Supported by the US Department of
Education
http://calper.la.psu.edu/chinese.php
3. Goals
This project is developing teaching materials
for advanced learners based on a collection of
authentic examples of contemporary spoken
Chinese.
The materials will highlight some of the
interactive aspects of spoken Chinese including
features such as topic
transition, assessments, repairs, linking, and
acknowledgement.
In addition, the materials will be useful to teach
grammar points, such as the use of the particle
le, from a discourse perspective.
4. BACKGROUND
The vast majority of spoken language teaching materials
available for learners of Chinese are either based on
constructed sentences or on some assumed features of
spoken language. Rarely do they rely on naturally occurring
spoken language.
Discourse linguists, however, have shown that there are
fundamental and systematic differences between written
and spoken language. Hence, examining natural
conversation offers important insights into the workings of
spoken language.
Example: discourse analyses show that spoken Chinese
tends to use a great deal of ellipses, verb-less
constructions, discourse
markers, formulations, backchannels, and so forth, which
are much less frequently found in written discourse, and as
a result are rarely made explicit in second language
instruction.
5. ACTIVITIES (I): Research Database
Collection of conversational Mandarin
Chinese: Over 60 hours; transcribed into
300,000 words.
The data come from speakers discussing
readings, narrating stories based on past
experience, talking to each other while
playing games, talking about movies,
talking on campus tours, conversing with
each other at dinner parties, and talking
while shopping at a farmers market, etc.
6. ACTIVITIES (II): Materials
A practical guide to selected features of
natural spoken Chinese, which will highlight
important interactive aspects of the spoken
language and features of spoken grammar.
The materials are aimed at students who
have had at least 300 hours of instruction in
Chinese but will also be valuable for
teachers who would like to use them with
somewhat less advanced students.
7. ACTIVITIES (III): Other
Workshops: Intensive workshops have
been conducted for (K-12 and college)
teachers of Mandarin Chinese, informing
about ways to improve language teaching
with natural discourse data.
Software: Developed a suite of software
tools called A Corpus Worker’s Toolkit
(ACWT) for data processing and analysis.
Publications: Papers and Chinese Corpus
Resource Guide.
8. Material Development and Use
Selection of segments from transcriptions
based on genre types and linguistic and
discourse pragmatic features;
Clean up transcription;
Use various text analysis computer
programs to process the data.
9. Structural Features
conditionals; extended coordination;
verbal classifiers; clause linking
devices; inclusive and exclusive
pronominals; complex interrogatives;
ba constructions, bei constructions;
aspect markers, syntactic
constructions (e.g. verb
complements), and relative clauses.
10. Discourse Pragmatic and
Sociolinguistic Features
story opening; disagreement in
strategic forms; topic transition;
making and soliciting assessments;
repairs; checking for clarification;
linking discourse units;
acknowledgement and response to
the previous speaker; expressing
personal emotions and stances; and
evidential marking.
11. Use of Technology
Chinese annotation tools for code
conversion, Romanization, etc.
Tokenization
Vocabulary/Word List
Concordance
Web/hyperlink presentation
13. Summary
Highlight differences between spoken and
written language and different types of
communicative strategies;
Maximize learner (and teacher) exposure to
real language use;
Exposure to the ‘current’ state of the target
language.
Emphasize both linguistic and pragmatic
competences;
Emphasize both language and cultural
knowledge.