6. “Most of the prescriptive
rules of the language
mavens make no sense on
any level. They are bits of
folklore that originated for
screwball reasons several
hundred years ago… For as
long as they have existed,
speakers have flouted
them…”
7. “intellectual abdication”
“should be ashamed”
“current around 1900”
“a perversion of
grammatical education”
“blind to textual evidence
even when he himself
exhibits it”
“dishonest and stupid”
“vile little compendium
of tripe about style”
Grammarian
Geoffrey K Pullum on …
“More passives in
Orwell's pompous essay
with the warning about
how you mustn't use
them than in any
periodical you can lay
your hands on! “
8. This usage stuff is not straightforward and
easy. If ever someone tells you that the rules
of English grammar are simple and logical
and you should just learn them and obey
them, walk away, because you're getting
advice from a fool.
http://languagelog.ldc.upenn.edu/nll/?p=2790
15. BYU corpora available
COCA (contemporary Am English)
COHA (historical Am English)
GloWbE (global web English)
Wikipedia
Google Books (BrEng/AmEng)
BNC (British National Corpus)
Hansard (British parliamentary speeches)
Spanish/Portugese
16. Access to COCA and related BYU
corpora is free…
…but free registration
required for more than
~10 queries a day
17.
18. Other resources derived
from BYU corpora
WordFrequency.info
WordAndPhrase.info
AcademicWords.info
Collocates.info
27. The Corpus Magic
*
[ ]
?
Different corpora use slightly
different codes. Read the
manual.
[n* ]
28. The Corpus Magic
*
[ ]
?
Any one character
Any number of
characters (incl 0)
Lemma
(all inflectional
forms of a word)
Different corpora use slightly
different codes. Read the
manual.
[n* ]
Part of speech tags
(e.g. nouns)
37. You can also
cats and dogs search for idioms
?each*s combine wildcards
[=pretty] search for synonyms
car|bike|horse search for alternatives
used -car exclude searches
For more details see:
41. Other questions corpus can answer
Are there more nouns or verbs ending in -ies?
*ies.[V*] vs. *ies.[N*]
Are there four-letter verbs ending in -ed in the present
tense? ??ed.[VVB]
What are the most common adjectives describing students
vs. pupils. [j*] [student] vs. [j*] [pupil]
What do we say teachers do most often?
[teacher] [vvb]
42. Corpus, rules, and regularity
http://www.flickr.com/photos/51505078@N00/352492687
pre*
*ed
*ies.[V*]
49. Google as a Corpus
PRO: rare, low frequency usage,
up-to-date usage
CON: no sampling, no frequency
sort, no genre limit, no part
of speech tags
50. Google results counts are only
rough estimates…
http://searchengineland.com/why-google-cant-count-results-properly-53559
Different people searching in different geographic
locations can get different numbers
Sometimes searching for A gives fewer results
than searching for A without B
54. Be aware of limitations:
sampling, coverage, size,
presence of typos and errors,
bad part of speech tagging
Beware of low frequency
results
Beware of homographs
55. Check results come from
multiple sources
Check KWIC to confirm
relevance
Limit search by genre
62. Teacher preparation
find relevant, common examples
prepare worksheets
check for exceptions
find out answers to student
questions about rules and usage
63. Student discovery
show search results to students to
work out rules or word meanings
teach students how to search for
questions
ask students to give each other
puzzles for searching
64. For heavy classroom use…
register for
group access
to prevent
spam lock out