This document provides students an overview of how to use Yoshikoder to complete a computer-assisted content analysis.
It accompanies other handouts I've uploaded to Scribd and the following blog post: http://mattkushin.com/2014/04/01/applied-research-class-sentiment-analysis-project-reflection/
1. Shepherd University · Matthew J. Kushin, Ph.D.
Computer Assisted Content Analysis with Yoshikoder
For help with project and in-class activities
Comm 435: Communication Research
What is computer assisted content analysis?
Use of computer software to aid in content analysis. Software counts and reports the frequency
of keywords (the computer calls these ‘patterns’) which were entered into a code sheet (The
code sheet is called a dictionary) on the computer.
Program we’ll use: Yoshikoder.org (free for Mac and PC) developed for the Identity Project at
Harvard’s Weatherhead Center for International Affairs.
Download: http://sourceforge.net/projects/yoshikoder/
Terms in Yoshikoder:
Preparing for Analysis
1. Dictionary = Code Sheet – a file we create and save that contains all of our categories and
their patterns. This dictionary is used to analyze our text file. We can create it, or we can
import and use an existing dictionary.
2. Category – Issue categories - Group of terms that represent an attribute or aspect of
interest.
a. Example: We can create a category called “Tired” one called “Awake”, etc.
3. Pattern – A pattern is a keyword that fits into a category. Many patterns (keywords) make
up a category. As a simple example:
a. Foreign affairs is our category. The words below are all the patterns (keywords) that
make up foreign affairs. In other words, if any of the patterns appear in our text, the
computer will categorize them in the “foreign affairs” category.
b. So: If the software were to analyze the following Tweet:
i. Last night, Hillary Clinton said she felt great about American foreign
policy
c. How many times would the pattern “Foreign Affairs” be counted?
i. The answer is: Twice. Once for “Hillary Clinton” and once for “foreign
affairs.”
d. Hint: To get possible uses of a word, enter a * after the word. Example:
i. Run* - will count runner, running, runs. But not: ran.
Analyzing Data
Once a dictionary has been created or loaded into the program, the file you want to analyze
must also be loaded (goto: Document -> Add Document).
Analysis options are:
1. *what you want to do! * Content Analysis of your Dictionary –(Report -> Apply
dictionary -> current document) See the # of times every category in your dictionary
came up in the document being analyzed.
2. Shepherd University · Matthew J. Kushin, Ph.D.
a. To see frequency of each pattern (keywords) within a category, uncheck “show
categories only” in the bottom of the window.
2. Total Word Count – (Report -> Count Words - > Current Document) This shows how
many times each word in the entire document you loaded came up. It doesn’t matter if
the word is in your dictionary or not.
a. Very useful to see the ranking of each word in the file!
3. Highlight - (Highlight -> Highlight Entry) – this simply highlights all the keywords
(patterns) in the document you are analyzing. Not super helpful – but you can see where
the words pop up.
4. Concordance - (Concordance -> Make Concordance) - This provides keyword in context –
allows the researcher to see the words surrounding the phrase, so the researcher can
interpret the context in which the term is used to better code the data if necessary.