Every researcher is a cyborg! Academic researchers engage various sorts of research in vitro (in the glass) and in vivo (in the living body), or they engage in experimental laboratory work and analyze data in natural in-world experiments. In between, many conduct surveys, focus groups, interviews, and other types of research work. In the computer-assisted qualitative data analysis software (CAQDAS) space, NVivo is one of the foremost tools, enabling the creation of manual codebooks, multimedia analysis, and various forms of “auto” or unsupervised machine learning. NVivo works as a “database” for structured and unstructured data (multimedia). It enables the drawing of content from various social media sites. Technologies augment human analytical capabilities, in the qualitative and quantitative research spaces. This presentation demonstrates some of the capabilities of NVivo. This also addresses how a researcher is changed by the computational capabilities they harness.
2. PRESENTATION BLURB
Every researcher is a cyborg! Academic researchers engage various sorts of research in vitro (in the glass) and in
vivo (in the living body), or they engage in experimental laboratory work and analyze data in natural in-world
experiments. In between, many conduct surveys, focus groups, interviews, and other types of research work. In
the computer-assisted qualitative data analysis software (CAQDAS) space, NVivo is one of the foremost tools,
enabling the creation of manual codebooks, multimedia analysis, and various forms of “auto” or unsupervised
machine learning. NVivo works as a “database” for structured and unstructured data (multimedia). It enables the
drawing of content from various social media sites. Technologies augment human analytical capabilities, in the
qualitative and quantitative research spaces. This presentation demonstrates some of the capabilities of NVivo.
This also addresses how a researcher is changed by the computational capabilities they harness.
2
3. DEFINITION: CYBORG
Cyborg: “A fictional or hypothetical person whose physical
abilities are extended beyond normal human limitations by
mechanical elements built into the body”
Oxford English Dictionary (2022)
3
5. INVITRO VS. INVIVO
In vitro (in glass)
Research that may be conducted “in glass” test tubes
in laboratories
More typical in the so-called “hard” sciences
In vivo (in living body)
Research that may be conducted based on “in living
body” or in-world natural experiments (based on
observables, scraped data from real life)
More typical in the so-called “soft” sciences
NVivo
5
6. MIXED METHODSVS. MULTIMETHODOLOGY RESEARCH
Mixed methods research
A combination of qualitative and quantitative “data,
methods, methodologies, and / or paradigms in a
research study or set of related studies”; a type of
multimethodology research (Multimethodology, Jan.
25, 2022)
Multimethodology research
Use of “more than one method of data collection or
research in a research study or set of related
studies” (Multimethodology, Jan. 25, 2022)
6
7. SOME EXAMPLES MULTIMETHODOLOGY (AND MIXED METHODS)
RESEARCH
An experimental intervention (quant) and a follow-up online survey (qual) (sequential multimethod research)
A program performance audit based on documentation and data (quant) and interviews (qual) (multi-sourced
data, multimethod research: content analysis, interviews)
A simulation study (quant) combined with social data and social network analysis (qual) (multimethod and multi-
sourced data)
Medical trials of a new drug (quant) along with long-term participant health data and surveys (qual) (multimethod
research)
Longitudinal research combining laboratory-based health data (quant) and surveys (qual) (multimethod research)
Autoethnography or ethnography (qual) studied in the context of external population data (quant) (multi-sourced
data of both types)
7
8. SOME EXAMPLES MULTIMETHODOLOGY (AND MIXED METHODS)
RESEARCH (CONT.)
Scientific research in the lab (quant) combined with external focus groups (or interviews or surveys) (qual)
(multimethod and mixed method research)
A quasi-experimental learning intervention (quant / qual) with assessment of grade data (quant)
Learning management system (LMS) data at scale (quant) combined with student surveys (qual) (mixed data)
Social media data (quant) combined with e-Delphi method study (qual) (mixed data)
Student grades (quant) and student survey responses (quant / qual) (mixed data)
Online-based interviews (qual) and sensor data (quant) (multimethods, mixed data sources)
8
9. SOME EXAMPLES MULTIMETHODOLOGY (AND MIXED METHODS)
RESEARCH (CONT.)
An oral history project (qual) with computational text analysis (quant / qual) with demographic data (quant)
(mixed data)
Mapping the state of a nation’s research by bibliometrics (quant / qual) and demographic analysis (quant) and
interviews (qual) (multimethod and mixed method)
And innumerable other variations
9
10. CAQDAS: COMPUTER-ASSISTED QUALITATIVE DATA ANALYTICS
SOFTWARE
computer-assisted qualitative data analysis software (CAQDAS)
Includes a wide number of software programs, including…
NVivo
[data exploration with word frequency counts, text searches; matrix queries; qualitative cross-tab analysis; compound queries; coding
queries; coding similarity analysis; manual coding; codebook export; memo export; reports export; machine learning: topic modeling (with
human researcher in-the-loop), sentiment analysis, speaker coding from transcripts, style coding,“NV” coding based on manual codebook;
data visualizations; manual model drawing; automated model drawing, and others]
[runs on Windows, Mac, and servers] [some differing capabilities]
10
11. A LIGHT COMPARISON / CONTRAST BETWEEN QUANT AND QUAL
APPROACHES
Quantitative research
Epistemological approaches (ways of knowing, ways
of making meaning)
Assumption of objectivity and absolutism, normal curve
to represent populations
Striving for high-rigor and reproducible research
Practical and applied, problem-solving; theoretical
relevance and implications
Qualitative research
Epistemological approaches (ways of knowing, ways
of making meaning)
Assumption of subjectivity and relativism on the part of
researchers
Striving for rich data (coded to saturation)
Practical and applied, problem-solving; also theoretical
relevance and implications
11
12. A LIGHT COMPARISON / CONTRAST BETWEEN QUANT AND QUAL
APPROACHES (CONT.)
Quantitative research
Experimental research
Gold standard is experimental research
Lab-based
Field-based
High-precision measures, highly defined research
methodologies, high rigor
Qualitative research
Natural experiments, field observations,
Data elicitations through focus groups, interviews
(structured and semi-structured)
Valuing of voice
Informants based on positionality
All content has data value: content analysis, gray
literature, metadata
12
13. A LIGHT COMPARISON / CONTRAST BETWEEN QUANT AND QUAL
APPROACHES (CONT.)
Quantitative research
Reliance on statistical analysis, descriptive statistics, other
statistical methods, deductive logic
Independent variable(s), dependent variable(s)
Controls for potential other influences (noise)
Evaluate whether p-values justify rejecting null hypotheses
Use randomization for seating panels, participants, and so on
Can go with convenience samples, can go with snowball
sampling, and others, but these are weaker sampling methods,
with room for biasing
Require “power” in terms of numbers for representation
Qualitative research
Reliance on researcher expertise, thematic (and other) coding,
statistical methods
Can learn from small datasets
Can learn from an n = 1
Can learn from individual cases / case studies / groups of cases
Can make case for a construct based on coding similarity
analysis (using Cohen’s Kappa, Kappa Coefficient)
Usually a range of .6 to .8 where 1.0 is full agreement of what
is relevant and what is not relevant in the coding
Need to avoid “reification” (assuming an abstraction has
instantiation in concrete reality),“hallucinated” senses of reality
13
14. A LIGHT COMPARISON / CONTRAST BETWEEN QUANT AND QUAL
APPROACHES (CONT.)
Quantitative research
Experimental reproducibility and repeatability
Generalizability of certain standards are met
Qualitative research
Not striving for generalizability but for patterns and
insights
No assumption of being able to totally recreate a
prior qualitative study
May do follow-on studies with the “same” population
14
15. A LIGHT COMPARISON / CONTRAST BETWEEN QUANT AND QUAL
APPROACHES (CONT.)
Quantitative research
Assumption: Interchangeability of similarly trained
researchers
Integrity required
Complex skills required
Qualitative research
Assumption: A non-interchangeability of researchers
Celebration of the researchers’ unique interpretive lens
Uniqueness of researchers as a strength
Willingness to challenge status quo, cultural understandings;
be transgressive and revolutionary
Openness to novel experiences
Ability to take on challenging work
Control of own cognitive biases, perceptual slants,
preferential thinking
15
16. A LIGHT COMPARISON / CONTRAST BETWEEN QUANT AND QUAL
APPROACHES (CONT.)
Quantitative research
Work can be challenged:
Repeat of the experimental research but with new data
Finding of errors in the original handling and / or
analysis of the data
Unclear evidentiary chains
Finding of logic errors
Poor methodologies
Identification of research or other fraud
Qualitative research
Work can be challenged:
Finding of incorrect application of logic or theory
Insufficient richness of data
Researcher biographical bias
Poor methodologies
Identification of research or other fraud
16
17. SOME COMPUTATIONAL DIFFERENCES
Quantitative research
Statistically significant data patterns through…
Cross-tab analysis
Factor analysis
Principal components analysis
Cluster analysis
Network analysis, social network analysis, word networks,
related tags networks, and others
And others
Qualitative research
Focus on natural language analytics
Spoken, written, mixed
Various genres and forms
Harnessing of multimedia, gray literature, various “found”
contents, and others
Data elicitations using computational means
Data patterns through…
Topic modeling
Sentiment analysis
Predictive analysis
Qualitative cross-tab analysis
Text and data mining 17
18. SOME COMPUTATIONAL DIFFERENCES (CONT.)
Quantitative research
Machine learning
Supervised machine learning
Unsupervised machine learning
Data modeling from machine learning based on training
data (such as for predictive analytics, with automated
creation of confusion matrices and f-scores)
Artificial intelligence (AI)-based “experiential” learning
High performance computing with big data and big data
streams
Qualitative research
Can be applied at scale now
Can “remember” unique coding fists of unique
researchers and apply their coding computationally
18
19. COMPUTATIONAL INSTRUMENTATION / TOOLS
AND DIGITAL RESOURCES
Quantitative research
Software programs, code, script, macros
Curated datasets
Datasets
Data models
Connected script and datasets
Survey instruments
Interview instruments
Qualitative research
Manual codebook creation, automated codebook
creation (both coded to saturation)
Created with top-down coding (based on theory or
framework or model, or some combination; based on pre-
determined research questions; based on a priori
hypothesizing); bottom-up coding (grounded theory); both
top-down and bottom-up coding
Codebooks named and often with easy-reference acronyms
.qdc format for digital codebook sharing and heritability,
Microsoft Word or LaTeX formats for appendices
19
20. COMPUTATIONAL INSTRUMENTATION / TOOLS
AND DIGITAL RESOURCES (CONT.)
Quantitative research
Research journals
Field notes
Rubrics
Matrices
Checklists
Qualitative research
Coding dictionaries
Software programs
Curated datasets
Research journals
Field notes
Rubrics
Matrices
Memos
20
22. TYPICAL RESEARCH DATA SHARING PRACTICES
Quantitative research
Full dataset (into perpetuity, at the time of publication)
Data exploitable (in a constructive sense) for other
analyses (but need to cite the creator of the dataset)
The code used to interact with the data and to create
data visualizations
Sometimes derived or “shadow” datasets
De-identified data (privacy protections)
Canonical collections (image sets, video sets, others) for
further study
Clear evidentiary chains
Qualitative research
Project files sometimes
De-identified data (privacy protections)
Instruments may be shared, like codebooks (manual and
digital) and computer programs
Partnerships more rare than in quant research teams
22
23. (IN)CONCLUSIVENESS OF RESEARCH FINDINGS?
Quantitative research
Convergence towards a consensus
Not fully definitive for all time (may be overturned at
any point with new research)
No absolute “proof” in most cases but leaning in
certain directions
Even paradigms shift
Reproducibility of computational outcomes given the
same dataset and the same queries or autocoding
processes
Qualitative research
Never an absolute last word, but a momentary
provisional observation for a particular point-in-time
No absolute “proof” in most cases
Even paradigms shift here, too
Reproducibility of computational outcomes given the
same dataset and the same queries or autocoding
processes
23
24. HUMAN SUBJECTS RESEARCH AND STANDARDS
Professional ethics and regulations / laws that protect the following and more:
IRB (institutional review board) oversight prior, during, and post research
Non-use of duplicity except in rare approved-by-IRB cases
Research value
Legally procured data
Research subjects’ well-being
Informed consent for research subjects, ability to withdraw from the research at any time
Research subjects’ privacy
Data preservation
24
26. SELECTED QUALITATIVE RESEARCH PRECEPTS
It helps to have broad and general knowledge along with in-depth focused knowledge. All knowledge can inform
the work.
There are data everywhere. Everything is datafy-able.
Everything is culturally informed. Everything is seen through a cultural lens. It helps to be aware of culture, one’s
own and others’.
The data source may be anywhere from [raw and “found” in-world] to [refined, edited, vetted, and “worked
through”].
26
27. SELECTED QUALITATIVE RESEARCH PRECEPTS (CONT.)
With some work, data may be transcoded to information.
All human creations have potential informational value:
formal published work, gray literature (brochures), private letters, cultural artifacts, artworks, commenting on social media,
building designs, stamps, candy wrappers, private collections of anything, etc.
The informational value may differ based on research context and researcher interests. Different researchers will
extract different meanings from the same dataset.
27
28. SELECTED QUALITATIVE RESEARCH PRECEPTS (CONT.)
All researchers are subjective. They have built-in biases.They need to be self-aware and control for their own
biases in order to conduct effective research.
In their work, they need to report on their biases and how they mitigate their biases.
Different researchers approaching a particular topic will likely take different approaches and emerge with different findings
(to a degree).
Researchers have their own “coding fists” or “coding hands.” They identify relevant data differently.They create
different coding categories, and these categories may be mutually exclusive or not. They may engage greedy or
frugal coding (whether a coded object can be coded more than once and in different categories).
Computation enables researcher “coding fists” to be preserved and re-used into the future.
28
29. SELECTED QUALITATIVE RESEARCH PRECEPTS (CONT.)
The research findings are not about generalizing to a population per se but about surfacing relevant insights.
Researchers strive to see differently. One of their “superpowers” is in re-interpreting, at various levels: micro
(ego), meso (group, entity), and macro (larger systems).
Researchers work across cultures and contexts. They are able to disengage from the context in order to view the situation
analytically.
Values (stated and implied) are an integral part of the research.
Qualitative researchers do not assume that the status quo is all as it should be. In qualitative research, advocating
for social change and equity and justice is considered a professional responsibility.
Studies may be disciplinary or interdisciplinary.
29
30. SELECTED QUALITATIVE RESEARCH PRECEPTS (CONT.)
CAQDAS tools support the human researcher.
The human researcher is foremost in the research and is not displaced by the
technology. However, the human is changed by using technologies, too.
One graduate student wanted to use autocoding alone for her master’s thesis,
without bringing her own expertise to bear. Not a good idea… Unless you can
create the code for the data analytics informed by your knowledge, a generalized
software tool will output generalized insights.
CAQDAS enables scalability of various types of computational analytics. For
example, a human-created manual codebook in NVivo can be applied to a
larger dataset and coded with a Cohen’s Kappa coefficient of 1.0. (albeit in a
machine sensibility, not a human one)
30
31. TEMPORAL LEGACIES OF QUALITATIVE RESEARCHERS
A lifetime body of work
Particular research works
Unique or powerful contributions to particular
insights, theories, practices, and others
Coining of new terms
Originating new research methods, how research is
operationalized
Research instruments
Ability to reach the lesser-reached
Language skills
Professional and other affiliations
Professional collaborations
Personality, persona, charisma
Promotion of social change, advocacy for certain
values
Effective funding and uses of available resources
Style(s) and aesthetics
And others…
31
34. WHAT IS “CODING” ANYWAY?
Manual coding
Reading collected data (transcripts, articles, maps,
audio, video, photos, etc.) and identifying elements of
interest and coding them to a codebook in natural
language
Organizing the codebook in a rational order, with
child nodes, grandchild nodes, great grandchild
nodes, etc. (structured codebook)
Automated coding
Distant reading by machine using the following:
Word counts
Algorithmic topic extraction
Application of sentiment dictionaries to text at varying
levels of granularity (sentences, paragraphs, or data
cells…depending on the formatting of the textual data)
34
35. WHAT IS A CODEBOOK ANYWAY?
A basic codebook contains the following: coding nodes (classifications of codes) and descriptions for each node
so that coders understand what information belongs in that classification
A codebook may be hierarchical, with top-level nodes, child nodes, grandchild nodes, and so forth
The nodes may be sectioned based on topics. They may be sectioned in alphabetical order.They may be ordered
with leading 0s. There are many accepted ways for the ordering of the codes.
In table format, a codebook looks like the following (in the simplest construct):
35
Codes Descriptions of Coding Categories
36. WHY SHOULD A CODEBOOK HAVE A NAME?
A codebook should have a name that describes what is coded by that codebook. The foci and discipline should
be identifiable.
A codebook name should have a clear acronym, for easy reference.
A codebook needs a name so that it is easily citable by other researchers.
A codebook needs a name so that researchers can credit the original codebook instrument creator when they
use the codebook…or when they create a module to add to it, etc.
A codebook should have a name because of how it is used in the research and academic space.
A codebook shows a culmination of expertise…and expert interactions with a sufficient amount of relevant data.
36
37. THINK IN SEQUENCING
What are data patterns in a particular set of core data files? (word frequency counts, text searches, topic
modeling, and others)
What are proxemic terms around particular names, dates, labeled phenomena, symbols, and others? (proximity
searches)
Who are the different individuals who responded to the survey / focus group / interviews based on demographic
data? Based on topics of interest? Based on general sentiment? (classification sheets w/ demographic
information and case nodes, topic modeling, sentiment analysis, and others)
What are features of the created manual coding? Automated coding? (matrix coding queries)
In a sentiment analysis of a social network’s discussion, what topics are seen in the most positive sentiment?
Which topics are seen in the most negative sentiment? (sentiment analysis, topic modeling)
37
38. THINK IN SEQUENCING (CONT.)
In conducting a review of the literature, a large number of files have been downloaded from various subscription
and open-source web-facing databases. The research is focused on a particular subset of articles. The researcher
does not want to read all the articles. How can the researcher hone in on the particular works of interest?
(topic modeling by article set; topic modeling by titles and abstracts; word searches in the database of articles)
38
39. THINK IN SEQUENCING (CONT.)
In a geographical analysis of responses, what are topical and sentiment patterns and attitudes? (classification
sheets, geographical modeling, topic modeling, sentiment analysis)
In creating a team’s consensus codebook, based on collected .nvp and .nvpx project files (or even server files),
how do the various human-generated manual codebooks differ? What are the outlier ideas? (event logging for
objective record of individual researcher / coders and contributions, matrix coding query, coding comparison for
Kappa coefficient, transcoding of project files from-toWindows, Mac, server)
In the “use existing coding patterns” in which “NV” (the software) codes by emulating the human-generated
codebook, various individual’s and teams’ coding fists are emulated computationally…to scale…to computational
speeds. This enables preservation of people’s points-of-view and coding patterns. What are ways to ensure that a
codebook is coded to saturation, since this feature does not add any new nodes (coding classifications)? (use
existing coding patterns)
39
40. THINK IN MULTIPLE NVIVO FILES
When working on a large-size or longitudinal project (including doctorate degrees), use a number of files to
achieve your aims.
Make a file for the review of the literature. Make a file for the focus group. Make a file for the fieldwork. Make a file for
the social video analysis. Make a file for the analysis of the geographical maps.
Combine data only when you need to run data queries and / or autocoding on the particular set of information.
Do not clump everything into a large file unless your queries require access to all the included data.
Always have a backup set of files in the cloud or in multiple physical locations (so as not to lose work
accidentally).
40
41. THINK IN MULTIPLE LANGUAGES
NVivo enables coding in a number of languages:
simplified Chinese
English (US)
English (UK)
French
German
Japanese
Portuguese
Spanish
UTF-8 and UTF-16 enables representations of all languages on the Web and Internet.
41
42. THINK IN TEAMING
Aim for wide dissensus when originating a team codebook, so that the widest variety of ideas may be captured
initially before there is convergence to a consensus codebook
Aim for narrower consensus when training a team to use a defined codebook on defined data for a sufficiently
high Cohen’s Kappa / Kappa coefficient to establish the validity of a construct
42
44. ABOUT NVIVO
NVivo is a qualitative data analytics software tool that acts like a database (that enables the storage of structured
and unstructured data, the running of queries, the interaction with data, the drawing of data visualizations, the
export of reports, and so on)
The prior version of the software was known as NUD*IST (1981 – 1997), and N4 to NVivo from 1997 to
present (“NVivo,” Oct. 14, 2021)
NUD*IST stood for “Non numerical Unstructured Data Indexing Searching and Theorizing software”
44
45. BASIC SELECTED ANALYTICAL CAPABILITIES OF NVIVO INCLUDE…
Exploration of data
Word frequency count
Text search with various parameters
Similarity cluster analysis
Coding analysis
Matrix coding
Qualitative crosstab analysis (with case nodes and
classification sheet data)
Coding comparison (with Cohen’s Kappa / Kappa
coefficient)
Compound queries
Group queries
Locational geographical mapping from social media
data
Ego neighborhood mapping for following network in
directed graphs
Various data tables
Various data visualizations (dendrograms, treemap
diagrams, word trees, ring lattices, cluster diagrams
2d, cluster diagrams 3d, and others)
45
46. BASIC SELECTED ANALYTICAL CAPABILITIES OF NVIVO
INCLUDE…(CONT.)
Autocoding from data (various forms of machine
learning)
Topic modeling (“distant reading” of texts and
extraction of topics)
Sentiment analysis
Coding by style
Coding by name in transcript
Use of existing coding patterns (machine copies human
manual codebook to dataset scale and computation
speed)
Autocoding from survey downloads (to case nodes
and topic modeling and sentiment analysis)
46
47. SOME SCREENSHOTS FROM AN ORIGINAL DEMO
PROJECT
FROM NVIVO 12 (ONEVERSION PRIOR TO LATEST) AND NVIVO (LATESTVERSION) ON WINDOWS
47
59. SOME ANALYTICAL APPLICATIONS OF NVIVO IN THE RESEARCH
LITERATURE
Manual coding of various research data and the extraction of manual codebooks (based on a variety of target
topics)
Reproducible / repeatable autocoded topic modeling to compare against human coding
Autocoded sentiment analysis (positive or negative sentiment) of text sets
Respondent profiling by topics of focus and sentiment
Codebook analysis (analysis of the code, whether manual or autocoded or combined)
Qualitative cross-tab analysis for data patterns of respondents by various attributes (demographic and others)
Social media data extractions (tweetsets from a microblogging site, poststreams from a social networking site,
social video from a social video sharing site with comments, and so on)
59
61. STRUCTUREDVS. UNSTRUCTURED / SEMI-STRUCTURED DATA
Structured data
Labeled data in data tables
Each value in a cell is labeled by the column header and
the row header
Each value in a cell is identified by type of data with
attendant features
Unstructured, semi-structured data
Text
Imagery
Audio
Video
Multimodal, multimedia-based
* The argument for “semi-structured” vs.“unstructured”
is that there is no absolutely unstructured data unless
it’s randomness (even pseudo-randomness is not fully
unstructured). Natural language has an inherent
structure. Ditto storytelling, audio, video, and so on.
61
62. TYPES OF USABLE DATA IN NVIVO
Text files (incl. pdf)
Image files (maps, screenshots, photos, diagrams, and
others)
Audio files
Video files
Survey data (Qualtrics, Survey Monkey)
Web bibliography sources
Online notetaking sites
Email message and identity data
Excel workbooks
SPSS datasets (note the tie to quant methods from a
qual analytics tool…and vice versa)
NVivo projects (for team collaborations)
.qdc codebooks, .docx codebooks
(code category names and descriptions for what goes into
each category, not exemplars within the categories)
NVivo memos, NVivo reports, and others
* For multimedia, there have to be text equivalencies for
the imagery, audio, and video (transcripts)
62
63. USING CLEAN DATA
To have clean data, select the files purposefully.
Take out personally identifiable information (PII).
Ensure that metadata does not carry sensitive information (in the imagery, in the text files, in the video files, etc.)
Do not digitally annotate the files before you ingest those into the NVivo project, or you’ll have introduced noise
into your data. (If you want to annotate files, do so, but keep those files separate from the pristine ones that will
be ingested into the .nvp or .nvpx files.)
63
64. ENSURING USABLE DATATABLES AND DATAVISUALIZATIONS
OUTSIDE OF NVIVO
It helps to export data tables and data visualizations from NVivo unless you will have a forever license and assume that
the software will be forever available. NVivo is a proprietary software, and you will need a version of the software to
open NVivo files.
An older version of NVivo cannot open newer .nvp or .nvpx files. Upgrading files will mean that the upgraded version of the
software is needed.
Record the exported contents clearly with consistent file-naming protocols. Record the parameters used to extract the data table
or data visualization. [Review the data. Do “sanity checks” of the data before exporting and saving.]
Just make sure that you have clear naming protocols for any exported data, so you know what you’re looking at when
you access the files later. Document the parameters you use to run various data queries and machine learning
sequences, so you can represent them clearly in a publication or presentation. [The data analytics process is sequence-
sensitive. The order of operations affects data at each step and the ultimate outcomes. Error introduced at any one
point potentially amplifies.]
Any raw data you ingest into an NVivo project should also be stored externally to the project as well, so they are
available for reference external to the project file.
64
65. DATA EXTRACTIONS FROM SOCIAL MEDIA
USING NCAPTURE (WEB BROWSER ADD-ON TO GOOGLE CHROME)
65
66. SOCIAL MEDIA DATASETS
Profiling social groups and sub-groups
Capturing a sense of mass mood / sentiment around particular topics
Identifying the most high-degree social nodes in a social network; mapping the social network to understand
dynamics
Mapping http networks from social media
Analyzing social images on social media (for content, for sentiment, for identified peoples, and others)
Identifying synthetic persons (‘bots)
Identifying general geographical locations of respective linked social accounts
66
67. ABOUT SOCIAL MEDIA DATA AND NCAPTURE
NCapture works on Google Chrome (and the aging-out unsupported Internet Explorer / IE web browser)
Various social media platforms are not supported in IE now, so using NCapture does not enable access to the
various platforms, like Facebook / Meta
Developers are working on a bridge to Facebook via Chrome, but that has not been available for many months
Access to social media data is rarely an n = 1 (without paying for the data from the social media platform
provider or a third-party source)
Given dynamism in the space (due to various dependencies and other factors), if you have a chance to collect the
social data, do so. Do not assume that the chance will always be there.
Take a screenshot of the landing page of the social account you’ve profiled, so you have the “state” of the account
at the time of the data capture. (You may have to go deeper for more than summary data.)
You can scrape images using third-party web browser add-ons to capture that data in thumbnail format.
67
68. LIMITS TO COMPUTATIONALTEXT ANALYSIS IN NVIVO (IMHO)
Some limits:
may view words as individual n-grams and not bigrams, three-grams, four-grams, etc.; does not capture phrases
may have an insufficient stopwords list
does not understand negatives
does not understand humor
does not understand irony
does not understand external referents to a text or text corpus
machine logic and not human logic in “use existing coding patterns” machine emulation of human coding
is limited by human manual coding when using “use existing coding patterns”
does not capture an n = all in social media accounts with NCapture (unless the accounts have a limited amount ofTweet or
poststream contents) … given API (application programming interface) limits on the various social media platforms
There are commercial ways to acquire n = all datasets but require queries run online on “big data” datasets (using different data query methods like
versions of structured query language)
68
70. INDIVIDUALS AS RESEARCH INSTRUMENTS…
An individual researcher is a “research instrument”.
A group of researchers is a collaborative “research instrument”.
Researchers are sentients, and their aperture and vision and methods inform their power and capability.
Their social connections are part of their power and capability.
Their positionality—which they can change—affects what they have access to and what they can achieve.
70
71. CYBORGS HAVE MORE SKILLS TO DEPLOY
The building up of new knowledge and new skills in the computational space can enable an extension of the
researcher capabilities.
A “cyborg” is a “bionic” personage, who is both flesh and machine (as technical enhancement).
CAQDAS and other data analytics software have a forcing function: they force a researcher to get more precise
and to explicate and to explore and to ultimately form a sense of the target research topic.
Technologies change the researcher. [Some see this as a strength. Others see this as a threat. People choose how to
wield and apply certain tools.]
71
72. THE “BIONICS” FORTHE RESEARCHER INCLUDE THE
FOLLOWING…
Technologies enable the capture of otherwise-inaccessible data, in vitro, in vivo, and in cyber. They extend collection.
Technologies enable various discovery and explorations of the available data.Technologies enable rich review of
data. They extend perception.
Technologies enable permanent archival of data. They extend memory.
Technologies enable expanding askable research questions…and the testing of various hypotheses.They extend
asking and hypothesizing. They extend thinking.
Technologies can enable complementary insights to those attained by manual methods alone. They extend
learning. They extend conceptualization.
72
73. “BLACK BOX” ELEMENTS IN RESEARCH AND DATA ANALYSIS
There are some “black box” elements to both human and the
computational machines and methods.
For as much effort that goes into transparency in research and
computational sequences, there are “inexplicables” (as in ANNs
and how neural networks process data—although computer
scientists are getting closer to some explanations; as in human
intuitions and cognitive leaps; as in workings of the human
subconscious and unconscious).
Then again, not everything has to be fully understood and
explicated.
73
75. GETTING STARTEDWITH CAQDAS
CAQDAS is Computer-Assisted Qualitative Data Analytics Software.
Study how computation is applied to various qualitative data analytics challenges…and what may be asserted from
various analytics.
Explore the available software tools and their respective capabilities.
Decide which software programs provide the capabilities you will use.
Decide which software tools have a comfortable user interface.
There are some free software tools available, if you’re comfortable with command line (vs. graphical user interface).
Go for it! Start slow. Be nice to yourself. Be nice to others. Build the skillset. Share your knowledge, skills, and
abilities (KSAs).
75
76. CONTACT
Dr. Shalin Hai-Jew
ITS
Kansas State University
shalin@ksu.edu
785-532-5262
Gentle caveat: This presentation uses one software tool to bridge to CAQDAS. There are many other tools and
methods and capabilities…
Resource: Using NVivo: An Unofficial and Unauthorized Primer
76
77. EXTRA: AND A DIFFERENT FAVORITE CAQDAS TOOL: LIWC-22
LIWC-22*
[validated instrument trained on a number of natural language datasets; measures psychometrics; linguistic features; sentiment analysis,
four category scale scores related to (1) analytical thinking, (2) clout, (3) authenticity / warmth, (4) emotional tone (sentiment); and other
foci; enables custom dictionaries focused on various objectives; has versions in a number of different languages]
[related to a variety of insightful research]
[runs on Windows]
77
78. EXTRA: SOME ANALYTICAL APPLICATIONS OF LIWC INTHE
RESEARCH LITERATURE
Author identification (historical and present), (in)validation of authorship
Predictive analytics
Fraud detection (people and “self-invisible” tells)
Suicide intervention
Remote personality (ego and entity) profiling (longitudinal, episodic)
Political leader profiling
Political group profiling
Ideology profiling
Human landscape mapping, social network mapping
Elicitation of power dynamics
Terrorist group mapping
78