3. What Is Sustainability?
“...using resources to meet the needs of the present
without compromising the ability of future generations to
meet their own needs...”
Refers to agriculture, material engineering, energy,
economics, political science, sociology, management.
Silos of knowledge emerged across distinct disciplines and
divergence in perceptions of sustainability becomes
noticeable.
4. Goals and Means
Goals:
Map the semantic mindspace regarding the concept of
sustainability.
Develop a transferrable mapping tool.
Means:
Collect and analyze term data available from various
sustainability-related sources, using semantic network
analysis.
5. Method Workflow
Creator:cairo 1.8.10 (http://cairographi
CreationDate:Mon Jan 20 17:28:58 2014
LanguageLevel:2
Acquire term data from a variety of data source
Select most commonly used terms
Evaluate term similarity
Cluster terms, based on similarity
Extract motifs (meta-terms) using crowdsourcing via
Amazon Mechanical Turk
6. Data Framework and Sources
Paper keywords from EBSCO academic database (supplied by
authors)—scholarly aspect [KWD]
Paper subject tags from EBSCO academic database (supplied
by editors)—scholarly aspect [TAG]
Interests from LiveJournal (supplied by sustainability-related
communities' moderators and individual bloggers, both involved
in the sustainability-related communities and not)—consumer
aspect [LJ]
7. Term Structure
Select 600–700* most
frequently used terms from
each data source
Only 6% of the terms are
used in all three term
corpora
The overlap between two
scholarly corpora is only
25%! (Marketing to
blame?)
* (limited by the performance of the
similarity calculation procedure)
8. Term-Artifact Structure
Seven incidence matrices:
Profiles of sustainability-related LJ communities vs
interests (CORE)
Profiles of LJ bloggers in sustainability-related
communities vs interests (PPL)
Profiles of random LJ bloggers vs interests (BASE)
EBSCO papers vs keywords (KP)
EBSCO authors vs keywords (via papers; KA)
EBSCO papers vs subject tags (TP)
EBSCO authors vs subject tags (via papers; TA)
9. Similarity Calculation
Generalized similarity [-1...1] between terms/artifacts
(Kovacz 2010):
Two terms are similar if they are associated with
similar artifacts.
Two artifacts are similar if they are associated with
similar terms.
Iterative procedure calculates two similarity matrices: one
for artifacts (not used) and another for terms
Evaluated for each incidence matrix
11. Clustering
The maps have a clear
clustering structure
Extract clusters of
terms from each map
One map—one level;
one cluster—one node;
connection widths
proportional to the
overlap
A, B, and C to be
addressed later
12. Semantic Network Stats
Network
Keywords, by paper
(KP)
Keywords, by author
(KA)
Subject Tags, by
paper (TP)
Subject Tags, by
author (TA)
Communities’
interests (CORE)
Members’ interests
(PPL)
Random bloggers’
interests (BASE)
Mean
Nodes
Average
degree
centrality
Density
Major/minor
clusters
Modularity
Average
clustering
coefficient
753
162
0.228
4+3
0.1
0.76
679
16
0.027
8+7
0.67
0.53
755
42
0.06
6
0.59
0.62
666
48
0.067
5+1
0.59
0.63
769
107
0.148
4
0.56
0.74
752
57
0.079
5+2
0.58
0.62
615
713
24
65
0.043
0.093
6+3
5+2
0.58
0.52
0.53
0.63
13. Motif Extraction
Motifs—“meta-terms” describing a semantic cluster
Identified via Amazon Mechanical Turk (mTurk) by asking:
“Describe the following group of 25 / 50 words with
a single most suitable word or a two-word or threeword phrase.”
100 mTurk workers per cluster (50 for top 25 terms and 50
for top 50 terms)
Normalize responses (remove typos, Anglicisms,
stopwords, punctuation; do stemming; select stems that
are on both 25- and 50-word lists)
14. Motif Examples
LJ Core, cluster “262”: SOC-/12, POLIT-/10, LIB-/10,
DEMOCR-/9, HUM-/8, RIGHT-/7, HIPPY-/5, GOVERN-/4,
FREEDOM-/4
LJ Core, cluster “260”: GREEN-/16, ENVIRON-/15, LIV-/14,
NAT-/11, ECO-FRIENDLY-/8, FRIEND-/6
LJ Core, cluster “163”: ENVIRON-/24, ENERGY-/16,
GREEN-/12, NAT-/9, SCI-/7, EAR-/7, RENEW-/6
LJ Core, cluster “84”: HEAL-/11, FOOD-/10, LIV-/9,
HEALTHY-/8, VEGET-/7, ORG-/5
Numbers show the total number of times the stem was used
by the mTurk workers with respect to the cluster.
15. Bipartite Network Again!
Motifs and semantic term clusters form a bipartite network
Generalized similarities between motifs and term clusters
can be calculated:
Clustered network of motifs, based on their
generalized similarity
Clustered network of semantic term clusters, based
on their generalized similarity
17. Three-Cluster “Cluster” Network
Term clusters and their motifs co-belong to the same metaclusters A, B, and C!
A, B, and C are semantic domains of sustainability
18. Sustainability Lattice
A, B, and C are
semantic domains,
each formed by term
clusters and respective
motifs
A: “Environmental /
Farming”
B: “Politics /
Economics”
C: “Healthy Lifestyle”
(absent from the
EBSCO keywords
levels)
19. Marketing and Multidisciplinarity
Lack of congruence between the keywords (KA/KP) and
subject tags (TA/TP) layers may indicate a marketing
element: authors may chose keywords to target potential
readers, while tag editors concentrate more on the
substance of the papers
Lack of congruence between the keywords-by-author (KA)
and keywords-by-paper (KP) layers is probably the result of
multidisciplinary cooperation, where authors from different
disciplines (not unlike us ☺) infuse keywords from their
“native” disciplines.
20. Scholars vs Consumers
Drastically different patterns of shared motifs by scholars
and consumers.
The two communities shared the largest common grounds
in the Environment / Farming domain (more than 40% of
the motifs).
Not so good for Healthy Lifestyle domain (about 35.5%;
consumers-dominated).
Bad for Politics / Economics domain (28%; scholarsdominated).
There are less common perceptions or interests share by
both communities in the other two semantic domains.
21. Knowledge Aggregation
The average degree centrality, network density, and
clustering coefficient increase in the directions
KA→TP→TA and BASE→PPL→CORE
The aggregating networks: TA/TP and CORE—are denser
(have more similarity connections between individual
terms) and less structured (have more transitive similarity
connections) than “grassroot” networks, KA/KP and
BASE/PPL.
Similarities emerge that are not seen to individual
consumers and researchers, but are captured by
community moderators and subject tag editors over time.
22. Conclusion
(1) We developed a transferable semi-automated framework
for multifaceted analysis of “fuzzy” concepts, such as
“sustainability,” “resilience,” “complexity,” “success”
(2) We applied the framework to the concept of
“sustainability”
(3) We identified 74 motifs, describing sustainability and
grouped into three major semantic domains
(4) We discovered differences between scholarly and
consumer-oriented views of sustainability