SlideShare una empresa de Scribd logo
1 de 62
Descargar para leer sin conexión
Dr. Lev Manovich

Professor of Computer Science, The Graduate Center, City
University of NewYork / Director, Cultural Analytics Lab

lab.culturalanalytics.info 

email: manovich.lev@gmail.com
What Does Data Want?
(answer: help humans to think without categories) 

- How to use big cultural data without aggregation and
summarization?

- How to think without (traditional) categories?

- How to learn from computers to understand the world
differently?
- How to work with big data without numbers?
1960 - Born in Moscow


1981 - came to NYC

1982-1985: BA (NYU Film School)



1986-1988: MA inVision Science

1989-1993: PhD inVisual Culture

1992-2012: Professor of Digital Art



2013 - Professor of Computer Science
1973 - started art lessons
1975 - started learning computer programming

1978 - how images are structured? How they communicate?

1984 - started working in GGI 

1986 - took classes in computer vision

2005 - the idea of cultural analytics

2007 - established Cultural Analytics Lab (UCSD)

2020 - Cultural Analytics book is published
Using data science to analyze contemporary culture:



Companies - marketing research, consumer preferences,
new product development, analysis of online and physical
behaviors
Non-profits - museums, universities, etc.
AAI , Network science, many areas of Computer Science

Computational social science

Communication studies

Political science, Psychology, Sociology

Urban planning, Urban Studies
Data visualization, data design, data art

Digital Humanities (sometimes)
Research examples:



"Cultural diffusion & trends in Facebook photographs" (2017)

"StreetStyle: Exploring world-wide clothing styles from millions of
photos" (2017)

"Why the songs of the summer sound the same"(2018)
"Neuroaesthetics in Fashion: Modeling the Perception of
Fashionability”(2015)
Every Noise at Once (2013)
"Quantifying reputation and success in art" (2018)

What is “Culture”? Humanities vs Cultural Anthropology

Humanities: culture is material artifacts, texts, and media
objects created by small number of authors

Cultural Anthropology: culture is behaviors, symbols, rituals,
values, beliefs; looking at society as a whole



For contemporary culture, it seems easy to combine both
perspectives using data - e.g. SM is both artifacts & behaviors.
But how informative are these behaviors?
Analyzing Culture: Digital Humanities vs Computer
Science

DH (mostly): 

- analyzes the historical artifacts by professional creatives 

Computer Science (mostly):

- analyzes contemporary artifacts & behaviors of ordinary
people (e.g. SM posts, images, video, online and physical
behaviors by billions of “normal”users)
Cultural analytics: using data methods to see
contemporary global culture (2005-):
inspiration: cientometrics, evolutionary biology, GIS
Research goals:

1) What are the themes, styles, behaviors and their
patterns in contemporary global culture?
2) Where are they active? (spatial distributions)

3) When they emerge, how they diffuse, change over time?



We now have enough data to map some of this at
relatively high resolution - but -
The main challenge in studying contemporary culture with
data science (as I see it):
- Shall we aggregate big cultural data and reduce it to small set
of patterns - frequently occurring ideas, themes, styles,
patterns, behaviors frequent in the data? (Statistical paradigm -
standard today in data science and data-driven research). 



- This paradigm focuses on what is common between a number
of objects; does not include what occurs infrequently.
- Or shall we refuse this dominant paradigm
instead focusing on diversity, variability & differences
(including tiny ones)? - i.e. work on big cultural data without
aggregation)?



- include everything

- pay attention to infrequent (but not outliers)

- identify small cultural islands (that usually disappear when
researchers use dominant paradigm)
- question similarity (categories, clusters, dimension reduction,
etc.)
High-resolution data allows us to think outside of the
dominant intellectual paradigm of the modern period:
aggregation, reduction, categorization

Individualization paradigm in the media/data industry:
- Ad platforms making predictive models for each user (even
for different times of the day) & custom recommendations
- Search engines indexing every webpage they can find
- Spotify and other companies analyzing every music track
(40M+)
Examples of the dominant paradigm in data-driven
culture analysis:



Cultural Diffusion andTrends in Facebook Photographs (2014):
“We are interested in recognizing many different types of
cultural lifestyles or activities in photographs…we select
the most common concepts”
“…asked annotators to describe the main visible concepts
of images using a few keywords”
“After pruning infrequent keywords..”


“Faces Engage Us: Photos with Faces Attract More Likes and
Comments on Instagram” (2014): 



“Our dataset consists of 23 million Instagram photos and
over 3 million Instagram users…we randomly selected 1
million photos from this data set.“
"the existence of a face in a photo significantly affects its
social engagement.This effect is substantial, increasing the
chances of receiving likes by 38% and comments by 32%. "
“Exploring world-wide clothing styles from millions of photos” (2017)



Paper goals:
“- Identify common, visually correlated combinations of these
basic attributes (e.g., blue sweater with jacket and wool hat).
- Identify styles that appear more frequently in one city versus
another or more frequently during particular periods of time.
- Identify finer-grained, visually coherent versions of these ele-
ments (e.g., sports jerseys in a particular style)."
Examples of style clusters
“GeoStyle: Discovering Fashion Trends and Events (2019)



“Most attribute combinations are uninteresting because of
their rarity: e.g., pink, short-sleeved, suits.
We want to focus on the limited set of attribute combinations
that are actually prevalent in the data.”“
Supervised vs unsupervised machine learning for seeing
culture:



-supervised machine learning use for classification:
start with existing categories (defined by experts, or by
“common sense”) and then classify the rest of the data /
new data using these categories. Using neural nets only
makes this problem bigger.
Cultural analytics - how my vision changed over time 

- We want to challenge existing categories; ask if rigid
categories make sense for a particular cultural field; discover
its structure (2007)

- Unsupervised machine learning is well suited for these
goals; but success depends on how we represent a
phenomenon as data, what features we use (2010)
- But unsupervised machine learning also requires
aggregation, as classical statistics - how to avoid this? (2016)
- Data paradigm offers a new language for describing and
thinking about culture

- Numerical (continuous) scales instead of (verbal categories)
- Representing continuous change over time
- Representing differences between cultural artifacts and actors
as numerical distances in feature space
- Detecting clusters instead of starting with already existing
categories
- In a cluster any object has a particular distance to the center
(in traditional categories its either/or membership)
- But there are still key challenges -
The problems with representing cultural artifacts using
numerical features:
- how do we know we have the right features?
- we don’t know how brain combines visual features
- gestalt theory - the whole is not a mechanical part of the
parts
- many images may be identical from statistical point of view,
and yet they have crucial differences for a human observer -
tiny differences that make a difference
- Next slides: examples of Instagram photography
(2005-2016)
- Can data science and AI today capture all the differences
between these artifacts - between authors’ visions and the
differences all individual photographs? (in content, visual
language, mood, emotions - and all of this together for each
image - because this is how many people see)
- Every person may see each artifact differently depending on
her background, education, knowledge of codes, what she
seen before, etc. Can data approach capture such variability?
(recommendation engines research?)
- Cultural Analytics vision (2007-2008)
- Examples of projects from our lab (2009-2015)
Elsewhere project (2018-)
- Instead of using social networks data (posts by
individuals), we use information about cultural events
shared by organizations on different platforms.
- During last 20 years, the numbers of these places and
events have become so large that we can now to treat
them as “big data.”
- Examples of data sources: TEDx events, e-flux archive,
Meetup, Behance. Our dataset: 4.5 million events
Elsewhere project (2018-) - using locations, categories, dates
and text descriptions of millions of cultural events
1) What is the presence of some of contemporary culture
(CC) - as represented by our data sources - on a world map
today? What is the density and depth of this presence in
different places? Are there still big white spots?
2) What is the temporal growth and diffusion of CC (1990 -) ?
3) What are the topics, concepts and themes in CC? What
occurs everywhere, what is only somewhere, what is elsewhere
(outside of top cities), what is unique?
Thank you!

Questions, comments, collaboration:

manovich.lev@gmail.com



Projects and publications:

lab.culturalanalytcs.info

manovich.net 




Más contenido relacionado

Similar a What Does Data Want? (2019-2020)

COSMOS
COSMOSCOSMOS
COSMOS
NSMNSS
 
Data Visualizatiion: Using Vision to Think about the Humanities
Data Visualizatiion: Using Vision to Think about the HumanitiesData Visualizatiion: Using Vision to Think about the Humanities
Data Visualizatiion: Using Vision to Think about the Humanities
rwness
 
Data Visualization: Using Vision to Think about the Humanities
Data Visualization: Using Vision to Think about the HumanitiesData Visualization: Using Vision to Think about the Humanities
Data Visualization: Using Vision to Think about the Humanities
rwness
 
Sms 2017-momentary-talk-for-slideshare
Sms 2017-momentary-talk-for-slideshareSms 2017-momentary-talk-for-slideshare
Sms 2017-momentary-talk-for-slideshare
suthers
 
Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3
SMCFrance
 
Project 2 OntographyFor our second course project, we will be.docx
Project 2 OntographyFor our second course project, we will be.docxProject 2 OntographyFor our second course project, we will be.docx
Project 2 OntographyFor our second course project, we will be.docx
wkyra78
 

Similar a What Does Data Want? (2019-2020) (20)

Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter School
 
Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical Gestures
 
New Perspectives on Social Media: Putting Our ‘Known Unknowns’ on the Map
New Perspectives on Social Media: Putting Our ‘Known Unknowns’ on the MapNew Perspectives on Social Media: Putting Our ‘Known Unknowns’ on the Map
New Perspectives on Social Media: Putting Our ‘Known Unknowns’ on the Map
 
COSMOS
COSMOSCOSMOS
COSMOS
 
Social Data and Multimedia Analytics for News and Events Applications
Social Data and Multimedia Analytics for News and Events ApplicationsSocial Data and Multimedia Analytics for News and Events Applications
Social Data and Multimedia Analytics for News and Events Applications
 
Data Visualizatiion: Using Vision to Think about the Humanities
Data Visualizatiion: Using Vision to Think about the HumanitiesData Visualizatiion: Using Vision to Think about the Humanities
Data Visualizatiion: Using Vision to Think about the Humanities
 
Data Visualization: Using Vision to Think about the Humanities
Data Visualization: Using Vision to Think about the HumanitiesData Visualization: Using Vision to Think about the Humanities
Data Visualization: Using Vision to Think about the Humanities
 
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slidesMining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Post-social methods? Issues in live research, by Noortje Marres and Esther We...
Post-social methods? Issues in live research, by Noortje Marres and Esther We...Post-social methods? Issues in live research, by Noortje Marres and Esther We...
Post-social methods? Issues in live research, by Noortje Marres and Esther We...
 
Introduction to Computational Social Science - Lecture 1
Introduction to Computational Social Science - Lecture 1Introduction to Computational Social Science - Lecture 1
Introduction to Computational Social Science - Lecture 1
 
Experimental categorization and deep visualization
 Experimental categorization and deep visualization Experimental categorization and deep visualization
Experimental categorization and deep visualization
 
Sms 2017-momentary-talk-for-slideshare
Sms 2017-momentary-talk-for-slideshareSms 2017-momentary-talk-for-slideshare
Sms 2017-momentary-talk-for-slideshare
 
Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3
 
RTP1 18-19: Digital Methods 1
RTP1 18-19: Digital Methods 1RTP1 18-19: Digital Methods 1
RTP1 18-19: Digital Methods 1
 
Data socialscienceprogramme
Data socialscienceprogrammeData socialscienceprogramme
Data socialscienceprogramme
 
Project 2 OntographyFor our second course project, we will be.docx
Project 2 OntographyFor our second course project, we will be.docxProject 2 OntographyFor our second course project, we will be.docx
Project 2 OntographyFor our second course project, we will be.docx
 
(A)social internet projects as modulators of social reality
(A)social internet projects as modulators of social reality(A)social internet projects as modulators of social reality
(A)social internet projects as modulators of social reality
 
Open Grid Forum workshop on Social Networks, Semantic Grids and Web
Open Grid Forum workshop on Social Networks, Semantic Grids and WebOpen Grid Forum workshop on Social Networks, Semantic Grids and Web
Open Grid Forum workshop on Social Networks, Semantic Grids and Web
 
The Human Feedback Loop - World Usability Day 2012
The Human Feedback Loop - World Usability Day 2012The Human Feedback Loop - World Usability Day 2012
The Human Feedback Loop - World Usability Day 2012
 

Último

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 

Último (20)

How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 

What Does Data Want? (2019-2020)

  • 1. Dr. Lev Manovich
 Professor of Computer Science, The Graduate Center, City University of NewYork / Director, Cultural Analytics Lab
 lab.culturalanalytics.info 
 email: manovich.lev@gmail.com What Does Data Want? (answer: help humans to think without categories) 

  • 2. - How to use big cultural data without aggregation and summarization?
 - How to think without (traditional) categories?
 - How to learn from computers to understand the world differently? - How to work with big data without numbers?
  • 3. 1960 - Born in Moscow 
 1981 - came to NYC
 1982-1985: BA (NYU Film School)
 
 1986-1988: MA inVision Science
 1989-1993: PhD inVisual Culture
 1992-2012: Professor of Digital Art
 
 2013 - Professor of Computer Science
  • 4. 1973 - started art lessons 1975 - started learning computer programming
 1978 - how images are structured? How they communicate?
 1984 - started working in GGI 
 1986 - took classes in computer vision
 2005 - the idea of cultural analytics
 2007 - established Cultural Analytics Lab (UCSD)
 2020 - Cultural Analytics book is published
  • 5. Using data science to analyze contemporary culture:
 
 Companies - marketing research, consumer preferences, new product development, analysis of online and physical behaviors Non-profits - museums, universities, etc. AAI , Network science, many areas of Computer Science
 Computational social science
 Communication studies
 Political science, Psychology, Sociology
 Urban planning, Urban Studies Data visualization, data design, data art
 Digital Humanities (sometimes)
  • 6. Research examples:
 
 "Cultural diffusion & trends in Facebook photographs" (2017)
 "StreetStyle: Exploring world-wide clothing styles from millions of photos" (2017)
 "Why the songs of the summer sound the same"(2018) "Neuroaesthetics in Fashion: Modeling the Perception of Fashionability”(2015) Every Noise at Once (2013) "Quantifying reputation and success in art" (2018)

  • 7.
  • 8.
  • 9.
  • 10.
  • 11. What is “Culture”? Humanities vs Cultural Anthropology
 Humanities: culture is material artifacts, texts, and media objects created by small number of authors
 Cultural Anthropology: culture is behaviors, symbols, rituals, values, beliefs; looking at society as a whole
 
 For contemporary culture, it seems easy to combine both perspectives using data - e.g. SM is both artifacts & behaviors. But how informative are these behaviors?
  • 12. Analyzing Culture: Digital Humanities vs Computer Science
 DH (mostly): 
 - analyzes the historical artifacts by professional creatives 
 Computer Science (mostly):
 - analyzes contemporary artifacts & behaviors of ordinary people (e.g. SM posts, images, video, online and physical behaviors by billions of “normal”users)
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. Cultural analytics: using data methods to see contemporary global culture (2005-): inspiration: cientometrics, evolutionary biology, GIS Research goals:
 1) What are the themes, styles, behaviors and their patterns in contemporary global culture? 2) Where are they active? (spatial distributions)
 3) When they emerge, how they diffuse, change over time?
 
 We now have enough data to map some of this at relatively high resolution - but -
  • 19. The main challenge in studying contemporary culture with data science (as I see it): - Shall we aggregate big cultural data and reduce it to small set of patterns - frequently occurring ideas, themes, styles, patterns, behaviors frequent in the data? (Statistical paradigm - standard today in data science and data-driven research). 
 
 - This paradigm focuses on what is common between a number of objects; does not include what occurs infrequently.
  • 20. - Or shall we refuse this dominant paradigm instead focusing on diversity, variability & differences (including tiny ones)? - i.e. work on big cultural data without aggregation)?
 
 - include everything
 - pay attention to infrequent (but not outliers)
 - identify small cultural islands (that usually disappear when researchers use dominant paradigm) - question similarity (categories, clusters, dimension reduction, etc.)
  • 21. High-resolution data allows us to think outside of the dominant intellectual paradigm of the modern period: aggregation, reduction, categorization
 Individualization paradigm in the media/data industry: - Ad platforms making predictive models for each user (even for different times of the day) & custom recommendations - Search engines indexing every webpage they can find - Spotify and other companies analyzing every music track (40M+)
  • 22. Examples of the dominant paradigm in data-driven culture analysis:
 
 Cultural Diffusion andTrends in Facebook Photographs (2014): “We are interested in recognizing many different types of cultural lifestyles or activities in photographs…we select the most common concepts” “…asked annotators to describe the main visible concepts of images using a few keywords” “After pruning infrequent keywords..”
  • 23. 
 “Faces Engage Us: Photos with Faces Attract More Likes and Comments on Instagram” (2014): 
 
 “Our dataset consists of 23 million Instagram photos and over 3 million Instagram users…we randomly selected 1 million photos from this data set.“ "the existence of a face in a photo significantly affects its social engagement.This effect is substantial, increasing the chances of receiving likes by 38% and comments by 32%. "
  • 24. “Exploring world-wide clothing styles from millions of photos” (2017)
 
 Paper goals: “- Identify common, visually correlated combinations of these basic attributes (e.g., blue sweater with jacket and wool hat). - Identify styles that appear more frequently in one city versus another or more frequently during particular periods of time. - Identify finer-grained, visually coherent versions of these ele- ments (e.g., sports jerseys in a particular style)."
  • 25. Examples of style clusters
  • 26. “GeoStyle: Discovering Fashion Trends and Events (2019)
 
 “Most attribute combinations are uninteresting because of their rarity: e.g., pink, short-sleeved, suits. We want to focus on the limited set of attribute combinations that are actually prevalent in the data.”“
  • 27. Supervised vs unsupervised machine learning for seeing culture:
 
 -supervised machine learning use for classification: start with existing categories (defined by experts, or by “common sense”) and then classify the rest of the data / new data using these categories. Using neural nets only makes this problem bigger.
  • 28. Cultural analytics - how my vision changed over time 
 - We want to challenge existing categories; ask if rigid categories make sense for a particular cultural field; discover its structure (2007)
 - Unsupervised machine learning is well suited for these goals; but success depends on how we represent a phenomenon as data, what features we use (2010) - But unsupervised machine learning also requires aggregation, as classical statistics - how to avoid this? (2016)
  • 29. - Data paradigm offers a new language for describing and thinking about culture
 - Numerical (continuous) scales instead of (verbal categories) - Representing continuous change over time - Representing differences between cultural artifacts and actors as numerical distances in feature space - Detecting clusters instead of starting with already existing categories - In a cluster any object has a particular distance to the center (in traditional categories its either/or membership) - But there are still key challenges -
  • 30. The problems with representing cultural artifacts using numerical features: - how do we know we have the right features? - we don’t know how brain combines visual features - gestalt theory - the whole is not a mechanical part of the parts - many images may be identical from statistical point of view, and yet they have crucial differences for a human observer - tiny differences that make a difference
  • 31. - Next slides: examples of Instagram photography (2005-2016) - Can data science and AI today capture all the differences between these artifacts - between authors’ visions and the differences all individual photographs? (in content, visual language, mood, emotions - and all of this together for each image - because this is how many people see) - Every person may see each artifact differently depending on her background, education, knowledge of codes, what she seen before, etc. Can data approach capture such variability? (recommendation engines research?)
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. - Cultural Analytics vision (2007-2008) - Examples of projects from our lab (2009-2015)
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59. Elsewhere project (2018-) - Instead of using social networks data (posts by individuals), we use information about cultural events shared by organizations on different platforms. - During last 20 years, the numbers of these places and events have become so large that we can now to treat them as “big data.” - Examples of data sources: TEDx events, e-flux archive, Meetup, Behance. Our dataset: 4.5 million events
  • 60. Elsewhere project (2018-) - using locations, categories, dates and text descriptions of millions of cultural events 1) What is the presence of some of contemporary culture (CC) - as represented by our data sources - on a world map today? What is the density and depth of this presence in different places? Are there still big white spots? 2) What is the temporal growth and diffusion of CC (1990 -) ? 3) What are the topics, concepts and themes in CC? What occurs everywhere, what is only somewhere, what is elsewhere (outside of top cities), what is unique?
  • 61.
  • 62. Thank you!
 Questions, comments, collaboration:
 manovich.lev@gmail.com
 
 Projects and publications:
 lab.culturalanalytcs.info
 manovich.net