SlideShare una empresa de Scribd logo
1 de 69
Descargar para leer sin conexión
Data Literacy
and
Ethics in the Lab
Chris Wiggins + Matt Jones
data-ppf.github.io
Course supported by Collaboratory Fellows Fund, Columbia University
(talk presented 2018-02-13 at Digital Life Seminar, Cornell Tech)
overview
1. hypotheses driving the class
2. student-eye view of the course (following “lecture 1”)
3. ethics
a. defining vs enforcing
b. research vs industry: two institutional moments
c. curricula extant vs curricula needed
4. show and tell:
a. syllabus/readings
b. paired Python/Jupyter notebooks (readings on monday; Python on wednesdays)
c. Slack (not just for discussions)
1. hypotheses
hypotheses driving the class
1. there is important material being taught neither to future statisticians nor to
future senators
a. outside the technical canon yet also
b. present only at the advanced level in STS, to our knowledge unpaired with technical
engagement
2. multicapabilities [1] are teachable without prerequisite
a. functional: via pre-authored Jupyter notebooks, as in-class labs
b. rhetorical: in-class labs as well as discussion
c. critical: discussions + readings as well as in-class labs
3. pair intellectual changes with political and ethical context
a. what powers motivated this advance?
b. how did this advance rearrange power? (cf. Rogaway) [2]
[1] cf. Selber, S. (2004). Multiliteracies for a digital age. SIU Press.
[2] Rogaway, P. (2015). The Moral Character of Cryptographic Work. IACR Cryptology ePrint Archive, 2015, 1162
hypotheses driving the class
1. there is important material being taught neither to future statisticians nor to
future senators
a. outside the technical canon yet also
b. present only at the advanced level in STS, to our knowledge unpaired with technical
engagement
2. multicapabilities [1] are teachable without prerequisite
a. functional: via pre-authored Jupyter notebooks, as in-class labs
b. rhetorical: in-class labs as well as discussion
c. critical: discussions + readings as well as in-class labs
3. pair intellectual changes with political and ethical context
a. what powers motivated this advance?
b. how did this advance rearrange power? (cf. Rogaway) [2]
[1] cf. Selber, S. (2004). Multiliteracies for a digital age. SIU Press.
[2] Rogaway, P. (2015). The Moral Character of Cryptographic Work. IACR Cryptology ePrint Archive, 2015, 1162
2. student-eye view
(i.e., “Lecture 1”)
Data
Past Present Future
Wiggins + Jones
SEAS + A&S
data-ppf.github.io
how did this end up in my news feed?
- math
- hardware
- system
- funding
- market
- regulation
- data
this was not possible 20 years ago.
- why?
- what did people do instead?
Session replay scripts
“Automated Inference on Criminality using Face Images”
(arXiv:1611.04135v1)
“In all cultures and all
periods of recorded human
history, people share the
belief that the face alone
suffices to reveal innate
traits of a person.”
We’ve been here before
J Am Inst. Criminal Law, 1912, on Lombroso, 1899
We’ve been here before
We’ve been here before
We’ve been here before
Insert quote from Blaise here
Statistical sciences always political
Dream of sciences of social difference
central to development of
statistics
and the
data sciences
Florence Nightingale
& Data Visualization
“Experience has shown that
without special information and
skilful application of the resources
of science in preserving health,
the drain on our home population
must exhaust our means. The
introduction, therefore, of a proper
sanitary system into the British
army is of essential importance to
the public interests.” (1858)
Florence Nightingale
& Data Visualization
“Upon the British race alone the
integrity of that empire at this
moment appears to depend. The
conquering race must retain
possession.” (1858)
Every week:
Scientific and mathematical development
Technologies and engineering
Driving forces: money, prestige, resources, Imperial competition
Power, ethics, and data intensive knowledge
Tech story: three chronological stages
Data and Math
Data and Engineering
Data and Technology
Data technologies
Census and government survey
Information processing machines and
digital computers
Always on network infrastructure
Power
How should social and political order be organized on basis
of science and engineering?
How do technologies transform the social and political order?
How do technologies augment and diminish democratic orders? Autocratic ones?
Power and politics*
New technologies mean new capabilities.
These capabilities are first available to those in power
(cf. “The future is already here — it's just not very evenly distributed.” --Gibson)
- How does this distribution of capability reorder power?
- How are data-empowered algorithms an example of this dynamic
■ of capability, and
■ of reinforcing or distributing power?
* politics here meaning the dynamics of power, not to be confused with “voting”
Arc of class
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
data 1770s-present: capabilities
Regression +
Hypothesis testing
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
data 1770s-present: capabilities
Regression +
Hypothesis testing
huge breakpoint
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
.gov
.edu
.mil
.ai
.com
Regression +
Hypothesis testing
data 1770s-present: capabilities & intents
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
(statecraft)
(policy+eugenics)
(math-vs-science)
(data @ war)
(SIGINT)
(data wars (for funding))
(surveillance economy)
(defense+advertising)
Regression +
Hypothesis testing
data 1770s-present: capabilities & intents
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
Regression +
Hypothesis testing
data 1770s-present: capabilities & intents
persistent observation:
dynamics of capabilities implies
dynamics of power
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
Regression +
Hypothesis testing
data 1770s-present: capabilities & intents
persistent observation:
dynamics of capabilities implies
dynamics of power, i.e., is political
Qualitative
statistics
“Vulgar”
statistics
Mathematical
statistics
Computation+
Crypto
“intelligence”
AI winter
AI renaissance
Machine
learning
1770s 1830s 1900 WWII 1960s 1980s 1990s You are here.
Regression +
Hypothesis testing
data 1770s-present: capabilities & intents
(also, how did this end up in my news feed?)
How we know what the state of the state is
Halley, 1693
Regression, correlation and eugenics
Galton, 1886 MacKenzie on Galton
Experimental design,
hypothesis tests, and
decision theory
“To play this game with the greatest chance of
success, the experimenter cannot afford to exclude
the possibility of any possible arrangement of soil
fertilities, and his best strategy is to equalize the
chance that any treatment shall fall on any plot by
determining it by chance himself.”
- Joan Fisher
World War 2: Turing and statistical cryptography
AI and its many winters
Pattern Recognition to . . .
… Machine Learning
Research ethics (and the lack thereof)
Research ethics (and the lack thereof)
Silicon Valley and the Attention Economy
De-anonymization
Paul Ohm via Terms of Service (2014),
Keller+Neufeld latanyasweeney.org
Weekly structure
Monday
Lecture and discussion
Expectation
arrive having done the week’s readings
Wednesday
Laboratory
Expectation
arrive with laptop ready to collaborate
No prerequisites
Two tracks
more technical background track (60%)
● pursue a semester long project
culminating in a 15pp paper and any
associated code
● complete 3 problem sets
● short final presentation on paper
more humanistic background track (60%)
● write a 10 pp paper on a topic of their
choice
● complete 5 problem sets, these problem
sets will involve both computational work
and writing work
● short final presentation on paper
Jupyter notebooks:
code without any coding
INSTALLING PYTHON + JUPYTER
USING JUPYTER NOTEBOOKS
Required work
Postings on readings in slack
Problem sets (extensions of lab work, done in Jupyter)
Final paper
Your presence
3. ethics
ethics
1. placement of ethics within multicapabilities:
Q: integrate or separate?
A: our approach: lead up to, i.e., foreshadow throughout, via shock and awe, then provide framing
ethics
2. A curriculum / ideational setting for:
- granularities
- belmont, menlo park [1]
common rule / IRB
- users+ products
- society+markets [2]
three-party game (including
law/regulation) [3]
define v. enforceindustryv.“research”
[1] Salganik, M. J. (2017). Bit by bit: social research in the digital age. Princeton University Press.
[2] Pasquale, F. (2015). The black box society: The secret algorithms that control money and information.
[3] Janeway, W. H. (2012). Doing capitalism in the innovation economy: markets, speculation and the state.
defining via granularities: principles->standards->rules [1]
1. Respect for Persons:
- informed consent; respect for individuals’ autonomy and individuals impacted;
- protection for individuals with diminished autonomy or decision making capability.
2. Beneficence: do not harm; assess risk.
3. Justice: fair distribution of benefits of research; selection of subjects; and allocation of
burdens.
4. Respect for Law and Public Interest:
- legal due diligence;
- transparency in methods and results;
- accountability.
[1] Solum, L. B (2009). Legal theory lexicon: Rules, standards, and principles (blog post).
[2] Dittrich, D., & Kenneally, E. (2012). The Menlo Report: Ethical principles guiding information and
communication technology research. US Department of Homeland Security.
4. show and tell:
a. syllabus
b. paired Python/Jupyter notebooks
c. Slack
recap
1. hypotheses driving the class
2. student-eye view of the course (following “lecture 1”)
3. what we talk about when we talk about ethics
a. defining vs enforcing
b. two industrial moments: Research vs industry
c. curricula extant, curricula needed
4. show and tell:
a. syllabus/readings
b. paired Python/Jupyter notebooks (readings on monday; Python on wednesdays)
c. Slack (not just for discussions)
for more info, code, syllabus, etc. please see
“data: past present and future” course page
data-ppf.github.io
Course supported by Collaboratory Fellows Fund, Columbia University
(responses from 2018-02-13 talk follow)
responses: uses of history
1. “make the present strange” (- J. Grimmelmann)
- emphasize the diverse ways of thinking about the problem from
before it was “settled”: help us see what could have been?
- make human by revealing the human interests and conflicts
2. provides a new way into the technical materials which doesn’t
presume or advantage a particular prior curriculum
3. provides window into the ethical questions
- distance us from our “settled” narrative via the past
- show debates that are different now (e.g., eugenics debates)
responses: illustration by @zamchick

Más contenido relacionado

Similar a Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literacy & Ethics in the Lab on February 13th, 2018)

Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
Feyzi R. Bagirov
 
Computer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docxComputer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docx
patricke8
 
Computer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docxComputer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docx
mccormicknadine86
 
Computer Ethics Analyzing Information Technology
Computer Ethics Analyzing Information TechnologyComputer Ethics Analyzing Information Technology
Computer Ethics Analyzing Information Technology
LinaCovington707
 

Similar a Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literacy & Ethics in the Lab on February 13th, 2018) (20)

Data Science-1 (1).ppt
Data Science-1 (1).pptData Science-1 (1).ppt
Data Science-1 (1).ppt
 
Ntegra 20231003 v3.pptx
Ntegra 20231003 v3.pptxNtegra 20231003 v3.pptx
Ntegra 20231003 v3.pptx
 
Data socialscienceprogramme
Data socialscienceprogrammeData socialscienceprogramme
Data socialscienceprogramme
 
data science in academia and the real world
data science in academia and the real worlddata science in academia and the real world
data science in academia and the real world
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
Data Mining the City 2019 - Week 1
Data Mining the City 2019 - Week 1Data Mining the City 2019 - Week 1
Data Mining the City 2019 - Week 1
 
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptxEIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
EIT-Digital_Spohrer_AI_Intro 20231128 v1.pptx
 
The Ai & I at Work
The Ai & I at WorkThe Ai & I at Work
The Ai & I at Work
 
NHH 20231105 v6.pptx
NHH 20231105 v6.pptxNHH 20231105 v6.pptx
NHH 20231105 v6.pptx
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
 
What does Generative AI mean for public policy?
What does Generative AI mean for public policy?What does Generative AI mean for public policy?
What does Generative AI mean for public policy?
 
Competitive intelligence for multimodal data integration
Competitive intelligence for multimodal data integrationCompetitive intelligence for multimodal data integration
Competitive intelligence for multimodal data integration
 
20220103 jim spohrer hicss v9
20220103 jim spohrer hicss v920220103 jim spohrer hicss v9
20220103 jim spohrer hicss v9
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
 
Data journalism: Data rules, while data rule
Data journalism: Data rules, while data ruleData journalism: Data rules, while data rule
Data journalism: Data rules, while data rule
 
AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...
 
Computer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docxComputer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docx
 
Computer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docxComputer Ethics Analyzing Information Technology.docx
Computer Ethics Analyzing Information Technology.docx
 
Computer Ethics Analyzing Information Technology
Computer Ethics Analyzing Information TechnologyComputer Ethics Analyzing Information Technology
Computer Ethics Analyzing Information Technology
 

Más de chris wiggins

Más de chris wiggins (16)

data science at the new york times
data science at the new york timesdata science at the new york times
data science at the new york times
 
a mission-driven approach to personalizing the customer journey
a mission-driven approach to personalizing the customer journeya mission-driven approach to personalizing the customer journey
a mission-driven approach to personalizing the customer journey
 
Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...
 
Data Science at The New York Times
Data Science at The New York TimesData Science at The New York Times
Data Science at The New York Times
 
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...
 
lean + design thinking in building data products
lean + design thinking in building data productslean + design thinking in building data products
lean + design thinking in building data products
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecture
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
 
Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"
 
intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22
 
Lean workbench 2013-07-24
Lean workbench 2013-07-24Lean workbench 2013-07-24
Lean workbench 2013-07-24
 
Wiggins 2013 05-29
Wiggins 2013 05-29Wiggins 2013 05-29
Wiggins 2013 05-29
 
variational bayes in biophysics
variational bayes in biophysicsvariational bayes in biophysics
variational bayes in biophysics
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 

Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literacy & Ethics in the Lab on February 13th, 2018)

  • 1. Data Literacy and Ethics in the Lab Chris Wiggins + Matt Jones data-ppf.github.io Course supported by Collaboratory Fellows Fund, Columbia University (talk presented 2018-02-13 at Digital Life Seminar, Cornell Tech)
  • 2. overview 1. hypotheses driving the class 2. student-eye view of the course (following “lecture 1”) 3. ethics a. defining vs enforcing b. research vs industry: two institutional moments c. curricula extant vs curricula needed 4. show and tell: a. syllabus/readings b. paired Python/Jupyter notebooks (readings on monday; Python on wednesdays) c. Slack (not just for discussions)
  • 4. hypotheses driving the class 1. there is important material being taught neither to future statisticians nor to future senators a. outside the technical canon yet also b. present only at the advanced level in STS, to our knowledge unpaired with technical engagement 2. multicapabilities [1] are teachable without prerequisite a. functional: via pre-authored Jupyter notebooks, as in-class labs b. rhetorical: in-class labs as well as discussion c. critical: discussions + readings as well as in-class labs 3. pair intellectual changes with political and ethical context a. what powers motivated this advance? b. how did this advance rearrange power? (cf. Rogaway) [2] [1] cf. Selber, S. (2004). Multiliteracies for a digital age. SIU Press. [2] Rogaway, P. (2015). The Moral Character of Cryptographic Work. IACR Cryptology ePrint Archive, 2015, 1162
  • 5. hypotheses driving the class 1. there is important material being taught neither to future statisticians nor to future senators a. outside the technical canon yet also b. present only at the advanced level in STS, to our knowledge unpaired with technical engagement 2. multicapabilities [1] are teachable without prerequisite a. functional: via pre-authored Jupyter notebooks, as in-class labs b. rhetorical: in-class labs as well as discussion c. critical: discussions + readings as well as in-class labs 3. pair intellectual changes with political and ethical context a. what powers motivated this advance? b. how did this advance rearrange power? (cf. Rogaway) [2] [1] cf. Selber, S. (2004). Multiliteracies for a digital age. SIU Press. [2] Rogaway, P. (2015). The Moral Character of Cryptographic Work. IACR Cryptology ePrint Archive, 2015, 1162
  • 6. 2. student-eye view (i.e., “Lecture 1”)
  • 7. Data Past Present Future Wiggins + Jones SEAS + A&S data-ppf.github.io
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. how did this end up in my news feed? - math - hardware - system - funding - market - regulation - data this was not possible 20 years ago. - why? - what did people do instead?
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. “Automated Inference on Criminality using Face Images” (arXiv:1611.04135v1) “In all cultures and all periods of recorded human history, people share the belief that the face alone suffices to reveal innate traits of a person.”
  • 24. We’ve been here before J Am Inst. Criminal Law, 1912, on Lombroso, 1899
  • 27. We’ve been here before Insert quote from Blaise here
  • 28. Statistical sciences always political Dream of sciences of social difference central to development of statistics and the data sciences
  • 29. Florence Nightingale & Data Visualization “Experience has shown that without special information and skilful application of the resources of science in preserving health, the drain on our home population must exhaust our means. The introduction, therefore, of a proper sanitary system into the British army is of essential importance to the public interests.” (1858)
  • 30. Florence Nightingale & Data Visualization “Upon the British race alone the integrity of that empire at this moment appears to depend. The conquering race must retain possession.” (1858)
  • 31. Every week: Scientific and mathematical development Technologies and engineering Driving forces: money, prestige, resources, Imperial competition Power, ethics, and data intensive knowledge
  • 32. Tech story: three chronological stages Data and Math Data and Engineering Data and Technology
  • 33. Data technologies Census and government survey Information processing machines and digital computers Always on network infrastructure
  • 34. Power How should social and political order be organized on basis of science and engineering? How do technologies transform the social and political order? How do technologies augment and diminish democratic orders? Autocratic ones?
  • 35. Power and politics* New technologies mean new capabilities. These capabilities are first available to those in power (cf. “The future is already here — it's just not very evenly distributed.” --Gibson) - How does this distribution of capability reorder power? - How are data-empowered algorithms an example of this dynamic ■ of capability, and ■ of reinforcing or distributing power? * politics here meaning the dynamics of power, not to be confused with “voting”
  • 37. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. data 1770s-present: capabilities Regression + Hypothesis testing
  • 38. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. data 1770s-present: capabilities Regression + Hypothesis testing huge breakpoint
  • 39. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. .gov .edu .mil .ai .com Regression + Hypothesis testing data 1770s-present: capabilities & intents
  • 40. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. (statecraft) (policy+eugenics) (math-vs-science) (data @ war) (SIGINT) (data wars (for funding)) (surveillance economy) (defense+advertising) Regression + Hypothesis testing data 1770s-present: capabilities & intents
  • 41. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. Regression + Hypothesis testing data 1770s-present: capabilities & intents persistent observation: dynamics of capabilities implies dynamics of power
  • 42. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. Regression + Hypothesis testing data 1770s-present: capabilities & intents persistent observation: dynamics of capabilities implies dynamics of power, i.e., is political
  • 43. Qualitative statistics “Vulgar” statistics Mathematical statistics Computation+ Crypto “intelligence” AI winter AI renaissance Machine learning 1770s 1830s 1900 WWII 1960s 1980s 1990s You are here. Regression + Hypothesis testing data 1770s-present: capabilities & intents (also, how did this end up in my news feed?)
  • 44. How we know what the state of the state is Halley, 1693
  • 45. Regression, correlation and eugenics Galton, 1886 MacKenzie on Galton
  • 46. Experimental design, hypothesis tests, and decision theory “To play this game with the greatest chance of success, the experimenter cannot afford to exclude the possibility of any possible arrangement of soil fertilities, and his best strategy is to equalize the chance that any treatment shall fall on any plot by determining it by chance himself.” - Joan Fisher
  • 47. World War 2: Turing and statistical cryptography
  • 48. AI and its many winters
  • 51. Research ethics (and the lack thereof)
  • 52. Research ethics (and the lack thereof)
  • 53. Silicon Valley and the Attention Economy
  • 54. De-anonymization Paul Ohm via Terms of Service (2014), Keller+Neufeld latanyasweeney.org
  • 55. Weekly structure Monday Lecture and discussion Expectation arrive having done the week’s readings Wednesday Laboratory Expectation arrive with laptop ready to collaborate
  • 57. Two tracks more technical background track (60%) ● pursue a semester long project culminating in a 15pp paper and any associated code ● complete 3 problem sets ● short final presentation on paper more humanistic background track (60%) ● write a 10 pp paper on a topic of their choice ● complete 5 problem sets, these problem sets will involve both computational work and writing work ● short final presentation on paper
  • 58. Jupyter notebooks: code without any coding INSTALLING PYTHON + JUPYTER USING JUPYTER NOTEBOOKS
  • 59. Required work Postings on readings in slack Problem sets (extensions of lab work, done in Jupyter) Final paper Your presence
  • 61. ethics 1. placement of ethics within multicapabilities: Q: integrate or separate? A: our approach: lead up to, i.e., foreshadow throughout, via shock and awe, then provide framing
  • 62. ethics 2. A curriculum / ideational setting for: - granularities - belmont, menlo park [1] common rule / IRB - users+ products - society+markets [2] three-party game (including law/regulation) [3] define v. enforceindustryv.“research” [1] Salganik, M. J. (2017). Bit by bit: social research in the digital age. Princeton University Press. [2] Pasquale, F. (2015). The black box society: The secret algorithms that control money and information. [3] Janeway, W. H. (2012). Doing capitalism in the innovation economy: markets, speculation and the state.
  • 63. defining via granularities: principles->standards->rules [1] 1. Respect for Persons: - informed consent; respect for individuals’ autonomy and individuals impacted; - protection for individuals with diminished autonomy or decision making capability. 2. Beneficence: do not harm; assess risk. 3. Justice: fair distribution of benefits of research; selection of subjects; and allocation of burdens. 4. Respect for Law and Public Interest: - legal due diligence; - transparency in methods and results; - accountability. [1] Solum, L. B (2009). Legal theory lexicon: Rules, standards, and principles (blog post). [2] Dittrich, D., & Kenneally, E. (2012). The Menlo Report: Ethical principles guiding information and communication technology research. US Department of Homeland Security.
  • 64. 4. show and tell: a. syllabus b. paired Python/Jupyter notebooks c. Slack
  • 65. recap 1. hypotheses driving the class 2. student-eye view of the course (following “lecture 1”) 3. what we talk about when we talk about ethics a. defining vs enforcing b. two industrial moments: Research vs industry c. curricula extant, curricula needed 4. show and tell: a. syllabus/readings b. paired Python/Jupyter notebooks (readings on monday; Python on wednesdays) c. Slack (not just for discussions)
  • 66. for more info, code, syllabus, etc. please see “data: past present and future” course page data-ppf.github.io Course supported by Collaboratory Fellows Fund, Columbia University
  • 68. responses: uses of history 1. “make the present strange” (- J. Grimmelmann) - emphasize the diverse ways of thinking about the problem from before it was “settled”: help us see what could have been? - make human by revealing the human interests and conflicts 2. provides a new way into the technical materials which doesn’t presume or advantage a particular prior curriculum 3. provides window into the ethical questions - distance us from our “settled” narrative via the past - show debates that are different now (e.g., eugenics debates)