SlideShare una empresa de Scribd logo
1 de 34
Open Refine for Librarians
How a power tool for Google is now
being used by librarians to clean up
data and connect it to the world
Mita Williams
Scholarly Communications Librarian
University of Windsor
October 24, 2018 : 2:45 - 3:15 pm
NISO: That Cutting Edge: Technology’s Impact on Scholarly
Research Processes in the Library
PART ONE:
AN INTRODUCTION
TO OPEN REFINE
The most popular library tool you’ve never heard of…
link
link
link
link
link
link
link
link
PART TWO:
WHY NOT KEEP USING EXCEL?
The most popular library tool you’ve never heard of…
Why use Open Refine?
• Ability to handle more types of data
TSV, CSV, *SV, Excel (.xls and .xlsx), JSON, XML, RDF as XML
• Ability to handle larger amounts of data
Excel’s max: 1,048,576 rows by 16,384 columns
• Better control of data
• Ability to script processes
• Ability share and reproduce these scripts
link
link
link
link
link
link
link
PART THREE:
HOW ARE LIBRARIANS
USING OPENREFINE?
!!! OpenRefine is NOT Excel !!!
• Institution changing their library management system
and wished to migrate their catalogue data
• Approximately 50,000 bibliographic records
• MARC output from existing system would not load into
new system
link
link
link
link
link
link
link
link
link
link
link
link
link
link
Been there! Done that! Bought the t-shirt!
(Any questions?)

Más contenido relacionado

La actualidad más candente

Relationship Management Conference - expanded slides
Relationship Management Conference - expanded slidesRelationship Management Conference - expanded slides
Relationship Management Conference - expanded slides
Peter Hickey
 

La actualidad más candente (20)

Data management support as core business of research libraries
Data management support as core business of research librariesData management support as core business of research libraries
Data management support as core business of research libraries
 
Metrics Matter
Metrics MatterMetrics Matter
Metrics Matter
 
What do Digital Humanists want from a National Library?
What do Digital Humanists want from a National Library?What do Digital Humanists want from a National Library?
What do Digital Humanists want from a National Library?
 
Library Analytics and Metrics Project
Library Analytics and Metrics Project Library Analytics and Metrics Project
Library Analytics and Metrics Project
 
Knowledge Unlatched – Navigating Through the Rapids of Change
Knowledge Unlatched – Navigating Through the Rapids of Change 	Knowledge Unlatched – Navigating Through the Rapids of Change
Knowledge Unlatched – Navigating Through the Rapids of Change
 
SMART Library
SMART LibrarySMART Library
SMART Library
 
Bryant Confusing World of RIM
Bryant Confusing World of RIM Bryant Confusing World of RIM
Bryant Confusing World of RIM
 
CAVAL ANDS Workshop - Managing library teams for a research and data-intensiv...
CAVAL ANDS Workshop - Managing library teams for a research and data-intensiv...CAVAL ANDS Workshop - Managing library teams for a research and data-intensiv...
CAVAL ANDS Workshop - Managing library teams for a research and data-intensiv...
 
Allard - Research Data Services in Libraries
Allard - Research Data Services in LibrariesAllard - Research Data Services in Libraries
Allard - Research Data Services in Libraries
 
Ogier Virginia Tech's RIS Ecosystem
Ogier Virginia Tech's RIS EcosystemOgier Virginia Tech's RIS Ecosystem
Ogier Virginia Tech's RIS Ecosystem
 
Warren & Rauh Creating a Culture of Research Reputation
Warren & Rauh Creating a Culture of Research ReputationWarren & Rauh Creating a Culture of Research Reputation
Warren & Rauh Creating a Culture of Research Reputation
 
From industry to academia: user-centred design driving library service innova...
From industry to academia: user-centred design driving library service innova...From industry to academia: user-centred design driving library service innova...
From industry to academia: user-centred design driving library service innova...
 
Increase usage of online resources Edina presentation
Increase usage of online resources Edina presentationIncrease usage of online resources Edina presentation
Increase usage of online resources Edina presentation
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
Relationship Management Conference - expanded slides
Relationship Management Conference - expanded slidesRelationship Management Conference - expanded slides
Relationship Management Conference - expanded slides
 
Data are the New Black
Data are the New BlackData are the New Black
Data are the New Black
 
2015 NISO Forum: The Future of Library Resource
2015 NISO Forum: The Future of Library Resource2015 NISO Forum: The Future of Library Resource
2015 NISO Forum: The Future of Library Resource
 
Fransen From Researcher Profiling to System of Record
Fransen From Researcher Profiling to System of RecordFransen From Researcher Profiling to System of Record
Fransen From Researcher Profiling to System of Record
 
RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...
RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...
RDN Lightning talk - Open Research Leeds (@OpenResLeeds): networks, metrics a...
 
Javed - VIVO: Community Driven RIM
Javed - VIVO: Community Driven RIM Javed - VIVO: Community Driven RIM
Javed - VIVO: Community Driven RIM
 

Similar a Williams Open Refine for Librarians

Meeting the e-resources challenge through collaboration: an OCLC perspective ...
Meeting the e-resources challenge through collaboration: an OCLC perspective ...Meeting the e-resources challenge through collaboration: an OCLC perspective ...
Meeting the e-resources challenge through collaboration: an OCLC perspective ...
NASIG
 
Katalog dan Pengatalog: tantangan kini dan masa depan
Katalog dan Pengatalog: tantangan kini dan masa depanKatalog dan Pengatalog: tantangan kini dan masa depan
Katalog dan Pengatalog: tantangan kini dan masa depan
hendrowicaksonogmailcom
 

Similar a Williams Open Refine for Librarians (20)

Open Ed Global LibreTexts Presentation
Open Ed Global LibreTexts PresentationOpen Ed Global LibreTexts Presentation
Open Ed Global LibreTexts Presentation
 
Transparent Licenses: Making user rights clear (OLA Super Conference 2015)
Transparent Licenses: Making user rights clear (OLA Super Conference 2015)Transparent Licenses: Making user rights clear (OLA Super Conference 2015)
Transparent Licenses: Making user rights clear (OLA Super Conference 2015)
 
Meeting the e-resources challenge through collaboration: an OCLC perspective ...
Meeting the e-resources challenge through collaboration: an OCLC perspective ...Meeting the e-resources challenge through collaboration: an OCLC perspective ...
Meeting the e-resources challenge through collaboration: an OCLC perspective ...
 
Open to Opportunity: Possibilities for libraries in open education
Open to Opportunity: Possibilities for libraries in open education Open to Opportunity: Possibilities for libraries in open education
Open to Opportunity: Possibilities for libraries in open education
 
We Can and We Should: libraries' role in open education
We Can and We Should: libraries' role in open educationWe Can and We Should: libraries' role in open education
We Can and We Should: libraries' role in open education
 
Discovery Systems: Connecting the 21st Century Academic User to Content
Discovery Systems: Connecting the 21st Century Academic User to ContentDiscovery Systems: Connecting the 21st Century Academic User to Content
Discovery Systems: Connecting the 21st Century Academic User to Content
 
LibreTexts: Ful filling the 5R Dream Worldwide WCOL19 Halpern
LibreTexts: Ful filling the 5R Dream Worldwide WCOL19 Halpern LibreTexts: Ful filling the 5R Dream Worldwide WCOL19 Halpern
LibreTexts: Ful filling the 5R Dream Worldwide WCOL19 Halpern
 
Getting in the Flow! : How libraries can adapt to changing users and environm...
Getting in the Flow! : How libraries can adapt to changing users and environm...Getting in the Flow! : How libraries can adapt to changing users and environm...
Getting in the Flow! : How libraries can adapt to changing users and environm...
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environment
 
A Brief Overview of BIBFRAME, by Angela Kroeger
A Brief Overview of BIBFRAME, by Angela KroegerA Brief Overview of BIBFRAME, by Angela Kroeger
A Brief Overview of BIBFRAME, by Angela Kroeger
 
The Power of Sharing Linked Data: Bibliothekartag 2014
The Power of Sharing Linked Data: Bibliothekartag 2014The Power of Sharing Linked Data: Bibliothekartag 2014
The Power of Sharing Linked Data: Bibliothekartag 2014
 
Institutional Repository (IR) and Open Access in Academic Libraries
Institutional Repository (IR) and Open Access in Academic LibrariesInstitutional Repository (IR) and Open Access in Academic Libraries
Institutional Repository (IR) and Open Access in Academic Libraries
 
Library websites of the future
Library websites of the futureLibrary websites of the future
Library websites of the future
 
confernece paper
confernece paperconfernece paper
confernece paper
 
Valencia Marshall Breeding Jornada Perspectiva Tecnologia Bibliotecas
Valencia Marshall Breeding Jornada Perspectiva Tecnologia BibliotecasValencia Marshall Breeding Jornada Perspectiva Tecnologia Bibliotecas
Valencia Marshall Breeding Jornada Perspectiva Tecnologia Bibliotecas
 
Katalog dan Pengatalog: tantangan kini dan masa depan
Katalog dan Pengatalog: tantangan kini dan masa depanKatalog dan Pengatalog: tantangan kini dan masa depan
Katalog dan Pengatalog: tantangan kini dan masa depan
 
Open Science, Open Data: towards a new transparent and reproducible ecosystem
Open Science, Open Data:   towards a new transparent and reproducible ecosystemOpen Science, Open Data:   towards a new transparent and reproducible ecosystem
Open Science, Open Data: towards a new transparent and reproducible ecosystem
 
eResources in Academic Libraries
eResources in Academic LibrarieseResources in Academic Libraries
eResources in Academic Libraries
 
How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)
 
Rodriguez CV 2017
Rodriguez CV 2017Rodriguez CV 2017
Rodriguez CV 2017
 

Más de National Information Standards Organization (NISO)

Más de National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Último

Último (20)

Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 

Williams Open Refine for Librarians

  • 1. Open Refine for Librarians How a power tool for Google is now being used by librarians to clean up data and connect it to the world Mita Williams Scholarly Communications Librarian University of Windsor October 24, 2018 : 2:45 - 3:15 pm NISO: That Cutting Edge: Technology’s Impact on Scholarly Research Processes in the Library
  • 2. PART ONE: AN INTRODUCTION TO OPEN REFINE The most popular library tool you’ve never heard of…
  • 10. link
  • 11. PART TWO: WHY NOT KEEP USING EXCEL? The most popular library tool you’ve never heard of…
  • 12. Why use Open Refine? • Ability to handle more types of data TSV, CSV, *SV, Excel (.xls and .xlsx), JSON, XML, RDF as XML • Ability to handle larger amounts of data Excel’s max: 1,048,576 rows by 16,384 columns • Better control of data • Ability to script processes • Ability share and reproduce these scripts
  • 14. link
  • 15.
  • 16. link
  • 17. link
  • 18. link
  • 19. link
  • 20. PART THREE: HOW ARE LIBRARIANS USING OPENREFINE? !!! OpenRefine is NOT Excel !!!
  • 21. • Institution changing their library management system and wished to migrate their catalogue data • Approximately 50,000 bibliographic records • MARC output from existing system would not load into new system link
  • 23. link
  • 24. link
  • 25. link
  • 26. link
  • 27. link
  • 28. link
  • 29. link
  • 30. link
  • 31. link
  • 32. link
  • 33. link
  • 34. Been there! Done that! Bought the t-shirt! (Any questions?)