SlideShare a Scribd company logo
1 of 30
Download to read offline
SHARE Phase II
Judy Ruttenberg, Association of Research Libraries
Erin Braswell, Center for Open Science
Fabian von Feilitzsch, Center for Open Science
NISO Virtual Conference: October 1, 2015
Using Alerting Systems to Ensure OA Policy Compliance
Founded by Academic Leaders, Built
with Open Technology
Research universities are long-lived and are mission-driven to
generate, make accessible, and preserve over time new
knowledge and understanding.
What is SHARE?
SHARE is building a free, open data set about
research and scholarly activities across their
life cycle.
Research Lifecycle
Open Science Framework
http://osf.io
Using SHARE’s Search API
● API is currently a slightly restricted Elasticsearch
instance
● You can hit the API with any valid Elasticsearch query
● Going to go over some quick and sort of interesting
aggregations that are available
What it looks like: The query
$ curl -XPOST "https://osf.io/api/v1/share/search/" -H 'content-type: application/json' -d '{
"query": {"match_all": {}},
"size": 0,
"aggs": {
"top tags": {
"terms": {
"field": "tags"
}
}
}
}'
What it looks like: The response
{
"count": 2137806,
"time": 0.1,
"results": [],
"aggs": null,
"aggregations": {
"top tags": {
"buckets": [
{
"key": "ecological",
"doc_count": 20347
},
{
"key": "long",
"doc_count": 20179
},
{
"key": "term",
"doc_count": 20021
},
{
"key": "lter",
"doc_count": 18862
},
{
"key": "data",
"doc_count": 17086
}, ....
...
{
"key": "research",
"doc_count": 16539
},
{
"key": "earth",
"doc_count": 16395
},
{
"key": "water",
"doc_count": 16150
},
{
"key": "program",
"doc_count": 16098
},
{
"key": "remote",
"doc_count": 15963
}
],
"sum_other_doc_count": 1049093,
"doc_count_error_upper_bound": 3000
}
}
}
Kind of a pain
● We have an experimental python library to
help cut down the verbosity a bit
Same Example
>> from sharepa import ShareSearch
>> from sharepa.analysis import bucket_to_dataframe
>> search = ShareSearch()
>> search.aggs.bucket('top tags', 'significant_terms', field='tags')
Internal structure is:
{
"query": {
"match_all": {}
},
"aggs": {
"top tags": {
"significant_terms": {
"field": "tags"
}
}
}
}
Now we send the JSON blob to the SHARE search API
>> results = search.execute()
And we get back the same response.
We can then use some of our utilities to convert
the Elasticsearch response to a dataframe
(basically just a table)
>> df = bucket_to_dataframe(
'top tags',
results.aggregations['top tags']['buckets']
).sort('top tags', ascending=False)
and plot it as well:
>> df.plot(kind='bar', x='key', y=['bg_count', 'top tags'])
Cancer
Flu
Influenza
Vaccine
Phase II
Contact Us!
info@share-research.org
https://osf.io/yg3xj/

More Related Content

What's hot

GeoShareOverview_Mar11
GeoShareOverview_Mar11GeoShareOverview_Mar11
GeoShareOverview_Mar11
Werner Runge
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Lucidworks (Archived)
 

What's hot (20)

Real-time Data Analytics mit Elasticsearch
Real-time Data Analytics mit ElasticsearchReal-time Data Analytics mit Elasticsearch
Real-time Data Analytics mit Elasticsearch
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSync
 
Visualizing Data in Elasticsearch DevFest DC 2016
Visualizing Data in Elasticsearch DevFest DC 2016Visualizing Data in Elasticsearch DevFest DC 2016
Visualizing Data in Elasticsearch DevFest DC 2016
 
Schema, JSON-LD & the semantic web - Brighton SEO April 2015 - Kirsty Hulse -...
Schema, JSON-LD & the semantic web - Brighton SEO April 2015 - Kirsty Hulse -...Schema, JSON-LD & the semantic web - Brighton SEO April 2015 - Kirsty Hulse -...
Schema, JSON-LD & the semantic web - Brighton SEO April 2015 - Kirsty Hulse -...
 
Real Time Reporting Platform
Real Time Reporting PlatformReal Time Reporting Platform
Real Time Reporting Platform
 
GeoShareOverview_Mar11
GeoShareOverview_Mar11GeoShareOverview_Mar11
GeoShareOverview_Mar11
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
 
Frequent itemset mining_on_hadoop
Frequent itemset mining_on_hadoopFrequent itemset mining_on_hadoop
Frequent itemset mining_on_hadoop
 
Webtracks at JISC Managing Research Data Meeting
Webtracks at JISC Managing Research Data MeetingWebtracks at JISC Managing Research Data Meeting
Webtracks at JISC Managing Research Data Meeting
 
Intoroduce milkcocoa for english
Intoroduce milkcocoa for englishIntoroduce milkcocoa for english
Intoroduce milkcocoa for english
 
SC13 BoF: RDA and HPC
SC13 BoF: RDA and HPCSC13 BoF: RDA and HPC
SC13 BoF: RDA and HPC
 
NERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardNERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie Bard
 
Big Data Processing in the Cloud: A Hydra/Sufia Experience
Big Data Processing in the Cloud: A Hydra/Sufia ExperienceBig Data Processing in the Cloud: A Hydra/Sufia Experience
Big Data Processing in the Cloud: A Hydra/Sufia Experience
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
How to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collectionsHow to use NCI's national repository of big spatial data collections
How to use NCI's national repository of big spatial data collections
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
 
Integrating scientific laboratories into the cloud
Integrating scientific laboratories into the cloudIntegrating scientific laboratories into the cloud
Integrating scientific laboratories into the cloud
 
Accelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the CloudAccelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the Cloud
 
Foss4G 2009 Scenz Grid
Foss4G 2009 Scenz GridFoss4G 2009 Scenz Grid
Foss4G 2009 Scenz Grid
 
Data Skipping Technology
Data Skipping TechnologyData Skipping Technology
Data Skipping Technology
 

Viewers also liked

Level 1 lessons 1 - 5
Level 1   lessons 1 - 5Level 1   lessons 1 - 5
Level 1 lessons 1 - 5
Boojie Cowell
 

Viewers also liked (12)

The Collapse of the Ivory Tower: How to Fix The Higher Ed Crisis
The Collapse of the Ivory Tower: How to Fix The Higher Ed CrisisThe Collapse of the Ivory Tower: How to Fix The Higher Ed Crisis
The Collapse of the Ivory Tower: How to Fix The Higher Ed Crisis
 
Gomez Gipsy CEU
Gomez Gipsy CEUGomez Gipsy CEU
Gomez Gipsy CEU
 
Ten Simple Rules for Changing How Scholars Communicate
Ten Simple Rules for Changing How Scholars CommunicateTen Simple Rules for Changing How Scholars Communicate
Ten Simple Rules for Changing How Scholars Communicate
 
ePADD and Access -- Society of American Archivists (SAA) Annual Meeting, 2015
ePADD and Access -- Society of American Archivists (SAA) Annual Meeting, 2015ePADD and Access -- Society of American Archivists (SAA) Annual Meeting, 2015
ePADD and Access -- Society of American Archivists (SAA) Annual Meeting, 2015
 
ePADD & Records Management, Society of American Archivists (SAA) Annual Meeti...
ePADD & Records Management, Society of American Archivists (SAA) Annual Meeti...ePADD & Records Management, Society of American Archivists (SAA) Annual Meeti...
ePADD & Records Management, Society of American Archivists (SAA) Annual Meeti...
 
Autenticación de usuarios usando Kerberos
Autenticación de usuarios usando KerberosAutenticación de usuarios usando Kerberos
Autenticación de usuarios usando Kerberos
 
Principios heurísticos de Nielsen (Presentación)
Principios heurísticos de Nielsen (Presentación)Principios heurísticos de Nielsen (Presentación)
Principios heurísticos de Nielsen (Presentación)
 
2016 07-kdl-interr-infra
2016 07-kdl-interr-infra2016 07-kdl-interr-infra
2016 07-kdl-interr-infra
 
Alison Clague #UKMW15
Alison Clague #UKMW15Alison Clague #UKMW15
Alison Clague #UKMW15
 
Level 1 lessons 1 - 5
Level 1   lessons 1 - 5Level 1   lessons 1 - 5
Level 1 lessons 1 - 5
 
ISO 14001 life cycle assessment
ISO 14001 life cycle assessmentISO 14001 life cycle assessment
ISO 14001 life cycle assessment
 
Herbalife Skin Product usage Detail
Herbalife Skin Product usage Detail Herbalife Skin Product usage Detail
Herbalife Skin Product usage Detail
 

Similar to October 1 NISO Training Thursday: Using Alerting Systems to Ensure OA Policy Compliance

Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
ASIS&T
 
Mendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperMendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 Paper
William Gunn
 

Similar to October 1 NISO Training Thursday: Using Alerting Systems to Ensure OA Policy Compliance (20)

Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
Tools für das Management von Forschungsdaten
Tools für das Management von ForschungsdatenTools für das Management von Forschungsdaten
Tools für das Management von Forschungsdaten
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
The oecd delta project – providing easier access to data through api's
The oecd delta project – providing easier access to data through api'sThe oecd delta project – providing easier access to data through api's
The oecd delta project – providing easier access to data through api's
 
Online promises beyond the policies: what's under the skin
Online promises beyond the policies: what's under the skin Online promises beyond the policies: what's under the skin
Online promises beyond the policies: what's under the skin
 
20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph 20191119_The OpenAIRE Research Graph
20191119_The OpenAIRE Research Graph
 
20200130_Mannocci_OpenAIRE_ResearchGraph
20200130_Mannocci_OpenAIRE_ResearchGraph20200130_Mannocci_OpenAIRE_ResearchGraph
20200130_Mannocci_OpenAIRE_ResearchGraph
 
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...
 
Facilitate Research Communities Adoption of Open Science Publishing Principle...
Facilitate Research Communities Adoption of Open Science Publishing Principle...Facilitate Research Communities Adoption of Open Science Publishing Principle...
Facilitate Research Communities Adoption of Open Science Publishing Principle...
 
L&P Eric Celeste - SHARE
L&P Eric Celeste -  SHAREL&P Eric Celeste -  SHARE
L&P Eric Celeste - SHARE
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
 
Rolle und Perspektive von re3data.org bei der Förderung von Open Science
Rolle und Perspektive von re3data.org bei der Förderung von Open ScienceRolle und Perspektive von re3data.org bei der Förderung von Open Science
Rolle und Perspektive von re3data.org bei der Förderung von Open Science
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
OpenAIRE Open Science publishing for Research Infrastructures: the EPOS use-c...
OpenAIRE Open Science publishing for Research Infrastructures: the EPOS use-c...OpenAIRE Open Science publishing for Research Infrastructures: the EPOS use-c...
OpenAIRE Open Science publishing for Research Infrastructures: the EPOS use-c...
 
Tapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and FlinkTapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and Flink
 
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
 
Mendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperMendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 Paper
 
Introduction to open access and how it helps in your research and increases t...
Introduction to open access and how it helps in your research and increases t...Introduction to open access and how it helps in your research and increases t...
Introduction to open access and how it helps in your research and increases t...
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Recently uploaded (20)

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 

October 1 NISO Training Thursday: Using Alerting Systems to Ensure OA Policy Compliance

  • 1. SHARE Phase II Judy Ruttenberg, Association of Research Libraries Erin Braswell, Center for Open Science Fabian von Feilitzsch, Center for Open Science NISO Virtual Conference: October 1, 2015 Using Alerting Systems to Ensure OA Policy Compliance
  • 2. Founded by Academic Leaders, Built with Open Technology Research universities are long-lived and are mission-driven to generate, make accessible, and preserve over time new knowledge and understanding.
  • 3. What is SHARE? SHARE is building a free, open data set about research and scholarly activities across their life cycle.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19. Using SHARE’s Search API ● API is currently a slightly restricted Elasticsearch instance ● You can hit the API with any valid Elasticsearch query ● Going to go over some quick and sort of interesting aggregations that are available
  • 20. What it looks like: The query $ curl -XPOST "https://osf.io/api/v1/share/search/" -H 'content-type: application/json' -d '{ "query": {"match_all": {}}, "size": 0, "aggs": { "top tags": { "terms": { "field": "tags" } } } }'
  • 21. What it looks like: The response { "count": 2137806, "time": 0.1, "results": [], "aggs": null, "aggregations": { "top tags": { "buckets": [ { "key": "ecological", "doc_count": 20347 }, { "key": "long", "doc_count": 20179 }, { "key": "term", "doc_count": 20021 }, { "key": "lter", "doc_count": 18862 }, { "key": "data", "doc_count": 17086 }, .... ... { "key": "research", "doc_count": 16539 }, { "key": "earth", "doc_count": 16395 }, { "key": "water", "doc_count": 16150 }, { "key": "program", "doc_count": 16098 }, { "key": "remote", "doc_count": 15963 } ], "sum_other_doc_count": 1049093, "doc_count_error_upper_bound": 3000 } } }
  • 22. Kind of a pain ● We have an experimental python library to help cut down the verbosity a bit
  • 23. Same Example >> from sharepa import ShareSearch >> from sharepa.analysis import bucket_to_dataframe >> search = ShareSearch() >> search.aggs.bucket('top tags', 'significant_terms', field='tags') Internal structure is: { "query": { "match_all": {} }, "aggs": { "top tags": { "significant_terms": { "field": "tags" } } } }
  • 24. Now we send the JSON blob to the SHARE search API >> results = search.execute() And we get back the same response. We can then use some of our utilities to convert the Elasticsearch response to a dataframe (basically just a table) >> df = bucket_to_dataframe( 'top tags', results.aggregations['top tags']['buckets'] ).sort('top tags', ascending=False) and plot it as well: >> df.plot(kind='bar', x='key', y=['bg_count', 'top tags'])
  • 26. Flu