DSpace at ILRI: A semi-technical overview of “CGSpace”

•Descargar como PPTX, PDF•

4 recomendaciones•906 vistas

ILRI

Presented by Alan Orth at the KAINET Open Data and Open Science Workshop, Nairobi, Kenya, 18 June 2015

Ciencias

A semi-technical overview of “CGSpace”
DSpace at ILRI
Alan Orth
KAINET Open Data and Open Science’ Workshop
Nairobi, Kenya, 18 June 2015

History of DSpace at ILRI
● 2009: ILRI launches Mahider (“repository” in
Amharic)
● 2010: Other CGIAR centers and programs join
our platform and share hard / soft costs
● 2011: Rebranded as “CGSpace”
● 2015: 9 CGIAR centers, ~50,000 items, ~250k
hits/month

How we use DSpace
● Content people embedded in each department
help capture results (presentations, papers,
brochures, etc)
● Primary location for institutional outputs!
● No posting PDFs on corporate website!
● Integrate with website and blogs via RSS feeds
● Direct ALL traffic to DSpace!
● For data sets, videos, etc we make a metadata-
only accession with a link to eg YouTube

● Communities, sub-communities, and collections
● Tempting to model after organization hierarchy!
● (we did)
● … but organization hierarchies change!
DSpace hierarchies

Metadata
● Standard Dublin Core is available
● No AGROVOC
● You can create custom controlled vocabularies in
arbitrary namespaces, eg: cg.subject.ilri

Custom metadata in ILRI report
Not AGROVOC!

“Discovery” facets
● Context-aware
metadata summaries
● Side effect: helps
spot metadata
inconsistencies!
● … Open Access, Open
access, open Access,
etc.

Search engine optimization (SEO)
Help Google Scholar consume your content!
● XML sitemaps
● Consistent domain name, eg: cgspace.cgiar.org
● Persistent links for resources
● Website speed and HTTPS both a plus
● Sign up for Google Webmaster Tools to submit
sitemap, control indexing, see stats, etc

Importance of persistent links
● Website addresses change…
● mahider.ilri.org -> cgspace.cgiar.org
● But resources stay the same!
http://hdl.handle.net/10568/67073
● “Handle” service from handle.net
● Everything under prefix 10568 is CGSpace
● Default DSpace handle prefix is 123456789!

dc.identifier.uri specifies an item’s persistent universal resource identifier (URI)

Getting data INTO DSpace
● Day-to-day submission is manual, by a small
army of editors
● One-time batch uploads of items from other
systems in CSV format (InMagic!)
● OAI-PMH for metadata only
● OAI-ORE for metadata + bitstreams (eg, from
another DSpace or Sharepoint, etc)
● SWORD (haven't tried)
● REST API (DSpace 5+, haven't tried)

Getting data OUT OF DSpace
● REST API for structured JSON or XML
● OAI-PMH for metadata
● OAI-ORE for metadata + bitstreams (PDFs, etc)
● RSS feeds for websites / blogs
● XML sitemaps for search engines*
*Google discontinued the use of OAI for discovering
site content in 2008!
http://googlewebmastercentral.blogspot.com/2008
/04/retiring-support-for-oai-pmh-in.html

CCAFS website, driven by Drupal + DSpace APIs

“Latest outputs” on project blog populated via RSS, links to CGSpace

Open source workflow on GitHub
https://github.com/ilri/DSpace

Skills needed in your organization
Besides content people(!)...
● Prioritize Linux systems administration
experience (Tomcat, httpd, PostgreSQL, DNS,
SSH, git)
● General: computer science background
● Web developers a diverse bunch...
● Java development experience doesn't hurt

Extra considerations
● Item mapping
● Maintenance tasks (background batch jobs)
● Backups of assetstore and PostgreSQL!
● Altmetrics tracks social media mentions
● Separate production / development
environments
● CGSpace server is $80/month
● ~20GB of PDFs, ~8GB of Solr data

Getting help
● “DSpace Tech” mailing list
● “dspace” tag on StackOverflow website
● a.orth@cgiar.org

Más contenido relacionado

La actualidad más candente

Hadoop Training in HyderabadRajitha D

TiDB Introduction - San Francisco MySQL MeetupMorgan Tocker

ELK - Stack - Munich .net UGSteve Behrendt

Open source big data landscape and possible ITS applicationsSoftwareMill

Presto Meetup (2015-03-19)Dain Sundstrom

Building an open data platform with apache icebergAlluxio, Inc.

Dotnet Online TrainingSumma Mcclane

Dot Net Online training in uk and usaalmaandrea

Presto Meetup @ Facebook (3/22/2016)Martin Traverso

SANSA ISWC 2017 TalkJens Lehmann

Building an API layer for C* at CourseraDaniel Jin Hao Chia

MongoDBRony Gregory

Magic of assets pipelinekotharidarshi

Presto talk @ Global AI conference 2018 Bostonkbajda

Speeding CouchTaylor Luk

noSql - db4oFabio Medeiros Faria

Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...Andrii Vozniuk

Share point 2013 on azurePrabath Fonseka

ELK Elasticsearch Logstash and Kibana Stack for Log ManagementEl Mahdi Benzekri

La actualidad más candente (19)

Hadoop Training in Hyderabad

TiDB Introduction - San Francisco MySQL Meetup

ELK - Stack - Munich .net UG

Open source big data landscape and possible ITS applications

Presto Meetup (2015-03-19)

Building an open data platform with apache iceberg

Dotnet Online Training

Dot Net Online training in uk and usa

Presto Meetup @ Facebook (3/22/2016)

SANSA ISWC 2017 Talk

Building an API layer for C* at Coursera

MongoDB

Magic of assets pipeline

Presto talk @ Global AI conference 2018 Boston

Speeding Couch

noSql - db4o

Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...

Share point 2013 on azure

ELK Elasticsearch Logstash and Kibana Stack for Log Management

Similar a DSpace at ILRI: A semi-technical overview of “CGSpace”

Maximizing the Impact of Institutional Knowledge Using DSpaceAIMS (Agricultural Information Management Standards)

Day 13 - Creating Data Processing Services | Train the Trainers ProgramFIWARE

Welcome to databases in the CloudNelson Calero

Big data on google platform dev fest presentationPrzemysław Pastuszka

Automate the operation of your Oracle Cloud infrastructure v2.0Nelson Calero

Oracle OpenWo2014 review part 03 three_paa_s_databaseGetting value from IoT, Integration and Data Analytics

Session 8 - Creating Data Processing Services | Train the Trainers ProgramFIWARE

Introducing TiDB OperatorKevin Xu

Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...MongoDB

How we leveraged Drupal to build a leading SaaS product Invotra

CGSpace Update, 2015–2016ILRI

Xtending nintex workflow cloud w azure functions - xchange conferenceMichael Oryszak

MySQL Document Store and Node.JSReggie Burnett

Node.js and the MySQL Document StoreRui Quelhas

Searchlight + Horizon - Mitaka march 2016Travis Tripp

Enterprise tech meet up london - june 2015Pavel Dolezal

GCP for AWS ProfessionalsDoiT International

Running DSpace: Technical overview, lessons learned, workflows and essential ...ILRI

Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...MongoDB

Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari

Similar a DSpace at ILRI: A semi-technical overview of “CGSpace” (20)

Maximizing the Impact of Institutional Knowledge Using DSpace

Day 13 - Creating Data Processing Services | Train the Trainers Program

Welcome to databases in the Cloud

Big data on google platform dev fest presentation

Automate the operation of your Oracle Cloud infrastructure v2.0

Oracle OpenWo2014 review part 03 three_paa_s_database

Session 8 - Creating Data Processing Services | Train the Trainers Program

Introducing TiDB Operator

Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...

How we leveraged Drupal to build a leading SaaS product

CGSpace Update, 2015–2016

Xtending nintex workflow cloud w azure functions - xchange conference

MySQL Document Store and Node.JS

Node.js and the MySQL Document Store

Searchlight + Horizon - Mitaka march 2016

Enterprise tech meet up london - june 2015

GCP for AWS Professionals

Running DSpace: Technical overview, lessons learned, workflows and essential ...

Lightning Talk: Why and How to Integrate MongoDB and NoSQL into Hadoop Big Da...

Thinking DevOps in the era of the Cloud - Demi Ben-Ari

Más de ILRI

How the small-scale low biosecurity sector could be transformed into a more b...ILRI

Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...ILRI

A training, certification and marketing scheme for informal dairy vendors in ...ILRI

Milk safety and child nutrition impacts of the MoreMilk training, certificati...ILRI

Preventing the next pandemic: a 12-slide primer on emerging zoonotic diseasesILRI

Preventing preventable diseases: a 12-slide primer on foodborne diseaseILRI

Preventing a post-antibiotic era: a 12-slide primer on antimicrobial resistanceILRI

Food safety research in low- and middle-income countriesILRI

Food safety research LMICILRI

The application of One Health: Observations from eastern and southern AfricaILRI

One Health in action: Perspectives from 10 years in the fieldILRI

Reservoirs of pathogenic Leptospira species in UgandaILRI

Minyoo ya mbwaILRI

Parasites in dogsILRI

Assessing meat microbiological safety and associated handling practices in bu...ILRI

Ecological factors associated with abundance and distribution of mosquito vec...ILRI

Livestock in the agrifood systems transformationILRI

Development of a fluorescent RBL reporter system for diagnosis of porcine cys...ILRI

Practices and drivers of antibiotic use in Kenyan smallholder dairy farmsILRI

Más de ILRI (20)

How the small-scale low biosecurity sector could be transformed into a more b...

Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...

A training, certification and marketing scheme for informal dairy vendors in ...

Milk safety and child nutrition impacts of the MoreMilk training, certificati...

Preventing the next pandemic: a 12-slide primer on emerging zoonotic diseases

Preventing preventable diseases: a 12-slide primer on foodborne disease

Preventing a post-antibiotic era: a 12-slide primer on antimicrobial resistance

Food safety research in low- and middle-income countries

Food safety research LMIC

The application of One Health: Observations from eastern and southern Africa

One Health in action: Perspectives from 10 years in the field

Reservoirs of pathogenic Leptospira species in Uganda

Minyoo ya mbwa

Parasites in dogs

Assessing meat microbiological safety and associated handling practices in bu...

Ecological factors associated with abundance and distribution of mosquito vec...

Livestock in the agrifood systems transformation

Development of a fluorescent RBL reporter system for diagnosis of porcine cys...

Practices and drivers of antibiotic use in Kenyan smallholder dairy farms

Último

GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...Lokesh Kothari

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani

GBSN - Microbiology (Unit 1)Areesha Ahmad

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls

Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197

SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2

Seismic Method Estimate velocity from seismic data.pptxAlMamun560346

Zoology 4th semester series (krishna).pdfSumit Kumar yadav

Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25

Chemistry 4th semester series (krishna).pdfSumit Kumar yadav

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju

Green chemistry and Sustainable development.pptxRajatChauhan518211

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani

Disentangling the origin of chemical differences using GHOSTSérgio Sacani

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani

Botany 4th semester series (krishna).pdfSumit Kumar yadav

High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293

Biological Classification BioHack (3).pdfmuntazimhurra

Bacterial Identification and ClassificationsAreesha Ahmad

DSpace at ILRI: A semi-technical overview of “CGSpace”

1. A semi-technical overview of “CGSpace” DSpace at ILRI Alan Orth KAINET Open Data and Open Science’ Workshop Nairobi, Kenya, 18 June 2015

2. History of DSpace at ILRI ● 2009: ILRI launches Mahider (“repository” in Amharic) ● 2010: Other CGIAR centers and programs join our platform and share hard / soft costs ● 2011: Rebranded as “CGSpace” ● 2015: 9 CGIAR centers, ~50,000 items, ~250k hits/month

3. “CGSpace” in June, 2015

4. How we use DSpace ● Content people embedded in each department help capture results (presentations, papers, brochures, etc) ● Primary location for institutional outputs! ● No posting PDFs on corporate website! ● Integrate with website and blogs via RSS feeds ● Direct ALL traffic to DSpace! ● For data sets, videos, etc we make a metadata- only accession with a link to eg YouTube

5. ● Communities, sub-communities, and collections ● Tempting to model after organization hierarchy! ● (we did) ● … but organization hierarchies change! DSpace hierarchies

6. Mostly organized by output type now...

7. Metadata ● Standard Dublin Core is available ● No AGROVOC ● You can create custom controlled vocabularies in arbitrary namespaces, eg: cg.subject.ilri

8. Custom metadata in ILRI report Not AGROVOC!

9. “Discovery” facets ● Context-aware metadata summaries ● Side effect: helps spot metadata inconsistencies! ● … Open Access, Open access, open Access, etc.

10. Search engine optimization (SEO) Help Google Scholar consume your content! ● XML sitemaps ● Consistent domain name, eg: cgspace.cgiar.org ● Persistent links for resources ● Website speed and HTTPS both a plus ● Sign up for Google Webmaster Tools to submit sitemap, control indexing, see stats, etc

11. Sitemap view in Google Webmaster Tools

12. Importance of persistent links ● Website addresses change… ● mahider.ilri.org -> cgspace.cgiar.org ● But resources stay the same! http://hdl.handle.net/10568/67073 ● “Handle” service from handle.net ● Everything under prefix 10568 is CGSpace ● Default DSpace handle prefix is 123456789!

13. dc.identifier.uri specifies an item’s persistent universal resource identifier (URI)

14. Getting data INTO DSpace ● Day-to-day submission is manual, by a small army of editors ● One-time batch uploads of items from other systems in CSV format (InMagic!) ● OAI-PMH for metadata only ● OAI-ORE for metadata + bitstreams (eg, from another DSpace or Sharepoint, etc) ● SWORD (haven't tried) ● REST API (DSpace 5+, haven't tried)

15. Getting data OUT OF DSpace ● REST API for structured JSON or XML ● OAI-PMH for metadata ● OAI-ORE for metadata + bitstreams (PDFs, etc) ● RSS feeds for websites / blogs ● XML sitemaps for search engines* *Google discontinued the use of OAI for discovering site content in 2008! http://googlewebmastercentral.blogspot.com/2008 /04/retiring-support-for-oai-pmh-in.html

16. CCAFS website, driven by Drupal + DSpace APIs

17. “Latest outputs” on project blog populated via RSS, links to CGSpace

18. Open source workflow on GitHub https://github.com/ilri/DSpace

19. Skills needed in your organization Besides content people(!)... ● Prioritize Linux systems administration experience (Tomcat, httpd, PostgreSQL, DNS, SSH, git) ● General: computer science background ● Web developers a diverse bunch... ● Java development experience doesn't hurt

20. Extra considerations ● Item mapping ● Maintenance tasks (background batch jobs) ● Backups of assetstore and PostgreSQL! ● Altmetrics tracks social media mentions ● Separate production / development environments ● CGSpace server is $80/month ● ~20GB of PDFs, ~8GB of Solr data

21. Getting help ● “DSpace Tech” mailing list ● “dspace” tag on StackOverflow website ● a.orth@cgiar.org

Notas del editor

Introduce self as computer scientist, apologize for limited knowledge of library stuff. How we do things plus lessons learned.
Mention search engine stumbling and parsing vs consuming structured content

DSpace at ILRI: A semi-technical overview of “CGSpace”

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (19)

Similar a DSpace at ILRI: A semi-technical overview of “CGSpace”

Similar a DSpace at ILRI: A semi-technical overview of “CGSpace” (20)

Más de ILRI

Más de ILRI (20)

Último

Último (20)

DSpace at ILRI: A semi-technical overview of “CGSpace”

Notas del editor