SlideShare una empresa de Scribd logo
1 de 38
CORE
CO-author REcommendation using network information and interest similarity
Overview
• Problem statement
• Data collection & storage
• Calculations
• Technical infrastructure
• Conclusion
Problem statement
• Researchers in search of future
cooperation partners
• writing a paper
• writing a project proposal
• finding people with similar
interest
Problem statement
• Researchers in search of future
cooperation partners
• writing a paper
• writing a project proposal
• finding people with similar
interest
Whom to choose / ask when you want to work together?http://www.flickr.com/photos/jaygooby/
CORE
Similarity (of interest) / homophily (Ibarra, 1992; Lazarsfeld &
Merton, 1954; McPherson, Smith-Lovin, & Cook, 2001; Stahl, 2005)
Influence / Power over information/dissemination flow
(similar to Word-of-Mouth (Money, Gilly, & Graham, 1998; Park & Suh, 2013))
Data collection
• data:
• dspace.ou.nl (recommendation)
• Google scholar h-index (visualisation)
• Mendeley hr-index (visualisation)
• storage: MAMP
dspace harvester response
• identifier
• timestamp
• title
• creator: authors
• subject: keywords
• description:APA ref,
sponsors
• language
• type: conf. paper, article,
book chapter
<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2002-02-08T08:55:46Z</responseDate>
<request verb="GetRecord" identifier="oai:arXiv.org:cs/0112017"
metadataPrefix="oai_dc">http://arXiv.org/oai2</request>
<GetRecord>
<record>
<header>
<identifier>oai:arXiv.org:cs/0112017</identifier>
<datestamp>2001-12-14</datestamp>
<setSpec>cs</setSpec>
<setSpec>math</setSpec>
</header>
<metadata>
<oai_dc:dc
xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Using Structural Metadata to Localize Experience of
Digital Content</dc:title>
<dc:creator>Dushay, Naomi</dc:creator>
<dc:subject>Digital Libraries</dc:subject>
<dc:description>With the increasing technical sophistication of
both information consumers and providers, there is
increasing demand for more meaningful experiences of digital
information. We present a framework that separates digital
object experience, or rendering, from digital object storage
and manipulation, so the rendering can be tailored to
particular communities of users.
</dc:description>
<dc:description>Comment: 23 pages including 2 appendices,
8 figures</dc:description>
<dc:date>2001-12-14</dc:date>
</oai_dc:dc>
</metadata>
</record>
</GetRecord>
</OAI-PMH>
Data storage
4.3.2 New data collection
On top of the initial data we also collect data from two different sources.
Illustration 10: Data structure database Sie; red objects are relevant to COCOON CORE project
Additional data: h-index
• For each article:
• search google scholar
• scrape citations
• 1000 requests → Captcha
• → switch to another server
• total runtime: 1 hour
• Compute h-index per year
Additional data: hr-index
• h-index for Mendeley reads
• Reader Meter
Architecture
Interest similarity
• Vector space model (Salton,Wang, &Yang, 1975)
• every author has a keyword vector
• per keyword:TF-IDF = term frequency * inverse document
frequency
• boolean TF: 1 if author uses keyword, 0 otherwise
• IDF: all authors / number of times keyword is used by an author
• compute cosine similarity between vectors
Betweenness centrality
• requirement: a network
• co-author network
Betweenness centrality
• requirement: a network
• co-author network
Betweenness centrality
• requirement: a network
• co-author network
Betweenness centrality
• requirement: a network
• co-author network
Betweenness centrality
• requirement: a network
• co-author network
Betweenness centrality
• requirement: a network
• co-author network
OUNL co-author network5
Fig. 1. Co-authorship network
X.2.2 Calculations
Sie et al. (accepted)
Betweenness centrality
• betweenness = number of times an author is on the shortest
path between two other authors / total number of shortest
paths
Betweenness centrality
number of times v is on the shortest path between s and t
number of shortest paths between s and t
Betweenness centrality
g(Hendrik) = σMarlies,Erik(Hendrik)/σMarlies,Erik + σMarlies,Denis(Hendrik)/
σMarlies,Denis + σPeter,Erik(Hendrik)/σPeter,Erik + σPeter,Denis(Hendrik)/
σPeter,Denis + σRory,Erik(Hendrik)/σRory,Erik + σRory,Denis(Hendrik)/σRory,Denis
g(Hendrik) = 2/2 + 2/2 + 1/1 + 1/1 + 1/1 + 1/1 = 6
normalization by (N-1)(N-2)/2 gives Cb(Hendrik) = 0.6Marlies
Peter
Rory
Hendrik
Erik
Denis
Betweenness centrality
g(Hendrik) = σMarlies,Erik(Hendrik)/σMarlies,Erik + σMarlies,Denis(Hendrik)/
σMarlies,Denis + σPeter,Erik(Hendrik)/σPeter,Erik + σPeter,Denis(Hendrik)/
σPeter,Denis + σRory,Erik(Hendrik)/σRory,Erik + σRory,Denis(Hendrik)/σRory,Denis
g(Hendrik) = 2/2 + 2/2 + 1/1 + 1/1 + 1/1 + 1/1 = 6
normalization by (N-1)(N-2)/2 gives Cb(Hendrik) = 0.6Marlies
Peter
Rory
Hendrik
Erik
Denis
Hendrik is on the edge of his network
Problem
• Find a new co-author with:
• similar interest (vector similarity)
• influence (betweenness centrality)
GUI
GUI
fill out additional keywords
GUI
add keyword to user’s vector
GUI
adjust sliders to your liking
GUI
press the button
Recommendation result
Author page (1/2)
Author page (2/2)
Keyword page
Usability
• SUS System Usability Scale: 67/100 points
• Q4: no help from a technical person needed
COCOON CORE (questions 4 and 10, Figures 7 and 8), for instance not needing a
technical person to use COCOON CORE (question 4). Also, when looking at the
proportions of responses (Figure 8), participants think that there are few inconsist-
encies in COCOON CORE (question 6) and that COCOON CORE is not unneces-
sarily complex (question 2).
Fig. 7. Median score for each question of the System Usability Scale (SUS)
0"
0,5"
1"
1,5"
2"
2,5"
3"
3,5"
4"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
Median'
Questions'
Considerations
• Interest similarity: Keyword vector or keyword network?
• average distance between their keywords?
• Wordnet as keyword network
• GUI:
• KISS
• Connect individuals directly
• Performance and scalability:
• graph search depth
• smart indexing
• PHP or JAVA?
References
• Ibarra, H. (1992). Homophily and Differential Returns : Sex Differences in Network Structure and Access in an Advertising Firm. Science, 37(3), 422–447.
• Lazarsfeld, P. F., & Merton, R. K. (1954). Friendship as a social process:A substantive and methodological analysis. In M. Berger,T.Abel, & C. H. Page (Eds.),
Freedom and Control in Modern Society (Vol. 18, pp. 18–66).Van Nostrand. Retrieved from http://www.questia.com/PM.qst?a=o&docId=23415760
• McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a Feather: Homophily in Social Networks.Annual Review of Sociology, 27(1), 415–444. doi:
10.1146/annurev.soc.27.1.415
• Money, R. B., Gilly, M. C., & Graham, J. L. (1998). Explorations of National Culture and Word-of-Mouth Referral Behavior in the Purchase of Industrial
Services in the United States and Japan. Journal of Marketing, 62(October), 76–87.
• Park, J. H., & Suh, B. (2013).The impact of influential’s betweenness centraon the WOM effect under the online social networkingservice environment. In
Pacific Asia Conference on Information Systems (PACIS 2013). Jeju Island, Korea:The Korea Society of Management Information Systems.
• Salton, G.,Wong,A., &Yang, C. S. (1975).A vector space model for automatic indexing. Information Retrieval and Language Processing, 18(11), 613–620.
• Sie R. L. L.,Van Engelen, B.J., Bitter-Rijpkema, M., & Sloep, P. B. (accepted). COCOON CORE: CO-author Recommendation based on Betweenness
Centrality and Interest Similarity. SpringerVolume on Recommender Systems for Technology Enhanced Learning: Research Trends & Applications, pp.
• Stahl, G. (2005). Group cognition in computer-assisted collaborative learning. Journal of Computer Assisted Learning, 21(2), 79–90. doi:10.1111/j.
1365-2729.2005.00115.x
Networks are everywhere
Thank you for your attention!
rory.sie@ou.nl
http://www.open.ou.nl/rse
openrory, maisonpoublon
Rory Sie
openrse
http://nl.linkedin.com/in/rorysie
thebigbangrory.blogspot.com

Más contenido relacionado

Similar a CORE: co-author recommendation using network information and interest similarity

Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Artificial Intelligence Institute at UofSC
 
PhD proposal: Specialized heuristics for crowdsourcing website design
PhD proposal: Specialized heuristics for crowdsourcing website designPhD proposal: Specialized heuristics for crowdsourcing website design
PhD proposal: Specialized heuristics for crowdsourcing website design
donellemckinley
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
GESIS
 

Similar a CORE: co-author recommendation using network information and interest similarity (20)

Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...Semantics-enhanced Cyberinfrastructure for ICMSE :  Interoperability, Analyti...
Semantics-enhanced Cyberinfrastructure for ICMSE : Interoperability, Analyti...
 
Collaboration between Software Developers and the Impact of Proximity
Collaboration between Software Developers  and the Impact of ProximityCollaboration between Software Developers  and the Impact of Proximity
Collaboration between Software Developers and the Impact of Proximity
 
PhD proposal: Specialized heuristics for crowdsourcing website design
PhD proposal: Specialized heuristics for crowdsourcing website designPhD proposal: Specialized heuristics for crowdsourcing website design
PhD proposal: Specialized heuristics for crowdsourcing website design
 
Leveraging Technology in Collaborative Work - Foundations
Leveraging Technology in Collaborative Work - FoundationsLeveraging Technology in Collaborative Work - Foundations
Leveraging Technology in Collaborative Work - Foundations
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
Data stories
Data storiesData stories
Data stories
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?
 
Understanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational DatabasesUnderstanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational Databases
 
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Network Relationships and Job Changes of Software Developers at Sunbelt 2016
Network Relationships and Job Changes of Software Developers at Sunbelt 2016
 
Information Retrieval Fundamentals - An introduction
Information Retrieval Fundamentals - An introduction Information Retrieval Fundamentals - An introduction
Information Retrieval Fundamentals - An introduction
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
Methods and Tools for Facilitating Social Participation
Methods and Tools for Facilitating Social ParticipationMethods and Tools for Facilitating Social Participation
Methods and Tools for Facilitating Social Participation
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
Wimmics Research Team 2015 Activity Report
Wimmics Research Team 2015 Activity ReportWimmics Research Team 2015 Activity Report
Wimmics Research Team 2015 Activity Report
 
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR Experiences
 
Understanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational DatabasesUnderstanding Natural Language Queries over Relational Databases
Understanding Natural Language Queries over Relational Databases
 
How academic research on GitHub has evolved in the last several years
How academic research on GitHub has evolved in the last several yearsHow academic research on GitHub has evolved in the last several years
How academic research on GitHub has evolved in the last several years
 
A distributed network of digital heritage information - Unesco/NDL India
A distributed network of digital heritage information - Unesco/NDL IndiaA distributed network of digital heritage information - Unesco/NDL India
A distributed network of digital heritage information - Unesco/NDL India
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
 

Más de Rory Sie

Mendeley introductie
Mendeley introductieMendeley introductie
Mendeley introductie
Rory Sie
 

Más de Rory Sie (18)

Social Learning Analytics
Social Learning AnalyticsSocial Learning Analytics
Social Learning Analytics
 
Wikiwijsleermiddelenplein Maken en Delen
Wikiwijsleermiddelenplein Maken en DelenWikiwijsleermiddelenplein Maken en Delen
Wikiwijsleermiddelenplein Maken en Delen
 
Wikiwijsleermiddelenplein Zoeken
Wikiwijsleermiddelenplein ZoekenWikiwijsleermiddelenplein Zoeken
Wikiwijsleermiddelenplein Zoeken
 
Mendeley introductie
Mendeley introductieMendeley introductie
Mendeley introductie
 
Wikiwijs: Waarom eigenlijk?
Wikiwijs: Waarom eigenlijk?Wikiwijs: Waarom eigenlijk?
Wikiwijs: Waarom eigenlijk?
 
Standaardpresentatie wikiwijs zoeken
Standaardpresentatie wikiwijs zoekenStandaardpresentatie wikiwijs zoeken
Standaardpresentatie wikiwijs zoeken
 
Standaardpresentatie Wikiwijs Delen
Standaardpresentatie Wikiwijs DelenStandaardpresentatie Wikiwijs Delen
Standaardpresentatie Wikiwijs Delen
 
Standaardpresentatie Wikiwijs Maken
Standaardpresentatie Wikiwijs MakenStandaardpresentatie Wikiwijs Maken
Standaardpresentatie Wikiwijs Maken
 
Standaardpresentatie wikiwijs zoeken
Standaardpresentatie wikiwijs zoekenStandaardpresentatie wikiwijs zoeken
Standaardpresentatie wikiwijs zoeken
 
Networked Learning for Professional Development
Networked Learning for Professional DevelopmentNetworked Learning for Professional Development
Networked Learning for Professional Development
 
Social Network Analysis en Semantic Web voor Bibliotheken
Social Network Analysis en Semantic Web voor BibliothekenSocial Network Analysis en Semantic Web voor Bibliotheken
Social Network Analysis en Semantic Web voor Bibliotheken
 
Social Network Analysis, Semantic Web and Learning Networks
Social Network Analysis, Semantic Web and Learning NetworksSocial Network Analysis, Semantic Web and Learning Networks
Social Network Analysis, Semantic Web and Learning Networks
 
The Basics of Social Network Analysis
The Basics of Social Network AnalysisThe Basics of Social Network Analysis
The Basics of Social Network Analysis
 
The PhD council
The PhD councilThe PhD council
The PhD council
 
SIKS SIREN 2010 poster
SIKS SIREN 2010 posterSIKS SIREN 2010 poster
SIKS SIREN 2010 poster
 
JTEL winter school 2010
JTEL winter school 2010JTEL winter school 2010
JTEL winter school 2010
 
Lustrum OW
Lustrum OWLustrum OW
Lustrum OW
 
Tcws09
Tcws09Tcws09
Tcws09
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

CORE: co-author recommendation using network information and interest similarity

  • 1. CORE CO-author REcommendation using network information and interest similarity
  • 2. Overview • Problem statement • Data collection & storage • Calculations • Technical infrastructure • Conclusion
  • 3. Problem statement • Researchers in search of future cooperation partners • writing a paper • writing a project proposal • finding people with similar interest
  • 4. Problem statement • Researchers in search of future cooperation partners • writing a paper • writing a project proposal • finding people with similar interest Whom to choose / ask when you want to work together?http://www.flickr.com/photos/jaygooby/
  • 5. CORE Similarity (of interest) / homophily (Ibarra, 1992; Lazarsfeld & Merton, 1954; McPherson, Smith-Lovin, & Cook, 2001; Stahl, 2005) Influence / Power over information/dissemination flow (similar to Word-of-Mouth (Money, Gilly, & Graham, 1998; Park & Suh, 2013))
  • 6. Data collection • data: • dspace.ou.nl (recommendation) • Google scholar h-index (visualisation) • Mendeley hr-index (visualisation) • storage: MAMP
  • 7. dspace harvester response • identifier • timestamp • title • creator: authors • subject: keywords • description:APA ref, sponsors • language • type: conf. paper, article, book chapter <?xml version="1.0" encoding="UTF-8"?> <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> <responseDate>2002-02-08T08:55:46Z</responseDate> <request verb="GetRecord" identifier="oai:arXiv.org:cs/0112017" metadataPrefix="oai_dc">http://arXiv.org/oai2</request> <GetRecord> <record> <header> <identifier>oai:arXiv.org:cs/0112017</identifier> <datestamp>2001-12-14</datestamp> <setSpec>cs</setSpec> <setSpec>math</setSpec> </header> <metadata> <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> <dc:title>Using Structural Metadata to Localize Experience of Digital Content</dc:title> <dc:creator>Dushay, Naomi</dc:creator> <dc:subject>Digital Libraries</dc:subject> <dc:description>With the increasing technical sophistication of both information consumers and providers, there is increasing demand for more meaningful experiences of digital information. We present a framework that separates digital object experience, or rendering, from digital object storage and manipulation, so the rendering can be tailored to particular communities of users. </dc:description> <dc:description>Comment: 23 pages including 2 appendices, 8 figures</dc:description> <dc:date>2001-12-14</dc:date> </oai_dc:dc> </metadata> </record> </GetRecord> </OAI-PMH>
  • 8. Data storage 4.3.2 New data collection On top of the initial data we also collect data from two different sources. Illustration 10: Data structure database Sie; red objects are relevant to COCOON CORE project
  • 9. Additional data: h-index • For each article: • search google scholar • scrape citations • 1000 requests → Captcha • → switch to another server • total runtime: 1 hour • Compute h-index per year
  • 10. Additional data: hr-index • h-index for Mendeley reads • Reader Meter
  • 12. Interest similarity • Vector space model (Salton,Wang, &Yang, 1975) • every author has a keyword vector • per keyword:TF-IDF = term frequency * inverse document frequency • boolean TF: 1 if author uses keyword, 0 otherwise • IDF: all authors / number of times keyword is used by an author • compute cosine similarity between vectors
  • 13. Betweenness centrality • requirement: a network • co-author network
  • 14. Betweenness centrality • requirement: a network • co-author network
  • 15. Betweenness centrality • requirement: a network • co-author network
  • 16. Betweenness centrality • requirement: a network • co-author network
  • 17. Betweenness centrality • requirement: a network • co-author network
  • 18. Betweenness centrality • requirement: a network • co-author network
  • 19. OUNL co-author network5 Fig. 1. Co-authorship network X.2.2 Calculations Sie et al. (accepted)
  • 20. Betweenness centrality • betweenness = number of times an author is on the shortest path between two other authors / total number of shortest paths
  • 21. Betweenness centrality number of times v is on the shortest path between s and t number of shortest paths between s and t
  • 22. Betweenness centrality g(Hendrik) = σMarlies,Erik(Hendrik)/σMarlies,Erik + σMarlies,Denis(Hendrik)/ σMarlies,Denis + σPeter,Erik(Hendrik)/σPeter,Erik + σPeter,Denis(Hendrik)/ σPeter,Denis + σRory,Erik(Hendrik)/σRory,Erik + σRory,Denis(Hendrik)/σRory,Denis g(Hendrik) = 2/2 + 2/2 + 1/1 + 1/1 + 1/1 + 1/1 = 6 normalization by (N-1)(N-2)/2 gives Cb(Hendrik) = 0.6Marlies Peter Rory Hendrik Erik Denis
  • 23. Betweenness centrality g(Hendrik) = σMarlies,Erik(Hendrik)/σMarlies,Erik + σMarlies,Denis(Hendrik)/ σMarlies,Denis + σPeter,Erik(Hendrik)/σPeter,Erik + σPeter,Denis(Hendrik)/ σPeter,Denis + σRory,Erik(Hendrik)/σRory,Erik + σRory,Denis(Hendrik)/σRory,Denis g(Hendrik) = 2/2 + 2/2 + 1/1 + 1/1 + 1/1 + 1/1 = 6 normalization by (N-1)(N-2)/2 gives Cb(Hendrik) = 0.6Marlies Peter Rory Hendrik Erik Denis Hendrik is on the edge of his network
  • 24. Problem • Find a new co-author with: • similar interest (vector similarity) • influence (betweenness centrality)
  • 25. GUI
  • 27. GUI add keyword to user’s vector
  • 28. GUI adjust sliders to your liking
  • 34. Usability • SUS System Usability Scale: 67/100 points • Q4: no help from a technical person needed COCOON CORE (questions 4 and 10, Figures 7 and 8), for instance not needing a technical person to use COCOON CORE (question 4). Also, when looking at the proportions of responses (Figure 8), participants think that there are few inconsist- encies in COCOON CORE (question 6) and that COCOON CORE is not unneces- sarily complex (question 2). Fig. 7. Median score for each question of the System Usability Scale (SUS) 0" 0,5" 1" 1,5" 2" 2,5" 3" 3,5" 4" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" Median' Questions'
  • 35. Considerations • Interest similarity: Keyword vector or keyword network? • average distance between their keywords? • Wordnet as keyword network • GUI: • KISS • Connect individuals directly • Performance and scalability: • graph search depth • smart indexing • PHP or JAVA?
  • 36. References • Ibarra, H. (1992). Homophily and Differential Returns : Sex Differences in Network Structure and Access in an Advertising Firm. Science, 37(3), 422–447. • Lazarsfeld, P. F., & Merton, R. K. (1954). Friendship as a social process:A substantive and methodological analysis. In M. Berger,T.Abel, & C. H. Page (Eds.), Freedom and Control in Modern Society (Vol. 18, pp. 18–66).Van Nostrand. Retrieved from http://www.questia.com/PM.qst?a=o&docId=23415760 • McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a Feather: Homophily in Social Networks.Annual Review of Sociology, 27(1), 415–444. doi: 10.1146/annurev.soc.27.1.415 • Money, R. B., Gilly, M. C., & Graham, J. L. (1998). Explorations of National Culture and Word-of-Mouth Referral Behavior in the Purchase of Industrial Services in the United States and Japan. Journal of Marketing, 62(October), 76–87. • Park, J. H., & Suh, B. (2013).The impact of influential’s betweenness centraon the WOM effect under the online social networkingservice environment. In Pacific Asia Conference on Information Systems (PACIS 2013). Jeju Island, Korea:The Korea Society of Management Information Systems. • Salton, G.,Wong,A., &Yang, C. S. (1975).A vector space model for automatic indexing. Information Retrieval and Language Processing, 18(11), 613–620. • Sie R. L. L.,Van Engelen, B.J., Bitter-Rijpkema, M., & Sloep, P. B. (accepted). COCOON CORE: CO-author Recommendation based on Betweenness Centrality and Interest Similarity. SpringerVolume on Recommender Systems for Technology Enhanced Learning: Research Trends & Applications, pp. • Stahl, G. (2005). Group cognition in computer-assisted collaborative learning. Journal of Computer Assisted Learning, 21(2), 79–90. doi:10.1111/j. 1365-2729.2005.00115.x
  • 38. Thank you for your attention! rory.sie@ou.nl http://www.open.ou.nl/rse openrory, maisonpoublon Rory Sie openrse http://nl.linkedin.com/in/rorysie thebigbangrory.blogspot.com

Notas del editor

  1. Thank you for your attention, and I hope to see you at the Career day