SlideShare una empresa de Scribd logo
1 de 17
On cluster stability
Nees Jan van Eck
Centre for Science and Technology Studies (CWTS), Leiden University
15th International Conference on Scientometrics & Informetrics
Istanbul, Turkey, June 30, 2015
Introduction
• A clustering technique can be used to obtain highly
detailed clustering results (i.e., a large number of
clusters)
• A clustering technique can be used to force each
publication to be assigned to a cluster
• However, in a highly detailed clustering, is the
assignment of publications to clusters still meaningful?
1
Example: Waltman and Van Eck (2012)
2
Cluster stability
• To ensure that publications are assigned to clusters in a
meaningful way, we introduce the notion of stable
clusters
• Essentially, a cluster is stable if it is insensitive to small
changes in the underlying data
• Bootstrapping is used to make small changes in the data
3
Identification of stable clusters:
Step 1
• Collect the citation network of publications
• Create a large number (e.g., 100) of bootstrap citation
networks:
– A bootstrap citation network is a weighted variant of the original citation
network in which each edge has an integer weight drawn from a
Poisson distribution with mean 1 (cf. Rosvall & Bergstrom, 2009)
• In each bootstrap citation network, perform clustering
• For each pair of publications, calculate the proportion of
the bootstrap clustering results in which the publications
are in the same cluster
4
5
Original network Bootstrap networks
1
1
1
0
1
1
2
1
1
0
1
3
1
1
1 2
1
1
1
1
0
1
1
3
1
0
4
1
1
1
2 2
1
1
1
1
0
2
1
0
0
1
3
1
1
0
1 1
Clustering
1
1
1
0
1
1
2
1
1
0
1
3
1
0
1 2
1
1
0
0
1
1
1
3
0
0
4
1
1
1
0 2
1
1
1
1
0
1
1
0
0
1
3
1
2
0
1 1
1.0
0.9
0.9
0.4
0.6
0.9
0.9
0.9
0.1
0.1
0.9
1.0
0.9
0.5
0.9 1.0
Weighted network Clustered bootstrap networks
Identification of stable clusters:
Step 2
• Create a network of publications with an edge between
two publications if the publications are in the same
cluster in at least a certain proportion (e.g., 0.9) of the
bootstrap clustering results
• Identify connected components in the newly created
network
• Each connected component represents a stable cluster
6
1.0
0.9
0.9
0.4
0.6
0.9
0.9
0.9
0.1
0.1
0.9
1.0
0.9
0.5
0.9 1.0
Weighted network
7
Binary network
Connected components
Stable clusters
Data
• Library & Information Sciences (LIS):
– Time period: 1996-2013
– Publications: 31,534
– Citation links: 131,266
• Astrophysics (Berlin dataset):
– Time period: 2003-2010
– Publications: 101,828
– Citation links: 924,171
8
Cluster stability LIS
9
Stable clusters LIS (resolution 2)
10
Stable clusters LIS (resolution 2)
11
Cluster stability Berlin
12
Cluster stability
13
LIS Berlin
Conclusions
• If we want to have an accurate and detailed clustering,
we need to be satisfied with a clustering that doesn’t
comprehensively cover all publications
• Publications that do not clearly belong to one of the main
topics in a field cannot be assigned to a cluster
• Cluster stability analysis can be used to distinguish
between meaningful and non-meaningful assignments of
publications to clusters
14
Thank you for your attention!
15
References
Rosvall, M., & Bergstrom, C.T. (2009). Mapping change in large
networks. PLoS ONE, 5(1), e8694.
http://dx.doi.org/10.1371/journal.pone.0008694
Waltman, L., & Van Eck, N.J. (2012). A new methodology for
constructing a publication-level classification system of
science. JASIST, 63(12), 2378-2392.
http://dx.doi.org/10.1002/asi.22748
Waltman, L., & Van Eck, N.J. (2013). A smart local moving
algorithm for large-scale modularity-based community
detection. European Physical Journal B, 86(11), 471.
http://dx.doi.org/10.1140/epjb/e2013-40829-0
16

Más contenido relacionado

La actualidad más candente

Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesNees Jan van Eck
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataNees Jan van Eck
 
Large-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksLarge-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksNees Jan van Eck
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionNees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingNees Jan van Eck
 
Science Mapping and Research Positioning
Science Mapping and Research PositioningScience Mapping and Research Positioning
Science Mapping and Research PositioningNees Jan van Eck
 
VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialNees Jan van Eck
 
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerVisual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerNees Jan van Eck
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publicationsNees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewerNees Jan van Eck
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...Nees Jan van Eck
 
VOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literatureVOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literatureNees Jan van Eck
 
Large-scale visualization of science
Large-scale visualization of scienceLarge-scale visualization of science
Large-scale visualization of scienceNees Jan van Eck
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLudo Waltman
 
Bibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerBibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerLudo Waltman
 
Visualizing science based on open data sources
Visualizing science based on open data sourcesVisualizing science based on open data sources
Visualizing science based on open data sourcesNees Jan van Eck
 
Using full-text data to create improved term maps
Using full-text data to create improved term mapsUsing full-text data to create improved term maps
Using full-text data to create improved term mapsNees Jan van Eck
 
Scientometric approaches to classification
Scientometric approaches to classificationScientometric approaches to classification
Scientometric approaches to classificationNees Jan van Eck
 
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Ludo Waltman
 
The landscape of research on research
The landscape of research on researchThe landscape of research on research
The landscape of research on researchLudo Waltman
 

La actualidad más candente (20)

Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sources
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
 
Large-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksLarge-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networks
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 
Science Mapping and Research Positioning
Science Mapping and Research PositioningScience Mapping and Research Positioning
Science Mapping and Research Positioning
 
VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
 
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerVisual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
 
VOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literatureVOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literature
 
Large-scale visualization of science
Large-scale visualization of scienceLarge-scale visualization of science
Large-scale visualization of science
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
 
Bibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerBibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewer
 
Visualizing science based on open data sources
Visualizing science based on open data sourcesVisualizing science based on open data sources
Visualizing science based on open data sources
 
Using full-text data to create improved term maps
Using full-text data to create improved term mapsUsing full-text data to create improved term maps
Using full-text data to create improved term maps
 
Scientometric approaches to classification
Scientometric approaches to classificationScientometric approaches to classification
Scientometric approaches to classification
 
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
 
The landscape of research on research
The landscape of research on researchThe landscape of research on research
The landscape of research on research
 

Similar a On cluster stability

Text clustering
Text clusteringText clustering
Text clusteringKU Leuven
 
Energy efficient protocol with static clustering (eepsc) comparing with low e...
Energy efficient protocol with static clustering (eepsc) comparing with low e...Energy efficient protocol with static clustering (eepsc) comparing with low e...
Energy efficient protocol with static clustering (eepsc) comparing with low e...Alexander Decker
 
CERN User Story
CERN User StoryCERN User Story
CERN User StoryTim Bell
 
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACHBased on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACHijsrd.com
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Otávio Carvalho
 
Enhanced Leach Protocol
Enhanced Leach ProtocolEnhanced Leach Protocol
Enhanced Leach Protocolijceronline
 
An efficient tree based self-organizing protocol for internet of things
An efficient tree based self-organizing protocol for internet of thingsAn efficient tree based self-organizing protocol for internet of things
An efficient tree based self-organizing protocol for internet of thingsredpel dot com
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Robert Grossman
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdfbintis1
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchRobert Grossman
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSandinoBerutu1
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptImXaib
 
A ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORK
A ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORKA ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORK
A ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORKcscpconf
 
A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...
A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...
A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...csandit
 
IRJET- Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...
IRJET-  	  Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...IRJET-  	  Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...
IRJET- Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...IRJET Journal
 

Similar a On cluster stability (20)

Text clustering
Text clusteringText clustering
Text clustering
 
Energy efficient protocol with static clustering (eepsc) comparing with low e...
Energy efficient protocol with static clustering (eepsc) comparing with low e...Energy efficient protocol with static clustering (eepsc) comparing with low e...
Energy efficient protocol with static clustering (eepsc) comparing with low e...
 
C04511822
C04511822C04511822
C04511822
 
CERN User Story
CERN User StoryCERN User Story
CERN User Story
 
Scientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked DataScientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked Data
 
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACHBased on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
Based on Heterogeneity and Electing Probability of Nodes Improvement in LEACH
 
I04503075078
I04503075078I04503075078
I04503075078
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
 
Enhanced Leach Protocol
Enhanced Leach ProtocolEnhanced Leach Protocol
Enhanced Leach Protocol
 
An efficient tree based self-organizing protocol for internet of things
An efficient tree based self-organizing protocol for internet of thingsAn efficient tree based self-organizing protocol for internet of things
An efficient tree based self-organizing protocol for internet of things
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Ed33777782
Ed33777782Ed33777782
Ed33777782
 
Ed33777782
Ed33777782Ed33777782
Ed33777782
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
 
Slide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.pptSlide-TIF311-DM-10-11.ppt
Slide-TIF311-DM-10-11.ppt
 
A ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORK
A ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORKA ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORK
A ROUTING PROTOCOL ORPHAN-LEACH TO JOIN ORPHAN NODES IN WIRELESS SENSOR NETWORK
 
A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...
A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...
A Routing Protocol Orphan-Leach to Join Orphan Nodes in Wireless Sensor Netwo...
 
IRJET- Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...
IRJET-  	  Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...IRJET-  	  Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...
IRJET- Study on Hierarchical Cluster-Based Energy-Efficient Routing in Wi...
 

Más de Nees Jan van Eck

Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataNees Jan van Eck
 
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Nees Jan van Eck
 
Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Nees Jan van Eck
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university rankingNees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewerNees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university rankingNees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingNees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewerNees Jan van Eck
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusNees Jan van Eck
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonNees Jan van Eck
 

Más de Nees Jan van Eck (11)

Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
 
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
 
Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
 

Último

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 

Último (20)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 

On cluster stability

  • 1. On cluster stability Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University 15th International Conference on Scientometrics & Informetrics Istanbul, Turkey, June 30, 2015
  • 2. Introduction • A clustering technique can be used to obtain highly detailed clustering results (i.e., a large number of clusters) • A clustering technique can be used to force each publication to be assigned to a cluster • However, in a highly detailed clustering, is the assignment of publications to clusters still meaningful? 1
  • 3. Example: Waltman and Van Eck (2012) 2
  • 4. Cluster stability • To ensure that publications are assigned to clusters in a meaningful way, we introduce the notion of stable clusters • Essentially, a cluster is stable if it is insensitive to small changes in the underlying data • Bootstrapping is used to make small changes in the data 3
  • 5. Identification of stable clusters: Step 1 • Collect the citation network of publications • Create a large number (e.g., 100) of bootstrap citation networks: – A bootstrap citation network is a weighted variant of the original citation network in which each edge has an integer weight drawn from a Poisson distribution with mean 1 (cf. Rosvall & Bergstrom, 2009) • In each bootstrap citation network, perform clustering • For each pair of publications, calculate the proportion of the bootstrap clustering results in which the publications are in the same cluster 4
  • 6. 5 Original network Bootstrap networks 1 1 1 0 1 1 2 1 1 0 1 3 1 1 1 2 1 1 1 1 0 1 1 3 1 0 4 1 1 1 2 2 1 1 1 1 0 2 1 0 0 1 3 1 1 0 1 1 Clustering 1 1 1 0 1 1 2 1 1 0 1 3 1 0 1 2 1 1 0 0 1 1 1 3 0 0 4 1 1 1 0 2 1 1 1 1 0 1 1 0 0 1 3 1 2 0 1 1 1.0 0.9 0.9 0.4 0.6 0.9 0.9 0.9 0.1 0.1 0.9 1.0 0.9 0.5 0.9 1.0 Weighted network Clustered bootstrap networks
  • 7. Identification of stable clusters: Step 2 • Create a network of publications with an edge between two publications if the publications are in the same cluster in at least a certain proportion (e.g., 0.9) of the bootstrap clustering results • Identify connected components in the newly created network • Each connected component represents a stable cluster 6
  • 9. Data • Library & Information Sciences (LIS): – Time period: 1996-2013 – Publications: 31,534 – Citation links: 131,266 • Astrophysics (Berlin dataset): – Time period: 2003-2010 – Publications: 101,828 – Citation links: 924,171 8
  • 11. Stable clusters LIS (resolution 2) 10
  • 12. Stable clusters LIS (resolution 2) 11
  • 15. Conclusions • If we want to have an accurate and detailed clustering, we need to be satisfied with a clustering that doesn’t comprehensively cover all publications • Publications that do not clearly belong to one of the main topics in a field cannot be assigned to a cluster • Cluster stability analysis can be used to distinguish between meaningful and non-meaningful assignments of publications to clusters 14
  • 16. Thank you for your attention! 15
  • 17. References Rosvall, M., & Bergstrom, C.T. (2009). Mapping change in large networks. PLoS ONE, 5(1), e8694. http://dx.doi.org/10.1371/journal.pone.0008694 Waltman, L., & Van Eck, N.J. (2012). A new methodology for constructing a publication-level classification system of science. JASIST, 63(12), 2378-2392. http://dx.doi.org/10.1002/asi.22748 Waltman, L., & Van Eck, N.J. (2013). A smart local moving algorithm for large-scale modularity-based community detection. European Physical Journal B, 86(11), 471. http://dx.doi.org/10.1140/epjb/e2013-40829-0 16