SlideShare una empresa de Scribd logo
1 de 39
Stack Overflow
Network Graph
Team: Heineken
Motivation
What we want?
- We want overall quality of a question
- We want to see how much the similar
questions related to the question
Solution
Visualize the Question Network by A Graph
- Inspired by http://www.visuwords.com/
Dataset
Dataset
How Can We Translate
the Questions Into
Network Graph?
Three Properties of Graph
1. Size of Node
2. Distance
Between Nodes
3. Directed Edges
1.Quality of A Question
2.Similarity Between
Questions (Content Based)
1.Link Between Questions
(Item based)
Quality
SimilarityAssociation
Question
Concept
Data Exploration
Data Exploration
Data Exploration
Data
Cleansing
Sed Pig Latin
Bash - Sed
Pig
Data Analysis
Cosine
Similarity
Aggregation
Integration
Hive
● Library: “tm"
● Data Format: Corpus
● Standardization:
○ Cleansing
○ SMART IR system
○ Porter Stemmer Approach
● Term Frequency matrix
○ Normalization: Cornell SMART system
● Generate Vector Space Model Matrix
● Dot Product of the Matrix
Finding Similarities Among the Questions
in RStudio With R Language
factorial number of similarities
Visualization
Discovery - First 300 Questions TfIdf
Discovery - First 300 Questions Bool
Discovery - Links Rank Validation
Initial
OpenOrd
Discovery - Links Rank Validation
Filter
Discovery - Links Rank Validation
Edge View
OpenOrd
Discovery - Links Rank Validation
GEXF
Deployment
Sigma.js
Graphs from Gephi can be exported but they are static
But what we want is for the user to interact with the graph
Sigma.js is a JavaScript library that renders graphs on web pages
Export graph from Gephi and parse it through Sigma.js
Sigma.js
Init function
Sigma.js
parse.Gexf function
Creating Nodes
Sigma.js
Creating Edges
Final Product
Code Available on Git Repo
Cheers!

Más contenido relacionado

La actualidad más candente

srd117.final.512Spring2016
srd117.final.512Spring2016srd117.final.512Spring2016
srd117.final.512Spring2016
Saurabh Deochake
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
Love Tyagi
 

La actualidad más candente (20)

Models for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationModels for Information Retrieval and Recommendation
Models for Information Retrieval and Recommendation
 
Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Team CDTW Capstone Presentation
Team CDTW Capstone Presentation
 
srd117.final.512Spring2016
srd117.final.512Spring2016srd117.final.512Spring2016
srd117.final.512Spring2016
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
Red Blue Presentation
Red Blue PresentationRed Blue Presentation
Red Blue Presentation
 
Answering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrievalAnswering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrieval
 
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
Similarity & Recommendation - CWI Scientific Meeting - Sep 27th, 2013
 
Question Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesQuestion Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning Issues
 
MUDROD - Ranking
MUDROD - RankingMUDROD - Ranking
MUDROD - Ranking
 
Aspects of broad folksonomies
Aspects of broad folksonomiesAspects of broad folksonomies
Aspects of broad folksonomies
 
Selection of Tags for Tag Clouds
Selection of Tags for Tag CloudsSelection of Tags for Tag Clouds
Selection of Tags for Tag Clouds
 
The Largest Data Science Program in the World: The Johns Hopkins Data Science...
The Largest Data Science Program in the World: The Johns Hopkins Data Science...The Largest Data Science Program in the World: The Johns Hopkins Data Science...
The Largest Data Science Program in the World: The Johns Hopkins Data Science...
 
Social Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIASocial Phrases Having Impact in Altmetrics - SOPHIA
Social Phrases Having Impact in Altmetrics - SOPHIA
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
Detection of Embryonic Research Topics by Analysing Semantic Topic Networks
Detection of Embryonic Research Topics by Analysing Semantic Topic NetworksDetection of Embryonic Research Topics by Analysing Semantic Topic Networks
Detection of Embryonic Research Topics by Analysing Semantic Topic Networks
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 
BDACA1516s2 - Lecture3
BDACA1516s2 - Lecture3BDACA1516s2 - Lecture3
BDACA1516s2 - Lecture3
 
Data Analytics Capstone
Data Analytics CapstoneData Analytics Capstone
Data Analytics Capstone
 

Destacado

Object_Recognition_using_Deeplearning_Report.pptx
Object_Recognition_using_Deeplearning_Report.pptxObject_Recognition_using_Deeplearning_Report.pptx
Object_Recognition_using_Deeplearning_Report.pptx
Yaopeng (Gyoho) Wu
 

Destacado (20)

Object_Recognition_using_Deeplearning_Report.pptx
Object_Recognition_using_Deeplearning_Report.pptxObject_Recognition_using_Deeplearning_Report.pptx
Object_Recognition_using_Deeplearning_Report.pptx
 
Implementación Repositorio De Objetos De Aprendizajes Basado En
Implementación Repositorio De Objetos De Aprendizajes Basado EnImplementación Repositorio De Objetos De Aprendizajes Basado En
Implementación Repositorio De Objetos De Aprendizajes Basado En
 
groovy & grails - lecture 13
groovy & grails - lecture 13groovy & grails - lecture 13
groovy & grails - lecture 13
 
What is Node.js used for: The 2015 Node.js Overview Report
What is Node.js used for: The 2015 Node.js Overview ReportWhat is Node.js used for: The 2015 Node.js Overview Report
What is Node.js used for: The 2015 Node.js Overview Report
 
Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag...
 Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag... Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag...
Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag...
 
Soluciones tecnológicas para REA
Soluciones tecnológicas para REASoluciones tecnológicas para REA
Soluciones tecnológicas para REA
 
Presentacion MoodleMoot 2014 Colombia - Integración Moodle con un Repositorio...
Presentacion MoodleMoot 2014 Colombia - Integración Moodle con un Repositorio...Presentacion MoodleMoot 2014 Colombia - Integración Moodle con un Repositorio...
Presentacion MoodleMoot 2014 Colombia - Integración Moodle con un Repositorio...
 
Responsive Design
Responsive DesignResponsive Design
Responsive Design
 
Stack Overflow - It's all about performance / Marco Cecconi (Stack Overflow)
Stack Overflow - It's all about performance / Marco Cecconi (Stack Overflow)Stack Overflow - It's all about performance / Marco Cecconi (Stack Overflow)
Stack Overflow - It's all about performance / Marco Cecconi (Stack Overflow)
 
Modern HTML & CSS Coding: Speed, Semantics & Structure
Modern HTML & CSS Coding: Speed, Semantics & StructureModern HTML & CSS Coding: Speed, Semantics & Structure
Modern HTML & CSS Coding: Speed, Semantics & Structure
 
StrongLoop Overview
StrongLoop OverviewStrongLoop Overview
StrongLoop Overview
 
Why Scala for Web 2.0?
Why Scala for Web 2.0?Why Scala for Web 2.0?
Why Scala for Web 2.0?
 
Curso avanzado de capacitación en DSpace
Curso avanzado de capacitación en DSpaceCurso avanzado de capacitación en DSpace
Curso avanzado de capacitación en DSpace
 
Asynchronous programming done right - Node.js
Asynchronous programming done right - Node.jsAsynchronous programming done right - Node.js
Asynchronous programming done right - Node.js
 
Html5 devconf nodejs_devops_shubhra
Html5 devconf nodejs_devops_shubhraHtml5 devconf nodejs_devops_shubhra
Html5 devconf nodejs_devops_shubhra
 
Toronto node js_meetup
Toronto node js_meetupToronto node js_meetup
Toronto node js_meetup
 
Webstock 2010 - Stack Overflow: Building Social Software for the Anti-Social
Webstock 2010 - Stack Overflow: Building Social Software for the Anti-SocialWebstock 2010 - Stack Overflow: Building Social Software for the Anti-Social
Webstock 2010 - Stack Overflow: Building Social Software for the Anti-Social
 
Node.js Frameworks & Design Patterns Webinar
Node.js Frameworks & Design Patterns WebinarNode.js Frameworks & Design Patterns Webinar
Node.js Frameworks & Design Patterns Webinar
 
Introducing the New DSpace User Interface
Introducing the New DSpace User InterfaceIntroducing the New DSpace User Interface
Introducing the New DSpace User Interface
 
STACK OVERFLOW DATASET ANALYSIS
STACK OVERFLOW DATASET ANALYSISSTACK OVERFLOW DATASET ANALYSIS
STACK OVERFLOW DATASET ANALYSIS
 

Similar a Stack_Overflow-Network_Graph

FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
Ioan Toma
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Alessandro Benedetti
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
OpenSource Connections
 

Similar a Stack_Overflow-Network_Graph (20)

FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter BonczFOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
 
Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015
Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015
Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015
 
Question answering in linked data
Question answering in linked dataQuestion answering in linked data
Question answering in linked data
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA Datasets
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
keyword query routing
keyword query routingkeyword query routing
keyword query routing
 
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routingIEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
 
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Final presentation
Final presentationFinal presentation
Final presentation
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Search Quality Evaluation to Help Reproducibility : an Open Source Approach
Search Quality Evaluation to Help Reproducibility : an Open Source ApproachSearch Quality Evaluation to Help Reproducibility : an Open Source Approach
Search Quality Evaluation to Help Reproducibility : an Open Source Approach
 
Graph Analytics: Graph Algorithms Inside Neo4j
Graph Analytics: Graph Algorithms Inside Neo4jGraph Analytics: Graph Algorithms Inside Neo4j
Graph Analytics: Graph Algorithms Inside Neo4j
 
Search Quality Evaluation to Help Reproducibility: An Open-source Approach
Search Quality Evaluation to Help Reproducibility: An Open-source ApproachSearch Quality Evaluation to Help Reproducibility: An Open-source Approach
Search Quality Evaluation to Help Reproducibility: An Open-source Approach
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
 

Stack_Overflow-Network_Graph

Notas del editor

  1. Most of the time by simply visualizing it, we can solve a lot of problems.