Enviar búsqueda
Cargar
20320140501002 2
•
0 recomendaciones
•
148 vistas
IAEME Publication
Seguir
Tecnología
Educación
Denunciar
Compartir
Denunciar
Compartir
1 de 11
Descargar ahora
Descargar para leer sin conexión
Recomendados
A Novel Multi- Viewpoint based Similarity Measure for Document Clustering
A Novel Multi- Viewpoint based Similarity Measure for Document Clustering
IJMER
A Novel Approach for Clustering Big Data based on MapReduce
A Novel Approach for Clustering Big Data based on MapReduce
IJECEIAES
50120130406008
50120130406008
IAEME Publication
Scalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for multidimensional indexing
eSAT Journals
Scalable and efficient cluster based framework for
Scalable and efficient cluster based framework for
eSAT Publishing House
Cg33504508
Cg33504508
IJERA Editor
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
Shortest path estimation for graph
Shortest path estimation for graph
ijdms
Recomendados
A Novel Multi- Viewpoint based Similarity Measure for Document Clustering
A Novel Multi- Viewpoint based Similarity Measure for Document Clustering
IJMER
A Novel Approach for Clustering Big Data based on MapReduce
A Novel Approach for Clustering Big Data based on MapReduce
IJECEIAES
50120130406008
50120130406008
IAEME Publication
Scalable and efficient cluster based framework for multidimensional indexing
Scalable and efficient cluster based framework for multidimensional indexing
eSAT Journals
Scalable and efficient cluster based framework for
Scalable and efficient cluster based framework for
eSAT Publishing House
Cg33504508
Cg33504508
IJERA Editor
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
IRJET Journal
Shortest path estimation for graph
Shortest path estimation for graph
ijdms
Clustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity Measure
IOSR Journals
50120130406022
50120130406022
IAEME Publication
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm
iosrjce
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Data
idescitation
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET Journal
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
IRJET Journal
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
Alexander Decker
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
IRJET Journal
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
ijscmcj
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET Journal
Noura2
Noura2
Dr-mahmoud Algamel
Text documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizer
IJECEIAES
Principle Component Analysis Based on Optimal Centroid Selection Model for Su...
Principle Component Analysis Based on Optimal Centroid Selection Model for Su...
ijtsrd
F04463437
F04463437
IOSR-JEN
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET Journal
IEEE Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
tsysglobalsolutions
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
Craig Knoblock
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
IOSR Journals
40220130406011 2-3-4
40220130406011 2-3-4
IAEME Publication
50120140504009
50120140504009
IAEME Publication
30120140501004
30120140501004
IAEME Publication
Más contenido relacionado
La actualidad más candente
Clustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity Measure
IOSR Journals
50120130406022
50120130406022
IAEME Publication
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm
iosrjce
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Data
idescitation
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET Journal
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
IRJET Journal
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
Alexander Decker
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
IRJET Journal
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
ijscmcj
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET Journal
Noura2
Noura2
Dr-mahmoud Algamel
Text documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizer
IJECEIAES
Principle Component Analysis Based on Optimal Centroid Selection Model for Su...
Principle Component Analysis Based on Optimal Centroid Selection Model for Su...
ijtsrd
F04463437
F04463437
IOSR-JEN
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET Journal
IEEE Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
tsysglobalsolutions
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
Craig Knoblock
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
IOSR Journals
La actualidad más candente
(19)
Clustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity Measure
50120130406022
50120130406022
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Data
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
A fuzzy clustering algorithm for high dimensional streaming data
A fuzzy clustering algorithm for high dimensional streaming data
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
SVD BASED LATENT SEMANTIC INDEXING WITH USE OF THE GPU COMPUTATIONS
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction Data
Noura2
Noura2
Text documents clustering using modified multi-verse optimizer
Text documents clustering using modified multi-verse optimizer
Principle Component Analysis Based on Optimal Centroid Selection Model for Su...
Principle Component Analysis Based on Optimal Centroid Selection Model for Su...
F04463437
F04463437
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IEEE Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
A Semantic Approach to Retrieving, Linking, and Integrating Heterogeneous Ge...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
Destacado
40220130406011 2-3-4
40220130406011 2-3-4
IAEME Publication
50120140504009
50120140504009
IAEME Publication
30120140501004
30120140501004
IAEME Publication
10120130406021
10120130406021
IAEME Publication
20320140501002 2
20320140501002 2
IAEME Publication
20320130406025 2-3
20320130406025 2-3
IAEME Publication
10320130403006
10320130403006
IAEME Publication
40120140504011
40120140504011
IAEME Publication
30120140504026
30120140504026
IAEME Publication
Destacado
(9)
40220130406011 2-3-4
40220130406011 2-3-4
50120140504009
50120140504009
30120140501004
30120140501004
10120130406021
10120130406021
20320140501002 2
20320140501002 2
20320130406025 2-3
20320130406025 2-3
10320130403006
10320130403006
40120140504011
40120140504011
30120140504026
30120140504026
Similar a 20320140501002 2
Performance Analysis and Parallelization of CosineSimilarity of Documents
Performance Analysis and Parallelization of CosineSimilarity of Documents
IRJET Journal
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
cscpconf
G1803054653
G1803054653
IOSR Journals
IRJET- Automatic Text Summarization using Text Rank
IRJET- Automatic Text Summarization using Text Rank
IRJET Journal
Bl24409420
Bl24409420
IJERA Editor
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET Journal
Baroclinic Channel Model in Fluid Dynamics
Baroclinic Channel Model in Fluid Dynamics
IJERA Editor
Optimal approach for text summarization
Optimal approach for text summarization
IAEME Publication
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
ijdmtaiir
Optimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database Design
Waqas Tariq
Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...
IRJET Journal
IRJET- Text Document Clustering using K-Means Algorithm
IRJET- Text Document Clustering using K-Means Algorithm
IRJET Journal
Semi Automatic to Improve Ontology Mapping Process in Semantic Web Data Analysis
Semi Automatic to Improve Ontology Mapping Process in Semantic Web Data Analysis
IRJET Journal
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
dannyijwest
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
dannyijwest
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
IJwest
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
ijdms
E1062530
E1062530
IJERD Editor
Hierarchal clustering and similarity measures along with multi representation
Hierarchal clustering and similarity measures along with multi representation
eSAT Journals
Hierarchal clustering and similarity measures along
Hierarchal clustering and similarity measures along
eSAT Publishing House
Similar a 20320140501002 2
(20)
Performance Analysis and Parallelization of CosineSimilarity of Documents
Performance Analysis and Parallelization of CosineSimilarity of Documents
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
G1803054653
G1803054653
IRJET- Automatic Text Summarization using Text Rank
IRJET- Automatic Text Summarization using Text Rank
Bl24409420
Bl24409420
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
Baroclinic Channel Model in Fluid Dynamics
Baroclinic Channel Model in Fluid Dynamics
Optimal approach for text summarization
Optimal approach for text summarization
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Optimized Access Strategies for a Distributed Database Design
Optimized Access Strategies for a Distributed Database Design
Algorithm for calculating relevance of documents in information retrieval sys...
Algorithm for calculating relevance of documents in information retrieval sys...
IRJET- Text Document Clustering using K-Means Algorithm
IRJET- Text Document Clustering using K-Means Algorithm
Semi Automatic to Improve Ontology Mapping Process in Semantic Web Data Analysis
Semi Automatic to Improve Ontology Mapping Process in Semantic Web Data Analysis
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
SPATIAL R-TREE INDEX BASED ON GRID DIVISION FOR QUERY PROCESSING
E1062530
E1062530
Hierarchal clustering and similarity measures along with multi representation
Hierarchal clustering and similarity measures along with multi representation
Hierarchal clustering and similarity measures along
Hierarchal clustering and similarity measures along
Más de IAEME Publication
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME Publication
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
IAEME Publication
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
IAEME Publication
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
IAEME Publication
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IAEME Publication
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IAEME Publication
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
IAEME Publication
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
IAEME Publication
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
IAEME Publication
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
IAEME Publication
Más de IAEME Publication
(20)
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
Último
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Katpro Technologies
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
Slack Application Development 101 Slides
Slack Application Development 101 Slides
praypatel2
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
AndikSusilo4
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Paola De la Torre
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
ThousandEyes
Último
(20)
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Slack Application Development 101 Slides
Slack Application Development 101 Slides
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
20320140501002 2
1.
International Journal JOURNAL
OF ADVANCED RESEARCH Technology (IJARET), INTERNATIONAL of Advanced Research in Engineering and IN ENGINEERING ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) Volume 5, Issue 1, January (2014), pp. 07-17 © IAEME: www.iaeme.com/ijaret.asp Journal Impact Factor (2013): 5.8376 (Calculated by GISI) www.jifactor.com IJARET ©IAEME WEB PAGE CLUSTERING USING CEMETERY ORGANIZATION BEHAVIOR OF ANTS Priyank Thakkar1, Samir Kariya2, K Kotecha3 1 Assistant Professor, CSE Department, Institute of Technology, Nirma University, Ahmedabad - 382 481, Gujarat, India 2 Assistant Professor, IT Department, B. H. Gardi College of Engineering & Technology, Rajkot - 361 162, Gujarat, India 3 Director, Institute of Technology, Nirma University, Ahmedabad - 382 481, Gujarat, India ABSTRACT Clustering is the unsupervised classification of patterns (data items, observations or feature vectors) into groups (clusters). Clustering problem has been addressed by the researchers of many disciplines in different contexts. Due to the escalating amount of data available online, the World Wide Web has become one of the most precious resource for information retrievals and knowledge discoveries. Web mining technologies are the right solutions for knowledge discovery on the Web. In this paper, we focus on web page clustering based on their content. A web page clustering system is valuable in web search for grouping search results into strongly related sets of documents. It can improve similarity search by focusing on sets of pertinent documents. At the same time, as the large variety of noisy information is embedded in web pages, web page clustering is much more intricate than pure-text clustering. This paper addresses web page clustering problem through the technique inspired by cemetery organization behavior of ants. Technique proposed by us begins by reducing the dimensionality of index of web pages with the application of Latent Semantic Indexing (LSI). Web pages are then transformed to two dimensional grid space using cemetery organization behavior of ants. Web pages represented in this two dimensional grid space are finally clustered using k-means algorithm. Paper also demonstrates impact of dimensionality reduction by means of LSI and distance measure on web page clustering results is also demonstrated. Keywords: Web Page Clustering, Latent Semantic Indexing, Cemetery Organization Behavior of Ants. 1. INTRODUCTION 7
2.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME A web page clustering system can be precious in web search for grouping search results into strongly associated sets of documents. It can improve similarity search by centering on sets of pertinent documents [9]. At the same time, as the large diversity of noisy information is embedded in web pages, web page clustering is much more intricate than pure-text clustering. Clustering also serves asvaluable technique for analyzing the Web. Duplications, patterns and other interesting structure on the web can be exposed by matching the content-based clustering and the hyperlink structure. There exist a range of types of clustering, depending on the way that clusters are characterized, the types of algorithms and the cluster properties employed for clustering [9]. 2. WEB PAGE CLUSTERING In web mining, web page clustering is one of the foremost preprocessing steps [9].Web page clustering puts together web pages in groups based on similarity or other relationship measure. 2.1 Representation of Web Pages As long as an appropriate representation of objects exists, clustering can be applied to anyset of objects. Most general representation is the attribute–value or feature-value representation. In this representation a number of attributes (features) are recognized for the entire population, and each object is represented by a set of attribute-value pairs. Instead, if one fixes the order of the features, a vector of values (data points) can be used in its place. The document vector space model is precisely the identical type of representation, where the features are terms [9]. 2.1.1 Vector Space Model In vector space model, web pages are defined as vectors (or points) in a multidimensional Euclidean space where the terms represent axes (dimensions). Depending on the type of vector components (coordinates), there are three essential versions of this representation: Boolean, term frequency (TF), and term frequency-inverse document frequency (TFIDF) [9]. The easiest way to use a term as a feature in document representation is by inspecting whether the term is present in the document or not. That is why the term is considered as a Boolean attribute, and the representation is called Boolean. In term frequency (TF) approach, function of the term counts, usually normalized with the ሬሬሬԦ document length is used as the coordinates of the web document d . For each term ti and document dj in document collection (D) (where each document is represented using m terms), term frequency is defined as 0 if n୧୨ ൌ 0 n୧୨ TF൫t ୧ , d୨ ൯ ൌ ቐ ሺ1ሻ if n୧୨ 0 ୫ ∑୩ୀଵ n୩୨ In the TFIDF representation, a product of TF and IDF components is used to calculate each component of the document vector. It is calculated as d୧ ൌ TF൫t ୧ , d୨ ൯IDFሺt ୧ ሻ ୨ where inverse document frequency of term ti is defined as under: 1 |D| IDFሺt ୧ ሻ ൌ log |D୲ | 3. RELATED WORK 8 ሺ2ሻ ሺ3ሻ
3.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME A new web page clustering algorithm QDC was proposed in [12].In this algorithm, user’s query is used as a part of a reliable measure of cluster quality. The five key novelties proposed by this paper are: use of association between clusters and the query in query directed cluster quality guide, use of cluster depiction similarity in addition to cluster overlap to generate semantically coherent clusters, use of a new cluster splitting method to fix the cluster drifting or cluster chaining problem, use of query directed cluster quality guide to improve heuristic for cluster selection, use of ranking of the pages by relevance to the cluster to improve clustering results. Methods such as hierarchical (divisive and agglomerate) clustering, partitioning (k-medoids, k-means, probabilistic) approaches, density-based clustering, grid-based clustering, fuzzyc-means clustering, Kohonen self-organizing maps and many more [6, 7, 13, 15] have been used for web page clustering by researchers. Many algorithms such as partitioning and hierarchical algorithms use data similarity measures to build clusters; when applied directly to web page data; the similarity based methods are not effective at producing semantically meaningful clusters. A method based on Harmony Search (HS) optimization is proposed in [8] to deal with web page clustering. By modeling clustering as an optimization problem, they recommend a pure HS based clustering algorithm that finds near global optimal clusters within a reasonable time. They have also hybridized K-means and harmony clustering to achieve better clustering results. 4. PROPOSED APPROACH In our proposed approach, we follow the three steps as discussed under. 4.1 Generating Term-Document Matrix First all the web pages are saved as the text documents and then the under mentioned steps are followed. • • • Tokenized documents are produced by removing all punctuation marks and character strings without spaces. All the characters existing in the document are transformed to lowercase to carry out keyword matching in the document. All the stop words are removed and then resulting documents are used to generate termdocument matrix. We have used TFIDF representation of the documents. 4.2 Latent Semantic Indexing LSI proposed by Deerwester [20], uses a statistical technique, called Singular Value Decomposition (SVD) [21]. LSI begins with m ൈ n term-document matrix A. SVD factors matrix A into product of three matrices, i.e. A ൌ UΣV ሺ4ሻ A୩ ൌ U୩ Σ୩ V୩ ሺ5ሻ where ܷ and ܸ are matrices of the left and right singular vectors respectively. Σ is the diagonal matrix of singular values. A key feature of SVD is that we can delete some insignificant dimensions in the transformed space to optimally approximate matrix A. The importance of the dimensions is indicated by the magnitudes of the singular values in Σ. LSI approximates A with a rank k matrix 9
4.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME where U୩ includes the first k columns of the matrix U and V୩ includes first k rows of matrix V. Σ୩ ൌ diagሺσଵ , σଶ , … , σ୩ ሻ is the first k factors. In our approach we consider document d as ݉ ൈ 1 matrix and then to transform it in the reduced dimension space we use d ൌ d U୩Σ୩ ሺ6ሻ where k is the rank of the matrix and it allows us to control the dimensionality of the document in the transformed space. After reducing the dimensionality of the index of the web pages as discussed above, we have applied clustering algorithm inspired by the cemetery organization behavior of ants. This algorithm basically transforms web pages to the two dimensional grid space. This algorithm is discussed in the next sub-section. 4.3 Clustering Based on Cemetery Organization Behavior of Ants The general idea is that isolated items (web pages in our case) should be picked up and dropped at some other location where more items of that type are present [18]. The Algorithm pioneered by Lumerand Faieta [22] states to project the space of attributes onto some lower dimensional space, typically of dimension z ൌ 2.LF algorithm works as follows. Instead of embedding the set of web pages into ܴ ଶ , this embedding is approximated by bearing in mind a grid, that is, a subspace of Z ଶ . Ants that are moving in this discrete space can straightforwardly perceive a surrounding region of area sଶ (a square Neighሺୱ ൈୱሻof s ൈ s sites surrounding site r). Let d൫o୧, o୨ ൯ be the distance between two web pages and in the space of attributes. Letusalsoimaginethatanantissituatedatsiterat time t, and finds a web page at that site. The “local density”f (o୧ ) with respect to web page at site r is given by dሺo୧ , o୨ ሻ 1 ቈ1 െ sଶ α ୭ౠ ୣ୧୦౩ൈ౩ ሺ୰ሻ fሺo୧ ሻ ൌ ቐ if f 0, ሺ7ሻ 0 otherwise f(oi )is a measure of the average similarity of web page with the other web pages oj present in the neighborhood of . Scale for dissimilarity is defined by α. It is important as it determines when two items should or should not be located next to each other. Lumer and Faieta has defined picking up and dropping probabilities as follows: p୮ ሺo୧ ሻ ൌ ൬ ଶ kଵ ൰ kଵ fሺo୧ ሻ pୢ ሺo୧ ሻ ൌ ൜ 1 2fሺo୧ ሻ, wherekଵ andk ଶ are two constants. if fሺo୧ ሻ ൏ k ଶ if fሺo୧ ሻ k ଶ s 10 ሺ8ሻ ሺ9ሻ
5.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME High-Level Description of the Lumer-Feita Algorithm [18][22] /*Initialization*/ Foreveryweb pageoido Placeoirandomlyongrid EndFor For allantsdo Placeantatrandomlyselectedsite EndFor /*Mainloop*/ For t=1totmax do For allantsdo If ((ant unladen) and (site occupied by web page )) then Compute fሺo୧ ሻ and pp (oi) Draw Random Real number R between 0 and 1 If (R p୮ ሺo୧ ሻሻthen Pick up web page End If Else If ((ant carrying web page o୧ ሻ and ሺsite emptyሻሻ ܖ܍ܐܜ Compute fሺo୧ ሻ and pୢ ሺo୧ ሻ Draw Random Real number R between 0 and 1 If (R pୢ ሺo୧ ሻሻthen Pick up web page End If End If Move to randomly selected neighboring site not occupied by other ant End For End For Print location of web pages 4.4 Modifications to Lumer-Faieta Algorithm [18][22] Lumer and Faieta algorithm depicted above tend to produce more number of clusters than desired. This was also the case in our initial implementation. They have suggested three features to correct this behavior. • Ants with different moving speeds: Let v be the speed of an ant (v is the number of grid units walked per time unit by an ant along a given grid axis); v is distributed uniformly in [1, vmax]. We use vmax = 6 in our simulations. v also influences, through the function f, the tendency of an ant to either pick up or drop a web page: dሺo୧ , o୨ ሻ 1 ۓ 1 െ ଶ ୴ିଵ fሺo୧ ሻ ൌ s ୭ౠ ୣ୧୦౩౩ ሺ୰ሻ α ቀ1 ୴ ቁ ۔ ౣ౮ 0ە 11 if f 0, otherwise ሺ10ሻ
6.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME Therefore, slow moving ants are more careful than fast ants in their judgment of the average similarity of a web page to its neighbors. Clusters over diverse scales are developed simultaneously due to the miscellany of ants. • A short-term memory: Last m web pages that ants have dropped can be memorized by them along with their locations. The ant compares the properties of the web page with those of the m memorized web pages and goes toward the location of the most similar instead of moving randomly each time it picks up a web page.This behavior pilots to a reduction in the number of equivalent clusters, since similar web pages have a low probability of instigating independent clusters. • Behavioral switches: Web pages are less and less likely to be manipulated as clusters of similar objects form. Therefore it can be said that the system demonstrates some sort of self-annealing. To allows a "heating up" of the system to escape local non-optimal configurations Lumer and Faieta have added the opportunity for ants to start destroying clusters if they haven't performed an action for a given number of time steps.[18][22]. We have integrated first two features in our implementation. Incorporation of the third feature deteriorated the quality of clusters formed in our simulations. 5. RESULTS AND DISCUSSION 5.1 DataSets To verify the effectiveness of the web page clustering system proposed in this paper, we have conducted experiments on three different datasets as shown in Table 1. The aim behind using different datasets for experimentation is to prove the consistency of the proposed system. First two datasets (Bank Search [25] and Syskill & Webert [23][24]) are publically available and the third dataset is created by us by downloading 364 pages of eight different categories (we name this dataset “All Text Combine” Dataset). From the total available documents in Bank Search dataset [25], we have selected 300 documents each from two different categories and used them in our simulations. Bank Search Syskill & Webert All Text Combine Table 1. Data Set Statistics Number of Number of Attributes Documents 600 22513 331 21231 364 25927 Number of Clusters 2 4 8 5.2 Evaluation Measures Assume a confusion matrix with m classes (number of rows) and k clusters (number of columns) as shown in Table 2. For the number of web pages from cluster j that belong to class i, precision and recall are defined as follows. Precision Pሺi, jሻ ൌ Recall Rሺi, jሻ ൌ n୧୨ ୫ ∑୧ୀଵ n୧୨ ሺ11ሻ n୧୨ ሺ12ሻ ∑୩ n୧୨ ୨ୀଵ 12
7.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME Table 2. Confusion matrix for m classes and k clusters Clusters Classes 1 … j … k ݊ଵ … 1 … ݊ଵଵ ݊ଵ ... ݊ I … … ݊ଵ ݊ … ݊ M … … ݊ଵ ݊ We have used F-measure as the evaluation measure for our system.More exact account for the error is provided by F-measure than the overall accuracy. F-measure is actually the harmonic mean of precision and recall. Fሺi, jሻ ൌ 2 · Pሺi, jሻ · Rሺi, jሻ Pሺi, jሻ Rሺi, jሻ ሺ13ሻ Maximum of F(i, j) over all clusters is taken and then sum across classes. As classes normally include different numbers of documents, their contribution to the sum is weighted with the fraction of documents in each. Thus, we obtain the F-measure for the entire clustering. ୫ Fൌ ୧ୀଵ n୧ max ܨሺ݅, ݆ሻ n ୀଵ,…, ሺ14ሻ where n୧ ൌ ∑୩ n୧୨ (the number of web pages belonging to class i, or total of row i) and n ൌ ୨ୀଵ ∑୫ ∑୩ n୧୨ (the total number of web pages in the sample) ୧ୀଵ ୨ୀଵ To reduce the stochastic noise in evaluation, we have simulated our algorithm 10 times and all the results which are presented in this paper are averaged over these 10 runs. Fig. 1, 2 and 3 depict the results on Bank Search, Syskill & Webert and All Text Combine datasets respectively. All the figures show web pages mapped on two dimensional grid space. In our experiments best results are achieved while using 1-cosine similarity as the distance measure in Bank Search dataset. In case of other two datasets, best results are produced when Euclidean distance is used as the distance measure. Fig. 1: Results on Bank Search Data Set 13
8.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME Fig. 2: Syskill and Webert Data Set Once web pages are mapped onto two dimensional space, k-means algorithm is used to produce the final clustering. Table 3, 4 and 5 illustrate final clustering results in terms of F-measure for Bank Search, Syskill and Webert and All Text Combine data sets respectively. We have evaluated our proposed system using 1-cosine and Euclidean distance as the distance measure for all three data sets. Results also divulge impact of dimensionality reduction on clustering results as we have varied number of dimensions from 300 to 700. We have also simulated our algorithm without reducing the dimensionality of the web page corpus. However results are far superior when we have reduced the dimensionality appropriately. Fig. 3: All Text Combine Data Set Table 3. F-Measure for Bank Search Data Set No. of Dimensions Distance Measure Euclidean 1 - Cosine Similarity 300 0.6366 0.8340 400 0.6566 0.8747 0.6634 500 0.8859 600 0.6687 0.8800 700 0.6313 0.8523 All (22513) 0.5307 0.6185 14
9.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME As revealed from Table 3, in Bank Search dataset, best results of 0.8859 is obtained when number of dimensions is 500 and 1-cosine is used as the distance measure. Table 4. F-Measure for Syskill and Webert Data Set No. of Dimensions Distance Measure Euclidean 1 - Cosine Similarity 300 0.8882 0.2910 400 0.9049 0.4012 0.5084 500 0.9155 600 0.9008 0.5113 700 0.8904 0.4788 All (21231) 0.6658 0.3196 As shown in Table 4, in Syskill & Webert dataset, best results of 0.9155 is attained when number of dimensions is 500 and Euclidean is used as the distance measure. In All Text Combine dataset, as indicated in Table 5, best results of 0.8964 is achieved when number of dimensions is 500 and Euclidean is used as the distance measure. Fig. 4 depicts F-measure over 10 different runs for each of the dataset. In each of the run we have used 500 dimensions foreach of the dataset. Table 5. F-Measure for All Text Combine Data Set No. of Dimensions Distance Measure Euclidean 1 - Cosine Similarity 300 0.8759 0.6527 400 0.8928 0.6859 0.6911 500 0.8964 600 0.8939 0.6966 700 0.8801 0.6759 All (25927) 0.6427 0.4745 Fig. 4: F-measure over 10 different runs for each data sets 15
10.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME 6. CONCLUSION AND FUTURE WORK This paper proposes technique that combines algorithm inspired by cemetery organization behavior of ants and k-means for clustering the web pages. Latent Semantic Indexing (LSI) is first used to represent the documents in lower dimension space and then algorithm based on cemetery organization behavior of ants is used to transform the web pages on two dimensional grid space. Once the web pages are represented in two dimensional grid space, k-means is used to cluster them. Implementation results are promising and show the effectiveness of the proposed framework. Impact of dimensionality reduction is also demonstrated. It can be seen that selecting right number of dimensions to represent web pages improves the result of clustering. Clustering results are best when an appropriate number of dimensions are used to represent web pages. In future, we would like to incorporate link information among web pages in the representation and evaluate the impact of this on the results. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 12. 13. 14. 15. K.-C. H. Chun-Wei Tsai and M-C. Chiang, “Ant colony optimization with dual pheromone tables for clustering”, IEEE International Conference on Fuzzy Systems, June 2011. H. K. and A. K., “Clustering Algorithm Employ in Web Usage Mining: An Overview”, Bharati Vidyapeeths Institute of Computer Applications and Management, New Delhi, March 2011. W. Xiong and C. Wang, “A novel hybrid clustering based on adaptive ACO and PSO”, IEEE, 2011. M. V. S. G. Mr. Pankaj K. Bharne and M. S. K. Yewale, “Data clustering algorithms based on swarm intelligence”, IEEE, 2011. O. M. Jafar and R. Sivakumar, “Ant-based clustering algorithms: A brief survey”, International Journal of Computer Theory and Engineering, October 2010. K. Gupta and M. Shrivastava, “Web usage mining clustering using hybrid FCM with GA”, International Journal of Advanced Computer Research, June 2010. V. B. Praveen, “Influence of various clustering algorithms on web personalization”, Proceeding of the International Workshop on Machine Intelligence Research, 2009. Rana Forsati, Mehrdad Mahdavi, Mohammadreza Kangavari and Ba- nafsheh Safarkhani, “WEB PAGE CLUSTERING USING HARMONY SEARCH OPTIMIZATION”, Department of Computer Engineering, Tehran Azad University, Tehran, Iran, IEEE, 2008. Z. Markov and D. T. Larose, “Data Mining The Web: Uncovering Patterns in Web Content, Structure, and Usage”, John Wiley & Sons, 2007. B. Liu, “Web Data-Mining: Exploring Hyperlinks, Content, and Usage Data”, Springer, 2007. 11. J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, ElsevierInc. 2006. Daniel Crabtree, Peter Andreae and Xiaoying Gao, “Query Directed Web Page Clustering”, Victoria University of Wellington New Zealand, IEEE, 2006. R. Xu and D. W. II, “Survey of Clustering Algorithms”, IEEE Trans. On Neural Networks, May 2005. L. Wanner, “Introduction to Clustering Techniques”, July 2004 Kate A. Smith and Alan Ng, “Web page clustering using a self- organizing map of user navigation patterns”, Monash University, P.O. Box 63B, Victoria 3800, Australia, Elsevier Science, 2003. 16
11.
International Journal of
Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 1, January (2014), © IAEME 16. Jerome Moore and Eui-Hong, “Web Page Categorizing and feature selection using Association Rule and Principal Component Clustering”, University of Minnesota, IEEE, 2000. 17. A.K. Jain, M.N.Murty, and P.J.Flynn, “Data clustering: A review”, ACM Computing Surveys, September 1999. 18. E. Bonabeau, M. Dorigo, and G. Theraulaz, “Swarm Intelligence: From Natural to Artificial Systems”, Sante Fe Institute Studies in the Sciences of Complexity, Oxford University, 1999. 19. M. V. Shrivastava and M. N. Gupta, “Performance improvement of web usage mining by using learning based k-mean clustering”, International Journal of Computer Science and its Applications. 20. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer and R. Harshman, “Indexing by Latent Semantic Analysis”, Journal of the American Society for Information Science, 41, pp. 391–407, 1990. 21. G. H. Golub, and C. F. Van Loan,“Matrix Computations”, The Johns Hopkins University Press, 1983. 22. Lumer, E., and B. Faieta, "Diversity and Adaptation in Populations of Clustering Ants", In Proceedings of third International Conference on Simulation of Adaptive Behavior: From Animals to Animats 3, 499-508. Cambridge, MA: MIT Press, 1994. 23. Michael Pazzani, Jack Muramatsu and Daniel Billsus. “Syskill Webert: Identifying Interesting Web Sites”, In AAAI, Vol. 1(1996), pp. 54-61. 24. Bache, K. & Lichman, M., UCI Machine Learning Repository [http://archieve.ics.uci.edu/ml], Irvine, CA: University of California, School of Information and Computer Science. 25. Mark P. Sinka, David W. Corne. “The BankSearch Web Document Dataset: Investigating Unsupervised Clustering and Category Similarity”, Journal of Network and Computer Applications 28 (2004), 129-146, Science Direct. 26. Alamelu Mangai J, Santhosh Kumar V and Sugumaran V, “Recent Research in Web Page Classification – A Review”, International Journal of Computer Engineering & Technology (IJCET), Volume 1, Issue 1, 2010, pp. 112 - 122, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 27. Sudip Kumar Sahana, Dr. Aruna Jain and Abijit Mustafi, “A Comparative Study on Multicast Routing using Dijkstra’s, Prims and Ant Colony Systems”, International Journal of Computer Engineering & Technology (IJCET), Volume 1, Issue 2, 2010, pp. 16 - 25, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 28. R. Manickam, D. Boominath and V. Bhuvaneswari,, “An Analysis of Data Mining: Past, Present and Future”, International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. 17
Descargar ahora