SlideShare una empresa de Scribd logo
1 de 12
Web Mining
Prepared by
Dr. Swarnalatha K.S
Professor, Dept. of ISE
NMIT
Table of contents
• Introduction to web mining
• Types of web mining
• Challenges in web mining
• Applications of web mining
• Over the last few years, the World Wide Web has become a
significant source of information and simultaneously a
popular platform for business.
• Web mining can define as the method of utilizing data
mining techniques and algorithms to extract useful
information directly from the web, such as Web documents
and services, hyperlinks, Web content, and server logs.
• The World Wide Web contains a large amount of data that
provides a rich source to data mining. The objective of Web
mining is to look for patterns in Web data by collecting and
examining data in order to gain insights.
What is Web Mining?
• Web mining can widely be seen as the application of
adapted data mining techniques to the web, whereas data
mining is defined as the application of the algorithm to
discover patterns on mostly structured data embedded into
a knowledge discovery process.
• Web mining has a distinctive property to provide a set of
various data types. The web has multiple aspects that yield
different approaches for the mining process, such as web
pages consist of text, web pages are linked via hyperlinks,
and user activity can be monitored via web server logs.
These three features lead to the differentiation between
the three areas are web content mining, web structure
mining, web usage mining.
Types of web mining
1. Web Content Mining:
• Web content mining can be used to extract useful data,
information, knowledge from the web page content.
• In web content mining, each web page is considered as an
individual document.
• The individual can take advantage of the semi-structured nature of
web pages, as HTML provides information that concerns not only
the layout but also logical structure.
• The primary task of content mining is data extraction, where
structured data is extracted from unstructured websites. The
objective is to facilitate data aggregation over various web sites by
using the extracted structured data.
• Web content mining can be utilized to distinguish topics on the
web. For Example, if any user searches for a specific task on the
search engine, then the user will get a list of suggestions.
2. Web Structured Mining:
• The web structure mining can be used to find the link structure of
hyperlink. It is used to identify that data either link the web pages or
direct link network.
• In Web Structure Mining, an individual considers the web as a directed
graph, with the web pages being the vertices that are associated with
hyperlinks.
• The most important application in this regard is the Google search engine,
which estimates the ranking of its outcomes primarily with the PageRank
algorithm.
• It characterizes a page to be exceptionally relevant when frequently
connected by other highly related pages. Structure and content mining
methodologies are usually combined.
• For example, web structured mining can be beneficial to organizations to
regulate the network between two commercial sites.
Web Usage Mining:
• Web usage mining is used to extract useful data, information,
knowledge from the weblog records, and assists in recognizing the
user access patterns for web pages.
• In Mining, the usage of web resources, the individual is thinking
about records of requests of visitors of a website, that are often
collected as web server logs.
• While the content and structure of the collection of web pages
follow the intentions of the authors of the pages, the individual
requests demonstrate how the consumers see these pages.
• Web usage mining may disclose relationships that were not
proposed by the creator of the pages.
• Some of the methods to identify and analyze the web usage
patterns are given below:
I. Session and visitor analysis:
• The analysis of preprocessed data can be
accomplished in session analysis, which
incorporates the guest records, days, time,
sessions, etc. This data can be utilized to
analyze the visitor's behavior.
• The document is created after this analysis,
which contains the details of repeatedly
visited web pages, common entry, and exit.
II. OLAP (Online Analytical Processing):
• OLAP accomplishes a multidimensional
analysis of advanced data.
• OLAP can be accomplished on various parts of
log related data in a specific period.
• OLAP tools can be used to infer important
business intelligence metrics
Challenges in Web Mining:
the web pretends incredible challenges for resources, and knowledge
discovery based on the following observations:
The complexity of web pages:
• The site pages don't have a unifying structure. They are extremely
complicated as compared to traditional text documents. There are
enormous amounts of documents in the digital library of the web.
These libraries are not organized according to a specific order.
The web is a dynamic data source:
• The data on the internet is quickly updated. For example, news,
climate, shopping, financial news, sports, and so on.
Diversity of client networks:
• The client network on the web is quickly expanding. These clients
have different interests, backgrounds, and usage purposes. There
are over a hundred million workstations that are associated with
the internet and still increasing tremendously.
Relevancy of data:
• It is considered that a specific person is generally
concerned about a small portion of the web,
while the rest of the segment of the web contains
the data that is not familiar to the user and may
lead to unwanted results.
The web is too broad:
• The size of the web is tremendous and rapidly
increasing. It appears that the web is too huge for
data warehousing and data mining.
Application of Web Mining:
Web mining has an extensive application because of
various uses of the web. The list of some
applications of web mining is given below.
• Marketing and conversion tool
• Data analysis on website and application
accomplishment.
• Audience behavior analysis
• Advertising and campaign accomplishment
analysis.
• Testing and analysis of a site.

Más contenido relacionado

La actualidad más candente

Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architectureMilos Milovanovic
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paperSupratim Ray
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedDunn Solutions Group
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseBui Ha
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Research on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedResearch on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedAnant Kumar
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringDurga Gadiraju
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
Cloud Storage Spring Cleaning: A Treasure Hunt
Cloud Storage Spring Cleaning: A Treasure HuntCloud Storage Spring Cleaning: A Treasure Hunt
Cloud Storage Spring Cleaning: A Treasure HuntSteven Moy
 
Practical guide to architecting data lakes - Avinash Ramineni - Phoenix Data...
Practical guide to architecting data lakes -  Avinash Ramineni - Phoenix Data...Practical guide to architecting data lakes -  Avinash Ramineni - Phoenix Data...
Practical guide to architecting data lakes - Avinash Ramineni - Phoenix Data...Avinash Ramineni
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Zaloni
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesDataWorks Summit
 

La actualidad más candente (20)

Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architecture
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 
Disaster Recovery Site Implementation with MySQL
Disaster Recovery Site Implementation with MySQLDisaster Recovery Site Implementation with MySQL
Disaster Recovery Site Implementation with MySQL
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Research on vector spatial data storage scheme based
Research on vector spatial data storage scheme basedResearch on vector spatial data storage scheme based
Research on vector spatial data storage scheme based
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Cloud Storage Spring Cleaning: A Treasure Hunt
Cloud Storage Spring Cleaning: A Treasure HuntCloud Storage Spring Cleaning: A Treasure Hunt
Cloud Storage Spring Cleaning: A Treasure Hunt
 
SQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery ImplementationSQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery Implementation
 
Practical guide to architecting data lakes - Avinash Ramineni - Phoenix Data...
Practical guide to architecting data lakes -  Avinash Ramineni - Phoenix Data...Practical guide to architecting data lakes -  Avinash Ramineni - Phoenix Data...
Practical guide to architecting data lakes - Avinash Ramineni - Phoenix Data...
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
 
2022 02 Integration Bootcamp
2022 02 Integration Bootcamp2022 02 Integration Bootcamp
2022 02 Integration Bootcamp
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
 

Similar a Web mining

Personal Web Usage Mining
Personal Web Usage MiningPersonal Web Usage Mining
Personal Web Usage MiningDaminda Herath
 
Personal web usage mining
Personal web usage miningPersonal web usage mining
Personal web usage miningDaminda Herath
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfSowmyaJyothi3
 
Introduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningIntroduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningAarshDhokai
 

Similar a Web mining (20)

Web mining
Web miningWeb mining
Web mining
 
Gaurav web mining
Gaurav web miningGaurav web mining
Gaurav web mining
 
Web mining
Web miningWeb mining
Web mining
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web mining
Web miningWeb mining
Web mining
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 
Personal Web Usage Mining
Personal Web Usage MiningPersonal Web Usage Mining
Personal Web Usage Mining
 
Personal web usage mining
Personal web usage miningPersonal web usage mining
Personal web usage mining
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdf
 
Introduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data MiningIntroduction to Web Mining and Spatial Data Mining
Introduction to Web Mining and Spatial Data Mining
 
Aa03401490154
Aa03401490154Aa03401490154
Aa03401490154
 
Bb31269380
Bb31269380Bb31269380
Bb31269380
 

Último

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilVinayVitekari
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwaitjaanualu31
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxNadaHaitham1
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 

Último (20)

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 

Web mining

  • 1. Web Mining Prepared by Dr. Swarnalatha K.S Professor, Dept. of ISE NMIT
  • 2. Table of contents • Introduction to web mining • Types of web mining • Challenges in web mining • Applications of web mining
  • 3. • Over the last few years, the World Wide Web has become a significant source of information and simultaneously a popular platform for business. • Web mining can define as the method of utilizing data mining techniques and algorithms to extract useful information directly from the web, such as Web documents and services, hyperlinks, Web content, and server logs. • The World Wide Web contains a large amount of data that provides a rich source to data mining. The objective of Web mining is to look for patterns in Web data by collecting and examining data in order to gain insights.
  • 4. What is Web Mining? • Web mining can widely be seen as the application of adapted data mining techniques to the web, whereas data mining is defined as the application of the algorithm to discover patterns on mostly structured data embedded into a knowledge discovery process. • Web mining has a distinctive property to provide a set of various data types. The web has multiple aspects that yield different approaches for the mining process, such as web pages consist of text, web pages are linked via hyperlinks, and user activity can be monitored via web server logs. These three features lead to the differentiation between the three areas are web content mining, web structure mining, web usage mining.
  • 5. Types of web mining 1. Web Content Mining: • Web content mining can be used to extract useful data, information, knowledge from the web page content. • In web content mining, each web page is considered as an individual document. • The individual can take advantage of the semi-structured nature of web pages, as HTML provides information that concerns not only the layout but also logical structure. • The primary task of content mining is data extraction, where structured data is extracted from unstructured websites. The objective is to facilitate data aggregation over various web sites by using the extracted structured data. • Web content mining can be utilized to distinguish topics on the web. For Example, if any user searches for a specific task on the search engine, then the user will get a list of suggestions.
  • 6. 2. Web Structured Mining: • The web structure mining can be used to find the link structure of hyperlink. It is used to identify that data either link the web pages or direct link network. • In Web Structure Mining, an individual considers the web as a directed graph, with the web pages being the vertices that are associated with hyperlinks. • The most important application in this regard is the Google search engine, which estimates the ranking of its outcomes primarily with the PageRank algorithm. • It characterizes a page to be exceptionally relevant when frequently connected by other highly related pages. Structure and content mining methodologies are usually combined. • For example, web structured mining can be beneficial to organizations to regulate the network between two commercial sites.
  • 7. Web Usage Mining: • Web usage mining is used to extract useful data, information, knowledge from the weblog records, and assists in recognizing the user access patterns for web pages. • In Mining, the usage of web resources, the individual is thinking about records of requests of visitors of a website, that are often collected as web server logs. • While the content and structure of the collection of web pages follow the intentions of the authors of the pages, the individual requests demonstrate how the consumers see these pages. • Web usage mining may disclose relationships that were not proposed by the creator of the pages. • Some of the methods to identify and analyze the web usage patterns are given below:
  • 8. I. Session and visitor analysis: • The analysis of preprocessed data can be accomplished in session analysis, which incorporates the guest records, days, time, sessions, etc. This data can be utilized to analyze the visitor's behavior. • The document is created after this analysis, which contains the details of repeatedly visited web pages, common entry, and exit.
  • 9. II. OLAP (Online Analytical Processing): • OLAP accomplishes a multidimensional analysis of advanced data. • OLAP can be accomplished on various parts of log related data in a specific period. • OLAP tools can be used to infer important business intelligence metrics
  • 10. Challenges in Web Mining: the web pretends incredible challenges for resources, and knowledge discovery based on the following observations: The complexity of web pages: • The site pages don't have a unifying structure. They are extremely complicated as compared to traditional text documents. There are enormous amounts of documents in the digital library of the web. These libraries are not organized according to a specific order. The web is a dynamic data source: • The data on the internet is quickly updated. For example, news, climate, shopping, financial news, sports, and so on. Diversity of client networks: • The client network on the web is quickly expanding. These clients have different interests, backgrounds, and usage purposes. There are over a hundred million workstations that are associated with the internet and still increasing tremendously.
  • 11. Relevancy of data: • It is considered that a specific person is generally concerned about a small portion of the web, while the rest of the segment of the web contains the data that is not familiar to the user and may lead to unwanted results. The web is too broad: • The size of the web is tremendous and rapidly increasing. It appears that the web is too huge for data warehousing and data mining.
  • 12. Application of Web Mining: Web mining has an extensive application because of various uses of the web. The list of some applications of web mining is given below. • Marketing and conversion tool • Data analysis on website and application accomplishment. • Audience behavior analysis • Advertising and campaign accomplishment analysis. • Testing and analysis of a site.