Ranking algorithms

A
Ankit RajAssistant System Engineer en Tata Consultancy Services
RANKING ALGORITHMS
[DESCRIBES PAGE RANKING AND HITS ALGORITHM]
BY ANKIT RAJ
1309113012 [IT-1]
CONTENT
 INTRODUCTION
 SEARCHING
 SEARCH ENGINE OPTIMIZATION [SEO]
 TECHNIQUES OF SEO
 RANKING
 TYPES OF RANKING ALGORITHM
 PAGERANK ALGORITHM
 HITS ALGORITHM
 PRECISION AND RECALL
 CONCLUSION
 FUTURE ASPECTS
 REFERENCES
INTRODUCTION
 The Internet is the global system of interconnected mainframe, personal,
and wireless computer networks that use the internet protocol
suit (TCP/IP) to link billions of devices worldwide.
 It is a network of networks that consists of millions of private, public,
academic, business, and government networks of local to global scope.
 The Web has also enabled individuals and organizations to publish ideas
and information to a potentially large audience online at greatly reduced
expense and time delay.
WEB…WEB…..WEB….SEARCH………
SEARCHING
[SEARCH ENGINES]
 What is searching?????? Trying to find something by looking.
 When its talk about searching on web, then we can’t search any specified
thing by just simply looking.
 Because there huge and voluminous amount of data, files, directories and
content are present on web.
 So we need a tool to search the required content on web. That tool is
search engine.
 A search engine is a software system that is designed to search for
information on the World Wide Web.
 Examples are Google, Bing, Yahoo, etc….
SEARCH ENGINE OPTIMIZATION
[HOW ONE SEARCH ENGINE DIFFERS FROM OTHER OF ITS KIND]
 Search engine optimization (SEO) is the process of affecting the visibility of
a website or a web page in a search engine.
 The optimization techniques of the search engine differs from one search
engine to another.
 The better the optimization technique they have, more will be the visitors
and then that will be considered as better search engine.
[Sources: http://www.oshup.com/3-
defining-parameters-for-search-
engine-marketing/]
TECHNIQUE OF SEO
There are lots of parameters on which search engine efficiency and
effectiveness depends on but the basic among them are following:
SEO
links
page
update
rank
content
Keywords
Crawling
indexing
RANKING
 What is rank? A position in a hierarchy or scale.
 Searching anything on web using search engine will be a hectic task
without the use of proper ranking technique.
 It is very important for any search engine to use algorithm to rank the
searched pages according to the requirement of user.
 Because just simply giving the search result will not much pleased to the
user as compared to better ranked data.
Sources:
http://www.shutterstock.com/s/angry+person
+computer/search.html
TYPES OF RANKING ALGORITHMS
 Text-based ranking algorithm: The ranking scheme used in the
conventional search engines is purely Text-Based i.e. the pages are ranked
based on their textual content and number of matched terms with the
query string. , which seems to be logical.
 HITS (Hyperlink Induced Topic Search)
 SALSA: The Stochastic Approach for Link- Structure Analysis. Probabilistic
extension of the HITS algorithm.
 PageRank algorithm
1st rank…..2nd rank……3rd rank……10th rank………….
.
 Weighted Page Rank algorithm: Weighted Page Rank algorithm is an
extension of the Page-Rank algorithm. This algorithm allocates a higher
rank values to the more significant pages rather than dividing the rank
value of a page evenly among its outgoing linked web pages.
 Distance Rank Algorithm: The distance between pages is considered as a
factor. The algorithm calculates the minimum average distance between
two or more web pages.
 Topic sensitive Rank Algorithm : This algorithm computes the scores of
web page according to the importance of content available on web page.
PAGERANK ALGORITHM
 In “PageRank” the page word is not for web page though it is used for
ranking pages.
 The PageRank algorithm originally developed at Stanford University by
Larry Page in 1996 as part of a research project about a new search
engine. So it got its name from Larry Page.
 PageRank is an algorithm used by the Google web search engine to rank
websites in their search engine results.
 The PageRank algorithm does not rank the whole website, but it’s
determined for each page individually.
.
 Formula for calculating the web page rank :
 PR(A)=(1-d)+d(PR(T1)/C(T1)+………+ PR(Tn)/C(Tn))
 Where:
PR(A) = PageRank of page A
T1….Tn=All pages that link to page A
PR(Ti) =Page rank of page Ti
C(Ti) =the number of pages to which Ti links to
d =damping factor which can be set between 0 and 1
Now lets take a look at how it works: http://www.math.cornell.edu/~mec/Winter2009/R
alucaRemus/Lecture3/lecture3.html
STEP: 1 STEP: 2
.
0 0 0 ½
1/3 0 0 0
1/3 1/2 0 ½
1/3 1/2 0 0
A= V=
0.25
0.25
0.25
0.25
A matrix is made by studying
graph of page relation.
V matrix is made by
1/(number of pages).
.
.
1st iteration: 2nd iteration:
3rd …4th…5th iteration:
.
Now taking a look at 7th and 8th iteration, the values seems to become constant. So
this is the final rank value of algorithm.
6th..7th..8th..iteration
RANK
1—page 1
2—page 3
3—page 4
4—page 2
HITS ALGORITHM
 The HITS algorithm stands for “Hypertext Induced Topic Selection” and is used
for rating and ranking websites based on the link information when identifying
topic areas.
 Clever builds on the HITS (Hypertext-Induced Topic Search) algorithm
developed at IBM’s Almaden Research Lab in San Jose, CA.
 Unlike PageRank which is a static ranking algorithm, HITS is search query
dependent. Thus, ranking of the web page is decided by analysing its textual
contents against a given query.
 The algorithm produces two types of pages:
Authority: pages that provide an important.
Hub: pages that contain links to authorities
.
 In this algorithm a web page is named as authority if the web page is
pointed by many hyper links and a web page is named as HUB if the page
point to various hyperlinks .
 HITS is a topic specific search. First of all a subset of web pages containing
good hub and authority pages with respect to a query is created. This is
done by first firing the query and getting an initial set of documents
relevant to the query. This is called the root set for the query.
[Sources : International
Journal of Engineering
Research & Technology
(IJERT) Vol. 1 Issue 8,
October - 2012 ISSN: 2278-
0181]
PRECISION AND RECALL
[TO CHECK EFFICIENCY OF RANKING ALGORITHM]
 precision (also called positive predictive value) is the fraction of retrieved instances
that are relevant, while recall (also known as sensitivity) is the fraction of relevant
instances that are retrieved.
 Both precision and recall are therefore based on an understanding and measure
of relevance.
[Sources:www2.hawaii.edu/~donnab/lis670/]
Comparison between SVM[space vector model] vs PageRank:
.
[Sources:http://www.webology.org/2007/v4n3/a44.html]
Comparison between HITS vs SVM:
.
[Sources:http://www.webology.org/2007/v4n3/a44.html]
CONCLUSION
 To optimise the search we required a better ranking algorithm.
 On the basis of this study we conclude that both page rank and HITS algorithm are
different link analysis algorithms that employ different models to calculate web
page rank.
 Page Rank is a more popular algorithm used as the basis for the very popular
Google search engine.
 This popularity is due to the features like efficiency, feasibility, less query time cost,
less susceptibility to localized links etc. which are absent in HITS algorithm.
 However though the HITS algorithm itself has not been very popular, different
extensions of the same have been employed in a number of different web sites.
FUTURE ASPECTS
 The proposed work in the Page Rank algorithm includes the implementation to
solve the problem of Dangling Page. Dangling pages are pages which do not have
any outbound link or the page which does not provide any reference to other
pages. These Dangling pages create many issues to calculate efficient page rank of
different pages of a websites.
 Even the work is going on to remove circular references, so that proper ranking
can be done.
REFERENCES
 http://www.webology.org/2007/v4n3/a44.html
 www2.hawaii.edu/~donnab/lis670/
 International Journal of Engineering Research & Technology (IJERT) Vol. 1 Issue 8,
October - 2012 ISSN: 2278-0181
 http://www.math.cornell.edu/~mec/Winter2009/RalucaRemus/Lecture3/lecture3.ht
ml
 International Journal of Advanced Research in Computer and Communication
Engineering,Vol. 3, Issue 2, February 2014. ISSN (Online) : 2278-1021.ISSN (Print) :
2319-5940
.
.
1 de 25

Recomendados

web mining por
web miningweb mining
web miningArpit Verma
19.3K vistas22 diapositivas
Web data mining por
Web data miningWeb data mining
Web data miningInstitute of Technology Telkom
3K vistas41 diapositivas
Automatic indexing por
Automatic indexingAutomatic indexing
Automatic indexingdhatchayaninandu
11.9K vistas8 diapositivas
Text MIning por
Text MIningText MIning
Text MIningPrakhyath Rai
19.5K vistas15 diapositivas
Information retrieval (introduction) por
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction) Primya Tamil
880 vistas28 diapositivas
Boolean,vector space retrieval Models por
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Primya Tamil
6K vistas29 diapositivas

Más contenido relacionado

La actualidad más candente

Data mining tasks por
Data mining tasksData mining tasks
Data mining tasksKhwaja Aamer
14.7K vistas10 diapositivas
WEB BASED INFORMATION RETRIEVAL SYSTEM por
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
2K vistas13 diapositivas
data mining por
data miningdata mining
data miningmanasa polu
8.6K vistas18 diapositivas
3. mining frequent patterns por
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patternsAzad public school
12.2K vistas41 diapositivas
Fundamentals of Database ppt ch03 por
Fundamentals of Database ppt ch03Fundamentals of Database ppt ch03
Fundamentals of Database ppt ch03Jotham Gadot
4.3K vistas52 diapositivas
Information retrieval s por
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
39K vistas18 diapositivas

La actualidad más candente(20)

Data mining tasks por Khwaja Aamer
Data mining tasksData mining tasks
Data mining tasks
Khwaja Aamer14.7K vistas
WEB BASED INFORMATION RETRIEVAL SYSTEM por Sai Kumar Ale
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
Sai Kumar Ale2K vistas
Fundamentals of Database ppt ch03 por Jotham Gadot
Fundamentals of Database ppt ch03Fundamentals of Database ppt ch03
Fundamentals of Database ppt ch03
Jotham Gadot4.3K vistas
Information retrieval s por silambu111
Information retrieval sInformation retrieval s
Information retrieval s
silambu11139K vistas
The vector space model por pkgosh
The vector space modelThe vector space model
The vector space model
pkgosh6.3K vistas
The impact of web on ir por Primya Tamil
The impact of web on irThe impact of web on ir
The impact of web on ir
Primya Tamil5.6K vistas
Mining Frequent Patterns, Association and Correlations por Justin Cletus
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
Justin Cletus17.7K vistas
Indexing and Hashing por sathish sak
Indexing and HashingIndexing and Hashing
Indexing and Hashing
sathish sak904 vistas
Normalization por Salman Memon
NormalizationNormalization
Normalization
Salman Memon26.1K vistas
Supervised learning and Unsupervised learning por Usama Fayyaz
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
Usama Fayyaz5K vistas

Destacado

page ranking algorithm por
page ranking algorithmpage ranking algorithm
page ranking algorithmJaved Khan
2K vistas18 diapositivas
Comparative study of different ranking algorithms adopted by search engine por
Comparative study of  different ranking algorithms adopted by search engineComparative study of  different ranking algorithms adopted by search engine
Comparative study of different ranking algorithms adopted by search engineEchelon Institute of Technology
2.8K vistas19 diapositivas
Introduction to question answering for linked data & big data por
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataAndre Freitas
16.6K vistas193 diapositivas
Page rank algorithm por
Page rank algorithmPage rank algorithm
Page rank algorithmJunghoon Kim
4.3K vistas22 diapositivas
Pagerank Algorithm Explained por
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explainedjdhaar
21K vistas18 diapositivas
The Google Pagerank algorithm - How does it work? por
The Google Pagerank algorithm - How does it work?The Google Pagerank algorithm - How does it work?
The Google Pagerank algorithm - How does it work?Kundan Bhaduri
4.7K vistas13 diapositivas

Destacado(20)

page ranking algorithm por Javed Khan
page ranking algorithmpage ranking algorithm
page ranking algorithm
Javed Khan2K vistas
Introduction to question answering for linked data & big data por Andre Freitas
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
Andre Freitas16.6K vistas
Page rank algorithm por Junghoon Kim
Page rank algorithmPage rank algorithm
Page rank algorithm
Junghoon Kim4.3K vistas
Pagerank Algorithm Explained por jdhaar
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explained
jdhaar21K vistas
The Google Pagerank algorithm - How does it work? por Kundan Bhaduri
The Google Pagerank algorithm - How does it work?The Google Pagerank algorithm - How does it work?
The Google Pagerank algorithm - How does it work?
Kundan Bhaduri4.7K vistas
Google Page Rank Algorithm por Omkar Dash
Google Page Rank AlgorithmGoogle Page Rank Algorithm
Google Page Rank Algorithm
Omkar Dash16.2K vistas
Deep Learning Models for Question Answering por Sujit Pal
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal14.4K vistas
Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ... por Wolfgang Reinhardt
Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ...Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ...
Adding Semantics to Social Software Engineering (by Steffen Lohmann & Thomas ...
Wolfgang Reinhardt2.1K vistas
Introduction to Enterprise Search por Findwise
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
Findwise1.5K vistas
Communi Gate Web 3 0 Ajax World 08 V2 por rajivmordani
Communi Gate Web 3 0 Ajax World 08 V2Communi Gate Web 3 0 Ajax World 08 V2
Communi Gate Web 3 0 Ajax World 08 V2
rajivmordani1K vistas
PageRank and Related Methods por John Breslin
PageRank and Related MethodsPageRank and Related Methods
PageRank and Related Methods
John Breslin1.7K vistas
Pagerank and hits por Shatakirti Er
Pagerank and hitsPagerank and hits
Pagerank and hits
Shatakirti Er20.6K vistas

Similar a Ranking algorithms

Search engine por
Search engineSearch engine
Search engineswaraj27
295 vistas24 diapositivas
page ranking web crawling por
page ranking web crawlingpage ranking web crawling
page ranking web crawlingpradiprahul
135 vistas61 diapositivas
PAGE RANKING por
PAGE RANKING PAGE RANKING
PAGE RANKING pradiprahul
114 vistas61 diapositivas
A Survey On Search Engine Optimization Using Page Ranking Algorithms por
A Survey On Search Engine Optimization Using Page Ranking AlgorithmsA Survey On Search Engine Optimization Using Page Ranking Algorithms
A Survey On Search Engine Optimization Using Page Ranking AlgorithmsBrittany Allen
3 vistas4 diapositivas
Seo and page rank algorithm por
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithmNilkanth Shirodkar
1.2K vistas26 diapositivas
A Survey on Search Engine Optimization usingPage Ranking Algorithms por
A Survey on Search Engine Optimization usingPage Ranking AlgorithmsA Survey on Search Engine Optimization usingPage Ranking Algorithms
A Survey on Search Engine Optimization usingPage Ranking AlgorithmsIIRindia
9 vistas4 diapositivas

Similar a Ranking algorithms(20)

Search engine por swaraj27
Search engineSearch engine
Search engine
swaraj27295 vistas
page ranking web crawling por pradiprahul
page ranking web crawlingpage ranking web crawling
page ranking web crawling
pradiprahul135 vistas
PAGE RANKING por pradiprahul
PAGE RANKING PAGE RANKING
PAGE RANKING
pradiprahul114 vistas
A Survey On Search Engine Optimization Using Page Ranking Algorithms por Brittany Allen
A Survey On Search Engine Optimization Using Page Ranking AlgorithmsA Survey On Search Engine Optimization Using Page Ranking Algorithms
A Survey On Search Engine Optimization Using Page Ranking Algorithms
Brittany Allen3 vistas
A Survey on Search Engine Optimization usingPage Ranking Algorithms por IIRindia
A Survey on Search Engine Optimization usingPage Ranking AlgorithmsA Survey on Search Engine Optimization usingPage Ranking Algorithms
A Survey on Search Engine Optimization usingPage Ranking Algorithms
IIRindia9 vistas
IRJET- Page Ranking Algorithms – A Comparison por IRJET Journal
IRJET- Page Ranking Algorithms – A ComparisonIRJET- Page Ranking Algorithms – A Comparison
IRJET- Page Ranking Algorithms – A Comparison
IRJET Journal41 vistas
Search Engine Optimization(SEO) por Surit Datta
Search Engine Optimization(SEO)Search Engine Optimization(SEO)
Search Engine Optimization(SEO)
Surit Datta1.1K vistas
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine. por iosrjce
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
Smart Crawler: A Two Stage Crawler for Concept Based Semantic Search Engine.
iosrjce490 vistas
Googling of GooGle por binit singh
Googling of GooGleGoogling of GooGle
Googling of GooGle
binit singh2.6K vistas
Google Search Engine por guestf460ed0
Google Search EngineGoogle Search Engine
Google Search Engine
guestf460ed03.5K vistas
Done rerea dlink-farm-spam por James Arnold
Done rerea dlink-farm-spamDone rerea dlink-farm-spam
Done rerea dlink-farm-spam
James Arnold1.6K vistas
Done rerea dlink-farm-spam(2) por James Arnold
Done rerea dlink-farm-spam(2)Done rerea dlink-farm-spam(2)
Done rerea dlink-farm-spam(2)
James Arnold891 vistas
Done rerea dlink-farm-spam(3) por James Arnold
Done rerea dlink-farm-spam(3)Done rerea dlink-farm-spam(3)
Done rerea dlink-farm-spam(3)
James Arnold5.2K vistas
Search Engine Optimization - Aykut Aslantaş por Aykut Aslantaş
Search Engine Optimization - Aykut AslantaşSearch Engine Optimization - Aykut Aslantaş
Search Engine Optimization - Aykut Aslantaş
Aykut Aslantaş416 vistas

Más de Ankit Raj

Authentication on Cloud using Attribute Based Encryption por
Authentication on Cloud using Attribute Based EncryptionAuthentication on Cloud using Attribute Based Encryption
Authentication on Cloud using Attribute Based EncryptionAnkit Raj
758 vistas29 diapositivas
Augmented Reality por
Augmented RealityAugmented Reality
Augmented RealityAnkit Raj
692 vistas27 diapositivas
Sentiment Analyzer por
Sentiment AnalyzerSentiment Analyzer
Sentiment AnalyzerAnkit Raj
447 vistas26 diapositivas
Web server por
Web serverWeb server
Web serverAnkit Raj
6K vistas7 diapositivas
Multicore processor by Ankit Raj and Akash Prajapati por
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiAnkit Raj
6.8K vistas35 diapositivas
Mathematics por
MathematicsMathematics
MathematicsAnkit Raj
4.1K vistas35 diapositivas

Más de Ankit Raj(6)

Authentication on Cloud using Attribute Based Encryption por Ankit Raj
Authentication on Cloud using Attribute Based EncryptionAuthentication on Cloud using Attribute Based Encryption
Authentication on Cloud using Attribute Based Encryption
Ankit Raj758 vistas
Augmented Reality por Ankit Raj
Augmented RealityAugmented Reality
Augmented Reality
Ankit Raj692 vistas
Sentiment Analyzer por Ankit Raj
Sentiment AnalyzerSentiment Analyzer
Sentiment Analyzer
Ankit Raj447 vistas
Web server por Ankit Raj
Web serverWeb server
Web server
Ankit Raj6K vistas
Multicore processor by Ankit Raj and Akash Prajapati por Ankit Raj
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
Ankit Raj6.8K vistas
Mathematics por Ankit Raj
MathematicsMathematics
Mathematics
Ankit Raj4.1K vistas

Último

IETF 118: Starlink Protocol Performance por
IETF 118: Starlink Protocol PerformanceIETF 118: Starlink Protocol Performance
IETF 118: Starlink Protocol PerformanceAPNIC
354 vistas22 diapositivas
Marketing and Community Building in Web3 por
Marketing and Community Building in Web3Marketing and Community Building in Web3
Marketing and Community Building in Web3Federico Ast
12 vistas64 diapositivas
PORTFOLIO 1 (Bret Michael Pepito).pdf por
PORTFOLIO 1 (Bret Michael Pepito).pdfPORTFOLIO 1 (Bret Michael Pepito).pdf
PORTFOLIO 1 (Bret Michael Pepito).pdfbrejess0410
8 vistas6 diapositivas
Is Entireweb better than Google por
Is Entireweb better than GoogleIs Entireweb better than Google
Is Entireweb better than Googlesebastianthomasbejan
12 vistas1 diapositiva
Affiliate Marketing por
Affiliate MarketingAffiliate Marketing
Affiliate MarketingNavin Dhanuka
16 vistas30 diapositivas
information por
informationinformation
informationkhelgishekhar
9 vistas4 diapositivas

Último(9)

IETF 118: Starlink Protocol Performance por APNIC
IETF 118: Starlink Protocol PerformanceIETF 118: Starlink Protocol Performance
IETF 118: Starlink Protocol Performance
APNIC354 vistas
Marketing and Community Building in Web3 por Federico Ast
Marketing and Community Building in Web3Marketing and Community Building in Web3
Marketing and Community Building in Web3
Federico Ast12 vistas
PORTFOLIO 1 (Bret Michael Pepito).pdf por brejess0410
PORTFOLIO 1 (Bret Michael Pepito).pdfPORTFOLIO 1 (Bret Michael Pepito).pdf
PORTFOLIO 1 (Bret Michael Pepito).pdf
brejess04108 vistas
Building trust in our information ecosystem: who do we trust in an emergency por Tina Purnat
Building trust in our information ecosystem: who do we trust in an emergencyBuilding trust in our information ecosystem: who do we trust in an emergency
Building trust in our information ecosystem: who do we trust in an emergency
Tina Purnat106 vistas
How to think like a threat actor for Kubernetes.pptx por LibbySchulze1
How to think like a threat actor for Kubernetes.pptxHow to think like a threat actor for Kubernetes.pptx
How to think like a threat actor for Kubernetes.pptx
LibbySchulze15 vistas

Ranking algorithms

  • 1. RANKING ALGORITHMS [DESCRIBES PAGE RANKING AND HITS ALGORITHM] BY ANKIT RAJ 1309113012 [IT-1]
  • 2. CONTENT  INTRODUCTION  SEARCHING  SEARCH ENGINE OPTIMIZATION [SEO]  TECHNIQUES OF SEO  RANKING  TYPES OF RANKING ALGORITHM  PAGERANK ALGORITHM  HITS ALGORITHM  PRECISION AND RECALL  CONCLUSION  FUTURE ASPECTS  REFERENCES
  • 3. INTRODUCTION  The Internet is the global system of interconnected mainframe, personal, and wireless computer networks that use the internet protocol suit (TCP/IP) to link billions of devices worldwide.  It is a network of networks that consists of millions of private, public, academic, business, and government networks of local to global scope.  The Web has also enabled individuals and organizations to publish ideas and information to a potentially large audience online at greatly reduced expense and time delay. WEB…WEB…..WEB….SEARCH………
  • 4. SEARCHING [SEARCH ENGINES]  What is searching?????? Trying to find something by looking.  When its talk about searching on web, then we can’t search any specified thing by just simply looking.  Because there huge and voluminous amount of data, files, directories and content are present on web.  So we need a tool to search the required content on web. That tool is search engine.  A search engine is a software system that is designed to search for information on the World Wide Web.  Examples are Google, Bing, Yahoo, etc….
  • 5. SEARCH ENGINE OPTIMIZATION [HOW ONE SEARCH ENGINE DIFFERS FROM OTHER OF ITS KIND]  Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine.  The optimization techniques of the search engine differs from one search engine to another.  The better the optimization technique they have, more will be the visitors and then that will be considered as better search engine. [Sources: http://www.oshup.com/3- defining-parameters-for-search- engine-marketing/]
  • 6. TECHNIQUE OF SEO There are lots of parameters on which search engine efficiency and effectiveness depends on but the basic among them are following: SEO links page update rank content Keywords Crawling indexing
  • 7. RANKING  What is rank? A position in a hierarchy or scale.  Searching anything on web using search engine will be a hectic task without the use of proper ranking technique.  It is very important for any search engine to use algorithm to rank the searched pages according to the requirement of user.  Because just simply giving the search result will not much pleased to the user as compared to better ranked data. Sources: http://www.shutterstock.com/s/angry+person +computer/search.html
  • 8. TYPES OF RANKING ALGORITHMS  Text-based ranking algorithm: The ranking scheme used in the conventional search engines is purely Text-Based i.e. the pages are ranked based on their textual content and number of matched terms with the query string. , which seems to be logical.  HITS (Hyperlink Induced Topic Search)  SALSA: The Stochastic Approach for Link- Structure Analysis. Probabilistic extension of the HITS algorithm.  PageRank algorithm 1st rank…..2nd rank……3rd rank……10th rank………….
  • 9. .  Weighted Page Rank algorithm: Weighted Page Rank algorithm is an extension of the Page-Rank algorithm. This algorithm allocates a higher rank values to the more significant pages rather than dividing the rank value of a page evenly among its outgoing linked web pages.  Distance Rank Algorithm: The distance between pages is considered as a factor. The algorithm calculates the minimum average distance between two or more web pages.  Topic sensitive Rank Algorithm : This algorithm computes the scores of web page according to the importance of content available on web page.
  • 10. PAGERANK ALGORITHM  In “PageRank” the page word is not for web page though it is used for ranking pages.  The PageRank algorithm originally developed at Stanford University by Larry Page in 1996 as part of a research project about a new search engine. So it got its name from Larry Page.  PageRank is an algorithm used by the Google web search engine to rank websites in their search engine results.  The PageRank algorithm does not rank the whole website, but it’s determined for each page individually.
  • 11. .  Formula for calculating the web page rank :  PR(A)=(1-d)+d(PR(T1)/C(T1)+………+ PR(Tn)/C(Tn))  Where: PR(A) = PageRank of page A T1….Tn=All pages that link to page A PR(Ti) =Page rank of page Ti C(Ti) =the number of pages to which Ti links to d =damping factor which can be set between 0 and 1
  • 12. Now lets take a look at how it works: http://www.math.cornell.edu/~mec/Winter2009/R alucaRemus/Lecture3/lecture3.html
  • 14. . 0 0 0 ½ 1/3 0 0 0 1/3 1/2 0 ½ 1/3 1/2 0 0 A= V= 0.25 0.25 0.25 0.25 A matrix is made by studying graph of page relation. V matrix is made by 1/(number of pages).
  • 15. . . 1st iteration: 2nd iteration: 3rd …4th…5th iteration:
  • 16. . Now taking a look at 7th and 8th iteration, the values seems to become constant. So this is the final rank value of algorithm. 6th..7th..8th..iteration RANK 1—page 1 2—page 3 3—page 4 4—page 2
  • 17. HITS ALGORITHM  The HITS algorithm stands for “Hypertext Induced Topic Selection” and is used for rating and ranking websites based on the link information when identifying topic areas.  Clever builds on the HITS (Hypertext-Induced Topic Search) algorithm developed at IBM’s Almaden Research Lab in San Jose, CA.  Unlike PageRank which is a static ranking algorithm, HITS is search query dependent. Thus, ranking of the web page is decided by analysing its textual contents against a given query.  The algorithm produces two types of pages: Authority: pages that provide an important. Hub: pages that contain links to authorities
  • 18. .  In this algorithm a web page is named as authority if the web page is pointed by many hyper links and a web page is named as HUB if the page point to various hyperlinks .  HITS is a topic specific search. First of all a subset of web pages containing good hub and authority pages with respect to a query is created. This is done by first firing the query and getting an initial set of documents relevant to the query. This is called the root set for the query. [Sources : International Journal of Engineering Research & Technology (IJERT) Vol. 1 Issue 8, October - 2012 ISSN: 2278- 0181]
  • 19. PRECISION AND RECALL [TO CHECK EFFICIENCY OF RANKING ALGORITHM]  precision (also called positive predictive value) is the fraction of retrieved instances that are relevant, while recall (also known as sensitivity) is the fraction of relevant instances that are retrieved.  Both precision and recall are therefore based on an understanding and measure of relevance. [Sources:www2.hawaii.edu/~donnab/lis670/]
  • 20. Comparison between SVM[space vector model] vs PageRank: . [Sources:http://www.webology.org/2007/v4n3/a44.html]
  • 21. Comparison between HITS vs SVM: . [Sources:http://www.webology.org/2007/v4n3/a44.html]
  • 22. CONCLUSION  To optimise the search we required a better ranking algorithm.  On the basis of this study we conclude that both page rank and HITS algorithm are different link analysis algorithms that employ different models to calculate web page rank.  Page Rank is a more popular algorithm used as the basis for the very popular Google search engine.  This popularity is due to the features like efficiency, feasibility, less query time cost, less susceptibility to localized links etc. which are absent in HITS algorithm.  However though the HITS algorithm itself has not been very popular, different extensions of the same have been employed in a number of different web sites.
  • 23. FUTURE ASPECTS  The proposed work in the Page Rank algorithm includes the implementation to solve the problem of Dangling Page. Dangling pages are pages which do not have any outbound link or the page which does not provide any reference to other pages. These Dangling pages create many issues to calculate efficient page rank of different pages of a websites.  Even the work is going on to remove circular references, so that proper ranking can be done.
  • 24. REFERENCES  http://www.webology.org/2007/v4n3/a44.html  www2.hawaii.edu/~donnab/lis670/  International Journal of Engineering Research & Technology (IJERT) Vol. 1 Issue 8, October - 2012 ISSN: 2278-0181  http://www.math.cornell.edu/~mec/Winter2009/RalucaRemus/Lecture3/lecture3.ht ml  International Journal of Advanced Research in Computer and Communication Engineering,Vol. 3, Issue 2, February 2014. ISSN (Online) : 2278-1021.ISSN (Print) : 2319-5940
  • 25. . .