Web intelligence-future of next generation web

WEB INTELLIGENCE
Seminar Report
Submitted in partial fulfilment of the requirements
for the award of the degree of
Bachelor of Technology
in
Computer Science Engineering
of
Cochin University Of Science And Technology
by

NIJIL Y
(12080050)

DIVISION OF COMPUTER SCIENCE
SCHOOL OF ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY
KOCHI-682022

DIVISION OF COMPUTER SCIENCE
SCHOOL OF ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY
KOCHI-682022

Certificate
Certified that this is a bonafide record of the seminar entitled
“WEB

INTELLIGENCE”

Presented by the following student
NIJIL Y
of the VII

th

semester, Computer Science and Engineering in the year 2010

in partial f ulfillment of the requirements in the award of Degree of
Bachelor of Technology in Computer Science and E ngineering of Cochin
University of Science and Technology.

Mr. SUDEEP EDAYILAM
Seminar guide

Dr. DAVID PETER
Head Of Division

ACKNOWLEDGEMENT

I thank GOD almighty for guiding me throughout the seminar. I would like to thank all those
who ha ve contributed to t he c ompletion of t he s eminar a nd he lped me with va luable
suggestions for improvement.
I a m e xtremely grateful to Dr. David Peter, Head Of Division, Division of Computer
Science, for providing me with best facilities and atmosphere for the creative work guidance
and encouragement. I am profoundly indebted to my seminar guide Mr. Sudheep Elayidom,
sr.Lecturer, Division of Computer Science, for all help and support extend to me. I thank
all Staff me mbers of my c ollege a nd f riends f or e xtending t heir c ooperation during m y
seminar.

Above all I would like to thank my parents without whose blessings, I would not have been
able to accomplish my goal.

NIJIL Y

ABSTRACT

Web Intelligence is a new direction for scientific research and development that explores
the f undamental roles as w ell as practical i mpacts of ar tificial i ntelligence and adva nced
information t echnology f or t he ne xt ge neration of Web-empowered systems, services, and
environments. Web Intelligence is regarded as the key research field for the development of the
Wisdom Web ( including t he S emantic W eb). The Web r evolutionizes t he w ay w e ga ther,
process, a nd us e i nformation. Despite cu rrent t echnological adva nces, w e st ill ca nnot pred ict
what t he Web’s ne xt pa radigm s hift w ill b e. H owever, w e pr opose t hat t his c hange w ill
transform the Web into an intelligent entity—hence, the term Web intelligence.
The ne xt-generation W eb w ill go b eyond i mproved i nformation s earch a nd know ledge
queries a nd will h elp p eople a chieve be tter w ays of l iving, working, pl aying, a nd l earning. T o
fulfil its potential, the intelligent Web’s design and development must incorporate and integrate
several f undamental capa bilities. A f ew o f i ts capa bilities a re R eflexive ser ver pro pagation ,
Growth Specialization , A utocatalysis et c. Intelligent Web agents can use t he P roblem S olver
Mark-up L anguage ( PSML) t o s pecify t heir r oles, s ettings, a nd r elationships w ith a ny ot her
services. The i ntelligent Web must a lso ha ve the a bility t o pr ocess and unde rstand na tural
language. It must understand and c orrectly judge the meaning of concepts expressed in words,
such as “go od,” “be st,” and “season” et c. WI r esearch incorporates k nowledge f rom e xisting
disciplines, such as artificial intelligence and information technology, in a t otally new domain.
At t he sam e t ime, Web Intelligence r esearch also enriches t hese established disciplines as it
introduces new topics and challenges.

TABLE OF CONTENTS
CHAPTER NO.

CHAPTER TITLE

PAGE NO.

1

Introduction

1

2

Perspectives Of Wi

4

3

Intelligence Exploration

8

3.1

A New Field Of Science, Technology And Engineering

8

3.2

Design Philosophy And Principles Of The Web

8

3. 3

The Laws Of The Web

9

3. 4

The Web Revolution: One Link At A Time

10

3.5

The More Things Change, The More They Stay The Same

11

4

Components Of Web Intelligence

13

4.1

Web Data

13

4.2

Representation

15

4.3

Psml And Web Inference Engine

17

4.4

Social Network Intelligence

17

4.4

Social Network Intelligence

17

5

Computational Web Intelligence

18

5.1

Web Uncertainty

19

5.2

Computational Web Intelligence For Web Uncertainty

19

5.3

Granular Web Intelligence For Web Uncertainty

21

6

Trends And Challenges Of Wi Related Research And Development

23

6.1

Intelligent Web Agents

24

6.2

From Wa To Web-Based Services

25

7

Semantic Search Engine

28

8

Conclusion

29

References

30

Web Intelligence

CHAPTER 1

INTRODUCTION
With the rapid growth of Internet and World Wide Web (WWW), we have now entered
into a new information age. The Web provides a total new media for communication, which goes
far beyond the traditional communication media, such as radio, telephone and television. The
Web has significant impacts on both academic research and ordinary daily life. It revolutionizes
the way in which information is gathered, stored, processed, presented, shared, and used. The
Web offers new opportunities and challenges for many areas, such as business, commerce,
marketing, finance, publishing, education, research and development. For computer scientists, the
Web introduces many new research topics and provides a new platform to reconsider old
problems. It might be high time to create a new sub-discipline of computer science covering
theories and technologies related to the Web. Web Intelligence is our proposal for this purpose.

Through the billions of Web pages created with HTML and XML, or generated
dynamically by underlying Web database service engines, the Web captures almost all aspects of
human endeavor and provides a fertile ground for data mining. However, searching,
comprehending, and using the semi-structured information stored on the Web poses a significant
challenge because this data is more sophisticated and dynamic than the information that
commercial database systems store. To supplement keyword-based indexing, which forms the
cornerstone for Web search engines, researchers have applied data mining to Web-page ranking.
In this context, data mining helps Web search engines find high-quality site administrator.

WI explores the fundamental and practical impact that artificial intelligence and advanced
information technology will have on the next generation of Web-empowered systems, services,
and environments. In an era dominated by the World Wide Web, Grid computing, intelligentagent technology, and ubiquitous social computing, WI represents information technology’s next
challenge. 3 Motivations and Justifications for WI The introduction of Web Intelligence (WI) can
be motivated and justified fromboth academic and industrial perspectives. Two features of the
Web make it a useful and unique platform for computer applications and research, the size and
complexity. The Web contains a huge amount of interconnected Web documents known as Web
pages. For example, the popular search engine Google claims that it can search 1,346,966,000
pages as of February 2001. The sheer size of the Web leads to difficulties in the storage,

Division Of Computer Science , SOE CUSAT

Page 1

Web Intelligence

management, and efficient and effective retrieval of Web documents. The complexity of the Web,
in terms of connectivity and diversity of Web documents, forces us to reconsider many existing
information systems, as well as theories, methodologies and technologies underlying those
systems. One has to deal with a heterogeneous collection of structured, unstructured, semistructured, interrelated, and distributed Web documents consisting of texts, images and sounds,
instead of homogeneous collection of structured and unrelated objects. The latter is the subject of
study of many conventional information systems, such as databases, information retrieval, and
multi-media systems. To accommodate the needs of the Web, one needs to study issues on the
design and implementation of the Web-based information systems by combining and extending
results from existing intelligent information systems. Existing theories and technologies need
to be modified or enhanced to deal with complexity of the Web. Although individual Web-based
information systems are constantly being deployed, advanced issues and techniques for
developing and for benefiting from the Web remain to be systematically studied. The challenges
brought by the Web to computer scientists may justify the creation of the new sub-discipline, WI,
for carrying out Web-related research.

The Web increases the availability and accessibility of information to a much
larger community than any other computer applications. The introduction of Personal Computers
(PCs) brought the computational power to ordinary people. It is the Web that delivers more
effectively information to everyone at finger tips. The Web, no doubt, offers a new means for
sharing and transmitting information unmatchable by other media. The revolution started by the
Web is just beginning. New business opportunities, such as e-commerce, e-banking, and
e-publication, will increase with the maturity of the Web. It can hardly overemphasize more
impacts of the Web on the business and industrial world. The creation of a new sub-discipline
devoted toWeb related research and applications might has a significant value in the future.
The needs for WI may be further illustrated by the current fast growing research and industrial
activities centered on it. We searched the Web by using the keyword “Web Intelligence” through
several search engines in February 2001.


Page 2

Web Intelligence

What is Web Intelligence?

“Web Intelligence (WI) exploits Artificial Intelligence (AI) and advanced Information
Technology (IT) on the Web and Internet.”

This definition has the following implications. The basis of WI is AI and IT. The “I”
happens to be shared by both “AI” and “IT”, although with different meanings in them, and “W”
defines the platform on which WI research is carried out. The goal of WI is the joint goals of AI
and IT on the new platform of the Web. That is, WI applies AI and IT for the design and
implementation of Intelligent Web Information Systems (IWIS). An IWIS should be able to
perform functions normally associated with human intelligence, such as reasoning, learning, and
self improvement. There perhaps might not be a standard and non-controversial definition of WI,
as the case that there is no standard definition of AI. One may argued that our definition of WI
focuses more on the software aspects of the Web. It is not our intention to exclude any research
topic using the proposed definition. The term, Web Intelligence, should be considered as an
umbrella or a label of a new branch of research centered on the Web. Our definition simply states
the scopes and goals of WI. This allows us to include any theories and technologies that either fall
in the scopes or aim at the same goals. To complement the formal definition, we try to make the
picture clearer by listing topics to be covered by WI.

WI will be an ever-changing research branch. It will be evolving with development of the
Web as new media for information gathering, storage, processing, delivery and utilization. It is
our expectation that WI will be evolved into an inseparable research branch of computer science.
Although no one can predict the future in detail and without uncertainty, it is clear that WI would
have huge impacts on the application of computers, which in turn will affect our everyday lives.


Page 3

Web Intelligence

CHAPTER 2
Perspectives of WI

As a new branch of research, Web Intelligence exploits Artificial Intelligence (AI) and
Information Technology (IT) on the Web. On the one hand, it may be viewed as applying results
from these existing disciplines to a totally new domain. On the other hand, WI may also introduce
new problems and challenges to the established disciplines. WI may also be viewed as an
enhancement or an extension of AI and IT. It remains to be seen if WI would become a sub-area
of AI and IT or a child of a successful marriage of AI and IT. However, no matter what happens,
studies on WI can benefit a great deal from the results, experience, success and lessons of AI and
IT. In their very popular textbook, Russell and Norvig examined different definitions of artificial
intelligence from eight other textbooks, in order to decide what is exactly AI. They observed that
the definitions vary along the two dimensions. One dimension deals with the functionality and
ability of an AI system, ranging from thought processes and reasoning ability of the systems to the
behavior of the systems. The other dimension deals with the designing philosophy of AI systems,
ranging from intimating human problem solving to making rational decision. The combination of
the two dimensions results in four categories of AI systems adopted from Russell and Norvig .

Systems that think like humans.

Systems that think rationally.

Systems that act like humans.

Systems that act rationally.

This classification provides a basis for the studies of various views and approaches for AI.
It also clearly defines goals in the design of AI systems. According to Russell and Norvig , they
correspond to four approaches, the cognitive modeling approach (thinking humanly), the Turing
test approach (acting humanly), the laws of thought approach (thinking rationally), and the
rational agent approach (acting rationally).The two rows for separating AI systems in terms of
thinking and acting may not be a most suitable classification. Action is normally the final result of
a thinking process. One may argue that the class of systems acting humanly is a super set of the
class of system thinking humanly. In contrast, the separation of human-centered approach and
rationality-centered approach may have significant implications in the studies of AI. While earlier
research on AI was focus more on human-centered approach, rationality-centered approach
received more attention recently

Page 4

Web Intelligence

The first column is centered around humans and leads to the treatment of AI as an

empirical science involving hypothesis and experimental confirmation. A human-centered
approach represents the descriptive view of AI. Under this view, a system is designed by
intimating the human problem solving. This implies that a system should have the usual human
capabilities such as knowledge representation, natural language processing, reasoning, planning
and learning. The performance of an AI system is measured or evaluated through the Turing
test. An system is said to be intelligent if it provides human level performance. Such a descriptive
view dominates the majority of earlier studies of expert systems, a special type of AI systems.
The second column represents the prescriptive or normative view of AI. It deals with theoretical
principles and laws that an AI system must follow, instead of intimating humans. That is, a
rationalist approach deals with an ideal concept of intelligence, which may be independent of
human problem solving. An AI system is rational if it does the right thing and makes the right
decision. The normative view of AI based on the well established disciplines such as
mathematics, logic, and engineering. The descriptive and normative views also reflect the
experimental and theoretical aspects of AI research.

The experimental study represents the descriptive view. It covers theories and models for
the explanation of the workings of the human mind, and applications of AI to solving problems
that normally require human intelligence. The theoretic study aims at the development of theories
of rationality, and focuses on the foundations of AI. The two views are complementary to each
other. Studies in one direction may provide valuable insights into the other. Web Intelligence
concerns the design and development of intelligent Web information systems. The previous
framework for the study of AI can be immediately applied to that of Web Intelligence. More
specifically, we can cluster research in WI into the prescriptive approach and the normative
approach, and cluster Web information systems in terms of thinking and acting. Various research
topics can be identified and grouped accordingly. Like AI, a foundation of WI can be established
by drawing results from the following many related disciplines:
•

Mathematics: computation, logic, probability.
Applied Mathematics and Statistics: algorithms, non-classical logics, decision theory,
information theory, measurement theory, utility theory, theories of uncertainty,
approximate reasoning.


Page 5

Web Intelligence
•

Psychology: cognitive psychology, cognitive science, human-machine interaction, user
interface.

•

Linguistics: computational linguistics, natural language processing, machine translation.

•

Information Technology: information science, databases, information retrieval systems,
knowledge discovery and data mining, expert systems, knowledge-based systems, decision
support systems, intelligent information agents.

The topics under each entry are only intended as examples. They do not form an exhausted
list. In the development of AI, we have witnessed the formulation of many of its new subbranches, such as knowledge-based systems, artificial neural networks, genetic algorithms, and
intelligent agents. Recently, non-classical AI topics have received much attentions under the name
of computational intelligence. Computational intelligence focuses on the computational aspect of
intelligent systems , . The application of AI in other disciplines also leads to new techniques in the
corresponding fields. For instance, Business Intelligence (BI) is a result of applying artificial

Page 6

Web Intelligence

intelligence to the business domain. Artificial Intelligence in Medicine also proved to be a
successful application. When viewing WI in such settings, we can identify at least two of its roles.
WI may be interpreted “Web based Artificial Intelligence” as the study of particular aspects of AI
in the context of the Web, in parallel to the study of computational intelligence.

WI may also be interpreted as “Artificial Intelligence on the Web” which regards it as a
new application of AI.A more practical goal of WI is the design and implementation of intelligent
Web information systems (IWIS). It should be realized that an IWIS is an integrated system
containing many sub-systems. To design such a system, it is necessary to apply a variety of
theories and technologies.

In his work on vision, Marr convincingly made the point that a full understanding of an intelligent
system involves explanations at various levels. The same argument is applicable to the
development of an IWIS. We can identify at least two levels, the conceptual formulation and
physical implementation. The conceptual formulation deals with foundations of IWIS, while
physical implementation concerns with construction of an IWIS. The former depends on
mathematics and logic, and the latter depends on algorithms and programming. Each level may be
further divided into more sub-levels. Research in WI should include any topics at different levels.


Page 7

Web Intelligence

CHAPTER 3

WEB INTELLIGENCE EX PLORATION
Web intelligence further explores the transformation of knowledge from information, and
wisdom from knowledge, in its search of the Wisdom Web. Some of the important issues,
although may not be well-conceived yet, are briefly discussed in this section.

3.1 A new field of science, technology and engineering

The Web, as a new technical and social phenomenon and a growing organism, creates a
new field of science that involves a multi-disciplinary study and enquiry for the understanding of
the Web and its relationships to us. The Web may be studied from many perspectives, such as
philosophical foundations, theoretical and technical foundations, applications, and social impacts.
Some examples are given below:
•

Webology,

•

Web Science,

•

Web Technology,

•

Web Engineering,

•

Weblization.

The term, webology, is coined to label the study of the Web as a new field of science. By postfixing the phrase, science and technology, one clearly states the scope. By post fixing the phrase,
engineering, one emphasizes the design and implementation aspects. Together, they are driving
forces for information revolution. The term, weblization, concisely summarizes the development:
of the Web and web based systems so far. The process of weblization involves building the Web
itself and reconstructing existing tools and systems OR the web platform.

3.2 Design philosophy and principles of the Web
The design philosophy and principles set the direction of web growth and its ultimate
destiny. It may be difficult to compile a non-controversial and complete list. However, examples
include Decentralization principle, Universalist principles, Minimum constraint principle,

Page 8

Web Intelligence

Separation of form and content principle. The decentralization principle is inherited from the
decentralization property of the Internet. The universalist principles cover universal connectivity,
universal accessibility, as well as diversity of web contents and users. The minimum constraint
principle suggests that the Web should be as un-constraining as possible to realize its universality.
The separation principle deal with the presentation of web documents, in order to achieve
location, machine, and apphcation independence. The design principles ensure that the Web has
the desirable properties, such as decentralization, adaptability, evolvability, scalability, universal
connectivity and accessibility, affordability, anonymity, diversity, and many others. The Web is
able to support communication, collaboration. interaction, and intercreation.

3.3 The laws of the Web
Two sets of laws have been studied, namely, the set of laws governing the Web and the set of
empirical laws observable on the Web. The Web has given new meaning to publishing and
library, but not their underlying principles. Nomzi argued that Ranganathan’s Five Laws of
Library Science is weli applicable today as it was more than 70 years ago . Ranganathan’s Five
Laws of Library Science state:
•

Books are for use.

•

Every reader his or her book.

•

Every book its reader.

•

Save the time of the reader.

•

The Library is a growing organism

These laws describe a user-oriented, as well as a serviceoriented, view of library science. The
Web consists of a massive collection of resources. By replacing “book”, “reader”, and “library”
with “web resource”, “user”, and ‘‘web’, respectively, Noruzi stated Five Laws of the Web
•

Web resources are for use.

•

Every user his or her web resource.

•

Every web resource its user.

•

Save the time of the user.

•

The Web is a growing organism.


Page 9

Web Intelligence
They concisely represent the underlying philosophy of the Web and web services. They also
describe the ideal Web - “of the people, by the people, for the people”. Many researchers studied
empirical laws revealed by the Web, either its growth, web page distributions, or user surfing
patterns. An example set of such laws is reported by Huberman :
I. Power Law of Distribution.
2. Small World Law.
3. . Law of Surfing.
4. Law of Congestion.
5. The Free Ride Law
6. The Law of Downloading.
Website designers, webmasters, and organizations can apply such laws for the design of better
website and web resources.

3.4 The Web revolution: one link at a time
The story of the invention of the Web and the revolution brought by the Web provides a
good case study for web intelligence. It poses a challenge: how to derive insights and wisdom
from the existing data, information, and knowledge. Regarding the pre-web uses of hypertext
links, Berners-Lee commented, “The research community had used the links between paper
documents for ages: Tables of contents, indexes, bibliographies, and reference sections are
hypertext links.’’ A crucial question is what we can get from this common knowledge and
practice. Two types of approaches have been proposed and studied. One focuses on the
exploration of the potential implications of such knowledge, which leads to the creation of a field
of science known as citation indexing and analysis. The other focuses on the representation,
storage, and access of the similar types of data and knowledge using new media as they become
available, which leads to the invention of the Web.
A basic idea of citation indexing and analysis is to index and study the literature of science

Page 10

based on how scientists cite each other. Although it mainly uses bibliographies, citation indexing

Web Intelligence

and analysis brings more insights into science, publishing, scientific research, and many more
fields. Information retrieval systems, based on citation indexing and analysis, have been
implemented and used by scientists for many years. The same methods have been applied or
rediscovered in many recent studies, such as web search engines, social network analysis, and so
on.
A basic idea of the Web is to create a global space in which anything can be linked to
anything . The development of the Web emphasizes the implementation of this idea using
different type machines and media. The Web attempts to make the existing associations and links,
that people had used either explicitly or implicitly, concrete and computer manageable. The
similar

concepts

had

been

explored

in

preweb

age.

Vnnevar

Bush

described

a

photoelectromechanical machine called the Memex that can make and follow cross-references
among microfilm documents. Ted Nelson introduced the concept of hypertext, so that people can
use computers to read, write and publish non-linear texts. Doug Engelbart demonstrated a
collaborative work space called NLS which does hypertext browsing editing, email, and so on.
Thanks to the timely invention of the Internet for providing global connectivity, the dream of the
Web became a reality. The revolution of the Web is brought by grassroots effort that builds the
Web link by link. There are recent research efforts in cross-applications of the two types of
approaches. The methods developed for citation indexing and analysis are used and extended to
analyze the links and conductivity of the web. Existing systems for citation indexing and analysis
are moved to, and new such systems are impregnated on, the Web.
The above brief description, which is almost common knowledge, is repeated here to serve
one special purpose. It demonstrates that the great minds of our time bring revolutions by
analyzing what everyone has already known or by implementing, alternatively, what everyone has
already used. The question is: Can web intelligence help in the future?

3.5 The more things change, the more they stay the same
Now, we turn our attention to the other side of the same coin by investigating the things that the
resolutions do not change. In spite of the technological changes, achievements of the current Web
and associated systems lie in the process of weblization. The weblization of a specific field or an
organization does not change its fundamental principles, although it may become more effective
and efficient, as well as being at different level of scale. For example, electronic commence does
not change the principles of doing business, but does introduce more dynamics, opportunities,

Page 11

flexibility, and other new properties. Another example is the Five Laws of the Web:the subject

Web Intelligence

matters are changed, but the philosophy remains to be the same. Both paper documents and the
Web use links.
The physical implementations are different, one on paper and the other on computer, but the
logical meanings stay more or less the same. The same analytical tools and methods apply to both.
The property of “unchangeness” makes it possible to apply the same principles again and again,
with possible adaptation and adjustment. The philosophy and principles that have been proved to
be effective in past can be applied to design and implement intelligent web information systems.
Some illustrative examples are listed here:
Separation of logical view and physical view.
Separation of knowledge and inference engine.
Keep It Simple, Stupid!
The first two separation principles are along the same line as the separation of content and form
principle. The first one is widely used in the design and implementation of database systems. Its
application to the Web implies that one can generate many virtual logical views from the same
physical web. The second principle is a fundamental one in expert systems. It is applicable to the
design of web inference engines. The last rule, also known as the KISS principle, is universally
applicable It has been applied throughout the design of the Web.


Page 12

Web Intelligence

CHAPTER 4
Components of Web Intelligence

4.1 Web Data

The data available in electronic commerce environments is three-fold and includes server
data in the form of log files, site specific web meta data representing the structure of the
web site, and marketing information, which depends on the products and services provide. Server
data is generated by the interactions between the persons browsing an individual site and the web
server. This data can be divided into log files and query data. Historically, web servers recording
server activity, errors and referrer information used a log file to record each event. It is now the
standard that web servers use a combined log file format, called Common Log file Format . This
format combines the server and error logs into one file. More recently, the Extended Log file
Format has been used, which consolidates the Common format with additional information,
namely the referrer and cookie information. By incorporating referrer information, the output of
the mining of these logs files being much more useful and actionable in marketing terms. Cookies
are tokens generated by the web server and held by the clients. The information stored in a
cookie helps to ameliorate the transaction less state of web server http interactions, enabling
servers to track client access across their hosted web pages. The logged cookie data is
customizable and can contain keys for relating the navigational data to the content of the
marketing data, including transactional data. Usually the following information is contained in a
cookie: User ID, source IP address, time-to-live, randomly generated unique ID and user defined
information. A fourth data source that is typically generated on electronic commerce sites is
query data to a web server. This data is usually generated when users of the web site use search or
product locator facilities on the web site to search for relevant pages/products. This is often user
interaction with a product database, via the company’s Internet site. The final source of data is
web meta-data. This data describes the structure of the web site and is usually generated
dynamically and automatically after a site update. Web meta-data generally includes neighbor
pages, leaf nodes and entry points. This information is usually implemented as a site-specific
index table, which represents a labeled, directed graph. Meta-data also provides information
whether a page has been created statically or dynamically and whether user interaction is required
or not. In addition to the structure of a site, web meta-data can also contain information of more

semantic nature, usually represented in XML.

Page 13

Web Intelligence
Web Mining Components of Web Intelligence

In the context of web intelligence, web mining may be defined as the application of data
mining techniques to Internet data. This definition is sometimes extended to include statistical,
database optimization, and artificial intelligence techniques. Web mining has been sub-divided
into web structure, web usage, and web content mining . Web structure mining is the application
of data mining techniques to web site structures. In many cases this may be the entire web, and
research in intelligent search engines and intelligent agents is described in many articles, . In our
research, we define web structure mining as the mining of Internet data, together with data about
the structure of the site. This may be thought of as enriching the efficacy of the data mining
process with domain knowledge. The application of domain knowledge is further discussed in the
analytical process section. Web usage mining is the application of data mining to Internet web
server log file data, which is described in the earlier section on web data. Web usage mining
forms the core of our research in web mining for web intelligence, and log files provide the
foundation data for visitor analysis. This type of analysis of the visitors to a web site can be
subdivided into technographic and psychographic analysis . Technographic analysis focuses on
what is known about the visitor’s technical platform, i.e., operating system, browser, plug-ins,
user language, cookie information. On its own, this information is not a rich source of
discriminatory data for visitor profiling but in conjunction with the homogenous data sets
available after extract, transform & load operations to data warehousing, it contributes
significantly. Psychographic analysis is the examination of what we know about the behavioral
patterns of web site visitors. This includes the routes taken by visitors through a site, the time
spent on each page, route differences based on differing entry points to site, aggregated route
behavior, general click stream behavior, etc. This is the information of most use to web marketers,
and is equivalent to marketing intelligence about where shoppers enter the store, where shoppers
go in the store, where they leave the store, what they look at but don’t buy, what they buy and
how quickly, etc.
Web content mining is the application of data and text mining algorithms and techniques
to the contents of web pages, usually written in HTML. At its simplest, this entails the extraction
of text between HTML tags for headings and titles, or the extraction of the HTML Meta tag
content.. Our research is based upon XML and RDF-based data schemas that help to ensure
correctness and proper context.

Page 14

Web Intelligence
4.2 Representation

Intelligent Web agents can use the Problem Solver Markup Language (PSML) to specify
their roles, settings, and relationships with any other services. The intelligent Web must also have
the ability to process and understand natural language. It must understand and correctly judge the
meaning of concepts expressed in words, such as “good,” “best,” and “season.” Further, the
intelligent Web must grasp the granularities of these terms’ corresponding subjects and the
location of their ontology definitions.

Self-direction and learning

In addition to the semantic knowledge that an intelligent search can extract and
manipulate, intelligent Web agents must also incorporate a dynamically created source of metaknowledge that deals with the relationships between concepts and the spatial or temporal
constraint knowledge that planning and executing services use. This allows the agents to selfresolve their conflicts. To solve specific problems, intelligent Web agents must be able to plan.
The planning process uses goals and associated sub goals, as well as constraints. In the intelligent
Web, ontologies alone will not be sufficient. Personalization The intelligent Web can personalize
interactions by remembering a particular user’s recent encounters and relating the topics and sites
that a user accesses during different online sessions. It may further identify other goals and
courses of action as a user’s interactions broaden and deepen, providing ever more data upon
which to base its recommendations. As part of its personalized approach to user services, the
intelligent Web will interact with the user when executing these tasks. In summary, semantics
contributes a vital aspect to the intelligent Web. We expect the Web to extend not only the
knowledge of artificial assistants, but also their intelligence.

WI’s Four Levels
We can study Web intelligence on at least four conceptual levels, ranging from the lower,
hardware- centered level to the higher, application-centered level. This framework builds upon the
fast development and application of various Web technologies.
• Internet-level communication, infrastructure, and security protocols.
At its core, the Web is a computer-network system. WI techniques for this level include Web data
perfecting systems built upon Web surfing patterns to resolve latency issues. The intelligence of

Page 15

Web Intelligence

the Web’s perfecting routines comes from an adaptive learning process based on observations of
user surfing behavior.
• Interface-level multimedia presentation standards.
The Web functions as an interface for human-Internet interaction. At this level, the Web
interfaces require adaptive cross-language processing, personalized-multimedia-representation,
and multimodal-data-processing capabilities.
• Knowledge-level information processing and management tools.
The Web serves as a distributed data and knowledge base. Accessing and manipulating this
information requires semantic markup languages to represent the Web’s contents in machineunderstandable formats. Agent-based autonomic computing functions such as searching,
aggregation, classification, filtering, managing, mining, and discovery can then use this data.
• Application-level ubiquitous computing and social intelligence environments.
The Web can form the basis for establishing social networks that contain communities of people,
organizations, or other social entities. Social relationships such as friendship, co-working, or
exchanging information about common interest connect these entities. The study of WI thus
encompasses issues central to social network intelligence. Users access the Web’s multimedia
content from stationary desktop computers and increasingly from mobile platforms as well.5
Ubiquitous Web access and computing from various wireless devices requires even greater
adaptive personalization. WI should suit these needs well by providing techniques for use in
constructing interest models derived from implicit inferences based on user behavior.


Page 16

Web Intelligence

4.3 PSML and Web inference engine

Distributed inference engines form PSML’s core. These engines can perform automatic
reasoning on the Web by incorporating autonomically collected and transformed content and
meta-knowledge into locally operational knowledge and databases. A feasible way to implement
PSML is to use an existing Prolog-like logic language supplemented with agents that perform
dynamic-content updates, meta-knowledge.

4.4 Social network intelligence
The social intelligence approach to Web computing presents new opportunities for WI
research and development. As the Web becomes an integral part of our society, WI can and
should support Web-based social networks at all levels. Study in this area must receive as much
attention as Web mining, Web agents, ontologies, and related topics. Web-based computing The
intelligent Web seeks to provide not only a medium for seamless information exchange and
knowledge sharing, but also the sort of human-crafted resources that encourage sustainable
knowledge creation and scientific and social evolution. The intelligent Web will rely on Grid-like
service agencies that self-organize, learn, and evolve their courses of action to perform service
tasks and transform their identities and interrelationships in communities. These services will also
cooperate and compete among themselves to optimize their resources and utilities and those of
others.

4.5 Benchmark applications

To effectively develop and evaluate systems and applications that address WI research issues, we
must consider benchmark applications that will demonstrate these capabilities. Suppose we want
to conduct a Web-based search to compile the data and generate a market report for an existing
product or a potential new product. To perform these tasks, an information agent will mine and
integrate available Web information, which will in turn be passed to a market analysis agent.
The analysis will involve the quantitative simulation of customer behavior in a marketplace,
instantaneously handled by other service agencies involving a large number of Grid agents. Given
that the number of variables can number in the hundreds or thousands, generating one prediction
can easily require significant computer resources

Page 17

Web Intelligence

CHAPTER 5
Computational Web Intelligence and Granular
Web Intelligence for Web Uncertainty

With explosive growth of Web data on wired and wireless networks, a challenging
problem for a new generation of intelligent Web techniques is how to handle uncertain Web data
and making right decisions under Web uncertainty. So it is necessary to develop new intelligent
Web techniques for Web applications under different types of uncertainty including probability,
possibility, fuzziness, roughness, randomness, etc. Web Intelligence (WI), a new direction for
scientific research and development, exploits Artificial Intelligence (Al) and advanced
Information Technology (IT) on the Web and Internet. In general, Al-based Web techniques can
be used to handle probabilistic Web data. Since there are lots of fuzzy Web data and other kinds
of uncertain Web data, we need to apply relevant intelligent techniques to process different
uncertain Web data that cannot be processed by traditional precise intelligent techniques like
Boolean logic. To promote the use of fuzzy Logic in the Internet, Zadeh stated "fuzzy logic may
replace classical logic as what may be called the brainware of the Internet" at 2001 BISC
International Workshop on Fuzzy Logic and the Internet (FLINT2001) . The fuzzy intelligent
agents are used in smart e-Commerce applications. The conceptual fuzzy sets are applied to Web
search engines to improve quality of Web service. Clearly, the intelligent e-brainware based on
soft computing plays an important role in smart e-Business applications. So soft computing
techniques can play an important role in building the intelligent Web brain. So soft-computingbased Web techniques can enhance Web Qol (Quality of Intelligence). In order to use CI
(Computational Intelligence) techniques to make intelligent wired and wireless systems with high
Qol, Computational Web Intelligence (CWI) was proposed at the special session on CWI at
FUZZ-IEEE'02 of 2002 World Congress on Computational Intelligence. CWI is a hybrid
technology of CI and Web Technology (WT) dedicating to increasing Qol of e-Business
application systems on the wired and wireless networks. Main CWI techniques include
•

Fuzzy Web Intelligence (FWI)

•

Neural Web Intelligence (NWI)

•

Evolutionary Web Intelligence (EWI)

•

Granular Web Intelligence (GWI)

•

Rough Web Intelligence (RWI)


•

Probabilistic Web Intelligence

Page 18

Web Intelligence

5.1 WEB UNCERTAINTY

Web holds various data sets distributed on a huge number of computers just like a human
brain contains biological data stored on a large number of biological neurons. The biological data
in the human brain are not always precise but uncertain in most cases due to information
incompleteness, linguistic vagueness, imperfect measurement, knowledge limitations, etc.
Similarly, Web data on the Internet are not accurate but uncertain usually because of partial Web
information, dynamic Web data, fuzzy Web data, Web ontology, unpredictable Web information,
different Web users, different hardware environments, different data formats, etc.So the big
challenging problem is how to design intelligent Web techniques for Web-based applications with
uncertainty. With explosive growth of the wired and wireless networks, Web users suffer from
huge amounts of raw Web data because current Web tools still cannot find satisfactory
information and knowledge effectively and make decisions correctly because of uncertain Web
data, uncertain Web information, uncertain Web knowledge and uncertain Web intelligence. Now
the Internet and wireless networks connect an enormous number of computing devices including
computers, PDAs (Personal Digital Assistants), cell phones, home appliances, etc. CI is used in
telecommunication network applications . Clearly, such a huge networked computing system on
the world provides a complex, dynamic and global environment for developing the new
distributed intelligent theory and technology based on Al, BI (Biological Intelligence) and CI.
Therefore, we must design an intelligent Web technology for dealing with Web uncertainty.

5.2 COMPUTATIONAL WEB INTELLIGENCE FOR WEB
UNCERTAINTY

Zadeh states that traditional (hard) computing is the computational paradigm that underlies
artificial intelligence, whereas soft computing is the basis of CI. Based on the discussions on CI
and Al ,the basic conclusion is that CI is different from Al, but CI and Al have a common overlap.
In general, hard computing and soft computing can be used in intelligent hard Web applications
and intelligent soft Web applications. To enhance Qol (Quality of Intelligence) of e-Business,
Computational Web Intelligence (CWI) is proposed to use CI and Web Technology (WT) to make
intelligent e-Business applications on the Internet and wireless networks . So the concise relation
is given by CWI=CI+WT. Fuzzy logic, neural networks, evolutionary computation, granular

Page 19

computing, rough sets and probabilistic methods are major CI techniques for intelligent e-

Web Intelligence

Applications on the Internet and wireless networks. Currently, seven major research areas of CWI
are (1) Fuzzy WI (FWI), (2) Neural WI (NWI), (3) Evolutionary WI (EWI), (4) Probabilistic WI
(PWI), (5) Granular WI (GWI), and (6) Rough WI (RWI). In the future, more CWI research areas
will be added. The six current major CWI techniques are described below.
•

FWI has two major techniques: fuzzy logic and WT. The main goal of FWI is to design
intelligent fuzzy e-agents to deal with fuzziness of Web data, Web information and Web
knowledge, and also make good decisions for e-Applications effectively.

•

NWI has two major techniques: neural networks and WT. The main goal of NWI is to
design intelligent neural e-agents that can learn Web knowledge from of Web data and
Web information and make smart decisions for e-Applications intelligently.

•

EWI has two major techniques: evolutionary computing and WT. The main goal of EWI
is to design intelligent evolutionary e-agents to optimize e-Application tasks effectively.

•

PWI has two major techniques: probabilistic computing and WT. The main goal of PWI is
to design intelligent probabilistic e-agents to deal with probability of Web data, Web
information and Web knowledge for e-Applications effectively.

•

GWI has two major techniques: granular computing and WT. The main goal of GWI is to
design intelligent granular e-agents to deal with Web data granules, Web information
granules and Web knowledge granules for e-Applications effectively.

•

RWI has two major techniques: rough sets and WT.

The main goal is to design intelligent rough e-agents to deal with roughness of Web data, Web
information and Web knowledge for e-Applications effectively.CWI can be used to increase the
Qol of e-Business applications. CWI has a lot of wired and wireless applications in intelligent eBusiness. Currently, FWI, NWI, EWI, PWI, GWI and RWI are major CWI techniques. CWI can
be used to deal with uncertainty and complexity of Web applications. HWI, a more broad area


Page 20

Web Intelligence

than CWI, can be applied to more complex e-Business applications. In summary, HWI including
CWI will play an important role in designing the smart e-Application systems for wired and
wireless users. In summary, CWI technology is based on multiple CI techniques and WT.
Relevant CI techniques and WT are selected to make a powerful CWI system for the special
e-Business application.

5.3 GRANULAR WEB INTELLIGENCE FOR WEB UNCERTAINTY

Granular computing technology can be to do high-level information processing and
knowledge discovery based on data granules that are clustered intelligently from raw data with
uncertainty. Since there are huge amounts of Web data at different geographical places, it is
naturally necessary to use the granular computing technology to preprocess raw Web data, then do
granular Web data mining, and finally discover granular Web knowledge. So GWI is a general
intelligent technology in dealing with raw Web data with Uncertainty. Mathematically speaking,
to handle Web uncertainty effectively, it is really necessary to develop a novel granular set theory.
Here, a general framework about granular sets is briefly described below to deal with data
uncertainty such as Web data uncertainty.

Definition 1 (A Granular Set) Let X be a universal set of data elements. A granular set A in Xis
characterized by m granular membership functions Fk(x) for x in X, Fk(x)E[O,1], and
k=1,2,...m.
For example:
If k=1, a granular set is a fuzzy set (a special case: a crisp set) since one membership function is
used. The traditional fuzzy sets just use truth values in [0, 1] to handle data
uncertainty.

If k=2, a granular set is an intuitionistic fuzzy set [25] since two membership functions are used.
Intuitionistic fuzzy sets use both truth values and falsity values in [0, 1] to deal with data
uncertainty. If k=3, a granular set is a neutrosophic set since three membership functions are
used. For example, interval neutrosophic sets are defined on a truth-membership function, an
indeterminacy-membership function and a falsity-membership function . The major advantage of
interval neutrosophic sets is to reduce data uncertainty by using three types of information that are
truth values, falsity values and indeterminacy values in order to make a right decision. 100

Page 21

Web Intelligence

We hope that new granular sets and new granular logical systems with four or more membership
functions will be developed in the future to handle Web uncertainty effectively and
fundamentally.

Web uncertainty is a long-term challenging problem related to many Web applications like
semantic Web, Web mining, Web knowledge discovery, Web agents, Web search engines, Web
security, e-Commerce, e-Business, etc. To handle Web uncertainty, we need to develop relevant
intelligent Web technology such as CWI and GWI. Importantly, we need to continue to create
new granular sets such as neutrosophic sets to try to solve Web uncertainty effectively.

Web uncertainty is a difficult long-term problem. So we need to use different intelligent
techniques together for this complicated problem. Hybrid Web Intelligence (HWI), a broad hybrid
research area, uses Al, CI, BI (Biological Intelligence) and WT to build hybrid intelligent Web
systems to handle Web uncertainty effectively and efficiently. In the future, HWI will have a lot
of intelligent Web applications under uncertainty. Main HWI applications include (1) intelligent
Web agents for e-Applications such as e-Commerce, e-Government, e-Education and e-Health,
(2) intelligent Web security systems such as intelligent homeland security systems, (3) intelligent
Web bioinformatics systems, (4) intelligent grid computing systems, (5) intelligent wireless
mobile agents, (6) intelligent Web expert systems, (7) intelligent Web entertainment systems, (8)
intelligent Web services, (9) Web data mining and Web knowledge discovery, (10) intelligent
distributed and parallel Web computing systems based on a large number of networked computing
resources, ..., and so on.


Page 22

Web Intelligence

CHAPTER 6
Trends and Challenges of WI Related Research and
Development

Web Intelligence presents excellent opportunities and challenges for the research and
development of new generation Web-based information processing technology, as well as for
exploiting business intelligence. With the rapid growth of the Web, research and development on
WI have received much attention. We expect that more attention will be focused on WI in the
coming years. Many specific applications and systems have been proposed and studied. Several
dominant trends can be observed and are briefly reviewed in this section. E-commerce is one of
the most important applications of WI. The e-commerce activity that involves the end user is
undergoing a significant revolution. The ability to track users’ browsing behavior down to
individual mouse clicks has brought the vendor and end customer closer than ever before. It is
now possible for a vendor to personalize his product message for individual customers
at a massive scale. This is called targeted marketing or direct marketing

Web mining and Web usage analysis play an important role in e-commerce for customer
relationship management (CRM) and targeted marketing. Web mining is the use of data mining
techniques to automatically discover and extract information from Web documents and services.
Zhong et al. proposed a way of mining peculiar data and peculiarity rules that can be used for
Web-log mining. They also proposed ways for targeted marketing by mining classification rules
and market value functions. A challenge is to explore the connection between Web mining and
the related agent paradigm such as Web farming that is the systematic refining of information
resources on the Web for business intelligence. Text analysis, retrieval, and Web based digital
library is another fruitful research area in WI. Topics in this area include semantics model of the
Web, text ming, automatic construction of citation. Abiteboul et al. systematically investigated the
data on the Web and the features of semi-structured data. Zhong et al. studied text mining on the
Web including automatic construction of ontology, e-mail filtering system, and Web-based ebusiness systems. Web based intelligent agents are aimed at improving a Web site or providing
help to a user. Liu et al. worked on e-commerce agents . Liu and Zhong worked on Web agents
and KDDA (Knowledge Discovery and Data Mining Agents). We believe that Web agents will be
a very important issue. It is therefore not surprising that we decide to hold the WI conference in

Page 23

Web Intelligence

parallel to the Intelligent Agents conference. In the next section, we provide a more detailed
description of intelligent Web agents.

The Web itself has been studied from two aspects, the structure of the Web as a graph and
the semantics of the Web. Studies on Web structures investigate several structural properties of
graphs arising from the Web, including the graph of hyperlinks, and the graph induced by
connections between distributed search servants. The study of the Web as a graph is not only
fascinating in its own right, but also yields valuable insight into Web algorithms for crawling, 10
searching and community discovery, and the sociological phenomena which char- acterize its
evolution. Studies of the semantics of the Web were initiated by Tim Berners-Lee, the creator of
the World Wide Web. The Web is referred to as the “semantic Web”, where information will be
machine-processible in ways that support intelligent network services such as information brokers
and search agents.
The semantic Web requires interoperability standards that address not only the syntactic
form of documents but also the semantic content. A semantic Web also lets agents utilize all the
data on all Web pages, allowing it to gain knowledge from one site and apply it to logical
mappings on other sites for ontology-based Web retrieval and e-business intelligence. Ontologies
and agent technology can play a crucial role in enabling such Web-based knowledge processing,
sharing, and reuse between applications. A new DARPA program called DAML (DARPA Agent
Markup Languages) is a step toward a “semantic Web” where agents, search engines and other
programs can read DAML mark-up to decipher meaning rather than just the content on a Web
site.

6.1 Intelligent Web Agents

Intelligent agents are computational entities that are capable of making decisions on behalf
of their users and self-improving their performance in dynamically changing and unpredictable
task environments . In , Liu provided a comprehensive overview of related research work in the
field of autonomous agents and multi-agent systems, with an emphasis on its theoretical and
computational foundations as well as in-depth discussions on the useful techniques for developing
various embodiments of agent-based systems, such as autonomous robots, collective vision and
motion, autonomous animation, and search and segmentation agents. The core of those techniques
is the notion of synthetic or emergent autonomy based on behavioral self-organization. Intelligent

Page 24

Web Intelligence

Web Agents (WA) are software programs that primarily serve two important roles: a).
autonomous entities for exploring and exploiting Web-based services, and b). prototype entities
for exhibiting and explaining Web-generated regularities. These two roles are summarized below.

6.2 From WA to Web-Based Services

The first role for WA can be readily described and appreciated by examining the following
typical scenarios in which various tasks and objectives are achieved.
•

Personalized Multimodal Interface WA can provide users with a user-friendly style of
presentation that personalizes both the interaction with users and the content presentation.
This activity involves the creation of various cognitive aids, including tables, charts,
executive summaries, indices, and personalized visual assistants (e.g., graphically
animated personas and virtual-reality avatars). WA as interfaces must offer the ease of
using electronic services. The provided cognitive aids must be concise (i.e., accessible
with as fewer manipulations as possible and as less memorization as possible) and
consistent (i.e., understandable based on users’ previously customized cognitive styles).

•

Push and Pull WA can play an important role in dynamically creating pull-and-push
advertising. Here, by pull-and-push advertising we mean that a user expresses his or her
favorites during the interaction with the agents (pull advertising) and in return the agents
search and deliver the information about the favorite items dynamically to the user (push
advertising). Such agents can also increase the positive externality of products, that is,
the better people are informed about certain products, the more likely the products will be
sold.

•

Pattern Discovery and Self-Organization WA will enable to detect what users’ buying
patterns are forming and how they are structured, and hence effectively manage the online
commerce. Collaborative recommendation agents can help individual users aggregate into
groups, which can in turn form a dynamical marketplace.

•

Information Gateway WA can provide users with immediate access to the most relevant
information. This support encompasses a wide spectrum of information filtering and
delivery activities by manipulating various heterogeneous Web sources including
databases, data warehouses, newswire, financial reports, newsletters, newsgroups,
outbound emails, electronic bulletin boards, and hypermedia documents, and based on


Page 25

Web Intelligence

users’ profiles, tailoring and delivering the retrieved information to the users. The
provided summary information must be just-in-time (i.e., delivered whenever is needed),
relevant (i.e., focused on whichever topics the users are concerned with), and up-to-

minute (i.e., refreshed whenever a new piece of information arrives). An example of
applications with this type of agent support is comparison shopping that utilizes WA with
mobile and filtering capabilities. Some related experiences have been reported in .
•

Reward WA can motivate users to enter and re-enter a certain electronic service. While an
ever-greater proliferation of content continues to consume individuals’ attention, e.g.,
through push technology to sell something or to support users, WA can play a crucial role
in creating a captive audience, in educating it constantly, and even in removing away
users’ old purchase habits. To be rewarding is to add value. The motivational rewards or
incentives can be created by offering free access to certain information and utility
resources (e.g., free software download), opportunities to participate in multi-user
information/commodity exchange activities (e.g., collaborative recommendation, chat,
bidding, and auction), and scheduled plans for promotional deals.

•

Matchmaking WA can serve as a new means for trading commodities. Since the interests
of users as well as the availability of products from dealers can change dynamically from
time to time, what usually happens in present day electronic commerce is: (1) a dealer
sells his or her items simply because these are the only items that he or she has at the
moment, or (2) a user buys a certain item simply because it is the last item that he or she
can find that partially fits his or her need. WA-based customized business attempts to
change the existing online buying and selling into the following new scenarios: (1) a
dealer identifies and offers what exactly users are interested in, and (2) a user finds and
purchases what he or she really loves – some technical issues related to matchmaking
have been addressed in .

•

Decision WA can assist Web users in making decisions. Such decision support may be in
the forms of evaluations or recommendations on the various features of certain specific
items, cost-benefit analysis, inference support for optimizing utility and resources with
respect to functional, time, and cost requirements, and model-based trend analysis and
projections concerning new patterns of demand.
•

Delegation WA can act on behalf of Web users in online activities. The

tasks that WA may delegate to achieve include matchmaking, server monitoring,
negotiation, bidding, auction, transaction, transfer of goods, and follow-up support. This

Page 26

scenario will empower a new paradigm shift from user-centric to user-delegated

Web Intelligence

electronic business. The delegations of these tasks may be carried out in either semiautonomous (with users’ intervention on decisions) or fully autonomous manners. To this
end, various computational theories and models have been proposed and reported in.

•

Collaborative Work Support WA can offer the infrastructure support as well as the
necessary function for collaboratively solving problems and managing workflow
activities


Page 27

Web Intelligence

CHAPTER 7
Semantic Search Engine

The framework’s search engine component queries the information generated by the annotation
component. It accepts queries posed in SPARQL and returns a set of links to matching resources.
A specialized search interface lets users develop an abstract model of a semantic query, pose it to
the engine, and then review the resulting matched documents. The search interface gives end
users (people who aren’t experts in Semantic Web technologies) a way to access the resources
filtered and annotated by the semantic annotator component. It is also possible to add and delete
entities and properties (with related values), so that a user can interact with the knowledge base to
fine-tune the query, making subsequent searches more accurate. The key aim for the query
interface is to give the user an intuitive and clear abstract query model that hides, as much as
possible, the underlying complexity of representation and reasoning. Furthermore, the agents in
the search engine multi-agent system exhibit various autonomic features that aim at making the
system more robust and scalable. The QS system has been deployed in two different commercial
test cases in the UK. In the first case, QS was used to examine specific Web-published documents
for commercial opportunities matching the business interests of the customer company. In the
second deployment, QS was used to perform knowledge-based searches over existing database
sources. In evaluating the performance of the search system in both applications, we could
see that by using ontological knowledge and ontology-based annotations, users could perform
more accurate queries while being returned up to 71 percent fewer documents than with a
keyword-based search engine—in the best cases eliminating more than 90 percent of the
irrelevant documents. We are now in the process of further refining these two deployments, and
we are planning more industrial deployments in the near future with other UK companies


Page 28

Web Intelligence

CHAPTER 8
CONCLUSION

While it may be difficult to define what exactly Web Intelligence (WI) is, one can easily
argue for the need and necessity of creating such a subfield of study in computer science. With the
rapid growth of the Web, we foresee a fast growing interest in Web Intelligence. Roughly
speaking, we define Web Intelligence as a field that “exploits Artificial Intelligence (AI) and
advanced Information Technology (IT) on the Web and Internet.” It may be viewed as a marriage
of artificial intelligence and information technology in the new setting of the Web. By examining
the scope and historical development of artificial intelligence, we discuss some fundamental
issues of Web Intelligence in a similar manner. There is no doubt in our mind that results from AI
and IT will influence the development of WI. Instead of searching for a precise and noncontroversial definition of WI, we list topics that might be interested by a researcher working on
Web related issues. In particular, we identify some challenging issues of WI, including
ecommerce, studies of Web structures and Web semantics, Web information storage and retrieval,
Web mining, and intelligent Web agents, to examine performance characteristics of various
approaches in Web-based intelligent information technology, and to cross-fertilize ideas on the
development of Web-based intelligent information systems among different domains.

It is not intended to be a complete and systematic study of the field, but rather a record of
personal observations, scattered (perhaps immature) ideas, general comments, speculations, and
opinions. We hope that a careful study of these not yet well-connected points may lead to a web
of knowledge for web intelligence. From several perspectives, we examined the Web. This
enables us to see clearly the current status, the scope, and the future of web intelligence research.
Web intelligence exploration of the Web was then commented from a few angles. A couple of
challenges were posed. Finally, Web-based Support Systems (WSS) were used to demonstrate the
ideas presented, which may further enhance the Web as a tool - “of the people, by the people, for
the people”


Page 29

Web Intelligence

[1]

REFERENCES

Research Challenges and Trends in the New Information Age
Y.Y. Yao1, Ning Zhong, Jiming Liu, and Setsuo Ohsuga , IEEE

[2]

Web Intelligence: New Frontiers of Exploration Yiyu (Y.Y.) Yao
Department of Computer Science, University of Regina Regina ,
saskatchewa , IEEE

[4]

Education and the Semantic Web Vladan Devedzic, Department of
Information Systems and Technologies, FON – School ,of Business
Administration, University of Belgrade

[5]

Computational Web Intelligence and Granular ,Web Intelligence
for Web Uncertainty ,Yan-Qing Zhang, Member, IEEE


Page 30

Web intelligence-future of next generation web

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (20)

Similar to Web intelligence-future of next generation web

Similar to Web intelligence-future of next generation web (20)

Recently uploaded

Recently uploaded (20)

Web intelligence-future of next generation web