SlideShare a Scribd company logo
1 of 36
Download to read offline
WEB INTELLIGENCE
Seminar Report
Submitted in partial fulfilment of the requirements
for the award of the degree of
Bachelor of Technology
in
Computer Science Engineering
of
Cochin University Of Science And Technology
by

NIJIL Y
(12080050)

DIVISION OF COMPUTER SCIENCE
SCHOOL OF ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY
KOCHI-682022
WEB INTELLIGENCE
Seminar Report
Submitted in partial fulfilment of the requirements
for the award of the degree of
Bachelor of Technology
in
Computer Science Engineering
of
Cochin University Of Science And Technology
by

NIJIL Y
(12080050)

DIVISION OF COMPUTER SCIENCE
SCHOOL OF ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY
KOCHI-682022
DIVISION OF COMPUTER SCIENCE
SCHOOL OF ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY
KOCHI-682022

Certificate
Certified that this is a bonafide record of the seminar entitled
“WEB

INTELLIGENCE”

Presented by the following student
NIJIL Y
of the VII

th

semester, Computer Science and Engineering in the year 2010

in partial f ulfillment of the requirements in the award of Degree of
Bachelor of Technology in Computer Science and E ngineering of Cochin
University of Science and Technology.

Mr. SUDEEP EDAYILAM
Seminar guide

Dr. DAVID PETER
Head Of Division
ACKNOWLEDGEMENT

I thank GOD almighty for guiding me throughout the seminar. I would like to thank all those
who ha ve contributed to t he c ompletion of t he s eminar a nd he lped me with va luable
suggestions for improvement.
I a m e xtremely grateful to Dr. David Peter, Head Of Division, Division of Computer
Science, for providing me with best facilities and atmosphere for the creative work guidance
and encouragement. I am profoundly indebted to my seminar guide Mr. Sudheep Elayidom,
sr.Lecturer, Division of Computer Science, for all help and support extend to me. I thank
all Staff me mbers of my c ollege a nd f riends f or e xtending t heir c ooperation during m y
seminar.

Above all I would like to thank my parents without whose blessings, I would not have been
able to accomplish my goal.

NIJIL Y
ABSTRACT

Web Intelligence is a new direction for scientific research and development that explores
the f undamental roles as w ell as practical i mpacts of ar tificial i ntelligence and adva nced
information t echnology f or t he ne xt ge neration of Web-empowered systems, services, and
environments. Web Intelligence is regarded as the key research field for the development of the
Wisdom Web ( including t he S emantic W eb). The Web r evolutionizes t he w ay w e ga ther,
process, a nd us e i nformation. Despite cu rrent t echnological adva nces, w e st ill ca nnot pred ict
what t he Web’s ne xt pa radigm s hift w ill b e. H owever, w e pr opose t hat t his c hange w ill
transform the Web into an intelligent entity—hence, the term Web intelligence.
The ne xt-generation W eb w ill go b eyond i mproved i nformation s earch a nd know ledge
queries a nd will h elp p eople a chieve be tter w ays of l iving, working, pl aying, a nd l earning. T o
fulfil its potential, the intelligent Web’s design and development must incorporate and integrate
several f undamental capa bilities. A f ew o f i ts capa bilities a re R eflexive ser ver pro pagation ,
Growth Specialization , A utocatalysis et c. Intelligent Web agents can use t he P roblem S olver
Mark-up L anguage ( PSML) t o s pecify t heir r oles, s ettings, a nd r elationships w ith a ny ot her
services. The i ntelligent Web must a lso ha ve the a bility t o pr ocess and unde rstand na tural
language. It must understand and c orrectly judge the meaning of concepts expressed in words,
such as “go od,” “be st,” and “season” et c. WI r esearch incorporates k nowledge f rom e xisting
disciplines, such as artificial intelligence and information technology, in a t otally new domain.
At t he sam e t ime, Web Intelligence r esearch also enriches t hese established disciplines as it
introduces new topics and challenges.
TABLE OF CONTENTS
CHAPTER NO.

CHAPTER TITLE

PAGE NO.

1

Introduction

1

2

Perspectives Of Wi

4

3

Intelligence Exploration

8

3.1

A New Field Of Science, Technology And Engineering

8

3.2

Design Philosophy And Principles Of The Web

8

3. 3

The Laws Of The Web

9

3. 4

The Web Revolution: One Link At A Time

10

3.5

The More Things Change, The More They Stay The Same

11

4

Components Of Web Intelligence

13

4.1

Web Data

13

4.2

Representation

15

4.3

Psml And Web Inference Engine

17

4.4

Social Network Intelligence

17

4.4

Social Network Intelligence

17

5

Computational Web Intelligence

18

5.1

Web Uncertainty

19

5.2

Computational Web Intelligence For Web Uncertainty

19

5.3

Granular Web Intelligence For Web Uncertainty

21

6

Trends And Challenges Of Wi Related Research And Development

23

6.1

Intelligent Web Agents

24

6.2

From Wa To Web-Based Services

25

7

Semantic Search Engine

28

8

Conclusion

29

References

30
Web Intelligence

CHAPTER 1

INTRODUCTION
With the rapid growth of Internet and World Wide Web (WWW), we have now entered
into a new information age. The Web provides a total new media for communication, which goes
far beyond the traditional communication media, such as radio, telephone and television. The
Web has significant impacts on both academic research and ordinary daily life. It revolutionizes
the way in which information is gathered, stored, processed, presented, shared, and used. The
Web offers new opportunities and challenges for many areas, such as business, commerce,
marketing, finance, publishing, education, research and development. For computer scientists, the
Web introduces many new research topics and provides a new platform to reconsider old
problems. It might be high time to create a new sub-discipline of computer science covering
theories and technologies related to the Web. Web Intelligence is our proposal for this purpose.

Through the billions of Web pages created with HTML and XML, or generated
dynamically by underlying Web database service engines, the Web captures almost all aspects of
human endeavor and provides a fertile ground for data mining. However, searching,
comprehending, and using the semi-structured information stored on the Web poses a significant
challenge because this data is more sophisticated and dynamic than the information that
commercial database systems store. To supplement keyword-based indexing, which forms the
cornerstone for Web search engines, researchers have applied data mining to Web-page ranking.
In this context, data mining helps Web search engines find high-quality site administrator.

WI explores the fundamental and practical impact that artificial intelligence and advanced
information technology will have on the next generation of Web-empowered systems, services,
and environments. In an era dominated by the World Wide Web, Grid computing, intelligentagent technology, and ubiquitous social computing, WI represents information technology’s next
challenge. 3 Motivations and Justifications for WI The introduction of Web Intelligence (WI) can
be motivated and justified fromboth academic and industrial perspectives. Two features of the
Web make it a useful and unique platform for computer applications and research, the size and
complexity. The Web contains a huge amount of interconnected Web documents known as Web
pages. For example, the popular search engine Google claims that it can search 1,346,966,000
pages as of February 2001. The sheer size of the Web leads to difficulties in the storage,

Division Of Computer Science , SOE CUSAT

Page 1
Web Intelligence

management, and efficient and effective retrieval of Web documents. The complexity of the Web,
in terms of connectivity and diversity of Web documents, forces us to reconsider many existing
information systems, as well as theories, methodologies and technologies underlying those
systems. One has to deal with a heterogeneous collection of structured, unstructured, semistructured, interrelated, and distributed Web documents consisting of texts, images and sounds,
instead of homogeneous collection of structured and unrelated objects. The latter is the subject of
study of many conventional information systems, such as databases, information retrieval, and
multi-media systems. To accommodate the needs of the Web, one needs to study issues on the
design and implementation of the Web-based information systems by combining and extending
results from existing intelligent information systems. Existing theories and technologies need
to be modified or enhanced to deal with complexity of the Web. Although individual Web-based
information systems are constantly being deployed, advanced issues and techniques for
developing and for benefiting from the Web remain to be systematically studied. The challenges
brought by the Web to computer scientists may justify the creation of the new sub-discipline, WI,
for carrying out Web-related research.

The Web increases the availability and accessibility of information to a much
larger community than any other computer applications. The introduction of Personal Computers
(PCs) brought the computational power to ordinary people. It is the Web that delivers more
effectively information to everyone at finger tips. The Web, no doubt, offers a new means for
sharing and transmitting information unmatchable by other media. The revolution started by the
Web is just beginning. New business opportunities, such as e-commerce, e-banking, and
e-publication, will increase with the maturity of the Web. It can hardly overemphasize more
impacts of the Web on the business and industrial world. The creation of a new sub-discipline
devoted toWeb related research and applications might has a significant value in the future.
The needs for WI may be further illustrated by the current fast growing research and industrial
activities centered on it. We searched the Web by using the keyword “Web Intelligence” through
several search engines in February 2001.

Division Of Computer Science , SOE CUSAT

Page 2
Web Intelligence

What is Web Intelligence?

“Web Intelligence (WI) exploits Artificial Intelligence (AI) and advanced Information
Technology (IT) on the Web and Internet.”

This definition has the following implications. The basis of WI is AI and IT. The “I”
happens to be shared by both “AI” and “IT”, although with different meanings in them, and “W”
defines the platform on which WI research is carried out. The goal of WI is the joint goals of AI
and IT on the new platform of the Web. That is, WI applies AI and IT for the design and
implementation of Intelligent Web Information Systems (IWIS). An IWIS should be able to
perform functions normally associated with human intelligence, such as reasoning, learning, and
self improvement. There perhaps might not be a standard and non-controversial definition of WI,
as the case that there is no standard definition of AI. One may argued that our definition of WI
focuses more on the software aspects of the Web. It is not our intention to exclude any research
topic using the proposed definition. The term, Web Intelligence, should be considered as an
umbrella or a label of a new branch of research centered on the Web. Our definition simply states
the scopes and goals of WI. This allows us to include any theories and technologies that either fall
in the scopes or aim at the same goals. To complement the formal definition, we try to make the
picture clearer by listing topics to be covered by WI.

WI will be an ever-changing research branch. It will be evolving with development of the
Web as new media for information gathering, storage, processing, delivery and utilization. It is
our expectation that WI will be evolved into an inseparable research branch of computer science.
Although no one can predict the future in detail and without uncertainty, it is clear that WI would
have huge impacts on the application of computers, which in turn will affect our everyday lives.

Division Of Computer Science , SOE CUSAT

Page 3
Web Intelligence

CHAPTER 2
Perspectives of WI

As a new branch of research, Web Intelligence exploits Artificial Intelligence (AI) and
Information Technology (IT) on the Web. On the one hand, it may be viewed as applying results
from these existing disciplines to a totally new domain. On the other hand, WI may also introduce
new problems and challenges to the established disciplines. WI may also be viewed as an
enhancement or an extension of AI and IT. It remains to be seen if WI would become a sub-area
of AI and IT or a child of a successful marriage of AI and IT. However, no matter what happens,
studies on WI can benefit a great deal from the results, experience, success and lessons of AI and
IT. In their very popular textbook, Russell and Norvig examined different definitions of artificial
intelligence from eight other textbooks, in order to decide what is exactly AI. They observed that
the definitions vary along the two dimensions. One dimension deals with the functionality and
ability of an AI system, ranging from thought processes and reasoning ability of the systems to the
behavior of the systems. The other dimension deals with the designing philosophy of AI systems,
ranging from intimating human problem solving to making rational decision. The combination of
the two dimensions results in four categories of AI systems adopted from Russell and Norvig .

Systems that think like humans.

Systems that think rationally.

Systems that act like humans.

Systems that act rationally.

This classification provides a basis for the studies of various views and approaches for AI.
It also clearly defines goals in the design of AI systems. According to Russell and Norvig , they
correspond to four approaches, the cognitive modeling approach (thinking humanly), the Turing
test approach (acting humanly), the laws of thought approach (thinking rationally), and the
rational agent approach (acting rationally).The two rows for separating AI systems in terms of
thinking and acting may not be a most suitable classification. Action is normally the final result of
a thinking process. One may argue that the class of systems acting humanly is a super set of the
class of system thinking humanly. In contrast, the separation of human-centered approach and
rationality-centered approach may have significant implications in the studies of AI. While earlier
research on AI was focus more on human-centered approach, rationality-centered approach
received more attention recently
Division Of Computer Science , SOE CUSAT

Page 4
Web Intelligence

The first column is centered around humans and leads to the treatment of AI as an

empirical science involving hypothesis and experimental confirmation. A human-centered
approach represents the descriptive view of AI. Under this view, a system is designed by
intimating the human problem solving. This implies that a system should have the usual human
capabilities such as knowledge representation, natural language processing, reasoning, planning
and learning. The performance of an AI system is measured or evaluated through the Turing
test. An system is said to be intelligent if it provides human level performance. Such a descriptive
view dominates the majority of earlier studies of expert systems, a special type of AI systems.
The second column represents the prescriptive or normative view of AI. It deals with theoretical
principles and laws that an AI system must follow, instead of intimating humans. That is, a
rationalist approach deals with an ideal concept of intelligence, which may be independent of
human problem solving. An AI system is rational if it does the right thing and makes the right
decision. The normative view of AI based on the well established disciplines such as
mathematics, logic, and engineering. The descriptive and normative views also reflect the
experimental and theoretical aspects of AI research.

The experimental study represents the descriptive view. It covers theories and models for
the explanation of the workings of the human mind, and applications of AI to solving problems
that normally require human intelligence. The theoretic study aims at the development of theories
of rationality, and focuses on the foundations of AI. The two views are complementary to each
other. Studies in one direction may provide valuable insights into the other. Web Intelligence
concerns the design and development of intelligent Web information systems. The previous
framework for the study of AI can be immediately applied to that of Web Intelligence. More
specifically, we can cluster research in WI into the prescriptive approach and the normative
approach, and cluster Web information systems in terms of thinking and acting. Various research
topics can be identified and grouped accordingly. Like AI, a foundation of WI can be established
by drawing results from the following many related disciplines:
•

Mathematics: computation, logic, probability.
Applied Mathematics and Statistics: algorithms, non-classical logics, decision theory,
information theory, measurement theory, utility theory, theories of uncertainty,
approximate reasoning.

Division Of Computer Science , SOE CUSAT

Page 5
Web Intelligence
•

Psychology: cognitive psychology, cognitive science, human-machine interaction, user
interface.

•

Linguistics: computational linguistics, natural language processing, machine translation.

•

Information Technology: information science, databases, information retrieval systems,
knowledge discovery and data mining, expert systems, knowledge-based systems, decision
support systems, intelligent information agents.

The topics under each entry are only intended as examples. They do not form an exhausted
list. In the development of AI, we have witnessed the formulation of many of its new subbranches, such as knowledge-based systems, artificial neural networks, genetic algorithms, and
intelligent agents. Recently, non-classical AI topics have received much attentions under the name
of computational intelligence. Computational intelligence focuses on the computational aspect of
intelligent systems , . The application of AI in other disciplines also leads to new techniques in the
corresponding fields. For instance, Business Intelligence (BI) is a result of applying artificial
Division Of Computer Science , SOE CUSAT

Page 6
Web Intelligence

intelligence to the business domain. Artificial Intelligence in Medicine also proved to be a
successful application. When viewing WI in such settings, we can identify at least two of its roles.
WI may be interpreted “Web based Artificial Intelligence” as the study of particular aspects of AI
in the context of the Web, in parallel to the study of computational intelligence.

WI may also be interpreted as “Artificial Intelligence on the Web” which regards it as a
new application of AI.A more practical goal of WI is the design and implementation of intelligent
Web information systems (IWIS). It should be realized that an IWIS is an integrated system
containing many sub-systems. To design such a system, it is necessary to apply a variety of
theories and technologies.

In his work on vision, Marr convincingly made the point that a full understanding of an intelligent
system involves explanations at various levels. The same argument is applicable to the
development of an IWIS. We can identify at least two levels, the conceptual formulation and
physical implementation. The conceptual formulation deals with foundations of IWIS, while
physical implementation concerns with construction of an IWIS. The former depends on
mathematics and logic, and the latter depends on algorithms and programming. Each level may be
further divided into more sub-levels. Research in WI should include any topics at different levels.

Division Of Computer Science , SOE CUSAT

Page 7
Web Intelligence

CHAPTER 3

WEB INTELLIGENCE EX PLORATION
Web intelligence further explores the transformation of knowledge from information, and
wisdom from knowledge, in its search of the Wisdom Web. Some of the important issues,
although may not be well-conceived yet, are briefly discussed in this section.

3.1 A new field of science, technology and engineering

The Web, as a new technical and social phenomenon and a growing organism, creates a
new field of science that involves a multi-disciplinary study and enquiry for the understanding of
the Web and its relationships to us. The Web may be studied from many perspectives, such as
philosophical foundations, theoretical and technical foundations, applications, and social impacts.
Some examples are given below:
•

Webology,

•

Web Science,

•

Web Technology,

•

Web Engineering,

•

Weblization.

The term, webology, is coined to label the study of the Web as a new field of science. By postfixing the phrase, science and technology, one clearly states the scope. By post fixing the phrase,
engineering, one emphasizes the design and implementation aspects. Together, they are driving
forces for information revolution. The term, weblization, concisely summarizes the development:
of the Web and web based systems so far. The process of weblization involves building the Web
itself and reconstructing existing tools and systems OR the web platform.

3.2 Design philosophy and principles of the Web
The design philosophy and principles set the direction of web growth and its ultimate
destiny. It may be difficult to compile a non-controversial and complete list. However, examples
include Decentralization principle, Universalist principles, Minimum constraint principle,
Division Of Computer Science , SOE CUSAT

Page 8
Web Intelligence

Separation of form and content principle. The decentralization principle is inherited from the
decentralization property of the Internet. The universalist principles cover universal connectivity,
universal accessibility, as well as diversity of web contents and users. The minimum constraint
principle suggests that the Web should be as un-constraining as possible to realize its universality.
The separation principle deal with the presentation of web documents, in order to achieve
location, machine, and apphcation independence. The design principles ensure that the Web has
the desirable properties, such as decentralization, adaptability, evolvability, scalability, universal
connectivity and accessibility, affordability, anonymity, diversity, and many others. The Web is
able to support communication, collaboration. interaction, and intercreation.

3.3 The laws of the Web
Two sets of laws have been studied, namely, the set of laws governing the Web and the set of
empirical laws observable on the Web. The Web has given new meaning to publishing and
library, but not their underlying principles. Nomzi argued that Ranganathan’s Five Laws of
Library Science is weli applicable today as it was more than 70 years ago . Ranganathan’s Five
Laws of Library Science state:
•

Books are for use.

•

Every reader his or her book.

•

Every book its reader.

•

Save the time of the reader.

•

The Library is a growing organism

These laws describe a user-oriented, as well as a serviceoriented, view of library science. The
Web consists of a massive collection of resources. By replacing “book”, “reader”, and “library”
with “web resource”, “user”, and ‘‘web’, respectively, Noruzi stated Five Laws of the Web
•

Web resources are for use.

•

Every user his or her web resource.

•

Every web resource its user.

•

Save the time of the user.

•

The Web is a growing organism.

Division Of Computer Science , SOE CUSAT

Page 9
Web Intelligence
They concisely represent the underlying philosophy of the Web and web services. They also
describe the ideal Web - “of the people, by the people, for the people”. Many researchers studied
empirical laws revealed by the Web, either its growth, web page distributions, or user surfing
patterns. An example set of such laws is reported by Huberman :
I. Power Law of Distribution.
2. Small World Law.
3. . Law of Surfing.
4. Law of Congestion.
5. The Free Ride Law
6. The Law of Downloading.
Website designers, webmasters, and organizations can apply such laws for the design of better
website and web resources.

3.4 The Web revolution: one link at a time
The story of the invention of the Web and the revolution brought by the Web provides a
good case study for web intelligence. It poses a challenge: how to derive insights and wisdom
from the existing data, information, and knowledge. Regarding the pre-web uses of hypertext
links, Berners-Lee commented, “The research community had used the links between paper
documents for ages: Tables of contents, indexes, bibliographies, and reference sections are
hypertext links.’’ A crucial question is what we can get from this common knowledge and
practice. Two types of approaches have been proposed and studied. One focuses on the
exploration of the potential implications of such knowledge, which leads to the creation of a field
of science known as citation indexing and analysis. The other focuses on the representation,
storage, and access of the similar types of data and knowledge using new media as they become
available, which leads to the invention of the Web.
A basic idea of citation indexing and analysis is to index and study the literature of science
Division Of Computer Science , SOE CUSAT

Page 10

based on how scientists cite each other. Although it mainly uses bibliographies, citation indexing
Web Intelligence

and analysis brings more insights into science, publishing, scientific research, and many more
fields. Information retrieval systems, based on citation indexing and analysis, have been
implemented and used by scientists for many years. The same methods have been applied or
rediscovered in many recent studies, such as web search engines, social network analysis, and so
on.
A basic idea of the Web is to create a global space in which anything can be linked to
anything . The development of the Web emphasizes the implementation of this idea using
different type machines and media. The Web attempts to make the existing associations and links,
that people had used either explicitly or implicitly, concrete and computer manageable. The
similar

concepts

had

been

explored

in

preweb

age.

Vnnevar

Bush

described

a

photoelectromechanical machine called the Memex that can make and follow cross-references
among microfilm documents. Ted Nelson introduced the concept of hypertext, so that people can
use computers to read, write and publish non-linear texts. Doug Engelbart demonstrated a
collaborative work space called NLS which does hypertext browsing editing, email, and so on.
Thanks to the timely invention of the Internet for providing global connectivity, the dream of the
Web became a reality. The revolution of the Web is brought by grassroots effort that builds the
Web link by link. There are recent research efforts in cross-applications of the two types of
approaches. The methods developed for citation indexing and analysis are used and extended to
analyze the links and conductivity of the web. Existing systems for citation indexing and analysis
are moved to, and new such systems are impregnated on, the Web.
The above brief description, which is almost common knowledge, is repeated here to serve
one special purpose. It demonstrates that the great minds of our time bring revolutions by
analyzing what everyone has already known or by implementing, alternatively, what everyone has
already used. The question is: Can web intelligence help in the future?

3.5 The more things change, the more they stay the same
Now, we turn our attention to the other side of the same coin by investigating the things that the
resolutions do not change. In spite of the technological changes, achievements of the current Web
and associated systems lie in the process of weblization. The weblization of a specific field or an
organization does not change its fundamental principles, although it may become more effective
and efficient, as well as being at different level of scale. For example, electronic commence does
not change the principles of doing business, but does introduce more dynamics, opportunities,
Division Of Computer Science , SOE CUSAT

Page 11

flexibility, and other new properties. Another example is the Five Laws of the Web:the subject
Web Intelligence

matters are changed, but the philosophy remains to be the same. Both paper documents and the
Web use links.
The physical implementations are different, one on paper and the other on computer, but the
logical meanings stay more or less the same. The same analytical tools and methods apply to both.
The property of “unchangeness” makes it possible to apply the same principles again and again,
with possible adaptation and adjustment. The philosophy and principles that have been proved to
be effective in past can be applied to design and implement intelligent web information systems.
Some illustrative examples are listed here:
Separation of logical view and physical view.
Separation of knowledge and inference engine.
Keep It Simple, Stupid!
The first two separation principles are along the same line as the separation of content and form
principle. The first one is widely used in the design and implementation of database systems. Its
application to the Web implies that one can generate many virtual logical views from the same
physical web. The second principle is a fundamental one in expert systems. It is applicable to the
design of web inference engines. The last rule, also known as the KISS principle, is universally
applicable It has been applied throughout the design of the Web.

Division Of Computer Science , SOE CUSAT

Page 12
Web Intelligence

CHAPTER 4
Components of Web Intelligence

4.1 Web Data

The data available in electronic commerce environments is three-fold and includes server
data in the form of log files, site specific web meta data representing the structure of the
web site, and marketing information, which depends on the products and services provide. Server
data is generated by the interactions between the persons browsing an individual site and the web
server. This data can be divided into log files and query data. Historically, web servers recording
server activity, errors and referrer information used a log file to record each event. It is now the
standard that web servers use a combined log file format, called Common Log file Format . This
format combines the server and error logs into one file. More recently, the Extended Log file
Format has been used, which consolidates the Common format with additional information,
namely the referrer and cookie information. By incorporating referrer information, the output of
the mining of these logs files being much more useful and actionable in marketing terms. Cookies
are tokens generated by the web server and held by the clients. The information stored in a
cookie helps to ameliorate the transaction less state of web server http interactions, enabling
servers to track client access across their hosted web pages. The logged cookie data is
customizable and can contain keys for relating the navigational data to the content of the
marketing data, including transactional data. Usually the following information is contained in a
cookie: User ID, source IP address, time-to-live, randomly generated unique ID and user defined
information. A fourth data source that is typically generated on electronic commerce sites is
query data to a web server. This data is usually generated when users of the web site use search or
product locator facilities on the web site to search for relevant pages/products. This is often user
interaction with a product database, via the company’s Internet site. The final source of data is
web meta-data. This data describes the structure of the web site and is usually generated
dynamically and automatically after a site update. Web meta-data generally includes neighbor
pages, leaf nodes and entry points. This information is usually implemented as a site-specific
index table, which represents a labeled, directed graph. Meta-data also provides information
whether a page has been created statically or dynamically and whether user interaction is required
or not. In addition to the structure of a site, web meta-data can also contain information of more
Division Of Computer Science , SOE CUSAT

semantic nature, usually represented in XML.

Page 13
Web Intelligence
Web Mining Components of Web Intelligence

In the context of web intelligence, web mining may be defined as the application of data
mining techniques to Internet data. This definition is sometimes extended to include statistical,
database optimization, and artificial intelligence techniques. Web mining has been sub-divided
into web structure, web usage, and web content mining . Web structure mining is the application
of data mining techniques to web site structures. In many cases this may be the entire web, and
research in intelligent search engines and intelligent agents is described in many articles, . In our
research, we define web structure mining as the mining of Internet data, together with data about
the structure of the site. This may be thought of as enriching the efficacy of the data mining
process with domain knowledge. The application of domain knowledge is further discussed in the
analytical process section. Web usage mining is the application of data mining to Internet web
server log file data, which is described in the earlier section on web data. Web usage mining
forms the core of our research in web mining for web intelligence, and log files provide the
foundation data for visitor analysis. This type of analysis of the visitors to a web site can be
subdivided into technographic and psychographic analysis . Technographic analysis focuses on
what is known about the visitor’s technical platform, i.e., operating system, browser, plug-ins,
user language, cookie information. On its own, this information is not a rich source of
discriminatory data for visitor profiling but in conjunction with the homogenous data sets
available after extract, transform & load operations to data warehousing, it contributes
significantly. Psychographic analysis is the examination of what we know about the behavioral
patterns of web site visitors. This includes the routes taken by visitors through a site, the time
spent on each page, route differences based on differing entry points to site, aggregated route
behavior, general click stream behavior, etc. This is the information of most use to web marketers,
and is equivalent to marketing intelligence about where shoppers enter the store, where shoppers
go in the store, where they leave the store, what they look at but don’t buy, what they buy and
how quickly, etc.
Web content mining is the application of data and text mining algorithms and techniques
to the contents of web pages, usually written in HTML. At its simplest, this entails the extraction
of text between HTML tags for headings and titles, or the extraction of the HTML Meta tag
content.. Our research is based upon XML and RDF-based data schemas that help to ensure
correctness and proper context.
Division Of Computer Science , SOE CUSAT

Page 14
Web Intelligence
4.2 Representation

Intelligent Web agents can use the Problem Solver Markup Language (PSML) to specify
their roles, settings, and relationships with any other services. The intelligent Web must also have
the ability to process and understand natural language. It must understand and correctly judge the
meaning of concepts expressed in words, such as “good,” “best,” and “season.” Further, the
intelligent Web must grasp the granularities of these terms’ corresponding subjects and the
location of their ontology definitions.

Self-direction and learning

In addition to the semantic knowledge that an intelligent search can extract and
manipulate, intelligent Web agents must also incorporate a dynamically created source of metaknowledge that deals with the relationships between concepts and the spatial or temporal
constraint knowledge that planning and executing services use. This allows the agents to selfresolve their conflicts. To solve specific problems, intelligent Web agents must be able to plan.
The planning process uses goals and associated sub goals, as well as constraints. In the intelligent
Web, ontologies alone will not be sufficient. Personalization The intelligent Web can personalize
interactions by remembering a particular user’s recent encounters and relating the topics and sites
that a user accesses during different online sessions. It may further identify other goals and
courses of action as a user’s interactions broaden and deepen, providing ever more data upon
which to base its recommendations. As part of its personalized approach to user services, the
intelligent Web will interact with the user when executing these tasks. In summary, semantics
contributes a vital aspect to the intelligent Web. We expect the Web to extend not only the
knowledge of artificial assistants, but also their intelligence.

WI’s Four Levels
We can study Web intelligence on at least four conceptual levels, ranging from the lower,
hardware- centered level to the higher, application-centered level. This framework builds upon the
fast development and application of various Web technologies.
• Internet-level communication, infrastructure, and security protocols.
At its core, the Web is a computer-network system. WI techniques for this level include Web data
perfecting systems built upon Web surfing patterns to resolve latency issues. The intelligence of
Division Of Computer Science , SOE CUSAT

Page 15
Web Intelligence

the Web’s perfecting routines comes from an adaptive learning process based on observations of
user surfing behavior.
• Interface-level multimedia presentation standards.
The Web functions as an interface for human-Internet interaction. At this level, the Web
interfaces require adaptive cross-language processing, personalized-multimedia-representation,
and multimodal-data-processing capabilities.
• Knowledge-level information processing and management tools.
The Web serves as a distributed data and knowledge base. Accessing and manipulating this
information requires semantic markup languages to represent the Web’s contents in machineunderstandable formats. Agent-based autonomic computing functions such as searching,
aggregation, classification, filtering, managing, mining, and discovery can then use this data.
• Application-level ubiquitous computing and social intelligence environments.
The Web can form the basis for establishing social networks that contain communities of people,
organizations, or other social entities. Social relationships such as friendship, co-working, or
exchanging information about common interest connect these entities. The study of WI thus
encompasses issues central to social network intelligence. Users access the Web’s multimedia
content from stationary desktop computers and increasingly from mobile platforms as well.5
Ubiquitous Web access and computing from various wireless devices requires even greater
adaptive personalization. WI should suit these needs well by providing techniques for use in
constructing interest models derived from implicit inferences based on user behavior.

Division Of Computer Science , SOE CUSAT

Page 16
Web Intelligence

4.3 PSML and Web inference engine

Distributed inference engines form PSML’s core. These engines can perform automatic
reasoning on the Web by incorporating autonomically collected and transformed content and
meta-knowledge into locally operational knowledge and databases. A feasible way to implement
PSML is to use an existing Prolog-like logic language supplemented with agents that perform
dynamic-content updates, meta-knowledge.

4.4 Social network intelligence
The social intelligence approach to Web computing presents new opportunities for WI
research and development. As the Web becomes an integral part of our society, WI can and
should support Web-based social networks at all levels. Study in this area must receive as much
attention as Web mining, Web agents, ontologies, and related topics. Web-based computing The
intelligent Web seeks to provide not only a medium for seamless information exchange and
knowledge sharing, but also the sort of human-crafted resources that encourage sustainable
knowledge creation and scientific and social evolution. The intelligent Web will rely on Grid-like
service agencies that self-organize, learn, and evolve their courses of action to perform service
tasks and transform their identities and interrelationships in communities. These services will also
cooperate and compete among themselves to optimize their resources and utilities and those of
others.

4.5 Benchmark applications

To effectively develop and evaluate systems and applications that address WI research issues, we
must consider benchmark applications that will demonstrate these capabilities. Suppose we want
to conduct a Web-based search to compile the data and generate a market report for an existing
product or a potential new product. To perform these tasks, an information agent will mine and
integrate available Web information, which will in turn be passed to a market analysis agent.
The analysis will involve the quantitative simulation of customer behavior in a marketplace,
instantaneously handled by other service agencies involving a large number of Grid agents. Given
that the number of variables can number in the hundreds or thousands, generating one prediction
can easily require significant computer resources
Division Of Computer Science , SOE CUSAT

Page 17
Web Intelligence

CHAPTER 5
Computational Web Intelligence and Granular
Web Intelligence for Web Uncertainty

With explosive growth of Web data on wired and wireless networks, a challenging
problem for a new generation of intelligent Web techniques is how to handle uncertain Web data
and making right decisions under Web uncertainty. So it is necessary to develop new intelligent
Web techniques for Web applications under different types of uncertainty including probability,
possibility, fuzziness, roughness, randomness, etc. Web Intelligence (WI), a new direction for
scientific research and development, exploits Artificial Intelligence (Al) and advanced
Information Technology (IT) on the Web and Internet. In general, Al-based Web techniques can
be used to handle probabilistic Web data. Since there are lots of fuzzy Web data and other kinds
of uncertain Web data, we need to apply relevant intelligent techniques to process different
uncertain Web data that cannot be processed by traditional precise intelligent techniques like
Boolean logic. To promote the use of fuzzy Logic in the Internet, Zadeh stated "fuzzy logic may
replace classical logic as what may be called the brainware of the Internet" at 2001 BISC
International Workshop on Fuzzy Logic and the Internet (FLINT2001) . The fuzzy intelligent
agents are used in smart e-Commerce applications. The conceptual fuzzy sets are applied to Web
search engines to improve quality of Web service. Clearly, the intelligent e-brainware based on
soft computing plays an important role in smart e-Business applications. So soft computing
techniques can play an important role in building the intelligent Web brain. So soft-computingbased Web techniques can enhance Web Qol (Quality of Intelligence). In order to use CI
(Computational Intelligence) techniques to make intelligent wired and wireless systems with high
Qol, Computational Web Intelligence (CWI) was proposed at the special session on CWI at
FUZZ-IEEE'02 of 2002 World Congress on Computational Intelligence. CWI is a hybrid
technology of CI and Web Technology (WT) dedicating to increasing Qol of e-Business
application systems on the wired and wireless networks. Main CWI techniques include
•

Fuzzy Web Intelligence (FWI)

•

Neural Web Intelligence (NWI)

•

Evolutionary Web Intelligence (EWI)

•

Granular Web Intelligence (GWI)

•

Rough Web Intelligence (RWI)

Division Of Computer Science , SOE CUSAT

•

Probabilistic Web Intelligence

Page 18
Web Intelligence

5.1 WEB UNCERTAINTY

Web holds various data sets distributed on a huge number of computers just like a human
brain contains biological data stored on a large number of biological neurons. The biological data
in the human brain are not always precise but uncertain in most cases due to information
incompleteness, linguistic vagueness, imperfect measurement, knowledge limitations, etc.
Similarly, Web data on the Internet are not accurate but uncertain usually because of partial Web
information, dynamic Web data, fuzzy Web data, Web ontology, unpredictable Web information,
different Web users, different hardware environments, different data formats, etc.So the big
challenging problem is how to design intelligent Web techniques for Web-based applications with
uncertainty. With explosive growth of the wired and wireless networks, Web users suffer from
huge amounts of raw Web data because current Web tools still cannot find satisfactory
information and knowledge effectively and make decisions correctly because of uncertain Web
data, uncertain Web information, uncertain Web knowledge and uncertain Web intelligence. Now
the Internet and wireless networks connect an enormous number of computing devices including
computers, PDAs (Personal Digital Assistants), cell phones, home appliances, etc. CI is used in
telecommunication network applications . Clearly, such a huge networked computing system on
the world provides a complex, dynamic and global environment for developing the new
distributed intelligent theory and technology based on Al, BI (Biological Intelligence) and CI.
Therefore, we must design an intelligent Web technology for dealing with Web uncertainty.

5.2 COMPUTATIONAL WEB INTELLIGENCE FOR WEB
UNCERTAINTY

Zadeh states that traditional (hard) computing is the computational paradigm that underlies
artificial intelligence, whereas soft computing is the basis of CI. Based on the discussions on CI
and Al ,the basic conclusion is that CI is different from Al, but CI and Al have a common overlap.
In general, hard computing and soft computing can be used in intelligent hard Web applications
and intelligent soft Web applications. To enhance Qol (Quality of Intelligence) of e-Business,
Computational Web Intelligence (CWI) is proposed to use CI and Web Technology (WT) to make
intelligent e-Business applications on the Internet and wireless networks . So the concise relation
is given by CWI=CI+WT. Fuzzy logic, neural networks, evolutionary computation, granular
Division Of Computer Science , SOE CUSAT

Page 19

computing, rough sets and probabilistic methods are major CI techniques for intelligent e-
Web Intelligence

Applications on the Internet and wireless networks. Currently, seven major research areas of CWI
are (1) Fuzzy WI (FWI), (2) Neural WI (NWI), (3) Evolutionary WI (EWI), (4) Probabilistic WI
(PWI), (5) Granular WI (GWI), and (6) Rough WI (RWI). In the future, more CWI research areas
will be added. The six current major CWI techniques are described below.
•

FWI has two major techniques: fuzzy logic and WT. The main goal of FWI is to design
intelligent fuzzy e-agents to deal with fuzziness of Web data, Web information and Web
knowledge, and also make good decisions for e-Applications effectively.

•

NWI has two major techniques: neural networks and WT. The main goal of NWI is to
design intelligent neural e-agents that can learn Web knowledge from of Web data and
Web information and make smart decisions for e-Applications intelligently.

•

EWI has two major techniques: evolutionary computing and WT. The main goal of EWI
is to design intelligent evolutionary e-agents to optimize e-Application tasks effectively.

•

PWI has two major techniques: probabilistic computing and WT. The main goal of PWI is
to design intelligent probabilistic e-agents to deal with probability of Web data, Web
information and Web knowledge for e-Applications effectively.

•

GWI has two major techniques: granular computing and WT. The main goal of GWI is to
design intelligent granular e-agents to deal with Web data granules, Web information
granules and Web knowledge granules for e-Applications effectively.

•

RWI has two major techniques: rough sets and WT.

The main goal is to design intelligent rough e-agents to deal with roughness of Web data, Web
information and Web knowledge for e-Applications effectively.CWI can be used to increase the
Qol of e-Business applications. CWI has a lot of wired and wireless applications in intelligent eBusiness. Currently, FWI, NWI, EWI, PWI, GWI and RWI are major CWI techniques. CWI can
be used to deal with uncertainty and complexity of Web applications. HWI, a more broad area

Division Of Computer Science , SOE CUSAT

Page 20
Web Intelligence

than CWI, can be applied to more complex e-Business applications. In summary, HWI including
CWI will play an important role in designing the smart e-Application systems for wired and
wireless users. In summary, CWI technology is based on multiple CI techniques and WT.
Relevant CI techniques and WT are selected to make a powerful CWI system for the special
e-Business application.

5.3 GRANULAR WEB INTELLIGENCE FOR WEB UNCERTAINTY

Granular computing technology can be to do high-level information processing and
knowledge discovery based on data granules that are clustered intelligently from raw data with
uncertainty. Since there are huge amounts of Web data at different geographical places, it is
naturally necessary to use the granular computing technology to preprocess raw Web data, then do
granular Web data mining, and finally discover granular Web knowledge. So GWI is a general
intelligent technology in dealing with raw Web data with Uncertainty. Mathematically speaking,
to handle Web uncertainty effectively, it is really necessary to develop a novel granular set theory.
Here, a general framework about granular sets is briefly described below to deal with data
uncertainty such as Web data uncertainty.

Definition 1 (A Granular Set) Let X be a universal set of data elements. A granular set A in Xis
characterized by m granular membership functions Fk(x) for x in X, Fk(x)E[O,1], and
k=1,2,...m.
For example:
If k=1, a granular set is a fuzzy set (a special case: a crisp set) since one membership function is
used. The traditional fuzzy sets just use truth values in [0, 1] to handle data
uncertainty.

If k=2, a granular set is an intuitionistic fuzzy set [25] since two membership functions are used.
Intuitionistic fuzzy sets use both truth values and falsity values in [0, 1] to deal with data
uncertainty. If k=3, a granular set is a neutrosophic set since three membership functions are
used. For example, interval neutrosophic sets are defined on a truth-membership function, an
indeterminacy-membership function and a falsity-membership function . The major advantage of
interval neutrosophic sets is to reduce data uncertainty by using three types of information that are
truth values, falsity values and indeterminacy values in order to make a right decision. 100
Division Of Computer Science , SOE CUSAT

Page 21
Web Intelligence

We hope that new granular sets and new granular logical systems with four or more membership
functions will be developed in the future to handle Web uncertainty effectively and
fundamentally.

Web uncertainty is a long-term challenging problem related to many Web applications like
semantic Web, Web mining, Web knowledge discovery, Web agents, Web search engines, Web
security, e-Commerce, e-Business, etc. To handle Web uncertainty, we need to develop relevant
intelligent Web technology such as CWI and GWI. Importantly, we need to continue to create
new granular sets such as neutrosophic sets to try to solve Web uncertainty effectively.

Web uncertainty is a difficult long-term problem. So we need to use different intelligent
techniques together for this complicated problem. Hybrid Web Intelligence (HWI), a broad hybrid
research area, uses Al, CI, BI (Biological Intelligence) and WT to build hybrid intelligent Web
systems to handle Web uncertainty effectively and efficiently. In the future, HWI will have a lot
of intelligent Web applications under uncertainty. Main HWI applications include (1) intelligent
Web agents for e-Applications such as e-Commerce, e-Government, e-Education and e-Health,
(2) intelligent Web security systems such as intelligent homeland security systems, (3) intelligent
Web bioinformatics systems, (4) intelligent grid computing systems, (5) intelligent wireless
mobile agents, (6) intelligent Web expert systems, (7) intelligent Web entertainment systems, (8)
intelligent Web services, (9) Web data mining and Web knowledge discovery, (10) intelligent
distributed and parallel Web computing systems based on a large number of networked computing
resources, ..., and so on.

Division Of Computer Science , SOE CUSAT

Page 22
Web Intelligence

CHAPTER 6
Trends and Challenges of WI Related Research and
Development

Web Intelligence presents excellent opportunities and challenges for the research and
development of new generation Web-based information processing technology, as well as for
exploiting business intelligence. With the rapid growth of the Web, research and development on
WI have received much attention. We expect that more attention will be focused on WI in the
coming years. Many specific applications and systems have been proposed and studied. Several
dominant trends can be observed and are briefly reviewed in this section. E-commerce is one of
the most important applications of WI. The e-commerce activity that involves the end user is
undergoing a significant revolution. The ability to track users’ browsing behavior down to
individual mouse clicks has brought the vendor and end customer closer than ever before. It is
now possible for a vendor to personalize his product message for individual customers
at a massive scale. This is called targeted marketing or direct marketing

Web mining and Web usage analysis play an important role in e-commerce for customer
relationship management (CRM) and targeted marketing. Web min- ing is the use of data mining
techniques to automatically discover and extract information from Web documents and services.
Zhong et al. proposed a way of mining peculiar data and peculiarity rules that can be used for
Web-log mining. They also proposed ways for targeted marketing by mining classification rules
and market value functions. A challenge is to explore the connection between Web mining and
the related agent paradigm such as Web farming that is the systematic refining of information
resources on the Web for business intelligence. Text analysis, retrieval, and Web based digital
library is another fruitful research area in WI. Topics in this area include semantics model of the
Web, text ming, automatic construction of citation. Abiteboul et al. systematically investigated the
data on the Web and the features of semi-structured data. Zhong et al. studied text mining on the
Web including automatic construction of ontology, e-mail filtering system, and Web-based ebusiness systems. Web based intelligent agents are aimed at improving a Web site or providing
help to a user. Liu et al. worked on e-commerce agents . Liu and Zhong worked on Web agents
and KDDA (Knowledge Discovery and Data Mining Agents). We believe that Web agents will be
a very important issue. It is therefore not surprising that we decide to hold the WI conference in
Division Of Computer Science , SOE CUSAT

Page 23
Web Intelligence

parallel to the Intelligent Agents conference. In the next section, we provide a more detailed
description of intelligent Web agents.

The Web itself has been studied from two aspects, the structure of the Web as a graph and
the semantics of the Web. Studies on Web structures investigate several structural properties of
graphs arising from the Web, including the graph of hyperlinks, and the graph induced by
connections between distributed search servants. The study of the Web as a graph is not only
fascinating in its own right, but also yields valuable insight into Web algorithms for crawling, 10
searching and community discovery, and the sociological phenomena which char- acterize its
evolution. Studies of the semantics of the Web were initiated by Tim Berners-Lee, the creator of
the World Wide Web. The Web is referred to as the “semantic Web”, where information will be
machine-processible in ways that support intelligent network services such as information brokers
and search agents.
The semantic Web requires interoperability standards that address not only the syntactic
form of documents but also the semantic content. A semantic Web also lets agents utilize all the
data on all Web pages, allowing it to gain knowledge from one site and apply it to logical
mappings on other sites for ontology-based Web retrieval and e-business intelligence. Ontologies
and agent technology can play a crucial role in enabling such Web-based knowledge processing,
sharing, and reuse between applications. A new DARPA program called DAML (DARPA Agent
Markup Languages) is a step toward a “semantic Web” where agents, search engines and other
programs can read DAML mark-up to decipher meaning rather than just the content on a Web
site.

6.1 Intelligent Web Agents

Intelligent agents are computational entities that are capable of making decisions on behalf
of their users and self-improving their performance in dynamically changing and unpredictable
task environments . In , Liu provided a comprehensive overview of related research work in the
field of autonomous agents and multi-agent systems, with an emphasis on its theoretical and
computational foundations as well as in-depth discussions on the useful techniques for developing
various embodiments of agent-based systems, such as autonomous robots, collective vision and
motion, autonomous animation, and search and segmentation agents. The core of those techniques
is the notion of synthetic or emergent autonomy based on behavioral self-organization. Intelligent
Division Of Computer Science , SOE CUSAT

Page 24
Web Intelligence

Web Agents (WA) are software programs that primarily serve two important roles: a).
autonomous entities for exploring and exploiting Web-based services, and b). prototype entities
for exhibiting and explaining Web-generated regularities. These two roles are summarized below.

6.2 From WA to Web-Based Services

The first role for WA can be readily described and appreciated by examining the following
typical scenarios in which various tasks and objectives are achieved.
•

Personalized Multimodal Interface WA can provide users with a user-friendly style of
presentation that personalizes both the interaction with users and the content presentation.
This activity involves the creation of various cognitive aids, including tables, charts,
executive summaries, indices, and personalized visual assistants (e.g., graphically
animated personas and virtual-reality avatars). WA as interfaces must offer the ease of
using electronic services. The provided cognitive aids must be concise (i.e., accessible
with as fewer manipulations as possible and as less memorization as possible) and
consistent (i.e., understandable based on users’ previously customized cognitive styles).

•

Push and Pull WA can play an important role in dynamically creating pull-and-push
advertising. Here, by pull-and-push advertising we mean that a user expresses his or her
favorites during the interaction with the agents (pull advertising) and in return the agents
search and deliver the information about the favorite items dynamically to the user (push
advertising). Such agents can also increase the positive externality of products, that is,
the better people are informed about certain products, the more likely the products will be
sold.

•

Pattern Discovery and Self-Organization WA will enable to detect what users’ buying
patterns are forming and how they are structured, and hence effectively manage the online
commerce. Collaborative recommendation agents can help individual users aggregate into
groups, which can in turn form a dynamical marketplace.

•

Information Gateway WA can provide users with immediate access to the most relevant
information. This support encompasses a wide spectrum of information filtering and
delivery activities by manipulating various heterogeneous Web sources including
databases, data warehouses, newswire, financial reports, newsletters, newsgroups,
outbound emails, electronic bulletin boards, and hypermedia documents, and based on

Division Of Computer Science , SOE CUSAT

Page 25
Web Intelligence

users’ profiles, tailoring and delivering the retrieved information to the users. The
provided summary information must be just-in-time (i.e., delivered whenever is needed),
relevant (i.e., focused on whichever topics the users are concerned with), and up-to-

minute (i.e., refreshed whenever a new piece of information arrives). An example of
applications with this type of agent support is comparison shopping that utilizes WA with
mobile and filtering capabilities. Some related experiences have been reported in .
•

Reward WA can motivate users to enter and re-enter a certain electronic service. While an
ever-greater proliferation of content continues to consume individuals’ attention, e.g.,
through push technology to sell something or to support users, WA can play a crucial role
in creating a captive audience, in educating it constantly, and even in removing away
users’ old purchase habits. To be rewarding is to add value. The motivational rewards or
incentives can be created by offering free access to certain information and utility
resources (e.g., free software download), opportunities to participate in multi-user
information/commodity exchange activities (e.g., collaborative recommendation, chat,
bidding, and auction), and scheduled plans for promotional deals.

•

Matchmaking WA can serve as a new means for trading commodities. Since the interests
of users as well as the availability of products from dealers can change dynamically from
time to time, what usually happens in present day electronic commerce is: (1) a dealer
sells his or her items simply because these are the only items that he or she has at the
moment, or (2) a user buys a certain item simply because it is the last item that he or she
can find that partially fits his or her need. WA-based customized business attempts to
change the existing online buying and selling into the following new scenarios: (1) a
dealer identifies and offers what exactly users are interested in, and (2) a user finds and
purchases what he or she really loves – some technical issues related to matchmaking
have been addressed in .

•

Decision WA can assist Web users in making decisions. Such decision support may be in
the forms of evaluations or recommendations on the various features of certain specific
items, cost-benefit analysis, inference support for optimizing utility and resources with
respect to functional, time, and cost requirements, and model-based trend analysis and
projections concerning new patterns of demand.
•

Delegation WA can act on behalf of Web users in online activities. The

tasks that WA may delegate to achieve include matchmaking, server monitoring,
negotiation, bidding, auction, transaction, transfer of goods, and follow-up support. This
Division Of Computer Science , SOE CUSAT

Page 26

scenario will empower a new paradigm shift from user-centric to user-delegated
Web Intelligence

electronic business. The delegations of these tasks may be carried out in either semiautonomous (with users’ intervention on decisions) or fully autonomous manners. To this
end, various computational theories and models have been proposed and reported in.

•

Collaborative Work Support WA can offer the infrastructure support as well as the
necessary function for collaboratively solving problems and managing workflow
activities

Division Of Computer Science , SOE CUSAT

Page 27
Web Intelligence

CHAPTER 7
Semantic Search Engine

The framework’s search engine component queries the information generated by the annotation
component. It accepts queries posed in SPARQL and returns a set of links to matching resources.
A specialized search interface lets users develop an abstract model of a semantic query, pose it to
the engine, and then review the resulting matched documents. The search interface gives end
users (people who aren’t experts in Semantic Web technologies) a way to access the resources
filtered and annotated by the semantic annotator component. It is also possible to add and delete
entities and properties (with related values), so that a user can interact with the knowledge base to
fine-tune the query, making subsequent searches more accurate. The key aim for the query
interface is to give the user an intuitive and clear abstract query model that hides, as much as
possible, the underlying complexity of representation and reasoning. Furthermore, the agents in
the search engine multi-agent system exhibit various autonomic features that aim at making the
system more robust and scalable. The QS system has been deployed in two different commercial
test cases in the UK. In the first case, QS was used to examine specific Web-published documents
for commercial opportunities matching the business interests of the customer company. In the
second deployment, QS was used to perform knowledge-based searches over existing database
sources. In evaluating the performance of the search system in both applications, we could
see that by using ontological knowledge and ontology-based annotations, users could perform
more accurate queries while being returned up to 71 percent fewer documents than with a
keyword-based search engine—in the best cases eliminating more than 90 percent of the
irrelevant documents. We are now in the process of further refining these two deployments, and
we are planning more industrial deployments in the near future with other UK companies

Division Of Computer Science , SOE CUSAT

Page 28
Web Intelligence

CHAPTER 8
CONCLUSION

While it may be difficult to define what exactly Web Intelligence (WI) is, one can easily
argue for the need and necessity of creating such a subfield of study in computer science. With the
rapid growth of the Web, we foresee a fast growing interest in Web Intelligence. Roughly
speaking, we define Web Intelligence as a field that “exploits Artificial Intelligence (AI) and
advanced Information Technology (IT) on the Web and Internet.” It may be viewed as a marriage
of artificial intelligence and information technology in the new setting of the Web. By examining
the scope and historical development of artificial intelligence, we discuss some fundamental
issues of Web Intelligence in a similar manner. There is no doubt in our mind that results from AI
and IT will influence the development of WI. Instead of searching for a precise and noncontroversial definition of WI, we list topics that might be interested by a researcher working on
Web related issues. In particular, we identify some challenging issues of WI, including
ecommerce, studies of Web structures and Web semantics, Web information storage and retrieval,
Web mining, and intelligent Web agents, to examine performance characteristics of various
approaches in Web-based intelligent information technology, and to cross-fertilize ideas on the
development of Web-based intelligent information systems among different domains.

It is not intended to be a complete and systematic study of the field, but rather a record of
personal observations, scattered (perhaps immature) ideas, general comments, speculations, and
opinions. We hope that a careful study of these not yet well-connected points may lead to a web
of knowledge for web intelligence. From several perspectives, we examined the Web. This
enables us to see clearly the current status, the scope, and the future of web intelligence research.
Web intelligence exploration of the Web was then commented from a few angles. A couple of
challenges were posed. Finally, Web-based Support Systems (WSS) were used to demonstrate the
ideas presented, which may further enhance the Web as a tool - “of the people, by the people, for
the people”

Division Of Computer Science , SOE CUSAT

Page 29
Web Intelligence

[1]

REFERENCES

Research Challenges and Trends in the New Information Age
Y.Y. Yao1, Ning Zhong, Jiming Liu, and Setsuo Ohsuga , IEEE

[2]

Web Intelligence: New Frontiers of Exploration Yiyu (Y.Y.) Yao
Department of Computer Science, University of Regina Regina ,
saskatchewa , IEEE

[4]

Education and the Semantic Web Vladan Devedzic, Department of
Information Systems and Technologies, FON – School ,of Business
Administration, University of Belgrade

[5]

Computational Web Intelligence and Granular ,Web Intelligence
for Web Uncertainty ,Yan-Qing Zhang, Member, IEEE

Division Of Computer Science , SOE CUSAT

Page 30

More Related Content

What's hot

FUTURISTIC TECHNOLOGIES
FUTURISTIC TECHNOLOGIESFUTURISTIC TECHNOLOGIES
FUTURISTIC TECHNOLOGIES
Chuck Brooks
 
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Dr. Thiti Vacharasintopchai, ATSI-DX, CISA
 
MIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDY
MIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDYMIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDY
MIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDY
cseij
 
Lessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web appsLessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web apps
Benjamin Heitmann
 

What's hot (17)

Semantic Technologies - 2007
Semantic Technologies - 2007Semantic Technologies - 2007
Semantic Technologies - 2007
 
FUTURISTIC TECHNOLOGIES
FUTURISTIC TECHNOLOGIESFUTURISTIC TECHNOLOGIES
FUTURISTIC TECHNOLOGIES
 
Web 2.0 and mobile web
Web 2.0 and mobile webWeb 2.0 and mobile web
Web 2.0 and mobile web
 
MODERNIZING YOUR WORKPLACE WITH THE NEW OFFICE
 MODERNIZING YOUR WORKPLACE WITH THE NEW OFFICE MODERNIZING YOUR WORKPLACE WITH THE NEW OFFICE
MODERNIZING YOUR WORKPLACE WITH THE NEW OFFICE
 
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
Weblog, Digital Library, and Semantic Web Services Approach to Computer-Aided...
 
From research to business: the Web of linked data
From research to business: the Web of linked dataFrom research to business: the Web of linked data
From research to business: the Web of linked data
 
A survey on software defined networking
A survey on software defined networkingA survey on software defined networking
A survey on software defined networking
 
The Application of Cloud Computing in Education Informatization”
The Application of Cloud Computing in Education Informatization”The Application of Cloud Computing in Education Informatization”
The Application of Cloud Computing in Education Informatization”
 
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
 
Cloud Pricing is Broken - by Dr James Mitchell, curated by The Economist Inte...
Cloud Pricing is Broken - by Dr James Mitchell, curated by The Economist Inte...Cloud Pricing is Broken - by Dr James Mitchell, curated by The Economist Inte...
Cloud Pricing is Broken - by Dr James Mitchell, curated by The Economist Inte...
 
Cloud 101 higher_education_wp
Cloud 101 higher_education_wpCloud 101 higher_education_wp
Cloud 101 higher_education_wp
 
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning  on the Web of Data (How to create a Web of Exp...Enabling Case-Based Reasoning  on the Web of Data (How to create a Web of Exp...
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
 
MIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDY
MIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDYMIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDY
MIGRATING IN-HOUSE DATA CENTER TO PRIVATE CLOUD: A CASE STUDY
 
Lessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web appsLessons and requirements from a decade of deployed Semantic Web apps
Lessons and requirements from a decade of deployed Semantic Web apps
 
Web2.0: Theory & Application in the Classroom
Web2.0: Theory & Application in the ClassroomWeb2.0: Theory & Application in the Classroom
Web2.0: Theory & Application in the Classroom
 
Internet of Things: Implications for Open and Distance Learning
Internet of Things: Implications for Open and Distance LearningInternet of Things: Implications for Open and Distance Learning
Internet of Things: Implications for Open and Distance Learning
 
The Internet of Things How the Next Evolution of the Internet Is Changing Eve...
The Internet of Things How the Next Evolution of the Internet Is Changing Eve...The Internet of Things How the Next Evolution of the Internet Is Changing Eve...
The Internet of Things How the Next Evolution of the Internet Is Changing Eve...
 

Viewers also liked

Fifty Years Of Marriage
Fifty Years Of Marriage Fifty Years Of Marriage
Fifty Years Of Marriage
PaideiaAcademy
 
95 Theses - The Cluetrain Manifesto
95 Theses - The Cluetrain Manifesto95 Theses - The Cluetrain Manifesto
95 Theses - The Cluetrain Manifesto
Jennifer Angiwot
 
VietRees_Newsletter_47_Tuan1_Thang09
VietRees_Newsletter_47_Tuan1_Thang09VietRees_Newsletter_47_Tuan1_Thang09
VietRees_Newsletter_47_Tuan1_Thang09
internationalvr
 
Mdupiriak 30boxes
Mdupiriak 30boxesMdupiriak 30boxes
Mdupiriak 30boxes
dboling
 
TRiO Presentation-example- Edgar Castillo
TRiO Presentation-example- Edgar CastilloTRiO Presentation-example- Edgar Castillo
TRiO Presentation-example- Edgar Castillo
Edgar2011
 
VietRees_Newsletter_30_Tuan2_Thang05
VietRees_Newsletter_30_Tuan2_Thang05VietRees_Newsletter_30_Tuan2_Thang05
VietRees_Newsletter_30_Tuan2_Thang05
internationalvr
 
VietRees_Newsletter_30_Week2_Month05_Year08
VietRees_Newsletter_30_Week2_Month05_Year08VietRees_Newsletter_30_Week2_Month05_Year08
VietRees_Newsletter_30_Week2_Month05_Year08
internationalvr
 
VietRees_Newsletter_42_Tuan1_Thang08
VietRees_Newsletter_42_Tuan1_Thang08VietRees_Newsletter_42_Tuan1_Thang08
VietRees_Newsletter_42_Tuan1_Thang08
internationalvr
 

Viewers also liked (20)

Fortbridge consulting
Fortbridge consultingFortbridge consulting
Fortbridge consulting
 
Fifty Years Of Marriage
Fifty Years Of Marriage Fifty Years Of Marriage
Fifty Years Of Marriage
 
Maalem Imen Kairouan03
Maalem Imen Kairouan03Maalem Imen Kairouan03
Maalem Imen Kairouan03
 
Chpt1
Chpt1Chpt1
Chpt1
 
95 Theses - The Cluetrain Manifesto
95 Theses - The Cluetrain Manifesto95 Theses - The Cluetrain Manifesto
95 Theses - The Cluetrain Manifesto
 
When will librarians start research support with altmetrics to their research...
When will librarians start research support with altmetrics to their research...When will librarians start research support with altmetrics to their research...
When will librarians start research support with altmetrics to their research...
 
VietRees_Newsletter_47_Tuan1_Thang09
VietRees_Newsletter_47_Tuan1_Thang09VietRees_Newsletter_47_Tuan1_Thang09
VietRees_Newsletter_47_Tuan1_Thang09
 
Mdupiriak 30boxes
Mdupiriak 30boxesMdupiriak 30boxes
Mdupiriak 30boxes
 
C Questions
C QuestionsC Questions
C Questions
 
Why Run
Why RunWhy Run
Why Run
 
TRiO Presentation-example- Edgar Castillo
TRiO Presentation-example- Edgar CastilloTRiO Presentation-example- Edgar Castillo
TRiO Presentation-example- Edgar Castillo
 
VietRees_Newsletter_30_Tuan2_Thang05
VietRees_Newsletter_30_Tuan2_Thang05VietRees_Newsletter_30_Tuan2_Thang05
VietRees_Newsletter_30_Tuan2_Thang05
 
Maalem Imen Kairouan02
Maalem Imen Kairouan02Maalem Imen Kairouan02
Maalem Imen Kairouan02
 
VietRees_Newsletter_30_Week2_Month05_Year08
VietRees_Newsletter_30_Week2_Month05_Year08VietRees_Newsletter_30_Week2_Month05_Year08
VietRees_Newsletter_30_Week2_Month05_Year08
 
eCMO 2010 The e behind branding
eCMO 2010 The e behind brandingeCMO 2010 The e behind branding
eCMO 2010 The e behind branding
 
Real Time Image Processing
Real Time Image Processing Real Time Image Processing
Real Time Image Processing
 
Home Rf
Home RfHome Rf
Home Rf
 
Manipulation in games by Sunny
Manipulation in games by SunnyManipulation in games by Sunny
Manipulation in games by Sunny
 
surface computing (microsoft)
surface computing (microsoft)surface computing (microsoft)
surface computing (microsoft)
 
VietRees_Newsletter_42_Tuan1_Thang08
VietRees_Newsletter_42_Tuan1_Thang08VietRees_Newsletter_42_Tuan1_Thang08
VietRees_Newsletter_42_Tuan1_Thang08
 

Similar to Web intelligence-future of next generation web

Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09
Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09
Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09
Digibiz'09 Conference
 
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-LogicSecurity-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Nana Kwame(Emeritus) Gyamfi
 
COM6905 Research Methods And Professional Issues.docx
COM6905 Research Methods And Professional Issues.docxCOM6905 Research Methods And Professional Issues.docx
COM6905 Research Methods And Professional Issues.docx
write31
 

Similar to Web intelligence-future of next generation web (20)

The Internet and Education
The Internet and EducationThe Internet and Education
The Internet and Education
 
Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09
Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09
Future Internet Enterprise systems: a research vision- C.Martinez - DigiBiz'09
 
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-LogicSecurity-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
Security-Challenges-in-Implementing-Semantic-Web-Unifying-Logic
 
Keynote Cairns Curriculum Conference
Keynote Cairns Curriculum ConferenceKeynote Cairns Curriculum Conference
Keynote Cairns Curriculum Conference
 
Web2 Seminar
Web2 SeminarWeb2 Seminar
Web2 Seminar
 
The internet and education krisajenn
The internet and education   krisajennThe internet and education   krisajenn
The internet and education krisajenn
 
Towards Future Internet: Web 3.0, Internet of Services & Internet of Things
Towards Future Internet: Web 3.0, Internet of Services & Internet of ThingsTowards Future Internet: Web 3.0, Internet of Services & Internet of Things
Towards Future Internet: Web 3.0, Internet of Services & Internet of Things
 
Learning 2.0: Innovations to Gain the Edge
Learning 2.0:  Innovations to Gain the EdgeLearning 2.0:  Innovations to Gain the Edge
Learning 2.0: Innovations to Gain the Edge
 
asis and dal report
asis and dal reportasis and dal report
asis and dal report
 
Asis&dallesson16 report
Asis&dallesson16 reportAsis&dallesson16 report
Asis&dallesson16 report
 
The Internet and Education
The Internet and EducationThe Internet and Education
The Internet and Education
 
V5I6-0559
V5I6-0559V5I6-0559
V5I6-0559
 
Semantic Web concepts used in Web 3.0 applications
Semantic Web concepts used in Web 3.0 applicationsSemantic Web concepts used in Web 3.0 applications
Semantic Web concepts used in Web 3.0 applications
 
Bhef almaden 20131122 v1
Bhef almaden 20131122 v1Bhef almaden 20131122 v1
Bhef almaden 20131122 v1
 
E-COMMERCE BUSINESS MODELS IN THE CONTEXT OF WEB 3.0 PARADIGM
E-COMMERCE BUSINESS MODELS IN THE CONTEXT OF WEB 3.0 PARADIGME-COMMERCE BUSINESS MODELS IN THE CONTEXT OF WEB 3.0 PARADIGM
E-COMMERCE BUSINESS MODELS IN THE CONTEXT OF WEB 3.0 PARADIGM
 
Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...
Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...
Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...
 
COM6905 Research Methods And Professional Issues.docx
COM6905 Research Methods And Professional Issues.docxCOM6905 Research Methods And Professional Issues.docx
COM6905 Research Methods And Professional Issues.docx
 
Thesis slides
Thesis slidesThesis slides
Thesis slides
 
Thesis - Alain Perez - Semantic web and semantic technologies to enhance inno...
Thesis - Alain Perez - Semantic web and semantic technologies to enhance inno...Thesis - Alain Perez - Semantic web and semantic technologies to enhance inno...
Thesis - Alain Perez - Semantic web and semantic technologies to enhance inno...
 
APPLICATION OF DIGITAL CLOUD LIBRARIES FOR ETHIOPIAN PUBLIC HIGHER LEARNING I...
APPLICATION OF DIGITAL CLOUD LIBRARIES FOR ETHIOPIAN PUBLIC HIGHER LEARNING I...APPLICATION OF DIGITAL CLOUD LIBRARIES FOR ETHIOPIAN PUBLIC HIGHER LEARNING I...
APPLICATION OF DIGITAL CLOUD LIBRARIES FOR ETHIOPIAN PUBLIC HIGHER LEARNING I...
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Web intelligence-future of next generation web

  • 1. WEB INTELLIGENCE Seminar Report Submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Technology in Computer Science Engineering of Cochin University Of Science And Technology by NIJIL Y (12080050) DIVISION OF COMPUTER SCIENCE SCHOOL OF ENGINEERING COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY KOCHI-682022
  • 2. WEB INTELLIGENCE Seminar Report Submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Technology in Computer Science Engineering of Cochin University Of Science And Technology by NIJIL Y (12080050) DIVISION OF COMPUTER SCIENCE SCHOOL OF ENGINEERING COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY KOCHI-682022
  • 3. DIVISION OF COMPUTER SCIENCE SCHOOL OF ENGINEERING COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY KOCHI-682022 Certificate Certified that this is a bonafide record of the seminar entitled “WEB INTELLIGENCE” Presented by the following student NIJIL Y of the VII th semester, Computer Science and Engineering in the year 2010 in partial f ulfillment of the requirements in the award of Degree of Bachelor of Technology in Computer Science and E ngineering of Cochin University of Science and Technology. Mr. SUDEEP EDAYILAM Seminar guide Dr. DAVID PETER Head Of Division
  • 4. ACKNOWLEDGEMENT I thank GOD almighty for guiding me throughout the seminar. I would like to thank all those who ha ve contributed to t he c ompletion of t he s eminar a nd he lped me with va luable suggestions for improvement. I a m e xtremely grateful to Dr. David Peter, Head Of Division, Division of Computer Science, for providing me with best facilities and atmosphere for the creative work guidance and encouragement. I am profoundly indebted to my seminar guide Mr. Sudheep Elayidom, sr.Lecturer, Division of Computer Science, for all help and support extend to me. I thank all Staff me mbers of my c ollege a nd f riends f or e xtending t heir c ooperation during m y seminar. Above all I would like to thank my parents without whose blessings, I would not have been able to accomplish my goal. NIJIL Y
  • 5. ABSTRACT Web Intelligence is a new direction for scientific research and development that explores the f undamental roles as w ell as practical i mpacts of ar tificial i ntelligence and adva nced information t echnology f or t he ne xt ge neration of Web-empowered systems, services, and environments. Web Intelligence is regarded as the key research field for the development of the Wisdom Web ( including t he S emantic W eb). The Web r evolutionizes t he w ay w e ga ther, process, a nd us e i nformation. Despite cu rrent t echnological adva nces, w e st ill ca nnot pred ict what t he Web’s ne xt pa radigm s hift w ill b e. H owever, w e pr opose t hat t his c hange w ill transform the Web into an intelligent entity—hence, the term Web intelligence. The ne xt-generation W eb w ill go b eyond i mproved i nformation s earch a nd know ledge queries a nd will h elp p eople a chieve be tter w ays of l iving, working, pl aying, a nd l earning. T o fulfil its potential, the intelligent Web’s design and development must incorporate and integrate several f undamental capa bilities. A f ew o f i ts capa bilities a re R eflexive ser ver pro pagation , Growth Specialization , A utocatalysis et c. Intelligent Web agents can use t he P roblem S olver Mark-up L anguage ( PSML) t o s pecify t heir r oles, s ettings, a nd r elationships w ith a ny ot her services. The i ntelligent Web must a lso ha ve the a bility t o pr ocess and unde rstand na tural language. It must understand and c orrectly judge the meaning of concepts expressed in words, such as “go od,” “be st,” and “season” et c. WI r esearch incorporates k nowledge f rom e xisting disciplines, such as artificial intelligence and information technology, in a t otally new domain. At t he sam e t ime, Web Intelligence r esearch also enriches t hese established disciplines as it introduces new topics and challenges.
  • 6. TABLE OF CONTENTS CHAPTER NO. CHAPTER TITLE PAGE NO. 1 Introduction 1 2 Perspectives Of Wi 4 3 Intelligence Exploration 8 3.1 A New Field Of Science, Technology And Engineering 8 3.2 Design Philosophy And Principles Of The Web 8 3. 3 The Laws Of The Web 9 3. 4 The Web Revolution: One Link At A Time 10 3.5 The More Things Change, The More They Stay The Same 11 4 Components Of Web Intelligence 13 4.1 Web Data 13 4.2 Representation 15 4.3 Psml And Web Inference Engine 17 4.4 Social Network Intelligence 17 4.4 Social Network Intelligence 17 5 Computational Web Intelligence 18 5.1 Web Uncertainty 19 5.2 Computational Web Intelligence For Web Uncertainty 19 5.3 Granular Web Intelligence For Web Uncertainty 21 6 Trends And Challenges Of Wi Related Research And Development 23 6.1 Intelligent Web Agents 24 6.2 From Wa To Web-Based Services 25 7 Semantic Search Engine 28 8 Conclusion 29 References 30
  • 7. Web Intelligence CHAPTER 1 INTRODUCTION With the rapid growth of Internet and World Wide Web (WWW), we have now entered into a new information age. The Web provides a total new media for communication, which goes far beyond the traditional communication media, such as radio, telephone and television. The Web has significant impacts on both academic research and ordinary daily life. It revolutionizes the way in which information is gathered, stored, processed, presented, shared, and used. The Web offers new opportunities and challenges for many areas, such as business, commerce, marketing, finance, publishing, education, research and development. For computer scientists, the Web introduces many new research topics and provides a new platform to reconsider old problems. It might be high time to create a new sub-discipline of computer science covering theories and technologies related to the Web. Web Intelligence is our proposal for this purpose. Through the billions of Web pages created with HTML and XML, or generated dynamically by underlying Web database service engines, the Web captures almost all aspects of human endeavor and provides a fertile ground for data mining. However, searching, comprehending, and using the semi-structured information stored on the Web poses a significant challenge because this data is more sophisticated and dynamic than the information that commercial database systems store. To supplement keyword-based indexing, which forms the cornerstone for Web search engines, researchers have applied data mining to Web-page ranking. In this context, data mining helps Web search engines find high-quality site administrator. WI explores the fundamental and practical impact that artificial intelligence and advanced information technology will have on the next generation of Web-empowered systems, services, and environments. In an era dominated by the World Wide Web, Grid computing, intelligentagent technology, and ubiquitous social computing, WI represents information technology’s next challenge. 3 Motivations and Justifications for WI The introduction of Web Intelligence (WI) can be motivated and justified fromboth academic and industrial perspectives. Two features of the Web make it a useful and unique platform for computer applications and research, the size and complexity. The Web contains a huge amount of interconnected Web documents known as Web pages. For example, the popular search engine Google claims that it can search 1,346,966,000 pages as of February 2001. The sheer size of the Web leads to difficulties in the storage, Division Of Computer Science , SOE CUSAT Page 1
  • 8. Web Intelligence management, and efficient and effective retrieval of Web documents. The complexity of the Web, in terms of connectivity and diversity of Web documents, forces us to reconsider many existing information systems, as well as theories, methodologies and technologies underlying those systems. One has to deal with a heterogeneous collection of structured, unstructured, semistructured, interrelated, and distributed Web documents consisting of texts, images and sounds, instead of homogeneous collection of structured and unrelated objects. The latter is the subject of study of many conventional information systems, such as databases, information retrieval, and multi-media systems. To accommodate the needs of the Web, one needs to study issues on the design and implementation of the Web-based information systems by combining and extending results from existing intelligent information systems. Existing theories and technologies need to be modified or enhanced to deal with complexity of the Web. Although individual Web-based information systems are constantly being deployed, advanced issues and techniques for developing and for benefiting from the Web remain to be systematically studied. The challenges brought by the Web to computer scientists may justify the creation of the new sub-discipline, WI, for carrying out Web-related research. The Web increases the availability and accessibility of information to a much larger community than any other computer applications. The introduction of Personal Computers (PCs) brought the computational power to ordinary people. It is the Web that delivers more effectively information to everyone at finger tips. The Web, no doubt, offers a new means for sharing and transmitting information unmatchable by other media. The revolution started by the Web is just beginning. New business opportunities, such as e-commerce, e-banking, and e-publication, will increase with the maturity of the Web. It can hardly overemphasize more impacts of the Web on the business and industrial world. The creation of a new sub-discipline devoted toWeb related research and applications might has a significant value in the future. The needs for WI may be further illustrated by the current fast growing research and industrial activities centered on it. We searched the Web by using the keyword “Web Intelligence” through several search engines in February 2001. Division Of Computer Science , SOE CUSAT Page 2
  • 9. Web Intelligence What is Web Intelligence? “Web Intelligence (WI) exploits Artificial Intelligence (AI) and advanced Information Technology (IT) on the Web and Internet.” This definition has the following implications. The basis of WI is AI and IT. The “I” happens to be shared by both “AI” and “IT”, although with different meanings in them, and “W” defines the platform on which WI research is carried out. The goal of WI is the joint goals of AI and IT on the new platform of the Web. That is, WI applies AI and IT for the design and implementation of Intelligent Web Information Systems (IWIS). An IWIS should be able to perform functions normally associated with human intelligence, such as reasoning, learning, and self improvement. There perhaps might not be a standard and non-controversial definition of WI, as the case that there is no standard definition of AI. One may argued that our definition of WI focuses more on the software aspects of the Web. It is not our intention to exclude any research topic using the proposed definition. The term, Web Intelligence, should be considered as an umbrella or a label of a new branch of research centered on the Web. Our definition simply states the scopes and goals of WI. This allows us to include any theories and technologies that either fall in the scopes or aim at the same goals. To complement the formal definition, we try to make the picture clearer by listing topics to be covered by WI. WI will be an ever-changing research branch. It will be evolving with development of the Web as new media for information gathering, storage, processing, delivery and utilization. It is our expectation that WI will be evolved into an inseparable research branch of computer science. Although no one can predict the future in detail and without uncertainty, it is clear that WI would have huge impacts on the application of computers, which in turn will affect our everyday lives. Division Of Computer Science , SOE CUSAT Page 3
  • 10. Web Intelligence CHAPTER 2 Perspectives of WI As a new branch of research, Web Intelligence exploits Artificial Intelligence (AI) and Information Technology (IT) on the Web. On the one hand, it may be viewed as applying results from these existing disciplines to a totally new domain. On the other hand, WI may also introduce new problems and challenges to the established disciplines. WI may also be viewed as an enhancement or an extension of AI and IT. It remains to be seen if WI would become a sub-area of AI and IT or a child of a successful marriage of AI and IT. However, no matter what happens, studies on WI can benefit a great deal from the results, experience, success and lessons of AI and IT. In their very popular textbook, Russell and Norvig examined different definitions of artificial intelligence from eight other textbooks, in order to decide what is exactly AI. They observed that the definitions vary along the two dimensions. One dimension deals with the functionality and ability of an AI system, ranging from thought processes and reasoning ability of the systems to the behavior of the systems. The other dimension deals with the designing philosophy of AI systems, ranging from intimating human problem solving to making rational decision. The combination of the two dimensions results in four categories of AI systems adopted from Russell and Norvig . Systems that think like humans. Systems that think rationally. Systems that act like humans. Systems that act rationally. This classification provides a basis for the studies of various views and approaches for AI. It also clearly defines goals in the design of AI systems. According to Russell and Norvig , they correspond to four approaches, the cognitive modeling approach (thinking humanly), the Turing test approach (acting humanly), the laws of thought approach (thinking rationally), and the rational agent approach (acting rationally).The two rows for separating AI systems in terms of thinking and acting may not be a most suitable classification. Action is normally the final result of a thinking process. One may argue that the class of systems acting humanly is a super set of the class of system thinking humanly. In contrast, the separation of human-centered approach and rationality-centered approach may have significant implications in the studies of AI. While earlier research on AI was focus more on human-centered approach, rationality-centered approach received more attention recently Division Of Computer Science , SOE CUSAT Page 4
  • 11. Web Intelligence The first column is centered around humans and leads to the treatment of AI as an empirical science involving hypothesis and experimental confirmation. A human-centered approach represents the descriptive view of AI. Under this view, a system is designed by intimating the human problem solving. This implies that a system should have the usual human capabilities such as knowledge representation, natural language processing, reasoning, planning and learning. The performance of an AI system is measured or evaluated through the Turing test. An system is said to be intelligent if it provides human level performance. Such a descriptive view dominates the majority of earlier studies of expert systems, a special type of AI systems. The second column represents the prescriptive or normative view of AI. It deals with theoretical principles and laws that an AI system must follow, instead of intimating humans. That is, a rationalist approach deals with an ideal concept of intelligence, which may be independent of human problem solving. An AI system is rational if it does the right thing and makes the right decision. The normative view of AI based on the well established disciplines such as mathematics, logic, and engineering. The descriptive and normative views also reflect the experimental and theoretical aspects of AI research. The experimental study represents the descriptive view. It covers theories and models for the explanation of the workings of the human mind, and applications of AI to solving problems that normally require human intelligence. The theoretic study aims at the development of theories of rationality, and focuses on the foundations of AI. The two views are complementary to each other. Studies in one direction may provide valuable insights into the other. Web Intelligence concerns the design and development of intelligent Web information systems. The previous framework for the study of AI can be immediately applied to that of Web Intelligence. More specifically, we can cluster research in WI into the prescriptive approach and the normative approach, and cluster Web information systems in terms of thinking and acting. Various research topics can be identified and grouped accordingly. Like AI, a foundation of WI can be established by drawing results from the following many related disciplines: • Mathematics: computation, logic, probability. Applied Mathematics and Statistics: algorithms, non-classical logics, decision theory, information theory, measurement theory, utility theory, theories of uncertainty, approximate reasoning. Division Of Computer Science , SOE CUSAT Page 5
  • 12. Web Intelligence • Psychology: cognitive psychology, cognitive science, human-machine interaction, user interface. • Linguistics: computational linguistics, natural language processing, machine translation. • Information Technology: information science, databases, information retrieval systems, knowledge discovery and data mining, expert systems, knowledge-based systems, decision support systems, intelligent information agents. The topics under each entry are only intended as examples. They do not form an exhausted list. In the development of AI, we have witnessed the formulation of many of its new subbranches, such as knowledge-based systems, artificial neural networks, genetic algorithms, and intelligent agents. Recently, non-classical AI topics have received much attentions under the name of computational intelligence. Computational intelligence focuses on the computational aspect of intelligent systems , . The application of AI in other disciplines also leads to new techniques in the corresponding fields. For instance, Business Intelligence (BI) is a result of applying artificial Division Of Computer Science , SOE CUSAT Page 6
  • 13. Web Intelligence intelligence to the business domain. Artificial Intelligence in Medicine also proved to be a successful application. When viewing WI in such settings, we can identify at least two of its roles. WI may be interpreted “Web based Artificial Intelligence” as the study of particular aspects of AI in the context of the Web, in parallel to the study of computational intelligence. WI may also be interpreted as “Artificial Intelligence on the Web” which regards it as a new application of AI.A more practical goal of WI is the design and implementation of intelligent Web information systems (IWIS). It should be realized that an IWIS is an integrated system containing many sub-systems. To design such a system, it is necessary to apply a variety of theories and technologies. In his work on vision, Marr convincingly made the point that a full understanding of an intelligent system involves explanations at various levels. The same argument is applicable to the development of an IWIS. We can identify at least two levels, the conceptual formulation and physical implementation. The conceptual formulation deals with foundations of IWIS, while physical implementation concerns with construction of an IWIS. The former depends on mathematics and logic, and the latter depends on algorithms and programming. Each level may be further divided into more sub-levels. Research in WI should include any topics at different levels. Division Of Computer Science , SOE CUSAT Page 7
  • 14. Web Intelligence CHAPTER 3 WEB INTELLIGENCE EX PLORATION Web intelligence further explores the transformation of knowledge from information, and wisdom from knowledge, in its search of the Wisdom Web. Some of the important issues, although may not be well-conceived yet, are briefly discussed in this section. 3.1 A new field of science, technology and engineering The Web, as a new technical and social phenomenon and a growing organism, creates a new field of science that involves a multi-disciplinary study and enquiry for the understanding of the Web and its relationships to us. The Web may be studied from many perspectives, such as philosophical foundations, theoretical and technical foundations, applications, and social impacts. Some examples are given below: • Webology, • Web Science, • Web Technology, • Web Engineering, • Weblization. The term, webology, is coined to label the study of the Web as a new field of science. By postfixing the phrase, science and technology, one clearly states the scope. By post fixing the phrase, engineering, one emphasizes the design and implementation aspects. Together, they are driving forces for information revolution. The term, weblization, concisely summarizes the development: of the Web and web based systems so far. The process of weblization involves building the Web itself and reconstructing existing tools and systems OR the web platform. 3.2 Design philosophy and principles of the Web The design philosophy and principles set the direction of web growth and its ultimate destiny. It may be difficult to compile a non-controversial and complete list. However, examples include Decentralization principle, Universalist principles, Minimum constraint principle, Division Of Computer Science , SOE CUSAT Page 8
  • 15. Web Intelligence Separation of form and content principle. The decentralization principle is inherited from the decentralization property of the Internet. The universalist principles cover universal connectivity, universal accessibility, as well as diversity of web contents and users. The minimum constraint principle suggests that the Web should be as un-constraining as possible to realize its universality. The separation principle deal with the presentation of web documents, in order to achieve location, machine, and apphcation independence. The design principles ensure that the Web has the desirable properties, such as decentralization, adaptability, evolvability, scalability, universal connectivity and accessibility, affordability, anonymity, diversity, and many others. The Web is able to support communication, collaboration. interaction, and intercreation. 3.3 The laws of the Web Two sets of laws have been studied, namely, the set of laws governing the Web and the set of empirical laws observable on the Web. The Web has given new meaning to publishing and library, but not their underlying principles. Nomzi argued that Ranganathan’s Five Laws of Library Science is weli applicable today as it was more than 70 years ago . Ranganathan’s Five Laws of Library Science state: • Books are for use. • Every reader his or her book. • Every book its reader. • Save the time of the reader. • The Library is a growing organism These laws describe a user-oriented, as well as a serviceoriented, view of library science. The Web consists of a massive collection of resources. By replacing “book”, “reader”, and “library” with “web resource”, “user”, and ‘‘web’, respectively, Noruzi stated Five Laws of the Web • Web resources are for use. • Every user his or her web resource. • Every web resource its user. • Save the time of the user. • The Web is a growing organism. Division Of Computer Science , SOE CUSAT Page 9
  • 16. Web Intelligence They concisely represent the underlying philosophy of the Web and web services. They also describe the ideal Web - “of the people, by the people, for the people”. Many researchers studied empirical laws revealed by the Web, either its growth, web page distributions, or user surfing patterns. An example set of such laws is reported by Huberman : I. Power Law of Distribution. 2. Small World Law. 3. . Law of Surfing. 4. Law of Congestion. 5. The Free Ride Law 6. The Law of Downloading. Website designers, webmasters, and organizations can apply such laws for the design of better website and web resources. 3.4 The Web revolution: one link at a time The story of the invention of the Web and the revolution brought by the Web provides a good case study for web intelligence. It poses a challenge: how to derive insights and wisdom from the existing data, information, and knowledge. Regarding the pre-web uses of hypertext links, Berners-Lee commented, “The research community had used the links between paper documents for ages: Tables of contents, indexes, bibliographies, and reference sections are hypertext links.’’ A crucial question is what we can get from this common knowledge and practice. Two types of approaches have been proposed and studied. One focuses on the exploration of the potential implications of such knowledge, which leads to the creation of a field of science known as citation indexing and analysis. The other focuses on the representation, storage, and access of the similar types of data and knowledge using new media as they become available, which leads to the invention of the Web. A basic idea of citation indexing and analysis is to index and study the literature of science Division Of Computer Science , SOE CUSAT Page 10 based on how scientists cite each other. Although it mainly uses bibliographies, citation indexing
  • 17. Web Intelligence and analysis brings more insights into science, publishing, scientific research, and many more fields. Information retrieval systems, based on citation indexing and analysis, have been implemented and used by scientists for many years. The same methods have been applied or rediscovered in many recent studies, such as web search engines, social network analysis, and so on. A basic idea of the Web is to create a global space in which anything can be linked to anything . The development of the Web emphasizes the implementation of this idea using different type machines and media. The Web attempts to make the existing associations and links, that people had used either explicitly or implicitly, concrete and computer manageable. The similar concepts had been explored in preweb age. Vnnevar Bush described a photoelectromechanical machine called the Memex that can make and follow cross-references among microfilm documents. Ted Nelson introduced the concept of hypertext, so that people can use computers to read, write and publish non-linear texts. Doug Engelbart demonstrated a collaborative work space called NLS which does hypertext browsing editing, email, and so on. Thanks to the timely invention of the Internet for providing global connectivity, the dream of the Web became a reality. The revolution of the Web is brought by grassroots effort that builds the Web link by link. There are recent research efforts in cross-applications of the two types of approaches. The methods developed for citation indexing and analysis are used and extended to analyze the links and conductivity of the web. Existing systems for citation indexing and analysis are moved to, and new such systems are impregnated on, the Web. The above brief description, which is almost common knowledge, is repeated here to serve one special purpose. It demonstrates that the great minds of our time bring revolutions by analyzing what everyone has already known or by implementing, alternatively, what everyone has already used. The question is: Can web intelligence help in the future? 3.5 The more things change, the more they stay the same Now, we turn our attention to the other side of the same coin by investigating the things that the resolutions do not change. In spite of the technological changes, achievements of the current Web and associated systems lie in the process of weblization. The weblization of a specific field or an organization does not change its fundamental principles, although it may become more effective and efficient, as well as being at different level of scale. For example, electronic commence does not change the principles of doing business, but does introduce more dynamics, opportunities, Division Of Computer Science , SOE CUSAT Page 11 flexibility, and other new properties. Another example is the Five Laws of the Web:the subject
  • 18. Web Intelligence matters are changed, but the philosophy remains to be the same. Both paper documents and the Web use links. The physical implementations are different, one on paper and the other on computer, but the logical meanings stay more or less the same. The same analytical tools and methods apply to both. The property of “unchangeness” makes it possible to apply the same principles again and again, with possible adaptation and adjustment. The philosophy and principles that have been proved to be effective in past can be applied to design and implement intelligent web information systems. Some illustrative examples are listed here: Separation of logical view and physical view. Separation of knowledge and inference engine. Keep It Simple, Stupid! The first two separation principles are along the same line as the separation of content and form principle. The first one is widely used in the design and implementation of database systems. Its application to the Web implies that one can generate many virtual logical views from the same physical web. The second principle is a fundamental one in expert systems. It is applicable to the design of web inference engines. The last rule, also known as the KISS principle, is universally applicable It has been applied throughout the design of the Web. Division Of Computer Science , SOE CUSAT Page 12
  • 19. Web Intelligence CHAPTER 4 Components of Web Intelligence 4.1 Web Data The data available in electronic commerce environments is three-fold and includes server data in the form of log files, site specific web meta data representing the structure of the web site, and marketing information, which depends on the products and services provide. Server data is generated by the interactions between the persons browsing an individual site and the web server. This data can be divided into log files and query data. Historically, web servers recording server activity, errors and referrer information used a log file to record each event. It is now the standard that web servers use a combined log file format, called Common Log file Format . This format combines the server and error logs into one file. More recently, the Extended Log file Format has been used, which consolidates the Common format with additional information, namely the referrer and cookie information. By incorporating referrer information, the output of the mining of these logs files being much more useful and actionable in marketing terms. Cookies are tokens generated by the web server and held by the clients. The information stored in a cookie helps to ameliorate the transaction less state of web server http interactions, enabling servers to track client access across their hosted web pages. The logged cookie data is customizable and can contain keys for relating the navigational data to the content of the marketing data, including transactional data. Usually the following information is contained in a cookie: User ID, source IP address, time-to-live, randomly generated unique ID and user defined information. A fourth data source that is typically generated on electronic commerce sites is query data to a web server. This data is usually generated when users of the web site use search or product locator facilities on the web site to search for relevant pages/products. This is often user interaction with a product database, via the company’s Internet site. The final source of data is web meta-data. This data describes the structure of the web site and is usually generated dynamically and automatically after a site update. Web meta-data generally includes neighbor pages, leaf nodes and entry points. This information is usually implemented as a site-specific index table, which represents a labeled, directed graph. Meta-data also provides information whether a page has been created statically or dynamically and whether user interaction is required or not. In addition to the structure of a site, web meta-data can also contain information of more Division Of Computer Science , SOE CUSAT semantic nature, usually represented in XML. Page 13
  • 20. Web Intelligence Web Mining Components of Web Intelligence In the context of web intelligence, web mining may be defined as the application of data mining techniques to Internet data. This definition is sometimes extended to include statistical, database optimization, and artificial intelligence techniques. Web mining has been sub-divided into web structure, web usage, and web content mining . Web structure mining is the application of data mining techniques to web site structures. In many cases this may be the entire web, and research in intelligent search engines and intelligent agents is described in many articles, . In our research, we define web structure mining as the mining of Internet data, together with data about the structure of the site. This may be thought of as enriching the efficacy of the data mining process with domain knowledge. The application of domain knowledge is further discussed in the analytical process section. Web usage mining is the application of data mining to Internet web server log file data, which is described in the earlier section on web data. Web usage mining forms the core of our research in web mining for web intelligence, and log files provide the foundation data for visitor analysis. This type of analysis of the visitors to a web site can be subdivided into technographic and psychographic analysis . Technographic analysis focuses on what is known about the visitor’s technical platform, i.e., operating system, browser, plug-ins, user language, cookie information. On its own, this information is not a rich source of discriminatory data for visitor profiling but in conjunction with the homogenous data sets available after extract, transform & load operations to data warehousing, it contributes significantly. Psychographic analysis is the examination of what we know about the behavioral patterns of web site visitors. This includes the routes taken by visitors through a site, the time spent on each page, route differences based on differing entry points to site, aggregated route behavior, general click stream behavior, etc. This is the information of most use to web marketers, and is equivalent to marketing intelligence about where shoppers enter the store, where shoppers go in the store, where they leave the store, what they look at but don’t buy, what they buy and how quickly, etc. Web content mining is the application of data and text mining algorithms and techniques to the contents of web pages, usually written in HTML. At its simplest, this entails the extraction of text between HTML tags for headings and titles, or the extraction of the HTML Meta tag content.. Our research is based upon XML and RDF-based data schemas that help to ensure correctness and proper context. Division Of Computer Science , SOE CUSAT Page 14
  • 21. Web Intelligence 4.2 Representation Intelligent Web agents can use the Problem Solver Markup Language (PSML) to specify their roles, settings, and relationships with any other services. The intelligent Web must also have the ability to process and understand natural language. It must understand and correctly judge the meaning of concepts expressed in words, such as “good,” “best,” and “season.” Further, the intelligent Web must grasp the granularities of these terms’ corresponding subjects and the location of their ontology definitions. Self-direction and learning In addition to the semantic knowledge that an intelligent search can extract and manipulate, intelligent Web agents must also incorporate a dynamically created source of metaknowledge that deals with the relationships between concepts and the spatial or temporal constraint knowledge that planning and executing services use. This allows the agents to selfresolve their conflicts. To solve specific problems, intelligent Web agents must be able to plan. The planning process uses goals and associated sub goals, as well as constraints. In the intelligent Web, ontologies alone will not be sufficient. Personalization The intelligent Web can personalize interactions by remembering a particular user’s recent encounters and relating the topics and sites that a user accesses during different online sessions. It may further identify other goals and courses of action as a user’s interactions broaden and deepen, providing ever more data upon which to base its recommendations. As part of its personalized approach to user services, the intelligent Web will interact with the user when executing these tasks. In summary, semantics contributes a vital aspect to the intelligent Web. We expect the Web to extend not only the knowledge of artificial assistants, but also their intelligence. WI’s Four Levels We can study Web intelligence on at least four conceptual levels, ranging from the lower, hardware- centered level to the higher, application-centered level. This framework builds upon the fast development and application of various Web technologies. • Internet-level communication, infrastructure, and security protocols. At its core, the Web is a computer-network system. WI techniques for this level include Web data perfecting systems built upon Web surfing patterns to resolve latency issues. The intelligence of Division Of Computer Science , SOE CUSAT Page 15
  • 22. Web Intelligence the Web’s perfecting routines comes from an adaptive learning process based on observations of user surfing behavior. • Interface-level multimedia presentation standards. The Web functions as an interface for human-Internet interaction. At this level, the Web interfaces require adaptive cross-language processing, personalized-multimedia-representation, and multimodal-data-processing capabilities. • Knowledge-level information processing and management tools. The Web serves as a distributed data and knowledge base. Accessing and manipulating this information requires semantic markup languages to represent the Web’s contents in machineunderstandable formats. Agent-based autonomic computing functions such as searching, aggregation, classification, filtering, managing, mining, and discovery can then use this data. • Application-level ubiquitous computing and social intelligence environments. The Web can form the basis for establishing social networks that contain communities of people, organizations, or other social entities. Social relationships such as friendship, co-working, or exchanging information about common interest connect these entities. The study of WI thus encompasses issues central to social network intelligence. Users access the Web’s multimedia content from stationary desktop computers and increasingly from mobile platforms as well.5 Ubiquitous Web access and computing from various wireless devices requires even greater adaptive personalization. WI should suit these needs well by providing techniques for use in constructing interest models derived from implicit inferences based on user behavior. Division Of Computer Science , SOE CUSAT Page 16
  • 23. Web Intelligence 4.3 PSML and Web inference engine Distributed inference engines form PSML’s core. These engines can perform automatic reasoning on the Web by incorporating autonomically collected and transformed content and meta-knowledge into locally operational knowledge and databases. A feasible way to implement PSML is to use an existing Prolog-like logic language supplemented with agents that perform dynamic-content updates, meta-knowledge. 4.4 Social network intelligence The social intelligence approach to Web computing presents new opportunities for WI research and development. As the Web becomes an integral part of our society, WI can and should support Web-based social networks at all levels. Study in this area must receive as much attention as Web mining, Web agents, ontologies, and related topics. Web-based computing The intelligent Web seeks to provide not only a medium for seamless information exchange and knowledge sharing, but also the sort of human-crafted resources that encourage sustainable knowledge creation and scientific and social evolution. The intelligent Web will rely on Grid-like service agencies that self-organize, learn, and evolve their courses of action to perform service tasks and transform their identities and interrelationships in communities. These services will also cooperate and compete among themselves to optimize their resources and utilities and those of others. 4.5 Benchmark applications To effectively develop and evaluate systems and applications that address WI research issues, we must consider benchmark applications that will demonstrate these capabilities. Suppose we want to conduct a Web-based search to compile the data and generate a market report for an existing product or a potential new product. To perform these tasks, an information agent will mine and integrate available Web information, which will in turn be passed to a market analysis agent. The analysis will involve the quantitative simulation of customer behavior in a marketplace, instantaneously handled by other service agencies involving a large number of Grid agents. Given that the number of variables can number in the hundreds or thousands, generating one prediction can easily require significant computer resources Division Of Computer Science , SOE CUSAT Page 17
  • 24. Web Intelligence CHAPTER 5 Computational Web Intelligence and Granular Web Intelligence for Web Uncertainty With explosive growth of Web data on wired and wireless networks, a challenging problem for a new generation of intelligent Web techniques is how to handle uncertain Web data and making right decisions under Web uncertainty. So it is necessary to develop new intelligent Web techniques for Web applications under different types of uncertainty including probability, possibility, fuzziness, roughness, randomness, etc. Web Intelligence (WI), a new direction for scientific research and development, exploits Artificial Intelligence (Al) and advanced Information Technology (IT) on the Web and Internet. In general, Al-based Web techniques can be used to handle probabilistic Web data. Since there are lots of fuzzy Web data and other kinds of uncertain Web data, we need to apply relevant intelligent techniques to process different uncertain Web data that cannot be processed by traditional precise intelligent techniques like Boolean logic. To promote the use of fuzzy Logic in the Internet, Zadeh stated "fuzzy logic may replace classical logic as what may be called the brainware of the Internet" at 2001 BISC International Workshop on Fuzzy Logic and the Internet (FLINT2001) . The fuzzy intelligent agents are used in smart e-Commerce applications. The conceptual fuzzy sets are applied to Web search engines to improve quality of Web service. Clearly, the intelligent e-brainware based on soft computing plays an important role in smart e-Business applications. So soft computing techniques can play an important role in building the intelligent Web brain. So soft-computingbased Web techniques can enhance Web Qol (Quality of Intelligence). In order to use CI (Computational Intelligence) techniques to make intelligent wired and wireless systems with high Qol, Computational Web Intelligence (CWI) was proposed at the special session on CWI at FUZZ-IEEE'02 of 2002 World Congress on Computational Intelligence. CWI is a hybrid technology of CI and Web Technology (WT) dedicating to increasing Qol of e-Business application systems on the wired and wireless networks. Main CWI techniques include • Fuzzy Web Intelligence (FWI) • Neural Web Intelligence (NWI) • Evolutionary Web Intelligence (EWI) • Granular Web Intelligence (GWI) • Rough Web Intelligence (RWI) Division Of Computer Science , SOE CUSAT • Probabilistic Web Intelligence Page 18
  • 25. Web Intelligence 5.1 WEB UNCERTAINTY Web holds various data sets distributed on a huge number of computers just like a human brain contains biological data stored on a large number of biological neurons. The biological data in the human brain are not always precise but uncertain in most cases due to information incompleteness, linguistic vagueness, imperfect measurement, knowledge limitations, etc. Similarly, Web data on the Internet are not accurate but uncertain usually because of partial Web information, dynamic Web data, fuzzy Web data, Web ontology, unpredictable Web information, different Web users, different hardware environments, different data formats, etc.So the big challenging problem is how to design intelligent Web techniques for Web-based applications with uncertainty. With explosive growth of the wired and wireless networks, Web users suffer from huge amounts of raw Web data because current Web tools still cannot find satisfactory information and knowledge effectively and make decisions correctly because of uncertain Web data, uncertain Web information, uncertain Web knowledge and uncertain Web intelligence. Now the Internet and wireless networks connect an enormous number of computing devices including computers, PDAs (Personal Digital Assistants), cell phones, home appliances, etc. CI is used in telecommunication network applications . Clearly, such a huge networked computing system on the world provides a complex, dynamic and global environment for developing the new distributed intelligent theory and technology based on Al, BI (Biological Intelligence) and CI. Therefore, we must design an intelligent Web technology for dealing with Web uncertainty. 5.2 COMPUTATIONAL WEB INTELLIGENCE FOR WEB UNCERTAINTY Zadeh states that traditional (hard) computing is the computational paradigm that underlies artificial intelligence, whereas soft computing is the basis of CI. Based on the discussions on CI and Al ,the basic conclusion is that CI is different from Al, but CI and Al have a common overlap. In general, hard computing and soft computing can be used in intelligent hard Web applications and intelligent soft Web applications. To enhance Qol (Quality of Intelligence) of e-Business, Computational Web Intelligence (CWI) is proposed to use CI and Web Technology (WT) to make intelligent e-Business applications on the Internet and wireless networks . So the concise relation is given by CWI=CI+WT. Fuzzy logic, neural networks, evolutionary computation, granular Division Of Computer Science , SOE CUSAT Page 19 computing, rough sets and probabilistic methods are major CI techniques for intelligent e-
  • 26. Web Intelligence Applications on the Internet and wireless networks. Currently, seven major research areas of CWI are (1) Fuzzy WI (FWI), (2) Neural WI (NWI), (3) Evolutionary WI (EWI), (4) Probabilistic WI (PWI), (5) Granular WI (GWI), and (6) Rough WI (RWI). In the future, more CWI research areas will be added. The six current major CWI techniques are described below. • FWI has two major techniques: fuzzy logic and WT. The main goal of FWI is to design intelligent fuzzy e-agents to deal with fuzziness of Web data, Web information and Web knowledge, and also make good decisions for e-Applications effectively. • NWI has two major techniques: neural networks and WT. The main goal of NWI is to design intelligent neural e-agents that can learn Web knowledge from of Web data and Web information and make smart decisions for e-Applications intelligently. • EWI has two major techniques: evolutionary computing and WT. The main goal of EWI is to design intelligent evolutionary e-agents to optimize e-Application tasks effectively. • PWI has two major techniques: probabilistic computing and WT. The main goal of PWI is to design intelligent probabilistic e-agents to deal with probability of Web data, Web information and Web knowledge for e-Applications effectively. • GWI has two major techniques: granular computing and WT. The main goal of GWI is to design intelligent granular e-agents to deal with Web data granules, Web information granules and Web knowledge granules for e-Applications effectively. • RWI has two major techniques: rough sets and WT. The main goal is to design intelligent rough e-agents to deal with roughness of Web data, Web information and Web knowledge for e-Applications effectively.CWI can be used to increase the Qol of e-Business applications. CWI has a lot of wired and wireless applications in intelligent eBusiness. Currently, FWI, NWI, EWI, PWI, GWI and RWI are major CWI techniques. CWI can be used to deal with uncertainty and complexity of Web applications. HWI, a more broad area Division Of Computer Science , SOE CUSAT Page 20
  • 27. Web Intelligence than CWI, can be applied to more complex e-Business applications. In summary, HWI including CWI will play an important role in designing the smart e-Application systems for wired and wireless users. In summary, CWI technology is based on multiple CI techniques and WT. Relevant CI techniques and WT are selected to make a powerful CWI system for the special e-Business application. 5.3 GRANULAR WEB INTELLIGENCE FOR WEB UNCERTAINTY Granular computing technology can be to do high-level information processing and knowledge discovery based on data granules that are clustered intelligently from raw data with uncertainty. Since there are huge amounts of Web data at different geographical places, it is naturally necessary to use the granular computing technology to preprocess raw Web data, then do granular Web data mining, and finally discover granular Web knowledge. So GWI is a general intelligent technology in dealing with raw Web data with Uncertainty. Mathematically speaking, to handle Web uncertainty effectively, it is really necessary to develop a novel granular set theory. Here, a general framework about granular sets is briefly described below to deal with data uncertainty such as Web data uncertainty. Definition 1 (A Granular Set) Let X be a universal set of data elements. A granular set A in Xis characterized by m granular membership functions Fk(x) for x in X, Fk(x)E[O,1], and k=1,2,...m. For example: If k=1, a granular set is a fuzzy set (a special case: a crisp set) since one membership function is used. The traditional fuzzy sets just use truth values in [0, 1] to handle data uncertainty. If k=2, a granular set is an intuitionistic fuzzy set [25] since two membership functions are used. Intuitionistic fuzzy sets use both truth values and falsity values in [0, 1] to deal with data uncertainty. If k=3, a granular set is a neutrosophic set since three membership functions are used. For example, interval neutrosophic sets are defined on a truth-membership function, an indeterminacy-membership function and a falsity-membership function . The major advantage of interval neutrosophic sets is to reduce data uncertainty by using three types of information that are truth values, falsity values and indeterminacy values in order to make a right decision. 100 Division Of Computer Science , SOE CUSAT Page 21
  • 28. Web Intelligence We hope that new granular sets and new granular logical systems with four or more membership functions will be developed in the future to handle Web uncertainty effectively and fundamentally. Web uncertainty is a long-term challenging problem related to many Web applications like semantic Web, Web mining, Web knowledge discovery, Web agents, Web search engines, Web security, e-Commerce, e-Business, etc. To handle Web uncertainty, we need to develop relevant intelligent Web technology such as CWI and GWI. Importantly, we need to continue to create new granular sets such as neutrosophic sets to try to solve Web uncertainty effectively. Web uncertainty is a difficult long-term problem. So we need to use different intelligent techniques together for this complicated problem. Hybrid Web Intelligence (HWI), a broad hybrid research area, uses Al, CI, BI (Biological Intelligence) and WT to build hybrid intelligent Web systems to handle Web uncertainty effectively and efficiently. In the future, HWI will have a lot of intelligent Web applications under uncertainty. Main HWI applications include (1) intelligent Web agents for e-Applications such as e-Commerce, e-Government, e-Education and e-Health, (2) intelligent Web security systems such as intelligent homeland security systems, (3) intelligent Web bioinformatics systems, (4) intelligent grid computing systems, (5) intelligent wireless mobile agents, (6) intelligent Web expert systems, (7) intelligent Web entertainment systems, (8) intelligent Web services, (9) Web data mining and Web knowledge discovery, (10) intelligent distributed and parallel Web computing systems based on a large number of networked computing resources, ..., and so on. Division Of Computer Science , SOE CUSAT Page 22
  • 29. Web Intelligence CHAPTER 6 Trends and Challenges of WI Related Research and Development Web Intelligence presents excellent opportunities and challenges for the research and development of new generation Web-based information processing technology, as well as for exploiting business intelligence. With the rapid growth of the Web, research and development on WI have received much attention. We expect that more attention will be focused on WI in the coming years. Many specific applications and systems have been proposed and studied. Several dominant trends can be observed and are briefly reviewed in this section. E-commerce is one of the most important applications of WI. The e-commerce activity that involves the end user is undergoing a significant revolution. The ability to track users’ browsing behavior down to individual mouse clicks has brought the vendor and end customer closer than ever before. It is now possible for a vendor to personalize his product message for individual customers at a massive scale. This is called targeted marketing or direct marketing Web mining and Web usage analysis play an important role in e-commerce for customer relationship management (CRM) and targeted marketing. Web min- ing is the use of data mining techniques to automatically discover and extract information from Web documents and services. Zhong et al. proposed a way of mining peculiar data and peculiarity rules that can be used for Web-log mining. They also proposed ways for targeted marketing by mining classification rules and market value functions. A challenge is to explore the connection between Web mining and the related agent paradigm such as Web farming that is the systematic refining of information resources on the Web for business intelligence. Text analysis, retrieval, and Web based digital library is another fruitful research area in WI. Topics in this area include semantics model of the Web, text ming, automatic construction of citation. Abiteboul et al. systematically investigated the data on the Web and the features of semi-structured data. Zhong et al. studied text mining on the Web including automatic construction of ontology, e-mail filtering system, and Web-based ebusiness systems. Web based intelligent agents are aimed at improving a Web site or providing help to a user. Liu et al. worked on e-commerce agents . Liu and Zhong worked on Web agents and KDDA (Knowledge Discovery and Data Mining Agents). We believe that Web agents will be a very important issue. It is therefore not surprising that we decide to hold the WI conference in Division Of Computer Science , SOE CUSAT Page 23
  • 30. Web Intelligence parallel to the Intelligent Agents conference. In the next section, we provide a more detailed description of intelligent Web agents. The Web itself has been studied from two aspects, the structure of the Web as a graph and the semantics of the Web. Studies on Web structures investigate several structural properties of graphs arising from the Web, including the graph of hyperlinks, and the graph induced by connections between distributed search servants. The study of the Web as a graph is not only fascinating in its own right, but also yields valuable insight into Web algorithms for crawling, 10 searching and community discovery, and the sociological phenomena which char- acterize its evolution. Studies of the semantics of the Web were initiated by Tim Berners-Lee, the creator of the World Wide Web. The Web is referred to as the “semantic Web”, where information will be machine-processible in ways that support intelligent network services such as information brokers and search agents. The semantic Web requires interoperability standards that address not only the syntactic form of documents but also the semantic content. A semantic Web also lets agents utilize all the data on all Web pages, allowing it to gain knowledge from one site and apply it to logical mappings on other sites for ontology-based Web retrieval and e-business intelligence. Ontologies and agent technology can play a crucial role in enabling such Web-based knowledge processing, sharing, and reuse between applications. A new DARPA program called DAML (DARPA Agent Markup Languages) is a step toward a “semantic Web” where agents, search engines and other programs can read DAML mark-up to decipher meaning rather than just the content on a Web site. 6.1 Intelligent Web Agents Intelligent agents are computational entities that are capable of making decisions on behalf of their users and self-improving their performance in dynamically changing and unpredictable task environments . In , Liu provided a comprehensive overview of related research work in the field of autonomous agents and multi-agent systems, with an emphasis on its theoretical and computational foundations as well as in-depth discussions on the useful techniques for developing various embodiments of agent-based systems, such as autonomous robots, collective vision and motion, autonomous animation, and search and segmentation agents. The core of those techniques is the notion of synthetic or emergent autonomy based on behavioral self-organization. Intelligent Division Of Computer Science , SOE CUSAT Page 24
  • 31. Web Intelligence Web Agents (WA) are software programs that primarily serve two important roles: a). autonomous entities for exploring and exploiting Web-based services, and b). prototype entities for exhibiting and explaining Web-generated regularities. These two roles are summarized below. 6.2 From WA to Web-Based Services The first role for WA can be readily described and appreciated by examining the following typical scenarios in which various tasks and objectives are achieved. • Personalized Multimodal Interface WA can provide users with a user-friendly style of presentation that personalizes both the interaction with users and the content presentation. This activity involves the creation of various cognitive aids, including tables, charts, executive summaries, indices, and personalized visual assistants (e.g., graphically animated personas and virtual-reality avatars). WA as interfaces must offer the ease of using electronic services. The provided cognitive aids must be concise (i.e., accessible with as fewer manipulations as possible and as less memorization as possible) and consistent (i.e., understandable based on users’ previously customized cognitive styles). • Push and Pull WA can play an important role in dynamically creating pull-and-push advertising. Here, by pull-and-push advertising we mean that a user expresses his or her favorites during the interaction with the agents (pull advertising) and in return the agents search and deliver the information about the favorite items dynamically to the user (push advertising). Such agents can also increase the positive externality of products, that is, the better people are informed about certain products, the more likely the products will be sold. • Pattern Discovery and Self-Organization WA will enable to detect what users’ buying patterns are forming and how they are structured, and hence effectively manage the online commerce. Collaborative recommendation agents can help individual users aggregate into groups, which can in turn form a dynamical marketplace. • Information Gateway WA can provide users with immediate access to the most relevant information. This support encompasses a wide spectrum of information filtering and delivery activities by manipulating various heterogeneous Web sources including databases, data warehouses, newswire, financial reports, newsletters, newsgroups, outbound emails, electronic bulletin boards, and hypermedia documents, and based on Division Of Computer Science , SOE CUSAT Page 25
  • 32. Web Intelligence users’ profiles, tailoring and delivering the retrieved information to the users. The provided summary information must be just-in-time (i.e., delivered whenever is needed), relevant (i.e., focused on whichever topics the users are concerned with), and up-to- minute (i.e., refreshed whenever a new piece of information arrives). An example of applications with this type of agent support is comparison shopping that utilizes WA with mobile and filtering capabilities. Some related experiences have been reported in . • Reward WA can motivate users to enter and re-enter a certain electronic service. While an ever-greater proliferation of content continues to consume individuals’ attention, e.g., through push technology to sell something or to support users, WA can play a crucial role in creating a captive audience, in educating it constantly, and even in removing away users’ old purchase habits. To be rewarding is to add value. The motivational rewards or incentives can be created by offering free access to certain information and utility resources (e.g., free software download), opportunities to participate in multi-user information/commodity exchange activities (e.g., collaborative recommendation, chat, bidding, and auction), and scheduled plans for promotional deals. • Matchmaking WA can serve as a new means for trading commodities. Since the interests of users as well as the availability of products from dealers can change dynamically from time to time, what usually happens in present day electronic commerce is: (1) a dealer sells his or her items simply because these are the only items that he or she has at the moment, or (2) a user buys a certain item simply because it is the last item that he or she can find that partially fits his or her need. WA-based customized business attempts to change the existing online buying and selling into the following new scenarios: (1) a dealer identifies and offers what exactly users are interested in, and (2) a user finds and purchases what he or she really loves – some technical issues related to matchmaking have been addressed in . • Decision WA can assist Web users in making decisions. Such decision support may be in the forms of evaluations or recommendations on the various features of certain specific items, cost-benefit analysis, inference support for optimizing utility and resources with respect to functional, time, and cost requirements, and model-based trend analysis and projections concerning new patterns of demand. • Delegation WA can act on behalf of Web users in online activities. The tasks that WA may delegate to achieve include matchmaking, server monitoring, negotiation, bidding, auction, transaction, transfer of goods, and follow-up support. This Division Of Computer Science , SOE CUSAT Page 26 scenario will empower a new paradigm shift from user-centric to user-delegated
  • 33. Web Intelligence electronic business. The delegations of these tasks may be carried out in either semiautonomous (with users’ intervention on decisions) or fully autonomous manners. To this end, various computational theories and models have been proposed and reported in. • Collaborative Work Support WA can offer the infrastructure support as well as the necessary function for collaboratively solving problems and managing workflow activities Division Of Computer Science , SOE CUSAT Page 27
  • 34. Web Intelligence CHAPTER 7 Semantic Search Engine The framework’s search engine component queries the information generated by the annotation component. It accepts queries posed in SPARQL and returns a set of links to matching resources. A specialized search interface lets users develop an abstract model of a semantic query, pose it to the engine, and then review the resulting matched documents. The search interface gives end users (people who aren’t experts in Semantic Web technologies) a way to access the resources filtered and annotated by the semantic annotator component. It is also possible to add and delete entities and properties (with related values), so that a user can interact with the knowledge base to fine-tune the query, making subsequent searches more accurate. The key aim for the query interface is to give the user an intuitive and clear abstract query model that hides, as much as possible, the underlying complexity of representation and reasoning. Furthermore, the agents in the search engine multi-agent system exhibit various autonomic features that aim at making the system more robust and scalable. The QS system has been deployed in two different commercial test cases in the UK. In the first case, QS was used to examine specific Web-published documents for commercial opportunities matching the business interests of the customer company. In the second deployment, QS was used to perform knowledge-based searches over existing database sources. In evaluating the performance of the search system in both applications, we could see that by using ontological knowledge and ontology-based annotations, users could perform more accurate queries while being returned up to 71 percent fewer documents than with a keyword-based search engine—in the best cases eliminating more than 90 percent of the irrelevant documents. We are now in the process of further refining these two deployments, and we are planning more industrial deployments in the near future with other UK companies Division Of Computer Science , SOE CUSAT Page 28
  • 35. Web Intelligence CHAPTER 8 CONCLUSION While it may be difficult to define what exactly Web Intelligence (WI) is, one can easily argue for the need and necessity of creating such a subfield of study in computer science. With the rapid growth of the Web, we foresee a fast growing interest in Web Intelligence. Roughly speaking, we define Web Intelligence as a field that “exploits Artificial Intelligence (AI) and advanced Information Technology (IT) on the Web and Internet.” It may be viewed as a marriage of artificial intelligence and information technology in the new setting of the Web. By examining the scope and historical development of artificial intelligence, we discuss some fundamental issues of Web Intelligence in a similar manner. There is no doubt in our mind that results from AI and IT will influence the development of WI. Instead of searching for a precise and noncontroversial definition of WI, we list topics that might be interested by a researcher working on Web related issues. In particular, we identify some challenging issues of WI, including ecommerce, studies of Web structures and Web semantics, Web information storage and retrieval, Web mining, and intelligent Web agents, to examine performance characteristics of various approaches in Web-based intelligent information technology, and to cross-fertilize ideas on the development of Web-based intelligent information systems among different domains. It is not intended to be a complete and systematic study of the field, but rather a record of personal observations, scattered (perhaps immature) ideas, general comments, speculations, and opinions. We hope that a careful study of these not yet well-connected points may lead to a web of knowledge for web intelligence. From several perspectives, we examined the Web. This enables us to see clearly the current status, the scope, and the future of web intelligence research. Web intelligence exploration of the Web was then commented from a few angles. A couple of challenges were posed. Finally, Web-based Support Systems (WSS) were used to demonstrate the ideas presented, which may further enhance the Web as a tool - “of the people, by the people, for the people” Division Of Computer Science , SOE CUSAT Page 29
  • 36. Web Intelligence [1] REFERENCES Research Challenges and Trends in the New Information Age Y.Y. Yao1, Ning Zhong, Jiming Liu, and Setsuo Ohsuga , IEEE [2] Web Intelligence: New Frontiers of Exploration Yiyu (Y.Y.) Yao Department of Computer Science, University of Regina Regina , saskatchewa , IEEE [4] Education and the Semantic Web Vladan Devedzic, Department of Information Systems and Technologies, FON – School ,of Business Administration, University of Belgrade [5] Computational Web Intelligence and Granular ,Web Intelligence for Web Uncertainty ,Yan-Qing Zhang, Member, IEEE Division Of Computer Science , SOE CUSAT Page 30