Authors: Wuwongse, Vilas; Vacharasintopchai, Thiti; and
Intaraksa, Neelawat
Issue Date: 12-Dec-2008
Type: Article
Series/Report no.: Proc. 2nd Hanoi Forum on Information-Communication Technology (ICT-Hanoi 2008);
Abstract: Information and Communication Technology has advanced at an unprecedented rate nowadays, resulting in a mass of electronic contents being produced and disseminated at an exponential rate. In general, such contents are not systematically organized, making them inaccessible when they are most needed. A software infrastructure to commonly support the management of knowledge and information on the Internet and an intranet is proposed. It can be used to capture, preserve and manage the information and knowledge so that they stay intact and do not vanish with time. It allows pieces of knowledge in repositories to be located and shared effectively across boundaries. Contents from repositories can be readily utilized and published. Opinions and discussions about contents can also be captured and archived for later reference. Once adopted and deployed in large-scale, the Common Infrastructure for Knowledge and Information Management will play a crucial role in creating a universal source of knowledge for humanity.
URI: http://dspace.siu.ac.th/handle/1532/129
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Common Infrastructure for Knowledge and Information Management
1. VNU Journal of Science, Natural Sciences and Technology xx (2008) 0-0
Common Infrastructure for
Knowledge and Information Management
Vilas Wuwongse1*, Thiti Vacharasintopchai2, Neelawat Intaraksa3
1
Professor, School of Engineering and Technology, Asian Institute of Technology,
P.O. Box 4 Klong Luang, Pathumthani 12120 Thailand
2
Lecturer, School of Technology, Shinawatra University,
99 Moo 10 Bangtoey, Samkhok, Pathumthani 12160 Thailand
3
Research Associate, The Greater Mekong Subregion Academic and Research Network,
Asian Institute of Technology, P.O. Box 4 Klong Luang, Pathumthani 12120 Thailand
Received ...
Abstract. Information and Communication Technology has advanced at an unprecedented rate
nowadays, resulting in a mass of electronic contents being produced and disseminated at an
exponential rate. In general, such contents are not systematically organized, making them
inaccessible when they are most needed. A software infrastructure to commonly support the
management of knowledge and information on the Internet and an intranet is proposed. It can be
used to capture, preserve and manage the information and knowledge so that they stay intact and
do not vanish with time. It allows pieces of knowledge in repositories to be located and shared
effectively across boundaries. Contents from repositories can be readily utilized and published.
Opinions and discussions about contents can also be captured and archived for later reference.
Once adopted and deployed in large-scale, the Common Infrastructure for Knowledge and
Information Management will play a crucial role in creating a universal source of knowledge for
humanity.
Keywords: knowledge management, information retrieval, software interoperability, digital library
1. Introduction computing and communication devices; yet a
constant decline in their price tags. Personal
Information and Communication computers and accesses to the Internet have
Technology (ICT) has advanced at an penetrated into most households, schools and
unprecedented rate nowadays. We have offices—from within capital cities to remote
witnessed constant growth in performance of rural towns. Computers and mobile phones
come equipped with video and audio recording
_______
Corresponding author. Tel.: +66 2524-5700
capabilities, like those found in ubiquitous
E-mail: vilasw@ait.ac.th digital cameras and camcorders, making the
1
2. 2 Wuwongse et al. / VNU Journal of Science, Natural Sciences and Technology xx (2008) 0-0
creation of electronic multimedia contents more unreliable because original creators cannot be
affordable and more convenient than ever. This identified and trusted. The lack of standard
ease of access to ICT tools results in a mass of interoperability protocol makes cross-system
electronic contents being produced and searches without a global search engine like
disseminated at an exponential rate. The Google infeasible, resulting in users being
contents are typically stored on personal hard unaware of critical knowledge pieces that
drives and shared on the Internet as e-mails, already exist at the time they most needed. The
static web pages or as user-contributed contents metadata problem can be alleviated by
using Web 2.0 technologies—forums, weblogs, information service providers adopting
wikis, content management systems (CMS) and international metadata standards and cataloging
learning management systems (LMS). In their contents accordingly. Technologies and
general, they are not systematically organized tools on ontology and natural language
or if so are merely by folder hierarchies. processing are available to help so that contents
Information are located and discovered through are more properly clustered and cataloged as
operating system search features, website well as queries being more aligned and matched
search sections, or Internet search engines like with indexes. The preservation problem can be
Google. There are at times when users cannot alleviated by organizations adopting digital
find information most needed, or be presented library technologies, which enable effective
with piles of duplicated, irrelevant information archiving and preserving of digital contents so
which demand manual examination— that they stay intact with time. The system
sometimes only to find that they are damaged interoperability problem can be alleviated by
and can partially be recovered. It is not information service providers adopting open
uncommon that pieces of knowledge are not standards which allow information in
shared within communities or organizations repositories to be exchanged freely among
because of the two extremes—people do not participating organizations by means of
know that they exist or there are just too many metadata. Such standards include the
of them. Search/Retrieval via URL (SRU) protocol [1]
sanctioned by the United States Library of
Congress for peer-to-peer queries of content
2. How Problems Can Be Alleviated metadata in repositories. The Open Archives
Initiative Protocol for Metadata Harvesting [2]
The problems described earlier can be and the RSS web syndication [3] can be used to
categorized into three groups, namely, the harvest and build up directories of metadata
problem on metadata control and indexing, the from information sources. The adoptions of
problem on digital content preservation, and the these open standards will widen the range of
problem on system interoperability. The lack of information sources accessible to users from the
metadata control and indexing in mainstream consumer point of view, and increase the
information systems makes users unable to effectiveness of new contents being delivered to
locate and retrieve particular information potential target groups from the provider point
precisely in time, or be presented with piles of of view.
irrelevant information. The lack of content
preservation makes some retrieved contents
inaccessible because of file damages or
3. Tên tác gi / T p chí Khoa h c HQGHN, Khoa h c T Nhiên và Công ngh t p (n m) s trang 1
Fig 1. System Architecture of Common Infrastructure for
Knowledge and Information Management.
and opinions as digital contents; the Data
3. A Common Infrastructure for Knowledge Center layer that systematically collects,
and Information Management catalogs, and preserves the digital contents so
that they stay intact and are ready for random
A system architecture which unifies the retrieval on demand; and the Data Utilization
individual solutions described previously into layer that enables intelligent and effective
an infrastructure for the management of searches for pieces of knowledge in data centers
information and knowledge on the Internet and such that they can be quickly posted on
an intranet is proposed in Figure 1. It is websites. Content management systems and
consisted of four layers, namely, the Network learning management systems allow general
and Internet layer, the System Software layer, users without programming skills to publish
the Data Exchange and Metadata layer, and the contents online conveniently. Querying to data
Application layer. centers loaded with useful pieces of knowledge
The Network and Internet layer is the basic is analogous to inquiring some human-filtered
infrastructure for information exchanges search engines.
between information services. Such an The Data Exchange and Metadata layer
infrastructure is provided by Internet service comprises the standards and software services
providers and is commonly taken for granted in that facilitate the exchange of knowledge pieces
software development. between information systems. These include the
The System Software layer is responsible SRU, the OAI-PMH and the RSS protocols
for the creation, storage and utilization of introduced earlier as well as the system
knowledge. It is composed of three sub-layers software components that handle information
which are the Knowledge Creation and exchanges based on such protocols.
Capturing layer that records personal The Application layer involves the humanly
knowledge, experiences as well as information processes that utilize individuals’ knowledge
4. 2 Wuwongse et al. / VNU Journal of Science, Natural Sciences and Technology xx (2008) 0-0
and experiences captured, shared and application, under the title “The Knowledge,
discovered across information systems to Imagination, Discovery and Sharing (KIDS-D)
perform tasks in various disciplines, which Project,” in which networks of digital libraries
could range from education, preservation of are created to archive and exchange useful
cultural heritages, planning and development to learning materials among teachers and students
agricultural and environmental activities. at pilot schools and institutes across Thailand.
Rare historical books from the National
Archive have also been digitized and preserved
4. Prototype System in the digital libraries. Such contents as well as
discussions with fellow students and teachers
The components for the proposed on them are hoped to alleviate the academic
infrastructure are being developed at the Asian resource deficiency problem in Thailand. It
Institute of Technology based on open-source should be noted that, unlike other software
software tools. Core components in the Data infrastructures in which components are tightly
Center, Data Utilization as well as Data coupled and deployed, our implementation of
Exchange and Metadata layers have been the Common Infrastructure are loosely coupled,
implemented. Components for the Knowledge meaning that software components interoperate
Creation and Capturing layer are being through open standards and protocols and can
designed. The DSpace [4] and Greenstone [5] be readily replaced by alternative components
digital library servers are chosen as the engines that are standard-compliant. Therefore, the
for the collection and preservation of digital implementation of the Common Infrastructure
contents into knowledge repositories. The for Knowledge and Information Management is
Moodle [6] learning management system and also highly flexible and scalable in this regard.
the Drupal [7] content management system are
chosen as the platforms to utilize knowledge in
repositories for academic and non-academic 5. Conclusion
purposes, respectively. A metadata harvester
has been developed to aggregate metadata from This paper has presented a software
various information sources into a central infrastructure to commonly support the
directory. The metadata harvested include those management of knowledge and information on
from digital libraries and library management the Internet and an intranet. The infrastructure
systems, through the OAI-PMH protocol as can be used to capture, preserve and manage the
well as those from Web 2.0 sites, through RSS information and knowledge that belong to
web syndications. A “single search” engine has communities and organizations so that they stay
been developed to assist users in exploring such intact and do not vanish with time. It allows
directory. Contents can be retrieved by pieces of knowledge in repositories to be
keywords in metadata fields and can be located and shared effectively across
browsed and explored by facets of metadata. boundaries within an organization or between
The SRU support for peer-to-peer metadata organizations and nations through open
queries has been added to the original DSpace standards, without necessitating for a
code and is being incorporated into the homogeneous software suite. Contents from
single-search facility. An offspring of this repositories can be quickly utilized and
implementation was applied in an e-learning published with the assistance of learning
5. Tên tác gi / T p chí Khoa h c HQGHN, Khoa h c T Nhiên và Công ngh t p (n m) s trang 3
management systems and content management References
systems. Opinions and discussions about
contents can also be captured and archived back [1] The Library of Congress, Search/Retrieval via
into the repositories for later reference. Once URL, June 2008. Available online:
http://www.loc.gov/standards/sru
adopted and deployed in large-scale, the
[2] The Open Archives Initiative, Open Archives
Common Infrastructure for Knowledge and Initiative - Protocol for Metadata Harvesting,
Information Management will play a crucial October 2004. Available online:
role in creating a universal source of knowledge http://www.openarchives.org/OAI/openarchives
for humanity. protocol.html
[3] RSS Advisory Board, RSS 2.0 Specification,
2006. Available online:
Acknowledgements http://www.rssboard.org/rss-specification
[4] DSpace, DSpace – An Open-source Solution
for Accessing, Managing and Preserving
The authors would like to thank the Royal
Scholarly Works, 2008. Available online:
Thai Government for their financial support http://www.dspace.org
through the Greater Mekong Subregion [5] Greenstone, Greenstone Digital Library
Academic and Research Network Knowledge Software, 2008. Available online:
Management Toolkit and Applications project. http://www.greenstone.org
They would also like to thank countless friends [6] Moodle, Moodle – A Free Open Source
and colleagues whose constructive comments Course Management System for Online
have contributed to this research. The Learning, 2008. Available online:
http://moodle.org
developers of the free and open source software
[7] Drupal, Drupal – An Open Source Content
tools used are thankfully acknowledged for
Management System Platform, 2008. Available
their devoted time and contributions. online: http://drupal.org