1. 複合情報学専攻 修士論文 (2007 年 2 月 14 日)
カテゴリーに特徴的な単語を利用した
Web サイトの分類に関する研究
複雑系工学講座 調和系工学研究室 修士 2 年 氏名 本田崇智
Automated Classification Method of Websites by using Characteristic
Keywords
Research Group of Complex Systems Engineering
Laboratory of Harmonious Systems Engineering
MC2 Takatomo Honda
Abstract: Recently, the number of tourists getting tourism information using websites has
increased. At the same time, tourism information on the World Wide Web (WWW) has been
rising significantly. This large amount of information causes confusion and trouble for
tourists. In order to get clear and effective tourism information, this study proposed a
method to search websites regarding to categories. Categories such as restaurants or
accommodation can contribute to acquire appropriate and satisfactory information. In order
to search Websites regarding to categories, it is necessary to collect Websites from the WWW
and classify them into categories. This research focuses on classification of Websites into
categories. Previous researches deal with the tf-idf method and the Bayesian Classifier,
website classification methods using text information. This research proposed a website
classification method which concentrates on websites belonging to a given category and on
characteristic keywords those appear only in the given category. The effectiveness of this
approach was examined by using data from the directory-based search engine “Yahoo!
Japan” and from the WWW randomly. The experimental results show that the proposed
method can classify Websites into proper categories with a high classification degree
compared to a previous method.
研究業績(査読付き学術論文,国際会議講演論文,国内講演論文等)
:
1. Takatomo Honda, Masahito Yamamoto, Azuma Ohuchi, “Automatic Classification of Websites based on
Keyword Extraction of Nouns”, Information and Communication Technologies in Tourism 2006,
SpringerWienNewYork, pp.263-272. (2006).
2. 本田 崇智,山本 雅人,大内 東,”観光インターネットディレクトリ構築に向けた Web サイトの自動分類”,観光
情報学会,Vol.2,No.1,pp.49-57,(2005).
その他 国内学会 6 編