SlideShare una empresa de Scribd logo
1 de 6
ICDM 2005 Workshop Proposal

         Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically
                        Heterogeneous Data and Knowledge Sources

Description of the workshop topic and the associated research issues

Recent advances in high performance computing, high speed and high bandwidth communication, massive storage,
and software (e.g., web services) that can be remotely invoked on the Internet present unprecedented opportunities in
data-driven knowledge acquisition in a broad range of applications in virtually all areas of human endeavor including
collaborative cross-disciplinary discovery in e-science, bioinformatics, e-government, environmental informatics,
health informatics, security informatics, e-business, education, social informatics, among others. Given the explosive
growth in the number and diversity of potentially useful information sources in many domains, there is an urgent need
for sound approaches to integrative and collaborative analysis and interpretation of distributed, autonomous (and
hence, inevitably semantically heterogeneous) data sources.

Machine learning offers some of the most cost-effective approaches to automated or semi-automated knowledge
acquisition (discovery of features, correlations, and other complex relationships and hypotheses that describe
potentially interesting regularities from large data sets) in many data rich application domains. However, the
applicability of current approaches to machine learning in emerging data rich application domains presents several
challenges in practice:


      (a) Centralized access to data (assumed by most machine learning algorithms) is infeasible because of the large
           size and/or access restrictions imposed by the autonomous data sources. Hence, there is a need for
           knowledge acquisition systems that can perform the necessary analysis of data at the locations where the
           data and the computational resources are available and transmit the results of analysis (knowledge acquired
           from the data) to the locations where they are needed.
      (b) Ontological commitments associated with a data source (that is, assumptions concerning the objects that
           exist in the world, the properties or attributes of the objects, the possible values of attributes, and their
           intended meaning) are determined by the intended use of the data repository (at design time). In addition,
           data sources that are created for use in one context often find use in other contexts or applications.
           Therefore, semantic differences among autonomous data sources are simply unavoidable. Because users
           often need to analyze data in different contexts from different perspectives, there is no single privileged
           ontology that can serve all users, or for that matter, even a single user, in every context. Effective use of
           multiple sources of data in a given context requires reconciliation of such semantic differences from the
           user’s point of view.
      (c) Explicitly associating ontologies with data repositories results in partially specified data, i.e., data that are
           described in terms of attribute values at different levels of abstraction. For example, the program of a
           student in a data source can be specified as Graduate, while the program of a different student in the same
           data source (or even a different data source) can be specified as Doctoral.

Against this background, the proposed workshop seeks to bring together researchers in relevant areas of artificial
intelligence (machine learning, data mining,          knowledge representation, ontologies), information systems
(information integration, databases, semantic web) distributed computing, and selected application areas (e.g.,
bioinformatics, security informatics, environmental informatics) to address several questions such as:

    1) What are some of the research challenges presented by emerging data-rich application domains such as
       bioinformatics, health informatics, security informatics, social informatics, environmental informatics?
    2) How can we perform knowledge discovery from distributed data (assuming different types of data
       fragmentation, e.g., horizontal or vertical data fragmentation; different hypothesis classes, e.g., naïve Bayes,
       decision tree, support vector machine classifiers; different performance criteria, e.g., accuracy versus
       complexity versus reliability of the model generated, etc.)?
3) How can we make semantically heterogeneous data sources self-describing (e.g., by explicitly associating
       ontologies with data sources and mappings between them) in order to help collaborative science from
       autonomous information sources?
    4) How can we represent, manipulate, and reason with ontologies and mappings between ontologies?
    5) How can we learn ontologies from data (e.g., attribute value taxonomies)?
    6) How can we learn mappings between semantically heterogeneous data source schemas and between their
       associated ontologies?
    7) How can we perform knowledge discovery in the presence of ontologies (e.g., attribute value taxonomies)
       and partially specified data (data that are described at different levels of abstraction within an ontology)?
    8) How can we achieve online query relaxation when an initial query posed to the data sources fails (i.e., returns
       no tuples)? That is, how do we perform a query-driven mining of the individual sources that will result in
       knowledge that can be used for query relaxation?

Reasons why an ICDM workshop on this topic should take place

As noted above, the explosive growth in the number and diversity of potentially useful information sources in many
domains, there is an urgent need for sound approaches to integrative and collaborative analysis and interpretation of
distributed, autonomous (and hence, inevitably semantically heterogeneous) data sources. At present, while there are
several research conferences focus on well-established research areas (e.g., machine learning, data mining, knowledge
representation, databases), there is relatively little interaction among the different research communities. For example,
machine learning researchers working on algorithms for learning predictive models from distributed data, are isolated
from the large community of database researchers working on data integration, and the community of artificial
intelligence researchers focused on knowledge representation and inference. Researchers in this area can also benefit
from a better understanding of specific challenges posed by emerging informatics-enabled application domains such as
bioinformatics, health informatics, security informatics, environmental informatics.

Fundamental advances in collaborative approaches to knowledge acquisition and data-driven decision making from
distributed, autonomous, semantically heterogeneous data and knowledge sources require synergistic synthesis of
research advances, insights, algorithms, and results in multiple areas of:
     • artificial intelligence – especially machine learning, data mining, knowledge representation and inference,
          intelligent agents and multi-agent systems;
     • information systems – especially databases, information integration, semantic web;
     • distributed computing (e.g., service-oriented computing).

The proposed workshop aims to bring them together in order to enable discussion of research problems, approaches,
insights, and results drawn from multiple, and at present, largely disparate areas of artificial intelligence, computer
science, and emerging informatics-enabled disciples. At present, there is no annual conference or workshop dedicated
to this topic. It is hoped that the resulting exchanges will stimulate further interaction between these communities and
result in the development of new approaches that would advance the current state of the art in collaborative systems
for collaborative analysis, interpretation, and decision making from distributed, autonomous, semantically
heterogeneous data and knowledge sources.

Workshop Format

The workshop will consist of:

    •    An opening session for introducing the workshop topics, goals, participants, and expected outcomes
    •    A small number of invited talks carefully intermixed with presentation of contributed papers. The invited
         talks will give overviews of the key topics (learning from distributed data, semantic Web, ontology-based
         information integration, distributed description logics, selected applications, etc.). A possible list of invited
         speakers:
              • Alex Borgida (ontologies and databases) - Rutgers University
              • Katy Borner (information visualization) -- Indiana University
              • Foster Provost (machine learning and data mining) – New York University
• James Hendler (semantic web) - University of Maryland at College Park
              • Alon Halevy (information integration) – University of Washington
              • Dieter Fensel (ontologies) - University of Innsbruck
              • Tom Dietterich (machine learning and environmental informatics) – Oregon State University
              • H. Jagadish (biological data management) – University of Michigan
              • Daphne Koller (probabilistic models) – Stanford University
              • Munindar Singh (service-oriented computing) – North Carolina State University
              • Michael Pazzani (intelligent information systems) – National Science Foundation
    •    Presentations of contributed papers that represent completed work.
    •    Breaks between sessions, meant to encourage informal discussions related to the topics discussed in the
         sessions and to create opportunities for collaborations.
    •    A panel discussion on challenges and future research directions
    •    A wrap-up session summarizing the workshop (including formal or informal discussions).

Description of the anticipated target group(s) of attendees

The workshop is of interest to researchers, students, and practitioners in a number of areas of artificial intelligence,
information systems, and related areas including: machine learning and data mining, information extraction,
information integration, knowledge representation, semantic web, software agents and multi-agent systems, and
service-oriented computing. The workshop is also of interest to researchers and practitioners in emerging informatics-
enabled application domains such as bioinformatics, environmental informatics, health informatics, security
informatics, e-business, social informatics.

The organizers will make an effort to ensure a good mix of established researchers as well as graduate students and
junior researchers on the one hand and academic and industrial participants on the other.

Potential authors and attendees

A number of researchers who were informally contacted have expressed an interest in the proposed workshop. We put
together a short list of potential participants. (The list below does not include members of the program committee or
participants named on the list of potential invited speakers). We expect the target size of the workshop to be around 40
participants to allow for fruitful interactions and discussion in an informal setting among the workshop participants.

AnHai Doan -- University of Illinois at Urbana-Champaign
Lise Getoor – University of Maryland
Barbara Eckman -- IBM Life Sciences Solution Development
George Forman – Hewlett Packard Labs
Simon Kasif – Boston University
Zoe Lacroix -- Arizona State University
Pat Langley – Stanford University
Bertram Ludaescher -- University of California, Davis and San Diego Supercomputer Center
Sanjay Madria -- University of Missouri-Rolla
Nina Mishra – Stanford University and IBM
Vibhu Mittal – Google
Joyce Mitchell – University of Utah
Katia Sycara – Carnegie Mellon University
Lee Giles – Pennsylvania State University
Peter Tarczy-Hornoch – University of Washington

Workshop Organizing Committee – Contact Information

Dr. Doina Caragea (Contact Person)
226 Atanasoff Hall
Department of Computer Science
Iowa State University
Ames, IA 50011-1040 USA
dcaragea@cs.iastate.edu
Phone: 1-515-292-3704

Professor Vasant Honavar
226 Atanasoff Hall
Department of Computer Science
Iowa State University
Ames, IA 50011-1040 USA
honavar@cs.iastate.edu
Phone: 1-515-294-4377

Dr. Ion Muslea
Language Weaver, Inc.
4640 Admiralty Way
Suite 1210
Marina del Rey, CA 90292
imuslea@languageweaver.com
Phone: 1-310-437-7300

Professor Raghu Ramakrishnan
Department of Computer Sciences
University of Wisconsin-Madison
1210 West Dayton Street
Madison, WI 53706-1685 USA
raghu@cs.wisc.edu
Phone: 1-608-262-9759

Preliminary Program Committee

Naoki Abe - IBM
Liviu Badea – ICI, Romania
Marie desJardins - University of Maryland, Baltimore County
Tim Finin -- University of Maryland, Baltimore County
Joydeep Ghosh -- University of Texas
Hillol Kargupta – University of Maryland, Baltimore County
Sally McClean -- University of Ulster at Coleraine
Dragos Margineantu – Boeing
Bamshad Mobasher – DePaul University
Jay Modi – Carnegie Melon University
C. David Page Jr. – University of Wisconsin, Madison
Alexandrin Popescul - Ask Jeeves, Inc.
Adrian Silvescu – Iowa State University
Steffen Staab -- University of Koblenz

Previously Organized Related Workshops

    •   IJCAI-2001 Workshop on “Knowledge Discovery from Heterogeneous, Distributed, Autonomous, Dynamic
        Data and Knowledge Sources”.Vasant Honavar, Chair.
    •   AAAI-2004 workshop on "Adaptive Text Extraction and Mining". Ion Muslea, Chair.
    •   IJCAI-2001 workshop on "Adaptive Text Extraction and Mining". Ion Muslea, Co-chair.
    •   AAAI-99 workshop on "Machine Learning for Information Extraction". Ion Muslea, Co-Chair.
CALL FOR PAPERS
 ICDM 2005 WORKSHOP ON KNOWLEDGE ACQUISITION FROM DISTRIBUTED, AUTONOMOUS, SEMANTICALLY
                     HETEROGENEOUS DATA AND KNOWLEDGE SOURCES
                     NOVEMBER 27 , NEW ORLEANS, LOUISIANA, USA
                                     TH




Important Dates
                                     Steffen Staab – Univ. of Koblenz
Aug. 12th: Paper Due

Sept. 4th: Notification             Workshop Goals
Sept. 26th: Camera Ready
                                    The workshop aims to bring together researchers in relevant areas of
Nov. 27th: Workshop                 artificial intelligence (machine learning, data mining, knowledge
                                    representation, ontologies), information systems (information
Organizing Committee                integration, databases, semantic web) distributed computing, and
                                    selected application areas (e.g., bioinformatics, security informatics,
Doina Caragea                       environmental informatics) to address several questions that arise in the
Iowa State University               process of knowledge acquisition from distributed, autonomous,
dcaragea@cs.iastate.edu             semantically heterogeneous data and knowledge sources.

Vasant Honavar                      Topics of Interest
Iowa State University
honavar@cs.iastate.edu              Topics of interest include, but are not restricted to:
                                    • Challenges presented by emerging data-rich application domains
Ion Muslea                              such as bioinformatics, health informatics, security informatics,
Language Weaver, Inc.                   social informatics, environmental informatics.
imuslea@languageweaver.com          • Knowledge discovery from distributed data (assuming different
                                        types of data fragmentation, e.g., horizontal or vertical data
Raghu Ramakrishnan                      fragmentation; different hypothesis classes, e.g., naïve Bayes,
University of Wisconsin-Madison         decision tree; different performance criteria, e.g., accuracy versus
raghu@cs.wisc.edu                       complexity versus reliability of the model generated, etc.).
                                    • Making semantically heterogeneous data sources self-describing
Program Committee                       (e.g., by explicitly associating ontologies with data sources and
                                        mappings between them) in order to help collaborative science .
Naoki Abe, IBM                      • Representation, manipulation, and reasoning with ontologies and
Liviu Badea, ICI, Romania               mappings between ontologies.
Doina Caragea, Iowa State Univ.     • Learning ontologies from data (e.g., attribute value taxonomies).
AnHai Doan, UIUC                    • Learning mappings between semantically heterogeneous data source
Marie desJardins, UMBC                  schemas and between their associated ontologies.
Joydeep Ghosh, Univ. of Texas
                                    • Knowledge discovery in the presence of ontologies (e.g., attribute
C. Lee Giles, Penn State Univ.
                                        value taxonomies) and partially specified data (data described at
Vasant Honavar, Iowa State Univ.
                                        different levels of abstraction within an ontology)?
Hillol Kargupta, UMBC
                                    • Online query relaxation when an initial query posed to the data
Sally McClean, U. of Ulster, UK
                                        sources fails (i.e., returns no tuples), or equivalently, query-driven
Bamshad Mobasher – DePaul U.
                                        mining of the individual sources that will result in knowledge that
Jay Modi – Carnegie Mellon Univ.
                                        can be used for query relaxation.
C. David Page, Univ. of Wisconsin
Alexandrin Popescul - Ask Jeeves    Submission Instructions
Raghu Ramakrishnan, Univ. of
Wisconsin
                                    Postscript or PDF versions of papers, no more than 10 pages long
Zbigniew Ras, UNC-Charlotte
                                    (including figures, tables, and references) in the ICDM camera-ready
format (IEEE 2-column format), should be submitted electronically to dcaragea@cs.iastate.edu by August
12th. Each paper will be rigorously refereed by at least 2 reviewers for technical soundness, originality, and clarity of
presentation. Accepted papers will be included in informal workshop proceedings published by ICDM and distributed
at the workshop. More details about the workshop can be found at www.cs.iastate.edu/~dcaragea/ICDM-KA.

Más contenido relacionado

La actualidad más candente

Relationship Web: Trailblazing, Analytics and Computing for Human Experience
Relationship Web: Trailblazing, Analytics and Computing for Human ExperienceRelationship Web: Trailblazing, Analytics and Computing for Human Experience
Relationship Web: Trailblazing, Analytics and Computing for Human ExperienceAmit Sheth
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataIOSR Journals
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Scienceresearchinventy
 
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging DataLearning Relations from Social Tagging Data
Learning Relations from Social Tagging DataHang Dong
 
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainAngelo Salatino
 
Creating Effective Data Visualizations for Online Learning
Creating Effective Data Visualizations for Online Learning Creating Effective Data Visualizations for Online Learning
Creating Effective Data Visualizations for Online Learning Shalin Hai-Jew
 
User studies: enquiry foundations and methodological considerations
User studies: enquiry foundations and methodological considerationsUser studies: enquiry foundations and methodological considerations
User studies: enquiry foundations and methodological considerationsGiannis Tsakonas
 
Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...Amit Sheth
 
DoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital AgeDoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital AgeGregoire Burel
 
Capitalizing on Machine Reading to Engage Bigger Data
Capitalizing on Machine Reading to Engage Bigger DataCapitalizing on Machine Reading to Engage Bigger Data
Capitalizing on Machine Reading to Engage Bigger DataShalin Hai-Jew
 
An Abridged Version of My Statement of Research Interests
An Abridged Version of My Statement of Research InterestsAn Abridged Version of My Statement of Research Interests
An Abridged Version of My Statement of Research Interestsadil raja
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone
 
Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Shalin Hai-Jew
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content SmarterBianca Pereira
 
Cognitive Retrieval Model
Cognitive Retrieval ModelCognitive Retrieval Model
Cognitive Retrieval ModelFirdaus Rahaman
 
from local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global Dataspacefrom local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global DataspaceOpen Education Consortium
 

La actualidad más candente (20)

Relationship Web: Trailblazing, Analytics and Computing for Human Experience
Relationship Web: Trailblazing, Analytics and Computing for Human ExperienceRelationship Web: Trailblazing, Analytics and Computing for Human Experience
Relationship Web: Trailblazing, Analytics and Computing for Human Experience
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media Data
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging DataLearning Relations from Social Tagging Data
Learning Relations from Social Tagging Data
 
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domain
 
Creating Effective Data Visualizations for Online Learning
Creating Effective Data Visualizations for Online Learning Creating Effective Data Visualizations for Online Learning
Creating Effective Data Visualizations for Online Learning
 
What is What, When?
What is What, When?What is What, When?
What is What, When?
 
Shifting from librarian to data manager
Shifting from librarian to data managerShifting from librarian to data manager
Shifting from librarian to data manager
 
User studies: enquiry foundations and methodological considerations
User studies: enquiry foundations and methodological considerationsUser studies: enquiry foundations and methodological considerations
User studies: enquiry foundations and methodological considerations
 
Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...
 
Research Statement
Research StatementResearch Statement
Research Statement
 
DoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital AgeDoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
DoRES — A Three-tier Ontology for Modelling Crises in the Digital Age
 
Capitalizing on Machine Reading to Engage Bigger Data
Capitalizing on Machine Reading to Engage Bigger DataCapitalizing on Machine Reading to Engage Bigger Data
Capitalizing on Machine Reading to Engage Bigger Data
 
An Abridged Version of My Statement of Research Interests
An Abridged Version of My Statement of Research InterestsAn Abridged Version of My Statement of Research Interests
An Abridged Version of My Statement of Research Interests
 
Improving Tag Clouds
Improving Tag CloudsImproving Tag Clouds
Improving Tag Clouds
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content Smarter
 
Cognitive Retrieval Model
Cognitive Retrieval ModelCognitive Retrieval Model
Cognitive Retrieval Model
 
from local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global Dataspacefrom local/regional OER Silos towards an OER Global Dataspace
from local/regional OER Silos towards an OER Global Dataspace
 

Destacado

introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learningbutest
 
Machine Learning for Non-technical People
Machine Learning for Non-technical PeopleMachine Learning for Non-technical People
Machine Learning for Non-technical Peopleindico data
 
Künstliche Intelligenz - Maschinelles Lernen - Grundlagen
Künstliche Intelligenz - Maschinelles Lernen - GrundlagenKünstliche Intelligenz - Maschinelles Lernen - Grundlagen
Künstliche Intelligenz - Maschinelles Lernen - GrundlagenHighStreamw
 
Spamming and Spam Filtering
Spamming and Spam FilteringSpamming and Spam Filtering
Spamming and Spam FilteringiNazneen
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learningbutest
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesPier Luca Lanzi
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.butest
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 

Destacado (13)

introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
Machine Learning for Non-technical People
Machine Learning for Non-technical PeopleMachine Learning for Non-technical People
Machine Learning for Non-technical People
 
Künstliche Intelligenz - Maschinelles Lernen - Grundlagen
Künstliche Intelligenz - Maschinelles Lernen - GrundlagenKünstliche Intelligenz - Maschinelles Lernen - Grundlagen
Künstliche Intelligenz - Maschinelles Lernen - Grundlagen
 
Spamming and Spam Filtering
Spamming and Spam FilteringSpamming and Spam Filtering
Spamming and Spam Filtering
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 

Similar a ICDM 2005 Workshop on Knowledge Acquisition from Distributed Data

Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...
Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...
Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...António Correia
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud ComputingCarmen Sanborn
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...Patricia Tavares Boralli
 
On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...Grial - University of Salamanca
 
Recruitment Based On Ontology with Enhanced Security Features
Recruitment Based On Ontology with Enhanced Security FeaturesRecruitment Based On Ontology with Enhanced Security Features
Recruitment Based On Ontology with Enhanced Security Featurestheijes
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYcscpconf
 
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...eMadrid network
 
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...Micah Altman
 
Concept integration using edit distance and n gram match
Concept integration using edit distance and n gram match Concept integration using edit distance and n gram match
Concept integration using edit distance and n gram match ijdms
 
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...e-ROSA
 
Ontology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceOntology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceIJCERT
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveVince Smith
 
Sentimental classification analysis of polarity multi-view textual data using...
Sentimental classification analysis of polarity multi-view textual data using...Sentimental classification analysis of polarity multi-view textual data using...
Sentimental classification analysis of polarity multi-view textual data using...IJECEIAES
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYcseij
 
Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Talis Consulting
 
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
NeISSProject
 
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...UKSG: connecting the knowledge community
 

Similar a ICDM 2005 Workshop on Knowledge Acquisition from Distributed Data (20)

Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...
Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...
Exploiting classical bibliometrics of CSCW: classification, evaluation, limit...
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
 
On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...On data-driven systems analyzing, supporting and enhancing users’ interaction...
On data-driven systems analyzing, supporting and enhancing users’ interaction...
 
Recruitment Based On Ontology with Enhanced Security Features
Recruitment Based On Ontology with Enhanced Security FeaturesRecruitment Based On Ontology with Enhanced Security Features
Recruitment Based On Ontology with Enhanced Security Features
 
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYINTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGY
 
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...
2015 03 19 (EDUCON2015) eMadrid UPM Towards a Learning Analytics Approach for...
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
Concept integration using edit distance and n gram match
Concept integration using edit distance and n gram match Concept integration using edit distance and n gram match
Concept integration using edit distance and n gram match
 
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
 
Ontology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceOntology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold Preference
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
 
Sentimental classification analysis of polarity multi-view textual data using...
Sentimental classification analysis of polarity multi-view textual data using...Sentimental classification analysis of polarity multi-view textual data using...
Sentimental classification analysis of polarity multi-view textual data using...
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Linked Data Workshop Stanford University
Linked Data Workshop Stanford University Linked Data Workshop Stanford University
Linked Data Workshop Stanford University
 
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK

 
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
UKSG 2024 -From algorithms to empowerment:teaching algorithmic literacy (AL) ...
 

Más de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Más de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

ICDM 2005 Workshop on Knowledge Acquisition from Distributed Data

  • 1. ICDM 2005 Workshop Proposal Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources Description of the workshop topic and the associated research issues Recent advances in high performance computing, high speed and high bandwidth communication, massive storage, and software (e.g., web services) that can be remotely invoked on the Internet present unprecedented opportunities in data-driven knowledge acquisition in a broad range of applications in virtually all areas of human endeavor including collaborative cross-disciplinary discovery in e-science, bioinformatics, e-government, environmental informatics, health informatics, security informatics, e-business, education, social informatics, among others. Given the explosive growth in the number and diversity of potentially useful information sources in many domains, there is an urgent need for sound approaches to integrative and collaborative analysis and interpretation of distributed, autonomous (and hence, inevitably semantically heterogeneous) data sources. Machine learning offers some of the most cost-effective approaches to automated or semi-automated knowledge acquisition (discovery of features, correlations, and other complex relationships and hypotheses that describe potentially interesting regularities from large data sets) in many data rich application domains. However, the applicability of current approaches to machine learning in emerging data rich application domains presents several challenges in practice: (a) Centralized access to data (assumed by most machine learning algorithms) is infeasible because of the large size and/or access restrictions imposed by the autonomous data sources. Hence, there is a need for knowledge acquisition systems that can perform the necessary analysis of data at the locations where the data and the computational resources are available and transmit the results of analysis (knowledge acquired from the data) to the locations where they are needed. (b) Ontological commitments associated with a data source (that is, assumptions concerning the objects that exist in the world, the properties or attributes of the objects, the possible values of attributes, and their intended meaning) are determined by the intended use of the data repository (at design time). In addition, data sources that are created for use in one context often find use in other contexts or applications. Therefore, semantic differences among autonomous data sources are simply unavoidable. Because users often need to analyze data in different contexts from different perspectives, there is no single privileged ontology that can serve all users, or for that matter, even a single user, in every context. Effective use of multiple sources of data in a given context requires reconciliation of such semantic differences from the user’s point of view. (c) Explicitly associating ontologies with data repositories results in partially specified data, i.e., data that are described in terms of attribute values at different levels of abstraction. For example, the program of a student in a data source can be specified as Graduate, while the program of a different student in the same data source (or even a different data source) can be specified as Doctoral. Against this background, the proposed workshop seeks to bring together researchers in relevant areas of artificial intelligence (machine learning, data mining, knowledge representation, ontologies), information systems (information integration, databases, semantic web) distributed computing, and selected application areas (e.g., bioinformatics, security informatics, environmental informatics) to address several questions such as: 1) What are some of the research challenges presented by emerging data-rich application domains such as bioinformatics, health informatics, security informatics, social informatics, environmental informatics? 2) How can we perform knowledge discovery from distributed data (assuming different types of data fragmentation, e.g., horizontal or vertical data fragmentation; different hypothesis classes, e.g., naïve Bayes, decision tree, support vector machine classifiers; different performance criteria, e.g., accuracy versus complexity versus reliability of the model generated, etc.)?
  • 2. 3) How can we make semantically heterogeneous data sources self-describing (e.g., by explicitly associating ontologies with data sources and mappings between them) in order to help collaborative science from autonomous information sources? 4) How can we represent, manipulate, and reason with ontologies and mappings between ontologies? 5) How can we learn ontologies from data (e.g., attribute value taxonomies)? 6) How can we learn mappings between semantically heterogeneous data source schemas and between their associated ontologies? 7) How can we perform knowledge discovery in the presence of ontologies (e.g., attribute value taxonomies) and partially specified data (data that are described at different levels of abstraction within an ontology)? 8) How can we achieve online query relaxation when an initial query posed to the data sources fails (i.e., returns no tuples)? That is, how do we perform a query-driven mining of the individual sources that will result in knowledge that can be used for query relaxation? Reasons why an ICDM workshop on this topic should take place As noted above, the explosive growth in the number and diversity of potentially useful information sources in many domains, there is an urgent need for sound approaches to integrative and collaborative analysis and interpretation of distributed, autonomous (and hence, inevitably semantically heterogeneous) data sources. At present, while there are several research conferences focus on well-established research areas (e.g., machine learning, data mining, knowledge representation, databases), there is relatively little interaction among the different research communities. For example, machine learning researchers working on algorithms for learning predictive models from distributed data, are isolated from the large community of database researchers working on data integration, and the community of artificial intelligence researchers focused on knowledge representation and inference. Researchers in this area can also benefit from a better understanding of specific challenges posed by emerging informatics-enabled application domains such as bioinformatics, health informatics, security informatics, environmental informatics. Fundamental advances in collaborative approaches to knowledge acquisition and data-driven decision making from distributed, autonomous, semantically heterogeneous data and knowledge sources require synergistic synthesis of research advances, insights, algorithms, and results in multiple areas of: • artificial intelligence – especially machine learning, data mining, knowledge representation and inference, intelligent agents and multi-agent systems; • information systems – especially databases, information integration, semantic web; • distributed computing (e.g., service-oriented computing). The proposed workshop aims to bring them together in order to enable discussion of research problems, approaches, insights, and results drawn from multiple, and at present, largely disparate areas of artificial intelligence, computer science, and emerging informatics-enabled disciples. At present, there is no annual conference or workshop dedicated to this topic. It is hoped that the resulting exchanges will stimulate further interaction between these communities and result in the development of new approaches that would advance the current state of the art in collaborative systems for collaborative analysis, interpretation, and decision making from distributed, autonomous, semantically heterogeneous data and knowledge sources. Workshop Format The workshop will consist of: • An opening session for introducing the workshop topics, goals, participants, and expected outcomes • A small number of invited talks carefully intermixed with presentation of contributed papers. The invited talks will give overviews of the key topics (learning from distributed data, semantic Web, ontology-based information integration, distributed description logics, selected applications, etc.). A possible list of invited speakers: • Alex Borgida (ontologies and databases) - Rutgers University • Katy Borner (information visualization) -- Indiana University • Foster Provost (machine learning and data mining) – New York University
  • 3. • James Hendler (semantic web) - University of Maryland at College Park • Alon Halevy (information integration) – University of Washington • Dieter Fensel (ontologies) - University of Innsbruck • Tom Dietterich (machine learning and environmental informatics) – Oregon State University • H. Jagadish (biological data management) – University of Michigan • Daphne Koller (probabilistic models) – Stanford University • Munindar Singh (service-oriented computing) – North Carolina State University • Michael Pazzani (intelligent information systems) – National Science Foundation • Presentations of contributed papers that represent completed work. • Breaks between sessions, meant to encourage informal discussions related to the topics discussed in the sessions and to create opportunities for collaborations. • A panel discussion on challenges and future research directions • A wrap-up session summarizing the workshop (including formal or informal discussions). Description of the anticipated target group(s) of attendees The workshop is of interest to researchers, students, and practitioners in a number of areas of artificial intelligence, information systems, and related areas including: machine learning and data mining, information extraction, information integration, knowledge representation, semantic web, software agents and multi-agent systems, and service-oriented computing. The workshop is also of interest to researchers and practitioners in emerging informatics- enabled application domains such as bioinformatics, environmental informatics, health informatics, security informatics, e-business, social informatics. The organizers will make an effort to ensure a good mix of established researchers as well as graduate students and junior researchers on the one hand and academic and industrial participants on the other. Potential authors and attendees A number of researchers who were informally contacted have expressed an interest in the proposed workshop. We put together a short list of potential participants. (The list below does not include members of the program committee or participants named on the list of potential invited speakers). We expect the target size of the workshop to be around 40 participants to allow for fruitful interactions and discussion in an informal setting among the workshop participants. AnHai Doan -- University of Illinois at Urbana-Champaign Lise Getoor – University of Maryland Barbara Eckman -- IBM Life Sciences Solution Development George Forman – Hewlett Packard Labs Simon Kasif – Boston University Zoe Lacroix -- Arizona State University Pat Langley – Stanford University Bertram Ludaescher -- University of California, Davis and San Diego Supercomputer Center Sanjay Madria -- University of Missouri-Rolla Nina Mishra – Stanford University and IBM Vibhu Mittal – Google Joyce Mitchell – University of Utah Katia Sycara – Carnegie Mellon University Lee Giles – Pennsylvania State University Peter Tarczy-Hornoch – University of Washington Workshop Organizing Committee – Contact Information Dr. Doina Caragea (Contact Person) 226 Atanasoff Hall Department of Computer Science Iowa State University
  • 4. Ames, IA 50011-1040 USA dcaragea@cs.iastate.edu Phone: 1-515-292-3704 Professor Vasant Honavar 226 Atanasoff Hall Department of Computer Science Iowa State University Ames, IA 50011-1040 USA honavar@cs.iastate.edu Phone: 1-515-294-4377 Dr. Ion Muslea Language Weaver, Inc. 4640 Admiralty Way Suite 1210 Marina del Rey, CA 90292 imuslea@languageweaver.com Phone: 1-310-437-7300 Professor Raghu Ramakrishnan Department of Computer Sciences University of Wisconsin-Madison 1210 West Dayton Street Madison, WI 53706-1685 USA raghu@cs.wisc.edu Phone: 1-608-262-9759 Preliminary Program Committee Naoki Abe - IBM Liviu Badea – ICI, Romania Marie desJardins - University of Maryland, Baltimore County Tim Finin -- University of Maryland, Baltimore County Joydeep Ghosh -- University of Texas Hillol Kargupta – University of Maryland, Baltimore County Sally McClean -- University of Ulster at Coleraine Dragos Margineantu – Boeing Bamshad Mobasher – DePaul University Jay Modi – Carnegie Melon University C. David Page Jr. – University of Wisconsin, Madison Alexandrin Popescul - Ask Jeeves, Inc. Adrian Silvescu – Iowa State University Steffen Staab -- University of Koblenz Previously Organized Related Workshops • IJCAI-2001 Workshop on “Knowledge Discovery from Heterogeneous, Distributed, Autonomous, Dynamic Data and Knowledge Sources”.Vasant Honavar, Chair. • AAAI-2004 workshop on "Adaptive Text Extraction and Mining". Ion Muslea, Chair. • IJCAI-2001 workshop on "Adaptive Text Extraction and Mining". Ion Muslea, Co-chair. • AAAI-99 workshop on "Machine Learning for Information Extraction". Ion Muslea, Co-Chair.
  • 5. CALL FOR PAPERS ICDM 2005 WORKSHOP ON KNOWLEDGE ACQUISITION FROM DISTRIBUTED, AUTONOMOUS, SEMANTICALLY HETEROGENEOUS DATA AND KNOWLEDGE SOURCES NOVEMBER 27 , NEW ORLEANS, LOUISIANA, USA TH Important Dates Steffen Staab – Univ. of Koblenz Aug. 12th: Paper Due Sept. 4th: Notification Workshop Goals Sept. 26th: Camera Ready The workshop aims to bring together researchers in relevant areas of Nov. 27th: Workshop artificial intelligence (machine learning, data mining, knowledge representation, ontologies), information systems (information Organizing Committee integration, databases, semantic web) distributed computing, and selected application areas (e.g., bioinformatics, security informatics, Doina Caragea environmental informatics) to address several questions that arise in the Iowa State University process of knowledge acquisition from distributed, autonomous, dcaragea@cs.iastate.edu semantically heterogeneous data and knowledge sources. Vasant Honavar Topics of Interest Iowa State University honavar@cs.iastate.edu Topics of interest include, but are not restricted to: • Challenges presented by emerging data-rich application domains Ion Muslea such as bioinformatics, health informatics, security informatics, Language Weaver, Inc. social informatics, environmental informatics. imuslea@languageweaver.com • Knowledge discovery from distributed data (assuming different types of data fragmentation, e.g., horizontal or vertical data Raghu Ramakrishnan fragmentation; different hypothesis classes, e.g., naïve Bayes, University of Wisconsin-Madison decision tree; different performance criteria, e.g., accuracy versus raghu@cs.wisc.edu complexity versus reliability of the model generated, etc.). • Making semantically heterogeneous data sources self-describing Program Committee (e.g., by explicitly associating ontologies with data sources and mappings between them) in order to help collaborative science . Naoki Abe, IBM • Representation, manipulation, and reasoning with ontologies and Liviu Badea, ICI, Romania mappings between ontologies. Doina Caragea, Iowa State Univ. • Learning ontologies from data (e.g., attribute value taxonomies). AnHai Doan, UIUC • Learning mappings between semantically heterogeneous data source Marie desJardins, UMBC schemas and between their associated ontologies. Joydeep Ghosh, Univ. of Texas • Knowledge discovery in the presence of ontologies (e.g., attribute C. Lee Giles, Penn State Univ. value taxonomies) and partially specified data (data described at Vasant Honavar, Iowa State Univ. different levels of abstraction within an ontology)? Hillol Kargupta, UMBC • Online query relaxation when an initial query posed to the data Sally McClean, U. of Ulster, UK sources fails (i.e., returns no tuples), or equivalently, query-driven Bamshad Mobasher – DePaul U. mining of the individual sources that will result in knowledge that Jay Modi – Carnegie Mellon Univ. can be used for query relaxation. C. David Page, Univ. of Wisconsin Alexandrin Popescul - Ask Jeeves Submission Instructions Raghu Ramakrishnan, Univ. of Wisconsin Postscript or PDF versions of papers, no more than 10 pages long Zbigniew Ras, UNC-Charlotte (including figures, tables, and references) in the ICDM camera-ready
  • 6. format (IEEE 2-column format), should be submitted electronically to dcaragea@cs.iastate.edu by August 12th. Each paper will be rigorously refereed by at least 2 reviewers for technical soundness, originality, and clarity of presentation. Accepted papers will be included in informal workshop proceedings published by ICDM and distributed at the workshop. More details about the workshop can be found at www.cs.iastate.edu/~dcaragea/ICDM-KA.