The AQUAINT Question Answering (AQUA) System project aims to develop an advanced question answering system that can retrieve and integrate information from multiple sources to provide relevant answers to complex questions. The system will be developed by Science Applications International Corporation with Stanford University as a subcontractor. The goal is to incorporate new question answering technologies that can be used by various government agencies. Key innovations the project expects to contribute include context-relevant search, novel information retrieval techniques, source reliability assessment, conflict resolution, and improved explanation of answers.
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Saic aqua summary
1. AQUAINT Question & Answering System (AQUA) Science Applications International Corporation
Stanford University Knowledge Systems Laboratory
AQUAINT Question Answering (AQUA) System
Project Summary
Prime Contractor: Science Applications International Corporation
With Subcontractor: Stanford University – Knowledge Systems Laboratory
Technical Points of Contact:
Ms. Maureen Caudill
Phone: (858) 826-5743
E-Mail: Maureen.Caudill@saic.com
Fax: (858) 826-5517
10260 Campus Point Court
San Diego, CA 92121
Ms. Barbara Starr
Phone: (858) 826-3047
E-Mail: Barbara.H.Starr@saic.com
Fax: (858) 826-5517
10260 Campus Point Court
San Diego, CA 92121
1
2. AQUAINT Question & Answering System (AQUA) Science Applications International Corporation
Stanford University Knowledge Systems Laboratory
AQUAINT QUESTION ANSWERING SYSTEM PROJECT SUMMARY
The main goal of the AQUAINT Question Answering (AQUA) System’s technical approach
is to incorporate breakthrough advancements in question-answering technologies that ultimately
can be transitioned for use at a variety of U.S. government agencies. We will develop a question-
answering system that will seek sources of information across a variety of information genres,
assimilate that knowledge into sophisticated knowledge bases, and produce relevant, timely, and
helpful answers to complex questions.
We will create advanced technologies to search and retrieve relevant text from unstructured
text, structured databases, and metadata markups, as well as provide a capability to interpret and
integrate highly colloquial and informal text such as chat room or message board text. We will
provide source reliability assessments for that new data, and will translate their contents from
text to knowledge base representations. The knowledge bases (KBs) will have a sophisticated
suite of tools to automatically partition them into context- and query-sensitive segments, to rea-
son efficiently across those segments, and to identify and resolve conflicts between new infor-
mation and that already stored. Furthermore, we will provide answer explanations that are
carefully pruned and edited for readability, conciseness, and interestingness.
The SAIC Team is confident that the new technology components that comprise the AQUA
system will achieve the goals of this AQUAINT Program Phase 1 effort. The AQUA system will
be delivered as an integrated component solution, using a variety of technology approaches. We
will advance the existing state-of-the-art in question answering, focusing on important
opportunities to leverage multiple synergistic approaches and encompassing a variety of
promising research topics.
In the course of our AQUA system development efforts, we will operate on multidimensional
data, with a focus primarily on unstructured text, but also including structured data sources
(including numerical and statistical sources), and metadata sources, especially in terms of
DARPA Agent Markup Language (DAML) metadata. As a third data dimension, the AQUA
system will provide access to degraded text, such as that derived from closed-captioning video
sources. The use of message board text, complete with its misspellings and ungrammatical and
colloquial forms, will provide text that is midway between “clean” and “degraded.”
The SAIC Team is dedicated to making the AQUAINT Program a success. To that end, we
have brought together a talented, experienced staff, developed a unique and innovative solution,
and defined a radical new research and development approach that will allow us to achieve the
goals of the Determining the answer technical area of AQUAINT for this first phase of the
program. We look forward to the opportunity of developing the AQUA system for ARDA and of
working with the other contractors and contractor teams not only through this first phase, but
also in succeeding phases of the AQUAINT Program.
The key innovations the SAIC Team expects to contribute toward the AQUAINT Program
goals include the following:
2
3. AQUAINT Question & Answering System (AQUA) Science Applications International Corporation
Stanford University Knowledge Systems Laboratory
• Context-relevant search and retrieval removes the largest inefficiency in a question-
answering system, and thus reduces the level of effort required by all other system
components.
• Using novel information search and retrieval techniques specially designed to handle
large volumes of documents, ensures that the AQUA system will provide robust
functionality and real-world practicality.
• Novel source reliability assessment techniques restrict the use of incorrect
information, allowing a broader and deeper overall understanding of the answer.
Allowing unreliable but potentially information-rich sources gives the AQUA system
the ability to extend beyond formally written documents and interface with informal
text.
• Conflict resolution automates logical consistency checking for data items extracted
and retrieved from unstructured sources.
• Context-aware KB partitioning means the AQUA system must only reference
material within a single KB context, which significantly narrows the potential answer
space that must be processed to determine the answer to a question.
• New algorithms to reason across multiple partitioned KBs will improve the efficiency
of query answering in partitioned KBs. Reasoning efficiently across multiple KB
partitions means that query context can be more closely tracked.
• Techniques to markedly improve and shorten explanation proof trees and the output
of theorem provers will significantly increase overall system ease of use,
believability, and usefulness to analysts.
• The AQUA system’s use of generated markup with embedded semantic content
increases the precision of search results. More accurate search of documents means
returned documents are more relevant to the search intent and thus reduces the effort
needed to determine the final answer.
• The AQUA system in this Phase I effort will lay the groundwork for the future
development of techniques offering reliable detection of misstatements in source
documents. This will raise question-answering systems to a new threshold of
achievement and provide a real-world capability that does not exist today. Not all
documents and data sources are truthful, and if misstatements can be flagged early, it
not only prevents corruption of KBs, but also sets the stage for advanced behavioral
modeling and predictions of goals, actions, and intentions of untruthful sources.
3