SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
ELEVATOR
L   GUEST LAUNDRY
Schedule

    09:00
 
    Registration, poster set-up, and continental breakfast


    09:30
 
    Welcome


    09:45
 
    Invited Talk: Machine Learning in Space

    
      
    Kiri L. Wagstaff, N.A.S.A.


    10:15
 
    A general agnostic active learning algorithm

    
      
    Claire Monteleoni, UC San Diego


    10:35
 
    Bayesian Nonparametric Regression with Local Models

    
      
    Jo-Anne Ting, University of Southern California


    10:55
 
    Coffee Break


    11:15
 
    Invited Talk: Applying machine learning to a real-world

    
      
    problem: real-time ranking of electric components

    
      
    Marta Arias, Columbia University


    11:45
 
    Generating Summary Keywords for Emails Using Topics.

    
      
    Hanna Wallach, University of Cambridge


    12:05
 
    Continuous-State POMDPs with Hybrid Dynamics

    
      
    Emma Brunskill, MIT


    12:25
 
    Spotlights


    12:45
 
    Lunch


    14:20
 
    Invited Talk: Randomized Approaches to Preserving Privacy

    
      
    Nina Mishra, University of Virginia


    14:50
 
    Clustering Social Networks

    
      
    Isabelle Stanton, University of Virginia


    15:10
 
    Coffee Break


     15:30
 
   Invited Talk: Applications of Machine Learning to Image

     
      
   Retrieval

   
    
       Sally Goldman, Washington University


    16:00
 
    Improvement in Performance of Learning Using Scaling

    
      
    Soumi Ray, University of Maryland Baltimore County


    16:20
 
    Poster Session


    17:10
 
    Panel/ Open Discussion


    17:40
 
    Concluding Remarks
Invited Talks

Machine Learning in Space
Kiri L. Wagstaff, N.A.S.A.

      Remote space environments simultaneously present significant challenges to the
      machine learning community and enormous opportunities for advancement. In
      this talk, I present recent work on three key issues associated with machine
      learning in space: on-board data classification and regression, on-board
      prioritization of analysis results, and reliable computing in high-radiation
      environments. Support vector machines are currently being used on-board the
      EO-1 Earth orbiter, and they are poised for adoption by the Mars Odyssey orbiter
      as well. We have developed techniques for learning scientist preferences for
      which subset of images is most critical for transmission, so that we can make the
      most use of limited bandwidth. Finally, we have developed fault-tolerant SVMs
      that can detect and recover from radiation-induced errors while performing on-
      board data analysis.

About the speaker:
                             Kiri L. Wagstaff is a senior researcher at the Jet Propulsion
                              Laboratory in Pasadena, CA. She is a member of the
                              Machine Learning and Instrument Autonomy group, and
                              her focus is on developing new machine learning methods
                              that can be used for data analysis on-board spacecraft.
                              She has applied these techniques to data being collected
                              by the EO-1 Earth-orbiting spacecraft, Mars Odyssey, and
                              Mars Pathfinder. She has also worked on crop yield
                              prediction from orbital remote sensing observations, the
                              fault protection system for the MESSENGER mission to
                              Mercury, and automatic code generation for the Electra
                             radio used by the Mars Reconnaissance Orbiter and the
      Mars Science Laboratory. She is very interested in issues such as robustness
      (developing fault-tolerant machine learning methods for high-radiation
      environments) and infusion (how can machine learning be used to advance
      science?). She holds a Ph.D. in Computer Science from Cornell University and is
      currently working on an M.S. in Geology from the University of Southern
      California.
Applying machine learning to a real-world problem: real-time ranking of electric
components
Marta Arias, Columbia University

      In this talk, I will describe our experience with applying machine learning
      techniques to a concrete real-world problem: the generation of rankings of
      electric components according to their susceptibility to failure. The system's goal
      is to aid operators in the replacement strategy of most at-risk components and in
      handling emergency situations. In particular, I will address the challenge of
      dealing with the concept drift inherent in the electrical system and will describe
      our solution based on a simple weighted-majority voting scheme.

About the speaker:
                             Marta Arias received her bachelor's degree in Computer
                             Science from the Polytechnic University of Catalunya
                             (Barcelona, Spain) in 1998. After that she worked for a
                             year at Incyta S.A. (Barcelona, Spain), a company
                             specializing in software products for Natural Language
                             Processing applications. She then enrolled in the graduate
                             student program at Tufts University, recieving her PhD in
                             Computer Science in 2004. That same year she joined the
                             Center for Computational Learning Systems of Columbia
                             University as an Associate Research Scientist. Dr. Arias'
                            research interest include the theory and application of
      machine learning.
Randomized Approaches to Preserving Privacy
Nina Mishra, University of Virginia, Microsoft Research

      The Internet is arguably one of the most important inventions of the last century.
      It has altered the very nature of our lives -- the way we communicate, work, shop,
      vote, recreate, etc. The impact has been phenomenal for the machine learning
      community since both old and newly created information repositories, such as
      medical records and web click streams, are readily available and waiting to be
      mined. However, opposite these capabilities and advances is the basic right to
      privacy: On the one hand, in order to best serve and protect its citizens, the
      government should ideally have access to every available bit of societal
      information. On the other hand, privacy is a fundamental right and human need,
      which theoretically is served best when the government knows nothing about the
      personal lives of its citizens. This raises the natural question of whether it is even
      possible to simultaneously realize both of these diametrically opposed goals,
      namely, information transparency and individual privacy. Surprisingly, the answer
      is yes and I will describe solutions where individuals randomly perturb and
      publish their data so as to preserve their own privacy and yet large-scale
      information can still be learned. Joint work with Mark Sandler.

About the speaker:
                            Nina Mishra is an Associate Professor in the Computer
                            Science Department at the University of Virginia. Her
                            research interests are in data mining and machine
                            learning algorithms as well as privacy. She previously
                            held joint appointments as a Senior Research Scientist at
                            HP Labs, and as an Acting Faculty member at Stanford
                            University. She was Program Chair of the International
                            Conference on Machine Learning in 2003 and has served
                            on numerous data mining and machine learning program
                            committees. She also serves on the editorial Boards of
                            Machine Learning, IEEE Transactions on Knowledge and
                            Data Engineering, IEEE Intelligent Systems and the
                           Journal of Privacy and Confidentiality. She is currently on
      leave in Search Labs at Microsoft Research. She received a PhD in Computer
      Science from UIUC.
Applications of Machine Learning to Image Retrieval
Sally Goldman, Washington University

      Classic Content-Based Image Retrieval (CBIR) takes a single non-annotated
      query image, and retrieves similar images from an image repository. Such a
      search must rely upon a holistic (or global) view of the image. Yet often the
      desired content of an image is not holistic, but is localized. Specifically, we define
      Localized Content-Based Image Retrieval as a CBIR task where the user is only
      interested in a portion of the image, and the rest of the image is irrelevant. We
      discuss our localized CBIR system, Accio!, that uses labeled images in
      conjunction with a multiple-instance learning algorithm to first identify the desired
      object and re-weight the features, and then to rank images in the database using
      a similarity measure that is based upon individual regions within the image. We
      will discuss both the image representation and multiple-instance learning
      algorithm that we have used in the localized CBIR systems that we have
      developed. We also look briefly at ways in which multiple-instance learning can
      be applied to knowledge-based image segmentation.

About the speaker:
                             Dr. Sally Goldman is the Edwin H. Murty Professor of
                             Engineering at Washington University in St. Louis and the
                             Associate Chair of the Department of Computer Science
                             and Engineering. She received a Bachelor of Science in
                             Computer Science from Brown University in December
                             1984. Under the guidance of Dr. Ronald Rivest at the
                             Massachusetts Institute of Technology, Dr. Goldman
                             completed her Master of Science in Electrical
                             Engineering and Computer Science in May 1987 and her
                             Ph.D. in July 1990. Dr. Goldman's research is in the area
                             of algorithm design and analysis and machine learning
                             with a recent focus on applications to the area of content-
                             based image retrieval. Dr. Goldman has received many
                             teaching awards and honors including the Emerson
      Electric Company Excellence in Teaching Award in 1999, and the Governor's
      Award for Excellence in Teaching in 2001. Dr. Goldman and her husband, Dr.
      Ken Goldman, have just completed a book titled, A Practical Guide to Data
      Structures and Algorithms using Java.
Talks

A General Agnostic Active Learning Algorithm
Claire Monteleoni, UC San Diego

      We present a simple, agnostic active learning algorithm that works for any
      hypothesis class of bounded VC dimension, and any data distribution. Most
      previous work on active learning either makes strong distributional assumptions,
      or else is computationally prohibitive. Our algorithm extends a scheme due to
      Cohn, Atlas, and Ladner to the agnostic setting (i.e. arbitrary noise), by (1)
      reformulating it using a reduction to supervised learning and (2) showing how to
      apply generalization bounds even for the non-i.i.d. samples that result from
      selective sampling. We provide a general characterization of the label
      complexity of our algorithm. This quantity is never more than the usual PAC
      sample complexity of supervised learning, and is exponentially smaller for some
      hypothesis classes and distributions. We also demonstrate improvements
      experimentally.

      This is joint work with Sanjoy Dasgupta and Daniel Hsu. Currently in submission,
      but for a full version, please see UCSD tech report:
      http://www.cse.ucsd.edu/Dienst/UI/2.0/Describe/ncstrl.ucsd_cse/CS2007-0898

Bayesian Nonparametric Regression with Local Models
Jo-Anne Ting, University of Southern California

      We propose a Bayesian nonparametric regression algorithm with locally linear
      models for high-dimensional, data-rich scenarios where real- time, incremental
      learning is necessary. Nonlinear function approximation with high-dimensional
      input data is a nontrivial problem. An application example is a high-dimensional
      movement system like a humanoid robot, where real-time learning of internal
      models for compliant control may be needed. Fortunately, many real-world
      data sets tend to have locally low dimensional distributions, despite having high
      dimensional embedding (e.g., Tenenbaum et al. 2000, Roweis & Saul, 2000). A
      successful algorithm, thus, must avoid numerical problems arising potentially
      from redundancy in the input data, eliminate irrelevant input dimensions, and be
      computationally efficient to allow for incremental, online learning.

      Several methods have been proposed for nonlinear function approximation, such
      as Gaussian process regression (Williams & Rasmussen, 1996), support vector
      regression (Smola & Schölkopf, 1998) and variational Bayesian mixture models
      (Ghahramani & Beal, 2000). However, these global methods tend to be
      unsuitable for fast, incremental function approximation. Atkeson, Moore & Schaal
      (1997) have shown that in such scenarios, learning with spatially localized
      models is more appropriate, particularly in the framework of locally weighted
      learning.
In recent years, Vijayakumar & Schaal (2000) have introduced a learning
     algorithm designed to fulfill the fast, incremental requirements of locally weighted
     learning, specifically targeting high-dimensional input domains through the use of
     local projections. This algorithm, called Locally Weighted Projection Regression
     (LWPR),performs competitively in its generalization performance with state-of-
     the-art batch regression methods. It has been applied successfully to
     sensorimotor learning on a humanoid robot for the purpose of executing fast,
     accurate movements in a feedforward controller.

     The major issue with LWPR is that it requires gradient descent (with leave-one-
     out cross-validation) to optimize the local distance metrics in each local
     regression model. Since gradient descent search is sensitive to the initial values,
     we propose a novel Bayesian treatment of locally weighted regression with
     locally linear models that eliminates the need for any manual tuning of meta
     parameters, cross-validation approaches or sampling. Combined with variational
     approximation methods to allow for fast, tractable inference, this Bayesian
     algorithm learns the optimal distance metric value for each local regression
     model. It is able to automatically determine thesize of the neighborhood data
     (i.e., the ``bandwidth’’) that should contribute to each local model. A Bayesian
     approach offers error bounds on the distance metrics and incorporates this
     uncertainty in the predictive distributions. By being able to automatically detect
     relevant input dimensions, our algorithm is able to handle high- dimensional data
     sets with a large number of redundant and/or irrelevant input dimensions and a
     large number of data samples. We demonstrate competitive performance of our
     Bayesian locally weighted regression algorithm with Gaussian Process
     regression and LWPR on standard benchmark sets. We also explore extensions
     of this locally linear Bayesian algorithm to a real-time setting, to offer a
     parameter-free alternative for incremental learning in high-dimensional spaces.

Generating Summary Keywords for Emails Using Topics.
Hanna Wallach, University of Cambridge

     Email summary keywords, used to concisely represent the gist of an email, can
     help users manage and prioritize large numbers of messages. Previous work on
     email keyword selection has focused on a two-stage supervised learning system
     that selects nouns from individual emails using pre-defined linguistic rules [1]. In
     this work we present an unsupervised learning framework for selecting email
     summary keywords. A good summary keyword for an email message is not best
     characterized as a word that is unique to that message, but a word that relates
     the message to other topically similar messages. We therefore use latent
     representations of the underlying topics in a user's mailbox to find words that
     describe each message in the context of existing topics rather than selecting
     keywords based on a single message in isolation. We present and compare
     several methods for selecting email summary keywords, based on two well-
known models for inferring latent topics: latent semantic analysis (LSA) and
     latent Dirichlet allocation (LDA).

     Summary keywords for an email message are generated by selecting the
     words that are most topically similar to the words in the email. We use two
     approaches for selecting these words, one based on query-document similarity,
     and the other based on word association. Each approach may be used in
     conjunction with either LSA or LDA. We evaluate keyword quality by generating
     summaries for emails from twelve users in the Enron corpus and comparing each
     method's performance with a TF-IDF baseline. The quality of keywords are
     assessed using two proxy tasks, in which the summaries are used in place of
     whole messages: recipient prediction and foldering. In the recipient prediction
     task, the keywords for each email are used to predict the intended recipients of
     the current message. In the foldering task, each user's email messages are
     sorted into folders using the selected keywords as features. Our topic-based
     methods out-perform TF-IDF on both tasks, demonstrating that topic-based
     methods yield better summary keywords. By selecting keywords based on user-
     specific topics, we find summaries that represent each message in the context of
     the entire mailbox, not just that of a single message. Furthermore, combining the
     summary for an email with the email's subject improves foldering and recipient
     prediction results over those obtained using either summaries or subjects alone.

     References:
     [1] S. Muresan, E. Tzoukermann, and J. Klavans (2001). Combining
     linguistic and machine learning techniques for email
     summarization. CONLL.

Continuous-State POMDPs with Hybrid Dynamics
Emma Brunskill, MIT

     Partially observable Markov decision processes (POMDPs) provide a rich
     framework for describing many important planning problems that arise in
     situations with hidden state and stochastic actions. Most previous work has
     focused on solving POMDPs with discrete state, action and observation spaces.
     However, in a number of applications, such as navigation or robotic grasping, the
     world is most naturally represented using continuous states. Though any
     continuous domain can be described using a sufficiently fine grid, the number of
     discrete states grows exponentially with the dimensionality of the underlying state
     space. Existing discrete state POMDP algorithms can only scale up to the order
     of a few thousand states, beyond which they become computationally infeasible.
     Therefore, approaches for dealing efficiently with continuous-state POMDPs are
     of great interest.

     Previous work (such as [1]) on planning for continuous-state POMDPs has
     typically modeled the world dynamics using a single linear Gaussian model to
     describe the effects of an action. Unfortunately, this model is not powerful
enough to represent the multi-modal state-dependent dynamics that arise in a
       number of problems of interest. For example, in legged locomotion the different
       "modes" of walking and running are described best by significantly different
       dynamics. We instead employ a hybrid dynamics model for continuous-state
       POMDPs that can represent stochastic state-dependent distributions over a
       number of different linear dynamic models. We developed a new point-based
       approximation algorithm for solving these hybrid-dynamics POMDP planning
       problems that builds on Porta et al.'s continuous-state point-based approach[1].
       One nice attribute of our algorithm is that by representing the value function and
       belief states using a weighted sum of Gaussians, the belief state updates and
       value function backups can be computed in closed form. An additional
       contribution of our work is a new procedure for constructing a better
       approximation of the alpha functions composing the value function. We
       conducted experiments on a set of small problems to illustrate how the
       representational power of the hybrid dynamics model allows us to address
       problems not previously solvable by existing continuous-state approaches. In
       addition, we examined the toy problem of a simulated robot searching blindly (no
       observations) for a power supply in a long hallway. This problem requires a
       variable level of representational granularity in order to perform well. Here our
       hybrid continuous-state planner outperforms a discrete state POMDP planner,
       demonstrating the potential of continuous-state approaches.
       [1] J. Porta, M. Spaan, N. Vlassis, and P. Poupart. Point-based value iteration
       for continuous POMDPs. Journal of Machine Learning Research, 7:2329-2367,
       2006

Clustering Social Networks
Isabelle Stanton, University of Virginia

       Social networks have gained popularity recently with the advent of sites such as
       MySpace, Friendster, Facebook, etc. The number of users participating in these
       networks is large, e.g., a hundred million in MySpace, and growing. These
       networks are a rich source of data as users populate their sites with personal
       information. Of particular interest in this paper is the graph structure induced by
       the friendship links.

       A fundamental problem related to these networks is the discovery of clusters or
       communities. Intuitively, a cluster is a collection of individuals with dense
       friendship patterns internally and sparse friendships externally. There are many
       reasons to seek tightly-knit communities in networks, for instance, target
       marketing schemes can be designed based on clusters and terrorist cells can be
       uncovered.

       Existing clustering criteria are limited in that clusters typically do not overlap, all
       vertices are clustered and/or external sparsity is ignored. We introduce a new
       criterion that overcomes these limitations by combining internal density with
       external sparsity in a natural way. Our criterion does not require a strict
partitioning of the data which is particularly important in social networks, where
      one user may be a member of many communities.

      This work focuses on the combinatorial properties of the new criterion. In
      particular, we bound the amount that clusters can overlap, as well as find a loose
      bound for the number of clusters in a graph. From these properties we have
      developed deterministic and randomized algorithms for provably finding the
      clusters, provided there is a sufficiently large gap between internal density and
      external sparsity. Finally, we perform experiments on real social networks
      illustrate the effectiveness of the algorithm.


Improvement in Performance of Learning Using Scaling
Soumi Ray, University of Maryland Baltimore County

      Reinforcement learning often requires many training iterations to get an optimal
      policy. We are interested in trying to speed up learning in a domain using scaling,
      which works as follows: partial learning is performed to learn a sub-optimal action
      value function, Q, in the domain using standard Q-learning for few iterations. The
      Q-values of Q are then multiplied by a constant factor to scale the Q-values.
      Then learning continues using the scaled Q-values of the new Q-table as the
      initial values. Surprising, in many situations this scaling significantly reduces the
      number of iterations required to learn compared to learning without scaling.

      We can summarize our method of scaling in the following steps:
      1. Partial learning is done in the domain.
      2. The Q-values of the partially learned domain are scaled, using a scaling factor
      decided manually.
      3. Finally learning in the domain is carried out using the new scaled Q-values.

      This method can reduce the number of steps required to learn in the domain
      compared to learning without scaling. Two important aspects of scaling are the
      scaling factor and the time of scaling. If the scaling factor and the time of scaling
      are chosen correctly then we can get great improvements in the performance of
      learning in a domain. We have used 10×10 grid world domains with the starting
      position at the top left corner and the goal at the bottom right corner to run our
      experiments.

A Theory of Similarity Functions for Clustering
Maria-Florina Balcan, Carnagie Mellon University

      Problems of clustering data from pairwise similarity information are ubiquitous in
      Computer Science. Theoretical treatments typically view the similarity information
      as ground-truth and then design algorithms to (approximately) optimize various
      graph-based objective functions. However, in most applications, this similarity
      information is merely based on some heuristic: the true goal is to cluster the
points correctly rather than to optimize any specific graph property. In this work,
we initiate a theoretical study of the design of similarity functions for clustering
from this perspective. In particular, motivated by recent work in learning theory
that asks "what natural properties of a similarity function are sufficient to be able
to learn well?" we ask "what natural properties of a similarity function are
sufficient to be able to em cluster well?"

We develop a notion of the clustering complexity of a given property (analogous
to notions of capacity in learning theory), that characterizes its information-
theoretic usefulness for clustering. We then analyze this complexity for several
natural game-theoretic and learning-theoretic properties, as well as design
efficient algorithms that are able to take advantage of them. We consider two
natural clustering objectives: (a) list clustering: analogous to the notion of list-
decoding, the algorithm can produce a small list of clusterings (which a user can
select from) and (b) hierarchical clustering: the desired clustering is some
pruning of this tree (which a user could navigate). Our algorithms for hierarchical
clustering combine recent learning-theoretic approaches with linkage-style
methods.

This is joint work with Avrim Blum and Santosh Vempala.
Spotlights

Advancing Associative Classifiers - Challenges and Solutions
Luiza Antonie, University of Alberta

      In the past years, associative classifiers, classifiers that use association rules,
      have started to attract attention. An important advantage that these classification
      systems bring is that, using association rule mining, they are able to examine
      several features at a time, while other state-of-the-art methods, like decision
      trees or naive Bayesian classifiers, consider that each feature is independent of
      one another. However, in real-life applications, the independence assumption is
      not necessary true, and it was shown that correlations and co-occurrence of
      features can be very important. In addition, the associative classifiers can handle
      a large number of features, while other classification systems do not work well for
      high dimensional data. The associative classification systems proved to perform
      as well as, or even better, than other techniques in the literature. The associative
      classifiers are models that can be read, understood, modified by humans and
      thus can be manually enriched with domain knowledge.

      We have proposed the integration of new types of association rules and new
      methods to reduce the number of rules in the model. In our research work we
      studied the behaviour of associative classifiers when negative association rules,
      maximal and closed itemsets are employed. These types of association rules
      have not been used in associative classifiers before, thus bringing new
      challenges and opportunities to our work. Given that one advantage of the
      classifiers based on association rules is their readability, another direction that
      we investigated is reducing the number of association rules used in the
      classification model. Pruning of rules not only improves readability, but it may
      minimize overfitting of the model as well. Another challenge is the use of rules in
      the classification stage. We proposed a new technique where the system
      automatically learns how to use the rules.

      Many applications can benefit from a good classification model. Given the
      readability of the associative classifiers, they are especially fit to applications
      were the model may assist domain experts in their decisions. Medical field is a
      good example were such applications may appear. Let us consider an example
      were a physician has to examine a patient. There is a considerable amount of
      information associated with the patient (e.g. personal data, medical tests, etc.). A
      classification system can assist the physician in this process. The system can
      predict if the patient is likely to have a certain disease or present incompatibility
      with some treatments. Considering the output of the classification model, the
      physician can make a better decision on the treatment to be applied to this
      patient. Given the transparency of our model, a health practitioner can
      understand how the classification model reached its decision.
Real-life applications are usually characterized by unbalanced datasets. Classes
      of interest may be under-represented, thus making harder the discovery of
      knowledge associated with them. We evaluated the performance of our system
      under these difficult conditions. We studied the performance of our classification
      model on real-life applications (mammography classification, text categorization,
      preterm birth prediction) where the classes of interest are typically under-
      represented.

      This is joint work with my supervisors, Osmar R. Zaiane and Robert C.
      Holte.

Learning to Predict Prices in a Supply Chain Management Game
Shuo Chen, UC Berkeley

      Economic decisions can benefit greatly from accurate predictions of market
      prices, but making such predictions is a difficult problem and an area of active
      research. In this paper, we present and compare several techniques for
      predicting market prices that we have employed in the Trading Agent Competition
      Supply Chain Management (TAC SCM) Prediction Challenge. These strategies
      include simple heuristics and various machine learning approaches, such as
      simple perceptrons and support vector regression. We show that the heuristic
      methods are very good, especially for predicting current prices, but that the
      machine learning techniques may be more appropriate for future price
      predictions.

Sonar Terrain Mapping with BDI Agents
Shivali Gupta, University of Maryland, Baltimore County

      Mapping a constantly changing environment is a challenge that necessitates a
      team of agents working together. These agents must continually explore the
      terrain and assemble the map in a distributed fashion. In a real-world instance of
      this problem agents have limited sensor and communication ranges, such as
      surveillance problem, further compounding the problem.

      Our solution is to create multiple “Explorer" agents and a centralized “Base
      station" agent using the BDI architecture. The BDI architecture provides a
      framework for agents that have their individual beliefs, desires and intentions
      (goals). The environment is ripe with uncertainty given its continually changing
      nature which makes BDI architecture well suited to this problem. Mobile Explorer
      agents have limited range of communication and partial observability of the
      environment. The Base station agent is stable and it maintains the global map of
      the environment from the information of the Explorer agents. Explorer agents use
      the Base station’s global map (its beliefs about the world) to decide which area to
      explore next, and after exploration they send their updated map to the Base
      station agent. The Base station agent merges its copy with the information
      received from the explorer agent. The Explorer agents must stay within
surveillance problem, further compounding the problem.
         Our solution is to create multiple “Explorer" agents and a centralized “Base station" agent using
     the BDI architecture. The BDI architecture provides a framework for agents that have their individual
     beliefs, desires and intentions (goals). The environment is ripe with uncertainty given its continually
     changing nature which makes BDI architecture well suited to this problem. Mobile Explorer agents
     have limited range of communication and partial observability of the environment. The Base station
     agent is stable and it maintains the global map of the environment from the information of the Explorer
     agents. Explorer agents use the Base station’s global map (its beliefs about thecommunication
      communication range of each other to maintain a complete world) to decide which
     area to explore next, and after exploration they send their updated map to the Base station agent. The
      network between all agents and the base station.
     Base station agent merges its copy with the information received from the explorer agent. The Explorer
     agents must stay within communication range of each other to maintain a complete communication
      The system models the environment as a grid of cells and the Base station
     network between all agents and the base station.
      assigns each cell the“Curiosity level", based on how long it has been since that
         The system models a environment as a grid of cells and the Base station assigns each cell a “Cu-
     riosity level", based on howHigherhas been since that implies that the cell has curiosity level
      region was explored. long it curiosity level region was explored. Higher not been
     implies that recently. Therefore, the curiosity level drives exploration
      explored the cell has not been explored recently. Therefore, the curiosity level drives exploration
     toward regions of uncertainty. Explorer agents calculate a force vector,
      toward regions of uncertainty. Explorer agents calculate a force vector,
            force_vector =                     distance_based _penalty ∗ curiosity_value ∗ unit_vector   (1)
                             for _every_cell

     where distance_based_penalty is the inverse of theinverse of the manhattan distancefind cells
      where distance_based_penalty is the manhattan distance of cells from agents, to of the
     direction to explore. find calculation ensures that not all the agents move in one direction at the not all
      from agents, to This the direction to explore. This calculation ensures that same
     time. agents move advantages of this distributed approach is that One ofof an agent does not affect
      the One of the major in one direction at the same time. a failure the major advantages
     the this distributedIf an Exploreris that a failure of an agents can still continue to explore the
      of system in general. approach agent fails, then the other agent does not affect the
     environment.
      system in general. If anagents prevent the average curiosity level from rising at a canspace and
         The results show that more
                                      Explorer agent fails, then the other agents fast still
      continue eventually stabilizes after a limited number of Explorer agents explore the map. Another
     the average  to explore the environment.
     result shows that distance penalty based on the manhattan distance provides a better solution because it
     allowsresults show to explore theagents prevent the as well as the outer edges of the maprising
      The Explorer agents that more local area around them, average curiosity level from in
     comparison space and the on euclidian eventually stabilizes after aprocedure.number of
      at a fast to a penalty based average distance which localizes the search limited In our future
     work, we are interested in adding a learning mechanism result shows which distance penalty
      Explorer agents explore the map. Another to the algorithm that would enable Explorer
     agents to predict the changing behavior of the environment and how to explore it optimally. Learning
      based on the manhattan distance provides a better solution because it allows
     would also enable Explorer agents to avoid obstacles in their environment.
     Explorer agents to explore the local area around them, as well as the outer
     edges of the map in comparison to a penalty based on euclidian distance which
     localizes the search procedure. In our future work, we are interested in adding a
     learning mechanism to the algorithm which would enable Explorer agents to
     predict the changing behavior of the environment and how to explore it optimally.
     Learning would also enable Explorer agents to avoid obstacles in their
     environment.

Online Learning for OffRoad Robots                          1
Raia Hadsell, NYU

     We present a learning-based solution to the problem of long-range obstacle
     detection in autonomous robots. The system uses sparse traversability
     information from a stereo module to train a classifier online. The trained classifier
     can then predict the traversability of the entire scene. This learning strategy is
     called self-supervised, near-to-far learning, and, if it is done in an online manner,
     it allows the robot to adapt to changing environments and still accurately predict
     the traversability of distant areas.

     A distance-normalized image pyramid makes it possible to efficiently train on
     each frame seen by the robot, using large windows that contain contextual
     information as well as shape,color, and texture. Traversability labels are initially
     obtained for each target using a stereo module, then propagated to other views
     of the same target using temporal and spatial concurrences, thus training the
classifier to be view-invariant. A ring buffer simulates short-term memory and
ensures that the discriminative learning is balanced and consistent. This long-
range obstacle detection system sees obstacles and paths at 30-40 meters, far
beyond the maximum stereo range of 12 meters, and adapts very quickly to new
environments.

Experiments were run on the LAGR (Learning Applied to Ground Robots) robot
platform. Both the robot and the reference ``baseline'' software were built by
Carnegie Mellon University and the National Robotics Engineering Center. In this
program, in which all participants are constrained to use the given hardware, the
goal is to drive from a given start to a predefined (GPS) goal position through
unknown, offroad terrain using only passive vision. Both qualitative and
quantitative results are given by comparing the field performance of the robot
with and without learning-based, long-range vision enabled.
Posters
Untitled
Mair Allen-Williams, University of Southampton

      Two particular challenges faced by agents within dynamic, uncertain multi-agent
      systems are learning and acting in uncertain environments, and coordination with
      other agents about whom they may have little or no knowledge. Although
      uncertainty and coordination have each been tackled as separate problems,
      existing formal models for an integrated approach make a number of simplifying
      assumptions, and often have few guarantees. In this report we explore the
      extension of a Bayesian learning model into partially observable multi-agent
      domains. In order to implement such a model practically we make use of a
      number of approximation techniques. In addition to traditional methods such as
      repair sampling and state clustering, we apply graphical inference methods within
      the learning step to propagate information through partially observable nodes.
      We demonstrate the scalability of this approach with an ambulance rescue
      problem inspired by the Robocup Rescue system.

Supervised Learning by Training on Aggregate Outputs
Janara Christensen, Carleton College

      Supervised learning is a classic data mining problem where one wishes to be be
      able to predict an output value associated with a particular input vector. We
      present a new twist on this classic problem where, instead of having the training
      set contain an individual output value for each input vector, the output values in
      the training set are only given in aggregate over a number of input vectors. This
      new problem arose from a particular need in learning on mass spectrometry
      data, but could easily apply to situations when data has been aggregated in order
      to maintain privacy. We provide a formal description of this new problem for both
      classification and regression. We then examine how k-nearest neighbor, neural
      networks, and support vector machines can be adapted for this problem.

Disparate Data Fusion for Protein Phosphorylation Prediction
Genetha Gray, Sandia National Labs

      New challenges in knowledge extraction include interpreting and classifying data
      sets while simultaneously considering related information to confirm results or
      identify false positives. We discuss a data fusion algorithmic framework targeted
      at this problem. It includes separate base classifiers for each data type and a
      fusion method for combining the individual classifiers. The fusion method is an
      extension of current ensemble classification techniques and has the advantage
      of allowing data to remain in heterogeneous databases. In this poster, we focus
      on the applicability of such a framework to the protein phosphorylation prediction
      problem and show some numerical results.
Real Boosting a la Carte with an Application to Boosting Oblique Decision Tree
Claudia Henry, Université des Antilles et de la Guyane

      In the past ten years, boosting has become a major field of machine learning and
      classification. We bring contributions to its theory and algorithms. We first unify a
      well-known top-down decision tree induction algorithm due to Kearns and
      Mansour, and discrete AdaBoost, as two versions of a same higher-level
      boosting algorithm. It may be used as the basic building block to devise simple
      provable boosting algorithms for complex classifiers. We provide one example:
      the first boosting algorithm for Oblique Decision Trees, an algorithm which turns
      out to be simpler, faster and significantly more accurate than previous
      approaches.


Multimodal Integration for Multiparty Dialogue Understanding: A Machine
Learning Framework
Pei-Yun Sabrina Hsueh, University of Edinburgh

      Recent advances in recording and storage technologies have led to huge
      archives of multimedia conversational speech recordings in widely ranging areas,
      such as clinical use, online sharing service, and meeting analysis. While it is
      straightforward to replay such recordings, finding information from the often
      lengthy archives has become more difficult. It is therefore essential to provide
      sufficient aids to guide the users through the recordings and to point out the most
      important events that need their attentions. In particular, my research concerns
      how to infer human communicative intention from low level audio and video
      signals. In particular, I focus on identifying multimodal integration patterns (e.g.,
      people tend to speak more firmly and address to the whole group more often
      when they are making decisions) in human conversations, using approaches
      ranging from statistical analysis, empirical study, to machine learning.

      Past research has shown that ehe identified multimodal integration patterns are
      useful for recognizing local speaker intention in recorded speech such as speech
      disfluency (e.g., false start). My research attempts to recover speaker intention
      that serve a more global communicative goal, such as ìinitiate-discussionî and
      ìreach-decision." A learning framework that can identify characteristic features of
      different semantic classes has been developed. This framework has been proven
      to be useful for automatic topic segmentation (and labeling) and automatic
      decision detection. The ultimate goal of this research is to enhance the current
      browsing and search utilities of multimedia archives.
A POMDP for Automatic Software Customization
Bowen Hui, University of Toronto

     Providing personalized software for individuals has the potential to increase work
     productivity and user satisfaction. In order to accommodate a wide variety of user
     needs, skills, and preferences, today's software is typically packed with
     functionality suitable for everyone. As a result, the interface is complicated,
     functionalities are unexplored, and hence, unused, and users are dissatisfied
     with the product. Many attempts in the user adaptive systems literature have
     explored ways to customize software according to the inferred user needs.

     Recent probabilistic approaches model the uncertainty in the application domain
     and typically optimize single objective functions, i.e., helping the user complete a
     task faster or interact with the interface easier, but not both. A few exceptions
     exist that provide a principled treatment to modeling the uncertainty and the
     tradeoffs that are needed to satisfy multiple objectives. Nevertheless, existing
     work have done little to address three important issues:
     * the interaction principles that govern the nature of the problem's objective
     functions
     * the hidden user variables that explain observed preferences and behaviour
     * the value of information available in the repeated, sequential nature of the
     interaction between the user and the system

     We are interested in designing a software agent that assists the user by adapting
     the interface and suggesting task completion help. In particular, the sequential
     nature of the human-computer interaction (HCI) naturally lends itself as a partially
     observable Markov decision process (POMDP). We propose to develop a
     customization POMDP that learns the type of user it is dealing with and adapts
     its behaviour in order to maximize expected rewards formulated by the
     interaction principles for that specific user. Overall, modeling the automatic
     customization problem as a POMDP enables the system to take optimal actions
     with respect to the value of information gain of an exploratory action and the
     immediate rewards obtained by exploitation. This approach provides a decision-
     theoretic treatment to balancing the opportunities to learn about the user versus
     exploiting what the system already knows about the user.

     This work pools together techniques and insights from artificial intelligence and
     machine learning to construct and solve the POMDP. Specifically, we adopt
     methods from the Bayesian user modeling literature to construct a generic user
     model, the activity recognition literature to build a goal model of user activities,
     the HCI literature to formulate the reward model specifying user objectives, the
     preference elicitation literature to learn the user's utility function for adaptive
     systems, and the machine learning literature to populate model parameters with
     incomplete data and to do approximate inference. In addition to the development
     of the novel user model and reward model, a major contribution here is
demonstrating that the customization POMDP is able to model real world
      applications tractably and is able to adapt to different types of users quickly.

Using Probabilistic Graphical Models in Bio-Surveillance Research
Masoumeh Izadi, McGill University

      Artificial intelligence methods can support and assist optimal use of clinical and
      administrative knowledge in diverse perspectives from diagnostic assistance, and
      detection of epidemics, to improved efficiency of health care delivery processes.
      Probabilistic graphical models have been successfully used for many medical
      problems. We describe a decision support system in public health bio-
      surveillance research. A long line of research has shown that current outbreak
      detection methods are ineffective; they raise both false alarms and miss attacks.
      Our approach tries to bring us closer to an effective detection system that detects
      real attacks and only those. I show how Partially Observable Markov Decision
      Processes (POMDPs) can be applied on outbreak detection methods for
      improving alarm function in the case of anthrax. Our results show that this
      method significantly outperforms existing solutions, in terms of both sensitivity
      and timeliness.

Incorporating a New Relational Feature in Online Handwritten Character
Recognition
Sara Izadi, Concordia University

      Artificial neural networks have shown good capabilities in performing
      classification tasks. However, classifier models used for learning in pattern
      classification are challenged when the differences between the patterns of the
      training set are small. Therefore, the choice of effective features is
      mandatory for reaching a good performance. Statistical and geometrical features
      alone are not suitable for recognition of hand printed characters due to variations
      in writing styles, that may result in deformations of character shapes. We address
      this problem by using a relational context feature combined with a local
      descriptor for training a neural network-based recognition system in a user-
      independent online character recognition application. Our feature extraction
      approach provides a rich representation of the global shape characteristics, in a
      considerably compact form. This new relational feature generally provides a
      higher distinctiveness and robustness to character deformations, thus potentially
      increasing the recognition rate in a user-independent system. While enhancing
      the recognition accuracy, the feature extraction is computationally simple. We
      show that the ability to discriminate in handwriting characters is increased by
      adopting this mechanism which provides input to the feed forward neural network
      architecture. Our experiments on Arabic character recognition show comparable
      results with the state-of-the- art methods for online recognition of these
      characters.
Description Length and the Multiple Motif Problem
Anna Ritz, Brown University

      Protein interactions drive many biological functions in the cell. A source protein
      can interact with several proteins; the specificity of this interaction is partly
      determined by the sequence around the binding site. In the 20-letter alphabet of
      protein sequences (denoting the 20 amino acids), a motif is a pattern that
      describes these binding preferences for a given protein. The motif-finding
      problem is to extract a motif from a set of sequences that interact with a given
      protein. The problem is solved by identifying statistically enriched patterns in this
      foreground set compared to a background set of non-interacting sequences.
      Finding such patterns is well-studied in Computational Biology.

      Recent advances in technology require us to rethink the approach to the motif-
      finding problem. Mass spectrometry, for example, allows high-throughput
      measurements of multiple proteins interacting simultaneously. This creates a
      foreground set that is a mixture of motifs. The Multiple Motif problem is described
      as follows: find a collection of motifs, called a motif model, that best describes the
      foreground. The motif model is empty if the background distributions describe the
      foreground better than any set of patterns.

      A few algorithms to find multiple motifs exist, but they use either overly simplistic
      or overly descriptive motif representations. Overly simplistic motifs provide limited
      information about the structure of the data, while overly descriptive motifs use
      many parameters that require unrealistically large datasets. We use a
      representation between these extremes: some positions in a motif are exact,
      while others are restricted to a few letters.

      When comparing motif models, we want to know which model describes the
      foreground the best. We use description length as a metric. Our goal is to learn
      the motif model that produces the most compact representation of the foreground
      by minimizing description length. Using minimum description length in this
      context circumvents some of the limitations of other representations. Each motif
      in the model must contribute to describing the foreground as concisely as
      possible, avoiding both redundancy and overfitting. Description length also gives
      a criterion for merging multiple exact motifs into a single, inexact motif, a task
      that is often ambiguous in other algorithms.

      We describe the use of minimum description length to filter the results of known
      algorithms and to discover novel motifs in synthetic and real datasets.
      This is joint work with Benjamin Raphael and Gregory Shakhnarovich at Brown
      University.
Machine Translation with Self Organized Maps
Aparna Subramanian, University of Maryland, Baltimore County

      I am investigating the idea of using Self Organizing Maps for the purposes of
      Machine Translation. Human translators seem to translate based on their
      knowledge of what words/phrases of one language best represent the translation
      of the word/phrase in another. While choosing these word/phrase equivalents,
      they rely on similarity in the underlying concept to which the two words/phrases
      in different languages correspond to. This gives a good reason for a machine
      translation system to do something similar, i.e. translating at a conceptual level.
      Conceptual relativism of languages indicates a good source to parameterize
      concepts for the purpose of translation. Self Organizing Maps (SOM) can be
      used to formalize such concept categories and improve them by learning over
      time. Contextual information can also be captured in SOMs and be used for
      translation. Major challenges in practical application of SOMs to problems such
      as translation which require large vectors of concepts to be stored and processed
      are speed and space. This can be resolved in at least the following two ways –
      SOMs stored and processed as a hierarchy of concepts and SOMs maintained
      as different modules each catering to a group of similar concepts. I plan to
      further investigate the feasibility of these methods.

      One approach for translation therefore is to average over the contextual
      relevance of the given piece, e.g. sentence, over the whole conversation or text
      in the source language under consideration. This can be done using a SOM for
      contexts which learns with every input sentence in the text. The mapping of the
      input sentence in the SOM can then be used as input to the Word Category Map
      of the source language. The output/s of this exercise can be the input to the
      target language Word Category Map. The words/phrases that are the outcome of
      this step can be organized into a sentence using the context SOM for the target
      language and can be aided by the knowledge of the grammar for the target
      language.

      The investigation is in its initial stages, though the idea appears promising
      because this kind of translation system has the capacity to evolve through
      learning and takes care of pragmatics of the input. The approach also seems
      viable since there have been attempts in the past to use SOM for Natural
      Language Processing in general. The present work will be significant as attempts
      of using Self Organizing Maps for Machine Translation do not appear to have
      been explored, though it has been indicated as possibility in previous works.

Policy Recognition for Multi-Player Tactical Scenarios
Gita Sukthankar, University of Central Florida

      This research addresses the problem of recognizing policies given logs of battle
      scenarios from multi-player games. The ability to identify individual and team
      policies from observations is important for a wide range of applications including
automated commentary generation, game coaching, and opponent modeling.
     We define a policy as a preference model over possible actions based on the
     game state, and a team policy as a collection of individual policies along with an
     assignment of players to policies. Given a sequence of input observations, O,
     (including observable game state and player actions), a set of player policies, P,
     and team policies, T, the goal is to identify the individual policies p that were
     employed during the scenario.

     A team policy is an allocation of players to tactical roles and is typically arranged
     prior to the scenario as a locker-room agreement. However, circumstances
     during the battle (such as the elimination of a teammate or unexpected enemy
     reinforcements) can frequently force players to take actions that were a priori
     lower in their individual preference model. In particular, one difference between
     policy recognition in a tactical battle and typical plan recognition is that agents
     rarely have the luxury of performing a pre-planned series of actions in the face of
     enemy threat. This means that methods that rely on temporal structure, such as
     Dynamic Bayesian Networks (DBNs) and Hidden Markov Models are not
     necessarily be well-suited to this task. An additional challenge is that, over the
     course of a single scenario, one only observes a small fraction of the possible
     game states, which makes policy learning difficult.

     This research explores a model-based system for combining evidence from
     observed events using the Dempster-Shafer theory of evidential reasoning. The
     primary benefit of this approach is that the model generalizes easily to different
     initial starting states (scenario goals, agent capabilities, number and composition
     of the team). Unlike traditional probability theory where evidence is associated
     with mutually-exclusive outcomes, the Dempster-Shafer theory quantifies belief
     over sets of events. We computed the average accuracy over the set of battles
     for each of the three rules of combination. We evaluate our Dempster-Shafer
     based approach on logs of real and simulated games played using Open Gaming
     Foundation d20, the rule system used by many popular tabletop games,
     including Dungeons and Dragons.

Advice-based Transfer in Reinforcement Learning
Lisa Torrey, University of Wisconsin

     This report is an overview of our work on transfer in reinforcement learning using
     advice-taking mechanisms. The goal in transfer learning is to speed up learning
     in a target task by transferring knowledge from a related, previously learned
     source task. Our methods are designed to do so robustly, so that positive transfer
     will speed up learning but negative transfer will not slow it down. They are also
     designed to allow human teachers to provide simple guidance that increases the
     benefit of transfered knowledge. These methods allow us to push the boundaries
     of current work in this area and perform transfer between complex and dissimilar
     tasks in the challenging RoboCup simulated soccer domain.
Determining a Relationship Between Two Distinct Atmospheric Data Sets of
Different Granularities
Emma Turetsky, Carleton College

      Regression analysis is a classic data mining problem with many real-world
      applications. We present several methods of using data mining and statistical
      analysis to find a relationship between two different data sets; atmospheric
      particles (and their elemental constituents) and elemental carbon (EC).
      Specifically, we wish to determine which elements in the atmosphere cause
      elemental carbon, something that is common in industrial zones and large cities
      and can normally be found in exhaust fumes and areas where there is visible
      carbon. In order to do this, we used machine learning regression algorithms
      including SVM regression and Lasso regression as well as regular linear
      regression. Weíve created several models that correlate specific elements with
      the amount of elemental carbon in the atmosphere.

Inferring causal relationships between genes from steady state observations and
topological ordering information
Xin Zhang, Arizona State University

      The development of high-throughput genomic technologies, such as cDNA
      microarray and oligonucleotide chips, empowers researchers to reveal gene
      interactions. Mathematical modeling and in-silico simulation can be used to
      analyze gene interactions unambiguously, and to predict the network dynamic
      behavior in a systematic way. Various network inference models have been
      developed to identify gene regulatory networks using gene expression data, but
      none of them are about inferring causal relationships between genes, which is a
      very important issue in system biology. Among the developed methods, the
      Inductive Causation (IC) algorithm has been proven to be effective for inferring
      causal relationships among variables. However, simulation study in the context of
      gene regulatory network shows that the IC algorithm, which uses only one single
      data source, results in low precision and recall rates. To improve the
      performance, we propose a joint learning scheme that integrates multiple data
      sources. We present a modified IC (mIC) algorithm, that combines steady state
      data with partial prior knowledge of gene topological ordering information, for
      jointly learning causal relationships among genes.

      We perform three sets of experiments on synthetic datasets for learning causal
      relationships between genes using the IC and the mIC algorithms. Each
      experiment contains 100 randomly generated Boolean networks (DAGs), each of
      which contains 10 genes connected by proper functions, with the gene
      topological ordering information. The distribution of the network is generated
      based on the probability distribution of the root genes and the proper functions.
      The Monte Carlo sampling method is used to generate 200 samples in a dataset
      for each network based on the probability distribution. We compare the
      simulation results from the mIC algorithm with the ones from the IC algorithm.
From the simulation based evaluation we conclude that (i) IC algorithm does not
work well for learning gene regulatory networks from steady state data alone, (ii)
a better way for learning the gene causal relationship from steady state data is to
use additional knowledge such as gene topological ordering, (iii) the precision
and recall rates for mIC algorithm is significantly improved compared with IC
algorithm with statistical confidence of 95%. For randomly generated networks,
the mIC algorithms work well for jointly learning the causal regulatory network by
combining steady state data and gene topological ordering knowledge, with
precision rate of greater than 60%, and recall rate greater than 50%.

We further apply the mIC algorithm to gene expression profiles used in the study
of melanoma. 31 malignant melanoma samples were quantized to the ternary
format such that the expression level of each gene is assigned to ñ1
(downregulated), 0 (unchanged) or 1 (up-regulated). The 10 genes involved in
this study are chosen from 587 genes from the melanoma dataset. The result
showed that some of the important causal relationships associated with WNT5A
gene have been identified using the mIC algorithm, and those causal
connections have been verified from the literatures.
Workshop Organization

Organizers: 
 
     Hila Becker, Columbia University

     
       
     Bethany Leffler, Rutgers University

Faculty Advisor:
   Lise Getoor, University of Maryland, College Park

Reviewers:
   
     Hila Becker
 

     
       
     Finale Doshi

     
       
     Seyda Ertekin

     
       
     Katherine Heller

     
       
     Bethany Leffler

     
       
     Özgür Şimşek

     
       
     Jenn Wortmann
Thanks to our sponsors:


C   R   A

             Committee on the Status of
            Women in Computing Research




                                          PRINCETON
                                          UNIVERSITY

Más contenido relacionado

La actualidad más candente

Identification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesIdentification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesMilos Kravcik
 
Bridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human KnowledgeBridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human KnowledgeMattia Zeni
 
Developing Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearchDeveloping Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearchJian Qin
 
A Tableau-based Federated Reasoning Algorithm for Modular Ontologies
A Tableau-based Federated Reasoning Algorithm for Modular OntologiesA Tableau-based Federated Reasoning Algorithm for Modular Ontologies
A Tableau-based Federated Reasoning Algorithm for Modular OntologiesJie Bao
 
Introduction to Sohuman2012
Introduction to Sohuman2012Introduction to Sohuman2012
Introduction to Sohuman2012CUbRIK Project
 
Nature Inspired Reasoning Applied in Semantic Web
Nature Inspired Reasoning Applied in Semantic WebNature Inspired Reasoning Applied in Semantic Web
Nature Inspired Reasoning Applied in Semantic Webguestecf0af
 
Application of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning ExperienceApplication of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning ExperienceIJERA Editor
 
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...csandit
 
Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...
Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...
Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...IOSR Journals
 
Newsletter 2013-fall
Newsletter 2013-fallNewsletter 2013-fall
Newsletter 2013-fallHoa Bien
 
Molecules of Knowledge: Self-Organisation in Knowledge-Intensive Environments
Molecules of Knowledge: Self-Organisation in Knowledge-Intensive EnvironmentsMolecules of Knowledge: Self-Organisation in Knowledge-Intensive Environments
Molecules of Knowledge: Self-Organisation in Knowledge-Intensive EnvironmentsStefano Mariani
 
Sign Language Recognition with Gesture Analysis
Sign Language Recognition with Gesture AnalysisSign Language Recognition with Gesture Analysis
Sign Language Recognition with Gesture Analysispaperpublications3
 
english_cv_final.doc
english_cv_final.docenglish_cv_final.doc
english_cv_final.docbutest
 

La actualidad más candente (20)

Identification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesIdentification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based Communities
 
Bridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human KnowledgeBridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human Knowledge
 
cv-seyoung
cv-seyoungcv-seyoung
cv-seyoung
 
Developing Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearchDeveloping Data Services to Support eScience/eResearch
Developing Data Services to Support eScience/eResearch
 
AI that/for matters
AI that/for mattersAI that/for matters
AI that/for matters
 
3234150
32341503234150
3234150
 
A Tableau-based Federated Reasoning Algorithm for Modular Ontologies
A Tableau-based Federated Reasoning Algorithm for Modular OntologiesA Tableau-based Federated Reasoning Algorithm for Modular Ontologies
A Tableau-based Federated Reasoning Algorithm for Modular Ontologies
 
Introduction to Sohuman2012
Introduction to Sohuman2012Introduction to Sohuman2012
Introduction to Sohuman2012
 
Nature Inspired Reasoning Applied in Semantic Web
Nature Inspired Reasoning Applied in Semantic WebNature Inspired Reasoning Applied in Semantic Web
Nature Inspired Reasoning Applied in Semantic Web
 
Application of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning ExperienceApplication of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning Experience
 
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
 
Itbi
ItbiItbi
Itbi
 
119 128
119 128119 128
119 128
 
Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...
Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...
Interfacing of Java 3D objects for Virtual Physics Lab (VPLab) Setup for enco...
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
Newsletter 2013-fall
Newsletter 2013-fallNewsletter 2013-fall
Newsletter 2013-fall
 
Molecules of Knowledge: Self-Organisation in Knowledge-Intensive Environments
Molecules of Knowledge: Self-Organisation in Knowledge-Intensive EnvironmentsMolecules of Knowledge: Self-Organisation in Knowledge-Intensive Environments
Molecules of Knowledge: Self-Organisation in Knowledge-Intensive Environments
 
Sign Language Recognition with Gesture Analysis
Sign Language Recognition with Gesture AnalysisSign Language Recognition with Gesture Analysis
Sign Language Recognition with Gesture Analysis
 
english_cv_final.doc
english_cv_final.docenglish_cv_final.doc
english_cv_final.doc
 
ChenhuiHu_CV
ChenhuiHu_CVChenhuiHu_CV
ChenhuiHu_CV
 

Destacado

Machine Learning Meets Human Learning
Machine Learning Meets Human LearningMachine Learning Meets Human Learning
Machine Learning Meets Human Learningbutest
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Visionbutest
 
Web Design Contract
Web Design ContractWeb Design Contract
Web Design Contractbutest
 
Computer Security: A Machine Learning Approach
Computer Security: A Machine Learning ApproachComputer Security: A Machine Learning Approach
Computer Security: A Machine Learning Approachbutest
 
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...butest
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
Machine Learning in Bioinformatics
Machine Learning in BioinformaticsMachine Learning in Bioinformatics
Machine Learning in Bioinformaticsbutest
 
NEB Step-1 Day-8 (Review of Pelvis & Perineum)
NEB Step-1 Day-8 (Review of Pelvis & Perineum)NEB Step-1 Day-8 (Review of Pelvis & Perineum)
NEB Step-1 Day-8 (Review of Pelvis & Perineum)DrSaeed Shafi
 
投影片 1
投影片 1投影片 1
投影片 1butest
 
Bị Trật Khớp đầu Gối
Bị Trật Khớp đầu GốiBị Trật Khớp đầu Gối
Bị Trật Khớp đầu Gốicallie520
 
Presentation
PresentationPresentation
Presentationbutest
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
Web Design Course Outline
Web Design Course OutlineWeb Design Course Outline
Web Design Course Outlinebutest
 
Новые инструментальные решения
Новые инструментальные решенияНовые инструментальные решения
Новые инструментальные решенияЗАО "НИР"
 
Tearn Up pitch deck.pdf
Tearn Up pitch deck.pdfTearn Up pitch deck.pdf
Tearn Up pitch deck.pdfasenju
 

Destacado (20)

Machine Learning Meets Human Learning
Machine Learning Meets Human LearningMachine Learning Meets Human Learning
Machine Learning Meets Human Learning
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
Web Design Contract
Web Design ContractWeb Design Contract
Web Design Contract
 
Computer Security: A Machine Learning Approach
Computer Security: A Machine Learning ApproachComputer Security: A Machine Learning Approach
Computer Security: A Machine Learning Approach
 
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
University of Hyderabad Vacancies* *http://www.uohyd. ernet.in ...
 
static website quotation
static website quotationstatic website quotation
static website quotation
 
EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning in Bioinformatics
Machine Learning in BioinformaticsMachine Learning in Bioinformatics
Machine Learning in Bioinformatics
 
From Thoughts to Action
From Thoughts to ActionFrom Thoughts to Action
From Thoughts to Action
 
PPT
PPTPPT
PPT
 
NEB Step-1 Day-8 (Review of Pelvis & Perineum)
NEB Step-1 Day-8 (Review of Pelvis & Perineum)NEB Step-1 Day-8 (Review of Pelvis & Perineum)
NEB Step-1 Day-8 (Review of Pelvis & Perineum)
 
投影片 1
投影片 1投影片 1
投影片 1
 
Bị Trật Khớp đầu Gối
Bị Trật Khớp đầu GốiBị Trật Khớp đầu Gối
Bị Trật Khớp đầu Gối
 
Presentation
PresentationPresentation
Presentation
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Web Design Course Outline
Web Design Course OutlineWeb Design Course Outline
Web Design Course Outline
 
Новые инструментальные решения
Новые инструментальные решенияНовые инструментальные решения
Новые инструментальные решения
 
Tearn Up pitch deck.pdf
Tearn Up pitch deck.pdfTearn Up pitch deck.pdf
Tearn Up pitch deck.pdf
 

Similar a Machine Learning:

Elegant Resume
Elegant ResumeElegant Resume
Elegant Resumebutest
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resumebutest
 
Robotic Simulation of Human Brain Using Convolutional Deep Belief Networks
Robotic Simulation of Human Brain Using Convolutional Deep Belief NetworksRobotic Simulation of Human Brain Using Convolutional Deep Belief Networks
Robotic Simulation of Human Brain Using Convolutional Deep Belief NetworksDR.P.S.JAGADEESH KUMAR
 
Machine Learning: Theory, Applications, Experiences
Machine Learning: Theory, Applications, ExperiencesMachine Learning: Theory, Applications, Experiences
Machine Learning: Theory, Applications, Experiencesbutest
 
Computer Vision: Pattern Recognition
Computer Vision: Pattern RecognitionComputer Vision: Pattern Recognition
Computer Vision: Pattern Recognitionedsfocci
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data SciencePhilip Bourne
 
Agent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data EnvironmentAgent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data EnvironmentLaurie Smith
 
thesis_background.ppt
thesis_background.pptthesis_background.ppt
thesis_background.pptbutest
 
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...Javier Gonzalez-Sanchez
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data SciencePhilip Bourne
 
H2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupH2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupSri Ambati
 
Rise of AI through DL
Rise of AI through DLRise of AI through DL
Rise of AI through DLRehan Guha
 
Program Leader, Symbolic Machine Learning and Knowledge ...
Program Leader, Symbolic Machine Learning and Knowledge ...Program Leader, Symbolic Machine Learning and Knowledge ...
Program Leader, Symbolic Machine Learning and Knowledge ...butest
 
Hot Topics in Machine Learning for Research and Thesis
Hot Topics in Machine Learning for Research and ThesisHot Topics in Machine Learning for Research and Thesis
Hot Topics in Machine Learning for Research and ThesisWriteMyThesis
 
Resume It Industry Strath Address
Resume It Industry Strath AddressResume It Industry Strath Address
Resume It Industry Strath Addressdroussinov
 
About the authors
About the authorsAbout the authors
About the authorsbutest
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Jan Sifra
 

Similar a Machine Learning: (20)

Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
 
Robotic Simulation of Human Brain Using Convolutional Deep Belief Networks
Robotic Simulation of Human Brain Using Convolutional Deep Belief NetworksRobotic Simulation of Human Brain Using Convolutional Deep Belief Networks
Robotic Simulation of Human Brain Using Convolutional Deep Belief Networks
 
Machine Learning: Theory, Applications, Experiences
Machine Learning: Theory, Applications, ExperiencesMachine Learning: Theory, Applications, Experiences
Machine Learning: Theory, Applications, Experiences
 
Computer Vision: Pattern Recognition
Computer Vision: Pattern RecognitionComputer Vision: Pattern Recognition
Computer Vision: Pattern Recognition
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
 
Agent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data EnvironmentAgent-Based Problem Solving Methods In Big Data Environment
Agent-Based Problem Solving Methods In Big Data Environment
 
thesis_background.ppt
thesis_background.pptthesis_background.ppt
thesis_background.ppt
 
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data Science
 
H2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupH2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User Group
 
Rise of AI through DL
Rise of AI through DLRise of AI through DL
Rise of AI through DL
 
linkedin summary
linkedin summarylinkedin summary
linkedin summary
 
Program Leader, Symbolic Machine Learning and Knowledge ...
Program Leader, Symbolic Machine Learning and Knowledge ...Program Leader, Symbolic Machine Learning and Knowledge ...
Program Leader, Symbolic Machine Learning and Knowledge ...
 
Hot Topics in Machine Learning for Research and Thesis
Hot Topics in Machine Learning for Research and ThesisHot Topics in Machine Learning for Research and Thesis
Hot Topics in Machine Learning for Research and Thesis
 
Resume It Industry Strath Address
Resume It Industry Strath AddressResume It Industry Strath Address
Resume It Industry Strath Address
 
dissertation
dissertationdissertation
dissertation
 
About the authors
About the authorsAbout the authors
About the authors
 
Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)Reality Mining (Nathan Eagle)
Reality Mining (Nathan Eagle)
 
Machine learning
Machine learningMachine learning
Machine learning
 

Más de butest

1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 
Download
DownloadDownload
Downloadbutest
 

Más de butest (20)

1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 
Download
DownloadDownload
Download
 

Machine Learning:

  • 1.
  • 2. ELEVATOR L GUEST LAUNDRY
  • 3. Schedule 09:00 Registration, poster set-up, and continental breakfast 09:30 Welcome 09:45 Invited Talk: Machine Learning in Space Kiri L. Wagstaff, N.A.S.A. 10:15 A general agnostic active learning algorithm Claire Monteleoni, UC San Diego 10:35 Bayesian Nonparametric Regression with Local Models Jo-Anne Ting, University of Southern California 10:55 Coffee Break 11:15 Invited Talk: Applying machine learning to a real-world problem: real-time ranking of electric components Marta Arias, Columbia University 11:45 Generating Summary Keywords for Emails Using Topics. Hanna Wallach, University of Cambridge 12:05 Continuous-State POMDPs with Hybrid Dynamics Emma Brunskill, MIT 12:25 Spotlights 12:45 Lunch 14:20 Invited Talk: Randomized Approaches to Preserving Privacy Nina Mishra, University of Virginia 14:50 Clustering Social Networks Isabelle Stanton, University of Virginia 15:10 Coffee Break 15:30 Invited Talk: Applications of Machine Learning to Image Retrieval Sally Goldman, Washington University 16:00 Improvement in Performance of Learning Using Scaling Soumi Ray, University of Maryland Baltimore County 16:20 Poster Session 17:10 Panel/ Open Discussion 17:40 Concluding Remarks
  • 4. Invited Talks Machine Learning in Space Kiri L. Wagstaff, N.A.S.A. Remote space environments simultaneously present significant challenges to the machine learning community and enormous opportunities for advancement. In this talk, I present recent work on three key issues associated with machine learning in space: on-board data classification and regression, on-board prioritization of analysis results, and reliable computing in high-radiation environments. Support vector machines are currently being used on-board the EO-1 Earth orbiter, and they are poised for adoption by the Mars Odyssey orbiter as well. We have developed techniques for learning scientist preferences for which subset of images is most critical for transmission, so that we can make the most use of limited bandwidth. Finally, we have developed fault-tolerant SVMs that can detect and recover from radiation-induced errors while performing on- board data analysis. About the speaker: Kiri L. Wagstaff is a senior researcher at the Jet Propulsion Laboratory in Pasadena, CA. She is a member of the Machine Learning and Instrument Autonomy group, and her focus is on developing new machine learning methods that can be used for data analysis on-board spacecraft. She has applied these techniques to data being collected by the EO-1 Earth-orbiting spacecraft, Mars Odyssey, and Mars Pathfinder. She has also worked on crop yield prediction from orbital remote sensing observations, the fault protection system for the MESSENGER mission to Mercury, and automatic code generation for the Electra radio used by the Mars Reconnaissance Orbiter and the Mars Science Laboratory. She is very interested in issues such as robustness (developing fault-tolerant machine learning methods for high-radiation environments) and infusion (how can machine learning be used to advance science?). She holds a Ph.D. in Computer Science from Cornell University and is currently working on an M.S. in Geology from the University of Southern California.
  • 5. Applying machine learning to a real-world problem: real-time ranking of electric components Marta Arias, Columbia University In this talk, I will describe our experience with applying machine learning techniques to a concrete real-world problem: the generation of rankings of electric components according to their susceptibility to failure. The system's goal is to aid operators in the replacement strategy of most at-risk components and in handling emergency situations. In particular, I will address the challenge of dealing with the concept drift inherent in the electrical system and will describe our solution based on a simple weighted-majority voting scheme. About the speaker: Marta Arias received her bachelor's degree in Computer Science from the Polytechnic University of Catalunya (Barcelona, Spain) in 1998. After that she worked for a year at Incyta S.A. (Barcelona, Spain), a company specializing in software products for Natural Language Processing applications. She then enrolled in the graduate student program at Tufts University, recieving her PhD in Computer Science in 2004. That same year she joined the Center for Computational Learning Systems of Columbia University as an Associate Research Scientist. Dr. Arias' research interest include the theory and application of machine learning.
  • 6. Randomized Approaches to Preserving Privacy Nina Mishra, University of Virginia, Microsoft Research The Internet is arguably one of the most important inventions of the last century. It has altered the very nature of our lives -- the way we communicate, work, shop, vote, recreate, etc. The impact has been phenomenal for the machine learning community since both old and newly created information repositories, such as medical records and web click streams, are readily available and waiting to be mined. However, opposite these capabilities and advances is the basic right to privacy: On the one hand, in order to best serve and protect its citizens, the government should ideally have access to every available bit of societal information. On the other hand, privacy is a fundamental right and human need, which theoretically is served best when the government knows nothing about the personal lives of its citizens. This raises the natural question of whether it is even possible to simultaneously realize both of these diametrically opposed goals, namely, information transparency and individual privacy. Surprisingly, the answer is yes and I will describe solutions where individuals randomly perturb and publish their data so as to preserve their own privacy and yet large-scale information can still be learned. Joint work with Mark Sandler. About the speaker: Nina Mishra is an Associate Professor in the Computer Science Department at the University of Virginia. Her research interests are in data mining and machine learning algorithms as well as privacy. She previously held joint appointments as a Senior Research Scientist at HP Labs, and as an Acting Faculty member at Stanford University. She was Program Chair of the International Conference on Machine Learning in 2003 and has served on numerous data mining and machine learning program committees. She also serves on the editorial Boards of Machine Learning, IEEE Transactions on Knowledge and Data Engineering, IEEE Intelligent Systems and the Journal of Privacy and Confidentiality. She is currently on leave in Search Labs at Microsoft Research. She received a PhD in Computer Science from UIUC.
  • 7. Applications of Machine Learning to Image Retrieval Sally Goldman, Washington University Classic Content-Based Image Retrieval (CBIR) takes a single non-annotated query image, and retrieves similar images from an image repository. Such a search must rely upon a holistic (or global) view of the image. Yet often the desired content of an image is not holistic, but is localized. Specifically, we define Localized Content-Based Image Retrieval as a CBIR task where the user is only interested in a portion of the image, and the rest of the image is irrelevant. We discuss our localized CBIR system, Accio!, that uses labeled images in conjunction with a multiple-instance learning algorithm to first identify the desired object and re-weight the features, and then to rank images in the database using a similarity measure that is based upon individual regions within the image. We will discuss both the image representation and multiple-instance learning algorithm that we have used in the localized CBIR systems that we have developed. We also look briefly at ways in which multiple-instance learning can be applied to knowledge-based image segmentation. About the speaker: Dr. Sally Goldman is the Edwin H. Murty Professor of Engineering at Washington University in St. Louis and the Associate Chair of the Department of Computer Science and Engineering. She received a Bachelor of Science in Computer Science from Brown University in December 1984. Under the guidance of Dr. Ronald Rivest at the Massachusetts Institute of Technology, Dr. Goldman completed her Master of Science in Electrical Engineering and Computer Science in May 1987 and her Ph.D. in July 1990. Dr. Goldman's research is in the area of algorithm design and analysis and machine learning with a recent focus on applications to the area of content- based image retrieval. Dr. Goldman has received many teaching awards and honors including the Emerson Electric Company Excellence in Teaching Award in 1999, and the Governor's Award for Excellence in Teaching in 2001. Dr. Goldman and her husband, Dr. Ken Goldman, have just completed a book titled, A Practical Guide to Data Structures and Algorithms using Java.
  • 8. Talks A General Agnostic Active Learning Algorithm Claire Monteleoni, UC San Diego We present a simple, agnostic active learning algorithm that works for any hypothesis class of bounded VC dimension, and any data distribution. Most previous work on active learning either makes strong distributional assumptions, or else is computationally prohibitive. Our algorithm extends a scheme due to Cohn, Atlas, and Ladner to the agnostic setting (i.e. arbitrary noise), by (1) reformulating it using a reduction to supervised learning and (2) showing how to apply generalization bounds even for the non-i.i.d. samples that result from selective sampling. We provide a general characterization of the label complexity of our algorithm. This quantity is never more than the usual PAC sample complexity of supervised learning, and is exponentially smaller for some hypothesis classes and distributions. We also demonstrate improvements experimentally. This is joint work with Sanjoy Dasgupta and Daniel Hsu. Currently in submission, but for a full version, please see UCSD tech report: http://www.cse.ucsd.edu/Dienst/UI/2.0/Describe/ncstrl.ucsd_cse/CS2007-0898 Bayesian Nonparametric Regression with Local Models Jo-Anne Ting, University of Southern California We propose a Bayesian nonparametric regression algorithm with locally linear models for high-dimensional, data-rich scenarios where real- time, incremental learning is necessary. Nonlinear function approximation with high-dimensional input data is a nontrivial problem. An application example is a high-dimensional movement system like a humanoid robot, where real-time learning of internal models for compliant control may be needed. Fortunately, many real-world data sets tend to have locally low dimensional distributions, despite having high dimensional embedding (e.g., Tenenbaum et al. 2000, Roweis & Saul, 2000). A successful algorithm, thus, must avoid numerical problems arising potentially from redundancy in the input data, eliminate irrelevant input dimensions, and be computationally efficient to allow for incremental, online learning. Several methods have been proposed for nonlinear function approximation, such as Gaussian process regression (Williams & Rasmussen, 1996), support vector regression (Smola & Schölkopf, 1998) and variational Bayesian mixture models (Ghahramani & Beal, 2000). However, these global methods tend to be unsuitable for fast, incremental function approximation. Atkeson, Moore & Schaal (1997) have shown that in such scenarios, learning with spatially localized models is more appropriate, particularly in the framework of locally weighted learning.
  • 9. In recent years, Vijayakumar & Schaal (2000) have introduced a learning algorithm designed to fulfill the fast, incremental requirements of locally weighted learning, specifically targeting high-dimensional input domains through the use of local projections. This algorithm, called Locally Weighted Projection Regression (LWPR),performs competitively in its generalization performance with state-of- the-art batch regression methods. It has been applied successfully to sensorimotor learning on a humanoid robot for the purpose of executing fast, accurate movements in a feedforward controller. The major issue with LWPR is that it requires gradient descent (with leave-one- out cross-validation) to optimize the local distance metrics in each local regression model. Since gradient descent search is sensitive to the initial values, we propose a novel Bayesian treatment of locally weighted regression with locally linear models that eliminates the need for any manual tuning of meta parameters, cross-validation approaches or sampling. Combined with variational approximation methods to allow for fast, tractable inference, this Bayesian algorithm learns the optimal distance metric value for each local regression model. It is able to automatically determine thesize of the neighborhood data (i.e., the ``bandwidth’’) that should contribute to each local model. A Bayesian approach offers error bounds on the distance metrics and incorporates this uncertainty in the predictive distributions. By being able to automatically detect relevant input dimensions, our algorithm is able to handle high- dimensional data sets with a large number of redundant and/or irrelevant input dimensions and a large number of data samples. We demonstrate competitive performance of our Bayesian locally weighted regression algorithm with Gaussian Process regression and LWPR on standard benchmark sets. We also explore extensions of this locally linear Bayesian algorithm to a real-time setting, to offer a parameter-free alternative for incremental learning in high-dimensional spaces. Generating Summary Keywords for Emails Using Topics. Hanna Wallach, University of Cambridge Email summary keywords, used to concisely represent the gist of an email, can help users manage and prioritize large numbers of messages. Previous work on email keyword selection has focused on a two-stage supervised learning system that selects nouns from individual emails using pre-defined linguistic rules [1]. In this work we present an unsupervised learning framework for selecting email summary keywords. A good summary keyword for an email message is not best characterized as a word that is unique to that message, but a word that relates the message to other topically similar messages. We therefore use latent representations of the underlying topics in a user's mailbox to find words that describe each message in the context of existing topics rather than selecting keywords based on a single message in isolation. We present and compare several methods for selecting email summary keywords, based on two well-
  • 10. known models for inferring latent topics: latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). Summary keywords for an email message are generated by selecting the words that are most topically similar to the words in the email. We use two approaches for selecting these words, one based on query-document similarity, and the other based on word association. Each approach may be used in conjunction with either LSA or LDA. We evaluate keyword quality by generating summaries for emails from twelve users in the Enron corpus and comparing each method's performance with a TF-IDF baseline. The quality of keywords are assessed using two proxy tasks, in which the summaries are used in place of whole messages: recipient prediction and foldering. In the recipient prediction task, the keywords for each email are used to predict the intended recipients of the current message. In the foldering task, each user's email messages are sorted into folders using the selected keywords as features. Our topic-based methods out-perform TF-IDF on both tasks, demonstrating that topic-based methods yield better summary keywords. By selecting keywords based on user- specific topics, we find summaries that represent each message in the context of the entire mailbox, not just that of a single message. Furthermore, combining the summary for an email with the email's subject improves foldering and recipient prediction results over those obtained using either summaries or subjects alone. References: [1] S. Muresan, E. Tzoukermann, and J. Klavans (2001). Combining linguistic and machine learning techniques for email summarization. CONLL. Continuous-State POMDPs with Hybrid Dynamics Emma Brunskill, MIT Partially observable Markov decision processes (POMDPs) provide a rich framework for describing many important planning problems that arise in situations with hidden state and stochastic actions. Most previous work has focused on solving POMDPs with discrete state, action and observation spaces. However, in a number of applications, such as navigation or robotic grasping, the world is most naturally represented using continuous states. Though any continuous domain can be described using a sufficiently fine grid, the number of discrete states grows exponentially with the dimensionality of the underlying state space. Existing discrete state POMDP algorithms can only scale up to the order of a few thousand states, beyond which they become computationally infeasible. Therefore, approaches for dealing efficiently with continuous-state POMDPs are of great interest. Previous work (such as [1]) on planning for continuous-state POMDPs has typically modeled the world dynamics using a single linear Gaussian model to describe the effects of an action. Unfortunately, this model is not powerful
  • 11. enough to represent the multi-modal state-dependent dynamics that arise in a number of problems of interest. For example, in legged locomotion the different "modes" of walking and running are described best by significantly different dynamics. We instead employ a hybrid dynamics model for continuous-state POMDPs that can represent stochastic state-dependent distributions over a number of different linear dynamic models. We developed a new point-based approximation algorithm for solving these hybrid-dynamics POMDP planning problems that builds on Porta et al.'s continuous-state point-based approach[1]. One nice attribute of our algorithm is that by representing the value function and belief states using a weighted sum of Gaussians, the belief state updates and value function backups can be computed in closed form. An additional contribution of our work is a new procedure for constructing a better approximation of the alpha functions composing the value function. We conducted experiments on a set of small problems to illustrate how the representational power of the hybrid dynamics model allows us to address problems not previously solvable by existing continuous-state approaches. In addition, we examined the toy problem of a simulated robot searching blindly (no observations) for a power supply in a long hallway. This problem requires a variable level of representational granularity in order to perform well. Here our hybrid continuous-state planner outperforms a discrete state POMDP planner, demonstrating the potential of continuous-state approaches. [1] J. Porta, M. Spaan, N. Vlassis, and P. Poupart. Point-based value iteration for continuous POMDPs. Journal of Machine Learning Research, 7:2329-2367, 2006 Clustering Social Networks Isabelle Stanton, University of Virginia Social networks have gained popularity recently with the advent of sites such as MySpace, Friendster, Facebook, etc. The number of users participating in these networks is large, e.g., a hundred million in MySpace, and growing. These networks are a rich source of data as users populate their sites with personal information. Of particular interest in this paper is the graph structure induced by the friendship links. A fundamental problem related to these networks is the discovery of clusters or communities. Intuitively, a cluster is a collection of individuals with dense friendship patterns internally and sparse friendships externally. There are many reasons to seek tightly-knit communities in networks, for instance, target marketing schemes can be designed based on clusters and terrorist cells can be uncovered. Existing clustering criteria are limited in that clusters typically do not overlap, all vertices are clustered and/or external sparsity is ignored. We introduce a new criterion that overcomes these limitations by combining internal density with external sparsity in a natural way. Our criterion does not require a strict
  • 12. partitioning of the data which is particularly important in social networks, where one user may be a member of many communities. This work focuses on the combinatorial properties of the new criterion. In particular, we bound the amount that clusters can overlap, as well as find a loose bound for the number of clusters in a graph. From these properties we have developed deterministic and randomized algorithms for provably finding the clusters, provided there is a sufficiently large gap between internal density and external sparsity. Finally, we perform experiments on real social networks illustrate the effectiveness of the algorithm. Improvement in Performance of Learning Using Scaling Soumi Ray, University of Maryland Baltimore County Reinforcement learning often requires many training iterations to get an optimal policy. We are interested in trying to speed up learning in a domain using scaling, which works as follows: partial learning is performed to learn a sub-optimal action value function, Q, in the domain using standard Q-learning for few iterations. The Q-values of Q are then multiplied by a constant factor to scale the Q-values. Then learning continues using the scaled Q-values of the new Q-table as the initial values. Surprising, in many situations this scaling significantly reduces the number of iterations required to learn compared to learning without scaling. We can summarize our method of scaling in the following steps: 1. Partial learning is done in the domain. 2. The Q-values of the partially learned domain are scaled, using a scaling factor decided manually. 3. Finally learning in the domain is carried out using the new scaled Q-values. This method can reduce the number of steps required to learn in the domain compared to learning without scaling. Two important aspects of scaling are the scaling factor and the time of scaling. If the scaling factor and the time of scaling are chosen correctly then we can get great improvements in the performance of learning in a domain. We have used 10×10 grid world domains with the starting position at the top left corner and the goal at the bottom right corner to run our experiments. A Theory of Similarity Functions for Clustering Maria-Florina Balcan, Carnagie Mellon University Problems of clustering data from pairwise similarity information are ubiquitous in Computer Science. Theoretical treatments typically view the similarity information as ground-truth and then design algorithms to (approximately) optimize various graph-based objective functions. However, in most applications, this similarity information is merely based on some heuristic: the true goal is to cluster the
  • 13. points correctly rather than to optimize any specific graph property. In this work, we initiate a theoretical study of the design of similarity functions for clustering from this perspective. In particular, motivated by recent work in learning theory that asks "what natural properties of a similarity function are sufficient to be able to learn well?" we ask "what natural properties of a similarity function are sufficient to be able to em cluster well?" We develop a notion of the clustering complexity of a given property (analogous to notions of capacity in learning theory), that characterizes its information- theoretic usefulness for clustering. We then analyze this complexity for several natural game-theoretic and learning-theoretic properties, as well as design efficient algorithms that are able to take advantage of them. We consider two natural clustering objectives: (a) list clustering: analogous to the notion of list- decoding, the algorithm can produce a small list of clusterings (which a user can select from) and (b) hierarchical clustering: the desired clustering is some pruning of this tree (which a user could navigate). Our algorithms for hierarchical clustering combine recent learning-theoretic approaches with linkage-style methods. This is joint work with Avrim Blum and Santosh Vempala.
  • 14. Spotlights Advancing Associative Classifiers - Challenges and Solutions Luiza Antonie, University of Alberta In the past years, associative classifiers, classifiers that use association rules, have started to attract attention. An important advantage that these classification systems bring is that, using association rule mining, they are able to examine several features at a time, while other state-of-the-art methods, like decision trees or naive Bayesian classifiers, consider that each feature is independent of one another. However, in real-life applications, the independence assumption is not necessary true, and it was shown that correlations and co-occurrence of features can be very important. In addition, the associative classifiers can handle a large number of features, while other classification systems do not work well for high dimensional data. The associative classification systems proved to perform as well as, or even better, than other techniques in the literature. The associative classifiers are models that can be read, understood, modified by humans and thus can be manually enriched with domain knowledge. We have proposed the integration of new types of association rules and new methods to reduce the number of rules in the model. In our research work we studied the behaviour of associative classifiers when negative association rules, maximal and closed itemsets are employed. These types of association rules have not been used in associative classifiers before, thus bringing new challenges and opportunities to our work. Given that one advantage of the classifiers based on association rules is their readability, another direction that we investigated is reducing the number of association rules used in the classification model. Pruning of rules not only improves readability, but it may minimize overfitting of the model as well. Another challenge is the use of rules in the classification stage. We proposed a new technique where the system automatically learns how to use the rules. Many applications can benefit from a good classification model. Given the readability of the associative classifiers, they are especially fit to applications were the model may assist domain experts in their decisions. Medical field is a good example were such applications may appear. Let us consider an example were a physician has to examine a patient. There is a considerable amount of information associated with the patient (e.g. personal data, medical tests, etc.). A classification system can assist the physician in this process. The system can predict if the patient is likely to have a certain disease or present incompatibility with some treatments. Considering the output of the classification model, the physician can make a better decision on the treatment to be applied to this patient. Given the transparency of our model, a health practitioner can understand how the classification model reached its decision.
  • 15. Real-life applications are usually characterized by unbalanced datasets. Classes of interest may be under-represented, thus making harder the discovery of knowledge associated with them. We evaluated the performance of our system under these difficult conditions. We studied the performance of our classification model on real-life applications (mammography classification, text categorization, preterm birth prediction) where the classes of interest are typically under- represented. This is joint work with my supervisors, Osmar R. Zaiane and Robert C. Holte. Learning to Predict Prices in a Supply Chain Management Game Shuo Chen, UC Berkeley Economic decisions can benefit greatly from accurate predictions of market prices, but making such predictions is a difficult problem and an area of active research. In this paper, we present and compare several techniques for predicting market prices that we have employed in the Trading Agent Competition Supply Chain Management (TAC SCM) Prediction Challenge. These strategies include simple heuristics and various machine learning approaches, such as simple perceptrons and support vector regression. We show that the heuristic methods are very good, especially for predicting current prices, but that the machine learning techniques may be more appropriate for future price predictions. Sonar Terrain Mapping with BDI Agents Shivali Gupta, University of Maryland, Baltimore County Mapping a constantly changing environment is a challenge that necessitates a team of agents working together. These agents must continually explore the terrain and assemble the map in a distributed fashion. In a real-world instance of this problem agents have limited sensor and communication ranges, such as surveillance problem, further compounding the problem. Our solution is to create multiple “Explorer" agents and a centralized “Base station" agent using the BDI architecture. The BDI architecture provides a framework for agents that have their individual beliefs, desires and intentions (goals). The environment is ripe with uncertainty given its continually changing nature which makes BDI architecture well suited to this problem. Mobile Explorer agents have limited range of communication and partial observability of the environment. The Base station agent is stable and it maintains the global map of the environment from the information of the Explorer agents. Explorer agents use the Base station’s global map (its beliefs about the world) to decide which area to explore next, and after exploration they send their updated map to the Base station agent. The Base station agent merges its copy with the information received from the explorer agent. The Explorer agents must stay within
  • 16. surveillance problem, further compounding the problem. Our solution is to create multiple “Explorer" agents and a centralized “Base station" agent using the BDI architecture. The BDI architecture provides a framework for agents that have their individual beliefs, desires and intentions (goals). The environment is ripe with uncertainty given its continually changing nature which makes BDI architecture well suited to this problem. Mobile Explorer agents have limited range of communication and partial observability of the environment. The Base station agent is stable and it maintains the global map of the environment from the information of the Explorer agents. Explorer agents use the Base station’s global map (its beliefs about thecommunication communication range of each other to maintain a complete world) to decide which area to explore next, and after exploration they send their updated map to the Base station agent. The network between all agents and the base station. Base station agent merges its copy with the information received from the explorer agent. The Explorer agents must stay within communication range of each other to maintain a complete communication The system models the environment as a grid of cells and the Base station network between all agents and the base station. assigns each cell the“Curiosity level", based on how long it has been since that The system models a environment as a grid of cells and the Base station assigns each cell a “Cu- riosity level", based on howHigherhas been since that implies that the cell has curiosity level region was explored. long it curiosity level region was explored. Higher not been implies that recently. Therefore, the curiosity level drives exploration explored the cell has not been explored recently. Therefore, the curiosity level drives exploration toward regions of uncertainty. Explorer agents calculate a force vector, toward regions of uncertainty. Explorer agents calculate a force vector, force_vector = distance_based _penalty ∗ curiosity_value ∗ unit_vector (1) for _every_cell where distance_based_penalty is the inverse of theinverse of the manhattan distancefind cells where distance_based_penalty is the manhattan distance of cells from agents, to of the direction to explore. find calculation ensures that not all the agents move in one direction at the not all from agents, to This the direction to explore. This calculation ensures that same time. agents move advantages of this distributed approach is that One ofof an agent does not affect the One of the major in one direction at the same time. a failure the major advantages the this distributedIf an Exploreris that a failure of an agents can still continue to explore the of system in general. approach agent fails, then the other agent does not affect the environment. system in general. If anagents prevent the average curiosity level from rising at a canspace and The results show that more Explorer agent fails, then the other agents fast still continue eventually stabilizes after a limited number of Explorer agents explore the map. Another the average to explore the environment. result shows that distance penalty based on the manhattan distance provides a better solution because it allowsresults show to explore theagents prevent the as well as the outer edges of the maprising The Explorer agents that more local area around them, average curiosity level from in comparison space and the on euclidian eventually stabilizes after aprocedure.number of at a fast to a penalty based average distance which localizes the search limited In our future work, we are interested in adding a learning mechanism result shows which distance penalty Explorer agents explore the map. Another to the algorithm that would enable Explorer agents to predict the changing behavior of the environment and how to explore it optimally. Learning based on the manhattan distance provides a better solution because it allows would also enable Explorer agents to avoid obstacles in their environment. Explorer agents to explore the local area around them, as well as the outer edges of the map in comparison to a penalty based on euclidian distance which localizes the search procedure. In our future work, we are interested in adding a learning mechanism to the algorithm which would enable Explorer agents to predict the changing behavior of the environment and how to explore it optimally. Learning would also enable Explorer agents to avoid obstacles in their environment. Online Learning for OffRoad Robots 1 Raia Hadsell, NYU We present a learning-based solution to the problem of long-range obstacle detection in autonomous robots. The system uses sparse traversability information from a stereo module to train a classifier online. The trained classifier can then predict the traversability of the entire scene. This learning strategy is called self-supervised, near-to-far learning, and, if it is done in an online manner, it allows the robot to adapt to changing environments and still accurately predict the traversability of distant areas. A distance-normalized image pyramid makes it possible to efficiently train on each frame seen by the robot, using large windows that contain contextual information as well as shape,color, and texture. Traversability labels are initially obtained for each target using a stereo module, then propagated to other views of the same target using temporal and spatial concurrences, thus training the
  • 17. classifier to be view-invariant. A ring buffer simulates short-term memory and ensures that the discriminative learning is balanced and consistent. This long- range obstacle detection system sees obstacles and paths at 30-40 meters, far beyond the maximum stereo range of 12 meters, and adapts very quickly to new environments. Experiments were run on the LAGR (Learning Applied to Ground Robots) robot platform. Both the robot and the reference ``baseline'' software were built by Carnegie Mellon University and the National Robotics Engineering Center. In this program, in which all participants are constrained to use the given hardware, the goal is to drive from a given start to a predefined (GPS) goal position through unknown, offroad terrain using only passive vision. Both qualitative and quantitative results are given by comparing the field performance of the robot with and without learning-based, long-range vision enabled.
  • 18. Posters Untitled Mair Allen-Williams, University of Southampton Two particular challenges faced by agents within dynamic, uncertain multi-agent systems are learning and acting in uncertain environments, and coordination with other agents about whom they may have little or no knowledge. Although uncertainty and coordination have each been tackled as separate problems, existing formal models for an integrated approach make a number of simplifying assumptions, and often have few guarantees. In this report we explore the extension of a Bayesian learning model into partially observable multi-agent domains. In order to implement such a model practically we make use of a number of approximation techniques. In addition to traditional methods such as repair sampling and state clustering, we apply graphical inference methods within the learning step to propagate information through partially observable nodes. We demonstrate the scalability of this approach with an ambulance rescue problem inspired by the Robocup Rescue system. Supervised Learning by Training on Aggregate Outputs Janara Christensen, Carleton College Supervised learning is a classic data mining problem where one wishes to be be able to predict an output value associated with a particular input vector. We present a new twist on this classic problem where, instead of having the training set contain an individual output value for each input vector, the output values in the training set are only given in aggregate over a number of input vectors. This new problem arose from a particular need in learning on mass spectrometry data, but could easily apply to situations when data has been aggregated in order to maintain privacy. We provide a formal description of this new problem for both classification and regression. We then examine how k-nearest neighbor, neural networks, and support vector machines can be adapted for this problem. Disparate Data Fusion for Protein Phosphorylation Prediction Genetha Gray, Sandia National Labs New challenges in knowledge extraction include interpreting and classifying data sets while simultaneously considering related information to confirm results or identify false positives. We discuss a data fusion algorithmic framework targeted at this problem. It includes separate base classifiers for each data type and a fusion method for combining the individual classifiers. The fusion method is an extension of current ensemble classification techniques and has the advantage of allowing data to remain in heterogeneous databases. In this poster, we focus on the applicability of such a framework to the protein phosphorylation prediction problem and show some numerical results.
  • 19. Real Boosting a la Carte with an Application to Boosting Oblique Decision Tree Claudia Henry, Université des Antilles et de la Guyane In the past ten years, boosting has become a major field of machine learning and classification. We bring contributions to its theory and algorithms. We first unify a well-known top-down decision tree induction algorithm due to Kearns and Mansour, and discrete AdaBoost, as two versions of a same higher-level boosting algorithm. It may be used as the basic building block to devise simple provable boosting algorithms for complex classifiers. We provide one example: the first boosting algorithm for Oblique Decision Trees, an algorithm which turns out to be simpler, faster and significantly more accurate than previous approaches. Multimodal Integration for Multiparty Dialogue Understanding: A Machine Learning Framework Pei-Yun Sabrina Hsueh, University of Edinburgh Recent advances in recording and storage technologies have led to huge archives of multimedia conversational speech recordings in widely ranging areas, such as clinical use, online sharing service, and meeting analysis. While it is straightforward to replay such recordings, finding information from the often lengthy archives has become more difficult. It is therefore essential to provide sufficient aids to guide the users through the recordings and to point out the most important events that need their attentions. In particular, my research concerns how to infer human communicative intention from low level audio and video signals. In particular, I focus on identifying multimodal integration patterns (e.g., people tend to speak more firmly and address to the whole group more often when they are making decisions) in human conversations, using approaches ranging from statistical analysis, empirical study, to machine learning. Past research has shown that ehe identified multimodal integration patterns are useful for recognizing local speaker intention in recorded speech such as speech disfluency (e.g., false start). My research attempts to recover speaker intention that serve a more global communicative goal, such as ìinitiate-discussionî and ìreach-decision." A learning framework that can identify characteristic features of different semantic classes has been developed. This framework has been proven to be useful for automatic topic segmentation (and labeling) and automatic decision detection. The ultimate goal of this research is to enhance the current browsing and search utilities of multimedia archives.
  • 20. A POMDP for Automatic Software Customization Bowen Hui, University of Toronto Providing personalized software for individuals has the potential to increase work productivity and user satisfaction. In order to accommodate a wide variety of user needs, skills, and preferences, today's software is typically packed with functionality suitable for everyone. As a result, the interface is complicated, functionalities are unexplored, and hence, unused, and users are dissatisfied with the product. Many attempts in the user adaptive systems literature have explored ways to customize software according to the inferred user needs. Recent probabilistic approaches model the uncertainty in the application domain and typically optimize single objective functions, i.e., helping the user complete a task faster or interact with the interface easier, but not both. A few exceptions exist that provide a principled treatment to modeling the uncertainty and the tradeoffs that are needed to satisfy multiple objectives. Nevertheless, existing work have done little to address three important issues: * the interaction principles that govern the nature of the problem's objective functions * the hidden user variables that explain observed preferences and behaviour * the value of information available in the repeated, sequential nature of the interaction between the user and the system We are interested in designing a software agent that assists the user by adapting the interface and suggesting task completion help. In particular, the sequential nature of the human-computer interaction (HCI) naturally lends itself as a partially observable Markov decision process (POMDP). We propose to develop a customization POMDP that learns the type of user it is dealing with and adapts its behaviour in order to maximize expected rewards formulated by the interaction principles for that specific user. Overall, modeling the automatic customization problem as a POMDP enables the system to take optimal actions with respect to the value of information gain of an exploratory action and the immediate rewards obtained by exploitation. This approach provides a decision- theoretic treatment to balancing the opportunities to learn about the user versus exploiting what the system already knows about the user. This work pools together techniques and insights from artificial intelligence and machine learning to construct and solve the POMDP. Specifically, we adopt methods from the Bayesian user modeling literature to construct a generic user model, the activity recognition literature to build a goal model of user activities, the HCI literature to formulate the reward model specifying user objectives, the preference elicitation literature to learn the user's utility function for adaptive systems, and the machine learning literature to populate model parameters with incomplete data and to do approximate inference. In addition to the development of the novel user model and reward model, a major contribution here is
  • 21. demonstrating that the customization POMDP is able to model real world applications tractably and is able to adapt to different types of users quickly. Using Probabilistic Graphical Models in Bio-Surveillance Research Masoumeh Izadi, McGill University Artificial intelligence methods can support and assist optimal use of clinical and administrative knowledge in diverse perspectives from diagnostic assistance, and detection of epidemics, to improved efficiency of health care delivery processes. Probabilistic graphical models have been successfully used for many medical problems. We describe a decision support system in public health bio- surveillance research. A long line of research has shown that current outbreak detection methods are ineffective; they raise both false alarms and miss attacks. Our approach tries to bring us closer to an effective detection system that detects real attacks and only those. I show how Partially Observable Markov Decision Processes (POMDPs) can be applied on outbreak detection methods for improving alarm function in the case of anthrax. Our results show that this method significantly outperforms existing solutions, in terms of both sensitivity and timeliness. Incorporating a New Relational Feature in Online Handwritten Character Recognition Sara Izadi, Concordia University Artificial neural networks have shown good capabilities in performing classification tasks. However, classifier models used for learning in pattern classification are challenged when the differences between the patterns of the training set are small. Therefore, the choice of effective features is mandatory for reaching a good performance. Statistical and geometrical features alone are not suitable for recognition of hand printed characters due to variations in writing styles, that may result in deformations of character shapes. We address this problem by using a relational context feature combined with a local descriptor for training a neural network-based recognition system in a user- independent online character recognition application. Our feature extraction approach provides a rich representation of the global shape characteristics, in a considerably compact form. This new relational feature generally provides a higher distinctiveness and robustness to character deformations, thus potentially increasing the recognition rate in a user-independent system. While enhancing the recognition accuracy, the feature extraction is computationally simple. We show that the ability to discriminate in handwriting characters is increased by adopting this mechanism which provides input to the feed forward neural network architecture. Our experiments on Arabic character recognition show comparable results with the state-of-the- art methods for online recognition of these characters.
  • 22. Description Length and the Multiple Motif Problem Anna Ritz, Brown University Protein interactions drive many biological functions in the cell. A source protein can interact with several proteins; the specificity of this interaction is partly determined by the sequence around the binding site. In the 20-letter alphabet of protein sequences (denoting the 20 amino acids), a motif is a pattern that describes these binding preferences for a given protein. The motif-finding problem is to extract a motif from a set of sequences that interact with a given protein. The problem is solved by identifying statistically enriched patterns in this foreground set compared to a background set of non-interacting sequences. Finding such patterns is well-studied in Computational Biology. Recent advances in technology require us to rethink the approach to the motif- finding problem. Mass spectrometry, for example, allows high-throughput measurements of multiple proteins interacting simultaneously. This creates a foreground set that is a mixture of motifs. The Multiple Motif problem is described as follows: find a collection of motifs, called a motif model, that best describes the foreground. The motif model is empty if the background distributions describe the foreground better than any set of patterns. A few algorithms to find multiple motifs exist, but they use either overly simplistic or overly descriptive motif representations. Overly simplistic motifs provide limited information about the structure of the data, while overly descriptive motifs use many parameters that require unrealistically large datasets. We use a representation between these extremes: some positions in a motif are exact, while others are restricted to a few letters. When comparing motif models, we want to know which model describes the foreground the best. We use description length as a metric. Our goal is to learn the motif model that produces the most compact representation of the foreground by minimizing description length. Using minimum description length in this context circumvents some of the limitations of other representations. Each motif in the model must contribute to describing the foreground as concisely as possible, avoiding both redundancy and overfitting. Description length also gives a criterion for merging multiple exact motifs into a single, inexact motif, a task that is often ambiguous in other algorithms. We describe the use of minimum description length to filter the results of known algorithms and to discover novel motifs in synthetic and real datasets. This is joint work with Benjamin Raphael and Gregory Shakhnarovich at Brown University.
  • 23. Machine Translation with Self Organized Maps Aparna Subramanian, University of Maryland, Baltimore County I am investigating the idea of using Self Organizing Maps for the purposes of Machine Translation. Human translators seem to translate based on their knowledge of what words/phrases of one language best represent the translation of the word/phrase in another. While choosing these word/phrase equivalents, they rely on similarity in the underlying concept to which the two words/phrases in different languages correspond to. This gives a good reason for a machine translation system to do something similar, i.e. translating at a conceptual level. Conceptual relativism of languages indicates a good source to parameterize concepts for the purpose of translation. Self Organizing Maps (SOM) can be used to formalize such concept categories and improve them by learning over time. Contextual information can also be captured in SOMs and be used for translation. Major challenges in practical application of SOMs to problems such as translation which require large vectors of concepts to be stored and processed are speed and space. This can be resolved in at least the following two ways – SOMs stored and processed as a hierarchy of concepts and SOMs maintained as different modules each catering to a group of similar concepts. I plan to further investigate the feasibility of these methods. One approach for translation therefore is to average over the contextual relevance of the given piece, e.g. sentence, over the whole conversation or text in the source language under consideration. This can be done using a SOM for contexts which learns with every input sentence in the text. The mapping of the input sentence in the SOM can then be used as input to the Word Category Map of the source language. The output/s of this exercise can be the input to the target language Word Category Map. The words/phrases that are the outcome of this step can be organized into a sentence using the context SOM for the target language and can be aided by the knowledge of the grammar for the target language. The investigation is in its initial stages, though the idea appears promising because this kind of translation system has the capacity to evolve through learning and takes care of pragmatics of the input. The approach also seems viable since there have been attempts in the past to use SOM for Natural Language Processing in general. The present work will be significant as attempts of using Self Organizing Maps for Machine Translation do not appear to have been explored, though it has been indicated as possibility in previous works. Policy Recognition for Multi-Player Tactical Scenarios Gita Sukthankar, University of Central Florida This research addresses the problem of recognizing policies given logs of battle scenarios from multi-player games. The ability to identify individual and team policies from observations is important for a wide range of applications including
  • 24. automated commentary generation, game coaching, and opponent modeling. We define a policy as a preference model over possible actions based on the game state, and a team policy as a collection of individual policies along with an assignment of players to policies. Given a sequence of input observations, O, (including observable game state and player actions), a set of player policies, P, and team policies, T, the goal is to identify the individual policies p that were employed during the scenario. A team policy is an allocation of players to tactical roles and is typically arranged prior to the scenario as a locker-room agreement. However, circumstances during the battle (such as the elimination of a teammate or unexpected enemy reinforcements) can frequently force players to take actions that were a priori lower in their individual preference model. In particular, one difference between policy recognition in a tactical battle and typical plan recognition is that agents rarely have the luxury of performing a pre-planned series of actions in the face of enemy threat. This means that methods that rely on temporal structure, such as Dynamic Bayesian Networks (DBNs) and Hidden Markov Models are not necessarily be well-suited to this task. An additional challenge is that, over the course of a single scenario, one only observes a small fraction of the possible game states, which makes policy learning difficult. This research explores a model-based system for combining evidence from observed events using the Dempster-Shafer theory of evidential reasoning. The primary benefit of this approach is that the model generalizes easily to different initial starting states (scenario goals, agent capabilities, number and composition of the team). Unlike traditional probability theory where evidence is associated with mutually-exclusive outcomes, the Dempster-Shafer theory quantifies belief over sets of events. We computed the average accuracy over the set of battles for each of the three rules of combination. We evaluate our Dempster-Shafer based approach on logs of real and simulated games played using Open Gaming Foundation d20, the rule system used by many popular tabletop games, including Dungeons and Dragons. Advice-based Transfer in Reinforcement Learning Lisa Torrey, University of Wisconsin This report is an overview of our work on transfer in reinforcement learning using advice-taking mechanisms. The goal in transfer learning is to speed up learning in a target task by transferring knowledge from a related, previously learned source task. Our methods are designed to do so robustly, so that positive transfer will speed up learning but negative transfer will not slow it down. They are also designed to allow human teachers to provide simple guidance that increases the benefit of transfered knowledge. These methods allow us to push the boundaries of current work in this area and perform transfer between complex and dissimilar tasks in the challenging RoboCup simulated soccer domain.
  • 25. Determining a Relationship Between Two Distinct Atmospheric Data Sets of Different Granularities Emma Turetsky, Carleton College Regression analysis is a classic data mining problem with many real-world applications. We present several methods of using data mining and statistical analysis to find a relationship between two different data sets; atmospheric particles (and their elemental constituents) and elemental carbon (EC). Specifically, we wish to determine which elements in the atmosphere cause elemental carbon, something that is common in industrial zones and large cities and can normally be found in exhaust fumes and areas where there is visible carbon. In order to do this, we used machine learning regression algorithms including SVM regression and Lasso regression as well as regular linear regression. Weíve created several models that correlate specific elements with the amount of elemental carbon in the atmosphere. Inferring causal relationships between genes from steady state observations and topological ordering information Xin Zhang, Arizona State University The development of high-throughput genomic technologies, such as cDNA microarray and oligonucleotide chips, empowers researchers to reveal gene interactions. Mathematical modeling and in-silico simulation can be used to analyze gene interactions unambiguously, and to predict the network dynamic behavior in a systematic way. Various network inference models have been developed to identify gene regulatory networks using gene expression data, but none of them are about inferring causal relationships between genes, which is a very important issue in system biology. Among the developed methods, the Inductive Causation (IC) algorithm has been proven to be effective for inferring causal relationships among variables. However, simulation study in the context of gene regulatory network shows that the IC algorithm, which uses only one single data source, results in low precision and recall rates. To improve the performance, we propose a joint learning scheme that integrates multiple data sources. We present a modified IC (mIC) algorithm, that combines steady state data with partial prior knowledge of gene topological ordering information, for jointly learning causal relationships among genes. We perform three sets of experiments on synthetic datasets for learning causal relationships between genes using the IC and the mIC algorithms. Each experiment contains 100 randomly generated Boolean networks (DAGs), each of which contains 10 genes connected by proper functions, with the gene topological ordering information. The distribution of the network is generated based on the probability distribution of the root genes and the proper functions. The Monte Carlo sampling method is used to generate 200 samples in a dataset for each network based on the probability distribution. We compare the simulation results from the mIC algorithm with the ones from the IC algorithm.
  • 26. From the simulation based evaluation we conclude that (i) IC algorithm does not work well for learning gene regulatory networks from steady state data alone, (ii) a better way for learning the gene causal relationship from steady state data is to use additional knowledge such as gene topological ordering, (iii) the precision and recall rates for mIC algorithm is significantly improved compared with IC algorithm with statistical confidence of 95%. For randomly generated networks, the mIC algorithms work well for jointly learning the causal regulatory network by combining steady state data and gene topological ordering knowledge, with precision rate of greater than 60%, and recall rate greater than 50%. We further apply the mIC algorithm to gene expression profiles used in the study of melanoma. 31 malignant melanoma samples were quantized to the ternary format such that the expression level of each gene is assigned to ñ1 (downregulated), 0 (unchanged) or 1 (up-regulated). The 10 genes involved in this study are chosen from 587 genes from the melanoma dataset. The result showed that some of the important causal relationships associated with WNT5A gene have been identified using the mIC algorithm, and those causal connections have been verified from the literatures.
  • 27. Workshop Organization Organizers: Hila Becker, Columbia University Bethany Leffler, Rutgers University Faculty Advisor: Lise Getoor, University of Maryland, College Park Reviewers: Hila Becker Finale Doshi Seyda Ertekin Katherine Heller Bethany Leffler Özgür Şimşek Jenn Wortmann
  • 28. Thanks to our sponsors: C R A Committee on the Status of Women in Computing Research PRINCETON UNIVERSITY