| |
|
|
|
|
|
|
|
[1]
|
M. Sridharan, R. Dearden, and J. Wyatt.
E-HiPPo: Extensions to Hierarchical POMDP-based Visual
Planning on a Robot.
In The 27th PlanSIG Workshop, December 11-12 2008.
[ bib |
.pdf ]
One major challenge to the widespread deployment of mobile robots
is the ability to autonomously tailor the sensory processing to the
task on hand. In our prior work [?], we proposed
an approach for such general-purpose processing of visual input in
an application domain where a robot and a human jointly converse
about and manipulate objects on a tabletop by processing the regions
of interest (ROIs) in input images. We posed the visual processing
management problem as a partially observable Markov decision problem
(POMDP), and introduced a hierarchical decomposition to make it tractable
to plan with POMDPs. In this paper we analyze and eliminate some
of the limitations of the existing approach. First, in addition to
tackling visual actions that analyze the state of the world represented
by the image, we show how to incorporate actions that can change
the state. Secondly, we show how policy caching can be used to speed
the planning performance and analyse the tradeoff between planning
speed and plan quality.
|
|
[2]
|
M. Sridharan, J. Wyatt, and R. Dearden.
HiPPo: Hierarchical POMDPs for Planning Information
Processing and Sensing Actions on a Robot.
In International Conference on Automated Planning and Scheduling
(ICAPS), September 14-18 2008.
[ bib |
.pdf ]
Flexible general purpose robots need to tailor their visual processing
to their task, on the fly. We propose a new approach to this within
a planning framework, where the goal is to plan a sequence of visual
operators to apply to the regions of interest (ROIs) in a scene.
We pose the visual processing problem as a Partially Observable Markov
Decision Process (POMDP). This requires probabilistic models of operator
effects to quantitatively capture the unreliability of the processing
actions, and thus reason precisely about trade-offs between plan
execution time and plan reliability. Since planning in practical
sized POMDPs is intractable we show how to ameliorate this intractability
somewhat for our domain by defining a hierarchical POMDP. We compare
the hierarchical POMDP approach with a Continual Planning (CP) approach.
On a real robot visual domain, we show empirically that all the planning
methods outperform naive application of all visual operators. The
key result is that the POMDP methods produce more robust plans than
either naive visual processing or the CP approach. In summary, we
believe that visual processing problems represent a challenging and
worthwhile domain for planning techniques, and that our hierarchical
POMDP based approach to them opens up a promising new line of research.
|
|
[3]
|
Henrik Jacobsson, Nick Hawes, Geert-Jan Kruijff, and Jeremy Wyatt.
Crossmodal content binding in information-processing architectures.
In HRI '08: Proceedings of the 3rd ACM/IEEE International
Conference on Human Robot Interaction, pages 81-88, New York, NY, USA,
March 2008. ACM.
[ bib |
http ]
Operating in a physical context, an intelligent robot faces two fundamental
problems. First, it needs to combine information from its different
sensors to form a representation of the environment that is more
complete than any representation a single sensor could provide. Second,
it needs to combine high-level representations (such as those for
planning and dialogue) with sensory information, to ensure that the
interpretations of these symbolic representations are grounded in
the situated context. Previous approaches to this problem have used
techniques such as (low-level) information fusion, ontological reasoning,
and (high-level) concept learning. This paper presents a framework
in which these, and related approaches, can be used to form a shared
representation of the current state of the robot in relation to its
environment and other agents. Preliminary results from an implemented
system are presented to illustrate how the framework supports behaviours
commonly required of an intelligent robot.
|
|
[4]
|
Bastian Leibe, Aleš Leonardis, and Bernt Schiele.
Robust object detection with interleaved categorization and
segmentation.
Int. J. Comput. Vision, 77(1-3):259-289, 2008.
[ bib |
.pdf ]
This paper presents a novel method for detecting and localizing objects
of a visual category in cluttered real-world scenes. Our approach
considers object categorization and figure-ground segmentation as
two interleaved processes that closely collaborate towards a common
goal. As shown in our work, the tight coupling between those two
processes allows them to benefit from each other and improve the
combined performance. The core part of our approach is a highly flexible
learned representation for object shape that can combine the information
observed on different training examples in a probabilistic extension
of the Generalized Hough Transform. The resulting approach can detect
categorical objects in novel images and automatically infer a probabilistic
segmentation from the recognition result. This segmentation is then
in turn used to again improve recognition by allowing the system
to focus its efforts on object pixels and to discard misleading influences
from the background. Moreover, the information from where in the
image a hypothesis draws its support is employed in an MDL based
hypothesis verification stage to resolve ambiguities between overlapping
hypotheses and factor out the effects of partial occlusion. An extensive
evaluation on several large data sets shows that the proposed system
is applicable to a range of different object categories, including
both rigid and articulated objects. In addition, its flexible representation
allows it to achieve competitive object detection performance already
from training sets that are between one and two orders of magnitude
smaller than those used in comparable systems.
Keywords: playmate
|
|
[5]
|
P. Lison and G.J.M. Kruijff.
Salience-driven contextual priming of speech recognition for
human-robot interaction.
In Proceedings of ECAI 2008, Athens, Greece, 2008.
[ bib |
.pdf ]
The paper presents an implemented model for priming speech recognition,
using contextual information about salient entities. The underlying
hypothesis is that, in human-robot interaction, speech recognition
performance can be improved by exploiting knowledge about the immediate
physical situation and the dialogue history. To this end, visual
salience (objects perceived in the physical scene) and linguistic
salience (objects, events already mentioned in the dialogue) are
integrated into a single cross-modal salience model. The model is
dynamically updated as the environment changes. It is used to establish
expectations about which words are most likely to be heard in the
given context. The update is realised by continuously adapting the
word-class probabilities specified in a statistical language model.
The paper discusses the motivations behind the approach, and presents
the implementation as part of a cognitive architecture for mobile
robots. Evaluation results on a test suite show a statistically significant
improvement of salience-driven priming speech recognition (WER) over
a commercial baseline system.
|
|
[6]
|
D. Skocaj, M. Kristan, and A. Leonardis.
Continuous learning of simple visual concepts using incremental
kernel density estimation.
In International Conference on Computer Vision Theory and
Applications, pages 598-604, Funchal, Madeira, Portugal, January 2008.
[ bib |
.pdf ]
In this paper we propose a method for continuous learning of simple
visual concepts. The method continuously associates words describing
observed scenes with automatically extracted visual features. Since
in our setting every sample is labelled with multiple concept labels,
and there are no negative examples, reconstructive representations
of the incoming data are used. The associated features are modelled
with kernel density probability distribution estimates, which are
built incrementally. The proposed approach is applied to the learning
of object properties and spatial relations.
|
|
[7]
|
Nick Hawes, Aaron Sloman, Jeremy Wyatt, Michael Zillich, Henrik Jacobsson,
Geert-Jan Kruijff, Michael Brenner, Gregor Berginc, and Danijel Skocaj.
Towards an integrated robot with multiple cognitive functions.
In Robert C. Holte and Adele Howe, editors, Proceedings of the
Twenty-Second AAAI Conference on Artificial Intelligence (AAAI 2008), pages
1548 - 1553, Vancouver, Canada, July 2007. AAAI Press.
[ bib |
.pdf ]
We present integration mechanisms for combining heterogeneous components
in a situated information processing system, illustrated by a cognitive
robot able to collaborate with a human and display some understanding
of its surroundings. These mechanisms include an architectural schema
that encourages parallel and incremental information processing,
and a method for binding information from distinct representations
that when faced with rapid change in the world can maintain a coherent,
though distributed, view of it. Provisional results are demonstrated
in a robot combining vision, manipulation, language, planning and
reasoning capabilities interacting with a human and manipulable objects.
|
|
[8]
|
M. Fritz, G.J.M. Kruijff, and B. Schiele.
Cross-modal learning of visual categories using different levels of
supervision.
In The 5th International Conference on Computer Vision Systems,
2007.
[ bib |
.pdf ]
Today's object categorization methods use either supervised or unsupervised
training methods. While supervised methods tend to produce more accurate
results, unsupervised methods are highly attractive due to their
potential to use far more and unlabeled training data. This paper
proposes a novel method that uses unsupervised training to obtain
visual groupings of objects and a cross-modal learning scheme to
overcome inherent limitations of purely unsupervised training. The
method uses a unified and scale-invariant object representation that
allows to handle labeled as well as unlabeled information in a coherent
way. One of the potential settings is to learn object category models
from many unlabeled observations and a few dialogue interactions
that can be ambiguous or even erroneous. First experiments demonstrate
the ability of the system to learn meaningful generalizations across
objects already from a few dialogue interactions.
|
|
[9]
|
Michael Zillich.
Incremental Indexing for Parameter-Free Perceptual
Grouping.
In 31st Workshop of the Austrian Association for Pattern
Recognition, 2007.
[ bib |
.pdf ]
The detection of closed convex contours in edge segmented images quickly
leads to a large number of hypotheses. Typically two methods are
used to limit the combinatorial explosion inherent in such perceptual
grouping tasks: indexing and early thresholding of less salient hypotheses.
We show that the adoption of an incremental indexing scheme removes
the need for thresholds, leading to improved robustness. Furthermore
incremental processing quite naturally leads to anytime processing.
|
|
[10]
|
Somboon Hongeng and Jeremy Wyatt.
Learning causality and intention in human actions.
In Proceedings of the 6th IEEE-RAS International Conference of
Humanoid Robots (Humanoids'06). IEEE, December 2006.
[ bib |
.pdf ]
Previous research has shown that human actions can be detected by
motion patterns. However, labeling motion patterns is not sufficient
in a cognitive system that requires reasoning about the agent's intentions,
and how the environmental context affects the way an action is performed.
In this paper, we develop a graphical model that captures how the
movements that realize the action vary depending on the situations,
and present statistical learning algorithms. Using object manipulation
tasks, we illustrate how a system infers the agent's goals from visual
observation and compare results with findings in psychological experiments.
|
|
[11]
|
Geert-Jan M. Kruijff, John D. Kelleher, and Nick Hawes.
Information fusion for visual reference resolution in dynamic
situated dialogue.
In Elisabeth Andre, Laila Dybkjaer, Wolfgang Minker, Heiko Neumann,
and Michael Weber, editors, Perception and Interactive Technologies:
International Tutorial and Research Workshop, PIT 2006, volume 4021 of
Lecture Notes in Computer Science, pages 117 - 128, Kloster Irsee, Germany,
June 2006. Springer Berlin / Heidelberg.
[ bib |
.pdf ]
Human-Robot Interaction (HRI) invariably involves dialogue about objects
in the environment in which the agents are situated. The paper focuses
on the issue of resolving discourse references to such visual objects.
The paper addresses the problem using strategies for intra-modal
fusion (identifying that different occurrences concern the same object),
and inter-modal fusion, (relating object references across different
modalities). Core to these strategies are sensorimotoric coordination,
and ontology-based mediation between content in differentmodalities.
The approach has been fully implemented, and is illustrated with
several working examples.
|
|
[12]
|
Michael Brenner and Bernhard Nebel.
Continual planning and acting in dynamic multiagent environments.
Journal of Autonomous Agents and Multiagent Systems, To appear.
accepted for publication.
[ bib ]
In highly dynamic environments, e.g. multiagent systems, finding optimal
action plans is practically impossible since individual agents lack
important knowledge at planning time or this knowledge has become
obsolete when a plan is executed. It is often more practical in such
environments to enable agents to actively extend their knowledge
as part of their plans and then revise their decisions in light of
these update. In this paper, we describe a new principled approach
to Continual Planning, i.e. the integration of Planning, Execution
and Monitoring. The algorithm deliberately postpones parts of the
planning process to later stages in an agent's plan-act-monitor cycle
and automatically determines when to switch back to refining or revising
a partly executed plan. To evaluate our (and others') Continual Planning
techniques we have developed a simulation environment where formal
MA Planning domains are not only used by planning agents but also
as the basis of the simulation model such that agents can not only
plan, but execute actions and perceive their environment. Our experiments
show that, using continual planning techniques, deliberate action
planning can be used efficiently even in complex multiagent environments.
|
This file has been generated by
bibtex2html 1.79
Print this page
|
|
|