| |
|
|
|
|
|
|
Explorer
Activities
SLAM
A key component in the system is the ability to build a map of the
environment based on sensor data. This is done using so called
Simultaneous Localization And Mapping (SLAM). Most traditional systems
rely on a laser scanner as main sensor for SLAM. The laser scanner is
expensive and also provides information about the environment limited to
distances. In comparison a camera offers much richer information,
including the appearance of the environment. However, building a map from
sensor camera data requires significantly much processing and a number of
challenging issues need to be addressed. In the explorer scenario the
main sensor is still the laser scanner but a number of alternative
approaches are investigates, either using a combination of laser and
vision or using vision only.
Multi-level conceptual spatial representations
Most existing approaches to robot map building, or Simultaneous
Localization And Mapping (SLAM), use a metric representation of space.
Humans, though, have a more qualitative, topological perspective on
spatial organization (McNamara 1986). We adopt an approach in which we
build a multi-level representation of the environment, combining metrical
maps and topological graphs (as an abstraction over metrical
information), like (Kuipers 2000). We extend these representations with
structural descriptions that capture aspects of spatial and functional
organization. The robot obtains these descriptions either through
interaction with a human, or through inference combining its own
observations (I see a coffee machine}) with ontological knowledge
(Coffee machines are usually found in kitchens, so this is likely to
be a kitchen!). We store objects in the spatial representations,
and so associate the functionality of a location with that of the
functions of the objects present there. A schematic view of the
multi-level representation is given
here .
Situated dialogue
A core characteristic of th explorer system is that each utterance is
analyzed to obtain a representation of the meaning it expresses, and how
it (syntactically) conveys that meaning rather than just doing for
example keyword spotting. This way, we can properly handle the variety of
ways in which people may express assertions, questions, and commands.
Furthermore, having a representation of the meaning of the utterance we
can combine it with further inferences over ontologies to obtain a
complete conceptual description of the location or object being talked
about. This way we can ground situated dialogue in the situational
awareness of the robot.
Human augmented mapping
Following (Topp and Christensen 2005) we talk about Human-Augmented
Mapping (HAM) to indicate the active role that human-robot
interaction plays in the robot's acquisition of qualitative spatial
knowledge.
Existing dialoguebased approaches to HRI usually implement a
master/slave model of dialogue: the human speaks, the robot listens.
However, situations naturally arise in which the robot needs to take the
initiative, e.g. to clarify an issue with the human. This is one form of
mixedinitiative interaction, enabling a robot to recognize when help is
needed from a human, and learn from this interaction (Bruemmer & Walton,
2003). A situation that may require is for example when uncertainty
arises in automatic area classification: Doors provide important
knowledge about spatial organization, but are difficult to recognize
robustly and reliably. Clarification dialogues can help to improve the
quality of the spatial representation the robot constructs, and to
increase the robot's robustness in dealing with uncertain information.
The basic idea is to allow for any modality to raise an issue. The image
below shows the timeline for an example with clarification dialogue.
Situation awareness
Situation awareness (SA) can be paraphrased as knowing [the important
aspects of] what is going on around you, where importance is defined
in terms of the goals and decision tasks for [the current] job (Endsley
and Garland 2000). Endsley defines three levels of SA: perception,
comprehension, and projection. A smart robot should be able perceive and
comprehend the situation and adapt its behavior depending on it.
An example is mmart handling of doors: When the user approaches a door,
the robot can cause problems if it continues in normal following mode. If
the user intends to close an open door or open a closed door the robot
might end up in a situation where it blocks the user from, for example,
swinging open a closed door leaf. A smart robot should be aware of this
danger and take appropriate action.
Visual place recognition
Current research on vision-based localization systems faces several
issues, of which robustness and adaptability are probably the most
challenging. The system should be robust to many types of variations such
as changes in illumination conditions, people moving around, or objects
being used and moved. Moreover, the visual appearance of indoor
environments changes continuously in time. This poses serious problems
for recognition algorithms trained off-line on data acquired once and for
all during a fixed time span. At the same time, when used on a robot, the
system must run in real-time on hardware with limited processing and
memory resources. To cope with the variations the system must adapt over
time and update the representation of the environment.
Object Search and localization
For the use of specific objects in the environment in navigation as well
as for manipulative or information-gathering tasks, it is necessary to
find them first. The system must be able both to plan its sensing -
including movement around the map - in order to perform efficiently and
quickly, and to use robust and effective recognition methods. All of this
needs to be backed by a good representation of the distribution of the
objects in space.
Print this page
|
|
|