| |
|
|
|
|
|
|
|
[1]
|
Michael Stark, Philipp Lies, Michael Zillich, Jeremy Wyatt, and Bernt Schiele.
Functional object class detection based on learned affordance cues.
In 6th International Conference on Computer Vision Systems
(ICVS), May 2008.
Accepted.
[ bib |
.pdf ]
Current approaches to visual object class detection mainly focus on
the recognition of abstract object categories, such as cars, motorbikes,
mugs and bottles. Although these approaches have demonstrated impressive
performance in terms of recognition, their restriction to abstract
categories seems artificial and inadequate in the context of embodied,
cognitive agents. Here, distinguishing objects according to functional
aspects based on object affordances is vital for a meaningful human-machine
interaction. In this paper, we propose a complete system for the
detection of functional object classes, based on a representation
of visually distinct hints on object affordances (affordance cues).
It spans the complete cycle from tutor-driven acquisition of affordance
cues, one-shot learning of corresponding object models, and detecting
novel instances of functional object classes in real images.
|
|
[2]
|
M. Kristan, D. Skocaj, and A. Leonardis.
Incremental learning with Gaussian mixture models.
In Computer Vision Winter Workshop CVWW 2008, pages 25-32,
Moravske toplice, Slovenia, February 2008.
[ bib |
.pdf ]
In this paper we propose a new incremental estimation of Gaussian
mixture models which can be used for applications of online learning.
Our approach allows for adding new samples incrementally as well
as removing parts of the mixture by the process of unlearning. Low
complexity of the mixtures is maintained through a novel compression
algorithm. In contrast to the existing approaches, our approach does
not require fine-tuning parameters for a specific application, we
do not assume specific forms of the target distributions and temporal
constraints are not assumed on the observed data. The strength of
the proposed approach is demonstrated with an example of online estimation
of a complex distribution, an example of unlearning, and with an
interactive learning of basic visual concepts.
|
|
[3]
|
S. Hongeng and J. Wyatt.
Learning Causality and Intentional Actions, pages 27-46.
LNAI: Towards Affordance-Based Robot Control. Springer, 2008.
[ bib |
.pdf ]
Previous research has shown that human actions can be detected by
motion patterns. However, labeling motion patterns is not sufficient
in a cognitive system that requires reasoning about the agent's intentions,
and how the environmental context affects the way an action is performed.
In this paper, we develop a graphical model that captures how the
movements that realize the action vary depending on the situ- ations,
and present statistical learning algorithms. Using ob ject manip-
ulation tasks, we illustrate how a system infers the agent's goals
from visual observation and compare results with findings in psychological
experiments.
|
|
[4]
|
M. Kristan, D. Sko}caj, and A. Leonardis.
Online kernel density estimation for interactive learning.
Image and Vision Computing, 2008.
[ bib ]
In this paper we propose a Gaussian-kernel-based online kernel density
estimation which can be used for applications of online probability
density estimation and online learning. Our approach generates a
Gaussian mixture model of the observed data and allows online adaptation
from positive examples as well as from the negative examples. The
adaptation from the negative examples is realized by a novel concept
of unlearning in mixture models. Low complexity of the mixtures is
maintained through a novel compression algorithm. In contrast to
the existing approaches, our approach does not require fine-tuning
parameters for a specific application, we do not assume specific
forms of the target distributions and temporal constraints are not
assumed on the observed data. The strength of the proposed approach
is demonstrated with examples of online estimation of complex distributions,
an example of unlearning, and with an interactive learning of basic
visual concepts.
|
|
[5]
|
D. Skocaj, M. Kristan, and A. Leonardis.
Continuous learning of simple visual concepts using incremental
kernel density estimation.
In International Conference on Computer Vision Theory and
Applications, pages 598-604, Funchal, Madeira, Portugal, January 2008.
[ bib |
.pdf ]
In this paper we propose a method for continuous learning of simple
visual concepts. The method continuously associates words describing
observed scenes with automatically extracted visual features. Since
in our setting every sample is labelled with multiple concept labels,
and there are no negative examples, reconstructive representations
of the incoming data are used. The associated features are modelled
with kernel density probability distribution estimates, which are
built incrementally. The proposed approach is applied to the learning
of object properties and spatial relations.
|
|
[6]
|
Michael Stark and Bernt Schiele.
How good are local features for classes of geometric objects.
In Eleventh IEEE International Conference on Computer Vision
(ICCV), October 2007.
Accepted.
[ bib |
.pdf ]
Recent work in object categorization often uses local image descriptors
such as SIFT to learn and detect object categories. As such descriptors
explicitly code local appearance they have shown impressive results
on objects with sufficient local appearance statistics. However,
many important object classes such as tools, cups and other man-made
artifacts seem to require features that capture the respective shape
and geometric layout of those object classes. Therefore this paper
compares, on a novel data collection of 10 geometric object classes,
various shape-based features with more appearance based descriptors
such as SIFT. The analysis includes a direct comparison of feature
statistics as well as the results within standard recognition frameworks.
The results suggest that there are indeed differences between shape-
based and more appearance-based features but that those differences
do not always conform with what one might expect.
|
|
[7]
|
D. Skocaj, G. Berginc, B. Ridge, A. Štimec, M. Jogan, O. Vanek,
A. Leonardis, M. Hutter, and N. Hewes.
A system for continuous learning of visual concepts.
In International Conference on Computer Vision Systems ICVS
2007, Bielefeld, Germany, March 2007.
[ bib |
.pdf ]
We present an artifficial cognitive system for learning visual concepts.
It comprises of vision, communication and manipulation sub- systems,
which provide visual input, enable verbal and non-verbal com munication
with a tutor and allow interaction with a given scene. The main goal
is to learn associations between automatically extracted visual features
and words that describe the scene in an open-ended, continuous manner.
In particular, we address the problem of cross-modal learning of
visual properties and spatial relations. We introduce and analyse
several learning modes requiring different levels of tutor supervision.
|
|
[8]
|
D. Skocaj, B. Ridge, G. Berginc, and A. Leonardis.
A framework for continuous learning of simple visual concepts.
In Computer Vision Winter Workshop 2007, pages 99-105, St.
Lambrecht, Austria, February 2007.
[ bib |
.pdf ]
We present a continuous learning framework for learning simple visual
concepts and its implementation in an artificial cognitive system.
The main goal is to learn associations between automatically extracted
visual features and words that describe the scene in an open-ended,
continuous manner. In particular, we address the problem of cross-modal
learning of elementary visual properties and spatial relations; we
show that the same learning mechanism can be used to both types of
concepts. We introduce and analyse several learning modes requiring
different levels of tutor supervision, ranging from a completely
tutor driven to a completely autonomous exploratory approach.
|
|
[9]
|
Somboon Hongeng and Jeremy Wyatt.
Learning causality and intention in human actions.
In Proceedings of the 6th IEEE-RAS International Conference of
Humanoid Robots (Humanoids'06). IEEE, December 2006.
[ bib |
.pdf ]
Previous research has shown that human actions can be detected by
motion patterns. However, labeling motion patterns is not sufficient
in a cognitive system that requires reasoning about the agent's intentions,
and how the environmental context affects the way an action is performed.
In this paper, we develop a graphical model that captures how the
movements that realize the action vary depending on the situations,
and present statistical learning algorithms. Using object manipulation
tasks, we illustrate how a system infers the agent's goals from visual
observation and compare results with findings in psychological experiments.
Keywords: cosy; irlab
|
|
[10]
|
Sanja Fidler, Danijel Skocaj, and Aleš Leonardis.
Combining reconstructive and discriminative subspace methods for
robust classification and regression by subsampling.
IEEE Transactions on Pattern Analysis and Machine Intelligence,
28(3):337-350, March 2006.
[ bib |
.pdf ]
Linear subspace methods that provide sufficient reconstruction of
the data such as PCA offer an efficient way of dealing with missing
pixels, outliers, and occlusions that often appear in the visual
data. Discriminative methods, such as LDA and CCA, which on the other
hand, are better suited for classification and regression tasks,
are highly sensitive to corrupted data. We present a theoretical
framework for achieving best of both types of methods: an approach
that combines the discrimination power of discriminative methods
with the reconstruction property of reconstructive methods which
enables one to work on subsets of pixels in images, to efficiently
detect and reject the outliers. The proposed approach is therefore
capable of robust classification/regression with a high-breakdown
point. The theoretical results are demonstrated on several computer
vision tasks showing that the proposed approach significantly outperforms
the standard discriminative methods in the case of missing pixels
and images containing occlusions and outliers.
|
|
[11]
|
B. Leibe, A. Leonardis, and B. Schiele.
An implicit shape model for combined object categorization and
segmentation.
In M. Hebert, J. Ponce, C. Schmid, and A. Zisserman, editors,
Towards Category-Level Object Recognition, LNCS. Springer, 2006.
to appear.
[ bib ]
We present a method for object categorization in real-world scenes.
Following a common consensus in the field, we do not assume that
a figure-ground segmentation is available prior to recognition. However,
in contrast to most standard approaches for object class recognition,
our approach automatically segments the object as a result of the
categorization. This combination of recognition and segmentation
into one process is made possible by our use of an Implicit Shape
Model, which integrates both capabilities into a common probabilistic
framework. This model can be thought of as a non-parametric approach
which can easily handle configurations of large numbers of object
parts. In addition to the recognition and segmentation result, it
also generates a per-pixel confidence measure specifying the area
that supports a hypothesis and how much it can be trusted. We use
this confidence to derive a natural extension of the approach to
handle multiple objects in a scene and resolve ambiguities between
overlapping hypotheses with an MDL-based criterion. In addition,
we present an extensive evaluation of our method on a standard dataset
for car detection and compare its performance to existing methods
from the literature. Our results show that the proposed method outperforms
previously published methods while needing one order of magnitude
less training examples. Finally, we present results for articulated
objects, which show that the proposed method can categorize and segment
unfamiliar objects in different articulations and with widely varying
texture patterns, even under significant partial occlusion.
|
This file has been generated by
bibtex2html 1.79
Print this page
|
|
|