There is a growing interest
in Machine Learning, in applying geometrical and topological tools to
high-dimensional data analysis and processing [
14,
3,
20]. Considering a finite set of
points in a high-dimensional space, the approaches developed in the field of
Topology Learning intend to learn, explore and exploit the topology of the
shapes (topological invariants such as the intrinsic dimension or the Betti
numbers), manifolds or not, from which these points are supposed to be
drawn. Applications likely to benefit from these topological characteristics
have been identified in the field of Exploratory Data Analysis [
13], Pattern
Recognition [
11], Process Control [
26], Semi-Supervised Learning [
5, 8],
Manifold Learning [
24,
8] and Clustering [
22].
Some works concerning
manifolds have been done by researchers from
Machine Learning. In the
framework of Manifold Learning [
8, chapter 5],
[
19,
27], non linear
dimension reduction techniques are used to represent or
visualize high-dimensional
data or as pre-processing step for supervised or
unsupervised learning tasks
[
12,
18,
25]. However, the final dimension of
the output data and the
topology of the space they live in, are constrained
a priori. In
Semi-Supervised Learning, it is intended to perform manifold
regularization by taking
into account the topology of the shapes using the
Laplacian of some proximity
graph of the data [
2,
5]. A similar approach
is also used in Spectral
Clustering [
4]. But in both cases, the choice of the
proximity graph greatly
impacts the results, making these methods sensitive
to noise [
21] or outliers,
and there is no objective way to choose a good graph.
In the field of
Computational Geometry, lots of efforts have been realized
in processing geometric
data in dimension 2 or 3. Using concepts such as epsilon-
samples [
1] and restricted Delaunay
triangulations [
17], efficient tools [
1,
15]
for estimating topological
and geometrical properties of shapes have been
developed. However, it is
necessary to assume that the unknown shape
is a smooth manifold and
that the sample is not too noisy to ensure the
correctness of the
reconstruction algorithms.
Topology Learning aims at
going beyond all these limits by providing
methods able to extract
geodesic paths or topological invariants from noisy
high-dimensional
out-of-control sampled shapes with theoretical guarantees
and algorithmic
tractability, and with the least a priori constraints
or knowledge.
It lies at the crossing of
Machine Learning, Computational Geometry
and Topology.
In the last few years, new
ideas have emerged to face the Topology Learning
problems from the algorithmic
and theoretical points of view. They lead
to new methods that are
based upon geometric and algebraic topology tools.
For example, the distance
function point of view [
9] allows to unify and
generalize what was
previously known about geometric inference [
23]. The
concept of topological
persistence [
16] proved very helpful for filtering out
noise in geometric signals
[
10] or visualizing sub-structures in 3D images [
22].
Recent attempts have also
been made to deal with manifold reconstruction
in high-dimension [
6], and
to mix statistical and topological tools, extending
Voronoï concepts to Bregman
divergence [
7], or defining generative models
based on simplicial
complexes [
3] combining the ideas of generative Principal
Manifolds [
24] and Witness
Complexes [
13]. The development of these newideas still face several
theoretical and algorithmic difficulties that need to be addressed.
[1] N. Amenta and M. Bern. Surface
reconstruction by voronoi filtering. Discrete
and Computational Geometry,
22(4), pp 481-504, 1999.
[2] A. Argyriou, M. Herbster,
and M. Pontil. Combining graph laplacians for semi-supervised
learning.
Proc. of NIPS 2005.
[3] M. Aupetit. Learning topology
with the generative gaussian graph and the EM algorithm.
Proc. of NIPS 2005
[4] M. Belkin and P. Niyogi.
Laplacian eigenmaps and spectral techniques for
embedding and clustering. Proc. of NIPS 2002
[5] M. Belkin, P. Niyogi,
and V. Sindhwani. Manifold regularization: a geometric
framework for learning from
labeled and unlabeled examples. Journal of
machine Learning Research 7,
pp 2399-2434, 2006.
[6] J.-D. Boissonnat, L. Guibas, and
S. Oudot. Manifold reconstruction in arbitrary dimensions using witness
complexes. In Proc. of Symp. on Computational Geometry (SoCG),
pages 194-203, 2007.
[7] J.-D. Boissonnat, F.
Nielsen, and R. Nock. On bregman voronoi diagrams. In
Proc. 18th ACM-SIAM Symp. on
Discrete Algorithms (SODA), 2007.
[8] O. Chapelle, B. Schölkopf,
and A. Zien, editors. Semi-Supervised Learning.
MIT Press, Cambridge, MA,
2006.
[9] F. Chazal, D.
Cohen-Steiner, and A. Lieutier. A sampling theory for compact
sets in euclidean spaces. To
appear in Discrete & Computational Geometry,
Springer,
2007.
[10] F. Chazal and A. Lieutier.
Topology guaranteeing manifold reconstruction using distance function to
noisy data. in proc. Symp. on Computational
Geometry
(SoCG'06),
2006.
[11] A. Collins, A. Zomorodian,
G. Carlsson, and L. Guibas. A barcode shape
descriptor for curve point
cloud data. Eurographics Symposium on Point-Based
Graphics M. Alexa, S. Rusinkiewicz,
(Editors), 2004.
[12] D. de Ridder, O. Kouropteva,
O. Okun, M. Pietikäinen, and R. P. W. Duin.
Supervised locally linear
embedding. In ICANN, pages 333-341,
2003.
[13] V. de Silva and G. Carlsson.
Topological estimation using witness complexes.
In M. Alexa and S. Rusinkiewicz
(Eds) Eurographics Symposium on Point-
Based Graphics, ETH, Zürich,Switzerland, June 2-4,
2004.
[14] V. de Silva and J. B. Tenenbaum.
Global versus local methods for nonlinear dimensionality reduction. Proc. of NIPS 2002
[15] H. Edelsbrunner, J. Harer,
V. Natarajan, and V. Pascucci. Morse-smale complexes
for piecewise linear
3-manifolds. Proc. 19th Ann. Symp. Comp. Geom.
(SOCG), 2003.
[16] H. Edelsbrunner, D. Letscher,
and A. Zomorodian. Topological persistence
and simplification. In IEEE Symp.
on Found. of Comp. Sci., pages 454-463, 2000.
[17] H. Edelsbrunner and N.
R. Shah. Triangulating topological spaces. International
Journal on Computational
Geometry and Applications, 7:365-378, 1997.
[18] A. Errity and J.
McKenna. An investigation of manifold learning for speech analysis. In Proc.
of the Int. Conf. on Spoken Language Processing (Interspeech
2006 - ICSLP), pages 2506-2509,
2006.
[19] A. M. Farahmand, C. Szepesvari, and
J-Y Audibert. Manifold adaptive dimension
estimation. In Proceedings
of the 24th ICML, 2007.
[20] G. Haro, G. Randall,
and G. Sapiro. Stratification learning: Detecting mixed
density and dimensionality
in high dimensional point clouds. Proc. of NIPS 2007.
[21] M. Hein and M. Maier. Manifold
denoising. Advances in Neural Information
Processing Systems 19,
2007.
[22] D. Laney, P.-T. Bremer,
A. Mascarenhas, P. Miller, and V. Pascucci. Understanding
the structure of the
turbulent mixing layer in hydrodynamic instabilities.
IEEE Trans. on Vis. and
Comp. Graph. Vol. 12, No. 5, pages 1053-1060, 2006.
[23] P. Niyogi, S. Smale,
and S. Weinberger. Finding the homology of submanifolds
with high confidence from
random samples. Discrete and Computational . Geometry, 2006.
[24] R. Tibshirani. Principal
curves revisited. Statistics and Computing,
(2):183-190, 1992.
[25] M. Vlachos, C. Domeniconi,
D. Gunopulos, G. Kollios, and N. Koudas. Nonlinear
dimensionality reduction
techniques for classification and visualization.
Proc of KDD
'02
[26] M. Zeller, R. Sharma,
and K. Schulten. Topology representing network for
sensor-based robot motion
planning. WCNN, INNS Press,
pages 100-103, 1996.
[27] Z. Zhang and H. Zha. Principal
manifolds and nonlinear dimensionality reduction
via tangent space
alignment. SIAM J. Sci.
Comp., 26:313-338, 2005.