HDM 2013 - The 1st International Workshop on High Dimensional Data Mining
Topics/Call fo Papers
The 1st International Workshop on High Dimensional Data Mining (HDM)
In conjunction with the IEEE International Conference on Data Mining (IEEE ICDM 2013) in Dallas, Texas.
Description
Some 13 years ago, Stanford statistician D. Donoho predicted that the 21st century will be the century of data. "We can say with complete confidence that in the coming century, high-dimensional data analysis will be a very significant activity, and completely new methods of high-dimensional data analysis will be developed; we just don't know what they are yet." -- D. Donoho, 2000.
Indeed, unprecedented technological advances lead to increasingly high dimensional data sets in all areas of science, engineering and businesses. These include genomics and proteomics, biomedical imaging, signal processing, astrophysics, finance, web, and market basket analysis, among many others. The number of features in such data is often of the order of thousands or millions -- that is much larger than the available sample size. This renders classical data analysis methods inadequate, questionable, or inefficient at best, and calls for new approaches.
Some of the manifestations of this curse of dimensionality are the following:
- High dimensional geometry defeats our intuition rooted in low dimensional experiences so that data presentation and visualisation become particularly challenging.
- Distance concentration is the phenomenon of high dimensional probability spaces where the contrast between pairwise distances vanishes as the dimensionality increases -- this makes distances meaningless, and affects all methods that rely on a notion of distance.
- Bogus correlations and misleading estimates may result when trying to fit complex models for which the effective dimensionality is too large compared to the number of data points available.
- The accumulation of noise may confound our ability to find low dimensional intrinsic structure hidden in the high dimensional data.
- The computation cost of processing high dimensional data is often prohibiting.
Topics
This workshop aims to promote new advances and research directions to address the curses, and to uncover and exploit the blessings of high dimensionality in data mining. Topics of interest range from theoretical foundations, to algorithms and implementation, to applications and empirical studies of mining high dimensional data, including (but not limited to) the following:
o Systematic studies of how the curse of dimensionality affects data mining methods
o New data mining techniques that exploit some properties of high dimensional data spaces
o Theoretical underpinnings of mining data whose dimensionality is larger than the sample size
o Stability and reliability analyses for data mining in high dimensions
o Adaptive and non-adaptive dimensionality reduction for noisy high dimensional data sets
o Methods of random projections, compressed sensing, and random matrix theory applied to high dimensional data mining
o Models of low intrinsic dimension, such as sparse representation, manifold models, latent structure models, and studies of their noise tolerance
o Classification, regression, clustering of high dimensional complex data sets
o Functional data mining
o Data presentation and visualisation methods for very high dimensional data sets
o Data mining applications to real problems in science, engineering or businesses where the data is high dimensional
Paper submission
High quality original submissions are solicited for oral and poster presentation at the workshop. Papers should not exceed a maximum of 8 pages, and must follow the IEEE ICDM format requirements of the main conference. All submissions will be peer-reviewed, and all accepted workshop papers will be published in the proceedings by the IEEE Computer Society Press.
Important dates
Submission deadline: August 3, 2013.
Notifications to authors: September 24, 2013.
Workshop: December 8, 2013.
Programme committee
Adam Kowalczyk - Victoria Research Laboratory, NICTA, Australia
Arthur Zimek - LMU Munich, Germany
Barbara Hammer - Clausthal University of Technology, Germany
John A. Lee - Universite Catholique de Louvain, Belgium
Laurens van der Maaten - Delft University of Technology, The Netherlands
Mark Last - University of the Negev, Israel
Milos Radovanovic - University of Novi Sad, Serbia
Pierre Alquier - University College Dublin, Ireland
Robert J. Durrant - University of Birmingham, UK
Stephan Gunnemann - Carnegie Mellon University
Yiming Ying - University of Exeter, UK
Workshop organiser
Ata Kaban
School of Computer Science, University of Birmingham, UK
In conjunction with the IEEE International Conference on Data Mining (IEEE ICDM 2013) in Dallas, Texas.
Description
Some 13 years ago, Stanford statistician D. Donoho predicted that the 21st century will be the century of data. "We can say with complete confidence that in the coming century, high-dimensional data analysis will be a very significant activity, and completely new methods of high-dimensional data analysis will be developed; we just don't know what they are yet." -- D. Donoho, 2000.
Indeed, unprecedented technological advances lead to increasingly high dimensional data sets in all areas of science, engineering and businesses. These include genomics and proteomics, biomedical imaging, signal processing, astrophysics, finance, web, and market basket analysis, among many others. The number of features in such data is often of the order of thousands or millions -- that is much larger than the available sample size. This renders classical data analysis methods inadequate, questionable, or inefficient at best, and calls for new approaches.
Some of the manifestations of this curse of dimensionality are the following:
- High dimensional geometry defeats our intuition rooted in low dimensional experiences so that data presentation and visualisation become particularly challenging.
- Distance concentration is the phenomenon of high dimensional probability spaces where the contrast between pairwise distances vanishes as the dimensionality increases -- this makes distances meaningless, and affects all methods that rely on a notion of distance.
- Bogus correlations and misleading estimates may result when trying to fit complex models for which the effective dimensionality is too large compared to the number of data points available.
- The accumulation of noise may confound our ability to find low dimensional intrinsic structure hidden in the high dimensional data.
- The computation cost of processing high dimensional data is often prohibiting.
Topics
This workshop aims to promote new advances and research directions to address the curses, and to uncover and exploit the blessings of high dimensionality in data mining. Topics of interest range from theoretical foundations, to algorithms and implementation, to applications and empirical studies of mining high dimensional data, including (but not limited to) the following:
o Systematic studies of how the curse of dimensionality affects data mining methods
o New data mining techniques that exploit some properties of high dimensional data spaces
o Theoretical underpinnings of mining data whose dimensionality is larger than the sample size
o Stability and reliability analyses for data mining in high dimensions
o Adaptive and non-adaptive dimensionality reduction for noisy high dimensional data sets
o Methods of random projections, compressed sensing, and random matrix theory applied to high dimensional data mining
o Models of low intrinsic dimension, such as sparse representation, manifold models, latent structure models, and studies of their noise tolerance
o Classification, regression, clustering of high dimensional complex data sets
o Functional data mining
o Data presentation and visualisation methods for very high dimensional data sets
o Data mining applications to real problems in science, engineering or businesses where the data is high dimensional
Paper submission
High quality original submissions are solicited for oral and poster presentation at the workshop. Papers should not exceed a maximum of 8 pages, and must follow the IEEE ICDM format requirements of the main conference. All submissions will be peer-reviewed, and all accepted workshop papers will be published in the proceedings by the IEEE Computer Society Press.
Important dates
Submission deadline: August 3, 2013.
Notifications to authors: September 24, 2013.
Workshop: December 8, 2013.
Programme committee
Adam Kowalczyk - Victoria Research Laboratory, NICTA, Australia
Arthur Zimek - LMU Munich, Germany
Barbara Hammer - Clausthal University of Technology, Germany
John A. Lee - Universite Catholique de Louvain, Belgium
Laurens van der Maaten - Delft University of Technology, The Netherlands
Mark Last - University of the Negev, Israel
Milos Radovanovic - University of Novi Sad, Serbia
Pierre Alquier - University College Dublin, Ireland
Robert J. Durrant - University of Birmingham, UK
Stephan Gunnemann - Carnegie Mellon University
Yiming Ying - University of Exeter, UK
Workshop organiser
Ata Kaban
School of Computer Science, University of Birmingham, UK
Other CFPs
- 5th International Conference on Automotive User Interfaces and Interactive Vehicular Applications
- 13th International Conference on Modularity
- 2013 3rd International eConference on Computer and Knowledge Engineering (ICCKE 2013)
- 2014 IEEE Region 10 Conference
- 13th International Conference on Systems Simulation
Last modified: 2013-05-28 08:50:35