2011 - Sixth Summer School on Advanced Statistics and Data Mining
Topics/Call fo Papers
This summerschool is organized by the Artificial Intelligence Department of the Computer Science Faculty of the Univ. Politécnica de Madrid. This summerschool is currently in its 6th edition. It is an intensive course aiming at providing attendees with an introduction to the theoretical foundations as well as the practical applications of some of the modern statistical analysis techniques currently in use. The summerschool takes 2 weeks and is divided into 17 courses. Each subject has theoretical classes as well as practical classes in which each technique is put into practice with a computer program. Students may register only in those courses of their interest.
Academic interest. This course complements the background of many students from a variety of disciplines with the theoretical and practical fundamentals of those modern techniques employed in the analysis and modelling of large data sets. The academic interest of this course is high since there are no specific university studies on this kind of techniques.
Scientific interest: Any scientist in most fields (engineering, life sciences, economics, etc.) is confronted to the problem of extracting conclusions from a set of experimental data. This course supplies experimentalists with the sufficient resources to be able to select the appropriate analysis technique and how to apply it to their specific problem.
Professional interest. The application of modern data analysis in the industry is well spread since it is practically needed in nearly all disciplines. As for job offers, it is a quite demanded topic: a search in Monster.com as for March 2011 retrieves more than 96,000 offers for "data analysis", and more than 25,000 offers for "data mining", all within the USA only.
http://www.dia.fi.upm.es/index.php?page=presentati...
*List of courses and brief description*
Week 1 (July 4th - July 8th, 2011)
Course 1: Bayesian networks (15 h)
Bayesian networks basics. Inference in Bayesian networks.
Learning Bayesian networks from data.
Course 2: Statistical inference (15 h)
Introduction. Some basic statistical test. Multiple testing.
Introduction to bootstrapping.
Course 3: Probabilistic modelling for evolutionary computation (15 h)
Evolutionary algorithms. Estimation of distribution algorithms.
EDAs for discrete, continuous and multi objective optimization
problems. Real-world applications.
Course 4: Supervised pattern recognition (Classification) (15 h)
Introduction. Assessing the Performance of Supervised
Classification Algorithms. Classification techniques. Combining
Classifiers. Comparing Supervised Classification Algorithms.
Course 5: Multivariate data analysis (15 h)
Introduction. Data Examination. Principal component analysis
(PCA). Factor Analysis. Multidimensional Scaling (MDS).
Correspondence analysis. Multivariate Analysis of Variance
(MANOVA). Canonical correlation.
Course 6: Neural networks (15 h)
Introduction to the biological models. Nomenclature. Perceptron
networks. The Hebb rule. Foundations of multivariate
optimization. Numerical optimization.
Rule of Widrow-Hoff. Backpropagation algorithm.
Practical data modelling with neural networks.
Course 7: Features Subset Selection (15 h)
Introduction. Redundancy and irrelevance. Filter approaches.
Wrapper methods. Embedded methods. Drawbacks and future strands.
Stability and consistency. Practical session with presentation.
Course 8: Regression (15 h)
Introduction. Simple Linear Regression Model. Measures of model
adequacy. Multiple Linear Regression. Regression Diagnostics and
model violations. Polynomial regression. Variable selection.
Indicator variables as regressors. Logistic regression.
Non-linear Regression.
Week 2 (July 11th - July 15th, 2011)
Course 9: Hot topics in intelligent data analysis (15 h)
Multi-label and multi-dimensional classification. Advanced
Bayesian classifiers. Data streams in a semi-supervised learning
context. Advanced Clustering. Spatial and circular point patterns.
Course 10: Machine learning in computer vision (15 h)
The scene understanding problem. Visual features for object
detection and classification. Usual classification techniques in
computer vision. Object detection. Object classification.
Course 11: Hidden Markov Models (15 h)
Introduction. Discrete Hidden Markov Models. Basic algorithms
for Hidden Markov Models. Semi-continuous Hidden Markov Models.
Continuous Hidden Markov Models. Unit selection and clustering.
Speaker and Environment Adaptation for HMMs.
Other applications of HMMs.
Course 12: Time series analysis (15 h)
Introduction. Probability models to time series. Regression and
Fourier analysis. Forecasting and Data mining.
Course 13: Data mining: A practical perspective (15 h)
Introduction to Data Mining and Knowledge Discovery. Prediction
in data mining. Classification. Association studies. Data mining
in free-form texts: text mining.
Course 14: Unsupervised pattern recognition (clustering) (15 h)
Introduction. Prototype-based clustering. Density-based
clustering. Graph-based clustering. Cluster evaluation.
Miscellanea.
Course 15: Support vector machines and kernel methods (15 h)
Linear classifiers. Perceptrons. Linear SMVs. Non-linear SVMs.
Kernelization. Support Vector Regression. Related models.
SVM Learning algorithms. Kernel PCA. Kernel FDA. Kernel K-means.
Course 16: Practical statistical questions (15 h)
The basics. How do I collect the data? Experimental design.
Parameter estimation. Correlation. Hypothesis testing.
Sample size. Study of cases of different fields.
Course 17: Statistics and machine learning with R (15h)
The R environment. Data in R. Programming in R. Graphics in R.
Statistical Analysis with R. Practical sessions
Academic interest. This course complements the background of many students from a variety of disciplines with the theoretical and practical fundamentals of those modern techniques employed in the analysis and modelling of large data sets. The academic interest of this course is high since there are no specific university studies on this kind of techniques.
Scientific interest: Any scientist in most fields (engineering, life sciences, economics, etc.) is confronted to the problem of extracting conclusions from a set of experimental data. This course supplies experimentalists with the sufficient resources to be able to select the appropriate analysis technique and how to apply it to their specific problem.
Professional interest. The application of modern data analysis in the industry is well spread since it is practically needed in nearly all disciplines. As for job offers, it is a quite demanded topic: a search in Monster.com as for March 2011 retrieves more than 96,000 offers for "data analysis", and more than 25,000 offers for "data mining", all within the USA only.
http://www.dia.fi.upm.es/index.php?page=presentati...
*List of courses and brief description*
Week 1 (July 4th - July 8th, 2011)
Course 1: Bayesian networks (15 h)
Bayesian networks basics. Inference in Bayesian networks.
Learning Bayesian networks from data.
Course 2: Statistical inference (15 h)
Introduction. Some basic statistical test. Multiple testing.
Introduction to bootstrapping.
Course 3: Probabilistic modelling for evolutionary computation (15 h)
Evolutionary algorithms. Estimation of distribution algorithms.
EDAs for discrete, continuous and multi objective optimization
problems. Real-world applications.
Course 4: Supervised pattern recognition (Classification) (15 h)
Introduction. Assessing the Performance of Supervised
Classification Algorithms. Classification techniques. Combining
Classifiers. Comparing Supervised Classification Algorithms.
Course 5: Multivariate data analysis (15 h)
Introduction. Data Examination. Principal component analysis
(PCA). Factor Analysis. Multidimensional Scaling (MDS).
Correspondence analysis. Multivariate Analysis of Variance
(MANOVA). Canonical correlation.
Course 6: Neural networks (15 h)
Introduction to the biological models. Nomenclature. Perceptron
networks. The Hebb rule. Foundations of multivariate
optimization. Numerical optimization.
Rule of Widrow-Hoff. Backpropagation algorithm.
Practical data modelling with neural networks.
Course 7: Features Subset Selection (15 h)
Introduction. Redundancy and irrelevance. Filter approaches.
Wrapper methods. Embedded methods. Drawbacks and future strands.
Stability and consistency. Practical session with presentation.
Course 8: Regression (15 h)
Introduction. Simple Linear Regression Model. Measures of model
adequacy. Multiple Linear Regression. Regression Diagnostics and
model violations. Polynomial regression. Variable selection.
Indicator variables as regressors. Logistic regression.
Non-linear Regression.
Week 2 (July 11th - July 15th, 2011)
Course 9: Hot topics in intelligent data analysis (15 h)
Multi-label and multi-dimensional classification. Advanced
Bayesian classifiers. Data streams in a semi-supervised learning
context. Advanced Clustering. Spatial and circular point patterns.
Course 10: Machine learning in computer vision (15 h)
The scene understanding problem. Visual features for object
detection and classification. Usual classification techniques in
computer vision. Object detection. Object classification.
Course 11: Hidden Markov Models (15 h)
Introduction. Discrete Hidden Markov Models. Basic algorithms
for Hidden Markov Models. Semi-continuous Hidden Markov Models.
Continuous Hidden Markov Models. Unit selection and clustering.
Speaker and Environment Adaptation for HMMs.
Other applications of HMMs.
Course 12: Time series analysis (15 h)
Introduction. Probability models to time series. Regression and
Fourier analysis. Forecasting and Data mining.
Course 13: Data mining: A practical perspective (15 h)
Introduction to Data Mining and Knowledge Discovery. Prediction
in data mining. Classification. Association studies. Data mining
in free-form texts: text mining.
Course 14: Unsupervised pattern recognition (clustering) (15 h)
Introduction. Prototype-based clustering. Density-based
clustering. Graph-based clustering. Cluster evaluation.
Miscellanea.
Course 15: Support vector machines and kernel methods (15 h)
Linear classifiers. Perceptrons. Linear SMVs. Non-linear SVMs.
Kernelization. Support Vector Regression. Related models.
SVM Learning algorithms. Kernel PCA. Kernel FDA. Kernel K-means.
Course 16: Practical statistical questions (15 h)
The basics. How do I collect the data? Experimental design.
Parameter estimation. Correlation. Hypothesis testing.
Sample size. Study of cases of different fields.
Course 17: Statistics and machine learning with R (15h)
The R environment. Data in R. Programming in R. Graphics in R.
Statistical Analysis with R. Practical sessions
Other CFPs
Last modified: 2011-03-29 13:35:04