An introductory pratical course by Florent Krzakala and Antoine Baker, Ecole Doctorale EDPIF 2019

This is a basic introductory course in machine learning and statistical inference, with an emphasis on simple methods, theoretical understanding and practical exercises. The course will combine (and alternate) between methodology with theoretical foundations and practical computational aspects with exercices in python, using scikit-learn and pytorch. The topics will be chosen from the following basic outline:

- Statistical theory : Maximum likelihood, Bayes, VC Bounds and Uniform convergence
- Supervised learning : Linear Regression, Ridge, Lasso, high Dimensional Data, Kernel methods, Boosting
- Deep learning: multi-layer net, conv-net, auto-encoder
- Unsupervised learning : Mixture Models, PCA & Kernel PCA
- Basics of Generative models & Reinforcement learning

All exercices will be in python. For practical installation, we recommand (especially for apple computers) Anaconda. More precisly, we shall use Python 3.7, with the following modules: numpy, scipy, matplotlib, pandas, h5py, datasets, scikit-learn and scikit-image. All these can be installed within anaconda, or with pip.

For deep-learning, we shall use keras (version >= 2.2.4) and tensorflow (version >= 1.13.1) . All the exercices will be given as jupyter notebooks, so jupyter should be installed as well. We will also use pytorch later in the course.

**Nota Bene** For the Lecture 4, you will need JupyterLab and some additional packages, see the
installation instructions.

- Lecture 1: Introduction to supervised machine learning. KNN, linear models, optimization. Some words on VC and Rademacher bounds [slides],[notebooks]
- Lecture 2: Kernel trick, Kernels methods, and random features,[notebooks]
- Lecture 3: Unsupervised learning with PCA and Kernel PCA, Ensemble methods (Boosting and Bagging) [slides], [notebooks]
- Lecture 4: How to work with data [notebooks] [installation requirements]
- Lecture 5: How to work with models [notebooks]
- Lecture 6: Introduction to Neural network and deep learning [slides][notebooks]
- Lecture 7: Some special architecture: CNN, RNN and LSTM [notebooks]
- Lecture 8: Introduction to Generative models, Auto-encoders and reinforcement learning [notebooks]

Inscription should be made with the “Ecole Doctorale EDPIF” : e-mail

Lecture will be on monday in April and May 2019, in Ecole Normale Superieure in Paris, in the physics department, Rue Lhomond, at the third floor. The lectures in will be in room L357 from 9h to 10h30, followed by practical exercices in room L363 and L378.

There will be eight sessions: april 1,8,15,29 and may 6,13,20,27.

- A good book for probability and statistics, accessible to students, is Larry A. Wasserman ‘s All of Statistics
- A good introduction to statistical learning is given in Elements of Statistical Learning by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie.
- Another great reference is Machine Learning:A Probabilistic Perspective by Kevin P. Murphy.
- Deep learning is well covered in this new book: Dive into Deep Learning by A. Zhang, Z. Lipton, M. Li, A.J. Smola.
- Un recent, et excellent, livre de reference en Francais: Introduction au Machine Learning par Chloé-Agathe Azencott.
- A very nice review on machine learning for physics .
- An introduction to machine learning for physicists.

You would like to try deep learning but don’t have a GPU? Or you don’t want to install software on your computer? A great solution is to use the Colaboratory platform from Google: It requires no specific hardware or software!

You can find the CNN example from our own lec 7 here. Additioanlly, the examples of lec 8 are given here, here and here. Have fun modifying them!