Lab Exercises

Recommended reading from Statistics, Data Mining, and Machine Learning in Astronomy ("SDMML") and Numerical Recipes ("NR"):
 
SDMML: Chapters 6
Scikit-learn's Related projects page for a glimpse of some ML resources.


Exercise 1.

Classification and machine learning, continued from last time: Using the The digits files in the director /u/inscc/bromley/courses/ap7730/data/, train-images-idx3-ubyte [images] and train-labels.idx1-ubyte [labels], represent 60,000 handwritten digits (0-9). Following code in ~bromley/courses/ap7730/examples/digitslab.py, as a starting point (very near to the end point!), use sklearn's Multi-Layer Perceptron classifier to distinguish 4's from 9's.

Plot a bunch of the images for which the classification failed.

Exercise 2.

Generate n=5 samples of a Gaussian process (N(0.0,1.0)) and use kernel density estimation to estimate the underlying probabilioty density function. In a plot, compare the output of Gaussian KDE with the underlying probablilty density function. Try with different n, also different bandwidths ("silverman", or a numerical value).

Exercise 3.

Create a 2-d map of n=200 points, clustered as in cluster2d.py the class examples directory. Use the friends of friends algorith to identify the high-density region. Try 2-D KDE to do the same.