Machine Learning

CS 445/545:
Computer Science Departments
Winter Quarter 2010


Time : Mondays and Wednesdays, 2:00-3:50pm

Location: Fourth Avenue Building (FAB), Room 40-07.

Instructor: Melanie Mitchell, FAB 120-24, (503) 725-2412, e-mail

Office hours: Mondays and Wednesdays, 12:30-1:30pm, or by appointment.

Course Website: :
http://www.cs.pdx.edu/~mm/MachineLearningWinter2010/index.html

Course description: This course provides a broad introduction to techniques for building computer systems that learn from experience. It provides both conceptual grounding and practical experience with several learning systems. The course provides grounding for advanced study in statistical learning methods, and for work with adaptive technologies used in speech and image processing, robotic planning and control, diagnostic systems, complex system modeling, and iterative optimization. Students will gain practical experience implementing and evaluating systems applied to pattern recognition, prediction, and optimization problems.

Prerequisites: Undergraduate-level courses in calculus, linear algebra, and probability and statistics. Facility in at least one high-level programming language.

Textbook: S. Marsland, Machine Learning: An Algorithmic Perspective

Homework: The class will have weekly or bi-weekly homework assignments, involving writing code for and/or experimenting with various machine learning methods. 


Late homework policy: Students must request and be granted an extension on any homework assigment before the assignment is due. Otherwise, 5% of the assignment grade will be subtracted for each day the homework is late.  

 

Exams: The class will have a take-home midterm exam and a take-home final exam. Both will be open-book. There will also be a number of short in-class quizzes. 

 

Grading: Homework 40%, Presentation 10%, In-Class Quizzes, 15%, Midterm 15%, Final 20%.

PSU Code of Academic Integrity:
Each student in this course is expected to abide by the Portland State University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student's own work. You are encouraged to study together and to discuss information and concepts covered in lecture and the sections with other students. Should copying occur, both the student who copied work from another student and the student who gave material to be copied will both automatically receive a zero for the assignment. Penalty for violation of this Code can also be extended to include failure of the course and University disciplinary action. During examinations, you must do your own work. Talking or discussion is not permitted during the examinations, nor may you compare papers, copy from others, or collaborate in any way. Any collaborative behavior during the examinations will result in failure of the exam, and may lead to failure of the course and University disciplinary action.

Schedule for student presentations

Syllabus (subject to change):

Date

Class Topic(s)

Homework and Reading

Mon. Jan. 4

Class introduction ( slides )

Perceptrons ( slides )

 

Reading: Textbook, Chapters 1-2

Homework, due Wed. Jan. 13, 2pm:

HW 1: Perceptron learning

Here is the data in gzipped tarball format.
 

Wed. Jan. 6

Neural Networks ( slides )

Code and data for Tom Mitchell's neural network face-analysis system.

Instructions for compiling this system on Windows

...

Mon. Jan. 11

Neural Networks, continued

Evaluating and comparing models ( slides )

Study sheet for quiz on Wednesday.

Reading:
Textbook, Chapter 3

Wed. Jan. 13

Quiz 1

Evaluating and comparing models, continued

slides, part 2
summary slides



Homework, due Mon. Jan. 25, 2pm:

HW 2: Neural networks and evaluating hypotheses

On-line statistics tables

Mon. Jan. 18

No Class

(Martin Luther King Day)

...

Wed. Jan. 20

Support Vector Machines (slides )

Student presentations:

Keith Wilson: Elsas et al., Fast learning of document ranking functions with the committee perceptron

Joshua Hughes: Dailey et al., EMPATH: A neural network that categorizes facial expressions

Reading: Textbook, Chapter 5

Mon. Jan. 25

Support Vector Machines, continued (slides )

Student presentations:

Spenser Barlow: Stuart et al. A neural network classifier for junk e-mail

Geoffrey Shauger: Mazurowskia et al. Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance


Homework, due Mon. Feb. 1:
HW 3: Support Vector Machines

Here is the svm_light code.

Here is the optdigits data.

Here is an example perl script

Wed. Jan. 27

Quiz 2

Decision Trees (slides )

Student presentations:

Ryan Mitchell: Cusano et al., Image annotation using SVM

Nick Insalata: Schohn et al., Less is more: Active learning with support vector machines

Reading: Textbook, Chapter 6

Mon. Feb. 1

Decision Trees, continued

Student presentations:

Adam Jones: Min and Lee, Bankruptcy prediction using support vector machine with optimal choice of kernal function parameters

Homework, due Monday Feb. 8: Decision Trees

Download the C4.5 executable, manual pages, and data

Optional: Download the original source code

Wed. Feb. 3

Ensemble Learning ( slides )

Student presentations:

Vasile Mihaescu: Lee et al., Effective Value of Decision Tree with KDD 99 Intrusion Detection Datasets

Christopher Eigner: Parkka et al., Activity classification using realistic data from wearable sensors

Reading: Textbook, Chapter 7

Mon. Feb. 8

Probability and Learning ( slides )

Student presentations:

Yidou Wang: Howe et al., Boosted decision trees for word recognition in handwritten document retrieval

Optional reading:
R. Schapire et al., Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods

L. Reyzin and R. Schapire, How Boosting the Margin Can Also Boost Classifier Complexity

Reading: Textbook, Chapter 8, Sections 1-2

Take-home midterm, due Monday Feb 15

Wed. Feb. 10

Probability and Learning

Student presentations:

David Gibbs: Pang et al., Thumbs up? Sentiment classification using machine learning techniques

Shreemoyee Sarkar: Hall, M. A Decision Tree-Based Attribute Weighting Filter for Naive Bayes

...

Mon. Feb. 15

Clustering ( slides )

Student presentations:

Chad Versace: Tieu and Viola, Boosting Image Retrieval

Daniel Cristofani: R. Mukras et al., Information gain feature selection for ordinal text classification using probability redistribution

Reading: Textbook Chapter 9, Section 1

Homework, due Monday March 1:  HW 5: Naive Bayes and Boosting

Handout on Binning Naive Bayes Handout on Boosting Naive Bayes

Wed. Feb. 17

Gaussian Mixture Models  and Expectation Maximization ( slides )

Genetic Algorithms ( slides )

Student presentations:

Allen Grimm: Hamerly and Elkan, Learning the k in k-means

Reading: Textbook Chapter 8, Sections 3-4.

Optional Reading: Nigam et al., Text classification from labeled and unlabeled documents using EM

Mon. Feb. 22

Class Cancelled

...

Wed. Feb. 24

Genetic Algorithms, continued ( slides )

Student presentations:

Ryan Price: Vallabha et al., Unsupervised learning of vowel categories from infant-directed speech

Ryan Ledbetter: Erman et al., Traffic classification using clustering algorithms

Russell Nakamura: Blanzieri and Melgani, Nearest neighbor classification of remote sensing images with the maximum margin principle

Pavana Anur: Julien and Saitta, Image databases browsing by unsupervised learning

Reading: Textbook Chapter 12.

Mon. Mar. 1

Genetic Algorithms for Feature Selection

Coevolutionary Learning ( slides )

Student presentations:

Jeffrey Brock: Raina et al., Self-taught learning: Transfer learning from unlabeled data

Marek Dolgos: Liu and Yu, Toward integrating feature selection algorithms for classification and clustering

Optional Reading: Texbook Chapter 10

Homework, due Monday March 15:  Genetic Algorithms for Feature Selection

Here is the gzipped tarball for the Simple GA in C.

Here is the spambase data.

Extra section (optional for undergraduates) on Reinforcement Learning will be assigned next time.

Wed. Mar. 3

Guest Lecturer: Bart Massey

Mon. Mar. 8

Quiz 3

Reinforcement Learning (slides)

Reading: Texbook Chapter 13

Wed. Mar. 10

Reinforcement Learning, Continued

Student presentations:

Joshua Hoak: W. Weimer et al., Automatically finding patches using genetic programming

Jason Akers: Branavan et al., Reinforcement learning for mapping instructions to actions

Danny Voils: Eiben et al., Reinforcement learning for online control of evolutionary algortihms

Take home final exam assigned, due Friday March 19

Mon. Mar. 15

No class (finals week).

...

Wed. Mar. 17

No class (finals week)