Machine Learning

CSE 410/510 TOP:
Machine Learning
Winter Quarter 2006


Time : Tuesdays and Thursdays, 12:00-1:50pm
Location: Shattuck Hall (SH), Room 145.

Instructor: Melanie Mitchell, FAB 120-24, (503) 725-2412, mm-AT-cs.pdx.edu.
Office hours: Tuesdays and Thursdays, 2:00-3:00pm, or by appointment.

Teaching Assistant: Lanfranco Muzi, FAB 115-14, muzila-AT-cs.pdx.edu.
Office Hours: Tuesdays and Thursdays 3:45-4:45, or by appointment.

Course Website: :
http://www.cs.pdx.edu/~mm/MachineLearningWinter2006/index.html

Prerequisites: Undergraduate-level courses in calculus, linear algebra, and probability and statistics. Facility in at least one high-level programming language.

Course objectives: :

  1. Introduce students to several prominent areas of machine learning, including feature extraction, decision trees, neural networks, genetic algorithms, Bayesian learning, and reinforcement learning, and illustrate what types of problems the different methods are suited for.
  2. (An advanced machine learning course will be offered in the Spring quarter.)
  3. Give students hands-on experience with these methods and tools for implementing and using them on real-world problems.
  4. Give students experience with performing simulations and doing statistical data analysis of the results.
  5. Provide students with experience in reading research papers and giving presentations.

Textbook: T. M. Mitchell, Machine Learning , McGraw-Hill, 1997. The less costly paperback version of this book was not officially published in the U.S., but can be ordered either new or used via Amazon.com.

Reserve Readings: TBA

Assignments: There will be eight short computer-based homework assignments, each corresponding to a topic covered in the course. All assignments are due at the beginning of class on the date specified. Late assignments will be accepted only in exceptional circumstances, preferably with prior approval.

Presentations: Each student will be assigned one technical paper to read on machine learning topics, and will give an in-class presentation (of approximately 15 minutes) on this paper.

Exams: There will be an in-class midterm exam and an in-class final exam.

Grading: Homework: 50%; Presentations: 10%; Midterm: 20%; Final exam: 20%.

Academic integrity: Students will be responsible for following the PSU Student Conduct Code, and in particular, the policy concerning academic honesty.

Collaboration policy: Students may discuss the general concepts and principles behind an assignment with other students. In fact, you are encouraged to do this whenever possible, because it is often a valuable way to reinforce ideas, and to learn new perspectives. However, in doing assignments, each student is expected to develop, write up, and hand in an individual solution and, in doing so, develop a sufficient understanding of the problem and solution so as to be able to explain it adequately to the instructor. Under no circumstances should a student copy or consult the solution of another student, or copy a solution from any other source, including the Internet.

Cheating will result in a grade of zero on the assignment or exam on which the student cheats and the initiation of disciplinary action at the university level.

Students with disabilities: If you are a student with a disability in need of academic accommodations, you should register with Disability Services for Students and notify the instructor immediately to arrange for support services.

Syllabus (subject to change):

Date

Topics

Homework and Reading

Tuesday Jan. 10


Class overview;
Intro. to machine learning;
Feature extraction;
Decision trees I

Homework 1 ("Feature Extraction") assigned. Due Tuesday Jan. 17. Here are the spam examples you need to use for this assignment.

Short papers assigned:

S. Salzberg, Locating Protein Coding Regions in Human DNA using a Decision Tree Algorithm. Journal of Computational Biology, vol. 2, no. 3, pp. 473--485, 1995.

D. D. Lewis and M. Ringuette, A comparison of two learning algorithms for text categorization. Symposium on Document Analysis and Information Retrieval, ISRI, 1994.

Reading: Textbook, Chapter 3. Sections 3.1-3.4.

Thursday Jan. 12


Decision Trees II


Reading: Textbook, Chapter 3, Sections 3.5-3.8.

Tuesday Jan. 17

Decision Trees III


Student presentations


Judy Fischbach (Decision trees for coding regions)
Ian Elliot (Decision trees for text categorization)

Homework 1 ("Feature Extraction") due.

Homework 2 ("Decision Trees") assigned, due Tuesday, Jan. 24.

Here are the spam test examples for Step 2.

Here are the non-spam test examples for Step 2.

Here is UCI-spam.names for Step 6.

Here is UCI-spam.data for Step 6.

Here is UCI-spam.test for Step 6.

Thursday Jan. 19

Neural Networks I

Guest lecturer: Lanfranco Muzi

Reading: Textbook, Chapter 4, Sections 4.1-4.4.

Tuesday, Jan. 24

Neural Networks II

Homework 2 ("Decision Trees") due.

Homework 3 ("Neural Networks") assigned, due Tuesday, Jan. 31. Here are the files (in gzipped tar format) for Homework 3

Short papers assigned:

M. N. Dailey, G. W. Cottrell, C. Padgett, and R. Adolphs. EMPATH: A neural network that categorizes facial expressions, Journal of Cognitive Neuroscience, 14(8):1158-1173, 2002.

T. J. Sejnowski & C. R. Rosenberg, Parallel networks that learn to pronounce English text. Complex Systems 1, 145-168, 1987.

Reading: Textbook, Chapter 4, Section 4.5, 4.7.

Thursday, Jan. 26

Neural Networks III

Reading: Textbook, Chapter 4, Sections 4.6, 4.8-4.9

Tuesday, Jan. 31

Student presentations on neural networks:
Ryan Hieber: NetTalk
JD Huntington: Recognizing facial expression with neural networks

Evaluating Hypotheses I

Homework 3 ("Neural Networks") due.

Reading: Textbook, Chapter 5, Sections 5.1-5.3.

Thursday, Feb. 2

Evaluating Hypotheses II

Homework 4 ("Evaluating Hypotheses" and other topics) assigned, due Thursday, Feb. 9.

Reading: Textbook, Chapter 5, Sections 5.4-5.7.

Tuesday, Feb. 7

Bayesian Learning I

Short papers assigned.

M. J. Pazzani, Searching for Dependencies in Bayesian Classifiers. In Proceedings of the Fifth International Workshop on AI and Statistics, 1995.

K. Nigam, A. McCallum, and T. Mitchell, Learning to classify text from labeled and unlabeled documents . 2000. Reading: Textbook, Chapter 6, Sections 6.1-6.3, 6.7, 6.9.

Thursday, Feb. 9

Bayesian Learning II

Homework 4 ("Evaluating Hypotheses") due.

Homework 5 ("Bayesian Learning") assigned, due Tuesday, Feb. 21 (note you have one and a half weeks for this assignment).

Here is the addendum to Homework 5.

Reading: Textbook, Chapter 6, Section 6.10.

Tuesday, Feb. 14

Bayesian Learning III

Student presentations on Bayesian learning

Nathan Linger, Jimmy Pierce

Review for midterm

...

Thursday, Feb. 16

Midterm

...

Tuesday, Feb. 21

Go over midterm

Computational Learning Theory I

Reading: Textbook, Chapter 7, Sections 7.1-7.3.

Thursday, Feb. 23

Computational Learning Theory II

Support Vector Machines I

Homework 5 ("Bayesian Learning") due.

Homework 6 ("Computational Learning Theory and Support Vector Machines") assigned, due Thursday, March 2.

Data sets for Homework 6:
spam.svm.train
spam.svm.test

Reading: Textbook, Chapter 7, Sections 7.4-7.6.

Tuesday, Feb. 28

Support Vector Machines II

Genetic Algorithms I

Reading: M. A. Hearst et al. Support Vector Machines. IEEE Intelligent Systems, 18-28, July/August 1998.

Textbook, Chapter 9, Sections 9.1-9.3

Thursday, March 2

Genetic Algorithms II

Homework 6 ("Computational Learning Theory") due.

Homework 7 ("Genetic Algorithms") assigned, due Thursday, March 9.

Here is the tarball for the Simple GA in C.

Short papers assigned:

J. Busch et al., Automatic generation of control programs for walking robots using genetic programming

P. Zhang et al. Neural vs. statistical classifier in conjunction with genetic algorithm based feature selection Reading: Textbook, Chapter 9, Sections 9.4-9.8.

Tuesday, March 7

Genetic Algorithms III

...

Thursday, March 9

Student presentations on genetic algorithms

Chuan-kai Lin, Thanh Dang

Reinforcement Learning I

Homework 7 ("Genetic algorithms") due.

Homework 8 ("Reinforcement Learning") assigned, due Thursday, March 16.

Tuesday, March 14

Reinforcement Learning II

Guest lecturer: Roberto Santiago

Reading: Textbook, Chapter 13, Sections 13.1-13.3

Thursday, March 16

Reinforcement Learning III

Review for final exam

Reading: Textbook, Chapter 13, Sections 13.1-13.3 Homework 8 ("Reinforcement Learning") due.

Tuesday, March 21

No class

Thursday Mar. 23

Final exam, 10:15am-12:05pm

...