Machine Learning Seminar

CS 570
Computer Science Department, Portland State University

Spring 2017

Topic for Spring Term 2017: Learning to Integrate Vision and Language

Time: Thursdays, 4:00-5:30pm

Location: FAB 88-03

Instructor: Melanie Mitchell, FAB 115-13, (503) 720-2412, e-mail
Office hours: Tuesdays and Thursdays, 2-3pm, or by appointment

Course Mailing list:

Course description: This course is a one-credit graduate seminar for students who have already taken a course in Machine Learning. Students will read and discuss recent papers in the Machine Learning literature. Each student will be responsible for presenting at least one paper during the term. This one-credit course will be offered each term, and students may take it multiple times. CS MS students who take this course for three terms may count it as one of the courses for the "Artificial Intelligence and Machine Learning" masters track requirement.

Prerequisites: CS 445/545 or permission of the instructor.

Textbook: None. We will read recent papers from the literature.

Course Work and Homework: One or more papers will be assigned per week for everyone in the class to read, along with a list of questions about the paper(s) that each student needs to answer before the following class. Each week one or more students will be assigned as discussion leaders for the week's papers.

Schedule for Spring Term 2017: This will be progressively filled in during the term.



Discussion Leader(s)



April 6

No class (instructor out of town)

April 13

Introduction; Learning to Generate Image Captions

Sheng and Anthony

O. Vinyals et al., Show and tell: A neural image caption generator

A. Karpathy & L. Fei-Fei, Deep visual-semantic alignments for generating image descriptions


April 20

Learning to Generate Image Captions, Continued Erik and Ben

X. Chen and C. L. Zitnick, Mind's eye: A recurrent visual representation for image caption generation

K. Xu et al., Show, attend, and tell: Neural image caption generation with visual attention


April 27

Evaluating Image Captions

Rachel, Mike

P. Anderson et al, SPICE: Semantic propositional image caption evaluation

S. Liu et al., Improved image captioning via policy gradient optimization of SPIDEr


May 4

Evaluating Image Captions (continued); Scene Graphs

Shiran, Alex

M. Kilickaya et al. Re-evaluating automatic metrics for image captioning

J. Johnson et al., Image retrieval using scene graphs


May 11

Visual Q/A

Thomas and Henry

Antol et al., VQA: Visual question answering

Zhu et al., Visual7W: Grounded question answering in images


May 18

Visual Q/A, continued

Mohamed and Noah

Fukui et al., Multimodal compact bilinear pooling for visual question answering and visual grounding

Lu et al, Hierarchical question-image co-attention for visual question answering


May 25

Evaluating and Understanding Visual Q/A Models

Devin and Melanie

Jabri et al., Revisiting visual question answering baselines

Kafle et al., An analysis of visual question answering algorithms


June 1

Visual semantics and visual reasoning

Cody and Robert

Johnson et al., CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning

Johnson et al., Inferring and executing programs for visual reasoning


June 8

Visual semantics and visual reasoning

Casey and Sharad

Mahendru et al., The promise of premise: Harnessing question premises in visual question answering

Li and Jia, Visual question answering with question representation update (QRU)