The Portland State Dog-Walking Images is a dataset of photographs, each of which is an instance of the visual concept (or situation) "Dog-Walking" This dataset was created by Melanie Mitchell's research group at Portland State University. Members of our research group took all the photographs, which feature natural scenes in different locations.
This dataset was originally used for the Situate Project, which investigates how high-level conceptual knowledge can be integrated with lower-level vision in order to flexibly recognize and make analogies between visual situations. A paper describing Situate can be downloaded here .
The dataset can be downloaded as a .zip file or as a .tar.gz file:
The dataset is grouped into two folders: PrototypicalDogWalking and NonPrototypicalDogWalking . The 500 images in the PrototypicalDogWalking folder each feature exactly one human dog-walker, one dog, and one leash. The 200 images in the NonPrototypicalDogWalking folder depart from this prototype in various ways. The images in PrototypicalDogWalking are named with the abbreviation "pdw" (for "prototypical dog walking", followed by a number (1 to 500). The images in NonPrototypicalDogWalking are named with the abbreviation "npdw" (for "nonprototypical dog walking", followed by a number (1 to 200).
The dog-walkers, dogs, and leashes in the images have been labeled with ground truth bounding boxes by members of the Mitchell research group. Each JPEG image has an accompanying label file, with extension ".labl". The label file specifies the coordinates and label of each bounding box. The labels give the object category and sometimes the orientation of the object. The main set of possible labels is:
In every image, every dog-walker, dog, and leash is labeled. In some images there are other objects labeled with other categories, such as pedestrian, tree, car, trash-can, but most objects are unlabeled.
Each label file is a text file that has the following format:
image-width | image-height | number-of-bounding-boxes | x0 | y0 | w0 | h0 | x1 | y1 | w1 | h1 |...| label0 | label1 ...
Each value is separated by a vertical line. The first value is the width of the image in pixels, the second is the height of the image in pixels, the third is the number of bounding boxes, and the following values are, for each bounding box, the (x,y) coordinates of the top-left corner, and the width and height of the box. These are given for each bounding box, and are followed by the labels for each box. The origin (0,0) of the coordinate system is the top-left corner of the image.
For example, here is the label file for pdw1.jpg.
2162|2029|3|346|1605|126|222|832|1060|220|588|450|1393|392|246|dog back|dog-walker back|leash-/
This label indicates that the image is 2162 pixels in width, 2029 pixels in height, and has been labeled with three bounding boxes:
We are making this dataset available only for non-commercial, research purposes. Users are prohibited from re-posting any of these images in any medium, except with written permission from Melanie Mitchell.
If you publish work that uses this dataset, please cite our paper: M. H. Quinn, A. D. Rhodes, and M. Mitchell, "Active object localization in visual situations". If you have any questions or comments about this dataset, please send email to Melanie Mitchell: mm @ pdx dot edu.
Collection, labeling, and publication of this dataset was made possible by funding from the National Science Foundation under Grant Number IIS-1423651. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.