The Images of Groups Dataset


We are interested in the intersection between social behavior and computer vision. For example, in group shots, people generally choose where to stand based on social (e.g. next to significant other) or physical (e.g. taller males are in the back row).

To study these ideas, we built a collection of people images from Flickr images. The following three searches were conducted: “wedding+bride+groom+portrait” “group shot” or “group photo” or “group portrait” “family portrait” A standard set of negative query terms were used to remove undesirable images. To prevent a single photographer’s images from over-representation, a maximum of 100 images are returned for any given image capture day, and this search is repeated for 270 different days. In each image, we labeled the gender and the age category for each person. As we are not studying face detection, we manually add missed faces, but 86% of the faces are automatically found. We labeled each face as being in one of seven age categories: 0-2, 3-7, 8-12, 13-19, 20-36, 37-65, and 66+, roughly corresponding to different life stages. In all, 5,080 images containing 28,231 faces are labeled with age and gender, making this what we believe is the largest dataset of its kind. Many faces have low resolution. The median face has only 18.5 pixels between the eye centers, and 25% of the faces have under 12.5 pixels. As is expected with Flickr images, there is a great deal of variety. Some images have people are sitting, laying, or standing on elevated surfaces. People often have dark glasses, face occlusions, or unusual facial expressions.


Terms of Use
Please adhere to the following terms of use of this dataset. 
This dataset is for non-commercial research purposes (such as academic research) only.  If you find this collection useful for your research, please cite the paper below.


People
Andrew Gallagher
Tsuhan Chen


Citation
A. Gallagher, T. Chen, “Understanding Groups of Images of People,” IEEE Conference on Computer Vision and Pattern Recognition, 2009.

Bibtex

@inproceedings{gallagher_cvpr_09_groups,

author = {A. Gallagher and T. Chen},

title = {Understanding Images of Groups of People},

booktitle = {Proc. CVPR},

year = {2009},

}


Datafiles

Zip files containing the images and raw text data files:

Fam2a.zip  85 MBytes

Fam4a.zip  44 MBytes

Fam5a.zip  71 MBytes

Fam8a.zip  43 MBytes

Group2a.zip  53 MBytes

Group4a.zip  114 MBytes

Group5a.zip  37 MBytes

Group8a.zip  71 MBytes

Wed2a.zip  42 MBytes

Wed3a.zip  18 MBytes

Wed5a.zip  14 MBytes


Matlab Files that contain the data from the images and faces:

MatlabFiles.zip 155 MBytes


Matlab Files that contain Row Labelings for the Group Shot 5a and 8a images (used in the ICME 2009 paper "Finding rows of people in group images"):

RowLabeling.zip 2 KBytes


Files associated with labeling age and gender:

ageGenderClassification.zip 25 MBytes


A Matlab Function to help view the images:

viewGroupImages.m


The README file:

README.txt