The Images of Groups Dataset

We are interested in the intersection between social behavior and computer vision. For example, in group shots, people generally choose where to stand based on social (e.g. next to significant other) or physical (e.g. taller males are in the back row).

To study these ideas, we built a collection of people images from Flickr images. The following three searches were conducted: “wedding+bride+groom+portrait” “group shot” or “group photo” or “group portrait” “family portrait” A standard set of negative query terms were used to remove undesirable images. To prevent a single photographer’s images from over-representation, a maximum of 100 images are returned for any given image capture day, and this search is repeated for 270 different days. In each image, we labeled the gender and the age category for each person. As we are not studying face detection, we manually add missed faces, but 86% of the faces are automatically found. We labeled each face as being in one of seven age categories: 0-2, 3-7, 8-12, 13-19, 20-36, 37-65, and 66+, roughly corresponding to different life stages. In all, 5,080 images containing 28,231 faces are labeled with age and gender, making this what we believe is the largest dataset of its kind. Many faces have low resolution. The median face has only 18.5 pixels between the eye centers, and 25% of the faces have under 12.5 pixels. As is expected with Flickr images, there is a great deal of variety. Some images have people are sitting, laying, or standing on elevated surfaces. People often have dark glasses, face occlusions, or unusual facial expressions.

Terms of Use
Please adhere to the following terms of use of this dataset.
This dataset is for non-commercial research purposes (such as academic research) only. If you find this collection useful for your research, please cite the paper below.