Advanced Multimedia Processing (AMP) Lab, Cornell University

Kinship Classification by Modeling Facial Feature Heredity


Ruogu Fang, Andrew C. Gallagher, Tsuhan Chen, Alexander Loui



We propose a new, challenging, problem in kinship classification: recognizing the family that a query person belongs to from a set of families. We propose a novel framework for recognizing kinship by modeling this problem as that of reconstructing the query face from a mixture of parts from a set of families. To accomplish this, we reconstruct the query face from a sparse set of samples among the candidate families. Our sparse group reconstruction roughly models the biological process of inheritance: a child inherits genetic material from two parents, and therefore may not appear completely similar to either parent, but is instead a composite of the parents. The family classification is determined based on the reconstruction error for each family. On our newly collected "Family101" dataset, we discover links between familial traits among family members and achieve state-of-the-art family classification performance.


  • Ruogu Fang, Andrew C. Gallagher, Tsuhan Chen, Alexander Loui. Kinship Classification by Modeling Facial Feature Heredity. IEEE International Conference on Image Processing, Australia, 2013 (ICIP '13) [PDF]

Dataset: Family101

The "Family101" dataset is the first large-scale dataset of families across several generations. It contains 101 different family with distinct family names, including 206 nuclear families, 607 individuals, with 14,816 images. The dataset are composed of renowned public families.

We used Amazon Mechanical Turk to assemble the dataset by asking workers to upload images of family members that we specify. The identities of the individuals are then verified. Each family contains 1 to 7 nuclear families. In total there are 206 nuclear families (both parents and their children), each with 3 to 9 family members. The final dataset includes around 72% Caucasians, 23% Asians, and 5% African Americans to guarantee a widespread dis- tribution of facial characteristics that depend on race, gender, age. We attempted to exclude non-biologically related parents-children by checking the familial relationships using public information avail- able online. For pair-wise relationships, there are 213 father-son relations, 147 father-daughter relations, 184 mother-son relations, and 148 mother-daughter relations.

The images below shows one of the families: the Kennedy family with its family tree of three generations and 48 images of one member, Caroline Kennedy.

Kennedy Caroline



Family Structure: FAMILY101.txt


  • The face images in this version of Family 101 database are formated in 150x120 pixels size. While the number of families is 101, the number of people and images may be different from that reported in the original paper due to further distillation and addition. If you spot any mismatch between the face image and the person, please report to us.
  • Family101.txt format is explained below:
  1. The first column is a SERIAL NUMBER. 0 indicates the start of a new extended family. 1, 2, ... etc indicates the sequence of nuclear families in this extended family.
  2. The second column is the ROLE. FAMI indicates the family name of the extended family following the serial number 0. HUSB, WIFE, SONN, DAUG indicate the role of the person in a nuclear family. Note son is expressed as SONN here.
  3. The third column is the NAME. The full name (first name_middle name_last name) following the ROLE is displayed here. Underline is used to replace space in the names.


If you use the "Family101" in your research, please cite the following paper,

  • Ruogu Fang, Andrew C. Gallagher, Tsuhan Chen, Alexander Loui. Kinship Classification by Modeling Facial Feature Heredity. IEEE International Conference on Image Processing, Australia, 2013 (ICIP '13)

Please inform us about your accuracies on this dataset, so that we may make them publicly available for easy comparison and citation.