Research and Technology - Forensic Science Communications - April 2008

Research and Technology - Forensic Science Communications - April 2008

fsc_logo_top.jpg
fsc_logo_left.jpg

April 2008 - Volume 10 - Number 2

 

Research and Technology

The MORPH Database: Investigating the Effects of Adult Craniofacial Aging on Automated Face-Recognition Technology

A. Midori Albert
Associate Professor
Forensic Anthropologist
Department of Anthropology
University of North Carolina Wilmington
Wilmington, North Carolina

Karl Ricanek, Jr.
Assistant Professor
Department of Computer Science
University of North Carolina Wilmington
Wilmington, North Carolina

Introduction | Adult Craniofacial Age-Related Changes | The MORPH Database | Experiments | Results and Discussion | Conclusion | Acknowledgments | References

Introduction

Current studies and developing technologies in automated (i.e., computer-based) face-recognition research are frequently geared toward testing the effectiveness of computer algorithms to match real-world faces or facial images with "database faces." Database facial images are assembled from scanned photographs or digitized images, which may or may not include anthropometric measures of faces (Phillips et al. 2005). A primary aim of testing the effectiveness of automated face-recognition systems is to assess their capabilities for use in security and law enforcement venues. While image quality, pose, lighting, and other variables have been studied for their influence on face-recognition system accuracy, little research has been done on the influence of natural age changes on the accuracy and reliability of automated face recognition (Lanitis et al. 2002; Ricanek, Jr., et al. 2004). Our research work focuses on the ways in which adult age progression affects the efficacy of computer algorithms to correctly match faces of the same adult individual at different ages. To successfully explore this line of research, we needed a large sample of faces, male and female, of diverse ancestries, where the facial images of each individual at different adult ages would be available. Thus, we developed the MORPH Database.

The MORPH Database is a distinctive longitudinal database of facial images. Since its inception in 2003, the MORPH Database has allowed investigations into the effects of normal adult age progression on automated face-recognition systems. MORPH is a dynamic database, and as new data are compiled—where data include digital facial images of individuals from young adulthood throughout old age—new algorithms based on these data are developed and tested for their face-recognition accuracy. The express contribution of this database for automated face recognition is elucidated here through an evaluation of a standard face-recognition algorithm. Performance results of the algorithm we tested illustrate the direct impact that natural aging can have on recognition rates. In this paper, we (1) highlight some general adult age-related changes in the craniofacial complex, (2) introduce to the forensic science community the MORPH Database, and (3) report findings from an experiment designed to test the quantifiable impact natural aging has on automated face-recognition systems. Results from experiments such as these could potentially revolutionize our current system of identifying adult missing persons or fugitives who have aged a significant number of years.

Adult Craniofacial Age-Related Changes

The literature on craniofacial aging is robust to development, growth, and maturation of "subadults" (i.e., fetuses, infants, children, teenagers); nonetheless, changes in craniofacial morphology occurring throughout adulthood seem to be less well understood by comparison (Albert et al. 2004). Even so, a good broad understanding exists of rates and patterns of adult facial aging (Neave 1998; Taylor 2001; Zimbler et al. 2000) and craniofacial skeletal changes, or remodeling, across the adult lifespan (Albert et al. 2004; Behrents 1985; Milner et al. 2001). Some of these changes include the formation of lines and wrinkles in the skin and sagging because of diminished elasticity and muscle tone (Coleman and Grover 2006). Other changes stem from hyperdynamic facial expressions, gravity, and mild, natural remodeling of the facial skeleton and jaws (Taister et al. 2000; Taylor 2001; Zimbler et al. 2001). While the aging features of soft tissue are perhaps more visible, occurring across the adult age span, bony modifications also may play a role in the changing appearance of the face with advancing age.

Some of the bony changes include increases in head circumference, head length, bizygomatic breadth (i.e., maximum width from cheekbone to cheekbone), and total face height (Behrents 1985; Guagliardo 1982). There are dentoalveolar changes (i.e., changes in the jaws), resulting in some shifts in the dimensions of the upper palate, as well as facial skeletal diminishment, such as in the mandible. There is also evidence for increases in anterior facial height, particularly the lower four-fifths of the face, possibly linked to the continued eruption of the teeth in adulthood (Albert et al. 2004; Bondevik 1995; Forsberg et al. 1991; West and McNamara 1999). Adult age-related facial changes essentially manifest as shape changes or morphological changes. The pace at which craniofacial aging occurs is variable. Regarding the facial soft tissues, for example, very few age-related changes may occur in the decade spanning the 20s to the 30s, whereas significant aging may be apparent in the decade spanning the 50s to the 60s (Taylor 2001). Differences because of normal human variation such as sex, ancestry, body size, and idiosyncratic (i.e., individual) features, stemming from genetic and environmental influences and so forth, play a major role in how faces age and, thus, in our ability to recognize people as they age. Further, the extent of the impact that age-related bone and soft-tissue changes may have on the effectiveness of face-recognition technology is unknown. We have begun to better understand these phenomena now that the barrier posed by the early-to-recent lack of adequate research samples and sample composition has been fairly remedied through the development of the MORPH Database.

The MORPH Database

Background

Face databases have typically been developed for researchers interested in face-based biometrics: face recognition, face modeling, photorealistic animation, etc. Numerous existing face databases tend to focus on one particular niche or a combination of niches, such as pose, lighting, high or low resolution, sex, ancestry features, and three-dimensional and multispectral approaches to studying faces. Yet only three databases are known to contain multiple images of individuals at different ages and be publicly available. The MORPH Database is among these three. The other two are the FERET (Facial Recognition Technology) (Philips et al. 1997) and FG-NET (Face and Gesture Recognition Research Network) (Lanitis 2002) Databases.

Whereas the FERET Database contains numerous facial images, the images do not span a range of years for each individual. The FG-NET Database contains facial images spanning up to five years per individual, but it is limited to a small number of subjects (n = 82) (Ricanek, Jr., and Testafaye 2006). The MORPH data corpus embraces thousands of facial images of individuals across time. Moreover, these images are available to the public for continued research, and we encourage studies of forensic science relevance and utility.

Source and Specifics

The recently developed and continually expanding (2003 to present) MORPH data corpus comprises facial images of numerous individuals and includes essential metadata, such as age, sex, ancestry, height, and weight, organized into two "albums." Album 1 contains digital scans of photographs of individuals taken between October 26, 1962, and April 7, 1998—which we refer to as acquisition dates. The acquisition dates correspond to increasing ages for individuals in the database; these dates range anywhere from 46 days to 29 years after the earliest photograph. MORPH Album 1 contains 1690 digitized images from 515 individuals, men and women of various ancestry groups (Table 1). The individuals range in age from 15 to 68 years, and the images are organized by "decade of life" categories (Table 2). Decade-of-life categories were established according to when the most notable craniofacial age-related changes appear.

Table 1: MORPH Album 1: Number of Facial Images by Sex and Ancestry (n = 1690)


Americans of  African Descent

Americans of European Descent

Americans of "Other" Descent

Total

Male

1037

365

3

1405

Female

  216

 69

0

  285

Total

1253

434

3

1690


Table 2: MORPH Album 1: Number of Facial Images by Decade-of-Life Categories (in Years)


<18

18–29

30–39

40–49

50+

Total

Male

142

803

345

 93

22

1405

Female

 15

182

 70

 18

  0

  285

Total

157

985

415

111

22

1690

Table 3 indicates how many additional images exist for each individual in the database—the initial image being the individual at his or her younger/youngest age, followed by one, two, three, four, or more additional images at older ages. These additional images are the crux of our studies, essential in researching age-related changes and testing our automated face-recognition algorithms.

Table 3: MORPH Album 1: Numbers of Additional Facial Images


1

2

3

4+

Total

Male

519

289

 67

 9

  884

Female

105

  58

13

 2

  178

Total

624

347

80

11

1062

In creating MORPH Album 1, we used a consumer-grade flatbed scanner to digitize photographs. First, we scanned images in full 48-bit color with a capture resolution of 300 dpi (dots per inch); then we cropped them to a size of 400 by 500 pixels, to reveal only the head and face. Next, we converted the images to a grayscale for uniformity; and finally, we stored them in the database in portable gray map (PGM) format. Figure 1 shows what these images look like; depicted are three subjects from Album 1 in whom general changes in facial appearance related to age can be seen. As noted, MORPH Album 1 was followed by the development of a second collection of facial images: MORPH Album 2.

Figure 1: MORPH Database sample images showing age progression

As of this writing, MORPH Album 2 contains more than 14,000 digital images from the middle 1990s to the present, obtained from more than 4000 individuals whose metadata (age, sex, ancestry, height, and weight) are also recorded. Table 4 shows the number of facial images by sex and ancestry; Table 5 lists the numbers of facial images by decade-of-life categories; and Table 6 shows the numbers of additional images that exist aside from the initial acquisition facial image. Interestingly, Album 1 contains a larger number of images of individuals at younger ages, 20s to 30s (Table 2), whereas Album 2 images are skewed slightly older—there are a greater number of images of individuals in their 40s to 50s (Table 5). All data in the MORPH database—Albums 1 and 2—were collected with appropriate legal considerations and Internal Review Board approval.

Table 4: MORPH Album 2: Number of Facial Images by Sex and Ancestry (n = 15,204)


Americans of African Descent

Americans of European Descent

Americans of Asian Descent

Americans of Hispanic Descent

Americans of "Other" Descent

Total

Male

10,283

2,650

5

34

12

12,984

Female

  1,665

   550

2

  0

  3

  2,220

Total

11,948

3,200

7

34

15

15,204


Table 5: MORPH Album 2: Number of Facial Images by Decade-of-Life Categories (in Years) (n = 15,204)


<18

18–29

30–39

40–49

50+

Total

Male

0

29

5,371

5,679

1,905

12,984

Female

0

  4

   964

1,017

   235

  2,220

Total

0

33

6,335

6,696

2,140

15,204


Table 6: MORPH Album 2: Numbers of Additional Facial Images


1

2

3

4+

Total

Male

3,440

1,954

1,192

2,958

  9,544

Female

   599

   322

   200

   500

  1,621

Total

4,039

2,276

1,392

3,458

11,165

For the experiment presented in this paper, we used images from MORPH Album 1. Of the 1690 images, 1253 were of individuals of African American descent, 434 images were of individuals of European American descent, and 3 images were classified as "other." There were 1405 images of men and 285 images of women (Table 1). It is interesting to note that of the male images, 76 percent had some form of facial hair, commonly a mustache. Table 2 lists the number of facial images for men and women across the different decades of life from under 18 through to 50 years and older.

The average age of an individual at the time of acquisition was 27.3 years, with a standard deviation of 8.6 years, a minimum age of 15 years, and a maximum age of 68 years. The number of images for any single individual in the database includes the initial and youngest-age image, plus what we call additional images. Additional images are images of the same individual at varying older ages, with a "soonest" acquisition date of 46 days after the initial acquisition and a "latest" of 29 years). Table 3 shows the number of additional images obtained for men and women in the database. As mentioned, the MORPH Database, unlike other similar databases, contains metadata on each individual: age, sex, ancestry, height, and weight. These metadata are vital elements in the study of how adult human faces change in appearance over the years. Our aim was to test, preliminarily, how well faces could be "recognized" after certain spans of time had elapsed between one image and a subsequent image of the same individual.

The Impact of Age Progression on Face-Recognition Rates: Experiments

Materials and Methods

In a series of experiments, we tested the performance of a standard principal component analysis (PCA) face-recognition (FR) algorithm, known as a PCA-FR algorithm, for its ability to correctly match facial images of the same individual at different ages, where the time span between the different ages ranged from less than 5 years (i.e., 0–5 years elapsing between a "younger photo" and an "older photo" of the same person), 6–10 years, 11–15 years, and 16–20 years. Experiments to evaluate the impact of an aging face on face-recognition accuracy were conducted on the total set of images in the database sample we used, which was Album 1, a subset of the MORPH data corpus, described above. The experiment on the total set of images was referred to as Experiment T. We then tested men only (Experiment M), women only (Experiment F), African American men and women (Experiment A) and European American men and women (Experiment W).

For Experiment T (total sample), the chronologically youngest photo of each of 515 randomly selected individuals composed the T Gallery, where a gallery is a collection of the youngest-age image for each individual. Thus, the T Gallery was made up of a youngest-age image of each of the 515 individuals. Each of the subsequent experiments was essentially conducted on a subset of the T Gallery (or the total). The M Gallery consisted of the youngest-age image for each of 420 men (i.e., 420 out of 515 individuals were men), whereas the F Gallery consisted of the youngest-age image for each of 95 women. The A Gallery (African American) contained the youngest-age images from 377 individuals (men and women combined), and the W Gallery (Americans of European descent) contained the youngest-age images of 137 individuals (men and women combined).

Inasmuch as gallery images were all the "youngest" of the images of individuals in the samples, they served as a baseline. The additional images of the individuals at varying older years are referred to as the probe images. Probe images are classified according to age spans of 0–5 years older than the gallery image, 6–10 years older, 11–15 years older, and 16–20 years older. Our experiments tested the accuracy of automated face recognition via a PCA-FR algorithm, where accuracy was measured by the number of correct matches between gallery and probe images of the same individuals. Experiments were conducted on all galleries (T, M, F, A, and W) using a series of probe images, and these experiments are referred to as Experiment T (total individuals), Experiment M (men only), Experiment F (women only), Experiment A (African American men and women only) and Experiment W (European American men and women only).

Experiments M and F were designed to assess how face-recognition accuracy might be affected by sex. Similarly, Experiments A and W informed us of any sway that ancestry has on face-recognition accuracy. Again, the galleries contained images of the individuals in the samples at the youngest ages for which a photo was available. Gallery images were classified in age ranges: younger than 18 years, 18–29 years, 30–39 years, and 40–49 years where applicable (i.e., Experiments T and M). These age ranges reflect decades of life during which craniofacial age changes in any one decade are fewer or less intense within a decade compared to the greater or more intense changes seen between decades. Table 2 shows the number of images that fell into each decade age category. Table 3 shows the number of additional images there were for the 515 individuals.

Results and Discussion

For all experiments, the face-recognition algorithm performed best in the age span of 0–5 years. This meant that the most accurate matches between gallery images and probe images occurred when image acquisition dates were less than five years apart. Algorithm performance was optimal when the time span between images—the gallery image and the probe image—was short. Moreover, algorithm performance increased as age increased. Individuals in the age range of 40–49 years were better recognized than those in the youngest age range, <18 years. For the age span of 0–5 years between image acquisition dates, algorithm performance, or recognition rate, was 0.344 for the age range <18 years. Performance increased to 0.420 for the age range 18–29 years, then to 0.452 for the age range 30–39 years, and to 0.800 for the age range 40–49 years (Table 7). This finding corroborated the results of Phillips et al. (2002), who evaluated several face-recognition systems, yielding a similar trend.

Table 7: Face-Recognition Accuracy Results for Experiment T: Total Sample


Rate of Correct Matches Between Gallery Images and Probe Images Based on Number of Years Between Acquisition Dates

Gallery Image Age Category of Initial Acquisition† (in Years) Probe Image
0–5 Years After Gallery Image
Probe Image
6–10 Years After Gallery Image
Probe Image
11–15 Years After Gallery Image
Probe Image
16–20 Years After Gallery Image

   <18

0.344

0.155

0.078

0.125

18–29

0.420

0.257

0.134

0.080

30–39

0.452

0.300

0.231

*

40–49

0.800

*

*

*

†Acquisition is the date (and age of an individual) when a photograph or digital image was taken.

*No data

Recognition rates addressing sex and ancestry differences have not yet been reported in the literature, and more research here is warranted; yet our findings are interesting to note. A similar trend to Experiment T (total sample) was found for Experiments M (men), A (African Americans), and W (European Americans): face-recognition rates declined as the time span between gallery and probe images increased, and older faces tended to be better recognized than younger faces (Tables 8, 9, and 10). Results of Experiment F on women are not included because of the paucity of women across the ages in the data corpus. Whereas our results indicated that individuals of European descent (Table 10) were better recognized than those of African American descent (Table 9), further research is needed before any reasonable conclusions may be drawn.

Table 8: Face-Recognition Accuracy Results for Experiment M: All Men


Rate of Correct Matches Between Gallery Images and Probe Images Based on Number of Years Between Acquisition Dates

Gallery Image Age Category of Initial Acquisition† (in Years)

Probe Image
0–5 Years After Gallery Image

Probe Image
6–10 Years After Gallery Image

Probe Image
11–15 Years After Gallery Image

Probe Image
16–20 Years After Gallery Image

   <18

0.354

0.163

0.085

0.083

18–29

0.431

0.294

0.199

0.073

30–39

0.513

0.292

0.167

*

40–49

0.800

*

*

*

†Acquisition is the date (and age of an individual) when a photograph or digital image was taken.

*No data

Table 9: Face-Recognition Accuracy Results for Experiment A: African American Men and Women


Rate of Correct Matches Between Gallery Images and Probe Images Based on Number of Years Between Acquisition Dates

Gallery Image Age Category of Initial Acquisition† (in Years)

Probe Image
0–5 Years After Gallery Image

Probe Image
6–10 Years After Gallery Image

Probe Image
11–15 Years After Gallery Image

Probe Image
16–20 Years After Gallery Image

   <18

0.257

0.051

0.105

0.077

18–29

0.423

0.252

0.107

0.093

30–39

0.444

0.304

0.333

*

40–49

*

*

*

*

†Acquisition is the date (and age of an individual) when a photograph or digital image was taken.

*No data

Table 10: Face-Recognition Accuracy Results for Experiment W: European American Men and Women


Rate of Correct Matches Between Gallery Images and Probe Images Based on Number of Years Between Acquisition Dates

Gallery Image Age Category of Initial Acquisition† (in Years)

Probe Image
0–5 Years After Gallery Image

Probe Image
6–10 Years After Gallery Image

Probe Image
11–15 Years After Gallery Image

Probe Image
16–20 Years After Gallery Image

   <18

0.346

0.211

0.077

*

18–29

0.413

0.250

0.227

*

30–39

0.571

0.286

*

*

40–49

*

*

*

*

†Acquisition is the date (and age of an individual) when a photograph or digital image was taken.

*No data

Conclusion

The accuracy of automated face-recognition systems can and should be tested using experiments that take into account the variables that affect its reliability. These variables include age, sex, ancestry, height, and weight, as well as idiosyncratic factors—features unique to a particular individual. In addition, there are image-related issues with which to contend, such as pose, lighting, and image quality. To this end, we created the MORPH data corpus, developing first our Album 1, followed by the continuously expanding Album 2. Images in the MORPH Database are structured such that image-related variables are controlled for as well as possible. We endeavored to assemble the best array of facial images, ensuring as much as possible that we had a large sample size of both men and women of different ancestry groups, where there were a significant number of additional images. The additional image component is key to testing automated face-recognition accuracy because it is, by definition, a facial image of an individual with an acquisition date (i.e., the time/age when the image was taken), subsequent to the date/time/age of the youngest adult image available. As many additional images as could be obtained for each individual in the database were sought to ensure favorable coverage of adult age-related facial changes, ideally from adolescence to senescence.

With a sizeable and representative database of facial images, we began testing PCA-FR algorithms to evaluate accuracy rates in terms of correct matches between gallery images (youngest-age images) and probe images (varying older-age images). Results from these early experiments, presented in this paper, inform the ways in which normal adult craniofacial age changes might affect automated face-recognition accuracy rates. Understanding the effects of natural craniofacial morphological changes occurring over the adult lifespan is critical to improving automated face-recognition technology that holds promise for wide use in the forensic science community.

Acknowledgments

The authors thank the U.S. Department of Defense for funding this research.

For further information, please contact:

A. Midori Albert
Associate Professor
Forensic Anthropologist
Department of Anthropology
University of North Carolina Wilmington
601 S. College Road
Wilmington, North Carolina  28403-5907
+ 1-910-962-7078 (Voice)
+ 1-910-962-3543 (Fax)
albertm@uncw.edu

Karl Ricanek, Jr.
Assistant Professor
Department of Computer Science
University of North Carolina Wilmington
601 S. College Road
Wilmington, North Carolina  28403-5907
+ 1-910-962-4261 (Voice)
+ 1-910-962-7457 (Fax)
ricanekk@uncw.edu

References

Albert, A. M., Ricanek K., Jr., and Patterson, E. The aging skull and face: A review of the literature and report on factors and processes of change, UNCW Technical Report. Call No. WRG FSC-A. University of North Carolina Wilmington, Wilmington, North Carolina, 2004.  

Behrents, R. G. Growth in the Aging Craniofacial Skeleton. Center for Human Growth and Development, University of Michigan, Ann Arbor, Michigan, 1985.

Bondevik, O. Growth changes in the cranial base and the face: A longitudinal cephalometric study of linear and angular changes in adult Norwegians, European Journal of Orthodontics (1995) 17:525–532.

Coleman, S. R. and Grover, R. The anatomy of the aging face: Volume loss and changes in 3-dimensional topography, Aesthetic Surgery Journal (2006) 26(Suppl 1):S4–S9.

Forsberg, C. M., Eliasson, S., and Westergren, H. Face height and tooth eruption in adults—A 20-year follow-up investigation, European Journal of Orthodontics (1991) 13:249–254.

Lanitis, A. FG-NET Aging Database [Online]. (2002). Available: http://www-prima.inrialpes.fr/FGnet/.

Lanitis, A., Taylor, C. J., and Cootes, T. F. Toward automatic simulation of aging effects on face images, IEEE Transactions of Pattern Analysis and Machine Intelligence (2002) 24:442–455.

Milner, C. S., Neave, R. A., and Wilkinson, C. M. Predicting growth in the aging craniofacial skeleton, Forensic Science Communications [Online]. (July 2001). Available: http://www.fbi.gov/hq/lab/fsc/backissu/july2001/milner.htm.

Neave, R. A. H. Age changes to the face in adulthood. In: Craniofacial Identification in Forensic Medicine. J. G. Clement and D. L. Ranson, eds. Arnold Publications, Sydney, 1998, pp. 215–231.

Phillips, P. J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Chang, J., Hoffman, K., Marques, J., Min, J., and Worek, W. Overview of the face recognition grand challenge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, California, 2005.

Phillips, P. J., Moon, H., Rizvi, S. A., and Rauss, P. J. The FERET evaluation methodology for face-recognition algorithms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico,1997.

Ricanek, K., Jr., Patterson, E. K., and Albert, A. M. Age-related morphological changes: Effects on facial recognition technologies, UNCW Technical Report. Call No. WRG FSC-R. University of North Carolina Wilmington, Wilmington, North Carolina, 2004.

Ricanek, K., Jr., and Tesafaye T. MORPH: A longitudinal image database of normal adult age-progression. In: Proceedings of the IEEE 7th International Conference on Automatic Face and Gesture Recognition, Southampton, England, April 2006.

Taister, M. A., Holliday, S. D., and Borrman, H. I. M. Comments on facial aging in law enforcement investigation, Forensic Science Communications [Online]. (April 2000). Available: http://www.fbi.gov/hq/lab/fsc/backissu/april2000/taister.htm.

Taylor, K. T. Forensic Art and Illustration. CRC Press-Taylor & Francis, Boca Raton, Florida, 2001, pp. 251–281.

West, K. S. and McNamara, J. A., Jr. Changes in the craniofacial complex from adolescence to midadulthood: A cephalometric study, American Journal of Orthodontic Dentofacial Orthopedics (1999) 115:521–532.

Zimbler, M. S., Kokosa, M. S., and Thomas, J. R. Anatomy and pathophysiology of facial aging, Facial Plastic Surgery Clinics of North America (2001) 9:179–187.