Home About Us Laboratory Services Forensic Science Communications Back Issues July 2009 undermicroscope Research and Technology - Analysis of...

Research and Technology - Analysis of Jihadi Extemist Groups' Videos - July 2009

fsc_logo_top.jpg
fsc_logo_left.jpg

July 2009 - Volume 11 - Number 3

 

Statistical Weight of a DNA Match in Cold-Hit Cases

Ranajit Chakraborty
Professor
Center for Genome Information
Department of Environmental Health
University of Cincinnati College of Medicine
Cincinnati, Ohio

Jianye Ge
Doctoral Student
Center for Genome Information
Department of Environmental Health
University of Cincinnati College of Medicine
Cincinnati, Ohio

Department of Biomedical Engineering
University of Cincinnati
Cincinnati, Ohio

Introduction | Probable Cause Versus Cold Hit | Different Questions, Different Answers | Universality of Random-Match Probability | The History and Future of Database Searches | Conclusion and Epilogue | Note | Acknowledgments | References

Introduction

On June 6, 2008, the Supreme Court of California ruled that the “Random Match Probability” (RMP) is a relevant statistic for assessing the statistical strength of DNA evidence in a cold-hit case (see People v. Nelson 2008). The ruling also states (see footnote 3 of the decision) that other alternative statistics (such as the “Database Match Probability,” DMP) also may be admissible and, more intriguingly, hinted as though RMP (and/or DMP) equates information as to “how likely it is that someone other than the defendant is the source of the crime scene evidence” (People v. Nelson 2008, pp. 30–31).

This decision is not the only discussion on the subject of relevant statistics for assessing the significance of a DNA match in cold-hit cases (see, e.g., Felch and Dolan 2008; Kaye 2008; People v. Johnson 2006; Storvik and Egeland 2007 and its cited references; and United States v. Jenkins 2005). Indeed, in some of these cases, the defense argued for inadmissibility of the entire DNA evidence on the grounds of a “raging debate” over the statistics for DNA evidence in cold-hit cases (see, e.g., State of Arizona v. Luong 2008 and United States v. Jenkins 2005). Although most courts have rejected such arguments, our commentary is prompted by a need to clearly distinguish the different concepts, as well as to clarify scenarios involving cold-hit cases. Confusion arises because often the interpretations of the different statistics are not clearly stated, and hence it is implied that a jury would be misled by a decision based on the RMP and would reverse itself if the court had an opportunity to hear about DMP (see, e.g., Felch and Dolan 2008).

In this commentary we first describe how (if at all) a cold-hit case may differ from a weight-of-evidence point of view in other cases in which DNA evidence is presented. Subsequently, we describe the different approaches for evaluating the strength of a DNA match in cold-hit cases, with the relevant questions that they answer, to establish that different solutions to different questions do not necessarily constitute a controversy. With these differences clearly defined, we argue that there is no disagreement in the literature, or in court testimonies, that the concept or the numerical value of a random-match probability is invalidated because a cold-hit identification of a suspect occurred and that no other number should be advocated to replace the RMP to evaluate its original (and only) interpretation. In fact, we proffer that DMP and RMP actually provide the same strength of a DNA match in cold-hit cases, if the frequentist’s interpretation of DMP is correctly spelled out.

Finally, with some brief statements about the rationale of establishing the offender database (the Combined DNA Index System, CODIS) in the United States and in other countries, we surmise that because such systems of suspect identification are becoming more and more successful with the enrichment of such databases, the attempt to modify RMP using database size as additional information without any mention of the RMP value would produce an inconsistent, if not erroneous, statistic, completely unrepresentative of the strength of the DNA evidence.

Probable Cause Versus Cold Hit

Crime investigations generally start with one or more suspects’ being identified based on leads obtained from informants or personal identification clues (not related to DNA profiles) left at the crime scene. Such cases are described as “probable cause” cases because the suspect is implicated in the crime on the basis of his or her association with the crime scene based on such leads taken alone or in combination with one another. If the crime scene provides any biological sample that may be from the perpetrator (and, in some cases, the victim), its DNA profile is compared with that of the suspect, and a resultant match constitutes DNA evidence in such a case. Because DNA data confirm the probable-cause scenario, these cases are also referred to as “confirmatory cases.”

When suspects are not identified from any such investigative clues or when all implicated suspects are eliminated after further investigation, the crime may be referred as a “cold” or “no suspect” case. If biological evidence is found at the crime scene and a DNA profile can be generated, a DNA databank(s) of offenders, including those convicted of other crimes, is (are) then searched to identify a suspect. A cold hit occurs when such a search process obtains a DNA match of the evidence sample profile (often called the “forensic sample profile”) with the DNA of an offender profile. Although the initial identification of a suspect starts with such cold hit(s), several other events precede charging the suspect with the crime. These include (and may not be limited to) (1) obtaining personal identification of the subject(s) whose DNA profile in the database matches that of the evidence sample, (2) verifying whether the person is the correct sex (and of the appropriate age range) to be the perpetrator, (3) ensuring that the person is out of custody to have had access to the crime scene when the crime occurred, (4) corroborating the match with the evidence-sample DNA profile after a fresh collection (and typing) of the sample from the person in the database, and/or (5) determining if other non-DNA evidence supports or refutes that the person is a reasonable suspect. Often, details of the whereabouts of the suspect around the proximity of the crime (temporal as well as spatial) constitute the basis for bringing the suspect to trial, unless the suspect has a strong alibi negating such information. Thus a DNA match begins the investigation of a cold-hit case; however, the match in the database itself may not be the one and only reason to bring a cold-hit suspect to trial.

The above descriptions indicate that the differences between probable-cause and cold-hit cases are not as definitive as discussed in court litigations. It is true that a probable-cause case starts with non-DNA evidence and may be bolstered by DNA evidence, and in contrast, a cold-hit case starts with a DNA match in a database (of past offenders) and often then is corroborated by non-DNA facts subsequently obtained. However, a statement such as “The defendant is brought to trial based on a database match,” giving rise to the term database trawl (Balding 1997), is not necessarily the full depiction of a cold-hit case. The People v. Nelson case is a good example because if today’s DNA technology had been available in 1976, it would not have been a cold-hit case at all. Almost similar would have been the case of People v. Puckett at the time of the crime, on which the newspaper article by Felch and Dolan (2008) is based.

Different Questions, Different Answers

Alhough the above discussion reflects that a probable-cause case is not necessarily clearly distinct from a cold-hit one, the focus of some arguments has been on the “raging debate” that exists on how the statistical strength of a cold-hit case should be assessed and ultimately presented to the court. Four different suggested approaches exist for such calculations. They are described below, along with the relevant questions that they answer. It is extremely important that the question asked be well understood so that the answer provided can be appropriately appreciated. The fact that there are different questions guarantees that the answers will differ. The different answers in no way constitute or support that a controversy exists.

1. RMP without any adjustment for the database search. P), representing the expected rarity of the profile in the population at large, computed for one or more populations relevant to the crime. Answering the question of how common (or rare) the profile is, RMP apparently ignores how the suspect is (initially) identified to be related to the crime. However, as we will discuss later, irrespective of a database search or not, the rarity of a profile in a population remains unchanged, an opinion that is the foundation of all proffered approaches for calculating cold-hit statistics; and further, a logical adjustment (ascertainment) for the strength of the (single) match found in the database gives a number close to that offered by P, for current database sizes (see, e.g., Balding and Donnelly 1995 and Dawid and Mortera 1996; also discussed in Storvik and Egeland 2007). The first approach is to report the RMP (say,

2. RMP based on an independent set of markers. The first committee of the U.S. National Research Council (NRC), in its report, often referred to as NRC-I (NRC 1992), objected to the above practice, stating that the chance of finding the profile in a database is different from that of RMP (in general, larger) and that any adjustment for a database-search event has to depend upon a number of assumptions, not necessarily verifiable in every case. Hence a noncontroversial approach would be to confirm (extend would be a better description) the DNA match (of the crime scene sample and that of the subject in the database) by typing the samples with additional loci not used in the search process and, should the match extend over these additional loci, report the RMP value with the additional loci alone. Although this suggestion by NRC-I gained the support of others as well (see, e.g., Lempert 1997 and Morton 1997), there are operational as well as technical problems regarding its implementation and adequacy.

First, although an additional battery of forensic markers may be available, they would have to be validated with rigor equivalent to those used in databases, a scenario that may not apply in all cases. Second, testing additional loci on the relevant “forensic sample” presumes that sufficient sample is still available for additional typing and not degraded so much that it does not provide useful information on all such additional loci. Third, and perhaps more important, the RMP based solely on additional markers constitutes only partial information and totally ignores the DNA match (and a large number of nonmatches) with the original loci of the database-search process. Of important significance is that in the initial database-search process, every profile in the database other than the suspect’s (usually a very large number) did not match the DNA profile of the evidence sample. Consequently, the suggestion by NRC-I did not find support in the worldwide forensic community, nor is it practiced on a routine basis in cold-hit cases.

3. Database Match Probability. In a reconvened committee of the National Research Council, this issue was discussed further, from which the concept of database-match probability arose. This committee, in its report published in 1996 (NRC 1996), also called the NRC-II report, recommended that a relevant question for a cold hit would be: How often a DNA profile matching that of the relevant forensic sample would be found in a database of size N. A simplified formula to answer this question is

DMP = 1 − (1 − P) NNP, (1)

in which the approximation is valid when P, the relevant RMP for the crime-sample DNA profile, is smaller than 1/N, the reciprocal of the size of the database searched. The approximation (NP), however, is the exact answer of the question of the expected number of DNA profiles in the database that would match the crime-sample DNA profile. Although the DMP uses RMP as its baseline, it clearly answers a question different from the one answered by the RMP.

The NRC-II report also may have contributed to the confusion by inconsistently stating the question. On page 40 of the report, the question is framed differently: “If one wishes to describe the impact of the DNA evidence under the hypothesis that the source of the evidence sample is someone in the database, then the likelihood ratio should be divided by N.” This is a different question from that stated immediately above and likely requires a Bayesian approach to answer adequately. Regardless, Stockmarr (1999) gave a justification of the database size adjustment using the concept of likelihood ratio, although the two hypotheses he used to form his likelihood ratio are different from that used in the typical probable-cause cases or the ones considered by Balding and Donnelly (1995, 1996). The U.S. DNA Advisory Board (DAB) endorsed the NP recommendation of the NRC-II report, with a clear statement that it answers a question different from the one answered by the RMP: DMP is not a reflection of the rarity of the profile; it reflects the chance of finding such a profile in a database of N profiles under some basic assumptions (DNA Advisory Board 2000).

4. Likelihood ratio using the event of a single match in a database. Arguing that the recommendation by the NRC-II report, namely the calculation of DMP, erroneously dilutes the strength of the DNA evidence in a cold-hit case, Balding and Donnelly (1995) computed the likelihood ratio (LR) for the hypothesis Hp: the suspect is the source of the crime scene DNA profile, contrasted with its complement; namely, someone else is the source of this DNA. Balding and Donnelly (1995) then derived an LR, giving

LR = [1+ (N - 1)/(n - N)]/P; or LR is slightly larger than 1/P, by an amount that depends on the ratio of N minus 1 and n minus N.image004 (2)

where n is the number of potential perpetrators; N, the size of database searched; and P, the RMP for the relevant DNA profile.1 Note that when N = 1, i.e., only one suspect is searched (in which case the cold-hit case reduces to the probable-cause scenario), LR = 1/P, the one derived from RMP alone. Although the logic of deriving this LR value is somewhat technical (Balding and Donnelly 1995), its thrust is on the observation that although N persons’ profiles had been searched, none other than the suspect matches the DNA profile of the crime scene sample. When n (number of potential suspects) is much larger than N (database size), the LR of this equation approximates 1/P (in fact, slightly higher), again predicted by the RMP alone. Furthermore, if the DNA profiles of all potential suspects are searched, the LR of this equation becomes infinite, meaning that the data (i.e., a single match in the database) are consistent with the hypothesis Hp alone, essentially identifying the true source of the evidence sample. Later we will return to this last observation, which will clarify further that these four approaches are not necessarily discordant with one another.

Universality of Random-Match Probability

The above discussions clearly demonstrate that the four approaches for providing statistics for DNA evidence in cold-hit cases ask four different questions (i.e., rarity in the population at large; rarity based on additional loci, ignoring the ones used in the search process; chance of finding the target profile in a database as large as the one searched; and likelihood ratio based on the finding that no profile in the database other than the suspect’s matches the evidence DNA profile). The concept of the RMP is imbedded in every one of these computations, implying that the rarity of the profile is unchanged whether a database search has occurred or not. In this sense, the RMP is universal.

Discussion also has occurred in the literature as to why the LR based on the DMP (as derived by Stockmarr 1999) is different from the one suggested by Balding and Donnelly (1995, 1996). Evett et al. (2000a) and Storvik and Egeland (2007) showed that although both approaches are mathematically correct, they consider two different sets of hypotheses to construct their respective LRs. For example, Balding and Donnelly’s LR is related to the hypotheses Hp, the suspect is the source of the crime scene sample, versus Hd, some person other than the suspect is the source; while Stockmarr considered Hp, one of the persons in the database is the source of the crime scene sample, versus Hd, the crime scene sample belongs to a person outside the database searched. It is an elementary statistical fact that when the contrasting hypotheses differ, the resulting LRs can differ. Again, different questions giving different answers does not constitute a controversy of statistical strength for cold-hit DNA evidence, but the issue could be which question is relevant and which is not.

Critics argue that the RMP is not relevant, a position that contradicts the fact that rarity is unchanged even when a database search is made. Consequently, the RMP is a relevant, robust statistic. This robustness is further supported because the LR based on the fact that no profile in the database other than the suspect’s matches the crime scene DNA profile (see Balding and Donnelly 1996) is close to the one provided by the RMP alone. Expressing the result in the form of an LR has created problems in the literature on this subject several times, as evident in the exchange of letters to the editor by Devlin (2000) and Evett et al. (2000b), also summarized again in Balding (2002). This problem arises from reverse conditioning, or “Prosecutor’s Fallacy,” which is not uncommon for statistical experts as well as judges and juries (see, e.g., the last sentence of the California Supreme Court ruling in People v. Nelson 2008).

A solution to Prosecutor’s Fallacy would be to provide the probabilities of the numerator and denominator of any LR separately, which can be simple to explain in nontechnical words and more easily understood, creating less confusion. Thus how common the profile is (the answer to which is given by RMP) and what the chance is of finding the profile in a database of size N (answer given by DMP) remain germane questions in providing the statistical strength of cold-hit DNA matches. This is the position espoused by the DAB (2000), while endorsing the NRC-II (1996) recommendation on the subject. Before providing this clarification of the NRC-II recommendation (in particular the language of Recommendation 5.1), the DAB sought comments from the chair of the NRC-II committee, James F. Crow, who, on a later occasion, gave an affidavit stating, “In my view, the rarity of a profile is always a relevant question for the trier of fact regardless of how a suspect is first identified. This question is answered by the RMP calculation.” He further stated, “As the chair of NRC II, I do not believe that Recommendation 5.1 was intended to eliminate the use of the RMP as one number in the context of cold hit cases” (see the affidavit, filed on 9/3/2004, in United States v. Jenkins 2005).

Having gone through the brief accounts of the different approaches, let us specifically examine if DMP and RMP values should be regarded as vastly different weights of DNA evidence in cold-hit cases, as implied in the last few sentences of the newspaper article on the People v. Puckett decision published in the Los Angeles Times on May 4, 2008 (Felch and Dolan 2008). Leaving technical details aside, in that case, the target profile was a seven-locus profile, the search of which in California’s DNA database (N = 338,000 approximately) identified Puckett as the only person matching the crime-sample profile. The RMP value (1 in 1.1 million) reported to the court in that case was based on five and a half loci, because some alleles in the evidence sample were below the analytical threshold set by the laboratory. Thus the RMP value of 1 in 1.1 million quoted in the Los Angeles Times article is not the one that should enter the DMP calculation giving the resultant number of 1 in 3 (approximately). Regardless of the correct details, one must be cautious to explain what this number (1 in 3) truly represents. Foremost, it does not ever mean that the RMP (signifying the rarity of the specific target profile) is 1 in 3.

First, because the California DNA database found a match of the evidence profile only for Puckett makes the question of how often this should have occurred somewhat moot, because the event has occurred already. The DMP number (properly calculated) truly estimates the expected number of replicates of such databases in which this profile is to be found. Thus a DMP of 1 in 3 (setting aside its inaccuracy) really means that in 1 of every 3 replicates of databases of a size equivalent to that of the California DNA database, this profile is expected to be found. Note that 3 multiplied by the database size (N = 338,000 approximately) brings us back to the reciprocal of the RMP, making the inference from DMP in perfect agreement with that from RMP. Thus any statement (such as the one made in Felch and Dolan 2008) that court decisions based on presentation of the RMP estimate alone might have been reversed if the DMP also had been presented is a mere reflection of the misunderstanding of the interpretation of the DMP. The fact that the evidence profile did indeed match Puckett’s profile in the California DNA database possibly had other reasons as well, which we will revisit in the next section.

The History and Future of Database Searches

The CODIS database, or similar ones in other countries, stems from projects that originated from sociological observations made in criminological sciences. Two phenomena, recidivism and graduated offenses, contribute to the theory that the earlier a criminal is detected and apprehended, the lesser the burden of future crimes will be. Recidivism refers to repeat crimes, and graduated offenses means progressive increase in the severity of crimes in repeat offenses; empirical data on both are reported from crime statistics of many countries (see, e.g., McEwen and Reilly 1994; U.S. Department of Justice, Bureau of Justice Statistics 1989; and citations in Walsh and Buckleton 2005). Although the estimates of the rate of repeat offenses (recidivism) are somewhat debatable (depending upon the survey design of such studies; see, e.g., Fischer 2005), it is generally concluded that a majority of crimes are committed by repeat offenders.

As cited in Butler (2005), estimates of recidivism for violent crimes may be more than 60 percent within three years of the first offense. Wickenheiser (2004) cites that an “average” serial rapist commits eight sexual assaults prior to apprehension. These statistics are staggering, implying that in no-suspect cases, DNA databases could be the primary tool for searching to develop a suspect lead. Because a sizable number of subjects in such offender databases may be out on the street at the time of the crime being investigated, it may be argued that databases such as the one in CODIS in the United States or any other country (see Butler [2005] and Walsh and Buckleton [2005] for lists of such worldwide databases) may not be truly random samples in which matches of crime-scene-related DNA profiles will be found by coincidence alone.

In fact, Walsh and Buckleton (2005) reported a strong correlation between the submission rate (of DNA profiles) from the crime scene and the hit rate (of cold hits) across different districts of New Zealand. Table 1 provides similar summary statistics based on NDIS (the National DNA Index System) data from CODIS (extracted from the Web site http://www.fbi.gov/hq/lab/codis/clickmap.htm, as of January 16, 2009). State-to-state variation of the number of offender profiles as well as forensic profiles is quite wide, as is the number of cases aided by the database (equated as the number of cold hits for the purpose of this discussion).

Table 1: Statistics of Cold Hits and Success Rates Based on NDIS Data from CODIS (Extracted from the website http://www.fbi.gov/about-us/lab/biometric-analysis/codis/ndis-statistics, as of January 16, 2009)

Figure 1 shows that the number of investigations aided (by the database) is strongly correlated with the number of offender profiles (panel a; r = 0.826, p < 10-4) and even more so with the number of forensic samples in the respective state database (panel b; r = 0.929, p < 10-4). Comparatively, however, if the ratio of the number of investigations aided by the database in relation to the number of forensic samples is considered the success rate, its correlation with the number of offenders is weaker (r = 0.481, p < 10-3), though significant (Figure 2). Taken together, these analyses show that the larger the database size, the greater the success rate of the crime investigation (i.e., cold hit), but other factors may be involved in predicting the chance of finding a match of DNA profiles of crime scene samples in the database, explaining its lower correlation with the number of offender samples in the database.

Figure 1a

Figure 1b

Figure 1: Correlation between (a) offender profiles and investigations aided and (b) forensic samples and investigations aided

Figure 2 is a graphical representation of the correlation between offender profiles (X-axis) and proportions of success (investigations aided/forensic samples; Y-axis), in which the correlation coefficient is 0.4811.

Figure 2: Correlation between offender profiles and proportions of success (investigations aided/forensic samples)

Overall, with total CODIS data of more than 6.3 million offender samples and 79,320 investigations aided for 245,171 forensic samples submitted (as of January 16, 2009), the success rate of 32.4 percent (derived from 79,320/245,171) provides a basis for predicting what the future might entail. With the optimum success of the CODIS program defined as containing the DNA profiles of all active criminals of the country in the database, the chance of finding a match of a crime scene sample would be dictated solely by the probability that the crime is committed by a repeat offender and very little by anything else. The NP rule of the NRC (1996) in that scenario becomes virtually uninformative by itself (and probably is misleading as well); first, because the database is not a random sample in relation to crime-related profiles, and second, because it does not use the information that profiles of everyone else in the database (consisting of both active and inactive criminals) other than the suspect do not match that of the crime scene sample.

In contrast, P, the random-match probability, is still the signature of the rarity of the profile, which gives a conservative strength of the evidence even if the LR argument is used (because the Balding-Donnelly LR value is always larger than 1/P; see equation 2). The observation that, as of now, cold hits have occurred in nearly 80,000 cases also supports the notion that the chance of such hits is dictated largely by factors that have little to do with the rarity (i.e., RMP) of the target DNA profile of interest. Although the nonrandomness of offender samples in relation to crime scene samples had been a known premise from the beginning, the seemingly approaching plateau of the scatter points of success rates of cold hits plotted against increasing values of offender database size (seen in Figure 2) further supports the notion that any empirical observations on finding a match of a crime-related DNA profile in such databases cannot be used to estimate its rarity.

Conclusion and Epilogue

Recent court discussions, media writings, and scientific literature on the statistical strength of DNA evidence in cold-hit cases do not clearly reflect that in cold-hit cases, a DNA match in the database is the primary initial step in developing a suspect but not the only one before bringing the suspect to trial. Thus the distinction between cold-hit and probable-cause cases is not as sharp as generally portrayed in these discussions. Although the suggestion of four different approaches for presenting DNA statistics in cold-hit cases is criticized for lack of general agreement, our commentary clearly spells out that their differences lie in the questions they answer and not the manner used to calculate the statistic after the question is defined.

One approach, the DMP suggested by the NRC-I report (1992), has not been well received and generally has been regarded as impractical (because of the unavailability of suitable, sufficiently validated, additional markers and/or the compromising quality and quantity of the evidence sample) and, even if employed, ignores important information. The RMP is always relevant, and the other two questions (i.e., leading to the concepts of DMP and LR) may be helpful in some scenarios, but answers to them rest on the premise that the offender database consists of DNA profiles of samples that are random with regard to crime-related samples. Although not discussed in detail, two other statistics, recently proposed by Song et al. (2009), also suffer from this wrong assumption.

The RMP plays a pivotal role in answering all of these suggestions; it reflects the rarity of a profile, irrespective of how this profile has become the subject of investigation (i.e., cold hit or not). Further, the usual (mis)interpretation of the NRC-II report’s Recommendation 5.1 (1996) is clarified here, in that its NP rule is not a replacement of the RMP. In fact, a properly interpreted DMP statistic (i.e., DMP implies the inverse of the expected number of replicates of the database needed to find the target profile) leads to a statistical weight exactly identical to that provided by RMP. The LR interpretation (as presented in DAB 2000 and Stockmarr 1999) of the DMP relates to two contrasting hypotheses that differ from the ones contrasted in the LR constructed by Balding and Donnelly (1995, 1996). Presentation of the RMP is almost equivalent to presenting the Balding-Donnelly LR statistic, because with the current size of the DNA databases that are searched, their LR value is only slightly higher than the one (1/P) obtained from the RMP.

Finally, presentation of RMP is also operationally simple, and it avoids the mistake of reverse conditioning (i.e., Prosecutor’s Fallacy), common in the presentation of the LR-based statistic. Thus it can be reasoned that cold-hit cases in which the suspect is identified in the absence of valid alibis for not having access to the crime scene, a DNA match can and should be quantified by RMP alone without any additional changes.

Note

1. The notations in equation 2 have been changed from those of the cited authors (Balding and Donnelly 1995) to make them consistent with those of the NRC-II (1996), where N was designated as the size of the database searched.

Acknowledgments

Wen Niu, University of Cincinnati, helped in computations and graphical display of the summary statistics from the CODIS database. Access to the summary statistics of the CODIS database from the Web site helped immensely in the logic presented in this commentary. Bruce Budowle, formerly with the FBI Laboratory, provided several editorial comments that improved the presentation.

References

Balding, D. J. Errors and misunderstanding in the second NRC report, Jurimetrics Journal (1997) 37:469–476.

Balding, D. J. The DNA database search controversy, Biometrics (2002) 58:241–244.

Balding, D. J. and Donnelly, P. Inference in forensic identification, Journal of the Royal Statistical Society, Series A, Statistics in Society (1995) 158:21–53.

Balding, D. J. and Donnelly, P. Evaluating DNA profile evidence when the suspect is identified through a database search, Journal of Forensic Sciences (1996) 41:603–607.

Beck, A. J. and Shipley, B. E. Recidivism of prisoners released in 1983, Bureau of Justice Statistics Special Report. NCJ-116261. A. J. Beck, Ed. U.S. Government Printing Office, Washington, D.C., 1989. Available: http://www.ojp.usdoj.gov/bjs/pub/pdf/rpr83.pdf.

Butler, J. M. Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers. 2nd ed. Elsevier, Amsterdam, 2005.

Dawid, A. P. and Mortera, J. Coherent analysis of forensic identification evidence, Journal of the Royal Statistical Society, Series B, Methodological (1996) 58:425–443.

Devlin, B. The evidentiary value of a DNA database search, Biometrics (2000) 56:1276–1277.

DNA Advisory Board. Statistical and population genetic issues affecting the evaluation of the frequency of occurrence of DNA profiles calculated from pertinent population database(s), Forensic Science Communications [Online]. (July 2000). Available: http://www.fbi.gov/hq/lab/fsc/backissu/july2000/dnastat.htm.

Evett, I. W., Foreman, L. A., and Weir, B. S. Letter to the editor of Biometrics, Biometrics (2000a) 56:1274–1275.

Evett, I. W., Foreman, L. A., and Weir, B. S. A response to Devlin, Biometrics (2000b) 56:1277.

Felch, J. and Dolan, M. The odds of justice can be long. Los Angeles Times, California/Local, May 4, 2008. http://articles.latimes.com/2008/may/04/local/me-dna4-copy.

Fischer, R. G. Are California’s recidivism rates really the highest in the nation? It depends upon what rates of recidivism you use, The Bulletin (2005) 1:1–4. (See also http://ucicorrections.seweb.uci.edu/pubs.)

Kaye, D. H. Rounding up the usual suspects: A logical and legal analysis of DNA trawling cases, Science and Law Blog, April 5, 2008, and North Carolina Law Review (2009) 87:425–503.

Lempert, R. After the DNA wars: Skirmishing with NRC II, Jurimetrics Journal (1997) 37:439–468.

McEwen, J. E. and Reilly, P. R. A review of state legislation on DNA forensic data banking, American Journal of Human Genetics (1994) 54:941–958. Correction printed in American Journal of Human Genetics (1995) 56:358.

Morton, N. E. The forensic DNA endgame, Jurimetrics Journal (1997) 37:477–494.

National Research Council. Committee on DNA Technology in Forensic Science. DNA Technology in Forensic Science. National Academy Press, Washington, D.C., 1992.

National Research Council. Committee on DNA Forensic Science: An update. The Evaluation of Forensic DNA Evidence. National Academy Press, Washington, D.C., 1996.

People v. Johnson, 43 Cal. Rptr. 3d 587 (Ct. App. 2006).

People v. Nelson, 48 Cal. Rptr. 3d 399 (Ct. App. 3 Dist. 2006), rev. granted 147 P. 3d 1011 (Cal. 2006).

Song, Y. S., Patil, A., Murphy, E. E., and Slatkin, M. Average probability that a “cold hit” in a DNA database search results in an erroneous attribution, Journal of Forensic Sciences (2009) 54:22–27.

State of Arizona v. Luong, CR 2007-145984-002 SE (2008).

Stockmarr, A. Likelihood ratios for evaluating DNA evidence when the suspect is found through a database search, Biometrics (1999) 55:671–677.

Storvik, G. and Egeland, T. The DNA database search controversy revisited: Bridging the Bayesian-frequentist gap, Biometrics (2007) 63:922–925.

United States v. Jenkins, 887 A.2d 1013b (D.C. 2005).

Walsh, S. and Buckleton, J. DNA intelligence databases. In: Forensic DNA Evidence Interpretation. J. Buckleton, C. M. Triggs, and S. J. Walsh, Eds. CRC Press-Taylor & Francis, Boca Raton, Florida, 2005, pp. 439–469.

Wickenheiser, R. A. The business case for using forensic DNA technology to solve and prevent crime, Journal of Biolaw & Business (2004) 7:34–50.