|
Reply to Weir (2001)
Bruce
Budowle
Senior
Scientist
Federal Bureau of Investigation
Wasington, DC
Ranajit
Chakraborty
Professor
Human Genetics Center
University of Texas School of Biomedical Sciences
Houston, Texas
George
Carmody
Professor
Department of Biology
Carleton University
Ottawa, Canada
Keith
L. Monson
Research
Chemist
Forensic Science Research Unit
Federal Bureau of Investigation
Quantico, Virginia
References
The comment by Weir (2001) criticizes
the approach for assessing source attribution by Budowle et al.
(2000) by suggesting that it is of little consequence that a
profile is rare in the population and that because of relatedness
in the population, the assumption of independence implicit in
using the binomial (1 px)N
is inappropriate. Weir's criticisms fail to acknowledge a pragmatic
application of a robust statistical model, the predictions of
which comport with observations even though its assumptions are
not strictly satisfied, and he also fails to recognize the low
degree of relatedness in the populations typically encountered.
Weir (2001) states that our
approach is inappropriate because the assumption of independence
is violated in the population. In statistics, models often are
used to help interpret data. These models are simplified representations
of a phenomenon and do not exactly represent the real world.
Weir's model for independence is a population with no common
evolutionary history, complete random mating, population homogeneity,
no linkage, no selection, no mutation, and no migrationin
other words, an idealized HardyWeinberg population. Such
a population does not exist. Yet, the HardyWeinberg model
prevails and often is used to describe genetic markers. Even
though complete independence is never realized, the genetic markers
used in forensics generally meet HardyWeinberg expectations
(NRC-II 1996). This is because the assumption of independence
is a reasonable approximation for the genetic markers used in
forensic human identity testing.
Both the NRC-II Report (1996)
and the DNA Advisory Board (DAB) recommendations on statistics
(DAB 2000) recognize that rarely is there only one statistical
approach to interpret and explain evidence. The DAB recommendations
state, "The choice of approach is affected by the philosophy
and experience of the user, the legal system, the practicality
of the approach, the question(s) posed, available data, and/or
assumptions." Moreover, the DAB recognizes that simplistic
and less rigorous approaches can be employed, as long as false
inferences are not conveyed (DAB 2000). We wholeheartedly agree
and therefore justify our approach (which actually was first
proposed by the NRC-II Report [1996]) as it is easy to understand
and compute and is a demonstrably conservative approximation.
Although it is true that no population meets the above-stated
HardyWeinberg criteria, extant data support that population
substructure has minimal effect on computations of profile frequencies.
The NRC-II Report (1996) recommends using a q of 0.01
to correct for substructure; however, the true value of q is
much lower than 0.01. Budowle et al. (in press) found that q (or
FST) estimates over all 13 core CODIS STR loci are
0.0006 for African Americans, 0.0005 for U.S. Caucasians,
0.0021 for Hispanics, and 0.0039 for Asians. The population data
on nine of the thirteen CODIS STR loci described by Chakraborty
et al. (1999) were shown to have GST values at 0.000
for African Americans, 0.001 for Caucasians, and 0.001 for Asians
(unpublished data).
Although Weir argues that
source attribution should be based on computations of the conditional
probability, the effect is of little consequence for forensically
relevant populations. For example, Chakraborty et al. (1999)
showed that even when considering the upper 95 percent confidence
limit of the most common 13 STR locus profile in African Americans,
the rarity of the profile changes from 1 in 100 × 109
to 1 in 72 × 109 if the conditional
probability is used. Therefore, the assumption of independence
has little practical consequence on such estimates.
However, we recognize that
when using simplified models a degree of conservatism should
be built into the analysis. First, the FBI uses a population
size (N) of 260 million. Rarely, if ever, would a population
of 260 million be meaningful in a forensic context. In principle,
assessing source attribution should be considered within the
context of the case. However, defining the true size of the potential
population can be difficult. The large value of N adds
a substantial buffer, by orders of magnitude, to the threshold
estimate. Second, the rarity of the profile is estimated according
to the recommendations in the NRC-II Report (1996). The frequency
px includes a correction for deviations from
HardyWeinberg expectations, using q, and
is further corrected for sampling variation by multiplying the
frequency by a factor of ten. As already stated, the value of
q used is 0.01, even though realistic estimates
of q are much smaller. Third, the threshold confidence
level for opining source attribution is 0.99, resulting in a
minimum threshold match probability of 1 in 2.6 × 1010
for a population of 260 million (Budowle et al. 2000). Rarely
is this specific threshold value observed. The average match
probability for a 13-locus STR profile, with adjustments for
the effect of population substructure, ranges from less than
1 in 1012 (in Apaches) to 1 in 1015 (in
major population groups; Chakraborty et al. 1999). Therefore,
with the genetic typing tools used today, the degree of confidence
typically is several orders of magnitude higher than 0.99.
For practical purposes, the
need for a conditional probability logically applies only when
the true contributor of the profile belongs to the same subpopulation
as the suspect (i.e., shares a common evolutionary history).
Rarely does such a situation occur. The probability of observing
an extremely rare profile again would most likely be greater
in a group of individuals with a common evolutionary history.
Should such an occasion arise, Weir would presumably advocate
a higher value for q and employ a conditional probability.
Our approach for assessment of source attribution would be more
conservative, even under this scenario. As q increases,
logically the size of the subpopulation must decrease. To apply
a realistic conditional probability, the effective size of the
relevant population (N) would have to be substantially
less than the currently used value of 260 million. Thus, the
threshold frequency for opining source attribution would not
be as high as currently used if a conditional probability were
employed.
We conclude that there is
support (including the NRC-II Report [1996]) for use of our simple
model, both from a practical point of view and from extant population
data. Our approach is a reasonable approximation because of the
low level of substructure in forensically relevant populations.
There is nothing inappropriate about being conservative (NRC-II
1996). The threshold is conspicuously conservative and thus would
not create any undue bias. In the end, we see no difference between
our approach and that of Weir (1995) after his testimony in the
O. J. Simpson trial where he opined, "Presentation of a
number such as 1 in 57 billion suggests that it is inconceivable
that the rear-gate profile ... would be found in a random individual
(after all, there are only 5 billion people on the planet). Thus,
the frequency ... in a population will be so low that the need
for presenting probability numbers in cases where one identifiable
profile is present appears to me to be superfluous."
References
Budowle, B., Chakraborty,
R., Carmody, G., and Monson, K. L. Source attribution of a forensic
DNA profile, Forensic Science Communications (July 2000).
Available at: http://www.fbi.gov/programs/lab/fsc/backissu/july2000/source.htm
Budowle, B., Shea, B., Niezgoda,
S., Chakraborty, R. CODIS STR Loci Data from 41 Sample Populations,
Journal of Forensic Sciences (in press).
Chakraborty, R., Stivers,
D. N., Su, B., Zhong, Y., and Budowle, B. The utility of STR
loci beyond human identification: Implications for the development
of new DNA typing systems, Electrophoresis (1999) 20:16821696.
DNA Advisory Board (DAB).
Statistical and population genetics issues affecting the evaluation
of the frequency of occurrence of DNA profiles calculated from
pertinent population database(s) (approved February 23, 2000),
Forensic Science Communications (July 2000). Available
at: http://www.fbi.gov/programs/lab/fsc/backissu/july2000/dnastat.htm
National Research Council
(NRC-II). The Evaluation of Forensic Evidence. National
Academy Press, Washington, DC, 1996.
Weir, B. S. DNA match and
profile probabilities: Comment on Budowle et al. (2000) and Fung
and Hu (2000), Forensic Science Communications (January
2001). Available at: http://www.fbi.gov/programs/lab/fsc/current/weir.htm
Weir, B. S. DNA statistics
in the Simpson matter, Nature Genetics (1995) 11:365368.
To the comment by Weir
To
the reply by Fung and Hu
Top of the page |