Molecular Ecology, In
Press, all rights reserved
Title: Nucleotide Diversity in Populations of
Authors: Richard L. Borowsky1 and Chavalit Vidthayanon2
Addresses: 1) Cave Biology Research Group, Department of
Biology,
Date of Receipt:
Keywords: Nucleotide Diversity, Troglobitic Fishes,
Balitoridae, RAPD
Running Head: Nucleotide diversity in fishes
Correspondence:
Richard L. Borowsky, Cave Biology Research Group, Department of Biology,
Abstract
Genetic
variabilities in four cave and eight surface species of balitorid freshwater
fishes from
Introduction
Obligatorily
cave dwelling (troglobitic) fishes warrant attention because so many of their
populations are threatened. Of the 85
currently recognized cave fish species, five are listed as
"Endangered" or "Critically Endangered" and 46 as
"Vulnerable," on the UICN Red Lists (Proudlove, 1997; Weber et al. 1998). Their proportionate representation on the Red
Lists exceeds that of almost any other animal group.
Because of the limited accessibility
and occult nature of their habitat, relatively little is known about the
population attributes of troglobitic fishes.
Data on effective population sizes and relative genetic variability, for
example, are sparse, although they are of particular importance in assessment
and management of threatened populations.
The few data available suggest troglobitic fish populations have low
genetic variability and are relatively small in size. Poulson
(1963) sampled ten populations of three species of troglobitic amblyopsids, and
estimated their numbers to range on the orders of 101 to 102. Mitchell et
al. (1977) estimated the size of the
Only three studies have directly
compared genetic variability in troglobitic and related epigean fish species
(Avise and Selander, 1972; Swofford, et
al., 1980; Perez and Moodie, 1993).
Each assessed genetic variability by enzyme electrophoresis, and showed
the troglobites to have lower heterozygosities than the comparison surface
species: on average, H = 0.01 vs. 0.06 (Table 1). The number of animals used in these studies
varied from scores to hundreds. Thus,
the study methodology placed a non-negligible cost on the studied population.
Here, we sought to test the
generality of the observation that cave fish populations have relatively low
genetic variability, and do so in a way that used a minimal number of
animals. We did this by sampling a small
number of individuals at a large number of loci. This strategy compensates for the increased
sampling variance from the small number of individuals by averaging over the
many loci. Nei (1978) has theorized that
reliable estimates of genetic heterozygosity can be obtained with sample sizes
as small as two individuals.
With small sample sizes it is
essential that sampling be unbiased and representative; this approach should be
applied cautiously if there is any evidence of non-random distributions of
related individuals. We feel this is
unlikely in the present case because the hypogean fishes sampled were sparsely
distributed and taken from different areas within the caves and the epigean
fishes were selected at random from larger numbers of fishes collected from
different parts of the stream.
We looked at phenotypic variability
of RAPD bands, which are anonymous markers amplified out of genomic DNA using
single or paired primers at low stringency (Welsh and McClelland, 1990; Williams
et al. 1990). An advantage of RAPDs in a study of this type
is that large numbers of loci can be screened with minimal effort. We examined over 1000 markers in each
population, using sample sizes of two or four individuals. We estimated nucleotide diversity (p)
from the RAPD data using a simple relationship between band sharing probability
and p (Borowsky, 2001).
Materials and Methods
Balitorid fishes
Twelve
(or thirteen, see below) species were studied, of which four are
troglobites: Cryptotora thamicola, Nemacheilus
troglocataractus, Schistura oedipus,
and a new cave fish species from Phitsanulok, S. aff. reidi. The first species belongs in the subfamily
balitorinae while the other three species belong to the nemacheilinae.
At present, there is no phylogenetic
hypothesis for the balitorids of
RAPD techniques
RAPD
bands were amplified using published techniques (Borowsky et al. 1995) modified as follows: amplifications were done using
Stoffel Taq polymerase (PE Cetus) and pairs of short primers (10 and 11 mers),
rather than AmpliTaq and long single primers.
Ten microliter reactions were cycled as follows: (94° for 3min); 40
cycles (94° for 1 min, 35° for 1 min, 72° for 2 min); 72° for 7 min. Products were labeled by the incorporation
of 33P, separated on 0.4 mm, 4% polyacrylamide sequencing gels under
denaturing conditions (50% Urea), and visualized by autoradiography.
The nine primers we used were: OPN24
(5'AGGGGCACCA3'), OPN28 (GCACCAGGGG), US6 (GTGGTGACAG), US12 (ACAGACAGTG), N1+
(ACGAAGAAGAG), N2+ (AAGAAGAGCAA), KA1+ (GAGGGTGCCTT), KA2+ (GGTGCCTTTGG) and
KB1- (TCTGGCTTGAA). These were used in
the following ten pairings: OPN24 with N1+ or N2+ or KA1+ or KA2+; OPN28 with
US6 or US12 or KA1+ or KB1-; US6 with N1+ or N2+.
Statistics
The
average number of bands scored per primer combination in each population ranged
from 68 to 127 (mean = 104 + 17.5).
We scored ten different primer sets and obtained over 1000 bands per
population. Pairs of individuals from
the same populations were run in adjacent lanes, and we counted the numbers of
shared and unique bands for each pair. The
proportion of unmatched RAPD bands between pairs of individuals randomly chosen
from the population is our basic measure of diversity. This proportion is denoted fe, the Phenotypic Heterogeneity
Index. In each population, fe was determined separately for each
primer combination. These separate
estimates were then averaged to obtain an overall population estimate of fe and its standard error. Of the seventeen populations we studied,
eleven were represented by two individuals and six others by four. These six populations allowed us independent
replicate estimates of f, in order to assess its
consistency. Because f
is a linear function of p, at least in the biologically realistic
range of values (Borowsky, 2001), we were able to estimate p
from the data.
Technical control
f is an
experimentally determined estimate and should be corrected before use for any
significant procedural estimation bias.
The most important potential source of bias would be variation in band
repeatability. To establish a baseline
for the reliability of RAPD band amplification under our experimental
conditions, we separately amplified replicates from the same DNA samples. We used two different DNA sources from each
of two species, eight replicates per source and one primer combination in this
control series
Results
Control Series
The
total number of bands that were amplified for all the DNA samples in the
control series was 570. Of these, 566
were completely consistent across replicates, while four were
"sporadics," being present only about half the time (
). Thus, the strict
repeatability of RAPD band amplification in the control series was 99.3% Because the existence of a sporadic would
contribute to f only when exhibited by one individual in
a pair, the correction to f is approximately 2s(1-s) times
0.007. This correction was applied to
all estimates of f.
Its uniform application had no effect on statistical inferences, but did
correct for a potential source of bias in estimates of p.
RAPD Band Variability
Figure
1 shows the results for populations of the three nemacheiline genera. The three troglobitic populations have the
lowest f values, with estimates ranging from 0.013 + 0.006 to
0.047 + 0.008. Values of f
for the epigean populations range from 0.074 + 0.012 to 0.187 +
0.025.
Figure 2 shows the results for the
balitorine genera. The troglobitic
species C. thamicola has a f
value of 0.014 + 0.004. Estimates
for the epigean populations range in f values from 0.025 + 0.008 to
0.195 + 0.017.
In spite of variability within the
troglobitic and epigean groups, all four troglobites have lower mean f
values than the comparison epigean species (ranges of 0.020 to 0.035 vs 0.041 to 0.189). The mean value of f
for the cave fishes (0.025 + 0.005 SE) is about 20% that of the epigean
species (0.110 + 0.018) and significantly lower (t11 = 3.14,
P < 0.01).
Replication and Repeatability
Five
populations are represented by replicate estimates of f. In four cases, the replicate estimates are
almost identical. Only in one case, the
new cave species of Schistura from
Phitsanulok, do the estimates differ to the extent that the error bars do not
overlap. Even in this case, the high
estimate for the cave fish population is lower than any of the estimates of f
for comparison epigean populations.
We studied two populations of S. poculi and five of Homaloptera smithi to test the repeatability
of f among populations of the same species. The two estimates for S. poculi were virtually identical (0.145 + 0.010 vs. 0.155 + 0.013) although the
populations are 80 kilometers separate.
For H. smithi, f
values fell into two significantly different, non-overlapping ranges: high (0.181 to 0.197) for populations from
the mainland and low (0.035 to 0.046) for populations from the Malaysian
peninsula. Estimates of f
within each geographic area are virtually identical. The differences among areas are so great as
to suggest there are two distinct species, although this hypothesis waits
testing. For statistical analyses, we
treated populations from these two areas as separate taxa: H. smithi from the mainland and
H. aff. smithi from the peninsula.
Preliminary mtDNA sequence data (16SrRNA, 12SrRNA and cytochrome b) strongly
support the specific distinction of the two forms (Borowsky, unpublished).
Nucleotide Diversity
Values
of p were calculated from estimates of fe by the approximation:
, where m is the number of bases per band screened in the
process (Borowsky, 2001). In this study,
in which RAPD bands were amplified with pairs of primers (either 10 or 11
bp), m averaged 20.8. Calculated values of p
for the troglobitic cave fishes were about 23% those of comparison surface
fishes: averages of 3.9 * 10-3 vs.
0.9 * 10-3 (Figures 1 and 2).
Discussion
Sampling the genome
The basic measure of variability we used, fe, is the empirically determined
proportion of RAPD bands differing between two individuals drawn at random from
a population. The ten primer
combinations employed in this study allowed us to score over 1000 RAPD loci in
each population. RAPD loci are generally
unlinked or loosely linked with each other and sample variation throughout the
genome. With an average of 20.8 bases
screened per primer combination, over 20,000 bases were assessed in each estimate
of fe. Thus, it is not surprising that the data show
that replicate estimates of fe are consistent within populations and among populations of
the same species. This lends empirical
support to Nei’s (1978) theoretical prediction that estimates of genetic
variability derived from small numbers of individuals may be quite reliable, if
a large number of gene loci are used.
Because RAPD is able to screen large numbers of loci with minimal
effort, this approach may prove particularly efficient for comparative studies
of genetic variation among large numbers of populations or species.
Nucleotide diversity in hypogean and
epigean fishes
The
limited data on allozyme variation cited in the introduction suggest that
troglobitic fish populations have lower genetic variability than related
surface populations. The present results
support this generalization with data at the level of DNA variation. Each of the four balitorid cave fishes we
tested fell significantly below the average genetic variability of an
appropriate panel of related comparison epigean fishes. The decreased genetic variation observed is
consistent with the expectation that the troglobitic fishes have smaller
population sizes than the epigean species.
Limited to single caves and cave systems, they certainly have smaller
ranges.
The foregoing does not preclude the
possibility that hypogean fish species with large population sizes may
eventually be found. Should such be
found, we predict they will have relatively high nucleotide diversities.
Estimates of p
in nuclear DNA can be obtained directly from sequence data, or indirectly from
restriction fragment analysis or RAPD/AFLP data. Sequence and restriction studies are
generally focused on small portions of the genome, while RAPD/AFLP studies are
genome-wide scans. Because specific
sites interesting or well enough known to warrant sequencing or restriction
analysis often have significant functions, selection may limit their
variability (Innan et al., 1999). As expected, average estimates of p
from sequencing and restriction are lower than those from RAPD and AFLP data
[0.0038 + 0.0009 vs. 0.0135 +
0.0032 (S.E.); Kreitman, 1991; Li and
Sadler, 1991; Takano et al., 1991;
Fullerton et al., 1994; Rogers and
Kidd, 1996; Kawabe et al., 1997;
Nickerson et al., 1998, vs. Harada et al., 1994; Martinez-Torres et
al., 1997; Silveira et al., 1998;
Innan et al., 1999; Verovnik et al., 1999].
Our estimates of p
for the epigean balitorids using RAPD data averaged lower (0.0039 +
0.005) than those in the RAPD/AFLP studies cited above, but are nearly
identical to those obtained from sequence analysis. Our estimates of p for
the four cave balitorids (0.0009 + 0.0002) average significantly lower
than those for epigean balitorids (t11 = 4.43, P < 0.05).
The consistently large difference in
variability between mainland and peninsular forms of "Homaloptera smithi" may simply reflect their suspected
taxonomic distinction or, alternatively, be a function of habitat differences. The peninsular populations are located in
independent drainages feeding Phang Nga bay, which are much less extensive than
the drainage systems of the mainland.
Larger drainage systems more likely support larger populations of
freshwater fishes.
The
average value of p calculated for troglobites was 23% that
of epigean populations. Because p
is a linear function of effective population size and mutation rate (Nei and
Li, 1979), it is tempting to suggest that effective population sizes also
differ by a factor of four to five-fold.
Troglobites are generally long lived, however, and it is not clear how
this might affect their per generation mutation rates. If these are significantly higher than those
of their surface counterparts, the data could be consistent with an even
greater disparity in effective population sizes between surface and cave
fishes. The use of nucleotide diversity
data to reliably assess population sizes in troglobitic organisms will probably
require a greater knowledge of their basic biology.
The reliability of RAPD
The
RAPD technique is widely employed, but in practice bands are amplified and
visualized by a variety of techniques.
Because the use of RAPDs for the estimation of nucleotide variability
requires that results be repeatable, a brief comment on technical aspects is
appropriate here. The amplified band
pattern consistency of 99.3% demonstrated in the present study is probably
universally attainable if several precautions are taken. Following Welsh and McClelland (1990) and
others, and from our practical experience, we recommend: 1) all template concentrations must be uniform across samples. Spectrophotometry and/or fluorimetry are
inadequately reliable measures of high molecular weight DNA concentration. The most consistent results are obtained if
all template samples are titrated to concentration and measured against known
quantitative DNA standards on agarose gels (Borowsky et al. 1995). This is probably the most important step to
insure consistent RAPD results, and is almost never employed. 2) In our experience, the most robust and
repeatable RAPD reactions come when two RAPD primers are used, instead of one
(Welsh and McClelland, 1991). The use of
two different primers prevents self-annealing of the two ends of PCR product
and provides more efficient template for the continuing reaction. 3) Running products on thin, denaturing
("sequencing") PAGE gels and visualizing them using 33P or
32P provides a sensitive and quantitative measure of band
presence. In our experience, silver staining
is not quantitative and is unacceptably variable among gels (although we
recognize that other laboratories may be able to produce more consistent
results with silver staining), and ethidium staining of agarose gels is
insufficiently sensitive. We find the
use of 33P particularly convenient because of its relatively long
half-life and low emission energy, and it avoids the need of toxic reagents,
like ethidium bromide and silver salts.
Thus, using RAPDs and a new analytic
model it is possible to reliably assess nucleotide diversity in populations using
DNA samples from as few as two individuals.
This approach should prove useful in the study of populations that are
endangered or relatively inaccessible, and in large scale comparisons of many
species.
Acknowledgments
This work was supported by grants to
RB from the National Science Foundation (INT9605200), the H. R. Axelrod
Foundation and The Research Challenge Fund (NYU). The Royal Thai Forestry and Fisheries
Departments provided material support in the field. Identifications of some specimens were
verified by Maurice Kottelat,
References
Avise JC, Selander RK (1972)
Evolutionary genetics of cave-dwelling fishes of the genus Astyanax. Evolution 26, 1-19.
Borowsky R (2001)
Estimating Nucleotide Diversity From Random Amplified Polymorphic DNA
and Amplified Fragment Length Polymorphism Data. Mol. Phylogenet. and Evol. 18, 143-148.
Borowsky R, McClelland M, Cheng R,
Welsh J (1995) Arbitrarily primed DNA
fingerprinting for phylogenetic reconstruction in vertebrates. Mol.
Biol. Evol. 12, 1022‑1032.
Fullerton
SM, Harding RM, Boyce AJ, Clegg JB
(1994) Molecular and population
genetic analysis of allelic sequence diversity at the human beta-globin
locus. Proc. Nat. Acad. Sci. (
Harada
K, Kinoshita A, Shukor NAA, Tachida H, Yamazaki T (1994)
Genetic variation estimated in three Shorea
species by RAPD analysis. Jpn. J. Genet. 69, 713-718.
Innan
H, Terauchi R, Kahl G, Tajima F
(1999) A method for estimating
nucleotide diversity from AFLP data. Genetics 151, 1157-1164.
Kawabe A, Innan H, Terauchi R,
Kottelat, M (1989)
Zoogeography of the fishes from the Indochinese inland waters with an
annotated check-list. Bull. Zöol. Mus. Univ. Amsterdam. 12, 1-54.
Kottelat, M (1990)
Indochinese nemacheilines. A revision of the nemacheiline loaches
(Pisces: Cypriniformes) of
Kottelat, M (1998)
Homaloptera yuwonoi, a new
species of hillstream loach from
Kreitman,
M (1991) Detecting selection at the level of
DNA. Chapter 10 in: Evolution
at the Molecular Level (eds. Selander RK, Clark AG, Whittam TS), Sinauer,
Li
W-H, Sadler LA (1991) Low nucleotide
diversity in man. Genetics 129, 513-524.
Martinez-Torres
D, Carrio R, Latorre A, Simon JC, Hermoso A, Moya A (1997) Assessing the nucleotide diversity of
three aphid species by RAPD. J. Evol. Biol. 10, 459-477.
Matsuda
M, Yonekawa H, Hamaguchi S, Sakaizumi M
(1997) Geographic variation and diversity in the mitochondrial DNA of
the medaka, Oryzias latipes, as
determined by restriction endonuclease analysis. Zoological
Science (
Mitchell RW, Russell WH, Elliott
WR (1977) Mexican eyeless characin fishes, genus Astyanax: Environment, distribution and evolution. Special
Publ. Mus. Texas Tech. Univ. 12, 1-89,
21 Figs.
Nickerson
DA, Taylor SL, Weiss KM, et al. (1998)
DNA sequence diversity in a 9.7-kb region of the human lipoprotein
lipase gene. Nature Genetics. 19, 233-240.
Nei M (1978)
Estimation of average heterozygosity and genetic distance from a small
number of individuals. Genetics 89, 583-590.
Nei M, Li W-H
(1979 Mathematical model for
studying genetic variation in terms of restriction endonucleases. Proc.
Nat. Acad. Sci.,
Perez JE, GE Moodie (1993)
Genetic variation in a cave-dwelling Venezuelan catfish. Zoologia
(Acta Cientifica Venezolana) 44, 28-31.
Poulson TL (1963)
Cave adaptation in Amblyopsid fishes. Amer. Midl. Nat. 70, 257-290.
Proudlove, GS (1997)
The conservation status of hypogean fishes. pp. 355-358 In: Proc. 12th Int. Cong. Speleol., La
Chaux de Fonds,
Rogers
J, Kidd K (1996)
Nucleotide polymorphism, effective population size, and dispersal
distances in the yellow baboons (Papio
hamadryas cynocephalus) of
Silveira
EB, Al-Janabi SM, Magalhaes BP, Carvalho LJ, Tigano MS (1998)
Polymorphism of the grasshopper Schistocerca
pallens (Thunberg) (Orthoptera: Acrididae) and its natural pathogen Metarhizium flavoviride (Gams and
Rozsypal) (Hyphomycetes), revealed by RAPD analysis. Anais Da Sociedade Entomologica do Brasil. 27, 91-99.
Swofford DL, Branson BA, Sievert
G (1980)
Genetic differentiation of cavefish populations (Amblyopsidae). Isozyme Bull 13, 109-110.
Takano
TS, Kusakabe S, Mukai T (1991) The genetic structure of natural populations
of Drosophila melanogaster XXII. Comparative study of DNA polymorphism in
northern and southern natural populations.
Genetics 129, 753-761.
Trajano E (1997)
Population ecology of Trichomycterus
itacarambiensis, a cave catfish from
Verovnik R, Trontelj P, Sket B (1999)
Genetic differentiation and species status within the snail leech Glossiphonia complanata aggregate
(Hirudinea: Glossiphoniidae) revealed by RAPD analysis. Archiv
Fuer Hydrobiologie. 144, 327-338.
Weber A, Proudlove GS, Parzefall J,
Wilkens H, Nalband TT (1998) Pisces (Teleostei). Pp. 1179-1190 in: Juberthie C, Decu V (eds.),
Encyclopaedia Biospeologica, Tome II.
Société de Biospéologie,
Welsh J, McClelland M
(1990) Fingerprinting genomes
using PCR with arbitrary primers. Nucleic Acids Res. 18, 7213-7218.
Welsh J, McClelland M
(1991) Genomic fingerprinting
with AP-PCR using pairwise combinations of primers: application to generic
mapping of the mouse. Nucleic Acids Res. 19, 5275-5279.
Williams
JGK, Kubelik AR, Livak KJ, Rafalski JA, Tingey SV (1990)
DNA polymorphisms amplified by arbitrary primers are useful as genetic
markers. Nucleic Acids Res. 18, 6531-6535.
Author Information
This work is part of a larger
comparative study of troglobitic and regressive evolution in cave fishes
(www.homepages.nyu.edu/~rb4). Richard
Borowsky is an evolutionist specializing in the population genetics and microevolution
of tropical freshwater fishes, both epigean and troglobitic. Chavalit Vidthayanon is an ichthyologist
specializing in the fishes of
Figure Legends
Figure 1: Fig1Nema.PDF Variability in populations of
nemacheiline hillstream loaches; Phenotypic Heterogeneity Index, f;
Nucleotide Diversity, p.
Epigean populations are as follows: 1 -- Acanthocobitis zonalternans (Huai Mae La Mao, 27 km e. of Mae Sot,
Tak Prov.,
Figure 2: Fig2Bali.PDF
Variability in populations of balitorine hillstream loaches; Phenotypic
Heterogeneity Index, f; Nucleotide Diversity, p. Epigean populations are as follows: Homaloptera smithi from two mainland
areas (11 -- same as #8; 12 -- same as #7) and from three populations northeast
of Phang Nga on the Malaysian peninsula in small independent tributaries of
Phang Nga Bay (13 -- Tham Phet, 1 km n of Ban Nok, Phang Nga Prov. = PN; 14 --
Tham Nam I, 2 km ne Ban Nok, PN.; 15 -- Stream 7 km sw Thap Put, PN); 16 -- Balitora burmanica (Mae Sanghi stream,
17 km n of Mae Hong Son, MHS, Salween dr.):
The cave population (17) is of Cryptotora
thamicola (Tham Susa, 7 km s of Nam Khong, MHS, Salween dr.).
Table 1. Genetic variation of enzyme
coding loci in cave fish species and related surface taxa. Genetic variation is expressed as
heterozygosity (H), as reported in or calculated from data in the cited
studies. Cited studies: 1) Swofford,
Branson and Sievert, 1980. 2) Avise and Selander, 1972. 3) Perez and Moodie, 1993.
Fish
Family Scope of Study Heterozygosity
Habit (mean
or range)
Amblyopsidae1 5 species, 19-22 loci
troglobites 3
species, 18 pops. 0.000
- 0.019
troglophile 1
species, 10 pops. 0.0280
epigean 1
species, 11 pops. 0.0400
Characidae2 1 species, 17 loci
troglobites 2
caves, 91 inds. 0.000
- 0.033
hybrid swarm 1
cave, 45 inds. 0.096
epigean 6
pops., 257 inds. 0.081
- 0.139
Trichomycteridae3 1 species, 40 loci
troglobite 1
pop., 30 inds. 0.000
epigean 1
pop., 80 inds. 0.025