In this blog post, I wish to write about species delimitation. What it is and methods we have to delimit species from each other.
Species delimitation — what it is and why it matters
Species delimitation is the set of methods and conceptual approaches used to determine where one species ends and another begins. It asks the fundamental biological question: which groups of organisms represent independently evolving lineages that should be recognized as distinct species?
Understanding and setting species boundaries can in some cases be trivial, but in others it can be of great importance. Accurate species delimitation is important for biodiversity measures, conservation planning, ecological studies, pest and disease management and evolutionary research to mention some. Especially for conservation, it is of great importance to understand, categorize and delimit species, since “knowing what to conserve” is an important aspect.
Different species concepts
In biology we find different species concepts, and this will ultimately affect how species delimitation processes occur. Different species concepts emphasize different properties, and among the most common concepts we have are:
- Biological species concept: species are groups of actually or potentially interbreeding natural populations that are reproductively isolated from others. Emphasizes gene flow and reproductive barriers.
- Morphological species concept: species are defined by consistent morphological differences. Practical for many taxa but risks lumping cryptic species or splitting polymorphic species.
- Phylogenetic species concept: defines a species as the smallest group of (monophyletic) organisms that share a common ancestor and are unified by a shared evolutionary history. Emphasizes diagnosability and shared ancestry.
- Evolutionary species concept: species are lineages with their own evolutionary tendencies and historical fate.
- Ecological species concept: classifies species based on their role in the environment, meaning they are a set of organisms that exploit the same ecological niche.
Methods of species delimitation
There are many ways to determine and decide what constitutes a species. In our blog post in door 6 we learned that traditional taxonomy is the process of naming and assigning species to ranks in a hierarchical system based on shared characteristics, most commonly shared morphological traits. As such, one way to delimit species is
- Morphological approaches – traditional taxonomic practice: examine diagnostic characters, measurements, meristic counts, colour patterns, etc.
Pros: accessible, inexpensive, applicable to museum specimens and fossils.
Cons: subjective, confounded by phenotypic plasticity, sexual dimorphism, and cryptic species.
Further, directly linked to the biological species concept, one can delimit species using also:
- Biological / reproductive data – Cross-breeding experiments, observations of mating behavior, hybrid zones, and fertility of hybrids.
Pros: direct test of reproductive isolation (BSC).
Cons: impractical for many organisms (e.g., long-lived or uncultivable species), field observations can be limited.
Next, as genetic data is becoming easily available, other approaches are:
- Molecular barcoding
Use of a standardized genetic marker (e.g., COI in animals) to assign individuals to molecular operational taxonomic units (MOTUs) based on genetic distance thresholds.
Pros: rapid, useful for large-scale surveys and for immature or fragmentary specimens.
Cons: threshold choice is arbitrary; single-locus data can be misleading because of introgression or incomplete lineage sorting (ILS). - Phylogenetic methods
Construct gene trees or species trees and identify monophyletic clusters.
Clades with diagnostic synapomorphies may be considered species under the phylogenetic species concept. - Coalescent-based and model-based delimitation and barcode gap methods
Incorporate population genetic models to distinguish population structure from species divergence.
Examples:- GMYC (Generalized Mixed Yule-Coalescent): uses a single-locus ultrametric tree to detect transition points between interspecific (Yule) branching and intraspecific (coalescent) branching
- PTP (Poisson Tree Processes): models speciation and coalescent processes on branch lengths of a phylogenetic tree without requiring an ultrametric tree (Zhang et al., 2013)
- ASAP analysis: ASAP is a distance-based algorithm that groups sequences into candidate species by finding barcode gap partitions that maximize a score based on intra-group cohesion and inter-group gaps.(Puillandre et al., 2021)
- Automatic barcode Gap discovery: Automatically detects the barcode gap in the distribution of pairwise genetic distances and proposes partitions of sequences into candidate species.(Puillandre et al., 2012)
- BPP (Bayesian Phylogenetics and Phylogeography): a full-likelihood Bayesian method using multilocus sequence data under the multispecies coalescent (MSC) to test delimitation models and estimate divergence parameters.
Pros: these single-locus methods are simple, automatic, and fast for barcode-level datasets. Do not require tree-building; works directly on distances.
Cons: Still single-locus and thus vulnerable to ILS and introgression. - BPP (Bayesian Phylogenetics and Phylogeography): a full-likelihood Bayesian method using multilocus sequence data under the multispecies coalescent (MSC) to test delimitation models and estimate divergence parameters.
Cons: computationally intensive; sensitive to priors, model assumptions, and sampling. MSC-based delimitation can mistake population structure for species if geographic sampling or demographic history is complex.
Pros: theoretically grounded in coalescent theory; account for gene tree discordance due to ILS.
- Genomic methods
SNP datasets, RADseq and whole genome datasets enable population genomics approaches: e.g. clustering (e.g., STRUCTURE, PCA), demographic modelling and species tree inference
Pros: high resolution, can detect subtle differentiation, admixture, and historical demography.
Cons: cost, data processing complexity, and need careful sampling design.
- Integrative taxonomy
Explicitly combines morphology, multiple genetic loci/genomes, ecological data, behaviour and distributional information. Considered best practice: congruence among independent data supports delimitation decisions strongly (as also explained in Door 6).
Do these methods really work – and how?
The major goal of species delimitation, no matter which species concept one is operating with, is to confidently set species boundaries and to detect independently evolving lineages. Taking genetics into account, mechanisms that actually generate separate lineages include barriers to gene flow (geographic, behavioural and temporal), divergent selection and genetic drift, and genetic divergence accumulates between isolated populations. Coalescent theory models how gene copies trace back in time and predicts the expected patterns of gene tree discordance and allele sharing. If populations have been isolated for a time long enough relative to their effective population sizes, most genes will have sorted into distinct lineages (reciprocal monophyly), so delimiting species is typically straightforward. By contrast, when divergence is recent or effective population sizes are large, incomplete lineage sorting (ILS) produces shared alleles across groups; in those cases single-gene trees are uninformative and multilocus or genomic datasets combined with coalescent-aware models are needed to tell apart population structure from true speciation.
In short, molecular and coalescent-based approaches succeed because they explicitly model how gene lineages coalesce and split over time under evolutionary processes, allowing probabilistic inference about whether the observed genetic patterns are best explained by distinct species-level divergences or by within-species variation.
Pit falls and issues
However, despite the promises of each method, there are multiple pit falls and issues with these methods. No single method is ever 100 % perfect, but with the models, we try our best and aim at creating the best and most plausible hypothesis based on the data we have. As Carstens et al. (2013) points out, a best practice is to include multiple methods and make a joint conclusion based on many levels of results.
Common issues usually includes:
– Incomplete lineage sorting (ILS): gene trees disagree with species trees; single-locus markers can mislead.
– Gene flow / hybridization: ongoing or historical introgression blurs species boundaries; some methods assume no gene flow.
– Population structure vs. species: geographically structured populations can be misinterpreted as separate species.
– Sampling bias: poor geographic or genomic sampling reduces power and increases error.
Choice of markers and models: single-marker barcodes and inappropriate priors can produce false splits or lump real species.
To summarize, best practice is using an integrative approach: combine independent data types (morphology, multiple loci/genomes, ecology, behavior). You need to sample broadly across geography and individuals to capture within-population variation and contact zones. A smart choice is to use coalescent-aware and genomic methods when possible, but interpret results in light of natural history and ecology.
References:
Carstens, B. C., Pelletier, T. A., Reid, N. M., & Satler, J. D. (2013). How to fail at species delimitation. Molecular Ecology, 22(17), 4369-4383. https://doi.org/10.1111/mec.12413
Luo, A., Ling, C., Ho, S. Y. W., & Zhu, C. D. (2018). Comparison of methods for molecular species delimitation across a range of speciation scenarios. Systematic biology, 67(5), 830-846. https://doi.org/10.1093/sysbio/syy011
Magoga, G., Fontaneto, D., & Montagna, M. (2021). Factors affecting the efficiency of molecular species delimitation in a species-rich insect family. Molecular Ecology Resources, 21(5), 1475-1489. https://doi.org/10.1111/1755-0998.13352
Pons, J., Barraclough, T. G., Gomez-Zurita, J., Cardoso, A., Duran, D. P., Hazell, S., . . . Vogler, A. P. (2006). Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Systematic biology, 55(4), 595-609. https://doi.org/10.1080/10635150600852011
Puillandre, N., Brouillet, S., & Achaz, G. (2021). ASAP: assemble species by automatic partitioning. Molecular Ecology Resources, 21(2), 609-620. https://doi.org/10.1111/1755-0998.13281
Puillandre, N., Lambert, A., Brouillet, S., & Achaz, G. (2012). ABGD, Automatic Barcode Gap Discovery for primary species delimitation. Molecular Ecology, 21(8), 1864-1877. https://doi.org/10.1111/j.1365-294X.2011.05239.x
Zhang, J., Kapli, P., Pavlidis, P., & Stamatakis, A. (2013). A general species delimitation method with applications to phylogenetic placements. Bioinformatics, 29(22), 2869-2876. https://doi.org/10.1093/bioinformatics/btt499
![]()