Advanced Research Journal of Biochemistry and Biotechnology

Molecular Analyses of Genetic Variation and Phylogenetic Relationship in the Family Sapindaceae. A Review Paper


Abstract

 

The Sapindaceae, commonly known as the soapberry family, is a cosmopolitan group of approximately 1900 species across 144 genera, forming part of the economically and ecologically significant angiosperm order Sapindales. Despite prior taxonomic efforts, relationships within Sapindaceae and across Sapindales have remained poorly resolved due to complex morphological variation and incomplete infra-familial classification systems. Recent advances in molecular systematics, particularly the use of nuclear ribosomal DNA internal transcribed spacer (ITS) sequences and Angiosperms353 target enrichment datasets, have enabled substantial progress in reconstructing evolutionary relationships within this group. ITS-based phylogenetic analyses have confirmed species-level resolution within Indian Sapindus, clearly distinguishing S. emarginatus from S. trifoliatus, and revealing the divergent position of S. oligophyllus, which clusters with Allophylus of tribe Thouinieae. Estimates of evolutionary divergence revealed significant variability among tribes, with the greatest divergence observed between Paullinieae and Harpullieae (0.20) and the least between Sapindeae and Lepisantheae (0.06), supporting past taxonomic hypotheses. Complementary phylogenomic analyses using Angiosperms353 markers across 123 Sapindaceae genera (86% coverage) recovered 21 clades, providing the basis for a revised classification into four subfamilies and 20 tribes, including six newly proposed tribes within Sapindoideae. Broader Sapindales-wide analyses comprising 448 samples and 85% of genera confirmed family monophyly and resolved core clades while also revealing persistent challenges in subfamily-level relationships due to paralogy, likely linked to ancient hybridisation and polyploidy events. The presence of paralogous loci, particularly in Meliaceae and Rutaceae, affirms the need for careful data curation and highlights the impact of ancient genome duplications on phylogenetic inference. This integrated molecular framework provides the most comprehensive phylogenetic resolution of Sapindaceae and Sapindales to date. It offers a robust foundation for future evolutionary, biogeographic, taxonomic, and conservation-orientated studies while emphasising the need for continued sampling and the incorporation of genomic complexity in phylogenetic reconstruction.

 

Key Words: Sapindaceae, phylogenomics, ITS sequences, Angiosperms353, infra-familial classification