Introduction
The leaves of the coca plant have been used as a medicine and mild stimulant in South America for over 8,000 years (Plowman 1984; Dillehay et al. 2010). In more recent history, few plants have had such far reaching effects on human health and international relations (Restrepo et al. 2019). Coca crops produce the alkaloid cocaine: a natural insecticide (Nathanson et al. 1993), Western medicine’s first local anesthetic, and a controlled narcotic whose supply chains and illicit international markets have caused decades of social disaster.
Coca is classified into two species, Erythroxylum coca and E. novogranatense (Erythroxylaceae, Malpighiales), each with two taxonomic varieties. These two species are found only in cultivation, having resulted from independent origins of domestication from the wild E. gracilipes (White et al. 2020).
The two varieties used in this study, E. coca var. ipadu Plowman, known as Amazonian coca, and E. novogranatense var. truxillense (Rusby) Plowman, known as Trujillo coca, are regionally distinct crops. Erythroxylum coca var. ipadu is a cultivated by indigenous groups in the lowland Amazon basin of Colombia, Brazil, and Perú. Erythroxylum novogranatense var. truxillense is grown primarily in the dry valleys of northwestern Perú and is exported as a flavoring agent of Coca Cola®. These taxa have been crossed to produce improved hybrid varieties for the cocaine market, which are currently grown in southern Colombia and possibly southern Mexico (Casale, Mallette, and Jones 2014; Rodríguez Zapata 2015).
Complete genome sequences for E. coca var. ipadu and E. novogranatense var. truxillense will provide insight into the origins, evolution, and modern breeding patterns of coca crops, as well as the of the cocaine biosynthesis pathway.
Methods
DNA from each species was provided by USDA/ARS Sustainable Perennial Crops Laboratory for use in this study.
Sequencing libraries were constructed with the Illumina TruSeq kit using standard protocols for the 2x150 bp format. Sequencing was performed on an Illumina X-Ten platform.
Raw, paired-end sequence data was trimmed of adapter sequence and low-quality regions using Trimmomatic (Bolger, Lohse, and Usadel 2014). Genome preassemblies were constructed using SPAdes (Bankevich et al. 2012), and finished with Zanfona (Kieras et al. 2021).
Results
The results of genome assemblies are as follows:
Acknowledgements
Dawson White is supported by an NSF Postdoctoral Fellowship in Biology, award number 2010821.
Funding
Funding was provided by Iridian Genomes, grant # IRGEN_RG_2021-1345 Genomic Studies of Eukaryotic Taxa.