Introduction
Periodical cicadas in North America spend 13 or 17 years in the larval stage underground, and emerge in very large numbers for 4–6 weeks to mate and lay eggs. This strategy, known as “predator satiation” is intended to ensure that after all predators have eaten as much as possible, most cicadas will survive. (Williams and Simon 1995). The emergence occurring in prime-numbered years is thought to be a mechanism to avoid competition between species for egg-laying sites and accidental cross-species mating as the emergence of the 13- and 17-year cicadas would only coincide once every 221 years (Tanaka, Yoshimura, Simon, et al. 2009). The length of time spent in the larval stage is thought to be dependent on a single gene, although this has not yet been demonstrated at the genomic level (Cox and Carlton 1991).
Complete genome sequences for these two species will assist with studies on taxonomy, longevity, and the timing of long-term larval development.
Methods
Wild caught specimens of Magicicada septendecim and Magicicada septendecula from a small premature emergence of Brood X (2017) collected in Newark, Delaware, USA were used in this study. DNA extraction was performed using the Qiagen DNAeasy genomic extraction kit for tissue, using the standard process. A paired-end sequencing library was constructed using the Illumina TruSeq kit, according to the manufacturer’s instructions. The library was sequenced on an Illumina Hi-Seq platform in paired-end, 2 × 150bp format.
The resulting fastq files were trimmed of adapter/primer sequence and low-quality regions with Trimmomatic v0.33 (Bolger, Lohse, and Usadel 2014). The trimmed sequence was assembled by SPAdes v2.5 (Bankevich et al. 2012) followed by a finishing step using RagTag v1.0.0 (Alonge 2020) to make additional contig joins based on conserved regions in related insect species: Rhopalosiphum maidis (GCA_003676215), Euschistus heros (GCA_003667255), and Aphis glycines (GCA_009928515). Default parameters were used for all assembly steps.
Annotation was performed using GeneMark-ES v2.0 (Lomsadze et al. 2005). Annotation was performed fully de novo without a curated training set and using default parameters.
Results
The genome assembly for Magicicada septendecim yielded a total sequence length of 1,579,033,894 with an N50 value of 983 kb and 27,124 gene models.
The genome assembly of Magicicada septendecula yielded 1,585,977,997 with an N50 value of 281 kb and 28,651 gene models.
Data availability
Raw reads and assembled genomes available from Genbank:
Author information
Harold B. White is now retired from Department of Chemistry and Biochemistry, University of Delaware.
Funding
Funding was provided by Iridian Genomes, grant# IRGEN_RG_2021-1345 Genomic Studies of Eukaryotic Taxa.