Mitochondrial COI gene is valid to delimitate Tylenchidae (Nematoda: Tylenchomorpha) species

Abstract Tylenchidae is a widely distributed soil-inhabiting nematode family. Regardless their abundance, molecular phylogeny based on rRNA genes is problematic, and the delimitation of taxa in this group remains poorly documented and highly uncertain. Mitochondrial Cytochrome Oxidase I (COI) gene is an important barcoding gene that has been widely used species identifications and phylogenetic analyses. However, currently COI data are only available for one species in Tylenchidae. In present study, we newly obtained 27 COI sequences from 12 species and 26 sequences from rRNA genes. The results suggest that the COI gene is valid to delimitate Tylenchidae species but fails to resolve phylogenetic relationships.

Tylenchidae is a widely distributed soil-inhabiting nematode family characterized by a weak stylet, an undifferentiated non-muscular pharyngeal corpus, and a filiform tail. Currently, it comprises 412 nominal species belongs to 44 genera and estimated species number ranged from 2,000 to 10,000 species (Qing and Bert, 2019). Regardless of their abundance, the delimitation of taxa in this group remains poorly documented and highly uncertain. Consequently, there is no consensus regarding their classification from species level up to family level (Andrássy, 2007;Brzeski, 1998;Qing and Bert, 2019;Siddiqi, 2000).
With the improved availability of genetic sequencing, molecular sequences in species diagnosis and phylogeny analysis have consolidated them as one of the most powerful tools in current taxonomy. Among marker genes, the ribosomal RNA (rRNA) genes are being used as the standard barcode for almost all animals and successfully resolved several groups in Nematoda (Bert et al., 2008;Holterman et al., 2006;Subbotin et al., 2006). However, rRNA genes are problematic in Tylenchidae phylogeny and the unresolved status is unlikely to be improved by intensive species sampling (Qing et al., 2017;Qing and Bert, 2019). Therefore, finding a proper molecular marker gene is crucial for the Tylenchidae study. In this study we examined the mitochondrial Cytochrome Oxidase I gene (COI) of 12 species belong to Tylenchidae (sensu (Geraert, 2008)), the goal is to evaluate the potential of COI sequences for the identification of Tylenchidae species; and compare the resolution, sequences variability, and tree topologies obtained from one COI and two rRNA markers (i.e. 18S and the 28S rRNA).

Samples collection and processing
Soil samples were collected in China from 2018 to 2019. The details on sampling locations and habitats were given in Table 1. The nematodes were extracted from soil samples by Baermann tray and subsequently collected by a 400 mesh sieve (37 μ m opening) after 24 hr of incubation. For morphological analysis, the extracted nematodes were manually picked up, fixed with 4% formalin, rinsed several times with deionized water and then transferred to anhydrous glycerin, following the protocol of Seinhorst (1962) and Sohlenius and Sandor (1987).

Morphological analysis
Measurements and photography were made from slides using Nikon Eclipse Ni-U 931609 Microscope (Nikon Corporation, Tokyo, Japan). Illustrations were prepared manually based on light microscope drawings and edited with Adobe Illustrator CS3 and Adobe Photoshop CS3.
For scanning electron microscopy (SEM), the samples were fixed by formalin, gradually washed with water and post-fixed with 2% PFA + 2.5% glutaraldehyde in 0.1M Sorensen buffer, then washed and dehydrated in ethanol solutions and subsequently critical point dried with CO 2 . After mounting on stubs, the samples were coated with gold by JFC-1200 and observed with a JSM-3680 (JEOL, Tokyo, Japan).

Molecular analysis
The fresh nematodes were directly used for DNA extraction. The single nematode was placed in the 10 μ l worm lysis buffer (50 mM KCl, 10 mM Tris pH 8.3, 2.5 mM MgCl 2 , 0.45% NP40, 4.5% Tween 20, pH = 8.3) on a glass slide. The nematode cuticle was broken by a needle and subsequently transferred to a 200 μ l Eppendorf tube. After 1 min for freezing in liquid nitrogen, 1 μ l proteinase K (1.0 mg/ml) was added and incubated for 1 h at 65˚C and 10 min at 95˚C.
The obtained sequences were analyzed with other relevant reference sequences available in the PPNID database . Multiple alignments of rRNA genes were made using the Q-INS-I algorithm of MAFFT v. 7.205 (Katoh and Standley, 2013) and the COI gene was aligned using TranslatorX (Abascal et al., 2010) under the invertebrate mitochondrial genetic code. The best-fitting substitution model was estimated using AIC in jModelTest v. 2.1.2 (Darriba et al., 2012). Maximum likelihood (ML) and Bayesian inference (BI) was performed at the CIPRES Science Gateway (Miller et al., 2010) using RAxML 8.1.11 (Stamatakis et al., 2008) and MrBayes 3.2.3 (Ronquist et al., 2012), respectively. ML analysis included 1,000 bootstrap (BS) replicates under the GTRCAT model. Bayesian phylogenetic analysis was carried out using the GTR + I + G model, analyses were run for 5 × 10 6 generations and Markov chains were sampled every 100 generations and 25% of the converged runs were regarded as burn-in. Gaps were treated as missing data for all phylogenetic analysis. ML bootstrap values and posterior probabilities (PP) were plotted on Bayesian 50% majority rule consensus trees using Tree View v. 1.6.6 (Page, 1996) and Illustrator CS3.

Results
To evaluate the validation and robustness of COI phylogeny in comparison to well-established rRNA phylogeny, we newly sequenced corresponding 28S and 18S rRNA of analyzed Tylenchidae species. Our results concur with previous studies that both regions show serious limitations: phylogenies are poorly resolved and support values do not agree with each other (Qing et al., 2017. In general, the newly sequenced species are placed in the same cluster or closely related to their corresponding species in GenBank (the morphology details are given in Figs Fig. 1) suggested that L. leptosoma population 2 has a broader amphidial aperture than population 1. These differences appear in the variation range of L. leptosoma stated in the study of Geraert (2008).
With the limited knowledge of this genus and overall problematic taxonomy in Tylenchidae, here we considered these differences as intra-specific variations of L. leptosoma. We obtain 27 newly generated COI sequences from 12 species with lengths ranging from 436 bp to 445 bp. The identification of our representatives was confirmed by their key morphological features (Supplementary Figs. 1-15 in https://doi.org/10.6084/ m9.figshare.12110667.v1) together with rRNA molecular evidence. We compared compositional bias of COI sequences and the result suggested Tylenchidae has similar GC content to Hoplolaimina (sensu Siddiqi, 2000) in three codon positions but different from Criconematina (sensu Siddiqi, 2000) in GC content of first and third codon position ( Table 2). The analysis of genetic distance suggested that most species can be well-separated except for two reciprocally similar genera Aglenchus and Coslenchus (Table 3).
A total of 52 species in Tylenchomorpha and outgroups (alignment of 1,581 characters) were used for COI phylogeny analysis. The resulting ML and BI trees are largely divergent in topologies, and therefore their phylogenies were presented separately. In both ML and BI analyses, Hirschmanniella mucronata (KR819278) was placed as a sister to Basiria aberrans (MN577605, MN577606). Such placement was contrary to its morphological assignment and rRNA-based phylogeny (Bert et al., 2008). Since this standalone sequence was not supported by morphology, and other related species (e.g. Pratylenchus spp.) were properly placed, we considered likely that this sequence had been mislabeled. On the basis of this assumption, the monophyly of Tylenchidae was moderately supported  (BS = 83) by ML analysis but not supported by BI analysis (split into three clusters, Figs. 4, 5). In all analyses, individuals of the same population were clustered together, either in a fully supported clade in BI (PP=1) or weakly supported clade in ML (BS from 43 to 72). Although COI phylogeny was unable to reject rRNA phylogenies with full confidence, several

Discussion
In the present study, we recovered two populations of L. leptosoma that similar in morphology but divergent in phylogenetic placements. Such inconsistency is not surprising as similar cases have been reported in genus Malenchus and Labrys (Qing et al., 2017. Lelenchus leptosoma is the most frequently encountered species in the genus Lelenchus that includes all Lelenchus spp. without distinct incisures. This species shows great variations in morphology, e.g., body length ranges from 470 to 780 μ m, tail 145 to 278 μ m (Geraert, 2008). We demonstrated that even only with minor morphological variations, two populations can be significantly divergent in genetics. We concur with De Ley (2000) that the extremely small size masks the actual morphological difference in nematodes. Indeed, only a few morphological characters (including SEM) are practically helpful for Tylenchidae diagnosis, and a substantial amount of cryptic species were therefore ignored (Qing and Bert, 2019). Similarly, our two recovered L. leptosoma were likely to contain at least one cryptic species. However current knowledge in Lelenchus is far from sufficient, especially the type of material and molecular data from different reported populations. Consequently, we followed the suggestion given by De Ley (2000) that the key priority for a difficult taxonomic group is to understand major patterns and clades rather than the compilation of a single taxonomic unit.
The mitochondrial COI gene is one of the most important standard barcoding genes that has been used for almost all animals (Hebert et al., 2004). Its higher mutation rate provides a better differentiation of closely related species and is particularly useful for the identification and description of hybrid or cryptic species (Palomares-Rius et al., 2014;Powers, 2004;Shaw et al., 2013). Although it has only been explored for a limited number of nematode species compared to rRNA (Palomares-Rius et al., 2014), the COI gene has recently received increasing attention for nematode barcoding and phylogeny. In plant-parasitic species, COI data were already available for several important taxa, e.g. Bursaphelenchus spp. (Kanzakiand and Giblin-Davis, 2012;Ye et al., 2007), Aphelenchoides spp. (Sánchez-Monge et al., 2017;Xu et al., XXXX), Meloidogyne spp. (Kiewnick et al., 2014), Pratylenchus spp. Qing et al., 2019a), and Scutellonema spp. (Van den Berg et al., 2013). However, due to the problematic taxonomic status and a lack of taxonomic attention to the Tylenchidae Figure 6: Bayesian 50% majority rule consensus tree interfered with the 18S rRNA gene. New sequences original to this study are indicated in bold. Branch support is indicated in the following order: PP value in BI analysis/BS value from ML analysis. (Qing and Bert, 2019), COI data are only available for one species (L. brevislitus) (Soleymanzadeh et al., 2016) regardless of its great diversity. Here we added 27 new COI sequences covering 13 species of Tylenchidae. Our result suggested that the overall resolution of COI phylogeny was low and inferred tree topologies failed to reject rRNA phylogenies. Therefore, we demonstrated that apart from less informative 18S and 28S genes (Qing and Bert, 2019;Qing et al., 2017), COI is also inadequate to resolve Tylenchidae, and therefore searching for valid alternative genes is the key to Tylenchidae phylogeny. Although failing to definitively resolve phylogenies, our analysis of interspecific/generic differences confirms the validity of COI as a barcode for Tylenchidae. Alongside with our high success rate in PCR amplification using universal COI primer pair JB3/JB4.5 (Bowles et al., 1992), we, therefore, acknowledge the COI as suitable options for Tylenchidae diagnosis. Figure 7: Bayesian 50% majority rule consensus tree interfered with the 28S rRNA gene. New sequences original to this study are indicated in bold. Branch support is indicated in the following order: PP value in BI analysis/BS value from ML analysis.