Discovery and Identification of Meloidogyne Species Using COI DNA Barcoding

DNA barcoding with a new cytochrome oxidase c subunit 1 primer set generated a 721 to 724 bp fragment used for the identification of 322 Meloidogyne specimens, including 205 new sequences combined with 117 from GenBank. A maximum likelihood analysis grouped the specimens into 19 well-supported clades and four single-specimen lineages. The “major” tropical apomictic species (Meloidogyne arenaria, Meloidogyne incognita, Meloidogyne javanica) were not discriminated by this barcode although some closely related species such as Meloidogyne konaensis were characterized by fixed diagnostic nucleotides. Species that were collected from multiple localities and strongly characterized as discrete lineages or species include Meloidogyne enterolobii, Meloidogyne partityla, Meloidogyne hapla, Meloidogyne graminicola, Meloidogyne naasi, Meloidogyne chitwoodi, and Meloidogyne fallax. Seven unnamed groups illustrate the limitations of DNA barcoding without the benefit of a wellpopulated reference library. The addition of these DNA sequences to GenBank and the Barcode of Life Database (BOLD) should stimulate and facilitate root-knot nematode identification and provide a first step in new species discovery.

The term DNA-barcoding has multiple definitions. The earliest mention of barcoding in nematology was in 1998 by Dr Mark Blaxter, then of Edinburgh University, referring to the "(d)evelopment of a molecular barcode system for soil nematode identification" in the first volume of the Natural Environment Research Council Soil Biodiversity Newsletter (http://soilbio. nerc.ac.uk/newsletters.htm). The barcode he was referring to was the 18S nuclear (small subunit) ribosomal gene. Other gene regions proposed for DNA-barcoding soon followed, creating a broader definition that generally applied to the use of DNA sequences for species identification (Floyd et al., 2002;Blaxter, 2004;Powers, 2004). In 2003 a widely cited paper by Hebert et al. (2003) proposed a standardization of the barcode definition linked to the amplification of a 658 bp gene region within the cytochrome oxidase subunit 1 mitochondrial gene. The goal of this conceptual paper was the development of a global bioidentification system for animals. Considerable controversy immediately followed this publication with criticism ranging from theoretical concerns about the use of a single gene, the ability of an organelle gene to track species boundaries, and barcoding's impact on the process of taxonomic investigation (DeSalle et al., 2005;. Practical concerns were expressed about lack of amplification with some groups, the designation of types, taxonomic resolution, and economic cost at the expense of traditional taxonomic approaches (Meyer and Paulay, 2005;Rubinoff et al., 2006;McFadden et al., 2011). Now, 15 years later, DNA-barcoding has become a component within the broader scope of integrated taxonomy and a routine tool for identification (Hodgetts et al., 2016;Janssen et al., 2016). As a diagnostic and discovery enterprise, DNA barcoding has generated thousands of publications, features biennial international conferences, has a dedicated database -BOLD, the Barcode of Life Database -and has multiple administrative structures such as the International Barcode of Life (IBOLD) and its affiliates (www.boldsystems.org/ index.php/default).
Nematology was slow to adopt this formalized version of barcoding, perhaps due to poor amplification with the original "Folmer" primer sets (Folmer et al., 1994). Now multiple primer sets for amplification of nematode cytochrome oxidase c subunit 1 (COI) are available (Derycke et al., , 2010Prosser et al., 2013;Kiewnick et al., 2014;Powers et al., 2014;Janssen et al., 2016). These primer sets typically have limited taxonomic scope with amplifications specific for genera or in some cases extending across families and superfamilies (Powers et al., 2014). The objective of this study is to present a primer set used for the amplification of 721 to 724 bp of COI sequence from Meloidogyne. A maximum likelihood (ML) tree is provided to illustrate the ability of this gene region to discriminate among many described Meloidogyne species. The primers also function as a means to amplify DNA from juvenile stages in community analyses, possibly leading to new species discoveries. Contributions to a COI reference library should aid future taxonomic and ecological research in the genus.

Nematode collection
Most of the specimens DNA barcoded in this study were either specimens submitted to the UNL Nematology Diagnostics Clinic, specimens contributed by colleagues, or specimens collected during grant funded surveys (NSF projects DEB-1145440; USDA Multistate Project W3186).

Primer sequences
The primer set for amplification of the COI gene region were: After removal of the primer sequences, amplification products from the Meloidogyne specimens were either 721 or 724 bp. GenBank sequences used in this study generally were 100 to 300 nucleotides shorter than sequences generated with the new primer set.

Amplification conditions
Nematodes amplified at the UNL Nematology Laboratory were individually smashed in 18 ul of sterile H20 with a transparent microfuge micropipette tip on a coverslip and added to a 0.5 ml microfuge tube. Nematode lysate was either amplified immediately or stored at -20°C. Amplification conditions were as follows: denaturation at 94°C for 5 min, followed by 45 cycles of denaturation at 94°C for 30 sec, annealing at 48.0°C for 30 sec, and extension at 72°C for 90 sec with a 0.5° per second ramp rate to 72°C. A final extension was performed at 72°C for 5 min as described by Powers et al. (2014) and Olson et al. (2017). Polymerase chain reaction (PCR) products were separated and visualized on 1% agarose using 0.5XTBE and stained with ethidium bromide. PCR products of sufficiently high quality were cleaned and sent for sequencing of both strands by University of California-Davis DNA sequencing facility.

Data storage
Nucleotide sequences have been submitted to Gen-Bank (accession numbers MH128384-MH128585) and the Barcode of Life Database (BOLD).

Phylogenetic analysis
Phylogenetic trees were constructed under ML and Neighbor Joining (NJ) criteria in MEGA version 6. Sequences were edited using CodonCode Aligner version 7.1 (www.codoncode.com/) and aligned using Muscle within MEGA version 6 ( Tamura et al., 2013). Gap opening penalty was set at -400 with a gap extension penalty of -200. The General Time Reversible Model with Gamma distributed rates (GTR+G) was determined to be the best substitution model by Bayesian Information Criterion using the Best Fit Substitution Model tool in MEGA 6.0. ML trees used a use all sites option for gaps and 200 bootstrap replications to assess clade support.  Support values that designate clades and haplotype groups are circled. Clades that correspond to named and unnamed species or haplotype groups are numbered. Clades that include specimens with a single amino acid deletion are denoted by (Δ 721 bp). Group 1 has been reduced to a box of species names. Sequences within Group 1 are presented in Table 2. A list of GenBank accession numbers for specimens included in Group 1 are found in supplementary Table 1.  National Preserve, Texas comes from a native lowland plant community, compared with other specimens from New Mexico collected from commercial pecan (Carya illinoinensis (Wangenh.) K. Koch) production.

Results
There are seven groups labeled as unnamed, all with sequence derived from j2 stage specimens except for N4431 and N4496 which were males collected from native chestnut (Castanea dentata (Marshall) Borkh.) in Great Smoky Mountains National Park (GRSM), North Carolina. All specimens in the unnamed groups 4, 5, 9 to 13 were isolated from soil samples within Gulf Coast or Eastern North American forests. Groups 9 and 12 were associated with American beech, (Fagus grandifolia Ehrh.) and chestnut or oak, respectively. Measurements of the unidentified juveniles are presented in Table 3, and Fig. 2 illustrates juveniles from three of the unnamed groups.

Discussion
The COI gene region used as a diagnostic marker in this study appears to discriminate many of the described species of Meloidogyne. It does not separate the apomictic "major species" and their close relatives, except possibly M. konaensis and M. incognita grahami. Other mitochondrial genes such as NAD 5 may help resolve some of those species boundaries (Janssen et al., 2016). Aside from an inability to discriminate among the tropical clade 1 species, there are advantages to using COI as a DNA barcode. As a protein coding gene, nucleotide alignment is easier compared with non-protein coding genes. Taxonomic resolution is at the population and species level, although for many genera, mutational saturation, lineage extinctions, or inadequate sampling may obscure deeper relationships that aid in the recognition of species groupings. Nonetheless, COI barcodes in combination with an adequately curated sequence database, provide a powerful tool for identification and discovery. The limitation of DNA barcoding without a corresponding database is illustrated by the unnamed groups in the Meloidogyne dataset. For example, there was an expectation that focal samples from soil around individual chestnut and oak trees in GRSM might yield Meloidogyne querciana Golden, 1979 which was described from northern red oak (Quercus rubra L.) and chestnut hosts within the same ecoregion. Indeed Meloidogyne specimens were found in these samples, however, the barcode data demonstrate that multiple COI lineages were associated with chestnut and oaks in the park. Similarly, unnamed lineages were also discov-ered associated with American beech and baygall plant communities in Big Thicket National Preserve, Texas (www.nps.gov/bith/plant-communities.htm). These results indicate that considerable Meloidogyne diversity exists in the primary and secondary forests of eastern and southern United States. Characterization of this diversity by COI barcoding allows us to rule out described species with representation in the COI database, yet neither COI barcode nor morphometrics of juvenile specimens permits unequivocal assignment of a species name to these specimens. For these unknown specimens a more complete taxonomic analysis that includes obtaining adult stages will be required before a barcode sequence can be linked to a formal Latin binomial.