Identification of Suitable Meloidogyne spp. Housekeeping Genes

Abstract Gene expression studies often require reliable housekeeping (HK) genes to accurately capture gene expression levels under given conditions. This is especially true for root-knot nematodes (RKN, Meloidogyne spp.), whose drastic developmental changes are strongly dependent upon their environment. Here we utilized a publicly available M. hapla RNAseq database to identify putative HK genes throughout the nematode lifecycle. We then validated these candidate HK genes on M. incognita in order to develop a small library of suitable HK genes for RKN. Seven putative HK genes were selected for validation based on high expression level and ease of primer design. The expression of these genes was quantified by qPCR at different developmental stages to capture the entire life cycle of M. incognita which included eggs and naive infective juveniles through 3-wk post inoculation. Two algorithms, geNorm and Normfinder, identified three genes (Disu, Poly, and Skinase) constitutively and uniformly expressed throughout the entire life cycle of RKN. We believe these genes are superior HK genes suitable to be used as internal reference genes at all stages of RKN. Importantly, while we identified Actin, a commonly used HK gene, as a candidate gene within our RNAseq analyses, our qPCR results did not demonstrate stable expression throughout the nematode life cycle of this gene. This study successfully validated suitable HK genes utilizing both RNAseq data and standard qPCR methods across two species of RKN; suitable HK genes are likely applicable to other species of RKN, or even plant-parasitic nematodes. Additional lists of potential HK genes are also provided if the nematode of interest does not have homologues of the three superior reference genes described here. Gene expression studies on RKN should use validated HK genes to ensure accurate representation of transcript abundance.

Root-knot nematodes (RKN, Meloidogyne spp.) are sedentary endoparasites with wide host ranges, including most important agricultural crops, and cause billions of dollars in yield losses (Sasser and Freckman, 1987). The first genomes of M. hapla and M. incognita were available in 2008 (Abad et al., 2008;Opperman et al., 2008), and now the genomes of seven RKN species, including the abovementioned two, and M. arenaria, M. enterolobii, M. floridensis, M. javanica are publicly available. The emergence of available genomes and transcriptomes are enabling the discovery of plantnematode interaction mechanisms and nematode loci involved in parasitism (Gleason et al., 2017). Expression of nematode genes is an important factor in determining a role in parasitism. However, housekeeping (HK) genes for RKN have not been defined and validated, reducing the accuracy and credibility of gene expression studies.
Reverse transcription quantitative PCR (RT-qPCR) is a standard for quantifying mRNA because of its high sensitivity and flexibility (Bustin, 2000). Advantages of RT-qPCR include accurate quantification, flexible wide range, potential high throughput, and wide application in different types of samples (Huggett et al., 2005). However, the reliability of RT-qPCR depends on normalization to control variations such as quantity and quality of RNA in the starting material, efficiency of cDNA transcription synthesis, and pipetting during any of the process from RNA preparation to qPCR (Gal et al., 2006). Generating standard curves for target genes is a common way to quantify gene expression, but with thousands of putative parasitism genes within RKN genomes, it is impractical to produce a unique standard curve for every gene of interest. An alternative approach is to use HK genes which are stably expressed regardless of the given conditions. HK genes are genes necessary to maintain basic cellular function and are presumed to be expressed in all cells of organisms (Butte et al., 2001). HK genes are often used as endogenous references to normalize the variations in RT-qPCR experiments (Jian et al., 2008). However, under conditions such as heat stress, viral infection, and cancer development, the expression of HK genes of animals can vary (Sahu et al., 2018). The utilization of invalidated reference genes results in unreliable and misleading conclusions. There is an urgent need to discover and validate suitable HK genes for plant-parasitic nematodes to ensure accurate representation of RKN transcript abundance.
Commonly used HK genes in plants and animals are tubulins, actins, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), albumins, cyclophilin, micro-globulins, ribosomal units (18S or 28S rRNA), ubiquitin (UBQ), and elongation factors (de Jesus Miranda et al., 2013). Suitable HK genes of plants such as Arabidopsis (Hofmann and Grundler, 2007), tomato (Exposito-Rodriguez et al., 2008), and soybean (de Jesus Miranda et al., 2013) have been discovered and validated using RNAseq and RT-qPCR. However, there is a huge gap in defining and identifying superior HK genes as internal reference genes in RKN, even though the genome of various species of which has been assembled and annotated. Without validation, the detection of "parasitism" gene expression differences during nematode development is hardly convincing (Duarte et al., 2016).
This study utilizes a publicly available M. hapla RNAseq database to identify putative HK genes suitable throughout the RKN life cycle. We validated across a select number of these candidate HK genes with a second species of RKN, M. incognita, using standard RT-qPCR, demonstrating that the HK genes provided in this study could apply to other species of RKN. Importantly, we discovered that the widely used HK gene-Actin may not be suitable reference gene throughout the life cycle of RKN. This study serves as a stepping stone for gene expression studies of RKN, and can greatly facilitate transcriptomic studies of plant-parasitic nematode.

Candidate HK gene selection
We utilized the publicly available M. hapla RNAseq database (https://brcwebportal.cos.ncsu.edu/haplatome/1-0_hapla.php) to discover potential HK genes for RKN (Cha, 2016). Genes with more than five reads per sample across all the treatments were analyzed. Differently expressed genes (DEG) were determined in R v.3.3.3 (Team, 2006) using package "edgeR" (McCarthy et al., 2012). Specifically, the count of the genes was normalized using function "calcNormFactors", and then the dispersion was estimated using function "estimateDisp"; finally, pairwise comparison was accomplished using function "exactTest". Genes with comparision value 0 across all treatments were considered non-DEG. The non-DEG with matching loci within the M. hapla genome were subject to Sig-nalP 4.1 to determine whether they encode secreted proteins ( Petersen et al., 2011), putative secreted protines were omitted as candidates due to the possibility of direct involvement in the plant-nematode interaction. The details of candidate HK genes are in listed in Table A1. The gene functions were annotated utilizing both KEGG (Kanehisa and Goto, 2000) and WormBase parasite website (Howe et al., 2017). Seven genes of the non-DEG pool were selected based upon ease of primer design and high expression level to validate stability of gene expression throughout the life cycle of M. incognita in growth chamber assays.

Plant and nematode inoculation
To test whether the genes can be widely applied to all RKN, a growth chamber experiment was set up to validate candidate HK genes on M. incognita. Tomato seeds (variety "Rutgers") were surface sterilized and planted in potting soil. After 3 wk, seedlings were transferred into individual pots with sand. Second stage juveniles (J2) of M. incognita were hatched from freshly collected eggs, and 300 J2 were inoculated into each pot. The pots were kept in a growth room with 16 hr light/8 hr dark at 26°C. The freshly collected eggs and J2 were also stored in -80°C for RNA extraction.

Root sample collection
Roots were gently washed in sterile water, dried and immediately frozen in liquid nitrogen. A total of 11 time points were collected with five time points during early infection, (1, 2, 4, 5, and 7 days post inoculation-dpi) and six time points during nematode maturity (21 dpi) ( Table 1). To test diurnal effects on RKN gene expression, at the nematode maturity stage, root samples were collected three time points in dark and three time points in light ( Table 1). Each of the total 11 time points had four biological replicates, and 3-wk old non-infected tomato plants were used as negative control.

Primer design
The homologues of M. hapla HK gene of M. incognita were obtained through WormBase parasite website. The transcript sequences of M. incognita V3 were used as template in IDT primerQuest tool for RT-qPCR primer design with the default parameters. To avoid any possible DNA amplification which would alter transcript level detection, primers were designed to span an exon-exon junction if possible or target adjacent exons. We checked the specificity of the primers against the whole transcriptome of M. incognita using Primer-BLAST on the NCBI website. The corresponding gene ID and annotation of the candidate HK genes are included in Table 2, and two genes with high to low DEG value, plus a known actin gene (Duarte et al., 2016) were used as controls (Table A2).

RNA extraction and cDNA synthesis
RNA was extracted from the root samples, eggs and naïve, pre-parasitic J2 using RNeasy Plant Mini kit (Qiagen, USA) according to the manufacturer's protocol. cDNA was synthesized with the iScript Reverse Transcription Supermix (Bio-RAD) for RT-qPCR according to the manufacturer's instructions.

RT-qPCR
RT-qPCR was carried out with a StepOne Plus Real Time PCR detection system (Applied Biosystem, USA) using SYBR Green as detection agent. PCR reactions were conducted with three technical replications for each sample in MicroAmp Fast 96 well reaction plate (0.1 ml) (Applied Biosystems). The total volume of the reaction was 10 μ l with 0.5 μ l forward and reverse primers each at 10 μ M, 2.5 μ l cDNA template, 1.5 μ l of Nuclease free water and 5 μ l SYBR Green qPCR MasterMix (iTaq Universal SYBR Green Supermix, Bio-RAD). PCR was performed at 95°C for 20 sec followed by 40 cycles at 95°C for 3 sec, and 60°C for 30 sec, the melting curve was generated at 95°C for 15 sec, 60°C for 1 min, and 95°C for 15 sec. Ct value was calculated using the StepOne software V2.3 (Applied Biosystems).

Data analyses
To ensure no off-target amplicons were produced, the melt curve was checked carefully to only include samples with single peaks at the correct melting temperature for downstream analyses. The mean C T (cycle threshold) value of each sample was calculated in excel from three technical replicates. The data were organized in excel according to the suggested format by NormFinder (Jensen and Ørntoft, 2004). According to the algorithm of NormFinder, candidate HK genes were compared pairwise within and across groups resulting in a stability value; the most stable HK genes are those with the lowest stability value. In our study, each time point was treated as a group factor. To test the robust of NormFinder analyses, the C T value was  Serine/threonineprotein kinase smg-1 also submitted to geNorm, another similar HK gene algorithm, for stability calculation (Vandesompele et al., 2002).

Functional summary of candidate HK genes
Candidate HK genes were identified utilizing RNAseq database and then annotated with KEGG and Warm-Base parasite. After filtering out genes with low count and DEG >0, we identified 48 candidate HK genes (Fig.  1). While only 39 out of the 48 genes were mapped to M. hapla genome, the 39 genes did not have the potential to encode secreted protein (Fig. 1) and are less likely to be directly involved in the plant-nematode interaction. The remaining 39 genes were submitted to KEGG for functional annotation; only 15 out of 39 genes were annotated (Fig. 2), and majority of the genes belong to genetic information processing and metabolism (Fig. 2). The annotation incorporated the results of KEGG and Wormbase was listed in Table A1.

HK gene validation
Seven out of the 39 candidate HK genes were validated in M. incognita based on high expression level and ease of primer design. Primer specificity was confirmed by single peak on the melt curve. Uninfected tomato plant root samples were not amplified by the primer sets, indicating that all primers were specific to RKN. We used two algorithms to validate suitable HK genes through all developmental stages, including a 24 hr period at 3-wk post inoculation. We determined genes with low stability values as stably expressed genes. Gene stability is slightly different between the two algorithms and throughout the life cycle of RKN.
The geNorm algorithm defines any gene with stability value less than 0.15 as stable genes; RKN genes under the cut-off value were selected as HK genes. According to geNorm, Actin1, Disu, ELF, Poly, Skinase were all suitable HK genes throughout all the developmental stages of RKN (Fig. 3). However, for naïve juveniles, Actin2, Disu, Minc04440, PTP, and Skinase were better reference genes for transcript quantification (Fig. 3). In addition, over a 24 hr period (diurnal) at the adult stage, Actin1, ELF, Poly, Skinase were determined suitable HK genes (Fig. 3). In summary, Disu, Poly, and Skinase had stability values below the threshold for any developmental stage. Normfinder yielded similar yet slightly different results than geNorm. When including all stages, Disu, ELF, Poly, Skinase were suitable reference genes (Fig. 4), but Actin2, Disu, Minc04440, Poly, PTP, and Skinase were stably expressed gene at naïve juvenile stage (Fig. 4). Lastly, Disu, ELF, Poly, and Skinase were better HK genes at the mature stage over the 24 hr period (Fig. 4). Overall, Disu, Poly, and Skinase were suitable HK genes for any developmental stage as suggested by Normfinder. Importantly, our negative control gene, Colla, was highly variable based on RNAseq data,

Identification of Suitable Meloidogyne spp. Housekeeping Genes
and was also determined to be highly variably expressed throughout all developmental stages by both algorithms.

Discussion
Gene expression is in a new era with the constant advancement of next generation sequencing. The expression of thousands of genes can be evaluated in a single experiment, but it is impossible to validate the expression of these genes through standard qPCR by generating standard curves. Reliable and suitable HK genes are essential for reporting accurate gene expression; however, there are no validated HK genes for RKN. The commonly used HK genes, such as 16S and Actin, are frequently used, yet lack any experimental support (Gleason et al., 2017). Although Actin was one of the first HK genes to be used as a normalizer, transcript levels have been demonstrated to vary widely under experimental conditions in human breast epithelial cells, blastomeres, porcine tissues, and canine myocardium (Bustin, 2000). Our results also indicate that Actin might not be a suitable HK gene RKN (Figs. 3,4). Ribosomal 16S/18S rRNA are useful internal control genes, but rRNA abnundance is much higher than target mRNA transcript levels, which makes pairwise statistical analyses inaccurate (Bustin, 2000). We successfully identified and validated the most suitable HK genes throughout the life cycle of RKN: Disu, Poly, and Skinase. These three genes are constitutively and uniformly expressed at all stages of RKN development and can be widely used as reference genes to quantify relative gene expression levels. Importantly, the homologs these genes can be found in over 150 nematode species. The HK genes for a specific species can be easily secured by looking up the homologs of those two genes, as we tested that they were stably expressed  We used two standard algorithms to validate putative HK genes identified in the RNAseq analyses to ensure reliable results. Both algorithms produced consistent results, suggesting that Normfinder and geNorm are robust programs for finding HK genes. In fact, those two programs have been used widely to determine HK genes in human, animals, and plants (Czechowski et al., 2005;Jian et al., 2008;Jongsik et al., 2007;Sahu et al., 2018). The stability value of our candidate HK genes were much lower than those in plants (Czechowski et al., 2005), suggesting that the candidate genes were stably expressed. The threshold of Normfinder was not suggested by the developer, but in this specific study, a similar pattern was observed if the cut-off value was 0.08. It is largely agreed that a single HK gene is not suitable to normalize expression data; at least three HK genes are suggested by geNorm (Vandesompele et al., 2002). Overall, both programs suggested that Disu, Skinase, and Poly are suitable HK genes at any developmental stage of RKN. We also suggest that genes with lower stability value should be chosen if only quantifying the expression of genes in a specific stage of the life cycle. Such as only for naïve juvenile stage, Disu, Minc04440, and PTP are great HK genes, while at the adult stage, Poly, PTP, and Skinase are better options (Figs. 3,4). Similar phenomena were observed in human cells and plants when certain HK genes are not suitable for experimental condition. Some HK genes constitutively expressed only in a particular tissue, and only in fetal or adult in human (Butte et al., 2001). The potential HK genes of Arabidopsis thaliana were tested

Control
Candidate HK on various tissues and under different biotic conditions to ensure the consistency (Czechowski et al., 2005). Interestingly, widely used HK genes such as UBQ10, ACT2, and EF1a for Arabidopsis thaliana were not stable when infected with RKN or soybean cyst nematode (Hofmann and Grundler, 2007). In addition, different combination of HK genes were also suggested for the tomato development process (Exposito-Rodriguez et al., 2008). The expression level of the gene is highly related with its function during the life cycle. The RKN nematode gene Colla is involved in the nematode cuticle collagen process and was expected to be highly variable during the life cycle of RKN. RNAseq analyses (DEG = 10) and standard qPCR results (Figs. 3,4) demonstrated highly variable expression in naïve juvenile, adult, and all stages of the life cycle, as expected for a cuticular collagen gene. Another negative control gene-NHL-with DEG value 1, is an amino acid repeat found in many enzymes. RNAseq data of this gene suggested that it was only differently expressed between egg and non-infected juvenile (Cha, 2016), which coincided with our results that it only expressed at the mature stage when eggs are formed, and naïve juveniles were hatched (Fig. 4). We deduce that genes involved in encoding disulfide-isomerase (Disu), Polyadenylate-binding protein (Poly), protein transport (PTP), and Serine/threonine-protein kinase (Skinase) are good HK genes for plant-parasitic nematodes as these are loci involved in processes required throughout the lifecycle of the nematode.