SEARCH WITHIN CONTENT
Citation Information : Postępy Mikrobiologii - Advancements of Microbiology. Volume 60, Issue 1, Pages 3-12, DOI: https://doi.org/10.21307/PM-2021.60.1.01
License : (CC-BY-NC-ND 4.0)
Received Date : May-2020 / Accepted: November-2020 / Published Online: 24-March-2021
The clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins are components of the adaptive immunity system, protecting against foreign DNA, which are present in many bacteria species. Recent years have brought extensive research on this system however, not all of its biological properties have been discovered so far. It was recently discovered that CRISPR-Cas can regulate the formation of biofilm and is closely associated with the DNA repair system in bacterial cells. It is also likely that some of the spacer sequences are complementary to short sequences in the bacterial genome, which may have an influence on regulation of bacterial genes, e.g. virulence factors. Besides, phages can synthesize anti-CRISPR genes, which could be of use in the future for the purpose of development of an alternative therapy against multi-drug resistant bacterial strains. Here we present an elementary characteristic of CRISPR-Cas system, including the structure and the brief mechanism of action, systematic classification and its importance for medicine and biotechnology issues. We would like to stress the huge potential of CRISPR-Cas by discussing the selected but varied aspects.
1. Introduction. 2. Structure, operation and differences. 3. Bacterial typing. 4. Correlation with bacterial pathogenicity. 5. Potential tool for medicine. 5.1. CRISPR-tool for genome editing. 5.2. Instances of CRISPR-tool strategies in medicine. 6. Phage response. 7. Conclusions
Clustered regularly interspaced short palindromic repeats (CRISPR) were identified in approximately 40% of bacterial and approximately 90% of archaeal genomes . These loci are usually accompanied by conserved sets of genes, encoding nucleic acid processing enzymes like nuclease or helicase proteins, named CRISPR-associated (cas) genes. Typically, CRISPR-Cas loci contain repeats of sequences interspaced by spacers with noncoding unique sequences of similar lengths [2–5]. The function of CRISPR-Cas as part of a defense system against foreign DNA was confirmed in 2007 . Functionally, the CRISPR-Cas system is based on molecular memory of a previous infection, which means that during second contact with a bacteriophage or other mobile genetic material, the foreign nucleic acid is recognized and inactivated.
The CRISPR sequences were previously classified as a group of interspaced short sequence repeats (SSRs) with a unique structure. Those repeats are interspaced by nonrepetitive DNA fragments, which, as was thought at the beginning, were not important. These distinct interspaced SSRs were detected for the first time in the Escherichia coli K-12 chromosome in 1987 . In subsequent years, many scientists have identified this specific group of SSRs in many other bacteria, e.g. Streptococcus pyogenes, Mycobacterium tuberculosis, Campylobacter jejuni, and Clostridium difficile [4, 8, 9]. The function of SSRs was unknown and many scientists have named the same group of sequences differently. In 2002 Jansen et al.  used the acronym CRISPR for the first time as a name for a characteristic class of repetitive DNA, to avoid confusing nomenclature in the future. After the discovery of the CRISPR sequence function, the CRISPR-Cas designation has become a part of the generally accepted nomenclature. Interestingly, before recognition of the biological role of CRISPR-Cas, these sequences were used as a typing tool in molecular diagnostics. The method was established for Yersinia pestis, Bacillus anthracis, Salmonella enterica serovar Typhi, Francisella tularensis, E. coli O157, and Mycobacterium leprae . Currently, scientists have isolated several subgroups that differ in the internal structure of CRISPR-Cas sequences and associated genes [11–14]. In this work we present a brief history of CRISPR-Cas, describing the structure and the mechanism of operation as well as its application on selected examples.
The CRISPR-Cas system is based on the involvement of small, noncoding sequences in conjunction with Cas proteins. It was found that cas genes are always located near a CRISPR locus and the most common arrangement of these genes is cas3-cas4-cas1-cas2. The Cas3 protein appears to be a helicase, whereas Cas4 resembles the RecB family of exonucleases and contains a cysteine-rich motif, which suggests DNA binding properties. Cas1 is generally highly conservative and it was found consistently in all species that contain CRISPR loci. Repeated sequences in size from 24 to 48 nucleotides are separated by unique spacers of similar length. The number of repeated sequences is highly variable in bacterial species [11, 15, 16].
The classification of CRISPR-Cas systems is complex and still not obvious. However, three basic stages of the mechanism of action of the CRISPR-Cas defense system are established, namely: a) spacer integration, b) expression of CRISPR locus and cas genes, and c) target recognition and neutralization are more or less conserved among all CRISPR types and hence can be described in general . In this work the CRISPR-Cas type II system is used to illustrate in detail this model of operation – a clue for developing a CRISPR genome editing technique [17, 18]. The first step of this mechanism runs with the integration of short sequences of foreign DNA into the CRISPR locus (Fig. 1). Identification of foreign DNA begins with recognition of the viral protospacer adjacent motif (PAM) via the protein complex of two Cas1 dimers and a single Cas2 dimer. PAM is a short sequence consisting of 2–8 nucleotides and flanking the protospacer, where the Cas1 protein is an endonuclease and the Cas2 is an endoribonuclease. These properties allow the Cas1-Cas2 complex to cut off the protospacer and integrate with the CRISPR array. Before the integration of the new spacer, specific enzymes cut the sequence and create its sticky ends. This mechanism allows integration of the new spacer between the first and second repeats on the leader sequence side. During integration, DNA breaks at two points, on the 5’ end of the first repeat and next on the 5’ end on the opposing side of the repeat. As a result, the arisen single-stranded DNA (ssDNA) contains the repeated sequence. In the next step, specific enzymes repair gaps in the ssDNA . During the expression stage, the CRISPR region with acquired foreign sequences is transcribed into a precursor RNA transcript and then is cut by the double-stranded RNA-specific RNase III into specific small RNA molecules (crRNAs) . The maturation of crRNA is possible due to a small sequence of trans-activating crRNA (tracr-RNA) (Fig. 1). The tracrRNA is complementary to pre-crRNA and thus these two molecules form a duplex. Each mature crRNA contains a fragment of a single spacer flanked by part of the repeated sequence . In the recognition stage, a complex of crRNA and Cas9 protein scans the cell for foreign nucleic acid targets. This tracing process is ensured by a complementarity of crRNA and foreign DNA which allows recognition and incorporation of the protein complex. This complex contains the Cas9 enzyme which cuts foreign DNA at the attachment site and leads to its degradation (Fig. 1) [20, 21]. It was shown that deletion or insertion of the new spacer sequence complementary to phage genome increases or decreases the bacterial sensitivity to phage infection, respectively .
The first polythetic CRISPR-Cas system classification was proposed by Makarova et al. (2011) . In this study, CRISPR-Cas systems were differentiated according to target molecule and mode of operation. In this classification, three major types and 12 subtypes were distinguished. Type I and II systems split and degrade DNA. The main difference between type I and type II systems is the presence of various proteins in the complex, Cas3 and Cas9, respectively. The CRISPR-Cas type III contains Cas10 protein and cleaves both DNA and RNA . Both type I and II require two principal factors to effectively target the DNA. The first is a CRISPR RNA spacer complementary to the target protospacer sequence. The second is a PAM sequence specific to each CRISPR-Cas system. Similar factors are required for DNA targeting by type III systems. All these systems involve base pairing between the target sequence and the region flanking the protospacer . The CRISPR-Cas type III system seems to be more beneficial for bacteria because of its various abilities to recognize and degrade both DNA and RNA. Therefore, it is effective against a wider group of phages. Otherwise, the type I and type II systems are simpler because of the smaller number of Cas proteins engaged. Those mechanisms require less energy from bacteria to transcribe and run the mechanism. This may allow for a faster response and defense in case of a phage infection. This classification of CRISPR-Cas systems was updated by Makarova et al. in 2015 and again in 2020 [13, 25]. The fast evolution and major achievements in the study of the diversity of the CRISPR-Cas systems lead to the creation of its robust classification which is essential to understanding their functionality in microorganisms and utility as genome-editing tools. Recently, authors put special emphasis on the classification of the quickly proliferating class 2 variants. In comparison with the previous classification system (from 2015), the new class 2 includes 3 types (II, V, VI) and 26 subtypes. Interestingly, the previous type IIC was divided into two extra subtypes (IIC1 and IIC2) and type V was described in detail (17 subtypes were found). Moreover, the new type VI was distinguished. The newest classification is shown in Table I.
The origin and cause of the system’s mechanism diversity are still unexplained. An interesting task now would be to determine which of the systems was created first, or if they were developed independently of each other.
Primarily, CRISPR sequences have been used as a typing tool in epidemiology and phylogenetic studies. The main elements useful for analysis are the palindromic repeats. The nucleotide sequence, length, and loci are highly variable and often strain specific among some bacterial species such as M. tuberculosis, in which CRISPR-Cas type III-A has been described [8, 10, 26, 27]. Due to this, the CRISPR-based approach has already been used for the differentiation of many bacterial species such as M. tuberculosis , C. jejuni , Corynebacterium diphtheria , Legionella pneumophila , and E. coli . This CRISPR utility in bacterial differentiation may be associated with their diverse pathogenicity profile and the unique genetic fingerprints of particular strains. Because of their unique structure, CRISPR sequences can be used for clonal differentiation [33, 34]. As a result of optimization, a technique called CRISPR high-resolution melt analysis was developed. This method is a combination of CRISPR typing with amplified fragment length polymorphisms (AFLP) and multilocus sequence typing (MLST) [29, 34]. Unfortunately, there are limitations for the use of CRISPR typing in phylogenetic studies which are based on the evaluation of a number of evolutionary events resulting in sequence changes. For now, it is impossible to prove which mutagenic factor could be responsible for the loss of a cluster of neighboring spacers from CRISPR [24, 34]. Data sources indicate that the multiplex nature of the CRISPR-Cas mechanism enables recognition of multiple loci simultaneously, which leads to large deletions, inversions, or translocations [35–37]. The system does not always require perfect complementarity to target DNA sequences, and therefore it is able to cause indel change outside the target sequence leading to random mutations . In turn, the stability of CRISPR-Cas in the bacterial genome and its mechanism of regulation seems to be crucial factors in the process of adaptation to unfavorable environmental conditions.
Presumably, the CRISPR-Cas system acts not only as a defense against foreign genetic material, but also plays a role in the regulation of other bacterial genes expression [39, 40]. The polymorphism of CRISPR-Cas genes in enterohemorrhagic E. coli (EHEC) is associated with the expression of stx (phage-encoded Shiga toxin) and eae (intimin virulence factor) virulence genes . It is also confirmed that deletion of genes coding for transcription regulators affects the expression of cas genes. For example, removal of the ompR regulator in Y. pestis causes changes in expression of 244 genes, including repression of transcription of cas1 [42, 43]. It is also very likely that cas genes are involved in the bacterial stress response. In Enterococcus faecalis, which causes opportunistic intestinal infections, a stress response is mediated by effector molecules, e.g. (p)ppGpp. The synthesis of these molecules is regulated by pyrophosphokinase GTP, which was correlated with the lower expression of cas genes in E. faecalis. This observation was described by Yan et al. (2009) after antibiotic therapy . García-Gutiérrez et al. (2015) showed a dependency between the increased number of virulence factors and the reduced number of repeated sequences in E. coli CRISPR loci . Moreover, it was shown that pathogenic E. coli strains isolated from distant ecological niches varied in terms of the number of CRISPR repeats. There is also evidence that the CRISPR-Cas regions correlate with higher drug resistance of E. coli . It has also been noticed that the spacer sequence was complementary to plasmids harboring antibiotic resistance genes . In another experiment, different biofilm production in various E. faecalis strains was observed. The strains that possessed CRISPR-Cas systems formed biofilm more intensively than bacteria without this system. A similar observation was made among Pseudomonas aeruginosa PA14, where cas genes affected biofilm production and the formation of swarming cells [47–49]. Thus, it can be assumed that modification of CRISPR-Cas genes may change the profiles of bacterial virulence. It is clear that the CRISPR-Cas system is associated with modulation of bacterial pathogenicity, including drug resistance, but lots of “white spots” still remain an issue of CRISPR-Cas biology. An interesting question arises whether particular types of the CRISPR-Cas systems correlate with the pathogenicity of bacteria or phages?
Recently, the CRISPR-Cas system has been considered a tool for genome editing. The ease of use of such systems is their main advantage. This is true because of the nature of sequence recognition, which differs significantly from conventional, nuclease-mediated DNA editing techniques. The advantage of the CRISPR-Cas results from the fact that DNA recognition comes not from protein but from the 20-bp guide RNA sequence . Instead of preparation of individual DNA-recognition domains (via protein engineering), multiple site-specific guide RNA particles may be used. This is a significant boost in applicability in a variety of genome editing experiments. Jinek et al. (2012)  developed the crRNA (CRISPR RNA) mechanism to guide the silencing of invading nucleic acids. The specific two-RNA chimera directs specific Cas9 endonucleases to introduce double-stranded breaks in target DNA, e.g. the Cas9 HNH nuclease domain cleaves the complementary strand, whereas the Cas9 RuvC-like domain cleaves the noncomplementary strand. Their study highlights the potential to exploit the CRISPR-Cas system for RNA-programmable genome editing.
Generally, the challenge in CRISPR-Cas genome editing technologies is engineering highly specific and programmable nucleases that generate DNA double-strand breaks (DSBs) and supervision over the induced DNA repair pathways. One of them is the nonhomologous end-joining (NHEJ) repair pathway, which is an error-prone process and often causes random deletion or insertion . It turned out to be a faster and highly efficient (up to 80%) process in comparison to traditional homologous recombination (HR), useful especially for the generation of knock-out mutants [52, 53]. The DSBs induced by CRISPR-Cas can be repaired also through the homology-directed-repair (HDR) pathway which is also more effective than the traditional HR technique and more precise than NHEJ. Gene editing using the CRISPR-Cas system with HDR recombination between genomic and homologous exogenous DNA is a tremendous opportunity for gene correction therapy, however, the simultaneous suppressing of NHEJ response, which generates harmful byproducts, remains a challenge for researchers . Another unresolved disruption for the application of the CRISPR-Cas in genome editing is the risk of off-target cuts due to tolerance of gRNA to mismatched DNA [38, 55]. This is a serious problem for the application of this technology in gene therapy because of the poor knowledge about the consequences of off-target genome editing . Here, bioinformatics specialists have a lot to do and intensive research is underway to develop appropriate bioinformatic tools to identify potential off-target sites or design the most specific gRNAs and other components of the CRISPR-tool . Currently, there is intense research performed on Cas9 nuclease manipulation to increase the specificity of this technology. These modifications include, for example, temporary expression of the nuclease  and also the application of modified nSpCas9 (nCas9) protein with its own sgRNA, which cuts a single strand through the inactivation of the nuclease domain RuvC or HNH , or the mitigating of the helicase activity (eSpCas9) . Another modification includes the fusing of the nuclease Fok1 with the dead SpCas9 (dSpCas9) which has inactivated HNH and RuvC domains , or (tested on mice models) the binary system Split-SpCas9 that uses the expression of the nuclease lobe and the α-helical domain independently, where modified gRNA abolishes the Cas9 activity and divides the dimer . Scientists are also looking for alternative solutions in newly discovered nucleases (e.g. SaCas9, St1Cas9, or NmCas9) encoded by other cas genes, which could be easier packed into viral vectors due to their smaller size . Brand-new enzymes like Cpf1 with shorter crRNA sequences or C2c2 (Cas13a) and C2c6 (Cas13b) cleaving RNA also aspire to be better substitutes of SpCas9 in CRISPR engineering [63, 64].
First genome-editing technologies (i.e. zinc finger proteins (ZFNs) and transcription activator-like effector nucleases (TALENs)) are still useful but now the CRISPR-Cas9 system is more popular. In particular, genome-editing endonucleases have significantly improved our ability to make precise changes in the DNA of eukaryotic cells as well as therapeutic strategies against viral infections. The human immunodeficiency virus (HIV) remains a major global public health issue, with more than 35 million individuals being infected worldwide. In the HIV context, CRISPR-Cas9 applications might go further than current anti-retroviral therapy. In the case of HIV-1, the product of the CXCR4 gene in primary T cells mediates viral entry into human CD4+ cells by binding to envelope protein gp120. Modifications of CXCR4 may induce resistance to HIV-1. When evaluating the therapeutic strategy based on CRISPR-Cas9, it is critical to understand that not only can HIV be eliminated from latently infected cells, but the majority of uninfected cells will become resistant to HIV infection too. HIV gene therapy has progressed very slowly until recent breakthroughs in gene-editing methods using the CRISPR-Cas9 . Recently, Dash et al. have demonstrated that CRISPR-Cas9, used in conjunction with the so-called LASER ART (Long Active, Slow-Effective Release Antiviral Therapy) successfully eliminated latent HIV-1 reservoirs from infected, humanized mice . Although important challenges still need to be overcome, it seems that a promising pathway to HIV cure has been found. The CRISPR-Cas9 seems to be also a viable tool for eliminating other viruses that persist post-infection in host cells. Promising results have been shown for the hepatitis B virus (HBV). Sakuma et al. used a multiplexed CRISPR-Cas9-nuclease and Cas9-nickase vector system, simultaneously targeting three domains in the HBV genome. One of the goals achieved was the elimination of HBV DNA from host cells via fragmentation. This fact seems crucial because the presence of stable cccHBV DNA in the liver cells is a major obstacle in suppressing HBV infection . The use of CRISPR-Cas9 based strategies has shown promising results against high-risk human papillomavirus (HPV) infection which is directly associated with an increased risk of developing cancer in an infected individual. Two studies show that CRISPR-Cas9 could be useful in HPV-related cancer therapy. This system can cause virus degradation and cell line growth inhibition (cervical cancer and anal cancer) [68, 69]. Results showing the antiviral potency of the CRISPR-Cas9 strategy have also been obtained in studies of viruses like the hepatitis C virus (HCV) , the Epstein-Barr virus (EBV) , and the African swine fever virus (ASFV) ; the latter happens to be a major epidemiological, as well as economical issue in Europe nowadays.
The CRISPR-Cas application is important for the possible use in gene therapy for monogenic recessive disorders due to loss-of-function mutations such as in the case of cystic fibrosis (CF) and Duchenne muscular dystrophy (DMD). These studies underscored the huge potential of the CRISPR-Cas technology for human gene therapy. CF is an inherited disease related to a defect in the gene coding for the cystic fibrosis transmembrane conductance regulator (CFTR). It was found that in 70% of CF cases, the deletion of phenylalanine at the 508th position on the protein chain had taken place. Using cultured primary adult intestinal stem cells derived from cystic fibrosis patients, the CFTR locus responsible for cystic fibrosis was corrected by homologous recombination (HR) with available homologous donor templates. This resulted in the clonal expansion of organoid cultures (miniature organ-like cell clusters) harboring exact genetic change. The authors proved that the corrected allele is expressed and fully functional as measured in clonally expanded organoids . DMD is a recessive X-linked disorder caused by a defective gene coding for dystrophin, which afflicts primarily males and affects both skeletal and cardiac muscles. In the context of gene therapy of DMD, there are a few ongoing studies on the mdx mouse model of DMD with a nonsense mutation in exon 23, which prematurely terminates protein production [72, 73]. They have shown that the Cas9 nuclease is targeted to introns 22 and 23 by two single guide RNAs. Generation of double stranded breaks (DSBs) by nuclease leads to cutting of the region surrounding the mutated exon 23. The distal ends are repaired through the non-homologous end joining system (NHEJ). Consequently, the reading frame of the dystrophin gene is recovered and protein expression is restored. It was established that CRISPR-treated mice showed no phenotypic evidence of toxicity [73, 74]. Zhang et al. (2020) applied the CRISPR-Cas9 technology to correct diverse genetic mutations in animal models of DMD, but the high doses of adeno-associated virus (AAV) were a significant problem in further clinical application. Their recent study on the DMD mouse model demonstrated a 20-fold higher efficiency of this technology by packing Cas9 nuclease in single-stranded AAV (ssAAV) and use of CRISPR single guide RNAs in self-complementary AAV (scAAV). The authors observed a restoration of dystrophin expression and improved muscle contractility in mice . It was shown that the CRISPR-Cas system type II can directly correct a genetic defect through NHEJ or HR mediated gene editing. This provides a proof of concept for a gene correction by homologous recombination in stem cells cultured from patients with a single-gene hereditary defect.
Standard antimicrobial strategies, such as the use of antibiotics, have a non-specific effect, removing sensitive microorganisms, both pathogenic and commensals. The development of new antibiotics and antimicrobial peptides requires detailed knowledge of the metabolism and physiology of bacteria. Lytic bacteriophages, in turn, offer exquisite specificity, but individual bacteriophages must be isolated against a specific strain, and this requires additional screening to determine the degree of specificity . There is evidence that CRISPR-Cas systems could be used as a “smart antibiotic” acting specifically against particular pathogenic bacteria and saving natural microbiota [23, 77]. In Escherichia coli with expression of the type I-E or II-A system, transformation of a plasmid with spacers targeting endogenous genes or a lysogenized bacteriophage has led to extremely low recovery of viable transformants [78–82]. The method still requires further optimization, but offers many possibilities to fight multidrug resistant bacteria.
A response to the CRISPR-Cas system was observed for the first time among phages closely related to the group of Mu-like phages. Nine anti-CRISPR genes in the phage genome were characterized [83, 84]. The CRISPR-Cas system can be blocked at the stage of complex formation and recognition of foreign DNA. In infected P. aeruginosa, the phage genome contains genes acrF2, acrF1, and acrF3 which encode anti-CRISPR proteins [84, 85]. Production of anti-CRISPR proteins is an excellent example of an “arms race” between phages and bacteria, where phages are constantly evolving to overcome bacterial defense systems. Anti-CRISPR proteins could be used in phage therapy of pathogenic strains that express resistance induced by the CRISPR-Cas system. It is extremely relevant in the case of infections where phage therapy is the sole solution .
The bacterial CRISPR-Cas system targets and degrades foreign DNA from all mobile genetic elements, including plasmids, transposons, and pathogenicity islands [9, 86]. Similar to phages, successful transfer of other mobile genetic elements will depend on the inactivation of the CRISPR-Cas systems of particular bacteria. In Pseudomonas sp., the anti-CRISPR protein sequences are found in genome loci that are not associated with the CRISPR-Cas region. Such regions include several elements that are involved in DNA transfer and conjugation . Interestingly, also the anti-CRISPR genes in mobile genetic elements may play an important role in increasing the virulence of bacterial strains. For example, an anti-CRISPR homolog was discovered in an active pathogenicity island of a highly virulent P. aeruginosa clinical isolate .
The first step of the CRISPR-Cas system operation is the adaptation of foreign DNA (spacer) by its insertion into the end of the leader CRISPR sequence and duplication of the CRISPR repeat [3, 88]. Considering occurrences like cutting, insertion, and duplication of DNA fragments, it is highly probable that other mechanisms of DNA metabolism are involved in this system. Ivančić-Bace et al. (2015) proved that replication proteins and DNA repair proteins, i.e. DNA polymerase I, RecG, and PriA, were required to insert new spacers in CRISPR loci . It is not excluded that more proteins specific to other important cellular processes can be involved in this system.
The CRISPR-Cas system is associated with multiple functions in bacterial cells, starting from the original function of resistance to foreign DNA and ending with the regulation of biofilm production. It is also likely that some of the spacer sequences are complementary to short sequences in the bacterial genome, which may have an influence on the regulation of bacterial genes, e.g. virulence factors. As a result of the “arms race”, phages synthesize anti-CRISPR genes, which could be extremely relevant in multi-drug resistant (MDR) bacteria therapy. In the last decades, a significant increase in the number of MDR strains has been observed, not only in hospitals but also in the environment. In Europe, the mortality rate caused by such infections is as high as 25 000 deaths per year. According to predictions, this number will probably increase in the following years. Phage therapy could be an alternative strategy against MDR bacteria which can also be equipped with the CRISPR-Cas system. Initially, it was proposed that the CRISPR-Cas system may be used in genetic modification of organisms and, in the future, in gene therapy. A wide range of applications for the CRISPR-Cas array opens many possibilities for contemporary medicine, biotechnology, industry, and environmental protection. However, it is still necessary to know all the shades of this phenomenon.
Funding: This work was supported under the program of the Minister of Science and Higher Education under the name “Regional Initiative of Excellence in 2019–2022 project number: 024/RID/2018/19, financing amount: 11.999.000,00 PLN. This work was also cofinanced by miniGrant UJK number: SMGR.RN.20.181.
Conflict of Interest: The authors declare that they have no conflict of interest.
Ethical approval: For this type of study formal consent is not required. This article does not contain any studies with human participants or animals carried out by any of the authors.
Informed consent: For this type of study formal consent is not required.