Prophage text description
PROPHAGE AND CLUSTERED REGULARLY INTERSPACED SHORT PALINDROMIC REPEATS
The 54 L. plantarum strains contain 15 regions encoding typical prophage proteins (Table 13). Each strain appears to have prophage genes and the total number of genes varies from 58 to 305 per strain. As for the sugar cassettes (see Manuscript), 9 groups of strains appear to have the same set of prophage genes in the same loci. Again, these strains are closely related and, in this case, their relatedness correlates with the source of isolation and the country of origin. Phage region 11 (pr11) is by far the most common, occurring in 43 strains (34 have complete and 9 incomplete prophages) (Table 13). Then phage regions 4, 6 and 10 are the most common, occurring in 15, 14 and 30 (of which 17 complete) strains, respectively. It is not directly clear what the similarity is between prophages in each region, because similar prophages are known to insert at different positions in the chromosome.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA repeats typically composed of short and highly conserved repeats, interspaced by variable sequences called spacers, which are found adjacent to cas genes (50). CRISPR play an important role in controlling gene transfer (51) and they have been proved to be an important phage-resistance system (52,53). It has been shown that L. rhamnosus strains isolated from the same niche mostly shared identical CRISPR-cas loci (54). However, this is not the case of L. plantarum. The CRISPR-cas machinery has been previously described to be absent in in L. plantarum (55). In our dataset, 2 CRISPR loci have been identified and 4 strains bear CRISPR genes. The CRISPR 1 locus consists of 4 cas genes (cas1/cas9, cas1, cas2 and csn2). This locus is only present in 3 strains with different origins (NIZO2029, NIZO2801, ZJ316). Surprisingly, they all share the same direct repeat of 36 nt, which has also been found in Lactobacillus pentosus MP-10 (56). The CRISPR 2 locus consists of 2 casgenes (cas1 and cas2) and it was found only in strain NIZO1839.
LARGE INTERGENIC CRUCIFORM-LIKE SUPERMOTIFS
Large Intergenic Cruciform-Like Supermotifs (LPSM) are highly conserved genetic elements in the intergenic regions of L. plantarum(55). It has been reported that LPSMs might be transcribed and be active as a transcript analogous to the CRISPR system. However, the physiological function of the LPSM transcript(s) remains to be established. A search on the 54 L. plantarum genomes for LPSM sequences described in WCFS1 (55), using the HMM model provided in the same paper, showed that all strains were found to contain at least a one LPSM occurence on their genome. In a majority of strains (48 out of 54), we identified 19-25 LPSMs, which is approximately the same number of LPSMs as were found in WCFS1 (24). Only 5 strains of different origins (IPLA88, CNW10, UCMA303 and ATCC14917) were found to contain much less LPSM occurences on the genome (3, 10, 12, 15, respectively). Two strains (NIZO2264 and NIZO1839 – isolated from silage and sour cassava respectively) contained a relatively high number (29 and 31) of LPSM occurences. Notably, they resulted to be very closely related from the phylogenetic analysis of the core genome (Fig. 2 Manuscript). Manual inspection of genomic regions of the LPSM occurences showed that the LPSMs do not seem to be located in the same genomic context in the different strains, suggesting that there is no direct link between the LPSMs and the surrounding genes. An updated search against the complete database of (draft and complete) bacterial genomes using one of the WCFS1 sequences as a seed, did not reveal any full length occurences of the LPSM outside the species of L. plantarum. The best hit found outside the species of L. plantarum was a hit in the draft genome sequence of Lactobacillus acidipiscis KCTC 13900 that showed 82% identity over 78% of the length of the LPSM. Within the genome of L. acidipiscis KCTC 13900 only a single BLAST hit was observed in the complete genome.
50. Haft DH, Selengut J, Mongodin EF, Nelson KE. A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput Biol. 2005 Nov;1(6):e60.
51. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science. 2007 Mar 23;315(5819):1709–12.
52. Jansen R, Embden JDAV, Gaastra W, Schouls LM. Identification of genes that are associated with DNA repeats in prokaryotes. Mol Microbiol. 2002 Mar;43(6):1565–75.
53. Westra ER, Brouns SJJ. The rise and fall of CRISPRs--dynamics of spacer acquisition and loss. Mol Microbiol. 2012 Sep;85(6):1021–5.
54. Douillard FP, Ribbera A, Kant R, Pietilä TE, Järvinen HM, Messing M, et al. Comparative Genomic and Functional Analysis of 100 Lactobacillus rhamnosus Strains and Their Comparison with Strain GG. Richardson PM, editor. PLoS Genet. 2013 Aug 15;9(8):e1003683.
55. Wels M, Bongers RS, Boekhorst J, Molenaar D, Sturme M, de Vos WM, et al. Large intergenic cruciform-like supermotifs in the Lactobacillus plantarum genome. Journal of Bacteriology. 2009 May;191(10):3420–3.
56. Abriouel H, Benomar N, Pulido RP, Cañamero MM, Gálvez A. Annotated genome sequence of Lactobacillus pentosus MP-10, which has probiotic potential, from naturally fermented Aloreña green table olives. Journal of Bacteriology. 2011 Sep;193(17):4559–60.