Naturally-occurring fusion between the regulatory and catalytic components of type IIP restriction-modification systems

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.56 MB, 59 trang )

The University of Toledo

The University of Toledo Digital Repository
Theses and Dissertations

2013

Naturally-occurring fusion between the regulatory
and catalytic components of type IIP restrictionmodification systems
Jixiao Liang
The University of Toledo

Follow this and additional works at: />Recommended Citation
Liang, Jixiao, "Naturally-occurring fusion between the regulatory and catalytic components of type IIP restriction-modification
systems" (2013). Theses and Dissertations. Paper 134.

This Thesis is brought to you for free and open access by The University of Toledo Digital Repository. It has been accepted for inclusion in Theses and
Dissertations by an authorized administrator of The University of Toledo Digital Repository. For more information, please see the repository's About
page.

A Thesis
entitled
Naturally-Occurring Fusion Between the Regulatory and Catalytic Components of Type
IIP Restriction-Modification Systems
by
Jixiao Liang
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Master of Science Degree in Biomedical Sciences

_________________________________________

Dr. Robert Blumenthal, Committee Chair
_________________________________________
Dr. Steve Patrick, Committee Member
_________________________________________
Dr. Jason Huntley, Committee Member
_________________________________________
Dr. Patricia R. Komuniecki, Dean
College of Graduate Studies

The University of Toledo
December 2013

Copyright 2013, Jixiao Liang
This document is copyrighted material. Under copyright law, no parts of this document
may be reproduced without the expressed permission of the author.

An Abstract of
Naturally-Occurring Fusion Between the Regulatory and Catalytic Components of Type
IIP Restriction-Modification Systems
by
Jixiao Liang
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Master of Science Degree in Biomedical Sciences
The University of Toledo
December 2013
Restriction-modification (R-M) systems play key roles in controlling gene flow
among bacteria and archaea, and their own genetic mobility depends critically on their
regulation, but the regulation of these systems is poorly understood. The PvuII R-M

system is a Type IIP R-M system in that the protective DNA methyltransferase (MTase)
is a separate and independently-active protein from the potentially lethal restriction
endonuclease (REase). PvuII is one of the best studied of the R-M systems that use a
positive feedback regulatory loop, involving a transcriptional regulator called C protein,
to delay expression of the REase relative to that of the MTase. This allows protective
methylation of a new host cell’s DNA before the REase is produced. In searching for RM systems related to PvuII, in order to study evolution and variation of its regulatory
system, a putative system was found in the genome sequence of the bacterium Niabella
soli strain DSM 19437, in which the regulatory C protein and the REase are
translationally fused. The hypothesis is that N. soli truly produces a fused C-R protein,
and that it is active as both a REase and as an autogenous regulator. The genes for the N.
soli R-M system were synthesized, produced and purified with affinity tags, and the

iii

production of full-length C-REase fusion protein was confirmed. The dual activity of the
fusion protein was determined by in vitro restriction of known DNAs, and in vivo
transcriptional activation of a lacZ fusion to the promoter on which the C protein acts.

iv

This work is dedicated to my parents, Zhao-jun Liang and Gui-ying Xu for their love and
support.

Acknowledgements

This thesis and the associated research would not have been possible without the
ever-patient guidance of my mentor, Dr. Robert Blumenthal. I would like to express my
sincere gratitude to my major advisor Dr. Robert Blumenthal for his continuous support
of my graduate study and research, for his patience, encouragement, guidance and
support. He recognizes my strength and weakness, which keep me motivated. I am also
grateful for all his advice about life, career and everything else.

I would additionally like to thank my committee members, Dr. Jason Huntley and
Dr. Steve Patrick for their valuable time, constructive suggestions, and criticisms during
my study.

Further, for her constant support as an instructor in lab and a friend in life, I
would like to sincerely thank my lab mate Dr. Kristen Williams. Also, my friends Dr.
Guo-ping Ren and Dr. Gang Ren have offered me valuable advice and help on my
experiments. Last but not least, I would like to thank all the students, faculty, and staff in
the Medical Microbiology and Immunology Department. Thank you all!

vi

Table of Contents

Abstract .............................................................................................................................. iii
Acknowledgements ............................................................................................................ vi
Table of Contents .............................................................................................................. vii
List of Figures .................................................................................................................. viii
Chapter 1: Literature Review ...............................................................................................1

Chapter 2: Materials and Methods .....................................................................................13
Chapter 3: Results………………………………………………………………………..22
Chapter 4: Discussion and Conclusion ..............................................................................33
References ..........................................................................................................................39

vii

List of Figures

Figure1

Complex formed by R.PvuII and its cognate DNA.

Figure2

PvuII R-M system control region.

Figure3

Structure of C. AhdI.

Figure4

Sequence of synthesized NsoJS138I R-M system.

Figure5

Vector map of constructed plasmids

Figure6

Alignment of CR fusion proteins orthologous to C.PvuII and R.PvuII.

Figure7

Test of CR fusion protein production.

Figure8

Test of CR fusion protein production.

Figure9

Assessment of REase activity in CR.NsoJS138I.

Figure10

Confirmation of specific digestion conditions.

Figure11

Assessment of C activity in CR.NsoJS138I.

Figure12

Possible interactions of C-REase fusion polypeptides.

viii

Chapter 1

Literature Review

1. Restriction-modification (R-M) systems
The biological phenomenon of restriction and modification were first recognized
in the early 1950s, and the first R-M system was cloned in E. coli in the late 1970s [1]. RM systems are present in the great majority of bacteria and archaea, with more than 3000
being found to date (most by detecting MTase gene sequences) [2]. As the term indicates,
a typical R-M system comprises two activities: a restriction endonuclease (REase) that
cleaves DNA at a target sequence, and a methyltransferase (MTase) that modifies the
same sequence to protect it from the cognate REase [2]. Four broad types of R-M systems
have been reported so far, each with unique characteristics, and the two enzymes have
been combined into a single multi-subunit protein in some of the systems [3]. However in
Type IIP R-M systems, the REase and MTase separately execute their opposing
intracellular enzymatic activities [3].

1.1 Restriction Endonuclease (REase)
The REase catalyzes the cleavage of double-stranded DNA, generally on both
strands. REases recognize specific sequences on the target DNA, and the cleavage occurs

1

via hydrolysis of one phosphate-deoxyribose bond in the backbone of each DNA strand
[4]. Typically, such enzymatic activity takes place without energy input, but commonly
requires Mg2+ or a similar divalent cation; some REases also require or are stimulated by,
ATP or S-adenosylmethionine (AdoMet) [5]. REases appear to come from very different
backgrounds, and are difficult to identify from their sequences alone [6-8].

1.2 Modification Methyltransferase (MTase)
REase cleavage of DNA could be lethal to cells producing R-M systems. To
protect endogenous DNA from REase, the paired (cognate) MTase catalyzes addition of a
methyl group to one nucleotide in each strand of the recognition sequence, with the
identities and positions varying from MTase to MTase [9]. AdoMet always serves as the
methyl donor and is thus an essential cofactor for methylation [10]. The sensitivity of the
REase of R-M systems to methylation on the recognition sequences usually prevents
cleavage of endogenous DNA. However, while cleavage can be prevented by the cognate
methylation, noncognate methylation occurring elsewhere in the recognition sequence
may or may not prevent the cleavage [11].

1.3 Types of restriction modification systems
R-M systems are classified based on enzyme composition and cofactor
requirements, recognition sequence symmetry, and cleavage position [3, 12]. Because my
research defines a new subtype of R-M system, in which the REase and regulatory C
protein are fused, it is appropriate to describe the various known types of R-M systems.

2

1.3.1 Type I Systems
Type I systems are considered as the most complex R-M systems, as they consist
of three polypeptides: R (Restriction), M (Modification) and S (Specificity). These form a
complex that can both cleave and methylate DNA in an energy (ATP) dependent manner,
and about half of the bacterial genomes contain closely linked-genes that are predicted to
code for these three polypeptides, based on screening of the present database of complete
genomic sequences [13]. Furthermore, the fact that cleavage occurs at a considerable
distance away from the recognition site in most cases, makes it difficult to visualize the
discrete bands by gel electrophoresis [14]. So these enzymes have substantial biological
significance, but have not yet found major biotechnological uses.

1.3.2 Type II Systems
Type II systems are believed to be the simplest and most prevalent R-M systems.
As opposed to type I systems, Type II REase and MTase act independently without the
need of a specificity protein, and each has its own simple catalytic requirement: REase
requires Mg2+ (or similar divalent cation) and MTase requires AdoMet [14]. Type IIP
REases are generally active after they dimerize and form homodimers, while most Type
II MTases only form monomers for catalyzing the addition of methyl groups to the
cognate DNA [14, 15]. Early on it was recognized that while typical Type II enzymes
recognized palindromic sequences and cleaved symmetrically within them, the Type IIS
enzymes cut outside their normally asymmetric sequences and differed in other
interesting ways [16]. There are many subdivisions of Type II enzymes, classified based
on their recognition and cleavage differences [3]. Specifically, some of the criteria are

3

based on the sequence cleaved and others on the structure of the enzymes themselves, so
not all subdivisions are mutually exclusive [3]. Type IIP designates the enzymes that
recognize symmetric sequences (palindromes) [3]. Some new subclasses of Type II R-M
systems involve fusion of components, such as between the REase and MTase [17-20].

1.3.3 Type III and Type IV Systems
Type III MTase and REase form a complex of modification and cleavage [21].
Similar to Type II systems, Mg2+ and AdoMet are essential cofactors for Type III REase
and MTase, respectively; and in the presence of such cofactors, a complex formed from
REase and MTase competes internally for modifying and restricting at the same DNA
position [22]. As a consequence, incomplete digestions are typical [14]. The Type IV
REases cleave only modified DNA, which consist of methylated, hydroxymethylated and
glucosyl-hydroxymethylated bases [3]. However, their recognition sequences have

usually not been well defined except for EcoK-McrBC, and cleavage occurs at ~30 bp
away from one of the sites [3] The Escherichia coli McrBC enzyme, the best studied of
the type IV REases and the only one that is commercially available, requires two purine
methylcytosine/hydroxymethylcytosine sites separated by 40–3000 base pairs for
cleavage [23].

2. Roles and Control of R-M systems
One major function of R-M systems is to protect bacterial cells from
bacteriophage infection or invasion by foreign DNA [24]. In addition to being bacterial
defense systems, R-M systems manifest themselves in a diverse range of functions such

4

as stabilization of genomic islands, maintenance of bacterial fitness and nutrition,
immigration control, recombination and genome rearrangement, evolution of genomes,
enforcing methylation on the genome and so forth [25].

Lethal DNA damage would occur if the two R-M enzymatic activities (MTase
and REase) were unbalanced [26]. This is particularly true when R-M genes first enter a
new host cell that has completely unmethylated DNA [27]. Therefore, a timing delay
between expression of the MTase and REase is theoretically believed to occur in Type
IIP systems, and this has been documented to occur in PvuII [28]. Specifically, there is a
~10-min delay between the appearance of MTase and REase transcripts and activities
[28]. This boosts our understanding of the mobility of R-M systems.

3. PvuII R-M system and its regulatory characteristics
3.1 Overview of PvuII
PvuII was discovered [29] and then cloned into E. coli from its original host
Proteus vulgaris about three decades ago [30]. Since then, it has been subjected to many

regulatory studies [31-34]. This system was also the first R-M system to have had both
the REase [35, 36], and MTase [37] structures crystallographically determined. Because
this study reports a REase-C protein fusion, it is important to discuss the structures of
those two components.

5

3.2 Structure and function of PvuII restriction endonuclease

Figure 1. Complex formed by R.PvuII and its cognate DNA. In this
view, the enzyme is in ribbons representation in purple, with the DNA strands in
green and cyan. The amino termini of the two REase subunits are at the right. The
image is structure 1EYU of the Protein Data Bank (managed by the Research
Collaboratory for Structural Biology). The image is in the public domain.

With the application of X-ray crystallography, the molecular structure of active
PvuII endonuclease has been identified as a homodimeric protein, with the subunit
interface region consisting of a pseudo three-helix bundle at the amino end [35]. Three
regions have been determined in R.PvuII, namely the subunit interface region, catalytic
6

region and DNA recognition region. The recognition sequence for R. PvuII cleavage is
CAG↓CTG, and such cleavage is prevented by N4-methylcytosine (yielding
CAGN4mCTG), generated by its cognate methyltransferase [27].

3.3 C- protein and its regulatory roles
3.3.1 Overview
In addition to the MTase and REase genes, a subset of type II R-M systems

contains regulatory genes. The regulatory C (controller) gene was first discovered in the
PvuII [38, 39] and BamHI [40] R-M systems. A milestone in characterizing the PvuII
system is the identification of a regulatory element called “C-Boxes” between the pvuIIM
and pvuIIR genes, exerting the time-control for the expression of REase and MTase [28,
39, 41]. C boxes are where the C protein binds to exert its effects [31]. While the location
of the MTase gene varies among R-M systems, in those that have C proteins the C gene is
typically upstream of the REase gene [31].

Figure 2. PvuII R-M system control region. Two transcription starts for
pvuIICR are identified by rightward bent arrows: from the C-independent weak
7

promoter (left) and C-dependent strong promoter (right) [38]. The two pvuIIM
promoters are also shown (leftward bent arrows). Gray wavy lines represent the
resulting mRNAs.

3.3.2 C protein-dependent regulatory circuit in PvuII
C proteins (encoded by C gene), where tested, activate transcription of their own
gene (‘autogenous’ activation). They are believed to be responsible for the delay in
REase activity, since the REase gene typically does not have its own promoter [33] and is
completely dependent on transcription from the upstream autogenously regulated C gene
[42]. Thus when the R-M genes enter a new cell, and no C protein is present, MTase is
expressed while C protein (and REase) are initially produced at very low levels. As C
protein accumulates, the positive feedback loop results in a sharp increase in C and
REase expression [33, 34]. The C protein acts as both as an activator and repressor, so it
can prevent overexpression of the REase [43].

8

3.3.3 C protein structure
Figure 3. Structure of C.
AhdI [44]. In this view, the
dimeric protein is in ribbons
representation. The image is
structure 1Y7Y of the
Protein Data Bank
(managed by the Research
Collaboratory for Structural
Biology). The image is in
the public domain.
Studies in Type II R-M systems have indicated that C proteins are only active
when they become homodimers [44, 45]. The dimerization of C proteins is required for
DNA binding and, considering the relatively low stability of the dimer itself, this appears
to be an important component of the genetic switch that delays transcription of the Cgene, and consequently that of the endonuclease (R) gene transcribed from the same
promoter [46]. The regulatory C protein of another R-M system named AhdI has been
crystallized [44], and a high-resolution crystal structure of C.AhdI was described two
years later by the same group of scientists [47]. The high-resolution structure of C.AhdI
reveals a compact, single-domain homodimer and can be classified as an all-alpha
protein: 65% of the residues are in a helical conformation with no beta-sheet present [44]
(Figure 3).

9

4. CR fusion protein in Type II R-M systems
The PvuII R-M system is one of the best studied of the group that uses a positive
feedback regulatory loop to delay restriction endonuclease (REase) expression with
respect to DNA methyltransferase (MTase) expression [43], allowing protective

methylation of a new host cell’s DNA before the REase is produced. To better understand
the variation in and evolution of this regulatory system, I searched for other R-M systems
closely related to PvuII. This work is described under Results, but a group of related
systems had naturally-occurring fusions between the C and REase proteins. I provide
here some background on the considerations underlying my studies on one of these fused
systems. Gene fusion is a major contributor to the evolution of multi-domain bacterial
proteins, that typically results in one long composite protein in one organism in place of
two or more smaller split proteins in another organism [48, 49].

4.1 Identification of the CR fused Type II R-M systems
To search for R-M systems closely related to PvuII, the REase (R.PvuII) amino
acid sequence was used as the search seed in TBLASTN [61]. This was done because the
C proteins are fairly well conserved [31, 33, 50], and the MTase proteins have wellconserved motifs [24, 51], so using them as search seeds would likely give a higher
background of unrelated R-M systems. However , the generally poor conservation of
REases implies that only two closely related R-M systems would have similar REase
sequences. One fused polypeptide with portions similar to both C.PvuII and R.PvuII was
found in the bacterium Niabella soli.

10

4.1.1 Overview of Niabella soli
The genus Niabella was proposed by Kim et al. (2007) [52] for a bacterium
isolated from soil. This genus was characterized as Gram-negative, aerobic, nonflagellated, flexirubin-pigment-producing bacteria that form short rods. Shortly after that,
a dark yellow-colored bacterium, JS13-8T, was isolated from a soil sample from Jeju
Island, Republic of Korea [53]. The cells were aerobic, Gram-negative, non-motile, short
rods. Growth occurred at 15–35 oC (optimally at 30 oC). On the basis of the phylogenetic,
physiological and chemotaxonomic data, strain JS13-8T was deemed to represent a novel
species of the genus Niabella, for which the name Niabella soli sp. nov. was proposed
[53]. Subsequent to our discovery of a fused system in N. soli it was also detected by an

automated sequence search by the curators of REBASE [2], which is a continuously
updated R-M system database. We have adopted their nomenclature as NsoJS138I for
this system, following their entry on April 10, 2013. They performed no biochemical
characterization of the R-M system.

4.1.2 Translational frameshifting as a possible mechanism for production of free C
protein in such fused systems
C.NsoJS138I and R.NsoJS138I are clearly fused at the sequence level, as
described in Results. However, it is possible that a certain amount of free NsoJS138I C or
REase protein is produced via translational frameshifting or post-translational processing.
Post-translational processing could involve proteolytic cleavage that yields free C and
free REase polypeptides. Alternatively, free C protein (but not free REase) could result
from ribosomal frameshifting during translation, which can occur when a ribosome

11

encounters certain sequence patterns in the mRNA [54]. Translational frameshifting
represents an alternative process of protein translation [55], and occurs much more
frequently than was originally expected [56]. For instance, a study of ribosomal
frameshifting on the sequence GCAAAA has shown that this pattern is associated with
efficient -1 ribosomal frameshifting in Escherichia coli [57].

4.1.3 Novel demonstration of CR fusions in Type II R-M systems
Natural and synthetic fusions of the REase and MTase polypeptides have been
observed, and found to be active [17-20]. However this thesis focuses on naturallyoccurring fusions between the REase and the regulatory C protein. These have been
suggested to occur by automated annotation systems, such as REBASE, but have never
been tested and shown to be active for either the REase or the C protein components.

12

Chapter 2

Materials and Methods

Gene synthesis
The sequence containing the complete R-M system of Niabella soli (1837nt, from
NCBI database; GenBank accession # NZ_AGSA01000028) was obtained from
Genscript Inc.
(Piscataway, NJ). Some modifications were made to optimize the distribution of
restriction sites, but without changing the specified amino acids (Figure 4). The inferred
NsoJS138I C-Box and promoter region (161nt) was also obtained from Genscript, and for
cloning purposes the restriction sites XmaI (at C gene end) and BamHI (at R gene end)
were appropriately placed .

Cloning strategy
The R-M system Mru1279I (~2.4 kbp) was cloned into the high-copy vector
pUC19, using NruI (at CR gene end) and BamHI sites (at M gene end). Genscript
synthesized the complete NsoJS138I system, but could only clone it into a low-copy
number vector pCC1 (they normally use higher-copy pUC57). This presumably resulted
from a frameshift error in the MTase gene that is due to an error in the requested

13

sequence. To avoid the apparent toxicity, a truncated version was subcloned, consisting
of only the fused CR gene of NsoJS138I and missing a portion of the COOH-end of the
REase (so the MTase would not be required). The truncated NsoJS138ICR was cloned
into the pACYCDuet-1 vector (Novagen®), with the N-terminus (C protein end) in-frame

with the His-tag (using BamHI and SaI I sites), and preceded by a T7 promoter. This
plasmid, pJL100, is referred to for readability as “pNsoShort”. Full length NsoJS138ICR
was also cloned into this vector, by transforming an E. coli strain containing the preexpressed PvuII MTase [58], with the NsoJS138ICR COOH-terminus (REase end) inframe with the His-tag (using the NcoI site), named as pJL200 (“pNso”). The truncated
product would be ~1.5 kDa less than the full length one.

The synthesized NsoJS138I “C-Box” region was digested with BamHI and XmaI
and ligated into pBH403, which is a derivative of pKK232-8 and contains a promoterless
lacZ gene between two bidirectional transcription terminators [59], making the pJL300
(“pBoxLac”). These plasmids are illustrated in Figure 5. The oligonucleotide primers
used for PCR amplification are shown below (all in the 5’à3’ direction).

Primer set for cloning the complete Mru1279I R-M system:
ggtTCGCGActtccgggtctacacctcaa; ggtGGATCCagccctaaccagccgtaaat

Primer set for making the truncated NsoJS138ICR PCR product for pJL100:
aatGTCGACttatttgggattattaatatccttatcac; aatGGATCCgatgaacgaaccaaatgc

14

Primer set for making the full length NsoJS138ICR PCR product for pJL200:
cgtCCATGGacaaaagtcttatgccat; cgtCCATGGatgaacgaaccaaatgctta

15

Figure 4. Sequence of synthesized NsoJS138I R-M system. The initiators of the

CR and M genes are in green. The red arrow near the top indicates the position at which
the C-REase gene is interrupted in the truncated clone (pJL100, pNsoShort).

16

Naturally-occurring fusion between the regulatory and catalytic components of type IIP restriction-modification systems

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về