Information

Operators and Enhancers/Silencers

Operators and Enhancers/Silencers


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Wikipedia has two images, of a eukaryotic gene and of a prokaryotic gene. They show the difference that the prokaryotic gene also has an operator while the eukaryotic gene does not. Both also have enhancers and silencers separately. I thought enhancers and silencers were types of operators. Is this wrong?


Enhancers and silencers are binding sequences for transcriptional activators or repressors, in which case the sequence is often located some distance upstream or downstream of the gene it regulates. See regulation of transcription for information about how these interact with their target genes (through DNA bending, mediator, etc.). A note about enhancers and silencers, though, they're not necessarily required for transcription: they help the gene attain robust up- or down-regulation of transcription.

The operator, on the other hand, influences whether the promoter will do something, or nothing. If we're talking about inducible systems, the repressor is bound to the operator, blocking action to/from the promoter (an inducer will bind the repressor and keep the operator from blocking the promoter). If we're talking about a repressible system, the repressor will need a corepressor to bind the operator to abrogate transcription by blocking action to/from the promoter.


9.7: Footprinting

  • Contributed by John W. Kimball
  • Professor (retired) at Tufts University & Harvard

Footprinting is a method for determining the exact DNA sequence to which a particular DNA-binding protein binds. Examples:

  • hormone-receptor complexes that bind to their hormone response elements
  • transcription factors that bind eukaryotic operators, enhancers, and silencers
  • the lac repressor that shuts down the lac operon in E. coli
  • Clone a piece of DNA that contains the operator site to which the repressor binds.
  • Label one end of the DNA molecules with a radioactive molecule, e.g. radioactive ATP.
  • Digest the DNA with DNase I.
    • DNase I cuts DNA molecules randomly (in contrast to restriction enzymes that cut where they find a particular sequence)
    • Choose such gentle conditions that most molecules will be cut only once.

    Introduction

    The complex linear organisation [1] of many metazoan genomes encodes regulatory sequences that can be categorised into two major groups: enhancers and silencers. Enhancers are short motifs that contain binding sites for transcription factors they activate their target genes without regard to orientation and often over great separations in cis or in trans[2]. Silencers suppress gene expression [3] and/or confine it within specific chromatin boundaries (and thus are also called 'insulators') [4]. The interplay between these contrasting regulatory elements, their target promoters and epigenetic modifications at all levels of three-dimensional organisation (that is, nucleosomes, chromatin fibres, loops, rosettes, chromosomes and chromosome location) [5–9] fine-tune expression during development and differentiation. However, the mechanisms involved in this interplay remain elusive, although some can be computationally predicted [10]. Although enhancers and silencers have apparently opposite effects, accumulating evidence suggests they share more properties than intuition would suggest [11]. Herein we try to reconcile their apparently disparate modes of action. We suggest they act by tethering their target promoters close to, or distant from, hot spots of nucleoplasmic transcription (known as 'transcription factories') as they produce noncoding transcripts (ncRNAs) [12–15].

    Enhancers

    Enhancers were characterised almost 30 years ago [16], but their functional definitions vary because of their flexibility of action (whether in cis or in trans) [17, 18], position (relative orientation and/or distance) and genomic location (in gene deserts, introns and/or untranslated regions) [2]. Although sequence conservation between species can, in some cases, be an efficient predictor of enhancer identity, there are examples where genes with identical expression patterns in different species rely on enhancers that bear no similarities [19]. Within a single genome, however, sensitivity to DNase I and characteristic modifications of histone tails provide a more reliable means of identification. They typically occupy approximately 200 bp of 'open' chromatin (making them DNase-sensitive) [20], are flanked by regions rich in mono- and/or dimethylated lysine 4 of histone H3 (H3K4me1/H3K4me2) and acetylated lysine 27 of histone H3 (H3K27ac) and, generally, bind p300 [21]. Attempts have been made to classify enhancers into subclasses that are differentially used during development. Comparison between mouse embryonic stem (ES) cells, their differentiated derivatives and terminally differentiated murine cells allow distinctions between 'active', 'intermediate' and 'poised' enhancers (here additional marks are used, for example, H3K27me3 or H3K36me3) [21]. These accessible DNA stretches are often bound (and can thus be identified) by acetyltransferase p300, Mediator subunits, chromodomain helicase DNA binding protein 7, cohesin and/or CCCTC-binding factor (CTCF) [21, 22]. Most importantly, canonical enhancers are characterised by the presence of bound RNA polymerase II (RNAPII) [23, 24].

    The first and most studied example of gene regulation by an enhancer is provided by the β-globin locus here, the locus control region (LCR) is located 40 to 60 kb upstream from the promoter it regulates. The two interact when the chromatin fibre forms new, or rearranges preexisting, loops [17, 25]. All other cis-regulatory elements in this locus are also in close proximity, where they form an 'active chromatin hub' [12, 26]. An active chromatin hub, as defined in the β-globin locus paradigm, arises from the three-dimensional clustering of DNA-hypersensitive sites, depends on specific DNA-protein interactions and brings together all essential components for transcriptional activation [17]. Similarly, in a comprehensive study of the immunoglobulin heavy-chain locus [6] (and many other loci), the multitude of preexisting loops and connecting regulatory elements are rearranged to form new ones that interact upon activation. Obviously, most of these conformations (and in fact most seen using chromosome conformation capture (3C)) concern a population of cells and will not be refined until single-cell 3C is developed and implemented.

    Enhancers are transcribed into RNAs (eRNAs) that do not encode proteins, run the length of the enhancer sequence and appear to stabilise enhancer-promoter interactions [11, 24, 27–29]. eRNAs derived from elements upstream of the Arc promoter depend on the activity of that promoter, as removing the promoter abolishes eRNA production [28]. β-globin-associated ncRNAs are still produced in the absence of the β-globin promoter [28, 30, 31]. However, the rate at which eRNAs are turned over, the exact mechanism by which they function and their abundance (relative to the mRNAs they regulate) all remain to be determined.

    An additional class of ncRNAs longer than 200 nucleotides (long intergenic ncRNAs (lincRNAs)) were found in a survey of human transcripts, and some exhibited enhancer function [27]. In different human cells, more than 3,000 lincRNAs have now been identified [32, 33]. Some seem essential for the activation of the thymidine kinase promoter, as well as for the expression of neighbouring protein-coding genes (although not all act as bona fide enhancers) [34]. For example, HOTTIP (a lincRNA transcribed from the 5' end of the HOXA locus) coordinates the activation of several HOXA genes chromatin looping brings HOTTIP close to its targets, and this drives H3K4 trimethylation and transcription [35].

    Silencers

    At the opposite functional extreme lie silencers. They prevent gene expression during differentiation and progression through the cell cycle [36]. This again correlates with RNA production (in some cases, through the generation of RNA duplexes that underlie the methylation of DNA at the promoter [37, 38]).

    Accumulating evidence supports a broad and general role of both long and short RNA molecules in transcriptional inhibition. Antigene RNAs (agRNAs) are small RNAs that target promoters and downstream regions [37]. The expression of genes encoding progesterone, low-density lipoprotein, the androgen receptor, cyclooxygenase-2, the major vault protein and huntingtin is inhibited by agRNAs [37, 39]. Similarly, miRNAs, which are 20 to 22 nucleotides long, regulate gene expression post-transcriptionally [40], and they may also act at the level of transcriptional initiation or elongation. This is now supported by deep sequencing of nuclear and cytoplasmic small RNA libraries, where the majority of mature miRNAs localise in the nucleus (and not only in the cytoplasm) [41]. For instance, introduction of miRNA mimics that target the progesterone gene promoter decreases RNAPII occupancy. It also increases H3K9me2 levels in an Argonaute 2 (Ago-2)-dependent manner and leads to gene silencing [42]. Note that mature miRNAs in the nucleus can also act as 'enhancers' [43].

    Polycomb complexes PRC1 and PRC2 rely on noncoding transcripts from silencing elements for recruitment to target sites. A range of examples are available: for instance, repression in cis in CD4 + T-cells and ES cells (where PRC2-catalysed H3K27 trimethylation recruits PRC1 to prevent chromatin remodelling of targeted loci [44]) and the PRC2-HOTAIR interaction (where transcripts produced from the XOXC locus establish repression of XOXD[33]). In human breast cancer cells, overexpression of HOTAIR results in the promiscuous association of PRC2 with more than 850 targets, which are in turn silenced [45]. Furthermore, in the well-studied cascade of X chromosome inactivation, the ncRNA Xist binds PRC2, which in turn drives H3K27 trimethylation [46, 47] and propagation of PRC1's binding to multiple sites along the silenced allele [48]. Here the three-dimensional conformation is also critical for efficient silencing and results in chromatin compaction and/or rearrangement [49]. Such equilibria may, however, be shifted by the eviction of Polycomb proteins to restore an active state [47].

    Insulators

    Functionally autonomous domains are strung along the chromatin fibre, and these need to be insulated from their neighbours to prevent the action of irrelevant enhancers and silencers. Insulator or boundary elements perform this task. These can be further categorised as enhancer blockers (when the insulator is located between a promoter and a cognate enhancer) and barriers (when located between a promoter and a silencer) [50]. Mutating or deleting insulators alters the pattern of gene expression and leads to developmental defects [51].

    It has been suggested that insulators evolved from a class of promoters binding a specific subset of transcription factors that drive chromatin remodelling and long-range interactions [11]. Many are marked by DNase I hypersensitivity [52] and/or the presence of bound RNAPII. Specifically, in the Drosophila Hox gene cluster, stalled polymerases, in conjunction with elongation factors DISF and NELF, insulate four of eight promoters from Hox enhancers, and this correlates with the rearrangement and/or de novo formation of chromatin loops [53].

    Perhaps the most abundant protein associated with insulator activity is CTCF. In the well-studied example of the Igf2-H19 imprinted locus, CTCF prevents activation of the maternal Igf2 allele by a distal enhancer. When its cognate binding site is lost, the gene is reactivated [54]. However, in this locus, CTCF is a positive regulator of the H19 gene [45]. Moreover, CTCF mediates enhancer-promoter, insulator-insulator and insulator-promoter interactions [11]. The insulator function of CTCF is regulated by cohesins [55, 56]. Their respective binding sites coincide in various cell types, including the IL-3 and granulocyte-macrophage colony-stimulating factor loci [57], as well as the renin, ETNK2[58], CFTR[50, 52] and c-Myc genes [59].

    However, the CTCF-cohesin duplet is characteristic of only one type of insulator or boundary. In a comprehensive mapping of such Drosophila elements, additional factors, such as boundary element associated factor, GAGA and CP190, were used to identify and classify domain boundaries [60]. Again, DNase I hypersensitivity characterises many of these elements, and examples exist where their function is Ago-2-dependent (and so transcription-dependent, but RNAi-independent) [61].

    A model

    The following four models have been proposed to describe gene regulation by enhancers (Figure 1). (1) According to the tracking model, a protein loads onto the enhancer and tracks along the chromatin fibre towards the promoter, where it stimulates transcription [62]. (2) The linking model is similar, but here the loaded protein drives polymerisation of proteins in the direction of the promoter [63]. (3) In the relocation model, a given gene relocates to compartments in the nucleus where enhancer-promoter interactions (and so transcription) are favoured [64, 65]. (4) The looping model (which shares features with the relocation model) predicts a direct contact between an enhancer and a relevant promoter that loops out the intervening DNA [12, 65, 66] and thus is closely linked to the three-dimensional genome architecture [1, 7, 65]. Next, activators bound to the enhancer interact with the mediator complex, which recruits RNAPII and general transcription factors to the promoter [34, 67]. This last model is now favoured, as it readily explains enhancer-promoter interactions in trans[18, 68] and is supported by a wealth of experimental data derived from 3C [69] and modelling [1, 6–10, 15].

    Existing models for the function of enhancers. The four existing models describing gene regulation by enhancers are depicted. (A) The tracking model, where a transcription factor (purple hexagon) loads onto the enhancer and tracks along the chromatin fibre towards the promoter, where it stimulates transcription by association with the polymerase (pink oval). (B) The linking model, where the loaded transcription factor drives polymerization of proteins in the direction of the promoter. (C) The relocation model, where a gene relocates to nuclear subcompartments (pink halo) favouring enhancer-promoter interactions, and so transcription. (D) The looping model, where the enhancer comes into proximity with the relevant promoter due to protein-protein interactions. This loops out the intervening chromatin and triggers transcriptional activation.

    Similarly, among the three major models proposed for insulator function (roadblock, sink/decoy and topological loop models), the topological loop model is best supported by experimental data: Rearrangement and/or de novo formation of appropriately oriented loops efficiently insulate promoters from enhancer elements [70]. Note also that recent data show how gene repression dependent on gypsy insulators in Drosophila propagates between distant loci to be repressed via the organisation of local loops [71].

    Gene regulation from distal regulatory elements via local looping or broader rearrangements in three-dimensional organisation is now widely accepted. For example, we have seen that the β-globin LCR loops back to its target promoter to activate it [17] through an active chromatin hub [12, 26], whereas Gata-1 represses the Kit gene locus via specific loop formation and exchange with Gata-2 reforms the enhancer-promoter loop and reactivates expression [72]. The IgH locus is another example of how this might occur, because its approximately 2.7-Mbp region is reorganised spatially during activation [6]. Similarly, various transcription factors have been implicated in forming regulatory chromatin loops, including EKLF [26] Gata-1, Gata-2 and Gata-3 [72] CTCF [73, 74] Ldb1 [75] and cohesin [56, 76]. Knocking them out or down results in loss of looping and changes in transcriptional state [26, 77, 78].

    On a broader scale, the genome is organised nonrandomly in three-dimensional space [1, 6–10, 15] as a result of a variety of chromatin loops and rosettes [15, 64, 79], and the idea that transcription is also architecturally organised is gradually gaining ground [13–15]. It has been proposed that the transcription of protein-coding genes occurs in nucleoplasmic hot spots (that is, transcription factories) where a high local concentration of the required molecular machinery renders the whole process more efficient [14, 15]. By definition, these harbour at least two RNA polymerases, each transcribing a different template. The β-globin active chromatin hub can be classified as a transcription factory, as it contains at least two polymerases: one transcribing the enhancer and another transcribing a protein-coding gene. Not only do active genes tend to colocalise in the nucleus to be transcribed [80, 81], but different types of genes seem to cluster in 'specialised' transcription factories, where they are coregulated and expressed. For example, RNAPII genes are transcribed in separate factories from RNAPIIIgenes, whereas erythropoietic genes and TNFα-responsive genes are copied at sites distinct from those of constitutive and/or nonresponsive ones [75, 82–88]. Although factories with different polymerising activities can now be isolated and their proteins characterised using mass spectrometry [89], the mechanism by which factories are 'marked' by specific transcription factors and the relative representation of different subtypes of factories remain undetermined.

    How can these ideas be extended to explain the function of enhancers and silencers and/or insulators? As we have established, all share common features (for example, DNase I hypersensitivity, active chromatin marks and interaction with transcription factors and RNAPII) therefore, we propose that canonical regulatory elements are primarily transcription units (Table 1) and that, in order for them to be functional, they need to be transcribed (and so associated with a transcription factory). This hypothesis defines two key aspects of chromatin structure: proximity between distant DNA sequences due to looping and tethering of active genes to a factory.

    Does the number of factories in a given cell suffice to accommodate all transcription units, including enhancers and/or silencers? To date, the lowest estimate of about 200 factories concerns murine primary cells and comes from RNAPII immunostaining ex vivo[81]. This suggests that about 80 transcription units would share a factory (assuming 16,000 active transcription units, as in HeLa cells) [86] or that a number of them are transcribed outside a factory. Other approaches in HeLa cells return a number that is an order of magnitude higher: approximately 2,000 factories, each hosting an average of 8 transcription units [97, 98]. Moreover, the density and diameter of these transcriptional hot spots appear to be constant between cell types, suggesting an underlying topology accessible to transcription units in different nuclear neighbourhoods [86, 99]. The difference between these numbers may be explained by a difference in sensitivity of detection [86, 98]. But does most transcription occur in factories? It seems it may, as some estimates indicate that more than 95% of nascent nucleoplasmic RNA is found in factories (assessed using incorporation of various precursors in a variety of cell types) [13, 97–99]. Nonetheless, these issues will probably be resolved only by imaging factories in different types of living cells.

    Now consider that an enhancer (transcription unit 1) (Figure 2A) tethers its target promoter (in unit 2) close to factory or hub A that contains the necessary machinery. As a result, the target promoter 2 will diffuse through the nucleoplasm and frequently collide with a polymerase in factory A to initiate transcription. Although promoter 3 is also tethered close to the same factory, it will initiate rarely (because factory A lacks the necessary transcription factors required by this particular promoter). Although promoter 3 can initiate in factory B (which contains high concentrations of the relevant factors), it will do so rarely, simply because it is tethered close to factory A and far from B. Next, transcription unit 1 acts as an enhancer of unit 2 and as a silencer of unit 3. The addition of histone modifications that mark the various units as active or inactive will now reinforce the status quo. After that, once unit 1 has been transcribed, these marks will make it more likely that unit 1 or unit 2 will reinitiate in factory A to create a virtuous cycle. Similarly, at another developmental stage, when a different set of transcription factors are expressed (Figure 2B), unit 1 might be transcribed in factory C. It is again flanked by units 2 and 3, but these can now be transcribed efficiently only in factory B (which is rich in the necessary factors). As units 2 and 3 cannot stably interact with each other by binding to factory C, unit 1 now acts as an insulator or barrier. As before, histone marks will reinforce this different (virtuous) cycle.

    A simple model for the function of regulatory elements. Spheres A, B and C represent factories rich in different sets of transcription factors and associated halos indicate the probability that promoter 1, 2 or 3 will collide with a factory (red indicates high probability). The low-probability zone immediately around the factory arises because the intrinsic stiffness of the chromatin fibre restricts the formation of very small loops). Curved black arrow indicates collision between promoter and factory that yields a productive initiation. Dashed grey arrows indicate the preferred site of initiation (as factory B is rich in the relevant transcription factors). Blocked red arrows indicate unproductive collisions (as the factory contains few of the relevant factors). (A) Enhancers and silencers. Transcription unit 1 is being transcribed by a polymerase in factory A. This tethers unit 2 in a 'hot zone', where it has a high probability of colliding with a polymerase in factory A (which contains high local concentrations of factors necessary for initiation by promoters 1 and 2). As a result, unit 1 acts as an enhancer for unit 2. At the same time, unit 3 is tethered far from factory B (which is rich in the factors required for its initiation). Here unit 1 acts as a silencer of unit 3. (B) Insulator. At a different stage in development, a different constellation of transcription factors are expressed. Chromatin domains containing units 2 and 3 are separated by unit 1 (now transcribed in factory C, which contains low concentrations of the factors required by units 2 and 3), so they rarely bind to factory A and interact. Here unit 1 acts as an insulator or barrier.


    References

    Banerji, J., Rusconi, S. & Schaffner, W. Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences. Cell 27, 299–308 (1981).

    Moreau, P. et al. The SV40 72 base repair repeat has a striking effect on gene expression both in SV40 and other chimeric recombinants. Nucleic Acids Res. 9, 6047–6068 (1981). Refs. 1 and 2 reported the identification of the SV40 enhancer whose properties became defining features of enhancers for the next 30 years.

    Banerji, J., Olson, L. & Schaffner, W. A lymphocyte-specific cellular enhancer is located downstream of the joining region in immunoglobulin heavy chain genes. Cell 33, 729–740 (1983).

    Gillies, S.D., Morrison, S.L., Oi, V.T. & Tonegawa, S. A tissue-specific transcription enhancer element is located in the major intron of a rearranged immunoglobulin heavy chain gene. Cell 33, 717–728 (1983).

    Mercola, M., Wang, X.F., Olsen, J. & Calame, K. Transcriptional enhancer elements in the mouse immunoglobulin heavy chain locus. Science 221, 663–665 (1983). Refs. 3–5 reported the first cellular enhancer, which, when placed in proximity to the MYC gene, leads to Burkitt's lymphoma.

    Taub, R. et al. Translocation of the c-myc gene into the immunoglobulin heavy chain locus in human Burkitt lymphoma and murine plasmacytoma cells. Proc. Natl. Acad. Sci. USA 79, 7837–7841 (1982).

    Schroeder, M.D., Greer, C. & Gaul, U. How to make stripes: deciphering the transition from non-periodic to periodic patterns in Drosophila segmentation. Development 138, 3067–3078 (2011).

    Levine, M. Transcriptional enhancers in animal development and evolution. Curr. Biol. 20, R754–R763 (2010).

    Klingler, M., Soong, J., Butler, B. & Gergen, J.P. Disperse versus compact elements for the regulation of runt stripes in Drosophila. Dev. Biol. 177, 73–84 (1996).

    Jack, J., Dorsett, D., Delotto, Y. & Liu, S. Expression of the cut locus in the Drosophila wing margin is required for cell type specification and is regulated by a distant enhancer. Development 113, 735–747 (1991).

    Spana, C., Harrison, D.A. & Corces, V.G. The Drosophila melanogaster suppressor of Hairy-wing protein binds to specific sequences of the gypsy retrotransposon. Genes Dev. 2, 1414–1423 (1988). This study showed that Su(Hw) directly bound sequences in the gypsy insulator, thus providing a mechanistic understanding for years of genetic studies and helping to launch the new field of insulator biology.

    Morcillo, P., Rosen, C., Baylies, M.K. & Dorsett, D. Chip, a widely expressed chromosomal protein required for segmentation and activity of a remote wing margin enhancer in Drosophila. Genes Dev. 11, 2729–2740 (1997).

    Morcillo, P., Rosen, C. & Dorsett, D. Genes regulating the remote wing margin enhancer in the Drosophila cut locus. Genetics 144, 1143–1154 (1996). This study used a genetic screen for factors regulating enhancer-promoter communication and identified genes later shown to encode the fly homologs of NIPBL and LDB1. These proteins were subsequently shown to be involved in the formation and/or maintenance of enhancer-promoter looping in mammals.

    Rollins, R.A., Korom, M., Aulner, N., Martens, A. & Dorsett, D. Drosophila nipped-B protein supports sister chromatid cohesion and opposes the stromalin/Scc3 cohesion factor to facilitate long-range activation of the cut gene. Mol. Cell. Biol. 24, 3100–3111 (2004).

    Rollins, R.A., Morcillo, P. & Dorsett, D. Nipped-B, a Drosophila homologue of chromosomal adherins, participates in activation by remote enhancers in the cut and Ultrabithorax genes. Genetics 152, 577–593 (1999).

    Agulnick, A.D. et al. Interactions of the LIM-domain-binding factor Ldb1 with LIM homeodomain proteins. Nature 384, 270–272 (1996).

    Lee, S.K. & Pfaff, S.L. Synchronization of neurogenesis and motor neuron specification by direct coupling of bHLH and homeodomain transcription factors. Neuron 38, 731–745 (2003).

    Deng, W. et al. Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell 149, 1233–1244 (2012). This study provided direct experimental confirmation that LDB1 bridges enhancer-promoter communication.

    Torigoi, E. et al. Chip interacts with diverse homeodomain proteins and potentiates bicoid activity in vivo. Proc. Natl. Acad. Sci. USA 97, 2686–2691 (2000).

    Soler, E. et al. The genome-wide dynamics of the binding of Ldb1 complexes during erythroid differentiation. Genes Dev. 24, 277–289 (2010).

    Kagey, M.H. et al. Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430–435 (2010). This study used genome-wide analyses and chromatin conformation assays to reveal the general requirement of cohesin in enhancer-promoter communication in mouse embryonic stem cells.

    Schaaf, C.A. et al. Genome-wide control of RNA polymerase II activity by cohesin. PLoS Genet. 9, e1003382 (2013).

    Bulger, M. & Groudine, M. Functional and mechanistic diversity of distal transcription enhancers. Cell 144, 327–339 (2011).

    Kulaeva, O.I., Nizovtseva, E.V., Polikanov, Y.S., Ulianov, S.V. & Studitsky, V.M. Distant activation of transcription: mechanisms of enhancer action. Mol. Cell. Biol. 32, 4892–4897 (2012).

    Williams, T.N. & Weatherall, D.J. World distribution, population genetics, and health burden of the hemoglobinopathies. Cold Spring Harb. Perspect. Med. 2, a011692 (2012).

    Kioussis, D., Vanin, E., deLange, T., Flavell, R.A. & Grosveld, F.G. β-globin gene inactivation by DNA translocation in γβ-thalassaemia. Nature 306, 662–666 (1983).

    Van der Ploeg, L.H. et al. γβ-Thalassaemia studies showing that deletion of the γ- and δ-genes influences β-globin gene expression in man. Nature 283, 637–642 (1980). Refs. 26 and 27 helped link a deletion in the DNase-hypersensitive β-globin locus control region with thalassemias.

    Kioussis, D. & Festenstein, R. Locus control regions: overcoming heterochromatin-induced gene inactivation in mammals. Curr. Opin. Genet. Dev. 7, 614–619 (1997).

    Fraser, P. & Grosveld, F. Locus control regions, chromatin activation and transcription. Curr. Opin. Cell Biol. 10, 361–365 (1998).

    Gilbert, S.F. Developmental Biology (Sinauer Associates, Sunderland, Massachusetts, USA, 2000).

    Bonifer, C., Vidal, M., Grosveld, F. & Sippel, A.E. Tissue specific and position independent expression of the complete gene domain for chicken lysozyme in transgenic mice. EMBO J. 9, 2843–2848 (1990).

    Jones, B.K., Monks, B.R., Liebhaber, S.A. & Cooke, N.E. The human growth hormone gene is regulated by a multicomponent locus control region. Mol. Cell. Biol. 15, 7010–7021 (1995).

    modENCODE Consortium et al. Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330, 1787–1797 (2010).

    Kvon, E.Z., Stampfel, G., Yanez-Cuna, J.O., Dickson, B.J. & Stark, A. HOT regions function as patterned developmental enhancers and have a distinct cis-regulatory signature. Genes Dev. 26, 908–913 (2012).

    Lovén, J. et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell 153, 320–334 (2013).

    Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

    Koch, F. et al. Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters. Nat. Struct. Mol. Biol. 18, 956–963 (2011).

    Parker, S.C. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl. Acad. Sci. USA 110, 17921–17926 (2013).

    Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

    Barolo, S. Shadow enhancers: frequently asked questions about distributed cis-regulatory information and enhancer redundancy. BioEssays 34, 135–141 (2012).

    Lagha, M., Bothma, J.P. & Levine, M. Mechanisms of transcriptional precision in animal development. Trends Genet. 28, 409–416 (2012).

    Perry, M.W., Boettiger, A.N., Bothma, J.P. & Levine, M. Shadow enhancers foster robustness of Drosophila gastrulation. Curr. Biol. 20, 1562–1567 (2010).

    Hong, J.W., Hendrix, D.A. & Levine, M.S. Shadow enhancers as a source of evolutionary novelty. Science 321, 1314 (2008). This study used genome-wide occupancy of transcription factors in Drosophila to identify novel, seemingly redundant enhancers for well-known enhancers of developmental genes.

    Frankel, N. et al. Phenotypic robustness conferred by apparently redundant transcriptional enhancers. Nature 466, 490–493 (2010).

    Casellas, R., Yamane, A., Kovalchuk, A.L. & Potter, M. Restricting activation-induced cytidine deaminase tumorigenic activity in B lymphocytes. Immunology 126, 316–328 (2009).

    Kieffer-Kwon, K.R. et al. Interactome maps of mouse gene regulatory domains reveal basic principles of transcriptional regulation. Cell 155, 1507–1520 (2013). This study found that deletion of either the E1 or E2 enhancers for the Aicd gene led to the abolishment of Aicd expression. The authors suggested that genes whose induction needs to be tightly controlled require two enhancers to be turned on, a system referred to in this Review as fail-safe or split enhancers.

    Heintzman, N.D. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).

    Creyghton, M.P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA 107, 21931–21936 (2010).

    Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011). Refs. 47–49 used genome-wide mapping of histone modifications to identify chromatin signatures for different states of enhancer activity.

    Whyte, W.A. et al. Enhancer decommissioning by LSD1 during embryonic stem cell differentiation. Nature 482, 221–225 (2012).

    Herz, H.M. et al. Enhancer-associated H3K4 monomethylation by Trithorax-related, the Drosophila homolog of mammalian Mll3/Mll4. Genes Dev. 26, 2604–2620 (2012).

    Hu, D. et al. The MLL3/MLL4 branches of the COMPASS family function as major histone H3K4 monomethylases at enhancers. Mol. Cell. Biol. 33, 4745–4754 (2013). Refs. 51 and 52 identified Drosophila Trr and its mammalian homologs MLL3 and MLL4 as major H3K4 monomethylases at enhancers.

    Shilatifard, A. The COMPASS family of histone H3K4 methylases: mechanisms of regulation in development and disease pathogenesis. Annu. Rev. Biochem. 81, 65–95 (2012).

    Ardehali, M.B. et al. Drosophila Set1 is the major histone H3 lysine 4 trimethyltransferase with role in transcription. EMBO J. 30, 2817–2828 (2011).

    Mohan, M. et al. The COMPASS family of H3K4 methylases in Drosophila. Mol. Cell. Biol. 31, 4310–4318 (2011).

    Eissenberg, J.C. & Shilatifard, A. Histone H3 lysine 4 (H3K4) methylation in development and differentiation. Dev. Biol. 339, 240–249 (2010).

    Sedkov, Y. et al. Molecular genetic analysis of the Drosophila trithorax-related gene which encodes a novel SET domain protein. Mech. Dev. 82, 171–179 (1999).

    Wang, P. et al. Global analysis of H3K4 methylation defines MLL family member targets and points to a role for MLL1-mediated H3K4 methylation in the regulation of transcriptional initiation by RNA polymerase II. Mol. Cell. Biol. 29, 6074–6085 (2009).

    Yu, B.D., Hess, J.L., Horning, S.E., Brown, G.A. & Korsmeyer, S.J. Altered Hox expression and segmental identity in Mll-mutant mice. Nature 378, 505–508 (1995).

    Tie, F. et al. CBP-mediated acetylation of histone H3 lysine 27 antagonizes Drosophila Polycomb silencing. Development 136, 3131–3141 (2009).

    Jin, Q. et al. Distinct roles of GCN5/PCAF-mediated H3K9ac and CBP/p300-mediated H3K18/27ac in nuclear receptor transactivation. EMBO J. 30, 249–262 (2011).

    Wang, L., Tang, Y., Cole, P.A. & Marmorstein, R. Structure and chemistry of the p300/CBP and Rtt109 histone acetyltransferases: implications for histone acetyltransferase evolution and function. Curr. Opin. Struct. Biol. 18, 741–747 (2008).

    Roelfsema, J.H. et al. Genetic heterogeneity in Rubinstein-Taybi syndrome: mutations in both the CBP and EP300 genes cause disease. Am. J. Hum. Genet. 76, 572–580 (2005).

    Roth, J.F. et al. Differential role of p300 and CBP acetyltransferase during myogenesis: p300 acts upstream of MyoD and Myf5. EMBO J. 22, 5186–5196 (2003).

    Kharchenko, P.V. et al. Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 471, 480–485 (2011).

    Di Croce, L. & Helin, K. Transcriptional regulation by Polycomb group proteins. Nat. Struct. Mol. Biol. 20, 1147–1155 (2013).

    Schwartz, Y.B. & Pirrotta, V. Polycomb silencing mechanisms and the management of genomic programmes. Nat. Rev. Genet. 8, 9–22 (2007).

    Herz, H.M., Garruss, A. & Shilatifard, A. SET for life: biochemical activities and biological functions of SET domain-containing proteins. Trends Biochem. Sci. 38, 621–639 (2013).

    Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    Tie, F., Banerjee, R., Conrad, P.A., Scacheri, P.C. & Harte, P.J. Histone demethylase UTX and chromatin remodeler BRM bind directly to CBP and modulate acetylation of histone H3 lysine 27. Mol. Cell. Biol. 32, 2323–2334 (2012).

    Lee, J.S., Smith, E. & Shilatifard, A. The language of histone crosstalk. Cell 142, 682–685 (2010).

    Thornton, J.L. et al. Context dependency of Set1/COMPASS-mediated histone H3 Lys4 trimethylation. Genes Dev. 28, 115–120 (2014).

    Natoli, G. & Andrau, J.C. Noncoding transcription at enhancers: general principles and functional models. Annu. Rev. Genet. 46, 1–19 (2012).

    Ørom, U.A. & Shiekhattar, R. Long noncoding RNAs usher in a new era in the biology of enhancers. Cell 154, 1190–1193 (2013).

    Kim, T.K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).

    De Santa, F. et al. A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol. 8, e1000384 (2010).

    Ashe, H.L., Monks, J., Wijgerde, M., Fraser, P. & Proudfoot, N.J. Intergenic transcription and transinduction of the human β-globin locus. Genes Dev. 11, 2494–2509 (1997).

    Li, W. et al. Functional roles of enhancer RNAs for oestrogen-dependent transcriptional activation. Nature 498, 516–520 (2013).

    Ørom, U.A. et al. Long noncoding RNAs with enhancer-like function in human cells. Cell 143, 46–58 (2010). Refs. 78 and 79 showed that some of the most active enhancers are transcribed.

    Lai, F. et al. Activating RNAs associate with Mediator to enhance chromatin architecture and transcription. Nature 494, 497–501 (2013).

    Wang, K.C. et al. A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature 472, 120–124 (2011). Refs. 79–81 found that polyadenylated lncRNAs could activate promoters of nearby genes. When placed in reporter constructs, these RNAs have enhancer-like function, revealing a previously unknown diversity in the cis -regulatory landscape.

    Rinn, J.L. & Chang, H.Y. Genome regulation by long noncoding RNAs. Annu. Rev. Biochem. 81, 145–166 (2012).

    Zaret, K.S. & Carroll, J.S. Pioneer transcription factors: establishing competence for gene expression. Genes Dev. 25, 2227–2241 (2011).

    Liber, D. et al. Epigenetic priming of a pre-B cell-specific enhancer through binding of Sox2 and Foxd3 at the ESC stage. Cell Stem Cell 7, 114–126 (2010).

    Xu, J. et al. Transcriptional competence and the active marking of tissue-specific enhancers by defined transcription factors in embryonic and induced pluripotent stem cells. Genes Dev. 23, 2824–2838 (2009).

    Lin, C., Garruss, A.S., Luo, Z., Guo, F. & Shilatifard, A. The RNA Pol II elongation factor Ell3 marks enhancers in ES cells and primes future gene activation. Cell 152, 144–156 (2013).

    Hou, C., Dale, R. & Dean, A. Cell type specificity of chromatin organization mediated by CTCF and cohesin. Proc. Natl. Acad. Sci. USA 107, 3651–3656 (2010).

    Ostuni, R. et al. Latent enhancers activated by stimulation in differentiated cells. Cell 152, 157–171 (2013).

    Kaikkonen, M.U. et al. Remodeling of the enhancer landscape during macrophage activation is coupled to enhancer transcription. Mol. Cell 51, 310–325 (2013). This study showed that the ongoing process of transcription at an enhancer, but not the eRNA product, was required for enhancer function.

    Liu, J. & Krantz, I.D. Cohesin and human disease. Annu. Rev. Genomics Hum. Genet. 9, 303–320 (2008).

    Dorsett, D. & Krantz, I.D. On the molecular etiology of Cornelia de Lange syndrome. Ann. NY Acad. Sci. 1151, 22–37 (2009).

    Borck, G. et al. Father-to-daughter transmission of Cornelia de Lange syndrome caused by a mutation in the 5′ untranslated region of the NIPBL gene. Hum. Mutat. 27, 731–735 (2006).

    Ng, S.B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 790–793 (2010).

    Lederer, D. et al. Deletion of KDM6A, a histone demethylase interacting with MLL2, in three patients with Kabuki syndrome. Am. J. Hum. Genet. 90, 119–124 (2012).

    Miyake, N. et al. MLL2 and KDM6A mutations in patients with Kabuki syndrome. Am. J. Med. Genet. A. 161, 2234–2243 (2013).

    Herz, H.M. et al. The H3K27me3 demethylase dUTX is a suppressor of Notch- and Rb-dependent tumors in Drosophila. Mol. Cell. Biol. 30, 2485–2497 (2010).

    Kanda, H., Nguyen, A., Chen, L., Okano, H. & Hariharan, I.K. The Drosophila ortholog of MLL3 and MLL4, trithorax related, functions as a negative regulator of tissue growth. Mol. Cell. Biol. 33, 1702–1710 (2013).

    Morgan, M.A. & Shilatifard, A. Drosophila SETs its sights on cancer: Trr/MLL3/4 COMPASS-like complexes in development and disease. Mol. Cell. Biol. 33, 1698–1701 (2013).

    Grasso, C.S. et al. The mutational landscape of lethal castration-resistant prostate cancer. Nature 487, 239–243 (2012).

    Jones, D.T. et al. Dissecting the genomic complexity underlying medulloblastoma. Nature 488, 100–105 (2012).

    Morin, R.D. et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Nature 476, 298–303 (2011).

    Parsons, D.W. et al. The genetic landscape of the childhood cancer medulloblastoma. Science 331, 435–439 (2011).

    Pasqualucci, L. et al. Analysis of the coding genome of diffuse large B-cell lymphoma. Nat. Genet. 43, 830–837 (2011).

    Pugh, T.J. et al. Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations. Nature 488, 106–110 (2012).

    Akhtar-Zaidi, B. et al. Epigenomic enhancer profiling defines a signature of colon cancer. Science 336, 736–739 (2012).

    Das, C., Lucia, M.S., Hansen, K.C. & Tyler, J.K. CBP/p300-mediated acetylation of histone H3 on lysine 56. Nature 459, 113–117 (2009).

    Wang, F., Marshall, C.B. & Ikura, M. Transcriptional/epigenetic regulator CBP/p300 in tumorigenesis: structural and functional versatility in target recognition. Cell. Mol. Life Sci. 70, 3989–4008 (2013).

    Zheng, R. & Blobel, G.A. GATA transcription factors and cancer. Genes Cancer 1, 1178–1188 (2010).

    Hsu, A.P. et al. Mutations in GATA2 are associated with the autosomal dominant and sporadic monocytopenia and mycobacterial infection (MonoMAC) syndrome. Blood 118, 2653–2655 (2011).

    Hsu, A.P. et al. GATA2 haploinsufficiency caused by mutations in a conserved intronic element leads to MonoMAC syndrome. Blood 121, 3830–3837 (2013).

    Lettice, L.A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003).

    Lettice, L.A., Hill, A.E., Devenney, P.S. & Hill, R.E. Point mutations in a distant sonic hedgehog cis-regulator generate a variable regulatory output responsible for preaxial polydactyly. Hum. Mol. Genet. 17, 978–985 (2008).

    Corradin, O. et al. Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits. Genome Res. 24, 1–13 (2013).

    Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).

    Encode Project Consortium et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Bauer, D.E. et al. An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science 342, 253–257 (2013).

    Xu, J. et al. Correction of sickle cell disease in adult mice by interference with fetal hemoglobin silencing. Science 334, 993–996 (2011). These authors deleted a tissue-specific enhancer of an essential gene encoding a transcriptional repressor for fetal hemoglobin and found that this disruption restored globin expression in a mouse model of sickle-cell anemia.

    Pastor, W.A., Aravind, L. & Rao, A. TETonic shift: biological roles of TET proteins in DNA demethylation and transcription. Nat. Rev. Mol. Cell Biol. 14, 341–356 (2013).

    Jin, F. et al. A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature 503, 290–294 (2013).

    Arnold, C.D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013). This study developed a genome-wide screen to identify regions of DNA that function in enhancer assays in Drosophila S2 cells.

    Gaj, T., Gersbach, C.A. & Barbas, C.F. III. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol. 31, 397–405 (2013).

    Mendenhall, E.M. et al. Locus-specific editing of histone modifications at endogenous enhancers. Nat. Biotechnol. 31, 1133–1136 (2013). This study used TAL effectors to target LSD1 to silence an enhancer. This approach can help determine the in vivo function of an enhancer and could potentially be used for therapeutic purposes.

    Li, L.M. & Arnosti, D.N. Long- and short-range transcriptional repressors induce distinct chromatin states on repressed genes. Curr. Biol. 21, 406–412 (2011).

    Li, L., Greer, C., Eisenman, R.N. & Secombe, J. Essential functions of the histone demethylase lid. PLoS Genet. 6, e1001221 (2010).

    Morales Torres, C., Laugesen, A. & Helin, K. Utx is required for proper induction of ectoderm and mesoderm during differentiation of embryonic stem cells. PLoS ONE 8, e60020 (2013).

    Shpargel, K.B., Sengoku, T., Yokoyama, S. & Magnuson, T. UTX and UTY demonstrate histone demethylase-independent function in mouse embryonic development. PLoS Genet. 8, e1002964 (2012).

    Terranova, R., Agherbi, H., Boned, A., Meresse, S. & Djabali, M. Histone and DNA methylation defects at Hox genes in mice expressing a SET domain-truncated form of Mll. Proc. Natl. Acad. Sci. USA 103, 6629–6634 (2006).

    Wang, C. et al. UTX regulates mesoderm differentiation of embryonic stem cells independent of H3K27 demethylase activity. Proc. Natl. Acad. Sci. USA 109, 15324–15329 (2012).

    Lam, M.T.Y. et al. Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription. Nature 498, 511–515 (2013). This study demonstrated that an eRNA product has regulatory function.

    Le Borgne, R., Remaud, S., Hamel, S. & Schweisguth, F. Two distinct E3 ubiquitin ligases have complementary functions in the regulation of delta and serrate signaling in Drosophila. PLoS Biol. 3, e96 (2005).

    Chlon, T.M. & Crispino, J.D. Combinatorial regulation of tissue specification by GATA and FOG factors. Development 139, 3905–3916 (2012).

    Cirillo, L.A. et al. Binding of the winged-helix transcription factor HNF3 to a linker histone site on the nucleosome. EMBO J. 17, 244–254 (1998). This study demonstrated that the pioneer factor HNF3 not only has a structure resembling that of linker histone H1, but binds to nucleosomal DNA in a similar manner.

    Cuesta, I., Zaret, K.S. & Santisteban, P. The forkhead factor FoxE1 binds to the thyroperoxidase promoter during thyroid cell differentiation and modifies compacted chromatin structure. Mol. Cell. Biol. 27, 7302–7314 (2007).

    Hatta, M. & Cirillo, L.A. Chromatin opening and stable perturbation of core histone:DNA contacts by FoxO1. J. Biol. Chem. 282, 35583–35593 (2007).

    Mandel, E.M. & Grosschedl, R. Transcription control of early B cell differentiation. Curr. Opin. Immunol. 22, 161–167 (2010).

    Natoli, G., Ghisletti, S. & Barozzi, I. The genomic landscapes of inflammation. Genes Dev. 25, 101–106 (2011).

    Harrison, M.M., Li, X.Y., Kaplan, T., Botchan, M.R. & Eisen, M.B. Zelda binding in the early Drosophila melanogaster embryo marks regions subsequently activated at the maternal-to-zygotic transition. PLoS Genet. 7, e1002266 (2011).

    Liang, H.L. et al. The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila. Nature 456, 400–403 (2008).

    Nien, C.Y. et al. Temporal coordination of gene networks by Zelda in the early Drosophila embryo. PLoS Genet. 7, e1002339 (2011).

    Dorsett, D. & Merkenschlager, M. Cohesin at active genes: a unifying theme for cohesin and gene expression from model organisms to humans. Curr. Opin. Cell Biol. 25, 327–333 (2013).

    Matthews, J.M. & Visvader, J.E. LIM-domain-binding protein 1: a multifunctional cofactor that interacts with diverse proteins. EMBO Rep. 4, 1132–1137 (2003).

    Ghirlando, R. et al. Chromatin domains, insulators, and the regulation of gene expression. Biochim. Biophys. Acta 1819, 644–651 (2012).

    Chetverina, D., Aoki, T., Erokhin, M., Georgiev, P. & Schedl, P. Making connections: insulators organize eukaryotic chromosomes into independent cis-regulatory networks. BioEssays 36, 163–172 (2014).

    Phillips-Cremins, J.E. & Corces, V.G. Chromatin insulators: linking genome organization to cellular function. Mol. Cell 50, 461–474 (2013).

    Guo, C. et al. KMT2D maintains neoplastic cell proliferation and global histone H3 lysine 4 monomethylation. Oncotarget 4, 2144–2153 (2013).

    Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825–837 (2013).

    Krebs, A.R., Karmodiya, K., Lindahl-Allen, M., Struhl, K. & Tora, L. SAGA and ATAC histone acetyl transferase complexes regulate distinct sets of genes and ATAC defines a class of p300-independent enhancers. Mol. Cell 44, 410–423 (2011).

    Zippo, A. et al. Histone crosstalk between H3S10ph and H4K16ac generates a histone code that mediates transcription elongation. Cell 138, 1122–1136 (2009).

    Shi, J. et al. Role of SWI/SNF in acute leukemia maintenance and enhancer-mediated Myc regulation. Genes Dev. 27, 2648–2662 (2013).

    Belkina, A.C. & Denis, G.V. BET domain co-regulators in obesity, inflammation and cancer. Nat. Rev. Cancer 12, 465–477 (2012).


    Enhancers

    Some transcription factors ("Enhancer-binding protein") bind to regions of DNA that are thousands of base pairs away from the gene they control. Binding increases the rate of transcription of the gene.

    Enhancers can be located upstream, downstream, or even within the gene they control.

    There are thousands of enhancers in the genome but which ones are active depends on the type of cell and the signals which it is receiving. Most genes, at least in Drosophila, are regulated by 2&ndash3 enhancers, but some may be controlled by 8 or more. Multiple enhancers are particularly characteristic of "housekeeping" genes.

    How does the binding of a protein to an enhancer regulate the transcription of a gene thousands of base pairs away?

    One possibility is that enhancer-binding proteins &mdash in addition to their DNA-binding site, have sites that bind to transcription factors ("TF") assembled at a promoter of the gene.

    This would draw the DNA into a loop (as shown in the figure).

    • a protein designated CTCF ("CCCTC binding factor" named for the nucleotide sequence to which it binds). The CTCF at one site on the DNA forms a dimer with the CTCF at another site on the DNA binding the two regions together. CTCF has 11 zinc fingers. [View another example of a zinc-finger protein]
    • cohesin &mdash the same protein complex that holds sister chromatids together during mitosis and meiosis. [Link]

    Visual evidence

    • several (4) promoter sites for Sp1 about 300 bases from one end. Sp1 is a zinc-finger transcription factor that binds to the sequence 5' GGGCGG 3' found in the promoters of many genes, especially "housekeeping" genes.
    • several (5) enhancer sites about 800 bases from the other end. These are bound by an enhancer-binding protein designated E2.
    • 1860 base pairs of DNA between the two.

    When these DNA molecules were added to a mixture of Sp1 and E2, the electron microscope showed that the DNA was drawn into loops with "tails" of approximately 300 and 800 base pairs.

    At the neck of each loop were two distinguishable globs of material, one representing Sp1 (red), the other E2 (blue) molecules. (The two micrographs are identical the lower one has been labeled to show the interpretation.)

    Artificial DNA molecules lacking either the promoter sites or the enhancer sites, or with mutated versions of them, failed to form loops when mixed with the two proteins.

    Significance of "Looping"


    Emerging concepts in enhancer action

    Enhancer or silencer?

    Since the chromatin structure itself is an overwhelming impediment to transcription, research on transcriptional activation mechanisms has been tremendous. However, there are position- and orientation-independent silencer elements in the eukaryotic genomes that mediate active repression of transcription [251, 252]. With the premise that active silencers (i) would be nuclease-sensitive due to assembly of repressor protein complexes—as are the enhancers, (ii) would be enriched with repressive epigenomic marks such as H3K27me3, and (iii) would suppress transcription of the neighboring genes, thousands of silencer elements have been recently discovered in multiple human cell lines [253]. These silencers are enriched with binding motifs for many repressors several of these silencers were also experimentally validated in reporter assays. Another study focused on previously uncharacterized nuclease-hypersensitive CREs enriched with repressive TFBSs and employed massively parallel reporter assays that identified thousands of silencers in multiple human and murine cell lines [254]. Several of these elements were found in contact with inactive genes in published Hi-C datasets.

    Just as an enhancer is not supposed to be active in all cells, so is a silencer expected to be active in specific cell types. Also, just as an enhancer can be inactivated through the recruitment of repressive proteins [255, 256], a silencer can also be potentially deactivated. Thus, depending on the presence of coactivators or corepressors, the functional identity of the enhancers and silences may potentially switch. Interestingly, several silencers reportedly act as enhancers in different cell types [254, 257]. We envisage six scenarios emerging from these studies: direct activation and repression by enhancers and silencers, respectively passive suppression and expression upon dismissal of coactivators and corepressors from enhancers and silencers, respectively and enforced reversal of enhancer and silencer functions when they gain repressors and activators, respectively. Whether such scenarios indeed play out await experimental evidence.

    Condensates encompassing EPC

    Over the recent years, it has become increasingly clear that various biochemical reactions undergo liquid-liquid phase separation (LLPS) into condensates inside the cell, where not only the efficiency and fidelity of biomolecular processes are enhanced, but also the organizational and architectural specificity is attained without requiring membranous confines [258, 259]. Two factors are important for phase separation of molecules: a network of interactions among the participating molecules so that the local concentration is effectively high and the surface of the molecules conducive enough to support such interactions—for example, a multi-modular feature called “multivalency” [260], where each module can potentially initiate interactions. Examples of multivalent interaction surfaces in macromolecules include unstructured patches or IDRs in proteins, the modular structure of RNA, high TFBSs as in enhancers, etc. Multivalency can spontaneously initiate oligomerization that soon results in a polymer, which forms high-density liquid droplets upon LLPS. A critical feature of these liquid droplets is the exchange of molecules with the surroundings. This provides an ideal platform for multicomponent biochemical reactions to take place. Transcription is one such molecular process, and LLPS was theorized to regulate transcription [261]. Subsequent studies have now demonstrated that the Mediator complex, various TFs with IDRs, coactivators, and RNA Polymerase II (RNAPII) form condensates during transcription [208, 262,263,264,265]. Multivalent DNA sequences on enhancers also promote condensations of the bound TFs and coactivators through LLPS even at moderate concentrations [266]. LLPS is reported to cause enhancers to come closer that otherwise reside in faraway TADs [267], although this notion has not been tested rigorously. Nevertheless, these studies have established that phase separated condensates form on enhancers and correlate with transcription activation. Taking into account the various components of transcription regulation undergoing LLPS, it is safe to assume that the RNAPII-Coactivator condensates encompass EPCs (see below Fig. 3), though definitive proof is desired. A recent study suggests that local RNA concentration can regulate condensate formation and dissolution, thereby functioning as a transcriptional feedback mechanism [268].

    A schematic for transcriptional coordination between the enhancer, promoter, and gene body. a Multiple protein complexes specializing in many structural and catalytic activities establish EPC. RNAPII recruitment and transcription initiation occur in a phase-separated condensate (a “transcription bubble”) where productive transcript elongation takes place. RNAPII remains within the EPC-encompassing transcription bubble, necessitating extrusion of the downstream template DNA into a loop behind—as elongation proceeds. Sequential recruitment of RNAPII represent transcription of the gene in multiple transcription units, each forming a DNA loop as in the petals of a sunflower. b Additionally, the chromatin landscape of a transcriptionally poised gene can exist in a sunflower arrangement where proteins assembled at intronic TFBS clusters hold the enhancer and promoter in proximity without direct EPC. Transcription initiation accompanies direct EPC. This model can supplement and co-operate with (a). c Multiple gene promoters can exist in physical proximity with a given enhancer within a phase-separated condensate, facilitating coordinated transcription activation. Likewise, a given promoter can also exist in association with multiple enhancers simultaneously

    Transcription bubble at EPC

    The relative motion of the template DNA vis-à-vis the elongating RNA polymerase is an area of active debate. There are two possibilities: the RNAPII leaves the promoter after transcription initiation, pauses, and “tracks forward” along the template DNA like a locomotive as the RNA synthesis progresses. Alternatively, the RNAPII can stay tethered to nuclear structures while the template DNA “extrudes backward” during chain elongation. Nearly 40 years ago, Peter Cook and colleagues observed that the “body” of nascent RNAs are tethered to nuclear structures—presumably through the RNAPII itself [269]. This led to the “extrusion backward” model, while the “tracking forward” model enjoys general acceptance. From a cellular perspective of population energetics, it would be wasteful to engage numerous RNAPIIs along numerous genes discretely such that at any given time thousands of transcribing genes populate the entire nuclear space. A more productive way would be to have transcription factories [204], where congregations of genes could be transcribed in a concerted fashion. In fact, transcription-associated RNAPII condensates provide support to this view [262].

    Over the years, many genome-wide [170, 241, 270] and locus-specific [271, 272] studies have observed promoter-gene body contacts that also associate with RNAPII. This is possible only if the RNAPII remains in contact with the promoter and downstream transcribed regions simultaneously. In that case, the only way for the RNAPII to elongate the RNA chain is to drag the template backward, causing DNA extrusion. Indeed, Blobel and colleagues observed the enhancer, the promoter, and the progressive regions of the gene body held together in close proximity as transcription continued [273]. While studying the transcription kinetics of the GREB1 locus, we observed that two gene body SRC-3-bound regions (GBS1 and GBS2) hold the enhancer and promoter in close proximity in estradiol (E2)-deprived MCF-7 cells. However, upon E2 stimulation, the GBS1 and GBS2 (18 kb and 43 kb downstream of the promoter, respectively) promptly disengage from the enhancer allowing direct EPC, and they then get back in contact with the EPC when their respective regions are undergoing transcription [128], supporting the extrusion model. We made similar observation at the NRIP1 locus as well. Recently, a strong proof for this model has come in the form of “stripes” in an exemplary nucleosome-resolution “Micro-C” connectome of the transcription-linked chromatin [274]. The stripes extend from the promoter and cover the entire length of active genes, which is possible only if the samples represent the promoter’s contact with the body of the gene progressively.

    Taking the above observations along with the recent transcriptional LLPS studies, we propose that the enhancer and the promoter reside within a phase-separated “transcription bubble” enriched with coactivators and the RNAPII, and that transcription elongation occurs when the downstream template DNA is dragged backward causing a progressively extruding loop (Fig. 3a). A recent global RNA-RNA interaction map has revealed physical proximity between eRNAs and transcripts derived from target promoters [275]. The experimental methodology in this study rules out detection of RNA-RNA interactions in free-floating ribonucleoprotein particles, emphasizing that the specific eRNA-mRNA contacts are chromatin-mediated and, thus, recapitulate EPC. Such direct contacts between the chromatin-anchored transcripts are possible only when the enhancer and the promoter reside in a transcription bubble during transcription (Fig. 3). It is also possible that strategic gene body locations within the gene, like the SRC-3 enriched GBSs, may exist as preassembled TF-Coactivator hubs. In fact, more than half of all genomic TFBS clusters reside in intronic regions [276] and many genes contain SRC-3 enriched GBSs that coincide with the intronic TFBS clusters [277] however, their functional relevance has not been explored. We envisage that these intronic TFBS clusters assemble into bound TF-coactivator hubs to aid the enhancer-coordinated transcription of the gene in a dynamic architecture akin to the petals of a sunflower (Fig. 3b). In our opinion, such a scenario can explain the concordance of transcription seen at the enhancers and target genes, coordination of enhancer-regulated transcription of the gene body, observation of multi-enhancer and multi-promoter contacts as observed at genomic scales [170, 241], as well as simultaneous regulation of more than one gene by a single enhancer (Fig. 3c) [221].

    Intronic regulation of enhancer action

    An offshoot of the transcription bubble and the sunflower model described above is the tantalizing prospect of intronic regulation of enhancer action. Since the enhancer, promoter and the strategic intronic TFBS clusters (e.g., GBSs) maintain near-constant contact, these nucleoprotein hubs can structurally and functionally influence each other (Fig. 3b). Interestingly, not only the genomic interactions of GBS1 and GBS2 with the GREB1 enhancer and promoter are recapitulated in vitro, inclusion of GBS1 and GBS2 fragments greatly enhance cell-free transcription of both the enhancer and promoter [128]. These results would suggest that GBS1 and GBS2 might be acting as intronic enhancers, but these elements do not exhibit any recognizable epigenomic signatures of active enhancers [144]. Thus, it is likely that strategic intronic TFBS clusters constitute a novel functional class of gene regulatory elements that not only regulate the EPC, but also impact transcriptional processivity.


    DATA COLLECTION AND DATABASE CONTENT

    Data collection

    For the collection of silencers, we adopted a series of standardized procedures to ensure consistent and reliable data collection ( 33–35). First, a total of 2300 abstracts with the keyword ‘silencer’ were retrieved from the PubMed database by June 2020. These candidate articles were then filtered based on availability of genomic locations of silencers and the form of identification. Validated silencers were identified by either high-throughput or low-throughput experimental techniques such as MPRA, CRISPR, transient transfection assays and reporter assays. Predicted silencers were collected using the correlation-based model ( 29), the SVM-based model ( 17), the gkmSVM-based model ( 30), and our newly developed deep learning-based model DeepSilencer. The current release of the database contains silencers retrieved from a total of 456 articles related to validated silencers and three articles related to predicted silencers. The full text of each candidate article was manually reviewed in detail by at least two independent researchers to extract the information of silencers. Each entry contains general information such as species, cell line, reference genome, genomic location, PubMed ID of the publication as well as details about the experimental or computational method used for its identification (Figure 1).

    Overview of data collection, data processing and annotation, and database features of SilencerDB.


    The Cell

    Enhancers, Silencers, and Transcription Factors

    Enhancer and silencer elements regulate transcription through the binding of a multitude of TFs that activate or repress transcription. These transcription factor-binding elements influence transcription irrespective of location they may lie proximal to the core promoter or several hundreds of base pairs from the TSS, or even on other chromosomes. Given the limited number of transcription factors compared with the number of genes under transcriptional regulation, and the repetitious use of the same transcriptional machinery, it is the enhancer and silencing motifs found proximally and distally that provide much of the temporal and spatial regulation of transcription. Through the binding of transcription factors, enhancer elements direct which genes are to be transcribed, when transcription takes place, for how long, and at what level of intensity.

    The sequence motif of enhancer elements determines the specificity of transcription factor binding, though these sequences are usually degenerate. Variants of consensus sequences may dictate the strength of association with a particular transcription factor, or preferential binding of specific dimerization partners. Sequences may also alter the conformation of a bound transcription factor, affecting its activity. Transcriptional activators may promote preinitiation complex (PIC) formation, regulate primary transcriptional events such as initiation and elongation, and recruit chromatin modifiers. Coactivators have similar functions, but do not bind directly to enhancer elements, instead associating with enhancer-bound activators.

    Transcription factors aid in the recruitment of the RNAPII transcriptional complex and recruit chromatin remodelers. How do distal enhancers regulate transcriptional activity at the core promoter, several hundred base pairs away? The multisubunit coactivator Mediator complex binds to distal transcription factors and causes the DNA to loop so that Mediator is in proximity to RNAPII, to which it binds at the C-terminal domain (CTD). In addition to bridging the distance between distal enchancers and the TSS, Mediator also mediates the phosphorylation of RNAPII at serine 5 ( Figure 12 ).


    Results

    Selection of H3K9ac as best suited histone modification to identify active enhancers in maize

    In mammals, several histone modifications such as H3K27ac, H3K9ac and H3K4me1 were shown to mark active enhancers [27, 28, 30]. To define which of these histone modifications indicate best active enhancers in maize, we examined the enrichment of H3K27ac, H3K9ac and H3K4me1 at the hepta-repeat enhancer and other cis-regulatory sequences present at the B-I allele of the b1 gene. ChIP was performed on inner stem tissue from V5 maize seedlings (V5-IST) and husk tissue. The hepta-repeat enhancer of B-I, located 100 kb upstream of the b1 transcription start site (TSS), is inactive in V5-IST and active in husk leaves [36]. Previously, the hepta-repeat enhancer and regulatory sequences

    45 kb upstream of b1 were shown to be enriched with H3K9K14ac when active [36]. The results presented here (Fig. 1) indicated that the enrichment in both H3K9ac and H3K27ac was significantly higher in husk compared to V5-IST at the hepta-repeat enhancer (R3 and R6),

    45 kb upstream regulatory sequences (g) and the untranslated 5’ region of b1 (UTR). Based on these results, both H3K9ac and H3K27ac appeared to mark active enhancers. In contrast, H3K4me1 enrichment levels were relatively low throughout the intergenic b1 region in both V5-IST and husk tissues. In addition, at the coding region, H3K4me1 enrichment levels were higher in low b1 expressing V5-IST than in high expressing husk tissue. Therefore, in contrast to animal systems [27, 37], H3K4me1 is probably not suited to identify enhancers in maize. Since the enrichment at the enhancer region in husk relative to V5-IST tissue was highest for H3K9ac, we chose this histone modification to identify active enhancers genome-wide.

    ChIP-quantitative polymerase chain reaction (qPCR) analysis at b1 for H3K27ac, H3K9ac and H3K4me1. a Schematic representation of the b1 locus. Vertical arrows with letters indicate the regions examined by ChIP-qPCR. The b1 hepta-repeat enhancer is indicated with seven black triangles, the b1 coding region by a black box and the TSS by a bent arrow. Grey bars represent TEs and other repetitive sequences. b Enrichment of H3K27ac, H3K9ac and H3K4me1 at the b1 locus relative to the enrichment at the maize actin 1 locus (actin). Error bars represent the standard error of the mean for two (H3K9ac, H3K4me1) or three (H3K27ac) biological replicates

    An integrated pipeline to identify tissue-specific enhancers in maize

    DNase-seq, H3K9ac ChIP-seq and RNA-seq experiments were carried out in two tissues, V2-IST and husk, isolated from the reference inbred line B73 (Additional file 1: Figure S1). These tissues were selected to identify tissue-specific as well as developmental stage-specific enhancers. Our study included material grown at two different locations (DNase-seq and H3K9ac ChIP-seq were performed at the Max Planck Institute for Plant Breeding Research and the University of Amsterdam, respectively) therefore, we performed RNA-seq experiments for each tissue in six biological replicates, three per location. Comparison of gene expression levels between replicates in reads per kilobase of transcript per million mapped reads (RPKM) revealed high correlations among replicates between the two locations (Additional file 1: Figure S2). This high correlation between replicates and locations indicated the data were comparable and implied that the chromatin states of the plants from both locations were similar. Gene expression levels and significant differential expression levels were calculated, taking the variability among six replicates into account. The genes determined as significantly differentially expressed thus showed statistically significant differences in their expression levels at both locations.

    After pre-processing of the data, our enhancer prediction pipeline consisted of three steps of data integration (Fig. 2). First, enriched chromatin or DNA features were identified for three genome-wide datasets. In addition to calling DNase-seq and H3K9ac ChIP-seq peaks from our own datasets, we identified low and unmethylated DNA regions (LUMRs) by re-analysing published BS-seq data [35]. By taking an overlap between the three datasets, regions displaying all three features were selected as enhancer candidate regions. We focused on intergenic enhancer candidates, excluding promoter regions, as chromatin profiles of enhancers located in proximity of and within coding regions are more likely to overlap with chromatin profiles of genic regions, making it difficult to disentangle the underlying regulatory regions. Enhancer candidates predicted in only one tissue were defined as tissue-specific candidates. Transposable elements (TEs) were included in our analysis as some of them had been shown or suggested to act as enhancers in maize and other organisms [13, 38]. The second step involved determining the degree of tissue-specificity of the candidates identified in both tissues by ranking the candidates based on signal intensity differences between the two tissues. This was done for both chromatin accessibility and H3K9ac enrichment, followed by summing the ranks and re-ranking. The last step assigned target genes to enhancer candidates, assuming that enhancers most likely regulate genes located directly upstream or downstream and that gene expression and active chromatin marks at enhancers are positively correlated.

    Overall workflow of this study. First, chromatin accessibility data from DNase-seq, H3K9ac enrichment data from ChIP-seq and DNA methylation data from BS-seq were analysed individually. Second, the data on accessible regions, H3K9ac-enriched regions and low DNA methylated regions were integrated to predict enhancers. Third, the enhancer candidates were ranked based on signal intensity differences of the chromatin accessibility and H3K9ac enrichment data between V2-IST and husk tissue. Finally, enhancer candidates were linked to their putative target genes based on their tissue specificity and on the differential expression of flanking genes determined by RNA-seq data. For shared candidates, adjacent genes being expressed in both tissues were associated

    Distribution of chromatin features in the uniquely mappable part of the maize genome

    To identify chromatin accessibility, H3K9ac enrichment, and low DNA methylation within the genome, we partitioned the genic and intergenic regions of the genome in six sub-categories: promoters exons introns flanking and distal intergenic regions and TEs (Fig. 3a). Gene annotations were taken from the maize B73 annotation version 4 (AGPv4 assembly [39]) from Ensembl Plants [40]. Only intergenic TEs were considered in our study TEs present in introns were counted as ‘introns’. Promoter regions were defined as 1 kb upstream to 200 bp downstream from the TSS, therefore including the first nucleosome downstream of the TSS. The composition of the B73 maize genome was quantified by counting the numbers of mega bases in each genomic region (Fig. 3b). Since 85% of the maize genome is highly repetitive [41], an important fraction of the next-generation sequencing reads could not be mapped uniquely (Additional file 1: Table S1), which prevented enhancer identification in repetitive genomic regions. We determined the uniquely mappable parts of the genome by performing an all-against-all alignment for theoretical 93 bp single-end reads, allowing a maximum of two mismatches using the Uniqueome pipeline [42], which estimates the fraction of uniquely mapped reads for each nucleotide (Fig. 3c). In the uniquely mappable genome, the proportion of TEs was reduced to approximately one-quarter of the assembled genome.

    Genomic composition and distribution of features. a Definition of genomic regions. Promoters are defined from 1 kb upstream to 200 bp downstream from the TSSs, flanking regions are 4 kb upstream from the promoters and 5 kb downstream from the TTSs. TE transposable elements, distal intergenic regions that are more than 5 kb away from genic regions and are not TEs. b Composition of the entire maize genome according to AGPv4 and (c) the uniquely mappable genome. Distribution of (d, f) DHSs, (h, j) H3K9ac, (l) LUMRs and (n, o) enhancer candidates over the different genomic regions, and (e, g, i, k, m) the fractions (Mbp/Mbp, from 0 to 1, y-axes) the different features (x-axes) occupy at the various genomic regions in the uniquely mappable genome. The grey bars indicate the fraction of overall occupancy in the uniquely mappable genome.

    9212 intergenic DHSs are potential cis-regulatory elements

    DNase I hypersensitive sites (DHSs) are genomic regions that are more sensitive to DNase I endonuclease activity compared with flanking regions due to a lower nucleosome density [43]. The mapping of DHSs by DNase-seq is a powerful approach to identify cis-regulatory regions, including enhancers, and has been used in many organisms including plants [20, 25, 44,45,46]. DNase-seq experiments were performed in two biological replicates for both V2-IST and husk tissue (Additional file 1: Table S1). To take the intrinsic digestion bias of DNase I into account, we also included a control sample generated by digesting B73 genomic DNA (gDNA) with DNase I. After mapping the reads obtained from each library, DHSs were identified for each library using MACS2 peak calling [47].

    Data reproducibility between biological replicates was examined by counting the number of overlapping DHSs identified for all the possible combinations of replicates (Additional file 1: Table S2). This comparison showed that 54–92% of DHSs overlapped by at least 1 bp between replicates. The overlap between the two V2-IST replicates was the lowest (54% of the 35,906 V2-IST_2 peaks were overlapping with the 21,309 V2-IST_1 peaks) as 1.5 times more peaks were identified in the V2-IST_2 sample. The overlap between peaks identified in V2-IST and in husk samples appeared quite large (e.g. 80% of the peaks identified in V2-IST_1 were also observed in Husk_1), indicating that most DHSs are not tissue-specific. To select for high confidence DHSs in both V2-IST and husk tissue, only DHSs overlapping by at least 70% of their lengths between replicates were kept for further analysis. For signal intensity analysis, the reads in all biological replicates were pooled per tissue to estimate the overall coverage of the reads.

    We correlated DNase I hypersensitivity and gene expression levels in gene bodies and their immediate 1 kb flanking regions for additional validation of the dataset. For each tissue, genes were binned according to their gene expression levels and the average DNase I hypersensitivity, measured in number of read counts per million mapped reads (RPM), was calculated for each bin using bwtools [48] (Fig. 4a and b). A positive correlation between expression levels and DNase-seq coverage over genic regions was observed, especially directly upstream of the TSSs and transcription termination sites (TTSs). Chromatin at gene bodies was rather inaccessible among the gradient of gene expression. Presence of DHSs at TSSs and a positive correlation with expression levels observed in our dataset confirm previous observations in both animals and plants [21, 26, 49,50,51].

    Average DNase I hypersensitivity and H3K9ac enrichment at genic regions. Average signal (in RPM) for DNase I hypersensitivity in (a) V2-IST and (b) husk, and for H3K9ac enrichment in (c) V2-IST and (d) husk at genes and their 1-kb flanking regions. Genes were binned based on their expression levels, from no expression (light colour) to high expression (dark colour): the lowest expression level bin contains all genes with an expression lower than 1 RPKM. The thresholds (in RPKM) are at 1.94, 4.17, 8.58, 16.64 and 36.28 for V2-IST and 1.88, 4.00, 8.34, 15.83 and 32.99 for husk tissue

    The number of DHSs per genomic region was counted to examine their fraction per genomic region (Fig. 3d, f). When comparing the distributions of DHSs to a randomised distribution within the mappable genome (Additional file 1: Figure S3A and B), we observed a clear over-representation of DHSs at promoters (p value < 0.001 permutation test). Still, 43% of DHSs, in total 9212 out of 21,445, were in intergenic regions excluding promoters (Fig. 3d, f): 7802 in V2-IST, 7123 in husk and 5130 shared between both tissues (Table 1A). In addition, the fraction of the genome scored as DHS (in Mbp) was calculated for each genomic category. In total, DHSs occupied about 2% of the mappable genome in both tissues (Fig. 3e, g). DHSs occupied 10% and 8% of the total mappable promoter regions in V2-IST and husk, respectively.

    ChIP-seq identifies 6511 intergenic H3K9ac-enriched regions

    ChIP-seq H3K9ac data were obtained from two and three biological replicates for V2-IST and husk tissue, respectively. The reads were aligned to the AGPv4 B73 reference genome and H3K9ac-enriched regions were identified, taking the input sample into account, by peak calling for each replicate using MACS2 [47].

    To examine the reproducibility between replicates, overlapping H3K9ac-enriched regions were counted for all replicate combinations, showing 62–96% overlap within a tissue (Additional file 1: Table S3). As for the DNase-seq data, H3K9ac-enriched regions with an overlap in length of at least 70% between all replicates were kept for further analysis and reads in replicates were pooled for coverage calculation in each tissue. We correlated gene expression levels with H3K9ac enrichment levels across gene bodies and their 1-kb flanking regions (Fig. 4c, d) and observed a peak of H3K9ac enrichment immediately after the TSS and increased levels across the gene bodies compared to gene flanking regions. At the TSS peak region, gene expression and H3K9ac levels showed a parabolic correlation, showing saturation for higher bins and signal reduction for the highest one. In gene bodies, H3K9ac was lower for the three highest bins than for the three following bins. Previous studies in yeast and maize have reported a genome-wide loss of nucleosomes at highly expressed genes [26, 52]. Reduced nucleosome levels could explain the reduction in H3K9ac observed at highly expressed maize genes. Correlations between enrichment levels of H3K9ac 3’ of the TSS and gene expression levels have been previously reported [30, 53, 54]. Our data suggest that H3K9ac enrichment levels reached saturation for genes with high expression levels.

    To estimate the number of potential intergenic enhancer candidates from the H3K9ac data sets, the genomic distribution of H3K9ac-enriched regions was examined by counting the numbers of H3K9ac-enriched regions in the different types of genomic regions (Fig. 3a, h, j). As seen for DHSs, a clear over-representation of H3K9ac-enriched regions at promoters was observed when compared with a randomised distribution (p value < 0.001 permutation test, Additional file 1: Figure S3C and D). In both tissues, nearly 70% of all H3K9ac-enriched regions located at promoters this enrichment is more pronounced than for DHSs (approximately 40%), suggesting a presence of H3K9ac at promoters in the absence of DHSs. The number of intergenic H3K9ac-enriched regions, excluding promoters, was 6511 in total 3115 in V2-IST, 6213 in husk and 2668 shared between both tissues (Table 1B).

    The overall H3K9ac-enriched regions occupy 2% and 7% of the uniquely mappable genome for V2-IST and husk, respectively (Fig. 3i, k). The fraction in husk is larger than in V2-IST because there were 1.5-fold more H3K9ac-enriched regions in husk and these regions were also longer (Additional file 1: Figure S4A, medians of 603 bp and 1015 bp in V2-IST and husk, respectively). The latter aspect is partly due to merging H3K9ac-enriched regions from three replicates for husk and two for V2-IST. Interestingly, despite the increase in H3K9ac-enriched regions in husk compared to V2-IST, no difference in the distribution of gene expression levels between the two tissues was observed (Additional file 1: Figure S4B). This observation suggests that the number of active genes is similar between the two tissues and independent from the identified number of H3K9ac-enriched regions.

    46,935 intergenic regions with low DNA methylation are potential enhancer candidates

    Low DNA methylation was selected as the third feature to identify enhancers because of its positive correlation with enhancer activity in mammals and plants [29, 36, 55,56,57,58]. To count the number of potential enhancers in the B73 maize genome, publicly available BS-seq data obtained from B73 coleoptile shoots were used [35]. Studies in Arabidopsis have revealed that DNA methylation levels in CG (mCG) and CHG (mCHG) contexts (H being A, C or T) are highly stable in different vegetative tissues [59, 60]. Furthermore, locus-specific [36] and genome-wide studies in maize ([61] RO, MS and NMS, unpublished observations) provided little evidence for changes in mCG or mCHG levels in different vegetative tissues, justifying the use of the coleoptile shoot dataset. We identified regions with 20% or lower DNA methylation in CG and CHG contexts independently, followed by defining LUMRs as regions that were low in both mCG and mCHG. Data on DNA methylation in CHH context (mCHH) were not included in the enhancer prediction step since, compared with the average levels of mCG and mCHG (86% and 74%, respectively), mCHH levels are generally low in maize (2%), like in other plant species [35, 62, 63]. The distribution of LUMRs within the genome was investigated by counting their number in each genomic region (Fig. 3l). The distribution of LUMRs in the uniquely mappable genome revealed an enrichment at genic regions, especially in exons, and at promoters (p values < 0.001 permutation test for all genomic categories), but a scarcity at TEs (p value = 1 permutation test for TEs) this observation is coherent with the fact that most TEs are highly methylated [35, 64, 65]. Investigation of the LUMR fractions revealed that nearly 50% of the genic regions are lowly methylated, which increases to nearly 60% for promoter regions and exons, while almost all TEs are highly methylated (Fig. 3m). To identify potential intergenic enhancer candidates, we focused on intergenic LUMRs, excluding promoters. We identified 46,935 intergenic LUMRs as potential enhancer candidate regions.

    Integration of features for enhancer candidate prediction

    To predict enhancer candidates, we integrated the DHS, H3K9ac and LUMR datasets discussed above. First, we calculated how many LUMRs and DHSs, or LUMRs and H3K9ac-enriched regions, overlapped by at least 1 bp with each other. The overlap between the chromatin features was investigated in both tissues and revealed that more than 97% and 99% of the intergenic DHSs and H3K9ac-enriched regions, respectively, overlapped with LUMRs (Table 1). DHSs are generally shorter than LUMRs (Additional file 1: Figure S4A median of 484 and 452 bp for V2-IST and husk, versus 834 bp, respectively). While most DHSs or H3K9ac-enriched regions co-localised within LUMRs, only about 20% of the total DHSs and H3K9ac overlapped with each other (Table 1).

    Active enhancers are expected to be indicated by a coincidence of chromatin accessibility, H3K9ac enrichment and low DNA methylation [29, 36]. We therefore filtered LUMRs based on the presence or absence of DHSs and H3K9ac-enriched regions and defined LUMRs overlapping with both DHSs and H3K9ac-enriched regions as active enhancer candidates (Fig. 2). Respectively, 398 and 1320 candidates in V2-IST and in husk were identified, of which 223 were shared between the tissues, resulting in 1495 enhancer candidates in total (Additional file 2: Dataset 1 and Additional file 3: Dataset 2). A total of 256 V2-IST and 775 husk candidates were located more than 5 kb away and 208 V2-IST and 623 husk candidates were located more than 10 kb away from their closest flanking genes. In V2-IST and husk tissue, the median distances between the candidates and their closest genes were 11.4 kb and 8.4 kb, while the largest distances were 438 kb (Zm00001d004626) and 498 kb (Zm00001d030489), respectively. Intersection of our candidates with a published dataset of sequence comparisons between rice and maize genomes indicated that 41 (10%) V2-IST and 241 (18%) husk candidates contained conserved non-coding sequences (CNSs). The overlap between enhancer candidates and CNSs is higher than expected for randomized features ([66], p value < 0.001 permutation test).

    Enhancer candidates and transposable elements

    Interestingly, 133 (33%) V2-IST and 370 (28%) husk candidates overlapped by at least 1 bp with TEs (Table 2). In most cases, enhancer candidates intersecting with TEs (TE-enhancer) overlapped more than 80% of their length or were entirely located within TEs. The number of TE-enhancers is the highest for long terminal repeat (LTR) retrotransposons, followed by helitrons and terminal inverted repeat (TIR) TEs, consistent with the fraction of the genome the three orders of TEs contribute to the TE space of the maize genome [39]. This TE space is calculated taking the average length for TEs and their number into account (136,000 LTRs with an average length of 9282 bp, 21,000 helitrons with an average length of 3605 bp and 14,000 TIRs with an average length of 621 bp). A small number of TIR elements (seven) are embedded entirely within enhancer candidates, possibly representing rare cases where the insertion of a small TE into open chromatin does not disrupt enhancer function. Indeed, these seven TIRs are in the range of 83–199 bp one overlaps with an H3K9ac peak, six do not overlap with either a DHS or H3K9ac peak all are enriched in mCHH (Additional file 1: Figure S5A and B). To further assess the potential of TEs to create enhancers, for the remaining analyses we focused on the subset of TEs that contained at least 80% of an enhancer (Table 2).

    The average distance between TEs and their closest genes did not vary between all TEs and TEs containing enhancer candidates (mean distance of 40.4 kb and 42.5 kb, respectively Additional file 1: Figure S6A and B). The TEs that contain candidates tend to be longer than other TEs. To assess if enhancer candidates are likely to overlap with promoters that create functional transcripts for the TEs, we examined the distribution of the candidates within TEs. They were distributed randomly within the TEs, while functional TE promoters are expected to be located at the TE ends, indicating most candidates within TEs are unlikely to be located at the functional promoter site of TEs (Additional file 1: Figure S6C).

    We explored the possibility that certain TE families could be a source of enhancers throughout the genome by looking for examples in which multiple members of the same TE family contained enhancer candidates (Additional file 4: Dataset 3). In most cases, only a single member of a TE family overlapped with enhancer candidates, with the exception of some very large TE families. Enrichment of TE families at enhancer candidates was tested by assuming a binomial distribution and applying Bonferroni correction for multiple testing. Only three TE families showed significant enrichment for enhancer candidates (RLG00010, RLG00357, RLG01570 annotations are available from Gramene [67] and the TE classifications from the Maize TE database [http://maizetedb.org]). The LTR Gypsy family RLG00010 was most significantly enriched (p value < 0.001), overlapping with seven V2-IST and 23 husk enhancer candidates. This represents a significant fraction of all TE-enhancers in the two tissues (7% and 8.6% for V2-IST and husk, respectively). The RLG00010 family was selected for further analysis.

    The same trends were observed for RLG00010 members overlapping with enhancer candidates as for all TEs: a similar distribution of distances of TEs to their closest flanking gene (Additional file 1: Figure S6B and D), and a longer average length for TEs overlapping with candidates (10,895 bp compared with 8517 bp Additional file 1: Figure S6A and E). Typical examples of RLG00010 TEs overlapping with enhancer candidates are shown in Additional file 1: Figure S5C. To examine if RLG00010 family members overlapping with enhancer candidates were enriched for specific consensus sequences relative to other family members, several de novo motif analysis tools were used [68,69,70,71]. When comparing the results from different algorithms, the GGCCCA motif stood out as recurring (found by MEME with p value < 0.0039, DREME with p value < 0.043, RSAT Plants with E-value of 2.9e –7 ). This motif, also named site II motif, has been discovered in promoter regions of various genes that are highly expressed, for example ribosomal and DEAD-box RNA helicase genes [72,73,74]. TCP and ASR5 transcription factors are examples of proteins shown to bind the GGCCCA motif [75, 76]. Scanning for the motif using FIMO [77] revealed that most enhancer candidates contained the GGCCCA motif irrespective of an overlap with the RLG00010 family (Additional file 1: Table S4). In fact, compared with random intergenic sequences, enhancer candidates showed an about twofold enrichment for the motif (p < 0.001). In contrast, the motif was not enriched in the RLG00010 family as such irrespective of their association with candidates.

    Characterisation of enhancer candidates

    In humans, enhancers generally show a bi-directional pattern of DNA, chromatin and transcript features. Histone modifications such as H3K27ac, as well as eRNA transcription, are located at both sides relative to single DHS peaks [4]. We set out to analyse whether DNA and chromatin features at our candidate enhancers displayed directionality. The read coverages for DNase-seq, H3K9ac ChIP-seq and DNA methylation in all three contexts were extracted for each DHS located in enhancer candidates and their 1-kb upstream and downstream flanking regions (431 candidates in V2-IST and 1,437 in husk) (Fig. 5). Note that the number of DHSs was higher than that of enhancer candidates because multiple DHSs could be located in one candidate. The averages of the read coverages are presented in Fig. 6. Empirical observations indicated that H3K9ac was often enriched at only one side of DHSs (see e.g. Fig. 7 and Additional file 1: Figure S7). Therefore, the orientation of DHSs was defined based on H3K9ac enrichment levels 300 bp from DHSs, the sides with the higher H3K9ac enrichment value, if present, being defined as 3' end. The observed asymmetry was further validated by plotting the H3K9ac enrichment values from both sides of the DHSs with and without the previously defined orientations for all DHSs (Additional file 1: Figure S8). For DHSs showing H3K9ac enrichment at either side of at least 0.5 RPM, 241 out of 431 in V-IST and 841 out 1437 in husk showed asymmetric H3K9ac enrichment as indicated by an at least twofold change in H3K9ac enrichment between the two flanking regions.

    Heatmaps of chromatin, DNA and transcript features at enhancer candidates. DNase I hypersensitivity, H3K9ac enrichment, mCG, mCHG and mCHH levels, presence of TEs and transcript levels at and around (±1 kb) DHSs in enhancer candidates. DHSs were scaled to equal size. The colour scales are in RPM for DNase I hypersensitivity, H3K9ac enrichment and transcript levels, and in methylation frequency (0–1) for DNA methylation. For TE sequences, red and white show the presence or absence of TEs, respectively. DHSs were clustered based on H3K9ac enrichment using a k-means (k = 4) clustering algorithm. The categories identified were numbered from 1 to 4 from the top to the bottom. All the DHSs were oriented based on H3K9ac enrichment intensity values 300 bp away from the DHS boundaries the side with higher H3K9ac enrichment was defined as 3' end

    Average profiles of the enhancer candidates in (a) V2-IST and (b) husk. Average signal intensities of DNase I hypersensitivity, H3K9ac enrichment in RPM and DNA methylation levels in methylation frequency at DHSs and their 1-kb flanking regions. DHSs were scaled to equal size. Prior to calculation of the average, all the DHSs were oriented based on H3K9ac enrichment intensity values 300 bp away from the DHS boundaries the sides with higher H3K9ac enrichment were defined as 3' end. The profiles show a clear preferential enrichment of H3K9ac 3’ of the DHSs and high levels of DNA methylation (CG and CHG context) around the DHSs and H3K9ac-enriched regions. The level of mCHH is low throughout the regions with a slight increase at the 5’ side of DHSs

    Example of data on (a) DICE and (b) b1 repeat enhancer. From the top: AGPv4 annotation and candidate annotation from our prediction (V V2-IST, H husk candidate), DNase I hypersensitivity and H3K9ac enrichment signal (all replicates pooled) and peak position (indicated as blue and green bars, respectively) in V2-IST and in husk tissue, mCG, mCHG and mCHH levels and unique mappability in percentage. The numbers under gene names indicate relative gene expression levels (V2-IST/husk). Although the b1 locus is on chromosome 2, in the current version of the AGPv4 assembly, the b1 gene is located in contig 44 (B, on the right of the grey vertical line). The dark blue bars in the gene annotation tracks indicate previously annotated known enhancers and putative cis-regulatory elements. The vertical red boxes indicate enhancer candidates identified in this study. Peaks at those tracks might not be present in each replicate, affecting enhancer candidate prediction

    The enhancer candidates were clustered into four categories based on H3K9ac enrichment patterns using the k-means clustering algorithm and the categories were numbered according to their appearance in the heatmaps (Fig. 5). For each category, average patterns were determined (Additional file 1: Figure S9). Heatmaps and profiles showed that H3K9ac can be primarily enriched on one side of the DHSs (category 1 and 2), within DHSs (category 3) or present at both sides but clearly enriched at one of them (category 4) (Fig. 5 and Additional file 1: Figure S9).

    Comparing DNase-seq or H3K9ac ChIP-seq read coverages with the distribution of mCG and mCHG levels, but also the average profiles, indicated that high chromatin accessibility and H3K9ac enrichment levels were exclusive with high DNA methylation levels (Figs. 5 and 6 and Additional file 1: Figure S9). The average profiles show a plateau and steep decline of mCG and mCHG at the 5’ side of DHSs (Fig. 6). In categories 1, 2 and 4, at the 3' side of enhancer candidates, mCG and mCHG levels increased gradually (Fig. 6, Additional file 1: Figure S9). These patterns indicate a sharp transition in DNA methylation level at the 5’ DHS boundaries and a more gradual transition at the H3K9ac boundaries. However, a sharp transition at the 5’ ends of candidates may be masked in the average profile by the varying size of the H3K9ac-enriched regions. In line with this, the profile of category 3 candidates, having H3K9ac at the DHSs itself, showed sharp boundaries at both sides of the candidates. Levels of mCHH were lower than mCG and mCHG levels, as expected [35]. In line with earlier studies [61, 62], mCHH marked boundaries between lowly and highly DNA methylated regions as shown by the relatively high level of mCHH, represented by a small mCHH peak in the average profiles, at the 5’ boundaries of the DHSs (Figs. 5 and 6 and Additional file 1: Figure S9).

    Additional heatmaps and profiles were created to illustrate the locations of TEs and transcripts for the four categories. The heatmaps suggest that TEs covered all selected regions, showing a slight depletion across DHSs but no apparent pattern across other features (Fig. 5). In animal models, enhancers are characterised by bi-directional transcription and the transcribed regions are, among others, enriched with H3K27ac [4]. In our data, transcript levels were generally low at candidates except for a few showing transcripts within and/or outside of their DHS (Fig. 5), making the detection of bi-directional transcription very challenging. In addition to this absence of detectable levels of bi-directional transcription, the clear asymmetric H3K9ac distribution at a majority of maize enhancer candidates suggested that the candidates have more resemblance to TSSs than animal enhancers do [4].

    Profiles of DNA and chromatin features at enhancer candidates and TSSs are similar

    To rule out the possibility that our enhancer candidates were actually TSSs of unannotated genes, we compared the patterns of their DNA, chromatin features and transcript features with those observed at annotated TSSs by randomly selecting 431 and 1437 DHSs located at TSSs for V2-IST and husk, respectively (Additional file 1: Figure S10). The selected regions were oriented according to the 5’ to 3’ orientation of flanking genes and analysed using the k-means clustering algorithm (k = 3). In general, the heatmaps and average profiles of DHSs at TSSs displayed a strong DNA methylation signal at the 5’ ends of DHSs and an enrichment in H3K9ac and an accumulation of transcripts at the 3' ends of DHSs (Additional file 1: Figure S10 and S11). The heatmaps and the average plots of TSSs and enhancer candidates revealed similar patterns of chromatin accessibility and H3K9ac, but they differed in transcript levels (higher at annotated TSSs) and distribution of mCG and mCHG (high on both sides for candidates, while restricted to the 5’ side for annotated TSSs) (Figs. 5 and 6, Additional file 1: Figures S10 and S11). The median transcript level at the enhancer candidates was 6.6 times lower than that at coding sequences in V2-IST the fold change could not be calculated for husk because the candidate expression levels had a median of 0 RPKM (Additional file 1: Figure S12). One category (category 3), showed transcriptional activity and H3K9ac enrichment on both sides (Additional file 1: Figure S10). The DHSs in this category were either flanked by two oppositely orientated and closely spaced genes or by alternative TSSs located in upstream regions.

    H3K4me3 histone modification was previously described for distinguishing TSSs from enhancers [21, 78,79,80]. Analysis of published ChIP-seq data for H3K4me3 in maize third seedling leaf [61] indicated that 24% and 11% of the V2-IST and husk enhancer candidates, respectively, overlapped with H3K4me3 enriched regions (Additional file 1: Figure S13), which could hint at unannotated TSSs. The observed H3K4me3 enrichment at enhancer candidates was, however, on average weaker than at TSSs (Additional file 1: Figure S13), suggesting H3K4me3 may also differentiate TSSs and enhancers in maize. In addition, the H3K4me3 enrichment pattern did not entirely reflect the H3K9ac enrichment pattern at TSSs but was rather slightly shifted downstream of the H3K9ac peaks. Such a pattern has not been reported in humans [79] and was not observed in a previous study in rice [21].

    In summary, despite a shared polarity with respect to flanking H3K9ac enrichment, the profiles of enhancer candidates differ from those at TSSs by the levels of transcript accumulation, DNA methylation and H3K4me3.

    Ranking and selecting a list of tissue-specific enhancer candidates

    To facilitate linking enhancer candidates to putative target genes, we set out to determine the degree of tissue-specificity of our enhancer candidates by ranking the 398 V2-IST and 1320 husk candidates based on the assumption that the levels of both DNase I hypersensitivity and H3K9ac enrichment are positively correlated with enhancer activity. The enhancer candidates were independently ranked based on the largest differences between the two tissues for DNase I hypersensitivity and H3K9aclevels. The strongest tissue-specific candidates were assumed to exhibit large differences in both DNase I hypersensitivity and H3K9ac enrichment therefore, the independent rankings for both features were summed for every candidate and the candidates were re-ranked (Additional file 2: Dataset 1 and Additional file 3: Dataset 2, column overall_rank). The ranking numbers were combined with a V for V2-IST or an H for husk as candidate IDs the lower the number, the more tissue-specific the candidate. However, the rankings for DNase I hypersensitivity and H3K9ac enrichment did not correlate with each other (Additional file 2: Dataset 1 and Additional file 3: Dataset 2, column DNase_rank and H3K9ac_rank shared candidates were ranked in both tissues). For example, the candidate ranked to the second place (candidate V2, Fig. 8) for V2-IST showed a large difference in DNase I hypersensitivity signal between V2-IST and husk tissue as expected, while the H3K9ac enrichment stayed almost the same for both tissues. The 313th candidate in V2-IST (candidate V313), on the other hand, is characterised by a large difference in H3K9ac enrichment but not in DNase I hypersensitivity. The 194th candidate in V2-IST (candidate V194) showed a large difference between the tissues for both DNase I and H3K9ac enrichment signals but in an opposite direction. The lack of correlation between the ranks obtained from both chromatin features indicated that determining tissue-specificity using this combination of features does not work properly. Experimental examinations of a number of candidates will be necessary to determine the best feature (combination) to predict tissue-specificity. For now, enhancer candidates identified in only one of the two tissues were defined as tissue-specific and the shared candidates between tissues as putative shared enhancers. With this definition, a total of 1495 candidates were classified into 175 V2-IST-specific, 1097 husk-specific and 223 shared candidates (Additional file 5: Dataset 4).

    Examples of candidate rankings. From the top: identified candidate region with its ID (V V2-IST, H husk candidate) and coordinates, DNase I hypersensitivity and H3K9ac enrichment signal intensities in V2-IST and husk tissues. In these examples, the DNase I hypersensitivity and H3K9ac enrichment signal differences do not positively correlate to each other as assumed

    Predicting putative target genes of enhancer candidates based on expression levels of closest genes

    Lastly, we examined if our candidates could be linked to putative target genes. Multiple approaches have been reported using data on chromatin accessibility, transcript levels and/or histone modification patterns at both enhancers and genes, across different tissues or developmental time points [4, 51, 81, 82]. We assumed that enhancers regulate the expression of either their adjacent upstream or downstream gene, though it has been observed that other genes can be located between enhancers and their target genes in animals and plants [17, 83,84,85]. We correlated the defined tissue-specificity of candidate enhancers with the gene expression levels of the nearest flanking genes in both tissues. Only genes showing significant differential expression between V2-IST and husk tissue (Cuffdiff [86]) were considered as targets of tissue-specific enhancer candidates for shared candidates, flanking genes that are expressed in both tissues were considered as potential target genes. If a flanking gene showed a significant difference in gene expression that matched the enhancer candidate specificity (e.g. higher gene expression in V2-IST for V2-IST candidates), then the candidate and the gene(s) were linked. With this method, 38 (22%) V2-IST-specific, 143 (13%) husk-specific and 101 (45%) shared enhancer candidates were linked to one putative target gene (Additional file 5: Dataset 4). We also identified 13 (2%) V2-IST-specific, 182 (17%) husk-specific and 103 (46%) shared candidates in which both flanking genes showed expression levels matching the features of the candidates. The other candidates could not be linked to a gene because either none of the flanking genes had a significant expression level difference in the expected direction for tissue-specific candidates (124 [71%] in V2-IST, 772 [70%] in husk) or, in case of shared enhancer candidates, neither of the flanking genes were expressed in one of the tissues (19 [9%] candidates).

    Identification of three known enhancers in maize

    In maize, five well-characterised and putative enhancers were reported, namely the b1 hepta-repeat, the enhancers of tb1, p1, and the putative enhancers DICE and Vgt1 that regulate the expression of the genes bx1 and ZmRAP2.7, respectively [11, 13,14,15, 23, 85, 87]. In our screen, we identified the confirmed and putative enhancers of b1, tb1 and bx1 (Fig. 7 and Additional file 1: Figure S7), although these enhancers were mostly identified and characterised in maize lines other than B73, which could have affected their functionality. For example, the b1 hepta-repeat enhancer has been identified for the B-I epiallele and consists of seven copies of an 853-bp sequence in tandem, while B73 only carries a single copy of this sequence (90% identity with consensus repeat sequence) [12]. In our dataset, b1 showed differential expression in the same direction as observed in the line the b1 repeat enhancer was discovered [23], already indicating there is some degree of conservation in the regulatory region. The tb1enhancer was identified in the inbred line W22 [13, 14] and DICE was shown to be required for high bx1 expression in Mo17 [85]. The enhancer candidates for b1 and DICE were not linked with b1 and bx1, respectively, because their known target genes were not the closest flanking gene. We identified neither the p1 enhancer nor Vgt1. In the case of the p1 locus, high repetitiveness of the region rendered the enhancer unmappable. For Vgt1, a clear DHS was present but H3K9ac-enrichment was not detected within the overlapping LUMR.

    Four H3K9ac-enriched enhancer candidate regions identified by ChIP-seq, candidate H108, the b1 and tb1 enhancer and DICE, were selected for validation with ChIP-quantitative polymerase chain reaction (qPCR). For each region, primer pairs were designed to amplify sequences located at the summit of the peak of the ChIP-seq H3K9ac-enriched region (P), its slope (S) and outside of the peak (O no enrichment by ChIP-seq) (Additional file 1: Figure S14). Results confirmed the presence and absence of H3K9ac enrichment at the identified candidate regions and their flanking regions, respectively. The differential H3K9ac enrichment observed for candidate H108 and the b1 enhancer fits their expected husk tissue-specificity based on the ranking. DICE had a high and low ranking in V2-IST and husk, respectively. In accordance, DICE showed higher H3K9ac enrichment levels in V2-IST than in husk. The tb1 enhancer showed H3K9ac enrichment in both V2-IST and husk. This is in accordance with what is observed for the pooled ChIP-seq data (Additional file 1: Figure S14C). Due to our stringent criteria, the tb1 enhancer was only called as a candidate in husk.

    To examine if H3K4me1 is indeed not enriched at enhancers as suggested by the results depicted in Fig. 1, enrichment for H3K4me1 was determined for the same regions as for H3K9ac enrichment (Additional file 1: Figure S14). Except for the enhancer of tb1, none of the analysed regions showed a clear H3K4me1 enrichment, confirming our previous observation and supporting the idea that H3K4me1 does not generally mark plant enhancers.


    Operators and Enhancers/Silencers - Biology

    Silencers

    A major challenge in biology is to understand how complex gene expression patterns are encoded in the genome. In metazoans, gene expression is regulated in a tissue/cell-type-specific manner predominantly via stretches of noncoding sequence referred to as cis regulatory modules (CRMs). CRMs contain 1 or more DNA binding sites for 1 or more sequence-specific, regulatory transcription factors that function to modulate the expression of target gene(s). CRMs that activate gene expression are typically referred to as enhancers, while those that repress gene expression are referred to as silencers. Transcriptional enhancers activate gene expression in a tissue-specific manner in development and also in adult cells in response to cellular or environmental stimuli. Like enhancers, silencers can function in a cell-type-specific manner. Indeed, silencers may contribute a crucial role in the specification of precise gene expression patterns, thus enabling the establishment of sharp expression domains, such as during development.

    Numerous genomic and computational studies have focused primarily on predicting and characterizing enhancers. In contrast to enhancers, silencers are much less well understood. In a recent study (Gisselbrecht et al., Molecular Cell, 2020), we developed a novel strategy, termed silencer-FACS-Seq (sFS), to screen hundreds of sequences for tissue-specific silencer activity in whole Drosophila embryos. Strikingly, almost all transcriptional silencers we identified were also active enhancers in other cellular contexts. A subset of these silencers form long-range contacts with promoters. Our results challenge the common practice of treating enhancers and silencers as separate classes of regulatory elements and suggest the possibility that thousands or more bifunctional CRMs remain to be discovered in Drosophila and 10 4 - 10 5 in human.

    We are developing approaches to identify and quantify the activities of tissue-specific silencers, to identify the chromatin signatures of silencers, and to elucidate the regulatory roles of silencer-associated (co-)repressors and DNA sequence motifs. We will focus on the developing embryonic mesoderm in Drosophila melanogaster as our model system. We anticipate that the features and chromatin signatures of silencers identified in this project will be evolutionarily conserved across metazoans, including human.


    Watch the video: MINI COOPER S R56 TURBO ΕΞΑΤΜΙΣΗ (October 2022).