RESOURCES

How to get the most out of your 3C assay sequence

https://dovetailgenomics.com/wp-content/uploads/2020/10/Index-4.gif

Chromatin conformation capture (3C) assays have been used for ~20 years to elucidate complex folding of genomic sequence in three-dimensional space.

Workflow:

All 3C-based assays follow the same core workflow:

Chromatin fixation → Digestion → Ligation of proximal cut ends → Removal of cross-links

What type of data is generated?

The result is chimeric DNA that is reflective of DNA fragments that were close in physical space.
Auxillary molecular biology processes can be over laid on the core workflow to:

Detect specific chimeric constructs by qPCR (3C)
Amplify all interactions that occur with known loci (4C)
Convert all chimeric DNA into an NGS sequencing library (Hi-C)
Target many loci through hybrid capture-NGS (Capture-C or Chi-C)
Conduct chromatin immunoprecipitation on a protein of interest (HiChIP or PLAC-seq)

All of these approaches capture the 3D topology of the genome through the lens of the chimeric DNA molecule.

That’s great! So what’s the problem?

Historically, these data types, regardless of the chimeric DNA of focus, have one thing in common – the chromatin fragmentation is performed using restriction enzymes (REs) that recognize and cut at specific motifs. RE adoption was largely a product of convenience as they:

Provided an endpoint assay.
Acted as a primary-sequence touchstone to computationally assess the data.

While convenient, due to their sequence-dependence, the use of REs does have three primary downsides:

Generation of highly variable chromatin fragment sizes.
Uneven read distribution with data stacked at restriction sites.
Limited contact matrix resolution preventing detection of finer chromatin features.
- Literature reviews reveal contact matrix limits are ~1 kb.

1 kb matrices require an extraordinary amount sequencing – 1.2-1.6 billion paired-reads (yes, that’s billion with a B).

How can you overcome these downsides?

Micro-C, a Hi-C approach using micrococcal nuclease (MNase) in place of REs, offers a solution.

For those of you not familiar, MNase is an endo/exonuclease that cuts at nucleosome-free (linker DNA) and then digests DNA back to the nucleosome. The resulting fragments are short and consistently sized – roughly the length of DNA wrapped around the nucleosome (i.e. 146 bp). Following proximity-ligation and cross-link reversal, the resulting chimeric DNA reveals the positioning of nucleosomes in physical space. Moreover, the short and consistent fragment sizes do not require sheering thereby offering two key advantages:

Reduced hands-on time.
Eliminates capture of self-ligation events in the final library increasing signal-to-noise and making better use of the sequence space.

Thus, replacing REs with MNase results in the enrichment of nucleosome-protected sequence and depletion of linker DNA (Figure 1).

https://dovetailgenomics.com/wp-content/uploads/2020/10/Fig1-1.png

Figure 1. Comparison of chromatin fragment size, possible ligation events, and coverage per fragment between Micro-C (blue) and RE-based Hi-C (green).

Whether or not you are ready to push resolution boundaries, Micro-C enables significantly increased potential for discovery of chromatin features and dynamics through decreased sequencing cost and the ability to capture chromatin dynamics at an entirely new scale. What’s there not to like about that?

The result is the stacking of sequence reads over nucleosomes. The consistently sized, small fragments enable better read support per fragment and more efficient use of generated sequence data. The net effect is to produce an ultra-high-resolution nucleosome position map, thus, reaching the theoretical 3D resolution maximum (Davies et al., 2017).

So why should I care?

Nucleosomes are the fundamental building block of chromatin. Early Micro-C studies are detecting novel features reflective of chromatin dynamics (Figure 2).

https://dovetailgenomics.com/wp-content/uploads/2020/10/Fig2png.png

Figure 2. Discover the anatomy of a TAD. A) Micro-C’s nucleosome resolution enables the detection and description of chromatin dynamics at the sub-TAD scale including enhancer- promoter (E-P), promoter-promoter (P-P) interactions, and loop extrusion features. Figure modified from Chang and Noordermeer, 2020. B) Chromatin loops identified in a Micro-C library sequenced to 800M read pairs from GM12878 compared to loops detected in Rao et al., 2014. C) Conformation features detectable by Micro-C and Hi-C, highlighting Miro-C’s ability to capture features below the sub-TAD features.

These studies are providing an enhanced perspective on enhancer-promoter and promoter-promoter interactions that occur within topologically associated domains (TADs) and CTCF-mediated looping events (Hsieh et al, 2020). In addition, they are offering new understandings of the loop extrusion dynamics.

While, loop extrusion stripes (DNA is pulled through a cohesin ring and has contact streaks along distal DNA already anchored at the cohesin via CTCF ) have been observed in RE-based Hi-C contact matrices, Micro-C is capturing loop extrusions at a detail that has not been previously observed (Krientenstein et al. 2020). For example, extrusion stripes in Micro-C data are flecked with punctate hot spots of increased contact frequency and these features are at least five times more prevalent than the chromatin loops themselves. While the significance of these new observations is not yet clear, the rate of their occurrence makes this an area ripe for discovery.

But “Micro-C is going to require even more sequence if I’m increasing the resolution of my contact matrix!” I hear you say. There is some truth to that objection, but there is still significant benefit to Micro-C even at the resolutions you may currently be exploring.

As you recall earlier in this article, to create RE-based Hi-C data requires 1.2 – 1.6 billion paired-end reads to generate an informative contact matrix at 1 kb. Nucleosomes occur more frequently in chromatin than accessible restriction sites and, so, for any bin size, the sequence data is more evenly distributed over each bin. This coverage distribution results in almost every 1 kb bin in Micro-C data being informative. As a result, Micro-C reduces the sequence depth needed to build a 1 kb matrix to only 800 M paired-end reads (80X coverage). Our experiences demonstrate that, with these parameters, Micro-C captures ~56,000 chromatin loops compared to only 7,000-9,000 loops identified with RE-based Hi-C (Figure 2).

Not only are Micro-C contact matrices more informative at 1 kb resolution, but they are more data rich (Figure 3).

https://dovetailgenomics.com/wp-content/uploads/2020/10/Fig3.png

Figure 3. Micro-C bins have superior coverage and are more informative, even at 1 kb than Hi-C data. Coverage tracks from Micro-C (blue) and RE-based Hi-C (orange), displayed over the same region. The coverage per bin and use-ability of each bin at 200 bp and 1 kb. The coverage gaps observed in the Hi-C data make some bins either un-useable or exhibit reduced coverage per bin compared to Micro-C.

Many bins in traditional Hi-C still lack sufficient coverage to confidently capture conformation features. The outcome of using Micro-C to build a 1 kb contact matrix is sequencing cost savings of 30-60%, AND the ability to detect more chromatin features.

In conclusion, whether or not you are ready to push resolution boundaries, Micro-C enables significantly increased potential for discovery of chromatin features and dynamics through decreased sequencing cost and the ability to capture chromatin dynamics at an entirely new scale. What’s there not to like about that.

References

3D Genomics, Epigenetics, Hi-C, LinkPrep™, News

Targeting Super-Enhancer Driven Genes in Multiple Myeloma

Super-Enhancers and Their Role in Multiple Myeloma Enhancers are regulatory elements located throughout the genome that bind transcription factors and...