What makes Region-Specific Extraction (RSE) unique?
- RSE is able to selectively isolate genomic target segments that are typically an order of magnitude larger (20 kb) than fragments produced by other target enrichment methods. Long DNA sequence reads can greatly simplify the assembly of complex genomes, in particular if the target region contains difficult sequence elements.
- RSE requires only a small number of short (20-25 base) capture primers, typically spaced at distances of 3-8 kb, to pull down an extended region of interest. This makes it possible to enrich a region of interest even from potentially highly variable samples by creating a standard capture primer set based on known, conserved sequence elements.
- RSE allows the identification of regions with unknown sequence around each capture point. It can therefore generate accurate and complete sequence information even for complex regions of interest for which no reliable reference genomes exist. Other target enrichment methods are unable capture a broad genomic context.
What amount of DNA do you require for each sample to ensure an efficient isolation of the targeted area?
600 ng – 6 µg per extraction are recommended depending on the project. DNA concentration should be at least 50 ng/µl.
Which protocol should I use to isolate the DNA?
Column- or bead-based methods that provide ‘high quality’ DNA – i.e. clean, reasonably long for the intended downstream application and not significantly entangled or nicked – are recommended. It is not necessary to shear the isolated DNA prior to use in RSE, however if desired this can be done in order to limit the linkage distance of each capture primer. Typical commercial methods for DNA extraction from blood, tissue, cell lines etc. work well and are able to produce captured fragment lengths after RSE that are significantly greater than those obtained with other enrichment methods.
If the maximum available linkage distance is desired from any single capture point during RSE / HSE, use freshly prepared DNA from freshly collected biospecimens that were properly stored and handled to avoid DNA degradation. Genomic DNA that is freshly isolated from older specimens, in particular if from blood, often suffers from significant damage and degradation that affects the achievable linkage distance and on- vs. off-target ratio during RSE.
The presence of hemoglobin-based iron as well as other factors gradually damage and degrade nucleic acids in bispecimens even if they are kept frozen at -20C or -80C and stored in common collection buffers. In order to preserve the longest molecular weight DNA that is available from a biospecimen for later extraction, the sample should be stored at liquid nitrogen temperatures soon after collection. This turns the sample into a glassy state that prevents any molecular reactions.
DNA that has been extracted from blood, tissue, cell lines etc. should be kept in solution or frozen at all times to avoid precipitation and entanglement, which increases off-target background. Ethanol precipitation methods are acceptable as long as the isolated DNA is never allowed to fully dry out. The use of reconstituted DNA that was previously dried or lyophilized is not recommended. Some exceptions apply for forensic applications. Please contact us for further details.
What length of DNA segments can you pull down?
Depending on the quality of the input DNA, RSE generates a large proportion of enriched segments that can produce long read-length sequences across the targeted region of interest. Without prior fragmentation, most of the DNA after whole genome amplification (WGA) and RSE will be between 5 kb and 20 kb, with a portion over 40 kb. Forensic applications have reached linkage distances of 50 kb in each direction from a single capture point.
If desired, the genomic DNA can be subjected to a defined restriction enzyme digest prior to RSE in order to create defined sequence boundaries, limit the overall linkage distance and thereby eliminate off-target material from more distant loci. It is also possible to carry out a size-selection step prior to RSE (such as through the use of AMPure® beads) to further eliminate any undesirable DNA template that does not correspond to the intended target size.
What is the capture efficiency?
RSE achieves a capture efficiency of 20-30% per targeted locus even for very large DNA target segments based on proprietary active magnetic mixing (i.e. copy number of a locus enriched by RSE versus the copy number of this locus present in the input DNA). The material obtained after RSE is then typically amplified by whole genome amplification (WGA) in order to generate sufficient material for library preparation and sequencing.
For some applications, such as the use of RSE-enriched material on DNA arrays, no further amplification is required. We are working with providers of newer next-generation sequencing platforms that will be able to directly process very large DNA segments as produced by RSE in order to generate greatly increased DNA reads across a target region. For these instruments the intermediate WGA step, which is required for current NGS platforms, will no longer be needed.
What is the specificity?
The enzymatic primer extension step provides dual specificity based on the condition that each capture primer has to both 1) hybridize to the corresponding target sequence and 2) have a matching 3’-end in order for it to get extended.
Due to the ability of RSE to pull down very large segments of the original template DNA, the amount of off-target material is high when compared with hybridization-based enrichment methods. Depending on the target region size, homology of target versus non-targeted regions and the duration of the WGA amplification step, the amount of off-target material is typically over 50% and in some cases has been as high as 90%.
This is in part due to the extension-based capture of very large fragments, which can lead to cross-hybridization of captured regions with other, non-targeted chromosomal segments that contain similar sequence, such as repeats, and the subsequent WGA amplification step, which can further introduce non-target sequences.
We therefore recommend periodically checking the amount of DNA generated during the WGA step via a fluorescence-based method (i.e. Qubit or nanodrop 3300) and strictly limiting the WGA to the amount necessary for library preparation.
How does the distance of the primer extension step during RSE relate to read length?
The length of the extension itself is irrelevant for the linkage distances that can be achieved from each capture point (provided the template DNA has sufficient length). The biotinylated nucleotides that are incorporated during the extension step only to serve as the handle for the pull-down of the original DNA template.
Each primer isolates the original DNA template in both directions – upstream and downstream – from the capture point. The read length and uniformity of sequence coverage achieved are dependent on the capture primer locations, the amplification and library preparation steps performed after RSE and the selected NGS platform.
How many capture primers are required per region?
RSE capture primers are typically spaced every 3-8 kb. This provides redundant capture across the region of interest even in cases where a capture primer may fail to perform optimally due to any unexpected issues, such as the presence of polymorphisms or other sequence variants at the capture point, DNA damage, secondary structure, or residual protein content. For short molecular weight DNA the distance between primers should be reduced.
The recommended average RSE primer spacing varies between 1-20 kb depending on the application and the overall number of primers that are used in one extraction:
- For small target regions (50-250 kb) and a low number of capture primers (10-50), we recommend an average distance between neighboring primers of 3-5 kb.
- For extended target regions (1-5 Mb) requiring a large number of capture primers (300+), the average distance between neighboring primers should be increased to 8-10 kb. This helps ensure adequate capture across the target region by retaining a sufficient capture primer concentration with a low risk for primer dimer formation.
For the typical design of a primer set, the region of interest is first repeat-masked to identify unique sequence elements across the target region that can serve as capture points. These sequences should then be reviewed for the possible presence of known mutations, which can either be avoided, incorporated or exploited during capture primer design.
Can I pull down a region that contains unknown sequence?
Yes, as long as it contains or is flanked by known sequence elements that can be used to design unique capture primers (typically 15-25 bases are sufficient). For example, a 300 kb contiguous region that includes unknown sections of sequence can reliably be captured using about 40 capture points when working with input DNA of sufficiently high quality. Linkage distances of up to 50 kb in both directions from each capture point have been achieved in forensic applications.
Our proprietary primer design process is able to generate highly specific capture primer sets even for extremely difficult genomic regions that do not allow for the placement of reliable primer sets with conventional repeat masking procedures.
Due to the enzymatic nature of the RSE capture process, even a very restrictive target sequence can in all likelihood be utilized to design a successful capture primer at this position:
- The 5’-end of a primer can be allowed to partly overlap repetitive sequence as long as its 3’-end is unique.
- The presence of a know polymorphism under the primer can be accepted by designing primers for both variants.
- If an allelic discrimination is desired, a known polymorphism can be positioned at the 3’-end of a capture primer (= HSE). In this case the enzymatic biotin labeling step and subsequent capture will only occur for primers whose 3’-ends match the targeted allele but not for variants that create a 3’-mismatch with the primers.
Does it matter in which direction the capture primers are oriented?
No. The orientation of the primers does not affect the linkage distance in either direction from the capture point.
Some customers prefer to orient the direction of the primers so that the enzymatic extension occurs into the region that is of particular interest to the user. The presence of the extended, biotinylated strand does not appear to interfere with any downstream assay, such as NGS or conventional Sanger sequencing, DNA arrays, or PCR / qPCR-based assays.
Can I design my own capture primers?
Yes. Existing primers and validated primer sets, such as used for for long-range PCR or qPCR, can often directly be used for RSE or as a starting point for the design of optimized RSE primers.
RSE capture primers are typically 15-25 bases long and designed to target unique sequence elements that distinguish the region of interest from the rest of the genomic material that is present in the sample. They should have a melting temperature of approximately 58°C and a GC content of no more than 50%. It is advantageous to avoid GC-regions for increased capture efficiency. The RSE capture primers contain no biotin and are used at an equimolar ratio for a combined (total) concentration of 100 µM.
RSE primers can be placed on any target strand (+/-) in any direction / orientation that is most convenient, although self-complementary capture primers that are targeting the same sequence via opposite strands should obviously be avoided. A pairwise bioinformatic analysis of all primers should be conducted to eliminate possible primer dimers that might result in reduced capture. RSE is very robust against the effects of possible primer dimer formation because, unlike in PCR, only a single extension step is required for capture and consequently no self-amplifying product is created.
A patented variation of RSE called region-specific amplification (RSA) uses a modified protocol to isothermally generate amplified products across the region of interest. The primer design in that case requires more diligence than for RSE.
Allele-specific PCR primers can likewise be use as allele- or PSV-specific primers during haplotype-specific extraction, HSE. Generally it is best to place a mismatch in order to select between two heterozygous alleles at the second-to-last position at the 3′-end of a capture primer. Additional primer design considerations apply to HSE that can greatly increase the degree of separation and overall capture efficiency, in particular when separating single-base differences in sequence (i.e. SNPs or PSVs).
We provide assistance in custom primer design for RSE and HSE if you need help.
Specific validated capture primers sets are available for common regions of interest.