肿瘤突变高灵敏度NGS检测技术:挑战、进展和应用
Next-generation sequencing (NGS) technologies have come of age as preferred technologies for screening of genomic variants of pathologic and therapeutic potential. Because of their capability for high-throughput and massively parallel sequencing, they can screen for a variety of genomic changes in multiple samples simultaneously. This has made them platforms of choice for clinical testing of solid tumors and hematological malignancies. Consequently, they are increasingly replacing conventional technologies, such as Sanger sequencing and pyrosequencing, expression arrays, real-time PCR, and fluorescence in situ hybridization methods, for routine molecular testing of tumors.
However, one limitation of routinely used NGS technologies is the inability to detect low-level genomic variants with high accuracy. This can be attributed to the frequent occurrence of low-level sequencing errors and artifacts in NGS workflow that need specialized approaches to be identified and eliminated.
LOD of Routine Clinical NGS Platforms
In terms of NGS, LOD can be defined as the minor allelic frequency of a variant at which 95% of the samples tested will be detected reliably. In NGS, this can be potentially influenced by several factors. Consequently, it is not surprising that the reported LODs for a variety of NGS assays, including limited gene panels, whole exome sequencing, and whole genome sequencing, range from 2% to 15% variant allelic frequency (VAF).
Cancer is predominantly driven by somatic mutations, which can occur at any level in a sample because of tumor heterogeneity caused by co-existence of multiple mutations in the tumor cell or occurrence of multiple cancerous clones in the same tumor. Consequently, it is challenging to define an LOD as adequate for genomic screening of tumors. Considering that somatic mutations can manifest at any given level in a tumor sample, achieving progressively higher detection sensitivity (or lower LODs) will be crucial to improve tumor mutation screening. Although current NGS platforms routinely identify low-level mutations (<1% VAF), most are not equipped to distinguish them from spurious, low-level sequencing artifacts (false positives) that originate from various steps of NGS workflow.
Origins and Impact of Errors in NGS Workflow
factors that influence LODSample and DNA Processing Artifacts
However, most of the solid organ tumors and bone marrow biopsy samples are formalin fixed and paraffin embedded (FFPE). However, formalin fixation has detrimental effects on the integrity, extraction efficiency, and amplifiability of nucleic acids.
Furthermore, several parameters of formalin fixation and nucleic acid extraction processes
have been identified to also adversely affect the quality of nucleic acids, and strategies to minimize them have been proposed.
Another routinely used method for preparation of DNA for NGS is sonication (acoustic shearing), which is used to generate DNA fragments of required size ranges. This process also generates low-level 8-oxoguanine lesions in DNA because of oxidation, resulting in G:C>T:A artifacts that compromise the accuracy and confidence of low-level variant detection.
PCR Amplification and Polymerase Fidelity
Although highfidelity polymerases with error rates of one per million base pairs (error rate of 10-6) are routinely used for the NGS workflow, using excessive PCR cycles and low-quality DNA with processing-induced chemical modifications compound the issue, resulting in sequencing artifacts.
In addition, polymerase-induced PCR errors during clonal amplifications of DNA strands (to isolate and amplify each DNA library strand for sequencing) and the actual sequencing by synthesis also contribute additional errors.
Not surprisingly, several studies investigating the role of polymerase fidelity have established improved accuracy of NGS by incorporating higher-fidelity polymerases, albeit to varying extents, depending on the target sequence and applications investigated.
Furthermore, GC- and AT-rich genomic regions can also be underrepresented in the PCRdependent workflow, resulting in low sequencing coverage and compromised variant detection, especially gene copy number alterations.
Sequencing
Although all NGS platforms perform massively parallel sequencing, the underlying sequencing technologies are distinct. This results in intrinsic variations in major aspects, such as sequencing run time, sequencing output, read length, and error rates.
Sequencing Data Analysis
This represents a critical step in NGS workflow, which includes digital processing of the voluminous sequencing information generated by the sequencers to obtain meaningful genomic sequences and detect variants. It encompasses consolidation of raw signal information from the sequencer, base calling, elimination of low-quality base calls and sequencing reads using preset quality parameters, and alignment of sequence information obtained to a reference sequence to identify potential sequence variants.
However, this process could also be a source of sequence artifacts, which can originate at various steps involved.
-
One of the important steps is sequence alignment, where the presence of repetitive sequences and occurrence of complex insertions and deletions can lead to misalignments, adversely affecting variant calling.
-
Another important step in data processing is the filtering of the variants, depending on the sequencing quality and the VAFs.
Enhancing Reliability of Low-Level Variant Detection Capability of NGS
Template Tagging
Template tagging is a revolutionary breakthrough approach where each template DNA molecule is tagged by a unique synthetic DNA sequence and used to trace back the strands of origin for the variant detected.
Different studies using UIDs in NGS have reported highconfidence LODs, ranging from 0.01% to 1% VAF, which represents a significant improvement (by at least two orders of magnitude) in comparison to conventional NGS.
Duplex Sequencing
This is an improved version of the UID tagging approach, in which the DNA template is tagged by two distinct and random UIDs on either side, tagging both the strands of DNA (+ and - strands). Doing so, the mutations detected can be traced back to each individual strand of duplex DNA template after paired-end sequencing.
Non-UID Approaches
The use of UIDs comes with some drawbacks:
- like the requirement of high sequencing depth to generate adequate redundant reads for the low-level mutations to be represented in adequate numbers of UID families. This increases the cost of sequencing.
- Additional limitations include potential off-target interactions of the UID tags among themselves and with the target sequences
- the need for specialized analysis pipelines for filtering and variant calling using the UID information.
To circumvent this, some recent advances have used non-UID methods that can identify lowlevel variants with accuracy.
-
duplex proximity sequencing physically links copies of every template DNA strand so that both the duplex strands are sequenced as a single cluster during sequencing, and the mutant detected in the cluster can be traced back to both DNA strands (+ and -) without the need for duplex sequencing with UIDs.
-
post-NGS sequencing, the reads that have low-level mutations are physically and selectively isolated and amplified and the sequence is confirmed by either NGS or Sanger sequencing, thus avoiding the need to sequence the wild-type sequence background.
-
a recent study has demonstrated the use of technical replicates from the level of library preparation as a solution to accurately detect low-level mutations without using UID tagging.
Although these methods represent significant improvements in non-UID options, their applicability in routine clinical laboratory workflow and performance to the expected standards of a clinical test are yet to be demonstrated.
Single-Molecule Sequencing
Eliminating the clonal amplification step would be useful in eliminating these errors and is possible with sequencing technologies, which are referred to as single-molecule sequencing, where each original strand of DNA is sequenced individually.
image.png image.pngApplications of High-Sensitive NGS in Oncology
Monitoring of MRD
Minimal residual disease (MRD) is defined as low-level remnant tumor content after complete remission, following surgery or therapeutic intervention, and is undetectable by routine histologic or radiological examinations. MRD represents cells from the original tumor that have not been cleared by treatment approaches and therapy-resistant subclones.
MRD monitoring is more prevalent in hematological cancers because of the ease of obtaining blood samples and is performed using sensitive techniques, like flow cytometry and ASPCR. Consequently, MRD levels are also defined by the LOD of these techniques, which is generally 0.01% (1 tumor cell in the background of 10,000 normal cells). However, these methods are restricted to screening a limited number of known markers and are not helpful in preemptively detecting emergent malignant clones.
NGS with the unique capability to screen multiple markers simultaneously can fulfill this need. In recent years, several studies have established the ability of NGS in MRD detection.74,75 However, routine MRD evaluation by NGS has been hindered because of the difficulties in detecting and authenticating low-level mutations (especially SNVs), cost, and long turnaround time. Consequently, MRD detection by NGS has been recommended for clinical trials only.
Recently developed, error-corrected NGS technologies have made NGS more attractive for routine MRD estimation.
Screening of ctDNA
The overall levels of cell-free DNA are also observed to be more in cancer because of the relatively higher cell growth and necrosis in tumors compared with healthy tissues.
Studies have shown concordance of the mutation profiles of ctDNA and tumor along with a direct correlation of ctDNA with the tumor burden and the stage of the tumor. Consequently, isolation and screening of cell-free DNA as a surrogate for DNA directly isolated from tumor tissue is an attractive proposition. This approach is especially valuable for solid tumors, where acquiring a tumor sample requires surgical resection or methods such as core-needle or fine-needle biopsies.
Ref
- Singh RR. Next-Generation Sequencing in High-Sensitive Detection of Mutations in Tumors: Challenges, Advances, and Applications. J Mol Diagn. 2020;22(8):994-1007.