Application of NGS in the Medical Field

Common NGS solutions in the medical field

l Whole genome sequencing (WGS)

WGS looks for individual mutations, InDel, CNV, SV, etc., the amount of data required is large; a single sample requires 90G data.

l Whole Exome Sequencing (WES)

WES captures all exons in the genome for sequencing, the amount of sequencing data is much less than WGS, and can obtain high sequencing depth (50X-150X).

l Target Sequencing

Target Sequencing captures and enriches targeted regions of interest for sequencing. The technical process is similar to full exon capture. Sequencing requirements are less than WES and higher sequencing depth can be obtained (up to 500X).

l Transcriptome sequencing (RNA-Seq)

RNA-Seq studies sequencing at the transcription level, including mRNA, IncRNA, microRNA, etc.

Whole genome sequencing (WGS/DNA-Seq)

Whole-genome sequencing is a method of performing genome sequencing on individuals whose genome sequence is known, and discarding the difference analysis at the individual or group level. As the cost of genome sequencing continues to decrease, the study of pathogenic mutations in human diseases has expanded from the exon region to the whole genome. High-throughput sequencing is achieved by constructing a library of inserts of different lengths, short sequences, and paired-end sequencing to achieve the detection of common, low-frequency, and even rare mutation sites and structural variations at the genome-wide level with great scientific research and industrial value.

Application of whole genome sequencing

l SNP (individual differences)

l CNV (large fragment gene copy number variation)

l InDel

l SV (structural variation)

Why study SNP?

l SNP has been recognized as a genetic molecular marker of disease occurrence.

l SNP is considered to be one of the main factors leading to the differentiation of drugs, so personalized medication can be used according to changes in SNP.

l SNP is widely distributed and relatively stable.

l SNPs directly affect the expression of functional genes.

Most SNPs are useless

l Most SNPs are silent mutations.

l Missing mutations in non-coding regions will not cause protein changes.

But there are exceptions.

Cancer related (human) applications of WES

l Genetic diseases (human)

l Other non-communicable diseases (human)

Mainly used to find rare mutations, inherited mutations and cancer-related somatic mutations.

l SNP and InDel analysis can be done, but due to the short capture area, it is generally not used for CNV and SV analysis.

Features of WES

l Low cost (generally 50-150X, only 8-15G data is needed) to reduce the analysis background and easy to find rare mutations. Approximately 50-80 bp fragment deletion can be detected. Due to the short exon capture chip fragment, it is difficult to determine whether it was caused by off-target capture or deletion.

l Similarly, because the capture chip fragment is short, generally do not do CNV, SV analysis.

Transcriptome sequencing (RNA-Seq)

l Including mRNA-Seq, IncRNA-Seq, sRNA-Seq, etc.,

l High-throughput analysis of transcript information to discover unknown transcripts and gene annotations.

l Look for changes in the expression abundance of the gene of interest among individuals and the same individual at different periods.

Advantages of RNA-seq

RNA-Seq is not limited to known genomic sequence information, it is suitable for high-throughput transcriptome research of species with unknown genomic sequence. Compared with chip technology, the background signal value is low, there is no upper limit of detection, and there is a very wide detection range for gene expression profiles. In the case of internal reference, it shows high accuracy and repeatability in terms of quantification.

No cloning steps are required, the operation is simple, and the required sample volume is small. The throughput of expression profiling can be performed at the single cell level, and the cost is lower than Tilling array or large-scale EST sequencing.

Challenges of RNA-seq

In the process of library construction, large fragments of RNA must be fragmented, which will introduce some bias. PCR will cause changes in expression levels. The comparison or splicing of massive short-sequence data is complicated, and there are obvious problems in the precise positioning of repeating sequences and multiple matching sequences. There are still considerable errors in the identification of alternative splicing and trans-splicing in higher eukaryotes. The determination of sequencing depth varies with species, organs, tissues, and time, and it is difficult to calculate directly with a unified formula.

What are CTCs?

CTCs (Circulating Tumor Cells) are tumor cells released into the peripheral blood circulation by solid tumors or metastases spontaneously or due to diagnostic procedures.

CTCs are tumor cells that survive in the blood circulation system during tumor metastasis, and their generation is considered to be a necessary prerequisite for tumor metastasis.

Research significance of CTCs

In-depth study of CTCs can help to further understand the mechanism of tumor metastasis, provide a new basis for the treatment of anti-tumor metastasis. The detection of CTCs can help the diagnosis of patients with early metastatic tumors and monitor the recurrence and metastasis of postoperative tumors. Anti-tumor drug sensitivity and patient prognosis and selection of individualized treatment strategies

Research difficulties of CTCs

CTCs have no obvious specificity and are clearly distinguished from other blood cells, and tumors of different histological types and molecular phenotypes express different markers.

The number of CTCs in peripheral blood is scarce. Generally, there are only 1 in 106-107 white blood cells.

Combination of CTCs and NGS

l Capture CTCs

l Identify CTCs

l Single cell expansion

l Building database with trace DNA (WES or targeted area capture)

l High-throughput sequencing

Bioinformatics analysis