Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Detecting copy number aberrations (CNA) in cancer genomes using high-throughput sequencing technologies

View through CrossRef
Abstract The karyotype of human tumors are often aneuploid: besides these numerical deviations, there are often structural rearrangements within individual chromosomes such as amplifications, deletions and translocations. In the resulting genome, profound and complex alterations in the underlying gene network and dosage occur, giving rise to the observed malignant phenotype. To understand how these events contribute to the biology of a tumor cell, a first crucial step is to be able to detect variation in chromosomal structure and, in particular, copy number. The recent development in technology, such as array comparative genomic hybridization (aCGH), SNP microarray and, more recently, high throughput sequencing, made it feasible to detect copy number aberrations (CNA). Using these techniques it is possible to obtain, for thousands of regions across the genome, a numerical value proportional to the chromosome copy number for each of the accessed regions. Comparing DNA from tumor to normal samples, it is possible to identify copy number aberrations across the genome. However, the methodology used to calculate CNA from the raw data usually makes one or both of the following assumptions: • The starting material consists of genetically homogeneous cells. • The overall size of a tumor genome is very similar to the size of the normal genome. The first assumption is completely reasonable when dealing with cell lines, but the same cannot be said when the DNA analyzed is isolated from patients’ tumors. Infiltrations with stromal or endothelial cells are, in fact, largely inevitable. The second assumption, that has a strong impact on the normalization step, is often not correct. Given the scale, extent and severity of chromosomal structural changes, the overall genetic material of a cancer cell might be significantly different (usually larger) than a normal one. Therefore, assuming that the total size of a cancer genome is comparable to the size of the normal genome, might lead to artifacts and misleading results. Some methods do not strictly require assumptions on the size of the genome, but they rely on the ability to detect SNP variants and distinguish between the two alleles of a heterozygous region. They can then infer when two, three, four or more copies are present. Although high throughput sequencing could also detect SNP variants, a very high read coverage would be required, making the technology far too expensive at present. Here, we propose a method to obtain CNA from high throughput sequencing that avoids the two assumptions mentioned above. The method counts the number of reads mapped to a region of fixed length both in tumor and normal DNAs from the same patient. For each region, the ratio between number of reads from the tumor sample and the normal sample depends mainly on three factors: • The average number of copies (including stromal contamination) of a given chromosomal region in the two samples. • The overall depth of coverage (total number of reads) of the two samples. • Sampling error. The goal is to detect, for each chromosomal region, the underlying ratio between number of copies in the tumor genome versus the normal genome, despite the noise due to different depth of coverage and the sampling error. For each chromosomal region, we first calculate the ratio of read counts between tumor and normal. This way we balance out various biases due to biological and technical issues (i.e. GC content, aligning artifacts). Second, to reduce the sampling error with minimal effect on the resolution, we use a segmentation algorithm on the obtained ratio. At this point, we look at the distribution of ratio in all chromosomal regions and we fit a model of several normal distributions with equally spaced means. The ratio from each region can thus be assigned to one of these distributions and underlying CNA estimated. We are testing the algorithm on a series of simulated and real samples to access the strength of the proposed method.
Title: Detecting copy number aberrations (CNA) in cancer genomes using high-throughput sequencing technologies
Description:
Abstract The karyotype of human tumors are often aneuploid: besides these numerical deviations, there are often structural rearrangements within individual chromosomes such as amplifications, deletions and translocations.
In the resulting genome, profound and complex alterations in the underlying gene network and dosage occur, giving rise to the observed malignant phenotype.
To understand how these events contribute to the biology of a tumor cell, a first crucial step is to be able to detect variation in chromosomal structure and, in particular, copy number.
The recent development in technology, such as array comparative genomic hybridization (aCGH), SNP microarray and, more recently, high throughput sequencing, made it feasible to detect copy number aberrations (CNA).
Using these techniques it is possible to obtain, for thousands of regions across the genome, a numerical value proportional to the chromosome copy number for each of the accessed regions.
Comparing DNA from tumor to normal samples, it is possible to identify copy number aberrations across the genome.
However, the methodology used to calculate CNA from the raw data usually makes one or both of the following assumptions: • The starting material consists of genetically homogeneous cells.
• The overall size of a tumor genome is very similar to the size of the normal genome.
The first assumption is completely reasonable when dealing with cell lines, but the same cannot be said when the DNA analyzed is isolated from patients’ tumors.
Infiltrations with stromal or endothelial cells are, in fact, largely inevitable.
The second assumption, that has a strong impact on the normalization step, is often not correct.
Given the scale, extent and severity of chromosomal structural changes, the overall genetic material of a cancer cell might be significantly different (usually larger) than a normal one.
Therefore, assuming that the total size of a cancer genome is comparable to the size of the normal genome, might lead to artifacts and misleading results.
Some methods do not strictly require assumptions on the size of the genome, but they rely on the ability to detect SNP variants and distinguish between the two alleles of a heterozygous region.
They can then infer when two, three, four or more copies are present.
Although high throughput sequencing could also detect SNP variants, a very high read coverage would be required, making the technology far too expensive at present.
Here, we propose a method to obtain CNA from high throughput sequencing that avoids the two assumptions mentioned above.
The method counts the number of reads mapped to a region of fixed length both in tumor and normal DNAs from the same patient.
For each region, the ratio between number of reads from the tumor sample and the normal sample depends mainly on three factors: • The average number of copies (including stromal contamination) of a given chromosomal region in the two samples.
• The overall depth of coverage (total number of reads) of the two samples.
• Sampling error.
The goal is to detect, for each chromosomal region, the underlying ratio between number of copies in the tumor genome versus the normal genome, despite the noise due to different depth of coverage and the sampling error.
For each chromosomal region, we first calculate the ratio of read counts between tumor and normal.
This way we balance out various biases due to biological and technical issues (i.
e.
GC content, aligning artifacts).
Second, to reduce the sampling error with minimal effect on the resolution, we use a segmentation algorithm on the obtained ratio.
At this point, we look at the distribution of ratio in all chromosomal regions and we fit a model of several normal distributions with equally spaced means.
The ratio from each region can thus be assigned to one of these distributions and underlying CNA estimated.
We are testing the algorithm on a series of simulated and real samples to access the strength of the proposed method.

Related Results

Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Frequency of Common Chromosomal Abnormalities in Patients with Idiopathic Acquired Aplastic Anemia
Objective: To determine the frequency of common chromosomal aberrations in local population idiopathic determine the frequency of common chromosomal aberrations in local population...
Single-shot simplified cardioneuroablation
Single-shot simplified cardioneuroablation
Abstract Introduction Cardioneuroablation (CNA) is a promising therapy of a spectrum of arrhythmias dependent on increased activ...
Abstract PO-058: Decoding tissue of origin patterns by tumor DNA and plasma tumor proteins
Abstract PO-058: Decoding tissue of origin patterns by tumor DNA and plasma tumor proteins
Abstract Background: Compared to standard cancer screening paradigms, blood-based cancer screening test was able to identify asymptomatic cancer patients in a less i...
Spatiotemporal Development of Cosmic Noise Absorption at Subauroral Latitudes Using Multipoint Ground‐Based Riometers
Spatiotemporal Development of Cosmic Noise Absorption at Subauroral Latitudes Using Multipoint Ground‐Based Riometers
AbstractElectron density enhancements in the ionospheric D‐region due to the precipitation of high‐energy electrons (>30 keV) have been measured as increases in cosmic radio noi...
Abstract 1698: Copy number diversity within and across tumor types
Abstract 1698: Copy number diversity within and across tumor types
Abstract Introduction Cancers commonly accrue copy number gains and losses during their development. An improved understanding of their contribution to tumorigenesis...
Next Generation Sequencing Technologies and Their Applications
Next Generation Sequencing Technologies and Their Applications
Abstract The advances in next generation sequencing (NGS) technologies have tremendous impacts on the studies of structural and f...
DengueSeq: A pan-serotype whole genome amplicon sequencing protocol for dengue virus v1
DengueSeq: A pan-serotype whole genome amplicon sequencing protocol for dengue virus v1
Background Amplicon-based sequencing (PrimalSeq) was developed in response to the Zika virus epidemic due to difficulties generating complete genomes using metagenomic approaches [...

Back to Top