Charles Chapple
Head of Bioinformatics
With the release of VarSome Clinical 11.4, we provide an additional solution for CNV calling from Whole Genome Sequencing (WGS) data analysis. The method offers improved recall of long CNVs complemented by concise and intuitive visualizations.
VarSome Clinical CNV calling for whole genome samples (WGS) is currently based on a tool called Delly [Rausch et al., 2012]. This tool processes single samples to create a vcf file containing the detected structural variants. Variants are filtered to retain deletions and duplications (CNVs) which pass quality control criteria, then annotated and reported in the Analysis Results Table.
CNV calling tool performance depends on CNV sizes and read coverage, both of which are known to affect detection and accuracy of CNV calling. Delly performs better in calling shorter CNVs and it is capable of accurately detecting breakpoints. It remains challenging to clearly distinguish larger events from mapping artifacts. [Mahmoud et al., 2019; Gabrielaite et al., 2021; Kosugi et al., 2019; Cameron et al., 2019]. Long CNVs are often causative factors in disease and their detection is essential for accurate diagnosis. Alternative algorithms, such as read depth based methods, may improve recall of longer CNVs, even if they can only provide coarse breakpoint resolution.
Extending the CNV pipeline to process WGS samples
CNV calling for targeted sequencing samples (WES, panels, anything not WGS) on VarSome Clinical is performed using the read depth based tool ExomeDepth (Plagnol et al., 2012). We have now adapted this method to also process WGS samples. The solution is suitable for samples with long CNVs (>50kb) that may not be reliably called by Delly. Validation of the method in a proprietary dataset of 27 samples with 30 known CNVs (call range: 10-95,599Kb, median: 547Kb) has shown sensitivity 97%, compared to Delly’s 33%. Other performance tests in the range of 1-10Kb have indicated that there is orthogonality between the results of Delly and the ExomeDepth based pipeline, suggesting that in certain samples, maximum sensitivity is achieved by running both solutions. It is therefore recommended that the read depth CNV calling pipeline is run in addition to Delly, especially if large rearrangements are suspected to be present.
The solution is run in a similar fashion to WES/panels. A cohort of several (between 3 and 10) WGS samples are processed together in a single run, whereby a selected sample set of the cohort acts as “reference” for each test sample (Link to FAQ/Manual here). The same requirements for non-relatedness between the samples and their processing by the same laboratory, sequencer and ideally in the same batch, apply to WGS samples too. In the case of WGS, the assay target regions comprise the complete genome, split into 50Kb bins. Please note that this imposes a hard minimum size limit: no CNVs smaller than 50Kb can be detected using this approach.
Users with WGS samples will benefit from the extensive functionality of the CNV calling pipeline. All components available to exome and panel samples CNV analyses, such as CNV Browser visualizations, plots and additional quality control metrics, are also available for WGS.
The pipeline selection depends on the number of submitted samples. When a user submits a single WGS sample to VarSome Clinical platform for CNV analysis, it will be processed by Delly. If the user selects multiple WGS samples for CNV analysis, these will be routed for processing by the adapted read depth based pipeline.
Further Information and Support
Not already a VarSome Clinical user? Get in touch and ask for a free trial.
As ever we hope you find these changes and improvements helpful, we’d love to hear any suggestions you may have, support is available as usual from support@varsome.com
- The VarSome Team
Submit a Comment