페이지 정보작성자 관리자 작성일2022-07-18 조회5,283회
Sunho Lee, Seokchol Hong, Jonathan Woo, Jae-hak Lee, Kyunghee Kim, Lucia Kim, Kunsoo Park, and Jongsun Jung
Several tools have been developed for calling variants from next-generation sequencing (NGS) data. Although they are generally accurate and reliable, most of them have room for improvement, especially regarding calling variants in datasets with low read depth. In addition, the somatic variants predicted by several somatic variant callers tend to have very low concordance rates. In this study, we developed a new method (RDscan) for improving germline and somatic variant calling in NGS data. RDscan removes misaligned reads, repositions reads, and calculates RDscore based on the read depth distribution. With RDscore, RDscan improves the precision of variant callers by removing false-positive variant calls. When we tested our new tool using the latest variant calling algorithms and data from the 1000 Genomes Project and Illumina's public datasets, accuracy was improved for most of the algorithms. After screening variants with RDscan, calling accuracies increased for germline variants in 11 of 12 cases and for somatic variants in 21 of 24 cases. RDscan is simple to use and can effectively remove false-positive variants while maintaining a low computation load. Therefore, RDscan, along with existing variant callers, should contribute to improvements in genome analysis.