CERENKOV3: Clustering and molecular network-derived features improve computational prediction of functional noncoding SNPs.

Title	CERENKOV3: Clustering and molecular network-derived features improve computational prediction of functional noncoding SNPs.
Publication Type	Journal Article
Year of Publication	2020
Authors	Yao, Y, Ramsey, SA
Journal	Pac Symp Biocomput
Volume	25
Pagination	535-546
Date Published	2020
ISSN	2335-6936
Keywords	Algorithms, Cluster Analysis, Computational Biology, Genome-Wide Association Study, Humans, Machine Learning, Models, Genetic, Polymorphism, Single Nucleotide, RNA, Untranslated
Abstract	Identification of causal noncoding single nucleotide polymorphisms (SNPs) is important for maximizing the knowledge dividend from human genome-wide association studies (GWAS). Recently, diverse machine learning-based methods have been used for functional SNP identification; however, this task remains a fundamental challenge in computational biology. We report CERENKOV3, a machine learning pipeline that leverages clustering-derived and molecular network-derived features to improve prediction accuracy of regulatory SNPs (rSNPs) in the context of post-GWAS analysis. The clustering-derived feature, locus size (number of SNPs in the locus), derives from our locus partitioning procedure and represents the sizes of clusters based on SNP locations. We generated two molecular network-derived features from representation learning on a network representing SNP-gene and gene-gene relations. Based on empirical studies using a ground-truth SNP dataset, CERENKOV3 significantly improves rSNP recognition performance in AUPRC, AUROC, and AVGRANK (a locus-wise rank-based measure of classification accuracy we previously proposed).
Alternate Journal	Pac Symp Biocomput
PubMed ID	31797625
PubMed Central ID	PMC6897322
Grant List	OT2 TR002520 / TR / NCATS NIH HHS / United States

Contact Info

Gary R. Carlson, MD, College of Veterinary Medicine
Oregon State University
700 SW 30th Street
Corvallis, OR 97331-4801
541-737-2141

vetmed@oregonstate.edu

The college is fully accredited by the American Veterinary Medical Association, Council on Education (COE).

Facebook | Instagram | LinkedIn

Gary R. Carlson, MD, College of Veterinary Medicine

You are here

CERENKOV3: Clustering and molecular network-derived features improve computational prediction of functional noncoding SNPs.

Contact Info