Basic/Translational Science -> Genomics: Bench D-PO03 - Poster Session III (ID 48) Poster

D-PO03-022 - Deep Learning Classification Of Atrial Fibrillation Using Genome-wide Association Study Dataset (ID 1079)


Background: Atrial fibrillation (AF) has been known to be a heritable disease, and we previously reported multiple genetic loci associated with early-onset AF among Korean population.
Objective: We hypothesized that genome-wide association studies (GWAS) associated with a deep learning (DL) algorithm improves predicting-power of AF phenotype.
Methods: We used GWAS data sets of 4372 samples (672 AF cases and 3700 controls with 615,235 SNP data genotyped using the Affymetrix 6.0 SNP array). We tested statistical associations between individual single nucleotide polymorphisms (SNPs) and AF using logistic regression. The data was split into training (70%) and test set (30%), and the stratified k-fold cross-validation was performed considering the characteristics of the unbalanced case/control of the GWAS data set. DL analyses were conducted in the conditions for progressive performance improvement (p<5×10-8[39 SNPs], 1×10-6[63 SNPs], 1×10-5[86 SNPs], 1×10-4[192 SNPs], 1×10-3[1022 SNPs], and 1×10-2[8045 SNPs]) and an inverse p-value with non-significant subset (p>9.8×10-1[12018 SNPs]).
Results: DL computational power was 1.025ms per each sample. The verification results of the algorithm show high prediction performance for AF in test-set (p<1×10-2, AUC=0.9981). As the cutoff p-values increase, AF predictive performances improved. In contrast, experimental subsets of non-significant p-value could not predict AF at all in spite of a large number of SNPs.
Conclusion: DL algorithms capture cumulative effects of less significant or undiscovered SNPs that determine the phenotype of AF. GWAS associated with DL algorithm might be a useful tool to determine the polygenic phenotypes.