Leqi Liu

Mentors: Jia Tao, Fadi Towfic

Computer Science

Classifying Diseases Based on their Genetic Causes

Efforts have focused on characterizing diseases based on sets of related phenotypes. However, as research uncovers more refined understanding of genetic contributions to disease susceptibility and progression, scientists have started to notice links between diseases with different phenotypes but similar causes. For example, according to the result presented in the article Comorbidity of Type 1 Diabetes and Juvenile Idiopathic Arthritis [The Journal of Pediatrics, Vol. 166, No. 4, 2015, pp.930-935], although type 1 diabetes and rheumatoid arthritis are considered separate diseases, children with type 1 diabetes are much more likely to develop a juvenile form of diabetes compared to children without type 1 diabetes. We plan to extract the genetic causes of diseases from publically available genetic studies incorporated in the ClinVar database (a database containing over 140,000 genetic variation associated with human diseases). We will cluster diseases based on their shared genetic causes and identify sets of diseases that share similar causal genetic mechanisms. This method of disease classification may help identify sets of diseases that share similar causal mechanisms that may serve as therapeutic intervention targets. Additionally, this new disease classification methodology may help identify diseases that might benefit from similar genetic screening procedures – reducing treatment costs by helping detect and treat diseases early to decrease potential disease complications.