Abstract: FR-PO643
Phenome-Wide Association Study of APOL1 Risk Genotypes in the Mass General Brigham Biobank Using Data-Driven Disease Association Clustering
Session Information
- Genetic Kidney Diseases: Cohort Studies - Genetic Associations and Diagnoses
October 25, 2024 | Location: Exhibit Hall, Convention Center
Abstract Time: 10:00 AM - 12:00 PM
Category: Genetic Diseases of the Kidneys
- 1202 Genetic Diseases of the Kidneys: Non-Cystic
Authors
- Vanderwall, David R., Harvard Medical School, Department of Cell Biology, Boston, Massachusetts, United States
- Mcnulty, Michelle, Boston Children's Hospital, Division of Nephrology, Boston, Massachusetts, United States
- Wongboonsin, Janewit, Boston Children's Hospital, Division of Nephrology, Boston, Massachusetts, United States
- Sampson, Matt G., Boston Children's Hospital, Division of Nephrology, Boston, Massachusetts, United States
Group or Team Name
- Sampson Lab for Kidney Genomics.
Background
To identify clinical associations with APOL1 risk genotypes using genotype and ICD-10 diagnostic codes of 53,395 participants in the Mass General Brigham Biobank (MGBB).
Methods
The 53,395 participants in MGBB with exome sequencing data were queried for the presence of at least one APOL1 kidney risk variant (rs73885319 and rs60910145, [“G1”] or rs71785313 [“G2”]). Presence of the G0 reference allele was noted. Individuals were classified as “Low-Risk (LR)” (G0/G1 or G0/G2) or “High-Risk (HR)” (G1/G1, G2/G2, or G1/G2). Demographic data and ICD-10 codes for each patient were extracted. A pairwise incidence matrix of all ICD-10 codes from qualifying participants was generated. The value at each matrix position reflects the number of patients in which both ICD-10 codes were mutually identified. Incidence data was subsequently clustered by Weighted Gene Correlation Network Analysis to produce clusters of broadly correlated ICD-10 codes, and further sub-clustered to produce small disease modules with at least 5 highly related ICD-10 codes. Disease module significance between the HR and LR cohorts was evaluated using Chi-Square test.
Results
1,949 participants had at least one G1 or G2 variant, including 349 HR and 1,600 LR. The HR cohort was significantly enriched for CKD, Nephrotic Syndrome, and ESRD (p < 0.0001), but did not differ in T2D (p = 0.821). Discrete analysis of individual ICD-10 codes identified 55 codes enriched among the HR cohort; the most significant ICD-10 code was “Nephrotic Syndrome and FSGS” (OR= 15.4, [14.6 – 16.2]). Our network approach produced 251 disease modules of co-occurring ICD-10 codes; 31 disease modules were differentially enriched. A majority of these disease modules were specific for renal pathology. The most significant disease module included ICD-10 codes for “Hypertension Secondary to Renal Disorders,” “Stage 5 CKD,” and “Secondary Hyperparathyroidism of Renal Origin.” 32.3% of significant modules mapped to non-kidney diseases, raising hypotheses of other consequences of HR variants on human health.
Conclusion
Disease network clustering across a large, densely phenotyped cohort offers an opportunity for discovery of potentially significant clinical associations of HR genotypes that would otherwise not be identified.