Boosting GWAS using biological networks: A study on susceptibility to familial breast cancer
Genome-wide association studies (GWAS) explore the genetic causes of complex diseases. However, classical approaches ignore the biological context of the genetic variants and genes under study. To address this shortcoming, one can use biological networks, which model functional relationships, to search for functionally related susceptibility loci. Many such network methods exist, each arising from different mathematical frameworks, pre-processing steps, and assumptions about the network properties of the susceptibility mechanism. Unsurprisingly, this results in disparate solutions. To explore how to exploit these heterogeneous approaches, we selected six network methods and applied them to GENESIS, a nationwide French study on familial breast cancer. First, we verified that network methods recovered more interpretable results than a standard GWAS. We addressed the heterogeneity of their solutions by studying their overlap, computing what we called the consensus. The key gene in this consensus solution was COPS5, a gene related to multiple cancer hallmarks. Another issue we observed was that network methods were unstable, selecting very different genes on different subsamples of GENESIS. Therefore, we proposed a stable consensus solution formed by the 68 genes most consistently selected across multiple subsamples. This solution was also enriched in genes known to be associated with breast cancer susceptibility (BLM, CASP8, CASP10, DNAJC1, FGFR2, MRPS30, and SLC4A7, P-value = 3 × 10−4). The most connected gene was CUL3, a regulator of several genes linked to cancer progression. Lastly, we evaluated the biases of each method and the impact of their parameters on the outcome. In general, network methods preferred highly connected genes, even after random rewirings that stripped the connections of any biological meaning. In conclusion, we present the advantages of network-guided GWAS, characterize their shortcomings, and provide strategies to address them. To compute the consensus networks, implementations of all six methods are available at https://github.com/hclimente/gwas-tools.