Combining gene mutation with gene expression data improves outcome prediction in myelodysplastic syndromes.
Gerstung M., Pellagatti A., Malcovati L., Giagounidis A., Porta MGD., Jädersten M., Dolatshad H., Verma A., Cross NCP., Vyas P., Killick S., Hellström-Lindberg E., Cazzola M., Papaemmanuil E., Campbell PJ., Boultwood J.
Cancer is a genetic disease, but two patients rarely have identical genotypes. Similarly, patients differ in their clinicopathological parameters, but how genotypic and phenotypic heterogeneity are interconnected is not well understood. Here we build statistical models to disentangle the effect of 12 recurrently mutated genes and 4 cytogenetic alterations on gene expression, diagnostic clinical variables and outcome in 124 patients with myelodysplastic syndromes. Overall, one or more genetic lesions correlate with expression levels of ~20% of all genes, explaining 20-65% of observed expression variability. Differential expression patterns vary between mutations and reflect the underlying biology, such as aberrant polycomb repression for ASXL1 and EZH2 mutations or perturbed gene dosage for copy-number changes. In predicting survival, genomic, transcriptomic and diagnostic clinical variables all have utility, with the largest contribution from the transcriptome. Similar observations are made on the TCGA acute myeloid leukaemia cohort, confirming the general trends reported here.