Posts

Subgroup Discovery with the Cox Model

We study the problem of subgroup discovery with Cox regression models and introduce a method for finding an interpretable subset of the data on which a Cox model is highly accurate. Our method relies on two technical innovations: the emph(Unknown sysvar: (expected prediction entropy)), a novel metric for evaluating survival models which predict a hazard function; and the emph(Unknown sysvar: (conditional rank distribution)), a statistical object which quantifies the deviation of an individual point to the distribution of survival times in an existing subgroup. Because of the interpretability of the discovered subgroups, in addition to improving the predictive accuracy of the model, they can also form meaningful, data-driven patient cohorts for further study in a clinical setting.