Inverse classification is a utility-based data mining process that generates optimal instance-centric recommendations that most increase the likelihood of a desirable event occurring by working backwards through an induced classifier. We have carefully crafted a framework for such a task that differentiates between features that can and cannot be changed, the difficulty and feasibility associated with each feature change, and the cumulative effort involved in implementing all changes. Our three formulations of the problem have accounted for classifiers that are smooth and differentiable , those that are not , and the causal nature of the problem .
The movie industry garners $30 billion in yearly revenue, yet less than a third of movies produced are profitable; on average, films amass $65 million in investment dollars. To better inform investment decisions and bring to light the factors that influence movie profitability, we constructed a so-called movie investor assurance system (MIAS) that is capable of accurate profitability predictions (AUC value/C stat = .82). Using the developed predictive system we were able to discover and typify the factors (i.e.,~features) of movies most indicative of profitability [1,2].
The survivability of colorectal cancer (CRC) hovers around ~40%. Timely and accurate diagnoses and prognoses are key to informing treatment decisions that ultimately impact survivability. Research on CRC, and other such diseases, has found that geographic location plays an important role in determining survival outlook. In these interdisciplinary works, we endeavor to more accurately forecast CRC survival, in the form of survival curves, for patients in the state of Iowa by developing and exploring geographical deep learning methodologies. We find that our devised representations produce more accurate results than baseline methods [7,10].
Hand hygiene is a first-line of defense against the spread of hospital acquired infections, including antibacterial resistant infections such as MRSA, employed by healthcare workers. In these multidisciplinary works, we examine sensor-based data, provided by our industry partner GOJO Industries, consisting of 24.5 million hand hygiene opportunity records, collected from 19 distinct hospital facilities, using linear predictive models. The goal of the work is to leverage these models to uncover the factors that affect hand hygiene compliance, either positively or negatively, in order to inform future hand hygiene interventions aimed at increasing compliance and stemming the spread of disease [5,8]. These works were undertaken with the Computational Epidemiology Research Group (CompEpi).