Fitting and Testing Sparse High-Order Interaction Models

Finding statistically significant high-order interactions in predictive modeling is important but challenging task because the possible number of high-order interactions is extremely large (e.g., $> 10^{17}$). In this study we propose feature selection and statistical inference for sparse high-order interaction models. Our first contribution is to develop an efficient optimization algorithm for L1-regularized empirical risk minimization problem. Our second contribution is to extend recently developed selective inference framework to sparse high-order interaction models by developing a novel algorithm for efficiently characterizing the selection event. We demonstrate the effectiveness of the proposed methods by applying it to an HIV drug response prediction problem.