CSE DSI Machine Learning Seminar with Ethan Fang (Duke University)
Estimation and Inference for Assortment Optimization
We present two works on assortment optimization. In the first part, we consider a class of assortment optimization problems in an offline data-driven setting. A firm does not know the underlying customer choice model but has access to an offline dataset consisting of the historically offered assortment set, customer choice, and revenue. The objective is to use the offline dataset to find an optimal assortment. Due to the combinatorial nature of assortment optimization, the problem of insufficient data coverage is likely to occur in the offline dataset. Therefore, designing a provably efficient offline learning algorithm becomes a significant challenge. To this end, we propose an algorithm referred to as Pessimistic ASsortment opTimizAtion (PASTA for short) designed based on the principle of pessimism, that can correctly identify the optimal assortment by only requiring the offline data to cover the optimal assortment under general settings. In particular, we establish a regret bound for the offline assortment optimization problem under the celebrated multinomial logit model, where the regret is shown to be minimax optimal. Joint work with Juncheng Dong, Weibin Mo, Zhengling Qi, Cong Shi, and Vahid Tarokh.
In the second part, we consider the inferential problem in assortment optimization. Uncertainty quantification for the optimal assortment still needs to be explored and is of great practical significance. Instead of estimating and recovering the complete optimal offer set, decision-makers may only be interested in testing whether a given property holds true for the optimal assortment, such as whether they should include several products of interest in the optimal set, or how many categories of products the optimal set should include. This paper proposes a novel inferential framework for testing such properties. We reduce inferring a general optimal assortment property to quantifying the uncertainty associated with the sign change point detection of the marginal revenue gaps. We show the asymptotic normality of the marginal revenue gap estimator, and construct a maximum statistic via the gap estimators to detect the sign change point. By approximating the distribution of the maximum statistic with multiplier bootstrap techniques, we propose a valid testing procedure. Joint work with Shuting Shen, Xi Chen, and Junwei Lu.
Ethan Fang is an Associate Professor of Biostatistics & Bioinformatics at Duke, and is affiliated with the Decision Sciences group at Fuqua School of Business. He works on different data science problems from both computational and inferential perspectives. Before joining Duke, he was an assistant professor of Statistics at Penn State. Earlier, he got his PhD in Operations Research and Financial Engineering from Princeton under Han Liu and Bob Vanderbei and Bachelor’s degree in Mathematics from National University of Singapore under Kim-Chuan Toh. He currently serves as Associate Editor of Annals of Statistics and Operations Research.