Emergent Mind

Dynamic pricing and assortment under a contextual MNL demand

(2110.10018)
Published Oct 19, 2021 in cs.LG

Abstract

We consider dynamic multi-product pricing and assortment problems under an unknown demand over T periods, where in each period, the seller decides on the price for each product or the assortment of products to offer to a customer who chooses according to an unknown Multinomial Logit Model (MNL). Such problems arise in many applications, including online retail and advertising. We propose a randomized dynamic pricing policy based on a variant of the Online Newton Step algorithm (ONS) that achieves a $O(d\sqrt{T}\log(T))$ regret guarantee under an adversarial arrival model. We also present a new optimistic algorithm for the adversarial MNL contextual bandits problem, which achieves a better dependency than the state-of-the-art algorithms in a problem-dependent constant $\kappa2$ (potentially exponentially small). Our regret upper bound scales as $\tilde{O}(d\sqrt{\kappa2 T}+ \log(T)/\kappa2)$, which gives a stronger bound than the existing $\tilde{O}(d\sqrt{T}/\kappa2)$ guarantees.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.