Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 47 tok/s

Gemini 2.5 Pro 44 tok/s Pro

GPT-5 Medium 13 tok/s Pro

GPT-5 High 12 tok/s Pro

GPT-4o 64 tok/s Pro

Kimi K2 160 tok/s Pro

GPT OSS 120B 452 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Rarely-switching linear bandits: optimization of causal effects for the real world (1905.13121v2)

Published 30 May 2019 in stat.ML and cs.LG

Abstract: Excessively changing policies in many real world scenarios is difficult, unethical, or expensive. After all, doctor guidelines, tax codes, and price lists can only be reprinted so often. We may thus want to only change a policy when it is probable that the change is beneficial. In cases that a policy is a threshold on contextual variables we can estimate treatment effects for populations lying at the threshold. This allows for a schedule of incremental policy updates that let us optimize a policy while making few detrimental changes. Using this idea, and the theory of linear contextual bandits, we present a conservative policy updating procedure which updates a deterministic policy only when justified. We extend the theory of linear bandits to this rarely-switching case, proving that such procedures share the same regret, up to constant scaling, as the common LinUCB algorithm. However the algorithm makes far fewer changes to its policy and, of those changes, fewer are detrimental. We provide simulations and an analysis of an infant health well-being causal inference dataset, showing the algorithm efficiently learns a good policy with few changes. Our approach allows efficiently solving problems where changes are to be avoided, with potential applications in medicine, economics and beyond.

Citations (4)

View on Semantic Scholar