Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 158 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 106 tok/s Pro
Kimi K2 183 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Learning to Persuade on the Fly: Robustness Against Ignorance (2102.10156v2)

Published 19 Feb 2021 in cs.GT, cs.LG, and econ.TH

Abstract: Motivated by information sharing in online platforms, we study repeated persuasion between a sender and a stream of receivers where at each time, the sender observes a payoff-relevant state drawn independently and identically from an unknown distribution, and shares state information with the receivers who each choose an action. The sender seeks to persuade the receivers into taking actions aligned with the sender's preference by selectively sharing state information. However, in contrast to the standard models, neither the sender nor the receivers know the distribution, and the sender has to persuade while learning the distribution on the fly. We study the sender's learning problem of making persuasive action recommendations to achieve low regret against the optimal persuasion mechanism with the knowledge of the distribution. To do this, we first propose and motivate a persuasiveness criterion for the unknown distribution setting that centers robustness as a requirement in the face of uncertainty. Our main result is an algorithm that, with high probability, is robustly-persuasive and achieves $O(\sqrt{T\log T})$ regret, where $T$ is the horizon length. Intuitively, at each time our algorithm maintains a set of candidate distributions, and chooses a signaling mechanism that is simultaneously persuasive for all of them. Core to our proof is a tight analysis about the cost of robust persuasion, which may be of independent interest. We further prove that this regret order is optimal (up to logarithmic terms) by showing that no algorithm can achieve regret better than $\Omega(\sqrt{T})$.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Bergemann D, Morris S (2016) Bayes correlated equilibrium and the comparison of information structures in games. Theoretical Economics 11(2):487–522.
  2. Bergemann D, Morris S (2019) Information design: A unified perspective. Journal of Economic Literature 57(1):44–95.
  3. Besson L, Kaufmann E (2018) What doubling tricks can and can't do for multi-armed bandits. arXiv preprint arXiv:1803.06971 .
  4. Candogan O (2020) Information design in operations. Pushing the Boundaries: Frontiers in Impactful OR/OM Research, 176–201 (INFORMS).
  5. Cao X, Liu KR (2018) Online convex optimization with time-varying constraints and bandit feedback. IEEE Transactions on automatic control 64(7):2665–2680.
  6. Dughmi S (2017) Algorithmic information structure design: a survey. ACM SIGecom Exchanges 15(2):2–24.
  7. Dughmi S, Xu H (2021) Algorithmic bayesian persuasion. SIAM Journal on Computing 50(3):STOC16–68–STOC16–97.
  8. Dworczak P, Pavan A (2020) Preparing for the worst but hoping for the best: Robust (Bayesian) persuasion. CEPR Discussion Paper No. DP15017 .
  9. Hu J, Weng X (2020) Robust persuasion of a privately informed receiver. Economic Theory 1–45.
  10. Jaynes ET (2003) Probability theory: The logic of science (Cambridge university press).
  11. Kamenica E, Gentzkow M (2011) Bayesian persuasion. American Economic Review 101(6):2590–2615.
  12. Khezeli K, Bitar E (2020) Safe linear stochastic bandits. ArXiv abs/1911.09501.
  13. Kim Y, Lee D (2023) Online convex optimization with stochastic constraints: Zero constraint violation and bandit feedback. arXiv preprint arXiv:2301.11267 .
  14. Kremer I, Mansour Y, Perry M (2014) Implementing the “wisdom of the crowd”. Journal of Political Economy 122(5):988–1012.
  15. Neely MJ, Yu H (2017) Online convex optimization with time-varying constraints. arXiv preprint arXiv:1702.04783 .
  16. Romanyuk G, Smolin A (2019) Cream skimming and information design in matching markets. American Economic Journal: Microeconomics 11(2):250–76.
  17. Van Mierlo T, et al. (2014) The 1% rule in four digital health social networks: an observational study. Journal of medical Internet research 16(2):e2966.
Citations (33)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: