Harnessing the Continuous Structure: Utilizing the First-order Approach in Online Contract Design (2403.07143v3)
Abstract: This work studies the online contract design problem. The principal's goal is to learn the optimal contract that maximizes her utility through repeated interactions, without prior knowledge of the agent's type (i.e., the agent's cost and production functions). We leverage the structure provided by continuous action spaces, which allows the application of first-order conditions (FOC) to characterize the agent's behavior. In some cases, we utilize conditions from the first-order approach (FOA) in economics, but in certain settings, we are able to apply FOC without additional assumptions, leading to simpler and more principled algorithms. We illustrate this approach in three problem settings. Firstly, we study the problem of learning the optimal contract when there can be many outcomes. In contrast to prior works that design highly specialized algorithms, we show that the problem can be directly reduced to Lipschitz bandits. Secondly, we study the problem of learning linear contracts. While the contracting problem involves hidden action (moral hazard) and the pricing problem involves hidden value (adverse selection), the two problems share a similar optimization structure, which enables direct reduction between the problem of learning linear contracts and dynamic pricing. Thirdly, we study the problem of learning contracts with many outcomes when agents are identical and provide an algorithm with polynomial sample complexity.
- Learning prices for repeated auctions with strategic buyers. In Neural Information Processing Systems.
- Contract theory. MIT press.
- Learning approximately optimal contracts. In International Symposium on Algorithmic Game Theory, pages 331–346. Springer.
- Conlon, J. R. (2009). Two new conditions supporting the first-order approach to multisignal principal–agent problems. Econometrica, 77(1):249–278.
- Combinatorial contracts. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 815–826. IEEE.
- Multi-agent contracts. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1311–1324.
- Simple versus optimal contracts. In Proceedings of the 2019 ACM Conference on Economics and Computation, pages 369–387.
- The complexity of contracts. SIAM Journal on Computing, 50(1):211–254.
- An analysis of the principal-agent problem. In Foundations of Insurance Economics: Readings in Economics and Finance, pages 302–340. Springer.
- Learning in stackelberg games with non-myopic agents. Proceedings of the 23rd ACM Conference on Economics and Computation.
- Adaptive contract design for crowdsourcing markets: Bandit algorithms for repeated principal-agent problems. In Proceedings of the fifteenth ACM conference on Economics and computation, pages 359–376.
- Holmström, B. (1979). Moral hazard and observability. The Bell journal of economics, pages 74–91.
- Holmstrom, B. (1982). Moral hazard in teams. The Bell journal of economics, pages 324–340.
- Holmström, B. (1999). Managerial incentive problems: A dynamic perspective. The review of Economic studies, 66(1):169–182.
- Innes, R. D. (1990). Limited liability and incentive contracting with ex-ante action choices. Journal of economic theory, 52(1):45–67.
- Jewitt, I. (1988). Justifying the first-order approach to principal-agent problems. Econometrica: Journal of the Econometric Society, pages 1177–1190.
- Moral hazard with bounded payments. Journal of Economic Theory, 143(1):59–82.
- Information space conditions for the first-order approach in agency problems. Journal of Economic Theory, 160:243–279.
- The value of knowing a demand curve: Bounds on regret for online posted-price auctions. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pages 594–605. IEEE.
- Bandits and experts in metric spaces. Journal of the ACM (JACM), 66:1 – 77.
- Efficient convex optimization with membership oracles. In Conference On Learning Theory, pages 1292–1294. PMLR.
- Learning and approximating the optimal strategy to commit to. In Algorithmic Game Theory.
- Adaptivity to smoothness in x-armed bandits. In Conference on Learning Theory, pages 1463–1492. PMLR.
- Mirrlees, J. A. (1999). The theory of moral hazard and unobservable behaviour: Part i. The Review of Economic Studies, 66(1):3–21.
- Revenue optimization against strategic buyers. In Neural Information Processing Systems.
- Learning optimal strategies to commit to. In AAAI Conference on Artificial Intelligence.
- Adaptive discretization for adversarial lipschitz bandits. In Annual Conference Computational Learning Theory.
- Rogerson, W. P. (1985). The first-order approach to principal-agent problems. Econometrica: Journal of the Econometric Society, pages 1357–1367.
- The sample complexity of online contract design. Proceedings of the 24th ACM Conference on Economics and Computation.