Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Decision Aggregation with Adversarial Experts (2403.08222v2)

Published 13 Mar 2024 in cs.LG and cs.AI

Abstract: We consider a robust aggregation problem in the presence of both truthful and adversarial experts. The truthful experts will report their private signals truthfully, while the adversarial experts can report arbitrarily. We assume experts are marginally symmetric in the sense that they share the same common prior and marginal posteriors. The rule maker needs to design an aggregator to predict the true world state from these experts' reports, without knowledge of the underlying information structures or adversarial strategies. We aim to find the optimal aggregator that outputs a forecast minimizing regret under the worst information structure and adversarial strategies. The regret is defined by the difference in expected loss between the aggregator and a benchmark who aggregates optimally given the information structure and reports of truthful experts. We focus on binary states and reports. Under L1 loss, we show that the truncated mean aggregator is optimal. When there are at most k adversaries, this aggregator discards the k lowest and highest reported values and averages the remaining ones. For L2 loss, the optimal aggregators are piecewise linear functions. All the optimalities hold when the ratio of adversaries is bounded above by a value determined by the experts' priors and posteriors. The regret only depends on the ratio of adversaries, not on their total number. For hard aggregators that output a decision, we prove that a random version of the truncated mean is optimal for both L1 and L2. This aggregator randomly follows a remaining value after discarding the $k$ lowest and highest reported values. We extend the hard aggregator to multi-state setting. We evaluate our aggregators numerically in an ensemble learning task. We also obtain negative results for general adversarial aggregation problems under broader information structures and report spaces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Robust forecast aggregation. Proceedings of the National Academy of Sciences, 115(52):E12135–E12143.
  2. Universally robust information aggregation for binary decisions. arXiv preprint arXiv:2302.03667.
  3. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389.
  4. Breiman, L. (1996). Bagging predictors. Machine learning, 24:123–140.
  5. Breiman, L. (2001). Random forests. Machine learning, 45:5–32.
  6. Data poisoning attacks on crowdsourcing learning. In Web and Big Data: 5th International Joint Conference, APWeb-WAIM 2021, Guangzhou, China, August 23–25, 2021, Proceedings, Part I 5, pages 164–179. Springer.
  7. Black-box data poisoning attacks on crowdsourcing. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 2975–2983.
  8. A composite classifier system design: Concepts and methodology. Proceedings of the IEEE, 67(5):708–713.
  9. Maximum likelihood estimation of observer error-rates using the em algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1):20–28.
  10. Robust merging of information. arXiv preprint arXiv:2106.00088.
  11. A survey on ensemble learning. Frontiers of Computer Science, 14:241–258.
  12. Are your participants gaming the system? screening mechanical turk workers. In Proceedings of the SIGCHI conference on human factors in computing systems, pages 2399–2402.
  13. Data poisoning attacks and defenses to crowdsourcing systems. In Proceedings of the web conference 2021, pages 969–980.
  14. Friedman, J. H. (2002). Stochastic gradient boosting. Computational statistics & data analysis, 38(4):367–378.
  15. Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. In Proceedings of the 33rd annual ACM conference on human factors in computing systems, pages 1631–1640.
  16. Algorithmic robust forecast aggregation.
  17. Truthful information elicitation from hybrid crowds. arXiv preprint arXiv:2107.10119.
  18. Multi-class adaboost. Statistics and its Interface, 2(3):349–360.
  19. Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282. IEEE.
  20. Learning whom to trust with mace. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1120–1130.
  21. Identifying unreliable and adversarial workers in crowdsourced labeling tasks. The Journal of Machine Learning Research, 18(1):3233–3299.
  22. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 2018 IEEE symposium on security and privacy (SP), pages 19–35. IEEE.
  23. The swing voter’s curse with adversarial preferences. Journal of Economic Theory, 135(1):236–252.
  24. Crowdsourcing with arbitrary adversaries. In International Conference on Machine Learning, pages 2708–2717. PMLR.
  25. Learning multiple layers of features from tiny images.
  26. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution. In SIGIR 2010 workshop on crowdsourcing for search evaluation, volume 2126, pages 22–32.
  27. Adversarial crowdsourcing through robust rank-one matrix completion. Advances in Neural Information Processing Systems, 33:21841–21852.
  28. Attack under disguise: An intelligent data poisoning attack mechanism in crowdsourcing. In Proceedings of the 2018 World Wide Web Conference, pages 13–22.
  29. Are you smarter than a random expert? the robust aggregation of substitutable signals. In Proceedings of the 23rd ACM Conference on Economics and Computation, pages 990–1012.
  30. Page, D. (2018). cifar10-fast. https://github.com/davidcpage/cifar10-fast.
  31. Robust decision aggregation with second-order information. arXiv preprint arXiv:2311.14094.
  32. Schapire, R. E. (1990). The strength of weak learnability. Machine learning, 5:197–227.
  33. Information elicitation from rowdy crowds. In Proceedings of the Web Conference 2021, pages 3974–3986.
  34. Cheap and fast–but is it good? evaluating non-expert annotations for natural language tasks. In Proceedings of the 2008 conference on empirical methods in natural language processing, pages 254–263.
  35. Avoiding imposters and delinquents: Adversarial crowdsourcing and peer prediction. Advances in Neural Information Processing Systems, 29.
  36. How much spam can you take? an analysis of crowdsourcing results to increase accuracy. In Proc. ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR’11), pages 21–26.
  37. Sybil defense in crowdsourcing platforms. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 1529–1538.
  38. Efficient label contamination attacks against black-box learning models. In IJCAI, pages 3945–3951.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets