Black Box Adversarial Prompting for Foundation Models (2302.04237v2)

Published 8 Feb 2023 in cs.LG

Abstract: Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to benign prompts, induce specific behaviors into the generative process, such as generating images of a particular object or generating high perplexity text.

Citations (42)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/max_paperclips/status/1745961150959976611

https://twitter.com/MrJonHowells/status/1755597224778846216

https://twitter.com/secemp9/status/1816768909053190586

Black Box Adversarial Prompting for Foundation Models (2302.04237v2)

Summary

Related Papers

Tweets