Leveraging MT to Sabotage AI Systems
Thread poster: brovxidfmgan (X)
brovxidfmgan (X)
brovxidfmgan (X)
Feb 1

The academics ran 520 harmful prompts through GPT-4, translating the queries from English into other languages and then translating the responses back again, and found that they were able to bypass its safety guardrails about 79 percent of the time using Zulu, Scots Gaelic, Hmong, or Guarani. The attack is about as successful as other types of jail-breaking methods that are more complex and technical to pull off, the team claimed.

[…]

They don't always work, however, and GPT-4 may generate nonsensical answers. It's not clear whether that issue lies with the model itself, or stems from a bad translation, or both.


https://www.theregister.com/2024/01/31/gpt4_gaelic_safety/


 
brovxidfmgan (X)
brovxidfmgan (X)
TOPIC STARTER
. Feb 1

Why the current generative AI systems are a glass cannon [PDF].

https:/
... See more
Why the current generative AI systems are a glass cannon [PDF].

https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.pdf
https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-v1_1.pdf
https://arxiv.org/pdf/2312.00157.pdf
https://arxiv.org/pdf/2307.15043.pdf

[Edited at 2024-02-02 02:22 GMT]
Collapse


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Leveraging MT to Sabotage AI Systems







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »