433 Central Ave., 4th Floor, St. Petersburg, FL 33701 | [email protected] | Office: (813) 563-2652
In this Help Net Security interview, Dr. Peter Garraghan, CEO of Mindgard, discusses their research around vulnerabilities in the guardrails used to protect large AI models. The findings highlight how even billion-dollar LLMs can be bypassed using surprisingly simple techniques, including emojis. To defend against prompt injection, many LLMs are wrapped in guardrails that inspect and filter prompts. But these guardrails are typically AI-based classifiers themselves, and, as Mindgard’s study shows, they are just as … More → The post Why security teams cannot rely solely on AI guardrails appeared first on Help Net Security.
http://news.poseidon-us.com/TKjKnL