Meta has recently announced a range of new tools and updates aimed at improving security, enhancing privacy, and fortifying the protection of its Llama large language model (LLM) for the open-source AI community.
In an effort to compete with OpenAI’s ChatGPT, Meta launched Meta AI on April 29, 2025, during its first-ever LlamaCon, an event focused on the Llama family of AI models.
Meta AI will operate as an independent app driven by Llama 4, offering tailored responses that leverage users’ Facebook and Instagram accounts, and will also integrate with Meta AI glasses. Additionally, it will feature a Discover feed that highlights interactions between the tool and social media connections.
During the LlamaCon event, the company unveiled several other updates focused on security, privacy, and tools for the open-source AI community.
Key announcements included:
- LlamaFirewall, a new security measure designed to thwart malicious activities targeting AI models and applications.
- CyberSec Eval 4, Meta’s latest open-source cybersecurity benchmark suite featuring two new tools—CyberSOC Eval and AutoPatchBench—to evaluate the defensive capabilities of AI systems.
- Llama Defenders Program, an initiative aimed at providing partners and developers access to various AI solutions to meet diverse security needs.
Additionally, Meta introduced Private Processing, a technology that purportedly allows WhatsApp users to access AI functionalities, such as message summarization, while ensuring message privacy. The company is collaborating with the security community to audit and refine this technology prior to its release.
New and Updated Llama Protection Tools
The highlight for the Llama open-source community is the introduction of LlamaFirewall. Meta claims this tool can coordinate with guard models and operates alongside its suite of protective tools to identify and mitigate risks to AI systems such as prompt injection and insecure code.
A companion paper providing a detailed technical overview of LlamaFirewall has also been released. Furthermore, Meta rolled out Llama Guard 4, an upgraded version of its safeguarding tool for Llama applications—both for image and text generation—designed to remove unwanted content, including violence, sexually explicit material, and privacy violations.
Llama Guard 4 is available via the new Llama API, currently in a limited preview. Jeff Boudier from Hugging Face expressed on LinkedIn that the release of Llama Guard 4 was the standout feature of LlamaCon.
Meta’s New LLM Benchmark for Cyber Applications
An important announcement made at LlamaCon was the unveiling of CyberSecEval 4, Meta’s latest benchmark suite for assessing the security risks and capabilities of LLMs in cybersecurity contexts. This suite introduces new tools:
- CyberSOC Eval, a framework developed in collaboration with CrowdStrike to evaluate the effectiveness of AI systems in Security Operation Centers (SOCs).
- AutoPatchBench, a standardized framework for researchers to assess how well Llama and other AI systems can automatically patch security vulnerabilities through fuzzing techniques.
AutoPatchBench can be downloaded from GitHub, while CyberSOC Eval is set to be publicly released in the coming weeks.
Meta’s Llama Defenders Program
In addition, Meta announced the launch of the Llama Defenders Program, which aims to offer various tools, including an Automated Sensitive Document Classification Tool and detectors for AI-generated audio content. These resources could help organizations identify threats such as scams and phishing attempts.
Meta is partnering with companies like Zendesk, Bell Canada, and AT&T to integrate these tools into their systems and has invited other organizations to seek further information about the Llama Defenders Program via its website.