Cookies
Close Cookie Preference Manager
Cookie Settings
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage and assist in our marketing efforts. More info
Strictly Necessary (Always Active)
Cookies required to enable basic website functionality.
Made by Flinch 77
Oops! Something went wrong while submitting the form.

What is being said about LLM+RAG at Blackhat

The security risks of LLM+RAG are the talk of town at Blackhat. What are those risks and what can you do to address them?

One of the take-aways from Black Hat is the growing consensus about the unique and unparalleled data security risks associated with GenAI among security practitioners. Or as one attendee put it:

“Usually when I go to conferences I return inspired and full of new ideas. This year, I came back genuinely scared”

In one particularly daunting presentation, Michael Bargury exposed the terrifying security risks of what can happen when an attacker gets access to Copilot. Copilot is a GenAI agent that answers a user’s questions using internal documents, data and emails. When a user asks a question Copilot looks for the relevant information to answer that question using Retrieval Augmented Generation (RAG). This has the potential to become an extremely powerful tool for productivity as it significantly lowers the technical threshold to get valuable information, and reduces context switching. This is why Morgan Stanely has given its financial advisors and support staff access to +100.000 documents using Microsoft’s Open AI. 

However, as organisations are implementing RAG we’re becoming aware of several limitations that harm the trust in their outcome, such as naive ranking, information loss due to chunking, parsing issues. As the industry is resolving these issues with Multi-Agent AI Systems, we’re still faced with important security risks such as data poisoning and data loss through RAG. 

By handing over the keys  to your company’s body of knowledge to CoPIlot introduces cybersecurity risks that make even the most seasoned cybersecurity professional squirm. If an attacker gets access to the agent, they can basically exfiltrate or poison any and all your company’s data. Until recently, Microsoft made it very easy for external attackers to get access to your Copilot by making it publicly accessible over the web by default. 

Even when limiting access to employees only, the security risks remain significant. Knowing that credential theft is still the most popular mode of attack, and that 62% of GenAI security breaches, originated from internal parties, the risks of data breaches through Copilot remain, even when access to Copilot is limited to employees only. 

30% of enterprises deploying AI had a security breach, Gartner 2024

It’s clear that Copilot and by extension GenAI has the potential to truly democratize access to information for non-technical users, but without good data security controls they pose huge cybersecurity risks. Below are 5 data security controls that will significantly reduce risk:

  • Limit access to GenAI agents: Make sure only employees and internal parties get access to your GenAI Agents.
  • Data security for fine-tuning: When fine-tuning a SLM or LLM make sure to either use synthetic data, apply differential privacy or that at least all sensitive data is masked and appropriate filters are applied. Fine-tuning is a powerful way to improve the performance of a generalistic SLM or LLM for specific tasks by training it on specialised data sets, but this data will be stored as parameters of the language model, making it near impossible to prevent data loss through the language model.
  • Reduce write-access for GenAI agents to prevent data poisoning. Data poisoning happens when a hacker prompt engineers a GenAI agent to corrupt data used for training, finetuning or RAG in order to let the GenAI produce inaccurate results or manipulate its actions. This can be prevented by removing all write access the agent might have.
  • Limit read access and use federated identity to limit what the GenAI agent can RAG to what the end user is allowed to access following the principles of least privilege access management.  
  • Monitor and Audit: Given the RAG’s unpredictable nature of querying it will be extremely important to continuously monitor what data it has access to and how it uses that data. Apart from that, you will have to regularly monitor changes to access to detect any suspicious activity.

How Raito helps

Raito offers a central platform to manage fine-grained access and data security controls to structured and unstructured data used for GenAI. With Raito, you can RAG your body of knowledge without creating undue security risks.

Monitor

Raito lets you centrally monitor data access and usage, monitor your data security maturity, and detect and remediate data security risks in your multi cloud environment. This will help you detect excessive access privileges for GenAI, when service accounts are used to RAG data, and unusual access patterns. 

Manage

With Raito you can centrally manage a GenAI Agent’s access to all your company’s structured and unstructured data. Raito’s identity centric access controls, access management federation, and integration in DevOps lets you implement consistent, universal fine-grained permissions at scale in a multicloud environment. 

Automate

RAG’s undeterministic nature and access to large amounts of data makes that traditional access control technology like ACL and RBAC does not scale. RAG requires a more dynamic and scalable approach that can be found in Attribute-Based Access Controls, where access is determined dynamically using the data and user’s attributes. Raito’s ABAC policies let you dynamically grant access and mask and filter data using the metadata from your data providers or data catalogs. 

Talk to the team