Data is the new frontier for AI
Recently, Anthropic open-sourced their Model Context Protocol (MCP), a new framework designed to give AI agents access to your organization's data. MCP provides a universal protocol that connects AI systems with various data sources, like databases, file systems, git repositories, and APIs. Instead of building custom connectors for each data source, developers can now rely on one streamlined protocol—making it much easier to integrate AI with different systems.
By connecting AI agents to data, they can assist with all sorts of tasks—whether it’s helping with code, generating summaries, or getting a sense of the organisation’s sales numbers.
With the enormous benefits of giving AI Agents access to data and code repositories also come significant data security risks. We’ve already seen several incidents where AI Agents accidently leaked sensitive data. At the same time, regulators are tightening the rules around both data security and AI, so organizations need to implement strong safeguards.
Data Security for AI Agents
Seeing how attackers can bypass AI guardrails through techniques like prompt injection, organisations must implement multiple lines of defense. The last and most important line of defense are strict data access controls for AI Agents.
Unfortunately, this is hard to achieve with traditional IAM technologies which were initially designed to manage access to applications, not individual data sets. In modern cloud platforms, where data often spans multiple lines of businesses, domains, and geographies, this leads to overly broad access controls. To achieve fine-grained access controls for AI Agents we will need a framework that integrates federated identity, identity centric security, automation using data classification, and continuous monitoring.
Federated Identity
AI Agents should access data using the end user’s identity such that they can only access the data the end user is authorised to see. If federated identity is not possible, you should limit the AI Agent’s access to the bare minimum needed for its use case.
Identity Centric Security
Managing access across multiple systems for AI agents requires an identity-centric approach. Rather than managing access for different accounts within the individual systems, access should be centralised and based on the identity of the end user or the machine.
Below you can see how the admin manages access for the different accounts for an AI Agent (Claude) in Azure, Snowflake, GitHub and Google Drive. By doing so, they can monitor and control data access and usage across the organisation from a central point, rather than handling it in isolated silos.
Automation using Data Classifications
With the massive amount of data stored in the cloud, manually managing access is nearly impossible. To handle access at scale, you need a framework where permissions are automatically assigned based on data classifications and user attributes.
In the below example the owner gives Claude read-only access to all tables and views classified as customer data. Sensitive PII data within those tables is automatically masked using a dynamic masking policy. This ensures compliance without manual intervention.
Continuous monitoring
Lastly, you need to centrally monitor all the datasets your AI agents can access. It's no longer enough to simply track which groups an agent belongs to. To ensure regulatory compliance, you must keep a close eye on exactly which data sets the AI has access to—and which ones it’s actually interacting with.
By combining automated access control, data classification, and centralised monitoring, you can ensure that your AI systems are both useful and secure, helping your organisation stay compliant in a rapidly evolving digital landscape.