In a previous post I talked about the challenges with Data Security for Data Products, but after talking with several data engineers, data architects, and security experts, I realized that I missed something important. Access management for Data Products is not limited to the underlying data. Depending on your role in the data team, you will also need the right access to infrastructure and code repositories, and the type of permissions you need depend on the job ahead:
Why am I writing about this now? Apart from having time on my hands now that I finished the second season of Foundation, the recent NIST CSF 2.0 framework prompted me to pick up my proverbial pen to write this addendum to my previous post. The NIST framework, as the upcoming NIS2 Directive, requires organisations to limit access and monitor usage not only to data, but also to infrastructure and code bases. Definitely when you’re using managed services as these are accessible over the web. As managed services are integrated in almost every aspect of a modern data team’s workload, it doesn’t suffice anymore to restrict our thinking to data access management when thinking about access management. You’ll have to broaden your scope to include access to infrastructure and code bases.
This brings me to a point that I’ve already slightly touched upon in my previous article. The effort of managing access grows exponentially with every tool. If managing cloud data access for a growing data team wasn’t hard enough in itself, now you also have to manage access to the other tools in the data development stack. Taking an isolated approach to access management per tool won’t help much, nor make your engineers more productive.
Or, to quote a data engineer from a large software vendor:
“What’s the point of using one tool for access management to data, and another tool for managing access to Airflow?”
To quote a security leader from a very fast growing scale-up:
“When an engineer has access issues, they send me a ticket. Doesn’t matter what kind of access.”
To quote an MLOps leader from a leading worldwide retailer:
“Access management in GitHub is a massive pain”
As data teams are increasingly becoming software teams, we need to take a more holistic approach to access management for data products that covers access to data, infrastructure, and code repositories, which integrates well with the data development process. Otherwise, data teams end up with a piecemeal solution that comes with huge productivity costs and security blind spots.
Raito offers data teams a platform to centrally group all the permissions to data, infrastructure, and code repositories needed to work with Data Products. Reach out to info@raito.io to learn how this approach will make data teams more efficient.