Data literacy is quickly becoming the most important skill of the decade. Organisations that can successfully enable their data workers to innovate with data at scale through Self Service Analytics can innovate and respond to change much faster, which gives them an important edge over their competitors. However, the many data breaches, regulatory fines, and increasing privacy awareness with customers show that we need to better control access to the data, which proves to be very difficult. From the 100’s of conversations we’ve had so far, we learned that data access management is a slow and tedious process, which means that it always comes at the expense of innovation.
However, Data Access Management should be a business enabler. I want to take the opportunity to discuss how data access management has to evolve, but I have to start with the beginning and describe how we got here.
In the beginning years of my career I worked in the middle office of an investment bank where I had written a VBA script to integrate my Excel file with the on-prem data warehouse. The script and the SQL were so poorly written that it killed performance, grinding the data warehouse to a halt. I still remember my heart sinking when someone from IT came around our floor asking who was running so many heavy queries. After a stern reprimand, IT closed access for the Excel service account.
For years data access management was a very centralised affair. There was one on-prem data warehouse which companies used for monthly reporting, database administrators acted as gatekeepers to guarantee the data warehouse’s security and performance, and data analysts had to run their queries on off-hours so as not to stress the database. Access was managed centrally by the database administrators, and data access request workflows were long and rigid. This worked well as there weren’t many data consumers and the reports were predefined.
But the Cloud was going to change everything!
Back in 2012, AWS announced Redshift on AWS Re:Invent as “the first fully managed, petabyte-scale cloud data warehouse”. Compared to on-prem data warehouses, Redshift’s queries were 10 times faster and ran at a fraction of the cost. Redshift fuelled an ecosystem of data startups that democratised data analytics and data science considerably. Companies could easily spin up a data stack to do data analytics at scale at a very low cost, and self-service analytics made data analytics available to the broader organisation. Literally everyone could now do data analytics.
The mantra was still to move fast and break things, and with data consumption being increasingly federated over an increasingly data literate organisation, a centralised process for data access requests was creating bottlenecks. With slow access to data being the number one killer for innovation and cheap cloud compute being readily available, organisations decided to loosen the reins of the control of access. Many organisations gave their data analysts and data scientists admin rights, giving them individual responsibility to keep the customer data private and secure. This, in fact, boils down to an inverse conway maneuver to data access management. Where access was once managed centrally, now everyone was responsible to decide for themselves. It is no surprise that this spurred an unprecedented wave of innovation in data. However, ungoverned access to large amounts of sensitive data combined with competitive pressure to create value from data led to increasing excesses, resulting in a significant increase in data breaches exposing astronomical amounts of PII data, making consumers progressively concerned about their privacy.
In the late 2010’s regulators realised they had to clamp down on the excesses and took actions by putting in place privacy regulations. In 2018, The European Parliament introduced the GDPR which served as a blueprint for other privacy regulations across the world such as the CCPA, LGPD, and PIPA. These privacy regulations give customers and employees more control over how their personal data is used, and require that organisations apply proper security measures to protect their personal data, including data access controls. Ever since, we’ve seen privacy regulations and security standards grow closer as it became clear that good security supports good privacy.
The result is that organisations have become increasingly concerned about their data compliance. And evidence shows they should be. The recent EUR 800.000 Discord fine shows that regulators are not exclusively going after BigTech like Meta and Google anymore, and the growing privacy awareness of customers is increasingly driving their buying decisions. A striking example is how Optus lost 10% of their customers after their recent data breach. This amounts to 1.000.000 customers! Let that sink in.
This creates a fascinating conundrum. Companies know they need to apply better access management to their cloud data, but this proves to be particularly difficult for analytical cloud data given its size and variability. At the same time, they have seen the value they can get from Self-Serve Analytics which will only grow in importance as organisations become more data literate in the coming years, and as analytical data integrates further in operational workflows the end points where data is consumed become uncountable. This makes a return to centralised data access management out of the question. Yet, a completely federated approach does not work either. The solution is always somewhere in the middle.
We need a solution that is centered around shared responsibilities, which we will aptly call the Conway Pincer Movement; where the Central Data Governance team and the Domain Teams collaborate on and share ownership of the organisation’s data access controls. This will work because the Central Data Governance team knows the regulations, manages stakeholders in security and privacy, and is accountable for protecting the organisation’s most sensitive data. This expertise can be complemented by the Domain Teams who understand the data, have the business context, and are closer to the data consumer.
At Raito we believe that this is the decade where we get rid of low-interest fuelled excesses in data and go to a sustainable level of data democratisation that is secure and private. The Conway Pincer Movement will enable companies to roll out Self Service Analytics at scale without creating undue privacy and security risks, and Raito will drive this change.
Raito’s solution to scale data access management is built on three pillars:
Photo by cottonbro studio: https://www.pexels.com/photo/hand-holding-a-key-with-a-usb-flash-drive-5474298/