How to explain data meshes, fabrics, and clouds
How to explain data meshes, fabrics, and clouds
Your CEO knows what a database is and probably thinks a data warehouse is a large data vault used for reporting and analytics. They know little about NoSQL data stores, why they need a Spark cluster, or how data lakes are used to ingest structured and unstructured data.
CEOs and business leaders focus on the business value of data, analytics, and machine learning and care less about the underlying technologies.
But therein lies a paradox, because they do want to understand the value of investing time and money into new technologies. Try explaining the latest data management technologies, including data meshes, data fabrics, and distributed data clouds, and watch your CEO’s head spin.
It’s not just CEOs, either. Data technology has exploded since the early web days, when the primary debate was whether to build your data warehouse on top of Oracle, Microsoft, or open source. Many non-IT leaders today are content to believe that the data is “in the cloud” and that data integration, quality, and performance are “IT issues.”
Anyone working with data should be prepared to explain the most critical technologies and practices in accessible language. In my book, Digital Trailblazer, I share a story about explaining what a browser cookie is to our startup’s board members when the web was new. You never know when you’ll be handed the microphone to answer a technical question. Responding with technobabble can easily deter or slow down key investments.
Gordon Allott, president and CEO of K3, suggests starting with a simple answer: “Data lake, data warehouse, mesh, and fabric all just refer to the overall company data strategy.”
What is a data mesh?
Keeping your answers simple is important, but it’s not often sufficient. When an executive asks me about a technical term, I want to answer the question in a way that encourages curiosity and follow-up questions.
Let’s start by explaining what a data mesh is. Steven Lin, product marketing manager at Semarchy, shared this concise answer: “A data mesh is a decentralized approach to managing data, where multiple teams within a company are responsible for their own data, promoting collaboration and flexibility,” he said.
There are no complex words in this definition, and it introduces the problems data meshes aim to solve, the type of solution, and why it’s important.
Expect to be asked for more technical details, though, especially if the executive has prior knowledge of other data management technologies. For example, “Weren’t data warehouses and data lakes supposed to solve the data management issue?”
This question can be a trap if you answer it with the technical differences between data warehouses, lakes, and meshes. Instead, focus your response on the business objective.
Satish Jayanthi, co-founder and CTO of Coalesce, offers this suggestion: “Data quality often affects the accuracy of business analytics and decision-making. By implementing data mesh paradigms, the quality and accuracy of data can be enhanced, resulting in increased trust among businesses to utilize data more extensively for informed decision-making.”
I like this answer and hope the executive wants to dive deeper into how data mesh paradigms help improve data quality. Jayanthi answers, “One of the core principles, domain ownership, guarantees the team producing the data is responsible for quality and accuracy. This principle of data as a product ensures that the data shared with other groups is accurate, reusable, self-documented, and meets high standards.”
If you are new to data meshes and want to dive into the technical details, I suggest reviewing Zhamak Dehghani’s pivotal article on moving beyond a monolithic data lake to a distributed data mesh.
What is a data fabric?
The CFO overheard the conversation about data meshes and now wants to know why the chief data officer prefers to invest in a data fabric instead of a data mesh.
The CFO is actually asking three questions:
- What’s a data fabric?
- How does it differ from a data mesh?
- Why is the chief data officer looking to invest in a data fabric?
When confronted with a compound question, I suggest slowing down, taking a deep breath, considering the context of who is asking the question, and providing a deconstructed answer. I might start with, “Let’s first talk about the data fabric and its importance.”
Ross Stuart, a senior solutions architect at AHEAD, suggests helping the CFO work off the visual of what a fabric looks like and how it functions. “A data fabric is a term used to describe the architecture of taking disparate systems and weaving them together, like fabric, to create a consistent layer on top of an organization’s data,” he says.
Ivan Batanov, senior vice president of engineering at Crux, adds, “A data fabric architecture can deliver enhanced insights and analytics efficiently and supports the interconnected nature of data from disparate sources.”
At this point, you should pause and give your audience a few seconds to understand the relationship between data meshes and data fabrics, including the apparent conflict between the two approaches. How might you bring them together? I suggest saying something like this:
Data meshes help business teams use data for analytics and improve data quality, while data fabrics help the chief data officer and the data governance team manage access to connected data sources wherever they are stored—including data warehouses, data lakes, file systems, and SaaS applications.
What we’re unpacking in these questions and answers are different organizational roles and their data responsibilities. We want business teams to embrace citizen data science and use data for decision-making, while organizations need the chief data officer to focus on proactive data governance, aiming to reduce friction and risks when democratizing data.
What is a distributed data cloud?
Now we come to a third data management group, which is tasked with storing and structuring data to support usage needs, performance objectives, and security requirements. “Where should we store dataset X” is the challenge at hand, and the answer isn’t straightforward. In most enterprises, there isn’t a one-size-fits-all architecture for storing, managing, and utilizing data.
James Malone, director of product management at Snowflake, says, “Instead of specifying the ‘how’ behind storing information, a data cloud represents the ‘what’ someone gets with the right mix of technologies,” he says. “The data cloud empowers organizations to pick what works for them versus prescribing and pushing only one way of doing things. Use cases change, needs change, and tech changes—that’s why the data cloud focuses on flexibility and utility.”
Hillary Ashton, chief product officer of Teradata, adds an important detail to share with the CFO. “Data clouds can be deployed on any combination of public clouds, on-premises private clouds, hybrid clouds, and multi-clouds,” she says. “But the ‘brain’ of any data cloud is the cloud analytics platform that processes and connects data from every source and architecture. To get the most value from your data, what matters most is the ability to scale your analytic engine and capabilities across the organization, enabling teams beyond data scientists to access, query, and transform data into insights.”
Tying it all together
At this point, the CEO and CFO may be looking for an easy button to push, so I remind them of the craftmanship required in the simplest of things. “To make a great loaf of bread, you need five ingredients: flour, water, yeast, salt, and sugar, in the right proportions, crafted with proper techniques, cooked for the correct amount of time, and presented elegantly for the desired experience.”
Anyone who’s ever tried making bread knows how hard it is to bake a great loaf consistently. Bread books have hundreds of recipes, and the techniques continue to evolve.
Storing, managing, integrating, governing, and using data sounds simple, but you need the right ingredients, tools, and practices to empower the data-driven organization.