“Oh, what a tangled web we weave, when first we practice to deceive”. The quote is from Sir Walter Scott’s epic poem, Marmion: A Tale of Flodden Field. It’s a historical romance in verse, published in 1808. It tells the tale of how one of Henry VIII’s courtiers, Lord Marmion, pursues his desires for a rich woman, Clara Clare. Marmion and his mistress, a delinquent nun, Constance De Beverley, devise a scheme to implicate Clara’s fiancé in treason. It becomes chaotic and although Marmion appears to have won, by defeating Clara’s fiancé in a duel, Clara retires to a convent rather than endure his attentions. That didn’t work out quite the way Lord Marmion was planning.
I wonder! Will we experience a similar disappointment on the backside of implementing the latest analytics industry effort to reinvent itself with the Data Mesh? I mean, the Operational Data Store (ODS), the Data Warehouse, the Data Mart, the Data Vault, the Data Lake, the Data Lakehouse, the woefully inadequately named “Big Data” and the whimsically named “Big Fat Table” (BFT) implementations all have their merits, generated value and met customer needs. Yet customer dissatisfaction and organizational unrest continues. Apparently, as an industry we are still producing too little value, too late. So, adopting the Data Mesh architecture, infrastructure and business practices will finally result in happy customers. Or will it? Let’s investigate.
What is the Data Mesh?
The term was originally coined by Zhamak Dehghani, a principal technology consultant at ThoughtWorks with a focus on distributed systems architecture and digital platform strategy. Zhamak asserts data platforms based on the data lake architecture have common failure modes that lead to unfulfilled promises at scale. To address these failure modes we need to shift from the centralized paradigm of a lake, or its predecessor data warehouse.
Let that sink in for a moment. We are going to undo the quest and the efforts to achieve the quest over the last four decades to store corporate information and data from operational systems and a wide range of other internal and external data resources in an integrated data repository designed to support the decision-making process through data collection, integration, harmonization, consolidation, discovery, analytics, and research capabilities.
Data Mesh is founded on four principles (ThoughtWorks Technology Radar, “Techniques Data Mesh”, updated 10/28/2020, original 11/20/2019):
- Domain-oriented decentralization of data ownership and architecture
- Domain-oriented data served as a product
- Self-serve data infrastructure as a platform to enable autonomous, domain-oriented data teams
- Federated governance to enable ecosystems, interoperability and observability
So, with Data Mesh, we shift to a paradigm that draws from modern distributed architecture: considering domains as the first class concern, applying platform thinking to create a self-serve data infrastructure, and treating data as a product. Below is a sample reference architecture for the Distributed Data Mesh:
The Data Mesh is going to finally enable us to achieve the utopian state of data democratization and monetization at scale! Data Mesh will achieve this by building well designed, conformed, governed and ethical data products by each distinct data domain that represent the data resource life-cycle of the enterprise. And, the work to build, manage, ethically publish and consume these domain-centric data products will be federated out across the business domain owners.
Heresy? Ahh, fret not. All those domain owners will sign up to be good enterprise citizens by contributing to a set of universally accepted principles including sharing their data with others and abiding by policies to protect the data from mis-use.
What are we fixing again?.
We’re not generating the “right” value from our data and analytic products and we’re just too darn slow!
Instead of federating the entire life-cycle of data and analytic products out to the business, a miniscule cultural and procedural shift though it may be, perhaps there’s another angle? .
Careful of the Consequences.
The “Data as a Product” mindset of Data Mesh is absolutely bang-on. More on that in a bit.
By federating the data products out to the domain-centric organizational constructs, I see the “@ scale” promise, BUT, how many times do knowledge workers and business analysts within any single functional area of the enterprise need ONLY their domain’s data? I venture to say perhaps 0.009% of the time.
Be wary of the silo mentality settling back in as those domain-centric border walls go up. Be wary of the consequences as “acts of redundancy” materialize. Business Analysts, like water, will seek the path of least resistance to get to their analytic destination. The result of those consequences will be replicas of other domain-centric data sprouting up across the enterprise.
Maybe it’s the Operating System that needs to be addressed?
Analytics is a continuous value generating stream of goodness. The business is at the heart of the value stream continuously sensing shifts in market rhythms and exploring where innovative analytic products can be applied to produce value for their customers and stakeholders of the enterprise.
- Agile Product Delivery takes an economic point of view and continuously identifies and prioritizes the minimum viable product (MVP) analytic product that will produce the most value in the shortest amount of time.
- Cross functional, self-contained technical teams continuously integrate and deploy data products to enable those prioritized analytic products.
- Adoption of DataOps and MLOps ensures operationalization, adoption, utilization and ongoing efficacy and performance monitoring.
- Adoption of Design Thinking and Behavioral Storytelling ensure the business adoption/utilization of the released analytic product and supporting data products across their useful life-cycle.
What about technology and architecture?
The Data Mesh approach does embrace self-service enabling technologies. Totally agree there. We need to innovate at scale! Once we have an analytic hypothesis that demonstrates promise and value, it gets prioritized on the Agile Product Delivery Continuous Integration and Deployment backlog.
The Agile Technical teams do the technical work to harden the analytic products (training, testing, performance tuning, security, governance, quality, operational application integration. execution monitoring, error recovery, etc) and their supporting data products (integration, harmonisation, quality, mastering, governance, drift monitoring, publishing for consumption, usage monitoring, etc).
These teams establish a predictable and reliable level of product delivery velocity while the business focuses on the next wave of innovation. If velocity needs to be increased additional teams are launched.
As mentioned earlier, Data Products is bang on. As new data is procured, curated and designed to enable the dependent analytics products, the data product is published including not only the curated content of the data product, but also the access methods to the data product in the form of libraries of microservices exposed for consumption by those analytic products.
The “Data as a Product” mindset is definitely where we need to keep heading AND we need to keep in mind that data is still a means to an end. The end being the enablement of Analytic Products that generate value.
Self-service enablement in the business community is definitely where we need to keep heading with a focus on continuous innovation through analytics, continuous prioritization of those analytics for deployment and continuous assurance of proper, ethical adoption of those analytics to realize the value.
Agile Product Delivery is an operating system that continuously builds and deploys the prioritized analytic and data products at a scale that is predictable, reliable and sustainable. When more velocity is needed to meet the increased demand for analytic products, the Analytic Portfolio Managers make a case for additional funding to achieve that next higher level of value.
The Data Mesh approach has many promising concepts that warrant exploration and adoption. My spidey-sense tells me that if an enterprise is interested in trying it out, do it with a Lean-Agile Product Management mindset. Do a few minimum viable product releases, take the learnings, inspect, adapt and try another one.