Data Readiness: The Backbone of AI Success
KPMG - You Can With AI Series | Episode: Data Readiness
Guest: Daniel Bearinger, Principal Data Modernization Practice, KPMG
Former CDO at Nissan | Background: Software & Data Engineer
Why Data Matters for AI: The Foundation
- Data is the prerequisite for AI/ML and GenAI success, not an afterthought.
- High-quality data that is well understood and contextualized is critical.
The Grocery Store Analogy:
- Products on shelves have labels showing identity, origin, and utility.
- Data must have clear utility: What is this data for? What can it be used for?
- Without understanding, building successful AI is hard.
- Data must be understood, explainable, and empowering.
Data as Team Sport:
- Requires collaboration between business stakeholders and technologists.
- All must understand: What data will be used for? What certifies it as fit for AI? What are outcomes?
- Without high-fidelity, high-quality data at the start, innovation and value cannot be realized.
Enterprise AI vs Generic GenAI: The Data Difference
Retrieval-Augmented Generation (RAG)
- Enables organizations to converse with their data in natural language.
- Users ask questions without needing SQL; ChatGPT-style interfaces grounded in corporate data.
Context Grounding Benefits
- GenAI grounded in corporate data vs generic internet search.
- Data is relevant to user’s business context (finance, supply chain, sales, etc.).
- Time to insights is dramatically shortened; no need for technical expertise.
- Users recognize context faster as it’s their own data.
Understanding Data Readiness: Framework
- Definition: People, Process, and Technology.
- Key Principle: Adoption drives success, not just technology presence.
Core Data Management Functions
Data Governance Foundations
- Data Quality: What makes data fit for use?
- Business Glossary: Clear definitions for data meaning.
- Data Ownership: Who is responsible?
- Standardized Terminology: Ensure consistency.
- Process Definition: Who does what when preparing data?
Data Preparation Lifecycle Evolution
- Before: Manual architect tasks (schema, validation).
- Now: GenAI-augmented automation:
- Automated labeling/classification
- Automated metadata enrichment
- Automated data catalog seeding
- Quick validation interfaces
The Acceleration Factor
- Before: Weeks/months for readiness.
- Now: Hours/days with right people + GenAI automation.
- Result: Certified, ready data much faster.
Three Primary Organizational Patterns
1. Large ERP Transformation
- Long journey for core finance & ERP orchestration.
- Want more value from data during transformation.
- Requirements: Master data management, data quality standards, critical element flows.
- Solution: Data prep in source systems + augmentation with third-party/internal data.
2. AI/ML & GenAI Initiatives (Most Common)
- Pursuing AI with foundational readiness gaps.
- Gaps: No formal data interaction org, no data stewards/governance, lack of tools/catalog.
- Need: Understand, label, classify, define domain—precursor to AI/ML success.
3. Mature Data-Driven Organizations
- Have data product factory, lakehouse, analytics/data science at scale.
- Goal: Monetization/commercialization—often starts in finance.
- Secondary: Privacy controls, role-based access, wider security.
Biggest Challenges Observed
Challenge 1: Executive-to-Working Level Gap
- DOTS assessment probes perception at all levels.
- Found gaps between senior leaders and working teams—cultural/perception.
Challenge 2: Silos and Lack of Visibility
- Data copied across silos; people unaware of overlapping work.
- Culture, not technology, is the biggest challenge.
Challenge 3: Project Proliferation/Duplication
- Case: Reduced 25 projects to 4 by aligning to data products.
- Solution: Center solutions on data products for efficiency.
Challenge 4: Legacy System Sprawl
- Lack of pruning/retirement of legacy data products.
- Results: Complexity, expense, unclear data landscape.
Challenge 5: Skills and Resources Gap
- More process/resource issue than technology.
- Bottlenecks and prolonged wait times result in lost opportunities.
GenAI Impact on Awareness and Sentiment
Early 2024: Fear, resistance, risk aversion, investment concerns.
Mid-Late 2024: Excitement, rapid adoption, Data Time Machine idea:
- Data wrangling reduced from 40 hrs/week to 5-10 hrs/week.
- 30+ hrs freed for innovation, flywheel pulls projects forward.
Executives:
- Everyone now sees GenAI promise with good data.
- Adoption up due to trusted frameworks, right constructs, market validation.
Recommendations for Leaders (2025+)
-
Active Learning & Monitoring
- Keep up with advances, vendor releases, industry benchmarks.
-
Build Data Community/Advocacy
- Data bridges business/IT. Form groups/councils, share social metadata.
-
Create Idea Management Systems
- “Idea jars” for use cases, unsolved problems, allow tinkering/hackathons.
-
Value Determination Framework
- Measure impact with “Data Value Chain”, demonstrate ROI.
-
Strategic Vendor/Tool Selection
- Assess/existing tech, select best-fit tools, reference architecture.
Key Insights Summary
Data as Strategic Asset
- Massive variation in how organizations use/data.
- Exec-working gap; culture + tech transformation needed.
Efficiency Through Alignment
- Project consolidation (25→4) frees resources, boosts innovation.
The Data Time Machine
- GenAI compresses weeks into hours/days, enabling new possibilities.
Human-Centric Innovation
- GenAI at home influences workplace; integration/workflow thinking overtakes dashboards.
Critical Success Factors
- People > Process > Technology
- Governance is business enabler
- Simplification before expansion
- Cross-functional alignment
- Persona-based design
- Active learning
- Value measurement
- Community building
Source: KPMG “You Can With AI” Series | Guest: Daniel Bearinger
Focus: Data Modernization & Enterprise Data Strategy
Date: 2024 | Series: Part of 7-part exploration on AI implementation