From pilots to productivity: How one AI leader operationalizes enterprise AI
AI may be helping individual employees work faster, but most organizations still struggle to turn individual wins into enterprise-wide execution gains. The gap between isolated productivity boosts and systemic impact is where many AI investments stall.
In my recent conversation with Shivam Khullar, NVIDIA’s director of engineering, we took a candid, practical look at how she and her peers are building an AI-powered system of work on a modern, connected foundation. Her perspective offers a blueprint for technology leaders who want to move beyond experimentation and into operational AI that measurably changes how teams work.
I use AI to actually say, ‘What are the top five priorities I should be working on this week?’ We built an agent which actually helps us do that and says based on your email activity, on your Jira activity, on your content activity, and Google Drive activity, here are the five things that are seeking your attention that you have not really responded to.”
– Shivam Khullar, director of engineering, NVIDIA
The challenge: beyond pilots and adoption – proving business outcomes
I’ve seen the same pattern play out across large enterprises: a promising AI tool generates excitement, a pilot launches, usage metrics look strong, and yet the business sees little to no change. The tool lingers in perpetual pilot. Teams slip back into familiar habits while leaders question the return on investment.
You can show impressive usage metrics – say, a high percentage of engineers using a coding assistant or millions of lines of AI-generated code – and still have no clear business outcome to point to. The meaningful question isn’t whether people used AI, but whether it helped us ship faster, improve decision quality, or unlock a previously intractable problem.
75% of my company uses Cursor. Is it a success metric? To me, it isn’t. It is an adoption metric.”
– Shivam Khullar, director of engineering, NVIDIA
This distinction between adoption metrics and outcome metrics explains why positioning is in a landscape where new AI capabilities are emerging almost daily; an enterprise’s ability to evaluate, test, and scale tools responsibly is itself a competitive advantage.
Shivam described NVIDIA’s operating model in three phases:
Phase 1: experimentation – unbound exploration
Everything begins with exploration. Individual teams and champions try new AI tools, test emerging capabilities, and surface potential use cases. Experimentation is intentionally unbound. The objective is to learn what is possible and what is promising, not to commit to long-term adoption.
Everything starts with experimentation, but not everything moves from experimentation to pilot. And similarly, not everything moves from pilot to production.”
– Shivam Khullar, director of engineering, NVIDIA
Phase 2: pilot – disciplined hypothesis testing
Shivam drew a sharp line between experiments and pilots. A pilot, in her model, is not just extended experimentation. It is a structured test of a specific hypothesis, with agreed-upon measures of success and clearly defined boundaries.
A credible pilot has 5 essential attributes:
- Time-boxed entry and exit criteria, specifying when the pilot starts and ends, and what will trigger a decision to scale, iterate, or stop.
- A clear hypothesis, stated in outcome terms, for example: “We believe this tool will reduce cycle time for bug triage by a defined percentage for a specific team.”
- A defined audience, typically a targeted group rather than an entire function or company, so that behavior change and impact can be observed more precisely.
- Safe data zones with narrow, pre-approved scopes for which the tool can access data during the pilot.
- Flexibility on the specific success metric as pilots reveal real value, while still holding firm on clear, outcome‑based impact.
This discipline is key to NVIDIA’s AI model and a best practice to adopt. It also showcases a practical benefit of platform unification: when the underlying tools for collaboration, planning, and development are already connected, it becomes much easier to define safe scopes, instrument pilots, and measure impact without the need for bespoke integration work for every new AI tool.
Phase 3: production – hardened and scaled
Only tools that prove their hypotheses in pilot progress to production at NVIDIA. In this phase, the focus shifts from proving value to hardening and scaling: full security and privacy reviews, broader data access with robust governance, and integration into standard workflows.
Here, the advantages of a unified platform are most visible. The difference between supporting a few hundred pilot users and deploying AI capabilities across tens of thousands of knowledge workers largely depends on the platform on which those tools run. With a modern, connected system of work, scaling an AI assistant or agent involves updating centrally managed configurations and policies rather than re-implementing infrastructure team by team.
NVIDIA’s lifecycle model offers a clear takeaway: a unified platform enables a repeatable, safe, and scalable path from unstructured experimentation to enterprise-grade AI adoption.
If the outcome is this is something which benefits the company, then we move it into production and then enforce it and harden it with the data foundations or the security foundations required for the scale.”
– Shivam Khullar, director of engineering, NVIDIA
Real-world impact: AI that changes how teams work
Shivam’s most compelling stories were not about AI capabilities in the abstract, but about how AI is changing day-to-day work at NVIDIA. Two examples highlight what an AI-powered system of work looks like when it runs on a connected, modern platform.
Example 1: real-time launch intelligence
Traditional launches collect feedback across fragmented channels – chat, issue trackers, email, support, and customer conversations – so meaningful retros happen weeks later after manual consolidation.
At NVIDIA, AI monitors key launch channels – chat, issues, and docs – so within hours of go‑live, leaders see a synthesized view: what works, what breaks, which signals matter, and where noise is building.
That rapid awareness becomes a repeatable launch playbook: AI monitoring and summaries are baked into every major release.
It works when collaboration, planning, and documentation live in a unified work graph. In siloed tools, cross‑cutting synthesis is far harder and more brittle.
Example 2: the prioritization agent
Shivam described an internal agent built around a universal question for knowledge workers: What should I focus on this week?
The agent pulls signals from email, issue trackers, documentation, and shared files to surface the few items that need attention. It highlights unresolved questions, unaddressed requests, and stalled work, giving each person an individualized, AI‑curated priority list.
This is not a generic chatbot. It’s an intelligent layer on top of a connected system of work that helps people cut through noise by interpreting activity across standardized, integrated cloud tools with consistent identity and permissions.
The priority agent shows that when an organization modernizes its work platform and builds a unified work graph, new categories of AI assistance become both possible and practical at scale.
The consultative partnership: IT as forward-looking advisor
Another notable theme in Shivam’s perspective is the evolution of IT from reactive support to a consultative partner. At NVIDIA, the teams working on AI adoption operate very differently.
- Co-designed business processes: Shivam’s team partners with business units to understand personas, workflows, and pain points, then co-designs AI-enabled processes.
- Segmented enablement model: NVIDIA segments by self-sufficiency: engineers self-serve with infrastructure and guardrails, while GTM and support functions need persona-based enablement, workflow mapping, and hands-on support.
- Champions-driven adoption: Champions are key – IT/platform teams help them refine use cases, choose tools, and define success metrics to turn AI interest into real productivity gains.
Culture and change management: making AI the norm
Making AI the norm is a culture-first effort. Normalize its use and model the behavior from the top, tailoring advocacy for each use case. Focus on building durable capability – with training, norms, and visible role modeling – rather than chasing hype or tools for their own sake.
Standardize, then let teams customize
Start with standardized, org-wide agents for common workflows (performance feedback, document collaboration, internal search). These templates create guardrails, a shared baseline for quality, and early wins that drive viral adoption. As people see value, teams ask to adapt templates, add data sources, or tweak prompts. Platform teams should enable this safely through patterns and self-service – not bespoke builds for every team.
Specialized scenarios – such as cross-product workflows or teams handling sensitive data – still start with shared templates but require tailored agents that fit their specific needs. This seed-and-customize approach only scales with strong platform support: centralized governance and security to prevent data exposure, plus enough flexibility for teams to iterate quickly without constant approvals. Systems of work with robust admin and extensibility fit this model well.
Very quickly, you’ll see a viral effect of that to people saying, oh, I was also looking for this use case. How can I use this as a template and create something more specific for me? A lot of people don’t know how to create something from scratch, but when they see something in their reactor, they say, oh, I can take this and change it a little bit.”
– Shivam Khullar, director of engineering, NVIDIA
Measuring what matters: moving beyond AI adoption
Shivam outlined a simple but powerful maturity model for AI metrics that every enterprise leader should internalize. It breaks down into three levels: adoption, usage, and outcomes.
| Level | What you are measuring | Illustrative example |
|---|---|---|
| Tool adoption | Whether people have access to the tool and are trying it | A high percentage of engineers have enabled a coding assistant |
| Productive usage | Whether they are doing meaningful work with it | The assistant is generating large volumes of code or content for active projects |
| Business outcomes | Whether it is delivering business results | Projects are shipping faster, customer response times are lower, or throughput has increased without adding headcount |
According to Shivam, most organizations are still focused on adoption and usage metrics. The leap to outcomes is more challenging, particularly because clean baselines are often missing. She drew an analogy to the early days of email: once a technology becomes embedded in everyday work, it becomes difficult to disentangle its exact contribution to productivity.
Her practical guidance is to measure rigorously where possible and use sensible proxies where necessary. When you know a process’s baseline (for example, how long a standard request usually takes) you can compare that directly before and after introducing AI.
Where baselines are fuzzy, alternative indicators such as throughput per team, time-to-first-response, or decision cycle times can provide directional evidence of impact.
This maturity ladder is a useful framing device. It encourages organizations to think beyond simple adoption metrics and to invest in instrumentation and governance that enable outcome measurement. Establishing consistent logging and analytics across core workflows is a prerequisite for that level of visibility.
The skills that will matter next
Looking ahead, Shivam identified several capabilities that she believes will be increasingly important in large organizations as AI becomes embedded in everyday work.
Forward-deployment engineering (cross-functional delivery)
These people combine technical fluency with a deep understanding of business processes and change management. They can design and deploy solutions quickly, embed them into real workflows, and translate between technical teams, operations, and leadership.
AI quality evaluation and assurance
Even as AI systems improve, there remains a critical need for humans who can assess whether AI outputs are accurate, appropriate, and aligned with organizational goals. Relying solely on AI to evaluate itself, as she noted, risks a circular dynamic with little external grounding.
Enduring fundamentals: customer understanding, process design, critical thinking, curiosity
AI can generate suggestions, but people are still responsible for deciding which processes are worth keeping, which should be redesigned, and how to balance automation with human judgment.
These skill sets intersect closely with platform strategy. As organizations modernize their system of work, they create new possibilities for cross-functional experimentation and rapid deployment. Forward-deployment engineers and similar roles are catalysts who unlock that potential, making them natural champions for AI and platform modernization initiatives.
Some skills that will never be out of fashion is understanding your customer, understanding your workflows, designing good processes. AI can give you suggestions, but you still have to think of this, if this is the right process or not.”
– Shivam Khullar, director of engineering, NVIDIA
7 key takeaways for enterprise leaders
NVIDIA’s approach to an AI-powered system of work offers several actionable lessons for organizations pursuing similar goals.
- Acknowledge that AI is already part of everyday work and focus on operationalizing it. Debates about whether AI is “real” are largely settled; the differentiator now is how systematically organizations can embed it into their system of work.
- Treat procurement and tooling decisions as part of an ongoing lifecycle. Intake, rigorous evaluation, disciplined pilots, and measured scaling should be standard practice, not exceptions.
- Measure outcomes, not just adoption and usage. Adoption dashboards are useful but incomplete. Link AI investments to concrete business results wherever baselines are available, and use thoughtful proxies where they are not.
- Lead with productivity narratives, not technology lists. The most persuasive cases for AI adoption center on improved launch responsiveness, higher throughput, faster decisions, and better employee focus.
- Seed standardized agents for common workflows, then empower teams to customize safely. This pattern accelerates adoption while maintaining governance and aligns well with a centralized platform strategy.
- Model AI usage at the leadership level to normalize it. Visible, candid use of AI tools by executives and managers accelerates cultural acceptance and reduces stigma.
- Invest in a connected system of work as the foundation for AI. A unified work graph, consistent identity, and centralized governance are prerequisites for turning AI pilots into productivity at scale.
By following patterns like those emerging at NVIDIA, organizations can build an AI-powered system of work that moves decisively from experimentation to impact – without detours – while keeping the focus on measurable outcomes and operational change.
Learn more key learnings on AI adoption from NVIDIA’s session
Curious how to go from pilot to production without the theater? My conversation with NVIDIA covers exactly that – disciplined tests, measurable wins, and a practical path to rollout. Start small, measure, then scale what works.
