MIT: 95% of enterprise AI pilots fail to deliver measurable ROI

The report should be a reality check, says Dr. Tim O'Connell, CEO of NLP company emtelligent. He digs into the new study, discusses genAI hurdles (and ways to jump them) and highlights the gap between real back office gains and "visibility spending."

AI & ML Intelligence

Global

By Bill Siwicki , Managing Editor | October 9, 2025 | 10:29 AM

Tim O'Connell, CEO and cofounder of emtelligent and a practicing radiologist

Photo: emtelligent

A new study from MIT's NANDA initiative has found that 95% of generative AI pilots fail to deliver measurable ROI for companies – a failure rate rooted not in flawed models but in poor integration and misaligned priorities.

Dr. Tim O'Connell can unpack what this means for healthcare. He is CEO of emtelligent, which builds AI-powered natural language processing tools that extract structured insights from unstructured medical data to improve healthcare workflows, analytics and decision making.

Of the more than 40 pilot tests the company has run on its AI systems, O'Connell reports all have generated a successful outcome, with some projects – such as one processing 5.1 billion notes – delivering up to an 80% increase in structured biomarker data across six therapeutic areas.

As both a practicing physician and the man behind emtelligent, O'Connell has firsthand insights into building clinical AI that works for clinicians.

We spoke with him to get his expert views on the new study from MIT's NANDA initiative, genAI ROI hurdles and ways to jump them, back-office ROI versus visibility spending, and working with vendors and distributed leadership.

Q. Please talk about the new study from MIT's NANDA initiative, and the results you find most important.

A. The recent MIT NANDA report delivered a reality check for healthcare, and for that matter, industry at large. The finding that 95% of enterprise AI pilots fail to deliver measurable ROI validates something those of us in healthcare have known for a while: Healthcare information is extremely context-specific and nuanced, making its challenges difficult to address with generic technology.

If you believe the hype, for example large language models passing the U.S. Medical Licensing Exam, then you might believe that AI will quickly replace physicians or revolutionize clinical decision making.

This report demonstrates that outcome is highly unrealistic at present. The key takeaway from the report for our industry is that medical AI must be designed to augment clinical decision making, keeping humans in the loop not substituting for them. And generic models, while useful in some consumer-facing or commercial use cases, simply cannot yet comprehend the complexity of medical language and healthcare workflows.

Ultimately, if AI is going to be successful in healthcare, it can't be off-the-shelf – it must be purpose-built and properly integrated in clinical and operational workflows. The models that take root will understand medical ontologies and clinical context deeply.

But perhaps the most interesting message in this research isn't about the models at all. It is that the real barriers to AI in healthcare mirror other technology predecessors like CRM and EHR in that they are organizational, not technological.

What will lower barriers is redesigning workflows and building a robust data infrastructure, creating the structured foundation that will allow medically aligned AI to deliver meaningful impact. Only then will AI rise to meet the hype and begin to help lower costs, improve outcomes and simplify how patients navigate the healthcare system.

Q. You suggest three hurdles and ways to jump those hurdles when it comes to getting ROI from generative AI projects. You have told me of an integration challenge – how fragmented systems exacerbate the "learning gap" MIT identifies. Please discuss the challenge and strategies to overcome it.

A. One of the largest hurdles in achieving ROI from healthcare AI is data integration. Healthcare data is unfortunately fragmented, scattered across dozens of systems and formats – let alone the fact there is huge individual variation in how providers take notes.

Electronic health records, lab systems and imaging databases all hold part of the patient health profile, but those systems rarely talk to each other. This makes obtaining a 360-degree view of the patient extremely difficult.

Add to this the sea of unstructured notes that are difficult enough to parse, now becoming more difficult with widespread copying/pasting and the inclusion of tables and other semi-structured data. This underscores why MIT's research saw a significant "learning gap" in bringing AI to production.

Healthcare AI will only ever be as smart as the data it sees. And in our industry, the data is highly fragmented and very difficult to parse.

Closing this gap starts with redesigning our data pipelines. Healthcare organizations must invest in technologies that bring disparate data together, extract and structure insights hidden in unstructured data, and create a coherent view of the patient.

This will create a reliable data foundation we can use to launch AI into the business, ensuring our investments support more than one-off pilots and deliver enterprise-wide ROI.

But data integration alone isn't enough. The models themselves must be purpose-built to make use of this unified data. Generic models, no matter how large, cannot reliably interpret clinical shorthand, medical abbreviations or the contextual nuances in unstructured notes.

Purpose-built medical AI, trained on the language of medicine, can. When paired with the right data pipeline, this kind of AI will bridge the learning gap and unlock both clinical and financial ROI.

Q. Your second point you sum up as back-office ROI versus visibility spending. Please spotlight how operational AI – like coding automation, resource allocation and patient triage – can deliver real patient and financial outcomes.

A. The NANDA report highlighted that many organizations are investing in the wrong areas and too much money has gone into "visibility" projects. These initiatives fail to move the needle financially. In contrast, the projects that consistently succeed are operational in nature – coding automation, prior authorization and patient triage are prime examples.

Focusing on these areas is critical, as these are the use cases where errors and inefficiencies create the most drag on the system. When properly addressed with the techniques we just discussed, they have the potential to deliver measurable returns and improve health outcomes.

Take coding automation as an example. Every clinician has suffered from documentation burdens, and every CFO understands the revenue impact if coding contains errors or is delayed. Medically aligned AI that can extract clinical concepts from unstructured notes, interpret the context correctly and assign accurate codes transforms this process.

It reduces administrative overhead, accelerates reimbursement and improves our ability to deliver the right care in the long term.

Generic models can't reliably do this at scale. They hallucinate, misinterpret and introduce an unacceptable level of risk. The same applies to prior authorization. These processes become faster and require less overhead when medical AI accurately performs criteria matching, provides auditable patient summaries and streamlines clinical review.

In every case, the right AI doesn't replace people, it allows them to focus on the work that requires human judgment. That's where the operational ROI lies, in making healthcare more efficient and more accurate.

Q. And finally, you have said working with vendors and distributed leadership is a major point. You say vendor-led, clinician-empowered models enable frontline teams, not centralized IT, to champion AI adoption. Please elaborate.

A. The MIT research found vendor-led AI projects are more successful than internal builds. This isn't surprising. Developing purpose-built medical AI requires deep expertise in both language models and healthcare ontologies, as well as the engineering know-how to create and sustain model enhancements.

Most health systems simply don't have this skill set in-house. That isn't their fault, it's simply a matter of priorities. Healthcare organizations are built to deliver care, not to reinvent advanced AI. Working with specialized vendors allows organizations to access the expertise needed while focusing internal teams on their core business.

But vendor leadership is not enough. Success comes when adoption is driven by clinicians and administrators on the frontlines. Top-down IT mandates rarely work. Business and clinical leaders must champion the tools.

When a radiologist, a care manager or a claims administrator sees that a tool makes their day easier, they will become the advocate. That's the power of frontline empowerment: It grounds AI adoption in utility, ensuring the technology solves real-world problems rather than artificial pilot scenarios.

Success requires an implementation approach that combines vendor expertise with frontline ownership. Vendors bring the expertise and knowledge to build accurate, scalable systems. Clinicians and business leaders bring the lived experience to validate, refine and champion the technology.

Together, they create AI that is both technically sound and practically useful. This collaborative approach lowers barriers to adoption, accelerates ROI, and ensures AI serves as an enabler, not a burden.

Follow Bill's health IT coverage on LinkedIn: Bill Siwicki
Email him: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.

WATCH NOW: The rise of the AI concierge?

Topic:

Artificial Intelligence,

Workflow

MIT: 95% of enterprise AI pilots fail to deliver measurable ROI

More Regional News

AI in Healthcare Forum