Skip to main content

Cleaning up healthcare data for the AI era

A QHIN and interoperability pro explains what clean, standardized and interoperable data enables with AI. He also details good governance and standardized frameworks for what constitutes clean data.
By Bill Siwicki , Managing Editor
Derek Plansky of Health Gorilla on clean data for AI

Derek Plansky, senior vice president of strategic governance at Health Gorilla, an interoperability company and QHIN

Photo: Derek Plansky

Challenges with data quality and data governance have plagued healthcare analytics efforts for decades – and the stakes are only getting higher in the age of AI. Inaccurate or inconsistent data doesn't just mean administrative headaches. It can lead to missed appointments, billing errors, patient dissatisfaction and even life-threatening treatment mistakes.

A recent Experian survey of healthcare professionals rated confidence in data quality at just 7.08 out of 10. The top concerns? Duplicated work from data inconsistencies, incorrect patient details and missed appointments — all of which erode efficiency and patient trust.

Now, as healthcare rapidly embraces artificial intelligence, poor data quality becomes an even greater risk. AI models are only as strong as the data that fuels them. Clean, standardized and interoperable data is essential for generating accurate insights, empowering clinicians and improving patient outcomes.

Without it, the "garbage in/garbage out" problem could put lives at risk.

Derek Plansky is senior vice president of strategic governance at Health Gorilla, an interoperability company and qualified health information network, or QHIN. 

As a technologist and product strategist, he has spent the last 20 years working with companies with deep data challenges such as LexisNexis, IBM, Change Healthcare and Quest Diagnostics.

Over the last 15 years, his focus has been building health information exchange and supporting sustainable systems to collect, analyze and exchange clinical and financial healthcare data, and use insights derived from this data to steer the U.S. healthcare industry in the right direction.

We spoke with Plansky to discuss the effects poor health data quality can have in the era of exploding use of AI in healthcare. He discussed what clean, standardized and interoperable data enables for AI and how frameworks can help.

Q. Please talk about the effects poor health data quality can have in the era of exploding use of AI in healthcare.

A. First there is the catch of using AI in healthcare. Few would question AI is rapidly maturing as a transformative force in healthcare, offering breakthroughs in diagnosis, predictive analytics and operational efficiency. But there's a catch.

This grand potential hinges on one critical factor: the quality of the health data. Poor data quality will quickly erode the benefits of AI promises to deliver and undermine trust in the systems going forward.

Then there are the dangers of poor-quality data. AI models are only as reliable as the data they're trained on. Incomplete or biased records can skew algorithms, perpetuate inequities and generate unsafe recommendations at the point of care.

This goes beyond inconvenience or frustration. Poor-quality data undermines trust in digital tools and makes care delivery unpredictable for patients and clinicians alike.

Misdiagnoses or even missed diagnoses resulting from faulty data can harm patients. Duplicated, fragmented or outdated records compound the problem by increasing error rates and reducing confidence clinicians have in AI-driven insights.

In other words, when flawed data fuels AI, it doesn't just affect one decision, it ripples across the entire care journey.

Then, how compliance is at risk. No organization wants the compliance headaches that come with low-quality data. Meeting strict standards for AI and large language models – like those from TEFCA, HIPAA and the FDA – already is challenging.

Throw in substandard data, and suddenly monitoring and reporting care standards as well as reimbursement becomes a minefield. All of this threatens transparency and accountability.

And finally, there is the financial factor. What about the money? Inaccurate data requires organizations to spend significant resources cleaning data. This is time-consuming, which means expensive. And poor data includes more than duplicated or outdated EHRs.

Inaccurate coding leads to denied claims resulting in lost revenue. Unreliable data slows innovation, and in a world of supersonic AI speed in transforming care delivery, this delay can be costly.

Q. You say clean, standardized and interoperable data is essential. What does this enable with AI?

A. When health data is standardized, normalized and deduplicated, AI can generate consistent high-quality insights that clinicians can trust at the point of care. Interoperable patient records ensure accurate decision support, reduce unnecessary testing and enable predictive models that anticipate risks before they materialize.

Standardized data also broadens the scope of AI well beyond the hospital. Population health initiatives, public health surveillance and medical research all depend on normalized data sets to uncover trends across millions of encounters. Without it, algorithms stall at the pilot stage. With it, they scale into tools that improve equity, accelerate discovery and strengthen the entire healthcare system.

And there's more. Interoperable data also seamlessly stitches integration across electronic health records, laboratories and payer systems. This creates coordinated and efficient care. It always comes back to the patient. Standardized data allows AI to provide the patient with comprehensive and even predictive care.

Interoperable data also influences research – which, of course, will also come back to patient care. AI's predictive analytics and clinical trial insights require reliable inputs to produce trustworthy, actionable results. Standardized and interoperable data lays a foundation for scalable innovation that AI can then predictively build upon, ultimately influencing the entire healthcare ecosystem.

Q. You suggest the solution lies in stronger data governance. Please explain.

A. Governance is often misunderstood as an administrative burden, but in reality, it provides the guardrails that make innovation safe and sustainable. But the truth is that only through stronger data oversight and vigilant stewardship can healthcare organizations fully harness AI's capabilities.

In today's fragmented and fast-moving health tech environment, clear policies and standards are essential to ensure AI models are fed accurate, compliant and consistent information. Interoperability is only powerful and useful if it can assume that every participating platform adheres to a standardized data management and policy framework.

Shared data, whether from EHRs, labs or payers, requires common rules for structure, validation and stewardship to ensure precision and reliability.

Q. You also say standardized frameworks for what constitutes "clean data" are another part of the solution. Please describe these frameworks and how they'll help.

A. Standardized frameworks will provide a clear set of rules and best practices for how health information should be formatted, validated and maintained. Think of it like a book. Each chapter (framework) needs to hold to the same formatting, language and structure as the preceding and following chapters for the reader's continuity and clarity.

The goal is to ensure structured, deduplicated, normalized and accurate data. Standardized frameworks will help reduce the variability in how different platforms store and share data. Let's go back to our book analogy.

If you are reading a book in English, for example, and turn to a chapter written in Latin, your ability to process and act upon the content is deeply diminished. Standardized frameworks remedy this confusion and variance. Organizations need regulated standards to cooperate with one another and to actionably use the shared data.

Interoperability is the darling of healthcare tech. But poor-quality data – data that is not held to a standardized oversight – will weaken the power of sharing data between EHRs, lab results and even payer systems.

AI has tremendous potential to process and analyze data across limitless sources. Only with universal adoption of frameworks that bake in regulatory compliance and transparency will this potential translate into healthcare transformation.

Follow Bill's health IT coverage on LinkedIn: Bill Siwicki
Email him: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.

WATCH NOW: How healthcare AI can take lessons from national security leaders