Go Carefully When Measuring Quality

Gauging and rewarding good work in health care is a noble goal with potentially negative consequences

Martin Sipkoff

The good inherent in evidence-based medicine — the cornerstone of quality measurement — can be compromised by the complexity of processes used to achieve its implementation. If transparency is confused with accountability and incentive programs are perceived by physicians as manipulative, pay for performance will be ineffective and its savings illusionary, says Brent James, MD, executive director of the Institute for Healthcare Delivery Research and vice president for medical research at Intermountain Healthcare in Salt Lake City.

“If physicians begin to game the system, pay for performance will affect financing but make no difference to the patients,” says James. “If we move resources around to meet the data expectations of payers, the unintended consequences of data measurement will be to damage the quality of care.”

He is not alone in his concern.

“A lot of the misunderstanding and debate about evidence-based medicine emerges from the failure to recognize that at the end of the line, some person or some organization needs to make a judgment about whether the quality of a particular body of evidence is adequate to make a particular decision,” says David Eddy, MD, a cofounder and medical director of Archimedes, a simulated model of care sponsored and managed by the Kaiser Foundation that outlines processes to achieve quality results.

Inhaltsverzeichnis Anzeigen

Dependent on docs

The value of quality measurement is therefore dependent on physicians, not payers. Pay-for-performance programs are only as effective as physicians allow them to be, says James. “There is in our profession an inherent tendency to do what is right. Pay for performance is a step in the right direction,” he says.

The goals and purpose of quality measurement are clear, and have been for a long time. “At its most basic level, quality is doing the right thing, at the right time, in the right way, for the right person,” the late John Eisenberg, MD, a founder of the quality movement and chief architect of the federal Agency for Health Research and Quality, told Congress 10 years ago. “The challenge clinicians face every day is knowing what the right thing is, when the right time is, and what is the right way.”

Today, public reporting of provider adherence to quality standards is at the core of the pay-for-performance programs proliferating in the market. “We believe that measuring and reporting performance is the best possible means of improving the quality of the care we deliver,” says Don Storey, MD, Aetna’s national medical director for high performance networks. “Actionable measures are the key to successful implementation.”

Previous efforts

James has helped to design quality measurement programs for his own health system and advised federal and private organizations on the development of quality measures. He believes such programs hold promise for improving the quality of care “when physicians are [given incentives] appropriately to do what’s right. It is a very important concept, but this isn’t the first time we’ve tried. It’s the third. We need to learn from the past and be careful.”

The two previous quality data reporting initiatives to which he is referring are a controversial public report by CMS in 1987 and the Joint Commission’s Oryx initiative, an ongoing effort to tie outcomes to accreditation that started 10 years ago.

The first, a public survey of 6,000 hospitals by the Centers for Medicare & Medicaid Services (then named the Health Care Financing Administration) in a December 1987 report named 1986 Hospital Mortality Data for Medicare Patients, caused an uproar. Hospitals claimed the data were inaccurate or skewered. But that really didn’t matter because the public tended to ignore the data.

“For the first time, information on the quality of health care is being made to American consumers by government agencies…,” wrote the authors of a study published in the September – October 1991 issue of Public Health Reports titled “A Survey of Newspaper Coverage of HCFA Hospital Mortality Data.”

The findings indicate that HCFA’s release of the 1986 hospital mortality data received heavy news coverage. There were twice as many negative headlines as positive ones, although nearly 95 percent of hospitals had mortality rates within expected ranges. Quotes from representatives of hospitals predominated in the newspaper articles, and they often blamed some aspects of the HCFA data for higher-than-expected mortality rates.

“The idea was that if people stayed away from the bad hospitals, not as many people would die,” says James. “A subsequent study found that several years after the HCFA data were published, hospitals experienced only a 1 percent fall in patient population for the procedures they performed badly.”

The HCFA initiative grew out of a perceived public concern about quality in health care, according to Michael Millenson, then a consultant at William Mercer and now an independent consultant who sits on Managed Care’s editorial advisory board. In an article in the May – June 1997 issue of Health Affairs titled “Miracle and Wonder: The AMA Embraces Quality Measurement,” Millenson wrote:

Hospital accreditation had always been based on “minimum standards.” In the early 1970s, public concerns about quality forced the AMA and the American Hospital Association (AHA) to completely overhaul the Joint Commission on Accreditation of Hospitals [since renamed the Joint Commission]. In theory, the new JCAHO “medical audit” required measurement of medical outcomes at hospitals. In practice this promise was never kept. Still, standards were improved, and in 1997 the JCAHO rolled out “Oryx,” its latest version of an outcomes management system.

The Oryx program continues to collect data to measure outcomes, in a manner apparently satisfactory to participating hospitals, but there is no evidence that simply collecting Oryx data improves the quality of care, says James.

“After the HCFA experience, hospitals retrenched and started again,” he says. “They put together their own set of quality measures. Over time it became apparent that the data were failing to improve care or influence choice. Data analysis shows how systems perform, but data analysis doesn’t get hospitals to perform well.”

Tying payment to measurement, however, does “move us in the right direction,” says James. “The short answer to whether pay for performance can work is ‘yes.’ It can have a positive effect on improving the quality of care.”

Eddy agrees. “Pay for performance does move us in the right direction because it puts emphasis on performance and it introduces a method that can increase value, if it is designed and implemented correctly,” he says. “But we have to determine the magnitude of benefit in each measure, not just the fact that some benefit exists.”

An example

He gives mammography as an example. “There is plenty of evidence that mammography can reduce morbidity and mortality. So it meets the criteria of being evidence-based,” he says. “But the degree of benefit and the value of its cost vary widely depending on who is being tested, on factors like age and risk, on what test is used, on the frequency of testing, and on follow-up. It is simplistic to say mammography screening is effective and use that as the basis for a performance measure. If one wants to save money or increase value, one has to dig down and determine the magnitude of benefits and costs.”

The real value of quality measurement always comes down to its application, says James. Designers of P4P programs should focus closely on the consequences of their incentives, and keep in mind that the goal is improving provider behavior.

“Bonuses cannot guarantee improved quality,” says James, “especially if they are perceived by physicians as a cynical and manipulative indictment of their professionalism.”

Physicians take pride in the quality of care they provide and in their relationship with patients, says James. They can be offended by performance measurement initiatives if the programs fail to take professional self-respect into account. In addition, James says research shows that if financial incentive programs are designed to pay more than 10 percent of current income, physicians become cynical and begin to game the system, finding a way to meet the technical requirements of a program but failing to change their behavior.

Four dangers

“The problem is that looking good is often far easier then being good, a very human response,” says James. He points to four pitfalls inherent in quality measurement programs, dangers that payers should keep in mind in designing P4P programs they hope will be effective in improving quality:

Pitfall No. 1: Confusing transparency with accountability: Transparency entails provision of information. Accountability is a much more complex issue, and reflects the effectiveness of treatment processes.

Accountability, i.e., tying incentives to performance, must involve a complex evaluation of clinical processes that only begins with transparency.

And providing data to the public is not sufficient to improve the quality of care, says James. “Transparency frequently has no effect on decision-making, either by patients or providers. In fact, among physicians there are two different understandings of the word,” says James. “The first is a health plan telling them ‘I’ll publish who is good and who is bad.’ That doesn’t go down well, and leads to gaming the system.

“The second is much more functional. It means the provision of information to enhance the chain of decisions that compose clinical decision-making. It entails giving detailed information to the physician to enhance performance,” says James.

“Transparency is at the beginning of P4P,” agrees Dick Salmon, MD, Cigna HealthCare’s national quality medical director. “It allows us to choose measures that are focused on ultimate outcomes, to set goals. It is not an end in itself.”

Pitfall No. 2: Most accountability measures cannot rank performers accurately. Differences in patients and treatment modalities are not always reflected in results.

All clinical decision-making is complex, says James, and — if it is to be effective — must reflect the needs of individual patients. Differences in patients (everything from anatomy to belief systems) mean differences in treatment, and that means differences in results. So in order to be meaningful to incentives and to inform patient choice, accountability measures must reflect those differences.

“Otherwise we risk losing all credibility,” says James. “If that happens, what’s the point of incentive programs?”

Pitfall No. 3: Top-down accountability measurement damages front-line quality. Effective accountability measurement begins at the patient level, not the policy level.

This is the issue James emphasizes the most vigorously. He refers to physicians and hospitals as the “front line” of health care.

By “top-down” he is referring to dictates by payers to the front line. An example is payers dictating to providers what percentage of their diabetes patients must receive foot exams in order to meet an incentive goal. “That virtually guarantees that payers will be told what they want to hear,” he says.

By “bottom-up” he is referring to payers supplying providers with the data necessary to create remedies to improve performance. A parallel example is providing physicians with claims data about how many diabetes patients are receiving foot exams and “giving physicians the tools they need to improve their care, such as reminders for their patients,” says James.

Bottom-up data collection systems are founded on transparency, and result in effective process management and quality improvement. They contain the data necessary to develop accountability measures.

Top-down data collection systems are designed for accountability alone. They do not contain process management data, and compete with front-line needs by creating compliance burdens. “They actively harm the quality of care,” says James.

Pitfall No. 4: Most patients do not use outcomes statistics to choose physicians or hospitals. They rely much more on anecdotal information from family and friends.

James believes that patients do not care much about data. They rely on personal relationships, especially with their physicians. “Stories trump statistics, and personal relationships trump stories,” he says.

Cigna’s Salmon does not agree. “We believe that patients are relying more and more on the data we provide to them about the performance of providers,” he says.

Perhaps as our information systems proliferate, numbers are replacing stories to some degree. But James believes that regardless of how data are used, it is at the provider level that quality has meaning.

“I have a deep and profound respect for the commitment and intelligence of front line people,” he says. “Why force patients into the role of clinical decision-makers? The purpose of quality measurement is to improve the physician’s ability to perform that role.”

The chain of effective measurement

Brent James, MD, of Intermountain Healthcare, lists the six questions, in order, that health plan officials should ask when they design quality measurement and pay-for-performance programs:

Do we have the necessary science to know the relevant factors?
Did we do our homework in building a data set that identifies the most important factors?
If so, did we include those factors?
When care is delivered, are those factors actually assessed?
Are those factors actually written into the medical record?
Is the information from the medical record being accurately extracted into an analyzed data set?

More like this

exp