‘Take My Word for It’: The Enduring Dispute Over Measuring DM’s Economic Value

While DM is now mainstream, it can still be difficult to judge a program’s worth. Health plans faced with renewing DM contracts have a lot to think about.

When Bill Popik was chief medical officer at Aetna last year, it was his job to renew or terminate Lifemasters’ three-year disease management contract. Lifemasters was running Aetna’s diabetes, heart failure, and other programs. Before making a second commitment, Popik, who was spending tens of millions of dollars on DM, needed assurance that the programs were earning their way, especially since Aetna was reselling DM to large self-insured companies. “We had a lot of skeptical employers to convince that this was worth investing in,” Popik says.

Aetna’s internal audit showed an overall return of about $3 for each $1 spent on DM, a good return on investment (ROI). Most savings came from the sickest patients, such as those with heart disease, while asthma programs were close to breaking even. Popik, president-elect of the Disease Management Association of America DMAA, the industry trade organization, gladly extended the contract.

Health plan leaders, CMS officials, and others are betting the farm on DM programs. An estimated $1.5 billion will be spent on DM this year, up from $78 million a decade ago, according to Al Lewis, JD, president of the Disease Management Purchasing Consortium. DM’s health improvements are incontrovertible — better quality of life, less absenteeism and disability, more productivity. The big unknown is: Do the programs really save money? Executives at Aetna and many other managed care organizations seem to think they do. But there is great uncertainty in and out of DM circles about the formulae that establish positive ROI, such as Aetna’s 3:1, causing some to wonder whether bullishness derives from faith alone.

Lifemasters, Healthways (formerly American Healthways), Matria, Health Dialog, and others can claim in contracts that for every dollar spent on call centers, patient literature, and other means to improve patient health, managed care organizations can expect $1.50 to $8 in reduced use of acute care services.

However, any postulation is suspect since there are many ways to calculate financial savings or losses, industry sources agree. Results can be wildly inconsistent. One telltale study demonstrated that using the same data and different methodologies produces three different results: a negative ROI, break even, and a positive ROI. Titled “The Impact of the N,” it won an award at DMAA’s 2004 annual convention. “The method you choose influences the results you get,” says the study’s author, Thomas Wilson, PhD, of the Population Health Impact Institute, a not-for-profit provider of evaluation guidelines and other services.

Wishful thinking

In the early days of DM a decade ago, vendors boasted returns of 30:1, a declaration that experience has shown to be outrageous. Still today, “It’s impossible to know whether the results managed care executives are hearing are valid,” Wilson says. “Most of the time, calculation methods are not even disclosed.”

The Congressional Budget Office examined dozens of peer-reviewed DM studies. “There is insufficient evidence that the programs can generally reduce overall spending,” wrote CBO director Douglas Holtz-Eakin to the Senate Budget Committee in 2004. Moreover, the CBO worried that the programs could even raise costs by increasing the amount of medical care patients use.

“Improving health outcomes and mitigating . . . costs do not necessarily go hand in hand,” CBO researchers wrote.

The Cornell University Institute for Health and Productivity Studies in collaboration with Thomson Medstat also analyzed ROI for dozens of DM studies. Led by Ron Goetzel, PhD, vice president for consulting and applied research at Thomson Medstat, the group discovered a positive ROI of 2.78:1 for congestive heart failure; only “some evidence” that diabetes programs save more than they cost; mixed results for asthma programs, and a 30-cent loss for every $1 spent for depression.

Depression programs were found to cost more than they save, but may save money when productivity outcomes are considered.

The greatest returns were for managing multiple conditions, from $6.65 to $10.87 per dollar invested. Results were published last summer in Health Care Financing Review.

David Cutler, PhD, the Harvard economics professor who studies health spending, examined two diabetes programs run by insurers in the last decade. He concluded that DM is not a moneymaker because patients change plans before long-term savings can be realized. Thirty-seven percent of the workforce turns over annually, according to a current analysis by the Bureau of Labor Statistics.

“The longer a member is enrolled in a program, the more dramatic results obtained,” says Donald Fetterolf, MD, corporate vice president for strategic initiatives at Matria and formerly vice president and senior medical officer at Highmark Blue Cross and Blue Shield in Pittsburgh.

“It could take 18 months to get people signed up and another 12 months to get initial outcome data,” adds John Stark, a regional vice president and actuary at WellPoint. In addition, the effects of behavior change are often slow to develop. Imagine a cardiac condition that has developed over a lifetime. While simple dietary changes, weight reduction, exercise and smoking cessation will have significant effect on health, the effect on claims may not be seen for years. Clearly, if all companies had similar DM programs and measurement standards, when patients moved on, DM would be seamless.

For some, like Gordon Norman, MD, formerly PacifiCare Health System’s vice president for disease management, the thorny issue of ROI is moot. “We’re beyond determining whether DM programs save money,” he says. “There are enough cohort and quasi experiments that have caused senior health plan executives and self-insured employers to invest year after year. Their confidence is not predicated on anecdote, supposition, or faith.”

Norman is now chief medical officer at the DM vendor Alere Medical, which entered into a long-term DM management agreement with PacifiCare last year. Every year since 2000 that DM had been in place at PacifiCare, the company realized “in excess of $100 million annually in net savings aside from what the program was costing,” he says.

Chris Coloian, Cigna’s vice president for DM and wellness, agrees: “People have moved past speculating whether these programs work.” In 2004, the company released results of a DM study that it had conducted of 43,000 diabetics in 12 states. Overall medical costs declined by an average of 8.1 percent for full parti-cipants, mainly through a decrease in hospitalizations. The program is provided in collaboration with Healthways.

Cigna has the largest DM program as measured by total dollars, according to “Leading Disease Management Organizations,” a summer 2005 report by the Disease Management Purchasing Consortium.


DM sometimes gets shortchanged because it is impossible to measure factors that would surely boost ROI. “Where does productivity loss occur?” asks Popik, formerly of Aetna. “Parents who leave work or don’t come to work because they have to take their asthmatic children to doctors — we don’t measure that but it’s very significant. Presenteeism — no one is measuring that and factoring it into ROI.” An analysis of the lost work time avoided would demonstrate significant savings, according to Popik.

Few health plans are without at least one DM program. Enthusiasm and high expectation reigns. “The check that the chief financial officer writes for DM is one of the largest, aside from claims and labor,” says Ian Duncan of Solucia Consulting, a company that helps negotiate DM contracts.

Gold standard

Even Senate Majority Leader Bill Frist, MD (R-Tenn), boosted DM in February when he said in a speech to the Detroit Economic Club referring to diabetes DM: “We know that through evidence-based interventions and monitoring — things as simple as phone calls from trained nurses — disease management can reduce health costs by as much as 30 percent, and can cut hospital costs by as much as 50 percent.”

But as the industry matures, changes are occurring. Only two to three years ago, managed care organizations “were just accepting what was handed to them by their DM vendors and relying on those explanations,” says Rebecca Owen, an actuary with Solucia Consulting. Today they are looking under the hood to learn how numbers are derived.

“The president of the company wants to know if it’s a red number or a black number,” says Matria’s Fetterolf. Executives have to consider whether they will come out ahead with alternate uses for their dollars.

To be sure, some health plan executives renewing vendor contracts are calling for independent audits of methodologies with striking results. “We have audited calculations that differ substantially from the written methodology that studies were supposed to represent,” Duncan says.

Especially for health insurers, outside auditors have become de rigueur since the federal Sarbanes-Oxley Act of 2002. “We want to validate vendor results to make sure that savings are there, for our peace of mind,” says WellPoint’s Stark. “We have a lot of people looking at our business.” The act was passed after the Enron, Tyco and WorldCom fiascoes to encourage public companies to enhance financial disclosure.

Numerous attempts had been made in the past to create a uniform standard, in the way that the accounting industry has “generally accepted accounting principles” to which all accountants adhere. “It’s been one big spitting contest and no one has won,” says the DM consultant Vincent Kuraitis of Better Health Technologies.

Wrote Tracey Moorhead, DMAA’s executive director, earlier this year: “Our experience shows DM works, but lack of agreement on how to measure that success has hampered our ability to convince skeptics.”

Representatives of multiple disciplines have weighed in — academicians, politicians, statisticians, actuaries, economists, epidemiologists, and assorted researchers. Many have created, or are in the process of creating, methodologies that they consider scientific. Moreover, each vendor thinks it has created the gold standard. For example, SHPS, a company in Arizona that describes itself as “an integrated health management provider,” used Ernst & Young to validate client savings. SHPS wrote in 2002: “Through the Ernst & Young study, SHPS has established a financial model that will become an industry standard for measuring our programs as well as those of the competition.”

A year later, Healthways (then called American Healthways) and John Hopkins University researchers announced that they had developed a Standard Outcomes Matrix that was to be the “benchmark for all program evaluations.” Neither is uniformly accepted today, Fetterolf says.

It was big news in January when the DMAA announced that by December, it would establish a definitive methodology for measuring financial and clinical outcomes.

“The ROI issue has become so annoying,” says Fetterolf, appointed chairman of the DMAA committee directing the effort. “It’s a resource-consuming, fracturing issue. We want to come up with a standard so everyone can get on with life. It’s slowing contracting. Expensive people spend lots of hours in windowless rooms arguing over nuances and methodology.”

To get the ball rolling, the DMAA sent 300 questionnaires to CEOs, chief medical officers, quality and informatics people, vendors, health plans, support organizations, consultants, and pharmaceutical concerns in an attempt to gain a consensus. “Engaging all key stakeholders early and soliciting their input is crucial, and we’re doing both,” Moorhead says. Committee members include Jim Pope, MD, chief medical officer of Healthways, Jaan Sidorov, MD, director of care coordination at Geisinger Health Plan, employees of actuarial and accounting concerns, and others. Still, there is so much contentiousness about how to conduct a financial review that it is doubtful that warring factions will accept the DMAA standard. “We fully anticipate consensus on a large number of key elements in the presentation of DM outcomes, but recognize there may not be agreement on some elements,” Fetterolf says.

“I’ll go on record to say that if the DMAA includes plausibility analysis, I will be a big supporter. Leaving it out wouldn’t pass the sniff test,” says Lewis, who has his own algorithm. Plausibility indicators adjust for the impossibility of predicting all the members of a health plan who are going to have health events.

There are even suspicions that since the DMAA represents an industry eager to demonstrate that its programs have financial worth, its algorithm may be faulty. “As a trade organization, the DMAA is going to come up with a method that shows DM provides a ROI,” says Medstat’s Goetzel. “Perhaps only a government agency can come up with a methodology that is unbiased.” Fetterolf strongly rebuts that assertion. “The industry can worry that the fox is in the henhouse, but we’ve gone to great lengths to make sure that is not the case by including lots of stakeholders,”

Wilson, of the Population Health Impact Institute, argues that it is not even possible to come up with a single credible standard, as there are too many variables for a method to work in all circumstances. He has created a framework of five principles instead of a uniform methodology: data quality, equivalence, statistical quality, causality, and generalizability. “There is not only one way to build a house, but all construction companies follow certain rules, such as that walls have to be plumb and floors level.”

It should be simple

Virtually all agree that the Holy Grail for evaluating ROI is through expensive, double-blind randomized controlled trials, with patients assigned to one of two groups. One is the control group receiving no program intervention; the other receives DM interventions. Some doctors and others perceive this as unethical because the vendor withholds services from people who could benefit from them, Solucia’s Duncan says.

Wilson, however, argues that “double-blind” trials are not even possible in DM, as the patient will be aware of the DM program. “What’s left,” he says, “are less rigorous methods, more potential for bias, and the need for even greater scrutiny. To enable credible evaluations, metrics and methodologies will have to be fully disclosed.”

One of these methods is “the pre-post method,” or pre-DM intervention as the baseline period for comparison, versus post DM intervention. This method is fraught with potential for miscalculations and flaws, such as selection bias and regression to the mean.

Self-selection will no doubt include more highly motivated and perhaps healthier patients. In regression to the mean, members selected for DM during a time of acute illness, such as a hospitalization in the initial year, tend over time to become better and more like the average patient with the disease. Comparison of costs for groups of members in a base year, selected for hospitalizations, have the false result of magnifying the effect of an intervention.

Countless variables easily skew the data. For instance, the introduction of new technologies, new therapies, and new medications can make a vendor seem positively heroic. ROI can rise dramatically if a popular drug goes off patent. “A pill that costs $10 may now cost 10 cents,” Solucia’s Duncan says. By the same token, a new, coveted, but expensive medication on the market could lower ROI.

Comorbid patients can wreak havoc on DM calculations. Most primary care patients have more than one chronic condition; approximately half of the patients have five or more, according to a study in the May/June2005 issue of Annals of Family Medicine. Financial savings from a single patient who is registered in more than one DM program should count in one program only. One CHF program may monitor patients with a remote device; another can use a nurse reminding CHF patients to take their medication. “Every one of these could have a profound effect on the results,” says Wilson of Population Health.

Insurers themselves unintentionally alter ROI. Formulary changes that substitute drugs could affect efficacy. Hiking copayments for doctor visits and prescriptions makes it less likely patients will get the treatments that could keep their diseases in check, irrespective of other aspects of the DM program.

When insurers alter their networks, changing doctors or hospitals, patients may find they have to travel distances to receive care and so may not follow up. “If WellPoint were to shrink its network and get rid of tertiary care facilities, that would affect prior promised savings by the vendor,” Stark says.

DM programs work only if patients take the advice proffered. Sometimes, they just can’t. About 15 percent of working age adults ages 18–64 with private health insurance and at least one chronic condition did not purchase all of their prescriptions because of cost concerns, according to a 2005 study released by the Center for Studying Health System Change.

In spite of it all, physician leaders remain ebullient about DM because there is a vague notion that even if results are 1:1, that is sufficient. “If the ROI is a wash, would you still do the programs?” asks Fetterolf.

“Programs have other value. Employees like their employers who provide these services; physicians support these efforts to help them practice.” Adds William Gold, chief medical officer of Blue Cross & Blue Shield of Minnesota: “We don’t want to lose money, and I would have a problem if ROI was less than 1:1, but DM is at least better than the way the health system operates now, such as ordering an MRI that is not indicated or not following up with noncompliant patients.”

In 2004, Gold had no hesitation re-enlisting one of its company’s outside DM vendors when that contract renewal landed on his desk for the second time. “We’d had very favorable outcomes and we were confident of their methodologies,” Gold says, declining to the name the niche vendor. This time around, though, Gold asked to drop performance guarantees, a move that reduced the rate charged to the insurer. In each of the previous years, the vendor had exceeded the guarantee. “We have no doubt the relationship will continue as it always has,” he says.

At contract renewal time, Aetna’s Popik also made changes. “It became clearer to us we needed to focus on the sickest members. We had to stratify members by risk and to focus activity on levels 1 and 2. “Someone with the mildest from of chronic heart failure might receive some written information, while those with the most severe form would receive much more intensive intervention, including home visits,” said Popik at a DM conference at Duke University.

Employers contracting with Aetna also made changes born of experience. “They decided they did not want to pay for people who were not enrolled, so we developed a per-participant pricing schedule,” Popik says. “Our customers were saying, ‘I want you to have an incentive to sign members up and I’ll pay you more for those who sign.'”

The first time around, Cigna was concerned with financial indicators in DM contracts, but at renegotiation time, clinical improvements were equally or more important, Coloian says. “Not just ‘Did the patient have the lab test?’ but ‘What was the value of improvement?'”

While Norman was at PacifiCare, he renewed contracts with Alere. “We were getting value,” he says. However, one publicized contract that ended 10 months early in January was a PacifiCare pilot with Uncle Sam for congestive heart failure patients. “There were challenges in recruiting beneficiaries,” Norman says. He adds that massive companywide changes were occurring. PacifiCare was acquired by UnitedHealth and all of PacifiCare’s DM business was outsourced to Alere.

Time will tell

A young DM industry in the 1990s boasted huge savings that now appear incredible. “There was no intent to mislead. If results were glowing, it was incorrect methodology created by people who did not have experience in clinical trials. They were unsophisticated in the complexities of study design,” Fetterolf says. “We are in a different place today. Vendors realize that mocking up a colorful histogram demonstrating large positive outcomes is history,” Gold says.

A lot of bright ideas have come and gone in the health field. Physician practice management companies such as MedPartners and Phycor rode a brief wave of popularity in the 1990s, for instance. Time will tell whether DM will succeed or meet the same fate. Popik, though, is upbeat. “You have 78 million baby boomers about to develop chronic diseases. It’s a great opportunity for DM and for health plans.”

Before You Renew? 10 Cautionary Items To Look For in Vendor Contracts

Every vendor contract should include four major categories: financial outcomes, clinical outcomes, functional outcomes (patients functioning better or worse), and satisfaction outcomes (asking physicians and patients if they are satisfied)

In addition:

1. Ensure that vendors don’t double-count savings on comorbid patients registered in two or more different DM programs. Savings should be assigned to one DM program.

2. Requalify chronic patients annually, because some of those registered might have originally been false positives, meeting criteria for inclusion in one year but not another. Their continued inclusion in the chronic populations increases the number of members contributing to “savings” and reduces average PMPM cost. (Other industry experts believe that requalifying is unnecessary and accept that “once chronic, always chronic.”)

“If someone really does have the disease and doesn’t have claims in a year they still count,” says Al Lewis of the Disease Management Purchasing Consortium.

3. Assure that both the baseline period for comparison and the DM intervention year are measured identically. For example, allow at least six months after end of a year to allow all claims to be submitted. This rule should be applied equally for both the baseline and intervention year.

4. Expect an adjustment for inflation, since it is becoming an industry standard for vendors to make an adjustment of 3–10 percent, sometimes known as a trend adjustment. That amount corrects for increases in hospital, emergency department, prescription, and other costs. Be careful how it is calculated. It should be reasonable in relation to the health plan’s experience and in relation to the expected experience of the chronic population.

5. If a comorbid patient is registered in both DM and case management, establish an objective method to attribute patient savings to one program or the other. Definitely not both.

6. Use guarantees as performance-improvement tools, increasing next year’s target when the current year target is met. Because health plans and employers don’t trust vendors’ methodology and calculations, they like guarantees. For example, the vendor must return $2 for every $1 spent in fees. Payers think this somehow assures validity. But if vendors use flawed methodology, they are more likely to meet numerical targets, and experience has shown that they rarely miss.

7. Consider applying a flat rate charge to the entire patient population. Typically, there are two ways vendors charge for DM. One is per enrolled member. In this way, fees escalate over time as more members are enrolled in DM, and fees can escalate sharply if members are not requalified annually. The other method is a fee for all members of a plan, for example, a $1 PMPM flat rate that applies to the eligible population whether members are enrolled or not. This method is more widely used because it’s simpler to calculate ROI and often results in lower fees.

8. Be wary of guarantee provisions. At the end of a contract, the vendor may owe money to the health plan, especially if there are contractual guarantees. A vendor sometimes places an item in the contract that says the money it owes is applied to fees owed by the health plan for the next contract period. That locks the plan into renewing, even if it is unhappy with the vendor. Owed amounts should be paid out within 30 days of the reconciliation.

9. If you want to be wholly comfortable with vendor methodologies and savings calculations, hire an independent auditor. Audited calculations can differ substantially from the written methodology vendors were supposed to employ. Actuarial and benefit consultants perform this function. They include Medstat, Mercer Consulting, Milliman USA, Reden & Anders, and Solucia.

10. Be straight with the vendors. They should have the right to adjust promised ROI down when information given by the health plan is not valid. For example, sometimes the members’ phone numbers that plans provide are wrong or missing. A plan can be late providing membership or claims data, or the data can be poor. Vendors should not be at risk for health plan data failures.

Ian Duncan is president of Solucia Consulting of Hartford, Conn. Solucia assists in contract negotiations and renegotiations between managed care organizations and DM vendors.

When you need a vendor to evaluate your DM program, be sure that . . .

1. The evaluator provides DM program metrics, outcome metrics, and a method to demonstrate that the DM program caused the outcome.

2. The evaluator discloses his method in sufficient detail that an independent evaluator could reproduce the study and get the same results.

3. The method used is verified independently for credibility.

4. The evaluator has made a statement regarding potential conflicts of interests.

Thomas Wilson, PhD, is an epidemiologist and founder of the Population Health Impact Institute, which is concerned with evaluating programs that are intended to improve the health of populations. «www.phiinstitute.org.»

For further reading

“Return on Investment in Disease Management: A Review,” by Ron Z. Goetzel, PhD et al, Health Care Financing Review, Summer 2005, Volume 26, No. 4.

“Leading Disease Management Organizations,” Summer 2005, Health & Disease Management Service, Health Industries Research Companies.

“Assessing Return on Investment of Defined Population Disease Management Interventions,” by Thomas W. Wilson, PhD, et al, Joint Commission Journal on Quality and Safety, November 2004, pp. 614–21.

“Estimating the Return on Investment in Disease Management Programs Using a Pre-Post Analysis,” by Donald Fetterolf, MD, MBA, et al., Disease Management, Vol. 7, No. 1, 2004