Predictive Modeling Expands Its Book of Techniques
Predictive Modeling Expands Its Book of Techniques
MANAGED CARE November 2005. ©MediMedia USA
Health plans and employers want to know more than just who needs help. They want to know who has the most to gain.
The traditional business case for predictive modeling is a simple one: It is easier to control costs and improve outcomes if you can pinpoint your most expensive patients months before they wind up in a health crisis.
Just how clearly that picture can be tuned in still depends on the quality and quantity of data that are available. But predictive modelers insist that recent trends in health care are solidly in their favor and are helping to broaden the use of predictive modeling tools for group rate-setting for small µemployers.
As each new hospital, regional health information network, and provider group wires up, new rivulets of electronic health information are being created that are flowing into the swelling current of data that is needed to make predictive modeling a more accurate tool. But the availability of more data is also influencing the types of applications that are available. Some health plans and employers want to know more than just who needs help; they want to know who has the most to gain.
"There's exciting software and technology that continue to emerge about how you use claims, pharmacy and even self-reported patient information to figure out who's getting sick," says Harry Leider, MD, chief medical officer of XLHealth, a disease management company that recently won one of the pilot DM projects for Medicare.
But predictive modeling experts are also quick to note that despite the gains, there are also plenty of shortcomings that remain in this field. Predictive modeling tools are still likely to be proprietary black boxes, a distinct disadvantage in an age where transparency and standardization have become a mantra for the health care technology field. Others insist that the data available on patients simply aren't good enough to offer a clear assessment of health needs. Codes are still often outdated or incomplete, and individual health survey information is mandatory for filling in the gaps that appear when data are collected from claims.
Acutely conscious of the doubts that persist, modeling companies are fighting back with peer-reviewed literature, third-party reviews, and data mining contests that rank the best in the field. New pilot studies, meanwhile, are also in the works. And the Disease Management Association of America has created a work group to help highlight the best use of analytics that can help develop predictive modeling into a more refined tool for case management.
Better data, better margins
"Traditional predictive modeling takes claims history and diagnoses and looks at prescription use and throws that in a model that predicts whether you are in a high-cost category or not, or what your cost is going to be," says David Knutson, the director of health systems studies at the Health Research Center of Park Nicollet Institute, who recently completed a study of the adoption of predictive models by health plans.
To hear predictive modelers talk, the data being "thrown in" keeps getting better.
"The data have gotten better, as we've shown," says Marilyn Kramer, president of DXCG, one of the best-known predictive modeling companies in the field. "People are collecting more data electronically. We published a paper that basically says that for a variety of reasons, the numbers of diagnosis codes per person has increased in recent years as providers and hospitals record more diagnoses on their claims. This improved coding," she says, "improves model performance."
"Codification is improving," agrees John Haughton, MD, the former chief medical officer of DXCG who is now running DocSite, which offers clinical IT software. "If you look at what's happening in the banking industry, there are a number of things that are automated that make core transactions like receiving funds better now than they were. When ATMs came out, people were scared to use them; now routine transactions don't go to tellers."
The same process is working its way through health care, he says. It is increasingly routine to get prescriptions refilled or ask physicians questions electronically. And his company is helping to capture that information — at the point of service — so that it can be used for better care and ultimately better predictability.
Pharmacy data have become increasingly available in recent years, adds Kramer, and as the health care industry continues to respond to a push for standardized — and therefore more easily accessed — data, her job gets easier.
More plans and employers also understand that the data need to continue to improve, and are willing to start assigning pay-for-performance money to ensure that the information being analyzed is of high quality.
"When people start paying attention to the reports, they pay attention to the data," Kramer says. Add bonuses for good data, she says, and physicians will be more careful in their coding.
Scoring health status
Predictive modeling has also been branching out, Kramer adds. It's no longer just about tagging groups of patients that are likely to suffer an expensive health episode in the coming year. Health plans want to use the data to see how they should set premiums for small groups.
"They're using risk scores to look at the health status of the population of their smaller groups, under 250 lives," adds Kramer, who is currently at work for one of the nation's Blues plans on just such a task. "And they want to ask themselves, 'Should I be pricing at preferred, regular, or nonstandard rates?'"
"Some are reporting that it's giving them enough of a prediction advantage that they can price products at levels to gain business and market share but still make a margin — a 1 to 2 percent improvement in bottom line compared to traditional means of underwriting," says Knutson, a former director of provider contracting for Aetna.
But as predictive modeling grows more sophisticated, the data detectives are going for bigger returns. They want to find opportunities for savings. "Predictive modeling can be used to highlight impactability," says Karen Fitzner, PhD, head of research and program development for the Disease Management Association of America, "not only the patient with the highest risk or need, but which patients are most likely to be at a point to be ready to change. They have to be at a moment where they are willing to make a behavioral change." Predictive modelers have been incorporating more behavioral health methods — better understanding the patients' motives — to help refine the process, she adds.
"It gets very frustrating if you identify someone in a disease management program or health plan that you can't do anything about," says Steve Epstein, CEO of Medai (for medical artificial intelligence), the predictive modeling company. "We developed an acute impact index, not just telling you who's at high risk next year but who's at high risk that there's something you can do about it — what types of treatments or gaps in care exist. In the literature there are a variety of ways to identify a person at risk of diabetes — blood sugar, A1C — but you also need to identify why they are at risk; not getting testing, their diet, or whatever. You can say, here's a person at risk for $10,000 in care, but you can cut it down with testing or a change in diet." Many predictive modelers have also learned to be discretionary about their data recipes, depending on the use. In a modeling case involving capitation contracts, says Knutson, adding that utilization data could create a perverse incentive for bad medicine.
"It could reward an inefficient provider who admits too often," he says, "or a low quality provider when heart failures could have been prevented. Lots of admissions last year means lots of admissions next year. You could be rewarding poor performance."
Between chaos and connections
Predictive modeling technology hasn't been developing in a vacuum, says Jim Kerr, vice president for business development at the Haelen Group. "When you think about predictive modeling, it's been around in different industries for a long time. Like the weather. Weather technology is better than it was. In our space, I feel it's the same way. If you continue to improve the algorithm, you can better identify who needs and wants help."
Exactly how that algorithm is shaped, though, varies from company to company. And few pull back the curtain to give everyone a clear view.
"It's a black box," notes Knutson. "Lack of transparency means that researchers like me find it difficult to evaluate performance. It's difficult to get away with black boxes, especially when the models are used for physician payment or underwriting."
In a field where payers still question returns, predictive modelers are taking a variety of routes to examine their performance or find some other way to gain kudos for their work. But the DMAA's Fitzner, for one, doesn't believe standardization is absolutely necessary. "If you go to different restaurants, chefs won't tell you how they prepare their beef."
Kerr knows that once you start talking about a proprietary system in predictive modeling, proving that it works as advertised is always going to be an issue. The Haelen Group's solution: third-party reviews. In several cases, he says, that's involved the Regenstrief Institute in Indianapolis. Kramer feels that submitting peer-reviewed studies helps substantiate DXCG's work. For Epstein, that approach means entering data-mining competitions, even if they're not strictly about health care. As Kerr relates his work to weather predictions, Epstein touts his company's recent win in using data to accurately predict which in a list of companies would wind up in bankruptcy. It's all about the same thing: using refined analytics to interpret data.
Filling in the gaps
For a whole contingent of companies in the DM field, though, there still isn't a convincing enough case that data mining by itself can dig deep enough. One of the chief stumbling blocks, says Leider, is the simple fact that there is no single, conclusive method yet devised to demonstrate its effectiveness.
"I don't know that there's been a ton of evidence that using predictive modeling improves predictability," says Leider. "Almost every organization using predictive modeling is trying to determine whether it adds value."
For XLHealth, it's a critical question. XLHealth was one of a group of nine disease management groups to be selected for a Medicare pilot study that will prove critical to winning overall acceptance, as well as access to billions of dollars in federal contracts for treating the sickest of the elderly.
"Our approach is to try to integrate not only claims data but information we get by interacting with the patient — in most cases face to face — then restratify, or predict, whether we think they're at critical risk of a heart attack or an amputation based on what we learned," adds Leider. "We ask them directly, examine their feet for neuropathy, because we know that if a diabetic patient has neuropathy, the likelihood of amputation is much higher.
"You can't get that from software. Doctors don't code for neuropathy. Our programs have reduced the rate of amputations — which are horrible things and expensive — by 40 to 50 percent."
"Most folks who have identified people for predictive modeling relied on claim data," says Kerr. "But there's a lot of evidence that claim data are not a good bet for finding people who are going to be expensive. Claim data can land 6, 12, or 18 months after the actual incident." As a result, the Haelen Group also relies more on information from surveys. "It's more current, more accurate. A person could complete a survey with us today and in a few days be involved with a health coach."
To get that, says Kerr, the company has developed a proprietary survey that collects a blend of medical information, symptoms and mood checks so that coaches can address everything from back pain to problems at home that interfere with proper care.
It all comes down to simple economics, says Kramer. In case management, it makes sense to go from 10,000 names to 100 individuals, and then start making the phone calls to gather the extra psycho-social information that can be critical in managing a high-cost patient. And while it may make sense to call 10 percent of the population, most can't do that. "Most health plans can't staff beyond one half of one percent," Kramer says.
But for employers, says Kerr, pure numbers don't always add up to what they're looking for.
"In some cases it can be as simple as benchmarking with certain data points that we can look at with large, medium, or small clients. But there's also a touchy-feely approach. A lot of time is spent reviewing how many people went through the program, how many got coaching. And we share stories of coaching people found through predictive modeling. Employers have gravitated to that. It's as important, if not more important, as a financial outcome."
Better input, better output
DXCG found that better coded claims data over a two-year period unveiled more diagnoses and better information on drug use. Both are essential for predictive modeling.
|Coding characteristics in a privately insured population: '97–'98 vs. '98–'99* and resulting impact on DXCG predictive model groups|
|Number of people||1,083,405||1,292,288||19.3%|
|Coding characteristics — over one year|
|Percent of people with at least one diagnosis||71.9%||73.9%||2.8%|
|Mean number of diagnoses per person||10.9||11.6||6.6%|
|Mean number of distinct diagnoses per person||3.71||3.99||7.5%|
|Mean number of hierarchical conditions — categories per person||2.37||2.52||6.3%|
|Percent of people with at least one prescription||63.9%||66.4%||3.9%|
|Mean number of Rx groups per person||2.41||2.46||2.1%|
|Source: Zhao Y, Ash AS, Ellis RP, et al., Predicting pharmacy costs and other medical costs using diagnoses and drug claims. Med Care. 2005;43(1):34–43.
Note: Hierarchical Condition Categories are used in the DCG predictive models and RxGroups are used in the DXCG pharmacy-base predictive models.
* For people with at least one month eligibility in both year 1 and year 2 in the MarketScan Research Database.
** All differences in means between years are significant at p<0.0001 level after correcting for correlation induced by panel design.