How We Did It: Cracking the Codes

Fred Schulte, Joe Eaton, David Donald and Gordon Witkin of the Center for Public Integrity have won 2012’s Philip Meyer Award, which recognizes the best journalism done using social science research methods. Here the Center’s data editor, ICIJ member David Donald, explains the computer-assisted analysis and reporting behind the story.

Cracking the Codes documented how thousands of medical professionals have steadily billed Medicare for more complex and costly health care over the past decade – adding $11 billion or more to their fees – despite little evidence elderly patients required more treatment. The series also uncovered a broad range of costly billing errors and abuses that have plagued Medicare for years – from confusion over how to pick proper payment codes to apparent overcharges in medical offices and hospital emergency rooms. The findings strongly suggest these problems, known as “upcoding,” are worsening amid lax federal oversight and the government-sponsored switch from paper to electronic medical records.

So how did we get this done? A lot of hard work and traditional news reporting for sure. But it also required a major data management and computer-assisted analysis effort.

Sorry if it’s a bit nerdy, but please consider yourself warned.

We obtained 10 years of Medicare Part A and Part B claims data from the Center for Medicare and Medicaid Services (CMS), a division of the Department of Health and Human Services. All totaled with supporting files, this was about 712 million claims or about 1.8 terabytes of data when imported into the Center’s SAN.

To gain access to the data, we first filed a Freedom of Information request to CMS. When CMS didn’t respond, we called CMS. Its response was that the data were not “FOIA-able” and that we should go through the process of acquiring the data the same way researchers and consultants do, by buying data from CMS. The estimated cost for the data we wanted was about $97,000.

When CMS wouldn’t budge on the price, we sued CMS in U.S. District Court under the Freedom of Information Act. This lead to a negotiated settlement in which we purchased the data we requested for $12,000. The Wall Street Journal split the final costs with the Center and gained access to the data as well.

The major analysis was done on Medicare claims data from 2001-2008 from what CMS calls its “Limited Data Set,” a scientific, random 5 percent sample patients and Medicare Part B claims filed for services to those patients. Using hospital outpatient claims for the Emergency Room analysis and “carrier” files that individual doctors use for their billing, we looked at about 133 million claims that listed one of 14 sets of Evaluation and Maintenance codes, those most prone to upcoding. We also used the CMS denominator files of all Medicare patients to give us a baseline for the entire Medicare population.

To work with individual Medicare claims, researchers also need the Current Procedure Terminology Codes – purchased from the American Medical Association – ICD-9 or ICD-10 diagnostic codes and physician UPIN and NPI codes to track individual doctors in the database. The latter are public records.

Many other government documents also were used to document a decade-long pattern of medical coding and billing abuses. These records included U.S. Department of Health and Human Services Inspector General audits, Medicare carrier audits, Congressional hearing testimony, federal court lawsuits and criminal prosecutions involving “upcoding” and other records. We also drew on medical journal articles and Centers for Disease Control and Prevention data indicating that Medicare patients over time have not grown sicker and older and that the amount of time doctors spent with patients has not increased. These documents cast doubt on the claims of many doctors and hospitals that higher billing is justified because patients have become more infirm and more complex to treat over time.

We had two major areas of research that were necessary to pin down the story. First, we had to show that physicians, medical clinics and hospitals had been “upcoding” throughout the last decade. Then, we needed to calculate how much money the inflated coding had cost the American taxpayer.

We turned to medical research literature to identify how social scientists have measured upcoding. Previous work suggested looking at the distribution of codes year-by-year. For the same set of codes, for example the five Evaluation and Management (E&M) codes used in the Emergency Room, the distribution would skew left with the mean and mode shifting right to reflect higher paying codes as the decade progressed.

To replicate this technique, we dove into the 5 percent random sample of 10 years of Medicare claims. From that database, we extracted more than 133 million claims, nearly 375 gigabytes of data, claims with codes for E & M procedures. E&M codes are those most likely open to “upcoding” abuse by hospitals and doctors. They are code groups providing incremental billing, based on how much time the provider spent with the patient and how complicated the procedure was. For example, billing for an emergency room visit should be coded for just a small sum of money if little time and basic evaluation took place, but will be a lot more money if the provider spent a lot of time on an emergency medical procedure.

We took the claims in 14 sets of E&M codes year by year and examined their distribution – how much they approached a normal curve. As the medical research literature suggested would happen, in nearly every set of codes the “peak” of the curve shifted right toward higher codes if upcoding was happening in that group, This indicated that the coder is using higher codes more often and reducing their use of lower paying codes. We could plainly see these results when we placed the normal curves visually on top of each other.

We performed significant tests to check the results, but with such a large sample size, it was not surprising that the shifts were statistically significant.

Medical providers responded that we were seeing such a pattern because their patients were getting older and sicker. So we controlled for age and diagnosis and the pattern remained the same. Every independent expert we showed our results to said, “That’s upcoding.”

Finally, we ran a model of the E&M costs to Medicare in which the proportion of coding remained constant during the decade and compared that to the actual total payment for E&M codes. We controlled for denied and modified codes. A conservative estimate of the difference – about $11 billion over the decade – was adjusted for inflation. That’s a minimum of what it is costing taxpayers. It’s likely a lot more.

Prior to starting the analysis, we made two trips to the Dartmouth Institute, which has done extensive medical research for decades using these data. We not only learned about the ins-and-outs of using these data, but we also established relationships with some of the directors and researchers at the Institute.

Because we focused more on fraud and Medicare costs, not medical practices or effectiveness, we found Dr. Jonathan Skinner, a medical economist at the Institute, most helpful. He could point us to literature, as well as confirm that our methods were accepted social science practice.

When we finished the results that revealed upcoding practices, Dr. Skinner suggested a micro look at the physician level data as follows. We ran thousands of sparkline charts on the physicians who showed the most upcoding year-by-year. The shapes of the sparklines again showed in almost all cases a gradual shift to higher codes throughout the decade by most of those physicians rather than a pattern of mostly flat-lined coding with abnormal spikes. Dr. Skinner suggested these micro patterns help confirm that the each macro pattern is upcoding in a set of E&M codes.

We then took all of the macro patterns and showed them to various researchers working at Washington, D.C., think tanks who have experience analyzing Medicare and other medical costs. They also confirmed from their experience that our results were evidence of systemic upcoding.

And that’s one way to “crack the codes.”

Subscribe to The ICIJ Global Muckraker by email or get the RSS feed