Unbiased Analysis of Today's Healthcare Issues

Add to Your Skills Toolkit: The Oaxaca Decomposition

Written By: Jason Shafrin - Jan• 26•12

Suppose you look at health care spending in two different regions and observe a significant difference.  You may want to know what the cause of this difference is.  Is it because one region has a mix of people who are sicker; or is because the reason treat patients with a given disease more intensively?

One way to answer this question is to use the Oaxaca decomposition.  This approach was originally formulated by Ronald Oaxaca. This document provides a nice overview of how to use the Oaxaca Decomposition and I apply that framework to the health spending case.

Differences in Health Spending

Assume that there are two regions: Region A and Region B. The spending for the two regions can be modeled using a linear regression framework:

  • YA = βAX + εA
  • YB = βBX + εB

The Y term represents spending and the variable X represents the patient’s health status. Health status could be measured as a vector of factors or as a single indicator (e.g., healthy or sick). The term β describes much an area spending on medical resources to treat a patient with a health status of X. Thus, average difference in spending per person the two regions is:

  • YA – YB = βAXA – βBXB

where XA is the average case mix in the area.

Determinants of Health Spending Differentials

Now the question is whether case mix or spending practices conditional on case mix is the key driver of the differences in spending between regions A and B. One can differentiate these two components using the following Oaxaca Decomposition:

  • YA – YB = ΔXβB + ΔβXA
  • YA – YB = ΔXβA + ΔβXB

In the first equation, the differences in health status (X‘s)are weighted by the coefficients for region B and the differences in the coefficients are weighted by the X’s from region A, whereas in the second, the differences in the X‘s are weighted by the coefficients of from region A and the differences in the coefficients are weighted by the X‘s of from region B.

There are basically three factors that effect health spending in the region: i) differences in health status across regions ii) differences in treatment patterns conditional on health status, and iii) the interaction of health status and conditional treatment effects. One can see this clearly below:

  • YA – YB = ΔXβB + ΔβXB + ΔXΔβ
  • YA – YB = H + T + HT

The equations above show the health status effect (H), the treatment effect (T) and the interaction (HT).

The specification chosen for the Oaxaca decomposition determines whether the interaction effect is placed with the health status effect or the treatment effect.  More precisely:

  • YA – YB = ΔXβB + ΔβXA = H + (HT + T)
  • YA – YB = ΔXβA + ΔβXB = (H+ HT) + T

In effect, the first decomposition specification incorporates the interaction term with the treatment effect whereas the second specification places the interaction term together with the health status effect.


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

Your email address will not be published. Required fields are marked *