Calculating Marginal Effects with Discrete Variables

Sometimes, a coefficient isn’t what it seems to be. When using an ordinary least squares (OLS) regression, the regression coefficients indicate the proportion by which the dependent variable changes when the independent variables increases by one unit. Regression coefficients are more difficult to interpret, however, for more complicated regression specifications such as probit, mutlinomial logit, negative binomial, etc.

Let us assume that a regression coefficients are fitted to the following equation:

y=f(Xβ)=f(β₀+β₁x₁+…+β_kx_k)

For instance, in the probit model, f(·)=Φ(·), where Φ(·) is the normal cdf. In the OLS setting,∂y/∂x_k=β_k. Thus, there are no interaction terms–unless explicitly states as one of the variables in X–in the OLS. However, in the probit case, ∂y/∂x_k=φ(xβ)β_k., where φ(·) is the pdf of the normal distribution. Thus, the marginal effects differ depending on the value of x.

The standard solution to this problem is to calculate the marginal effects when x is set equal to its mean value. When x_kis a dummy variable (i.e.: x_k∈{0,1}), the marginal effects are calculated by setting x_-k equal to their mean and then finding the difference in y when x_k increases from 0 to 1. This difference is the marginal effect for the discrete variable.

Sometimes this method will lead to results that are difficult to interpret. For instance, one could calculate the marginal effect on health from taking a new drug for someone with a gender=.051, and minority status=0.24. Since someone can either be male or female, they can be a minority or not, finding the marginal effect for this hypothetical person may not be the very revealing.

Another method would set continuous variables equal to their mean (e.g.: age, income) and then we could calculate the marginal effects for a the typical white female, a typical black male, etc.

A final methodology is given in Boonen, Schut and Koolman (2008). The authors investigate whether or not health insurance company financial incentives are effective in directing enrollees towards preferred provider pharmacies. In the results section of their paper, they state:

“The marginal effects for discrete variables are computed by calculating the change resulting from a change in the discrete variable from 0 to 1 holding all other variables fixed at their mean (see, for example, McGuirk and Porell, 1984; Madden et al., 2005). An average individual does not exist, however, and in our research we are interested in the probability that a certain consumer does or does not visit the preferred supplier. The marginal effects are thus not computed over the average individual but represent the mean of the marginal effects over each individual. This is done by computing the effect of, for example, a one-year increase in age on the probability of visiting the preferred provider for each individual and then averaging these probabilities across all individuals in the sample (Strombom et al., [JHE] 2002; Greene, 2003). The standard errors for the marginal effects are computed through bootstrapping. ”

This post should give researchers some idea of how to calculate marginal effects for complex regression models. It should be noted that there is no one optimal method, but one should determine how best to analyze the data and then use the marginal effects method most appropriate for your data analysis.

1 Comment