Econometrics

You are currently browsing the archive for the Econometrics category.

Fixed effects (FE) regressions are a useful tool for controlling for time-invariant factors in a regression specification.  When using a linear OLS model, FE represent the average value of the dependent variable for that individual after controlling for covariates.  Estimating a fixed effects model for non-linear regressions, however, can be problematic.

For instance, if you try to estimate the fixed effects coefficients in a probit model, you will introduce an incidental parameters problem.  Assume that the panel data has N individuals over T time periods.  If T is fixed, as N grows large (i.e., N→∞) your covariate estimates (β) become biased.  This occurs because the number of “nuisance parameters” grow quickly as N increases.

There does exist a “fixed effects logit estimator”, but this estimator does not actually use a fixed effects method.  Rather it is a conditional maximum likelihood estimator (cMLE).  In the two period model, it conditions on the fact that the event occurred in one or the other time period.

  • Pr(yi1=1|yi1+yi2=1) = 1-F(Δxβ)
  • Pr(yi2=1|yi1+yi2=1) = F(Δxβ)

“The conditional-likelihood estimator is thus equivalent to a logit estimator of the dependent variable 1(Δy=1) on the independent variables Δx for the subsample of observations satisfying yi1 + yi2=1.”

Abrevaya (1996) provides an example of how this bias can occur in a two period example and explains the conditional logit model in more detail.

Tags: , , ,

If you are evaluating the treatment effect of a policy or medical intervention, does it matter if some of your subjects leave the sample? In many cases, the answer is ‘yes’.

The Problem

As outlined in Grasdal (2001), the effect of the treatment is simply:

  • Δ = E(Y|X, T=1) − E(Y|X, T=0)

However, in some cases we may not observe Y. For instance, if there is attrition in the study, we will not observe their outcomes. Thus, we can decompose the two components from the equation above as follows: The effect of treatment with attrition is:

  • E(Y|X, T=1) = pTE(Y|X, T=1, A=0) + (1-pT)E(Y|X, T=1, A=1)
  • E(Y|X, T=0) = pCE(Y|X, T=0, A=0) + (1-pC)E(Y|X, T=0, A=1)

where pT is the probability someone in the treatment group drops out of the sample (pT=p(A=0|X, T=1) and pC is the probability someone in the control group drops out of the sample (pC=p(A=0|X, T=0).

Rearranging terms we get:

  • Δ = [E(Y|X, T=1, A=0)-E(Y|X, T=0, A=0)] + pT[E(Y|X, T=1, A=0)-E(Y|X, T=1, A=1)] + pC[E(Y|X, T=0, A=1)-E(Y|X, T=0, A=0)]

The first term in brackets is what we observe. The second term in brackets is the difference between is the outcome in the treatment group for the attrition and non-attrition group; the third term in brackets gives the difference between is the outcome in the control group for the attrition and non-attrition group. With random attrition, the two expressions inside the square brackets will cancel out. If attrition is random, then estimating the treatment effect using the first equation will produce unbiased estimates.

Potential Solutions

If one knows the source of the attrition bias, one can explicitly model the source of the attrition. Explicit models are typically sample selection model in which two simultaneous regression
models are calculated. “The first model is a regression model that addresses the research question, with the hypotheses of the study being examined by the regression of the dependent variable on the key independent variables in the study. The second model includes the variables that are causing attrition, with the dependent variable being a dichotomous variable indicating either continued participation or nonparticipation in the study. The error terms of the substantive dependent variable in the first regression model and the participation dependent variable in the second regression model are correlated. A significant correlation between the two error terms indicates attrition bias.”

If the source of the bias is unknown, one can use the Heckman selection model. The first step of the Heckman selection model “…not only tests for attrition bias but also creates an outcome variable, which Heckman calls λ (lambda). Thus, a λ value is computed for all cases in the study, and it represents the proxy variable that explains the causation of attrition in the study…The second step of Heckman’s procedure is to merge the λ value of each participant into the larger data set and then include it…in the regression equation that is used to test the hypotheses in the study. Including λ in the equation solves the problem of specification error and leads to more accurate regression coefficients.”

Empirical Investigation

A study by Grasdal looks at attrition in a randomized field trial of a rehabilitation programme designed to bring long-term sick listed workers with musculoskeletal problems back to work in Bergen, Norway. In this case, they found that “Both the parametric and the semi-parametric sample estimators that were considered indicated that sample attrition biased outcome data regarding posttreatment earnings, while the data regarding sick leave status remained unbiased. The sample selection estimators of post-treatment earnings perform quite well in terms of correcting for attrition bias and estimating treatment effects not very different from the experimental benchmark.”

…The analysis also demonstrates an inherent paradox in the ‘common support’ approach, which prescribes exclusion from the analysis of observations outside of common support for the selection probability. The more important treatment status is as a determinant of attrition, the larger is the proportion of treated with support for the selection probability outside the range, for which comparison with untreated counterparts is possible.”

Source:

Tags: , ,

On Saturday, UCSD Economics Professor Dr. Hal White passed away after an extended struggle with cancer.  This is a sad day as Hal was one of my former professors.  Here is an except from obituary written by Dr. Jim Hamilton regarding Dr. White’s work.

Hal was one of the world’s leading econometricians. One of his core beliefs was that the models and assumptions that we bring to the data are inevitably flawed and misspecified in some way. It might seem that if you believe that, there’s no hope in trying to do econometrics. But some of Hal’s most remarkable discoveries concerned how to form valid inference even if part of what you assumed was fundamentally wrong.

An example arises in ordinary regression analysis, in which a common assumption is that the variance of the regression model’s error is the same for all observations. Suppose that assumption is wrong, and instead the variance depends in an unknown way on the various explanatory variables. Hal found that it is possible to characterize how that dependence would affect the reliability of the inference from the regression, and construct modified t-statistics or F-statistics that take this into account. This was such a useful contribution that it is now a standard option a user can easily select in any decent regression software package. Hal once lamented to me that this was an example of a contribution that became so successful and widespread that people forgot who came up with it in the first place. Hal’s proposed adjustments are often described as “robust standard errors” or “heteroskedasticity-consistent standard errors”, though I have always introduced them to my students as “White standard errors”.

Hal also showed that this idea generalizes much more broadly, as spelled out in his classic article, Maximum Likelihood Estimation of Misspecified Models. The maximum likelihood estimator (affectionately known as the “MLE”) refers to a particular estimate of parameters that is derived from the claim that the researcher knows the family from which the true probability distribution that generated the data comes. Hal’s remarkable contribution here was to examine the properties of that inference if you have assumed the wrong class of probability distributions. He referred to that procedure (using an MLE that is based on an incorrect assumption about the probability distribution) as “quasi maximum likelihood estimation.” Again establishing the properties of such inference seems like (and is!) an astounding result. But when you get into the math, you discover that it makes perfect sense. For example, one could assume (mistakenly, perhaps), that the error terms in the regression model came from a Normal distribution with mean zero and constant variance. If your assumptions were correct, then the MLE turns out to be the usual formula for regression estimation. However, even if your assumption about the probability distribution is wrong, one can show that what you were calling the MLE is usually still giving you a decent estimate of something, namely, an estimate of the best prediction of y if you want to base your prediction on a linear function of x. In fact, White’s robust standard errors for ordinary regression prove to be a special case of his general results for quasi maximum likelihood estimation.

Hal had a host of other very fundamental contributions, ranging from the recognition that neural networks are essentially a statistical inference problem, elegant contributions to asymptotic theory, any number of extremely useful specification tests, and his most recent interest in some very deep ideas about causality and inference. There are I suspect a great many papers by Hal and his co-authors that have not yet been published, but soon will be, as he remained astonishingly productive up to the end, writing papers faster than the journals could publish them.

Tags:

In previous posts, I have explained how to create bootstrap estimates for a variety of statistics.  Doing so is fairly simple and involves a 3 step procedure:

  • Step 1: Using the observe data, create m boostrap data sets by using  random resampling with replacement.
  • Step 2: Calculate the statistic of interest for each bootstrap data set.
  • Step 3: The bootstrap estimate of the statistic of interest is the average value from Step 2 across all bootstrap samples.

One question that has not yet been answered is how to calculate the confidence interval for the statistic of interest.  A paper by Haukoos and Lewis describes five methods for computing bootstrap confidence intervals: i) normal approximation, ii) percentile, iii) bias-corrected(BC), iv) bias-corrected and accelerated (BCa) and v) approximate bootstrap confidence (ABC) methods.

The normal approximation method is calculated as follows:

  • original statistic +/- Z* (standard Error)

For instance, for a 95% confidence interval, Z=1.96.  Another alternative is to use the percentile method.  To calculate, the percentile confidence intervals for a 95% CI, one simple takes calculates the 2.5 and 97.5 percentiles for the distribution of statistics calculated in Step 2 of the bootstrap procedures.

The other bootstrap CI methods are a bit more complex.

The BCa method adjusts for bias in the bootstrapped sampling distributions relative to the actual sampling distribution, and is thus considered a substantial improvement over the percentile method. The BCa confidence interval is an adjustment of the percentiles used in the percentile method based upon the calculation of two coefficients
called ‘‘bias correction’’ and ‘‘acceleration.’’ The bias correction coefficient adjusts for the skewness in the bootstrap sampling distribution. If the bootstrap sampling distribution is perfectly symmetric, then the bias correction will be zero. The acceleration coefficient adjusts for nonconstant variances within the resampled data sets. The ABC method is an approximation of the BCa method that requires fewer resampled data sets than the BCa method.

Read the rest of this entry »

Tags: , ,

Suppose you look at health care spending in two different regions and observe a significant difference.  You may want to know what the cause of this difference is.  Is it because one region has a mix of people who are sicker; or is because the reason treat patients with a given disease more intensively?

One way to answer this question is to use the Oaxaca decomposition.  This approach was originally formulated by Ronald Oaxaca. This document provides a nice overview of how to use the Oaxaca Decomposition and I apply that framework to the health spending case.

Differences in Health Spending

Assume that there are two regions: Region A and Region B. The spending for the two regions can be modeled using a linear regression framework:

  • YA = βAX + εA
  • YB = βBX + εB

The Y term represents spending and the variable X represents the patient’s health status. Health status could be measured as a vector of factors or as a single indicator (e.g., healthy or sick). The term β describes much an area spending on medical resources to treat a patient with a health status of X. Thus, average difference in spending per person the two regions is:

  • YA – YB = βAXA – βBXB

where XA is the average case mix in the area.

Determinants of Health Spending Differentials

Now the question is whether case mix or spending practices conditional on case mix is the key driver of the differences in spending between regions A and B. One can differentiate these two components using the following Oaxaca Decomposition:

  • YA – YB = ΔXβB + ΔβXA
  • YA – YB = ΔXβA + ΔβXB

In the first equation, the differences in health status (X‘s)are weighted by the coefficients for region B and the differences in the coefficients are weighted by the X’s from region A, whereas in the second, the differences in the X‘s are weighted by the coefficients of from region A and the differences in the coefficients are weighted by the X‘s of from region B.

There are basically three factors that effect health spending in the region: i) differences in health status across regions ii) differences in treatment patterns conditional on health status, and iii) the interaction of health status and conditional treatment effects. One can see this clearly below:

  • YA – YB = ΔXβB + ΔβXB + ΔXΔβ
  • YA – YB = H + T + HT

The equations above show the health status effect (H), the treatment effect (T) and the interaction (HT).

The specification chosen for the Oaxaca decomposition determines whether the interaction effect is placed with the health status effect or the treatment effect.  More precisely:

  • YA – YB = ΔXβB + ΔβXA = H + (HT + T)
  • YA – YB = ΔXβA + ΔβXB = (H+ HT) + T

In effect, the first decomposition specification incorporates the interaction term with the treatment effect whereas the second specification places the interaction term together with the health status effect.

Sources:

Tags: , , ,

Many researchers use household data sources to examine a variety of hypothesis.  The use of household data has many benefits including allowing for more detailed socioeconomic information (e.g., education, income) beyond what is contained in administrative claims files.  One drawback of household data is that extrapolations made from household survey data may not match national estimates.

For instance, this article examines how to align the Medical Expenditure Panel Survey (MEPS) to aggregate U.S. benchmarks provided in the National Health Expenditure Accounts (NHEA).  Today, I review some of these adjustments.

Read the rest of this entry »

Tags: , , , ,

Many research studies aim to figure out if a physicians did a good job.  Many studies use administrative claims data to evaluate performance.  Other times, researchers use medical record review.

One problem with medical record review is that oftentimes experts will come up with differing opinions from reviewing the same medical record.  Thus, researchers often have at least two individuals review the medical record so that the results are not biased by a single person’t opinion.

A question of interest is how reliable are different evaluators of medical record.  Cohen’s kappa can provide a quantitative estimate of inter-rater reliability.  The formula is the following:

  • [P(a)-P(e)]/[1-P(e)]
Where P(a) is the observed level of agreement and P(e) is the expected level of agreement from pure chance.  In essence, the kappa measurement compares the observed level of inter-rater agreement against the level of agreement that would be expected by pure chance.  

To give an example, consider the situation where two raters rate 10 blogs and can give them a rating of an A, B, or C. These data are available here.  You can see that Tester 1 is more likely to give positive ratings and Tester 2 is more likely to give negative ratings.  In this example, the value of Kappa is 0.44.

A general rule of thumb to follow is values < 0 as indicating no agreement, 0–.20 as slight, .21–.40 as fair, .41–.60 as moderate, .61–.80 as substantial, and .81–1 as almost perfect agreement.

 

Biases

All economists are familiar with the problem of selection bias.  In non-randomized samples, patients may choose to be in either the treatment or control group based on factors which are also related to the outcome of interest.  Even if researchers can design a study that fully controls for selection bias, robust studies must also account for other biases.  These include:

  • Recall bias: Patients in one group have better or worse memory of a given event.  If one wishes to compare changes in income for individual who received certain workforce training, individuals who participated in the program may be more or less likely to inflate their income levels over time.
  • Interviewer bias: If new data is being collected and researchers use separate interviewers for the treatment and control groups, if one interviewer systematically over/understates the interviewee responses, the study results will be biased.
  • Observation bias: This problem is particularly problematic for medical studies.  Observation bias occurs when physicians (or patients) are more likely to detect a disease.  Thus, a study identifying how pollution affected disease rates may underestimate the impact of the pollution if those affected are less likely to detect any disease than those who are not.  For instance, if poor individuals are more likely to drink polluted water than rich individuals, but also less likely to go to the doctor, the disease incidence from polluted water would be underreported and the causal impact of water pollution would be underestimated.

Outside of purely statistical biases, the research community at large may suffer from other biases as well.  These include:

  • Funding bias: Researcher bias towards interpreting quantitative results in favor of the entity which funded their study.
  • Status quo bias: Survey respondents may base their opinions closer to the status quo or researchers can interpret their results in a fashion more likely to coincide with the existing academic literature.
  • Publication Bias: tendency of researchers, editors, and pharmaceutical companies to handle the reporting of experimental results that are positive (i.e. showing a significant finding) differently from results that are negative (i.e. supporting the null hypothesis) or inconclusive, leading to bias in the overall published literature.
  • Hindsight bias: is the inclination to see events that have already occurred as being more predictable than they were before they took place

 

Tags: , ,

Understanding quantiles is fairly intuitive. A physician would rank in the τth quantile of in terms of quality of care if he performs better than the proportion τ of the reference group of physicians and worse than the proportion (1–τ). For physicians at the median, half of physicians will perform worse than this doctor and half will perform better.

Quantile regressions, however, offer the power to evaluate whether the predicted effect of selected explanatory variables on the outcome of interest differs depending on the where in the distribution the individual is located. Koenker and Bassett (1978) created these regression models and based them on the same intuition used to calculate the median. Today I review contrasts how quantile regressions work compared to ordinary least squares (OLS).

Mean vs. Quantile

The simplest way to compare OLS against quantile regression is to compare optimization methods for the mean and quantiles (e.g., median). Most people know the mean and median formulas, but the following specifications detail how to calculate these values for any sample using optimization techniques.

  • Mean: min μ∈ℜ Σ (yi – μ)2
  • Quantile: min ξ∈ℜ Σ ρτ(yi – ξ)

where the function ρτ(x) = x(τ – I(x<0)). In essence, the function ρτ tilts the absolute value function towards the quantile under investigation. For the mean, the goal is to pick the a parameter (the mean) which will minimize the sum of squared deviations. For the quantile, the goal is to pick a parameter which will minimize the sum of absolute deviations. For the median, the absolute deviations are weighted equally whereas for other quantiles deviations closer the quantile of interest receive more weight than those further away.

I have created this spreadsheet to more clearly demonstrate how calculating quantiles can be done in practice.  Wikipedia also has a nice example.

OLS vs. Quantile Regression

Again, compare the mechanisms by which OLS and quantile regressions choose the coefficients (i.e., β) to optimize the equations below.

  • OLS: min β∈ℜ Σ (yi – Xβ)2
  • Quantile Regression: min βτ∈ℜ Σ ρτ(yi – Xβτ)

When you calculate the sample mean, you are calculating the unconditional population mean [i.e., E(y)]. When you conduct the OLS regression, one calculates the conditional expectation function E(y|X)]. Similarly, the quantile regression is used to estimate the conditional quantile of the dependent variable.

To conduct the quantile regression in SAS, on can perform the QUANTREG function. In Stata one can use the qreg function.

Quantile Regression in Practice

An example of a paper using Quantile Regression includes the following: Johar, M. and Katayama, H. (2011), Quantile regression analysis of body mass and wages. Health Economics, 20: n/a. doi: 10.1002/hec.1736. This paper uses the National Longitudinal Survey of Youth 1979, to explore the relationship between body mass and wages. The researchers use quantile regression to provide a broad description of the relationship across the wage distribution. “Our results find that for female workers body mass and wages are negatively correlated at all points in their wage distribution. The strength of the relationship is larger at higher-wage levels. For male workers, the relationship is relatively constant across wage distribution but heterogeneous across ethnic groups.”

Sources:

Tags: , , ,

How does one determine if a test is accuracy?  What does accuracy mean? One measure of test precision it is the positive predictive value, or the share of positive test results which are actually positive.  Alternatively, the negative predictive value determines the share of negative test results which are true (rather than false) negatives.  Better positive and negative predictive value indicates a better test.

In addition, sensitivity and specificity uses the gold standard (i.e., “true”) results as the denominator.  Sensitivity indicates the share of true positives as a fraction of total people who actually have the condition. Similarly, specificity gives the number of true negatives as a share of the number of test subjects who actually had the disease.

The formulas for these four metrics  describing the accuracy of various diagnostic testing procedures is shown below:

  • Positive Predictive Value:  TP/(TP+FP)
  • Negative Predictive Value:  TN/(TN+FN)
  • Sensitivity:                TP/(TP+FN)
  • Specificity:                TN/(FP+TN)

This example below from Wikipedia provides a simple example.

 

Tags: ,

« Older entries