Unbiased Analysis of Today's Healthcare Issues

What are we weighting for?

Written By: Jason Shafrin - May• 07•13

Weighting has a number of uses.  For instance, one can use weighting to estimate population sample statistics.  The Panel Study of Income Dynamics (PSID) for instance oversamples households with low income.  To get nationally mean values, one must reweight the PSID values, either using survey weights or matching to a nationally representative sample such as the CPS or ACS.

Researchers also use weighting when estimating causal effects.  A recent working paper by Solon, Haider and Wooldridge (NBER 2013) examines whether weighting is useful in the following 3 applications: (1) to achieve precise estimates by correcting for heteroskedasticity, (2) to achieve consistent estimates by correcting for endogenous sampling, and (3) to identify average partial effects in the presence of unmodeled heterogeneity of effects.  I discuss each of these situations below.

Correcting Heteroskedasticity

Heteroskedasticity occurs when one subpopulation has more variability than another.  This heteroskedasticity can affect the precision of regression coefficients.  However, can one use weighting to correct heteroskedasticity?  The authors state:

Now  suppose  that  one  estimates  that  population  regression  by  performing ordinary  least  squares  (OLS)  estimation  of  the  regression  of  log  earnings  on  the  race dummy, years of schooling, and a quartic in potential earnings for black and white male household heads in the PSID sample…this estimate [the coefficient on the race dummy] might be distorted by the PSID’s oversampling of low-income households, which surely must lead to an unrepresentative sample with respect to male household heads’ earnings…one  can  apply  a  reverse  funhouse  mirror  by  using  weights.    In particular, instead  of applying ordinary (i.e., equally weighted) least squares to the sample regression, one can use weighted least squares (WLS), minimizing the sum of squared residuals weighted by the  inverse  probabilities  of  selection.

Compared to Wyoming,  California  offers many more observations of the individual-level decision of whether or not to divorce, and therefore it seems at first that weighting by state population should lead to more precise coefficient estimation.  And yet, for the specification shown in Table 1, it appears that weighting by population harms the precision of estimation.

In many cases, however, using WLS actually harms the precision of the estimates.  This occurs because “…in many practical applications, the  assumption  that  the  individual-level  error  terms vij are independent  is  wrong.  Instead, the individual-level error terms within a group are positively correlated with each other because they have unobserved group-level factors in common.  In current parlance, the individual-level  error  terms  are  ‘clustered.'”  Thus the true individual error term may be better modelled as:

  • vij = ci + uij

where j indexes individuals and i indexes the groups. The cluster level variance causes the WLS to be relatively imprecise.

What should one do to address heteroskedasticity in this case?

One way to go is to…use  the  OLS  residuals  to  perform  the standard  heteroskedasticity  diagnostics  we  teach  in  introductory  econometrics.    For example,  in  this  situation,  the  modified Breusch-Pagan test  described  in  Wooldridge (2013, pp. 276-8) comes down to just applying OLS to a simple regression of the squared OLS residuals on the inverse within-group sample size 1/Ji, [where J is the size of a the group to which observation i belongs.]  The significance of the t-ratio for the coefficient on 1/Ji indicates whether the OLS residuals display significant evidence of heteroskedasticity… A  remarkable  feature  of  this  test  is  that  the estimated  intercept  consistently  estimates (σc)2,  and  the  estimated  coefficient  of 1/Ji consistently estimates (σu)2.

Other recommendations include:

  • Due to inevitable uncertainty about  the  true  variance  structure, report heteroskedasticity-robust standard error estimates.
  • Report both weighting and unweighted estimates since the differences between OLS and WLS estimates can be used as a diagnostic for model misspecification or endogenous sampling

Endogenous Sampling

Endogenous sampling occurs when the criteria used to create the sample are correlated with the error term of one’s regression.  For instance, if one conducted an earnings regression of various (exogenous) factors on income using PSID data, the resulting coefficients would be inconsistent because income itself is used to determine which individuals participate in the survey. [The PSID oversamples low-income individuals].

In  the  presence  of  endogenous  sampling, estimation that ignores the endogenous sampling generally will be inconsistent.  But if instead one weights the criterion function to be minimized (a sum of squares, a sum of absolute deviations, the negative of a log likelihood, a distance function for orthogonality conditions,  etc.)  by  the  inverse  probabilities  of  selection,  the  estimation  becomes consistent.

On the other hand, if the sampling probabilities are independent of the error term—for instance, if they vary only on the basis of the explanatory variables in the regression equation, then the estimates would be consistent.  In fact weighting would be unnecessary and harmful for precision.


  • If the sampling rate varies endogenously, estimation weighted by the inverse probabilities of selection is needed on consistency grounds.
  • The weighted estimation should be accompanied by robust estimation of standard errors.
  • When the variation  in  the sampling rate is exogenous, both weighted and unweighted estimation are consistent for the parameters of a correctly specified model, but unweighted estimation may be more precise.

Weighting to Estimate Partial Effects

Many times, the causal effect of one variable on another will vary across different subpopulations.  For instance, in a drug trial, the study compares the average effect of being in the treatment versus control arms on drug outcomes.  However, if the drug has heterogeneous treatment effects on outcomes depending on age, one may want to estimate the average partial effects of the drug.

Assume that the sample has more old people than young people relative to the population at large.  In this case, OLS would not be able to estimate the partial effect since the old people are over-represented in the sample.   Additionally:

In least squares estimation, observations with extreme values of the explanatory variables have particularly large influence on the estimates.  As a result, the weighted average of the rural and urban effects [in my example, young and old] identified by OLS depends not only on the sample shares of the two sectors, but also on how the within-sector variance of X differs between the two sectors… By reweighting the sample to get the sectoral shares in line with the population shares, WLS eliminates the first reason that OLS fails to identify the population average partial effect, but it does not eliminate the second.  As  a  result,  the  WLS  estimator  and  the  OLS  estimator  identify  different weighted averages of the heterogeneous effects, and neither one identifies the population average effect.


  1. Do not believe that in  the  presence  of  unmodeled  heterogeneous  effects, weighting to reflect population shares generally identifies the population average partial effect.
  2. Contrasting the weighted  and unweighted  estimates can  serve  as  a  test  for misspecification.  The failure  to  model  heterogeneous  effects is  one  sort  of misspecification  that  can  generate  a  significant  contrast.
  3. Where  heterogeneous effects are salient, study the heterogeneity don’t ust try to average it  out.

In Summary

In situations in which you might be  inclined to weight, it often  is useful to report both weighted and unweighted estimates  and  to  discuss  what  the  contrast  implies  for  the interpretation  of  the  results.  And, in many of the situations we have discussed, it is advisable to use robust standard error estimates.


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

Your email address will not be published. Required fields are marked *