“There are a lot of small data problems that occur in big data. They don’t disappear because you’ve got lots of the stuff. They get worse.” David Spiegelhalter, Winton Professor of the Public Understanding of Risk at Cambridge university Via FT.

## Extended Cost Effectiveness Analysis

Most people know what cost-effectiveness analysis (CEA), but what is extended cost effectiveness analysis (ECEA)? A paper by Verguet, Laxminarayan,and Jamison (2014) describes the ECEA approach as it relates to the benefits of universal public finance (UPF) of specific medical treatments. CEA measures the effectiveness of a treatment relative to its cost. ECEA does this […]

Is your randomized controlled trial (RCT) generalizable to the general population? This question is known as external validity and is a major issue for a number of treatments. Sometimes, a treatment is very effective in an RCT, but less so in the real world. One reason why this may be the case is that the […]

Statistics are statistics, but different disciplines use different terminology to refer to the same types of analysis. A 2013 Institute of Medicine Report on cancer care lists a number of different study types (see below). Whereas economists would refer to a study that examines groups with and without an event or outcome as a “difference-in-difference” study, […]

Most people know about the good ol’ t-test. You present a null hypothesis (e.g., the Healthcare Economist is the most popular blog covering health economics on the web), collect data to conduct the test, and use the mean and variance of the data to test whether your hypothesis is true. Standard convention holds that most […]

What is the standard error of a predicted value? Most people know that the standard error of the mean of a dependent variable is σ(Z_bar)= σ(Z_bar) / √N. As Dowd, Greene and Norton (2013) explain, however, the standard error for the predicted value of a linear regression is much more complicated: The estimated standard error […]

This question is not so easy to answer, even when using data from a randomized trial. Further, many studies do not have the statistical power to identify cause-specific mortality. Consider the following example from Kim and Thompson: Consider a trial of an intervention only influencing a single cause of death, or a few specific causes […]

From Davern (2013): …the three surveys studied showed that for the same total survey budget approximately twice the effective sample size (e.g., from 10,000 to 20,000 effective sample size completed responses) could be obtained if a less aggressive protocol were followed and the research team were willing to accept an approximately 40 percent lower response […]

How can one tell if two variables are related? If you have continuous variables, one can calculate a correlation. What if one is analyzing a 2×2 contingency table? What are is the likelihood of observing just such a distribution assuming the probability of being in the two groups is known? One way to determine this […]

Oftentimes, one will observe data cluster around two different points. This distribution is known as a bimodal distribution. A bimodal distribution could arise, for instance, when patients have two choices of health care providers, and the data measure the share of times patients use one of the providers. To model the effect of different covariates on variable […]

