Often times when doing data analysis, you want to find the relationship between two variables. The first step is typically to plot a scatterplot. To better understand this relationship, however, it is useful to fit a line to the scatterplot. Most commonly, this is done with a simple linear regression (i.e., ordinary least squares (OLS) […]

Read the rest of this entry »## What are regression trees?

Regression trees are a way to partition your explanatory variables to (potentially) better predict an outcome of interest. Regression trees start with a an outcome (let’s call it y) and a vector of explanatory variables (X). Simple Example For instance, let y be health care spending, X=(X1,X2) where X1 is the patient’s age and X2 is the patient’s […]

Read the rest of this entry »## Synthetic Control Method

A common method for measuring the effect of policy interventions is the difference in difference (DiD) approach. In essence, one examines the change in outcomes among observations subject to the policy intervention and compare them agains observations that were not eligible for the policy intervention. A key assumption for this approach to be valid is […]

Read the rest of this entry »## Confirmation Bias

HT: Incidental Economist.

Read the rest of this entry »## What are cure fraction models?

Many people are familiar with survival models. Survival models measure the probability of survival to a given time period. The “problem” addressed by these models is that some people are “censored”, in other words, the do not die in the sample time period. Although longer survival is good in practice, for statisticians it is problematic […]

Read the rest of this entry »## What is attribute non-attendance?

In discrete choice experiments (DCEs), respondents are asked to choose amoung different options which vary across different attributes. For instance, a DCE on mobile phone preferences could have processor speed, battery life, screen size and cost as attributes. A DCE looking at different treatments could have expected survival, anticipated side effects and cost as attributes. […]

Read the rest of this entry »## Rankings and Kendall’s W

How can you compare how similar two rankings are. For instance, US News and Consumer Reports may both rate hospitals. If they have identical ratings, then they are obviously the same. However, what if the rankings differ for 2 hospitals? For 4 hospitals? How can one quantify the similar of rankings? One method for doing so […]

Read the rest of this entry »## Longer trials or larger sample size?

Developing drugs is expensive. Some estimates have estimated that the cost of bringing a drug to market is $1 billion. In addition, payers are now reimbursing based on the perceived value of a treatment. That is, treatments that provide more health benefits receive higher reimbursements. In this world of value-based pricing (VBP), pharmaceutical companies have […]

Read the rest of this entry »## Type I and Type II Errors for Dummies

In previous posts, I have describe in detail what constitutes Type I and Type II errors. This figure, however, conveys the concept much more succinctly. HT: Marginal Revolution.

Read the rest of this entry »