Least Squares
You are currently browsing articles tagged Least Squares.
Blogroll
- Barry Shafrin
- Becker-Posner
- Brew Crew Ball
- Cato Blog
- Courtside Analyst
- Covert Rationing
- Econbrowser
- EconLog
- Economist, The
- Ezra Klein
- Free exchange
- Global Financial Data
- Goolge News: HealthCare
- HC Policy & Marketplace Rev
- Health Affairs Blog
- Health Business Blog
- Health Care Blog
- Health Policy Wonk
- Healthcare Technology News
- JASON SHAFRIN's Homepage
- John Goodman
- la Vanguardia
- Marginal Revolution
- Marketplace (NPR)
- Medical Rants
- Megan McArdle
- Retired Doc’s Thoughts
- Running a Hospital
- Statistical Modeling
- The Treatment
- WonkBlog
Archives
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
The History of Least Squares
March 23, 2009 in Books, Econometrics | 1 comment
Let us say you have 10 observations of 2 different variables. How do you determine which of the observations to use? Should you throw out the outliers? Should you only include the most similar values? Does more observations increase or decrease the amount of measurement error?
These problems can be answered by the discipline of Statistics. An interesting book by Stigler recounts The History of Statistics. Astronomers lead many of the statistical advances in the seventeenth and eighteenth centuries. Accurate measurement is very important to astronomers. Further, observations with respect to the circumference and oblateness of the earth were made at different times and places throughout history. This leaves a conundrum of how best to combine these observations.
Mayer, Boscovich, and others contributed to the development of the idea of least squares, but Stigler credits Legendre with the invention of least squares. Legendre came up with the idea in his attempt to measure the length of the median quadrant (the distance from the equator to the North Pole) through Paris.
To demonstrate some of his ideas, I will use a simpler example. Let us assume that a drug can have a dosage level between 0 and 5 and we want to find it’s impact on health (measured from a 0-10 scale). Let us look at the following data. The goal is to find the parameters m (slope) and b (intercept) that accurately measure the relationship between drug dosage and health (ignore any questions of endogeneity). Should we include all 10 observations?
Although Euler recognized that including more observations increases the maximum possible error, Legendre realized that adding more observations also greatly increased the probability of getting close to the true value of the parameters of interest.
In my example, we need to fit a line to measure the parameters m and b. How do we set up the errors so that we have the most accurate calculations. Laplace believed that the following two conditions would need to hold:
The first condition basically says that the errors are uncorrelated with the independent variables on average. The second condition hopes to minimize the errors. Legendre extended Laplace’s second condition to minimize the sum of the squared errors rather than just the absolute error level.
Another key point is that this regression line must go through the “center of gravity.” In my example, the average dosage for the ten observations is 2.2 and the average health level is 5.9. This means the center of gravity is at the coordinates (2.2, 5.9). In the solution in my example is to set m=1.1456 and b=3.3797. We see that if we plug 2.2 into the equation, the output is 5.9; thus, the regression line does indeed go through the center of gravity.
Understanding the historical development of modern statistical techniques is an interesting task, and Stigler’s book enlightens the reader with much detail.
Tags: Books, Econometrics, Least Squares, Statistics