From Mark Schaffer:

Question: Dave Giles, in his econometrics blog, has spent a few blog entries attacking the linear probability model.

http://davegiles.blogspot.co.uk/2012/06/another-gripe-about-linear-probability.html

http://davegiles.blogspot.co.uk/2012/06/yet-another-reason-for-avoiding-linear.html

The first of these is the more convincing (at least for me): he cites Horace & Oaxaca (2006) who show that the LPM will usually generate biased and inconsistent estimates. Biasedness doesn’t bother me so much but inconsistency does, especially as it apparently carries over to estimates of the marginal effects.

Dave’s conclusion is that one should use probit or logit unless there are really good reasons not to (e.g., endogenous dummies or with panel data).

You’ve been staunch defenders of estimating the LPM using OLS, so I’d be very interested to see your views on this.

Best wishes,

Mark Schaffer

*There are three arguments here: (1) The LPM does not estimate the structural parameters of a non-linear model (Horace and Oaxaca, 2006); (2) the LPM does not give consistent estimates of the marginal effects (Giles blog 1) and (3) the LPM does not lend itself towards dealing with measurement error in the dependent variable (Giles blog 2). The structural parameters of a binary choice model, just like the probit index coefficients, are not of particular interest to us. We care about the marginal effects. The LPM will do a pretty good job estimating those. If the CEF is linear, as it is for a saturated model, regression gives the CEF – even for LPM. If the CEF is non-linear, regression approximates the CEF. Usually it does it pretty well. Obviously, the LPM won’t give the true marginal effects from the right nonlinear model. But then, the same is true for the “wrong” nonlinear model! The fact that we have a probit, a logit, and the LPM is just a statement to the fact that we don’t know what the “right” model is. Hence, there is a lot to be said for sticking to a linear regression function as compared to a fairly arbitrary choice of a non-linear one! Nonlinearity per se is a red herring. As for measurement error, we would welcome seeing more applied work taking this seriously. Of course, plain vanilla probit is not the answer.*

*SP*

## 3 Comments

Steve, I like your answer and just have a nerdy footnote.

In a completely randomized experiment with a binary outcome, if you want to adjust for covariates to improve precision, you can use either logit (with an average marginal effect calculation) or OLS to consistently estimate the average treatment effect, even if your model’s “wrong”. Probit doesn’t enjoy this robustness property.

The first-order conditions for OLS and the logit MLE imply a nice property: if you compute an “untreated” predicted probability for each person, using her actual covariate values but setting the treatment dummy to 0, then the average “untreated” prediction in the control group equals the raw control mean. In large enough samples, this will be very similar to the average “untreated” prediction in the full sample (since the distribution of covariates in the control group will resemble the distribution in the full sample). The latter is a regression-adjusted control mean. So we have an adjusted control mean that enjoys the same consistency properties as the raw control mean.

Similarly, we can compute a “treated” predicted probability for each person, and the resulting adjusted treatment group mean enjoys the same consistency properties as the raw treatment group mean. So the difference between the adjusted treatment and control group means is consistent for ATE. None of this depends on the model being correct.

The probit MLE first-order conditions don’t imply the same nice property.

David Freedman gave a rigorous proof for logit in “Randomization does not justify logistic regression”. (The negative message is that you can’t just use predictions at the mean covariate values, and the coefficient on treatment doesn’t estimate anything meaningful if the model’s wrong. But diehard MHE fans already know that.)

Freedman also briefly discussed probits:

“On the other hand, with the probit, the plug-in estimators are unlikely to be consistent, since the analogs of the likelihood equations (16–18) below involve weighted averages rather than simple averages. In simulation studies … Numerical calculations also confirm inconsistency of the plug-in estimators [average marginal effect estimates from the probit], although the asymptotic bias is small.”

A couple other references:

D. Firth and K. Bennett (1998). “Robust models in probability sampling.” JRSSB 60: 3-21.

J. Wooldridge (2007). “Inverse probability weighted estimation for general missing data problems.” J. Econometrics 141: 1281-1301. (See section 6.2.)

Sorry my link to Freedman’s paper didn’t go through. It’s in Statistical Science, 2008, Vol. 23, No. 2, 237-249, and here’s an ungated preprint:

http://www.stat.berkeley.edu/~census/neylogit.pdf

As for the discussion for measurement error in the case of probit and logit, I think a good source of reading would be “Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left” : http://www.aeaweb.org/articles.php?doi=10.1257/jep.15.4.57

It’s worthwhile to note that you can end up with significant biases when you have in mismeasurement of the independent variable as well as the dependent variable.

## One Trackback

[...] are robust to any cdf for the error term. Jorn-Steffen Pischke at Mostly Harmless Econometrics points out that my gut is not wrong: “The structural parameters of a binary choice model, just like the probit index [...]