• Welcome to the MHE Blog. We'll use this space to post corrections and comments and to address any questions you might have about the material in Mostly Harmless Econometrics. We especially welcome questions that are likely to be of interest to other readers. The econometric universe is infinite and expanding, so we ask that questions be brief and relate to material in the book. Here is an example:
    "In Section 8.2.3, (on page 319), you suggest that 42 clusters is enough for the usual cluster variance formula to be fairly accurate. Is that a joke, or do you really think so?"
    To which we'd reply: "Yes."

Hey baby, what’s your name?

Noemi Banerjee-Duflo!

Published | Tagged , , | Leave a comment

RD News

Help is at hand from Calonico, Cattaneo, and Titiunik

Three (3) new Stata commands, no less!

rdrobust: new robust, bias-corrected confidence intervals

rdbwselect: bandwidth selection this way and that

rdbinselect: Automated binwidth selection for those figs where you’re not doing any smoothing

Published | Tagged , , | Leave a comment

Signs of aging …

From: Martin Van der Linden
In chapter 1 page 6, you mention the case of the effect of start age
on results in school as and example of fundamentally unidentified

Do you mean that what cannot be assessed experimentally is the very
effect of starting school later because a student who starts school
at 6 and is perfectly identical in all dimensions but start age to
another student starting school at 7 cannot be found?
If true, is there any reason we would like to measure this pure
effect of start age independently of maturation effect? Isn't it
precisely maturation effect we try to measure when thinking about
start age?

yes! any first grader who started at 7 will be older than a first grader
who starts at age 6 on test day.
Since we think there are big age-at-test effects,
the comparison in test scores between these two is misleading

Many school districts would like to boost their test scores and are
tempted to go for an older start age to do it. 
Older start ages will indeed boost scores
(suppose you couldn't enter first grade until after your bar mitzvah ...) 
But that fact doesn't mean older entrants are learning more; they
might well do worse (e.g., by virtue of the dropout age mechanism
detailed in Angrist and Krueger 1991)

Published | Tagged , | Leave a comment

Our Chicago Connection


Thanks Austin!

Published | Tagged , | Leave a comment

Why children succeed

Published | Tagged , | 1 Comment

Probit better than LPM?

From Mark Schaffer:

Question: Dave Giles, in his econometrics blog, has spent a few blog entries attacking the linear probability model.



The first of these is the more convincing (at least for me): he cites Horace & Oaxaca (2006) who show that the LPM will usually generate biased and inconsistent estimates. Biasedness doesn’t bother me so much but inconsistency does, especially as it apparently carries over to estimates of the marginal effects.

Dave’s conclusion is that one should use probit or logit unless there are really good reasons not to (e.g., endogenous dummies or with panel data).

You’ve been staunch defenders of estimating the LPM using OLS, so I’d be very interested to see your views on this.

Best wishes,

Mark Schaffer

There are three arguments here: (1) The LPM does not estimate the structural parameters of a non-linear model (Horace and Oaxaca, 2006); (2) the LPM does not give consistent estimates of the marginal effects (Giles blog 1) and (3) the LPM does not lend itself towards dealing with measurement error in the dependent variable (Giles blog 2). The structural parameters of a binary choice model, just like the probit index coefficients, are not of particular interest to us. We care about the marginal effects. The LPM will do a pretty good job estimating those. If the CEF is linear, as it is for a saturated model, regression gives the CEF – even for LPM. If the CEF is non-linear, regression approximates the CEF. Usually it does it pretty well. Obviously, the LPM won’t give the true marginal effects from the right nonlinear model. But then, the same is true for the “wrong” nonlinear model! The fact that we have a probit, a logit, and the LPM is just a statement to the fact that we don’t know what the “right” model is. Hence, there is a lot to be said for sticking to a linear regression function as compared to a fairly arbitrary choice of a non-linear one! Nonlinearity per se is a red herring. As for measurement error, we would welcome seeing more applied work taking this seriously. Of course, plain vanilla probit is not the answer.


Published | | 4 Comments

Fixed effects (DD) and LDV bracketing example

On p. 246, we reference Guryan (2004) as a scenario where DD and lagged dependent variables methods bracket the causal effect of interest (see also Section 5.4) . . . except that the editors and/or referees wrote that out of Guryan’s script.  This argument does appear, however, in his 2001 working paper, available here

Published | Tagged , , , | Leave a comment

Fixed effects and lagged dependent variables again

chaos6174 asks:
In section 5.3 Fixed Effects versus Lagged Dependent
Variables,you write that the fixed effect model can deal with the OVB
problem caused time-invariant or group-invariant omitted variables,and
also give some suggestions on the strategies handing the OVB problem
caused by potential omitted variables that may change over time in
practice.On whether to employ fixed individual effects model or lagged
dependent variable model ,you suggest that the researchers Â…"find
broadly similar results using both models".But if the estimated
coefficient  of the interested  explanatory variable from the two
models differ tremendously,what shall be done to for the regressions
to make sense?

The fixed effects and lagged dependent variable models are different models, so can give different results. We discuss this on p. 245-46 in the book. If the results are very different you could consider estimating a model with both fixed effects and a lagged dependent variable. As we discuss in the book, this is a challenging model to estimate. As you can see from our discussion we don’t think the approaches you need to instrument for the lagged dependent variable are all that compelling, so this is not a clean solution. You can also think about the simple FE and LDV results as bracketing the true effect. Of course, if the difference is large this may not be particularly informative. If all else fails there may not be much you can do other than find another approach/other data/a better natural experiment to study the research question you are after.


Published | Tagged , , | 2 Comments

Testing DD

Jessie asks: Is there a nonparametric (or parametric) test that can be
used to test whether the treatment and control group are similar in
difference-in-difference before the treatment occurs?  

For example, in the Card-Krueger minimum wage example (Figure 5.2.1),
would there be a test to check whether the trend in the employment
rate was statistically similar between PA and NJ before the 1992
minimum wage change?  

I can only think of doing a linear regression (or perhaps another
parametric form?) on the 2 series and testing whether the coefficient
on the time variable in the NJ regression pre-1992 is statistically
different from that of the PA regression pre-1992.  

Thank you!

Good question jessie. You can think of
the "Granger Causality" approach taken in Autor (2003)
and described in chapter 5 as version of the
test you have in mind.  If there appears to be
a treatment effect before treatment, that's evidence of
diverging trends.
Ultimately, what matters most is whether allowing for differential
trends changes results in a meaningful way, not whether the trends
themselves are statistically significant.  Chapter 5 suggests
adding linear group-specific trends (e.g., state-specific trends)
as a spec test.

Note that this won't work in the original NJ/PA min wage study.
Why not?


Published | Tagged , | Leave a comment

QOB Qonfusion

Ilyssa wonders

Question: In Table 4.1.1 (p. 124), how are there 30 instruments in Column 8 rather than 27 (= 3 qob dummies * 9 year of birth dummies)?

Why indeed?  There are still 3 QOB main effects.


Published | Tagged , | Leave a comment