# Suppose that you want to analyze how cigarettes smoking aects annual income (possibly through lost work days due to illness, or productivity eects)….

4. Suppose that you want to analyze how cigarettes smoking aects annual income (possibly through lost work days due to illness, or productivity eects). To do so, you use the data Smoke.dat. which was originally used in Mullahy (1997). The le contains data on smoking behavior for a random sample of single adults from the United States. The number of observations equals 807 (persons). The le includes the following variables: educ (years of schooling), prcig (state cigarette price, cents per pack), white (=1 if white), age (in years), income (annual income, \$), cigs (cigarettes smoked per day), restaurn (=1 if the state has restaurant smoking restrictions).

(a) Consider the following model:

log(income) = β0 + β1cigs + u1 How do you interpret β1 if E [u1|cigs] = 0?

(b) Estimate the model by OLS and discuss the estimate of β1.

(c) Discuss the interpretation of ˆ βOLS 1 if consumption of cigarettes depends negatively on income.

(d) Consider now the extended model:

log(income) = β0 + β1cigs + β2educ + βage + β4age2 + u2

Suppose that cigs is endogenous. Explain why prcig and restaurn are likely to be uncorrelated with u2.

(e) Test that prcig and restaurn are relevant and compute the two IV estimates using as instruments for cigs both prcig and restaurn, respectively. Compare the resulting estimates of β1 with the OLS estimate.