Class 10: Regression discontinuity I

# In-person<br>session 10

**March 24, 2022**

]

---

# Plan for today

.box-2.medium.sp-after-half[Diff-in-diff effect sizes]

.box-5.medium.sp-after-half[Miscellaneous R stuff]

.box-6.medium.sp-after-half[RDD fun times]

---

layout: false
name: ps5
class: center middle section-title section-title-2 animated fadeIn

# Diff-in-diff effect sizes

---

---

.box-2.large[What the heck is happening at<br>the end of problem set 5?!]

---

layout: false
name: r-stuff
class: center middle section-title section-title-5 animated fadeIn

# Miscellaneous R stuff

---

---

.box-5.large[Is there a way to make<br>the date update automatically<br>in the title area?]

---

.box-5.large[Lines across categories]

---

.box-5.large[What do all those things like<br>"AIC" mean in model tables?]

.box-inv-5.medium[(And do we care about them?)]

???

<https://evalsp22.classes.andrewheiss.com/slides/02-class.html#16>

Goodness of fit stats focus on the outcome; good for prediction

We don’t care so much about that - we care about the one single predictor X - the point of DAGs and quasi-experiments, etc. is identification, which is a theoretical thing, not a numerical thing, so you don’t really need to try to maximize R2 or minimize AIC or whatever

---

.box-5.large[Can we control what<br>shows up in those tables?]

---

layout: false
name: rdd
class: center middle section-title section-title-6 animated fadeIn

# RDD fun times

---

---

.box-6.medium[With RDD we rely on "the rule" to<br>determine treatment and control groups]

.box-6[How do you decide on the rule?<br>You mentioned that it's arbitrary—<br>we can choose whatever rule we want?]

---

.box-6.medium[Can we use RDD to evaluate a program<br>that doesn't have a rule for participation?]

---

.box-6.medium[Is there a rule of thumb to determine which<br>quasi-experimental method we should use?]

.box-6.medium[How do we know which method applies<br>to which circumstance? Does the data tell us?]

---

.pull-left-narrow[
<figure>
  <img src="img/10-class/vigdor.png" alt="Jake Vigdor working paper" title="Jake Vigdor working paper" width="100%">
</figure>
]

.pull-right-wide.small[
> Teachers in North Carolina Public schools earn a bonus of $750 if the students in their school meet a standard called "expected growth." A summary statistic called "average growth" is computed for each school; the expected growth standard is met when this summary measure exceeds zero.

> Does getting a bonus in year `$t$` cause improved student performance in year `$t + 1$`?
]

---

.box-6.large[How common are these kinds of rules<br>in the real world?]

???

- Anything income-based or means-tested - sliding scale community health clinics, school truancy programs
- Anything with a test: SAT/ACT, AIG programs
- Elections - causal effect of candidates
- Grades - 89.49 vs. 89.51
- Poverty, EITC

---

.center[
<figure>
  <img src="img/10-class/goodreads.png" alt="Goodreads" title="Goodreads" width="80%">
</figure>
]

---

.box-6.medium[Where do these eligibility thresholds come from? Do policy makers research them first and reexamine them later?]

---

---

# Discontinuities everywhere!

.pull-left-wide.small[
<table>
 <thead>
  <tr>
   <th style="text-align:center;"> Size </th>
   <th style="text-align:center;"> Annual </th>
   <th style="text-align:center;"> Monthly </th>
   <th style="text-align:center;"> 138% </th>
   <th style="text-align:center;"> 150% </th>
   <th style="text-align:center;"> 200% </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> $12,760 </td>
   <td style="text-align:center;"> $1,063 </td>
   <td style="text-align:center;"> $17,609 </td>
   <td style="text-align:center;"> $19,140 </td>
   <td style="text-align:center;"> $25,520 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 2 </td>
   <td style="text-align:center;"> $17,240 </td>
   <td style="text-align:center;"> $1,437 </td>
   <td style="text-align:center;"> $23,791 </td>
   <td style="text-align:center;"> $25,860 </td>
   <td style="text-align:center;"> $34,480 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 3 </td>
   <td style="text-align:center;"> $21,720 </td>
   <td style="text-align:center;"> $1,810 </td>
   <td style="text-align:center;"> $29,974 </td>
   <td style="text-align:center;"> $32,580 </td>
   <td style="text-align:center;"> $43,440 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 4 </td>
   <td style="text-align:center;"> $26,200 </td>
   <td style="text-align:center;"> $2,183 </td>
   <td style="text-align:center;"> $36,156 </td>
   <td style="text-align:center;"> $39,300 </td>
   <td style="text-align:center;"> $52,400 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 5 </td>
   <td style="text-align:center;"> $30,680 </td>
   <td style="text-align:center;"> $2,557 </td>
   <td style="text-align:center;"> $42,338 </td>
   <td style="text-align:center;"> $46,020 </td>
   <td style="text-align:center;"> $61,360 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 6 </td>
   <td style="text-align:center;"> $35,160 </td>
   <td style="text-align:center;"> $2,930 </td>
   <td style="text-align:center;"> $48,521 </td>
   <td style="text-align:center;"> $52,740 </td>
   <td style="text-align:center;"> $70,320 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 7 </td>
   <td style="text-align:center;"> $39,640 </td>
   <td style="text-align:center;"> $3,303 </td>
   <td style="text-align:center;"> $54,703 </td>
   <td style="text-align:center;"> $59,460 </td>
   <td style="text-align:center;"> $79,280 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 8 </td>
   <td style="text-align:center;"> $44,120 </td>
   <td style="text-align:center;"> $3,677 </td>
   <td style="text-align:center;"> $60,886 </td>
   <td style="text-align:center;"> $66,180 </td>
   <td style="text-align:center;"> $88,240 </td>
  </tr>
</tbody>
</table>
]

.box-inv-6.smaller[**ACA subsidies**<br>138–400%*]

.box-inv-6.smaller[**CHIP**<br>200%]

.box-inv-6.smaller[**SNAP/Free lunch**<br>130%]

.box-inv-6.smaller[**Reduced lunch**<br>130–185%]
]

---

# The US's official poverty measure

.pull-left.center[
<figure>
  <img src="img/10-class/orshansky.jpg" alt="Mollie Orshansky" title="Mollie Orshansky" width="70%">
  <figcaption>Mollie Orshansky</figcaption>
</figure>
]

???

- <https://www.census.gov/topics/income-poverty/poverty/about/history-of-the-poverty-measure.html>
- <https://www.ssa.gov/policy/docs/ssb/v68n3/v68n3p79.html>

---

# The US's official poverty measure

.box-6.medium[**1955 annual food budget × 3**]

<br>

---

---

.center[
<iframe width="800" height="450" src="https://www.youtube.com/embed/q9EehZlw-zk" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
]

---

.center[
<figure>
  <img src="img/10-class/eitc-phaseout.png" alt="EITC phase out" title="EITC phase out" width="75%">
</figure>
]

---

.center[
<figure>
  <img src="img/10-class/ctc-phase-out.jpg" alt="CTC phase out" title="CTC phase out" width="75%">
</figure>
]

---

.box-6.medium[What if there are multiple cutoffs?]

---

.pull-left[
<figure>
  <img src="img/10-class/one-running-var.png" alt="One running variable" title="One running variable" width="100%">
</figure>
]

.pull-left[
<figure>
  <img src="img/10-class/multiple-running-vars.png" alt="Multiple running variables" title="Multiple running variables" width="100%">
</figure>
]

---

.box-6.large[Why do we center<br>the running variable?]

---

.box-6.large[Regression is just fancy averages!]

---

---

```r
lm(exit_exam ~ entrance_exam + tutoring,
   data = filter(tutoring, entrance_exam <= 80, 
                 entrance_exam >= 60)) %>% 
  tidy()
```

```
## # A tibble: 3 × 5
##   term          estimate std.error statistic  p.value
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)     33.2       8.64       3.84 1.43e- 4
## 2 entrance_exam    0.388     0.114      3.40 7.45e- 4
## 3 tutoringTRUE     9.27      1.31       7.09 6.27e-12
```

---

```r
tutoring_centered <- tutoring %>%
  mutate(entrance_centered = entrance_exam - 70)

lm(exit_exam ~ entrance_centered + tutoring,
   data = filter(tutoring_centered, entrance_exam <= 80, 
                 entrance_exam >= 60)) %>% 
  tidy()
```

```
## # A tibble: 3 × 5
##   term              estimate std.error statistic   p.value
##   <chr>                <dbl>     <dbl>     <dbl>     <dbl>
## 1 (Intercept)         60.4       0.752     80.3  2.99e-249
## 2 entrance_centered    0.388     0.114      3.40 7.45e-  4
## 3 tutoringTRUE         9.27      1.31       7.09 6.27e- 12
```

---

---

.box-6.large[What's the difference between weighting with kernels and inverse probability weighting?]

???

- <https://evalsp22.classes.andrewheiss.com/slides/07-slides.html#122>
- <https://evalsp22.classes.andrewheiss.com/slides/10-slides.html#87>
- <https://evalsp22.classes.andrewheiss.com/slides/10-slides.html#95>

---

.box-6.medium[There must be some math behind for the non-parametric lines. Should we care about that or should we just trust in R?]

???

- <https://evalsp22.classes.andrewheiss.com/slides/10-slides.html#75>

---

.box-6.medium[How do we decide on the right model?]

.center[
- Parametric with `$y = x$`?
- With `$y = x^2 + x$`? 
- With `$y = x^\text{whatever} + x^\text{whatever} + x$`? 
- Nonparametric? 
- `rdrobust()` or just `lm()`? 
- Controls or no controls?
]

---

.box-6.medium[How do you justify a bandwidth?]

.box-6.medium[Does the bandwidth need to be<br>the same on both sides?]

---

.box-6.less-medium[How should we think about the impact of the program on people who score really high or low on the running variable?]

.box-6.less-medium[If we're throwing most of the data away and only looking at a narrow bandwidth of people, what does this say about generalizability?]

---

.box-6.medium[What do we do about noncompliance?]

.box-6.medium[What is fuzzy regression discontinuity?]

---

.box-6.huge[RD play time!]