Okay, I’m not sure what’s going on in this data.
Looks like a New York issue.
> filter(flights, !is.na(dep_delay), is.na(arr_delay)) %>% + count(origin) # A tibble: 3 x 2 origin n <chr> <int> 1 EWR 469 2 JFK 337 3 LGA 369