Hypothesis

430 Matching Annotations

Feb 2019
jrnold.github.io jrnold.github.io

5 Data transformation | R for Data Science: Exercise Solutions

14
1. cfgauss 19 Feb 2019
  
  in Public
  
  Exercise 5.5.4
  
  Another solution. Now that we know we can "trust" dep_delay we could simply
  
  md <- flights %>% select(carrier,flight,origin,dest,sched_dep_time,dep_time,dep_delay) %>% arrange(min_rank(desc(dep_delay)))
  
  There are no ties in the top 10.
2. cfgauss 19 Feb 2019
  
  in Public
  
  Daylight
  
  If you wish, you could change this to "Eastern Daylight" to agree with its previous reference in that sentence.
3. cfgauss 19 Feb 2019
  
  in Public
  
  Except for flights daylight savings started (March 10) or ended (November 3)
  
  I understand the point of the paragraph but this sentence is a bit unclear.
4. cfgauss 16 Feb 2019
  
  in Public
  
  (to handle the .
  
  It's unclear to me what is intended here.
5. cfgauss 10 Feb 2019
  
  in Public
  
  Exercise 5.5.2
  
  A nice analysis, showing the "messiness" of actual datasets.
6. cfgauss 08 Feb 2019
  
  in Public
  
  select(flights, one_of(vars))
  
  Note that
  
  select(flights,vars)
  
  produces the same result.
7. cfgauss 07 Feb 2019
  
  in Public
  
  /
  
  *
8. cfgauss 07 Feb 2019
  
  in Public
  
  Saturday
  
  December
9. cfgauss 06 Feb 2019
  
  in Public
  
  month == 7, month == 8, month == 9
  
  Replace , by |.
10. cfgauss 06 Feb 2019
  
  in Public
  
  between()
  
  Interestingly between microbenchmarks to 143 microseconds while >=, <= is 2.3 on my box!
11. cfgauss 05 Feb 2019
  
  in Public
  
  What other variables are missing?
  
  Another interpretation:
  
  colnames(flights)[colSums(is.na(flights)) > 0]
12. cfgauss 05 Feb 2019
  
  in Public
  
  dep_time %% 2400 <= 600
  
  Elegant, at a 2x microbenchmark cost.
13. cfgauss 03 Feb 2019
  
  in Public
  
  is preferred
  
  Do numerical operators execute faster than, say, %in%?
14. cfgauss 03 Feb 2019
  
  in Public
  
  were
  
  Omit
Visit annotations in context

Annotators

cfgauss

URL

jrnold.github.io/r4ds-exercise-solutions/transform.html
jrnold.github.io jrnold.github.io

12 Tidy Data | R for Data Science: Exercise Solutions

1
1. JohnOLeary 19 Feb 2019
  
  in Public
  
  sum(cases)
  
  Must be na.rm = TRUE inside the sum parenthesis for the below graph to appear otherwise it will just be a blank graph.
  
  By the way thanks so much for these solutions, they are extremely helpful and teach things which are not taught in the book!
Visit annotations in context

Annotators

JohnOLeary

URL

jrnold.github.io/r4ds-exercise-solutions/tidy-data.html
jrnold.github.io jrnold.github.io

4 Workflow: basics | R for Data Science: Exercise Solutions

1
1. cfgauss 03 Feb 2019
  
  in Public
  
  It looks like a typo, dota instead of data.
  
  There was no typo in the 2/2/19 version of r4ds.
Visit annotations in context

Annotators

cfgauss

URL

jrnold.github.io/r4ds-exercise-solutions/workflow-basics.html
Jan 2019
jrnold.github.io jrnold.github.io

19 Functions | R for Data Science: Exercise Solutions

1
1. ameliabedelia 26 Jan 2019
  
  in Public
  
  sum_to_one <- function(x, na.rm = FALSE) { x / sum(x, na.rm = na.rm) }
  
  Since the sum of x is the same across the input, couldn't you make the code less repetitive by assigning it to an intermediate variable?
  
  sum_to_one <- function(x, na.rm = FALSE) { y = sum(x, na.rm = na.rm) x / y }
Visit annotations in context

Annotators

ameliabedelia

URL

jrnold.github.io/r4ds-exercise-solutions/functions.html
jrnold.github.io jrnold.github.io

5 Data transformation | R for Data Science: Exercise Solutions

7
1. JohnOLeary 22 Jan 2019
  
  in Public
  
  dep_delay
  
  This should be arr_delay not dep_delay
2. zhiwei_li 22 Jan 2019
  
  in Public
  
  flights, dep_time %% 2400 <= 600)
  
  这个膜运算选择了一个最大值
3. zhiwei_li 22 Jan 2019
  
  in Public
  
  There is one remaining issue. Midnight is represented by 2400, which would correspond to 1440 minutes since midnight, but it should correspond to 0. After converting all the times to minutes after midnight, x %% 1440 will convert 1440 to zero while keeping all the other times the same. Now we will put it all together. The following code creates a new data frame flights_times with columns dep_time_mins and sched_dep_time_mins. These columns convert dep_time and sched_dep_time, respectively, to minutes since midnight. flights_times <- mutate(flights, dep_time_mins = (dep_time %/% 100 * 60 + dep_time %% 100) %% 1440, sched_dep_time_mins = (sched_dep_time %/% 100 * 60 + sched_dep_time %% 100) %% 1440
  
  这个计算变量用的小技巧非常好，要深入体会一下
4. zhiwei_li 21 Jan 2019
  
  in Public
  
  arrange(flights, distance / air_time * 60)
  
  arrange也可以接受一个表达式生成的新变量然后根据这个新变量排序
5. zhiwei_li 21 Jan 2019
  
  in Public
  
  c(600, 1200, 2400) %% 2400
  
  这个是膜运算，类似于取余数，但是又不太一样，详细见：https://baike.baidu.com/item/%E5%8F%96%E6%A8%A1%E8%BF%90%E7%AE%97/10739384?fr=aladdin
6. zhiwei_li 21 Jan 2019
  
  in Public
  
  filter(flights, between(month, 7, 9))
  
  这个不错，可以在对连续型变量转换成分类变量的时候使用
7. zhiwei_li 21 Jan 2019
  
  in Public
  
  desc(is.na(dep_time)), dep_time)
  
  通过两个变量排序，第一个生成一个逻辑变量T,F。因为缺失值是T，所以缺失值就排在了前边，然后再按照第二个变量dep_time排序
Visit annotations in context

Annotators

JohnOLeary

zhiwei_li

URL

jrnold.github.io/r4ds-exercise-solutions/transform.html
jrnold.github.io jrnold.github.io

R for Data Science Solutions

1
1. egurtzegi 12 Jan 2019
  
  in Public
  
  horizontal
  
  vertical
Visit annotations in context

Annotators

egurtzegi

URL

jrnold.github.io/r4ds-exercise-solutions/data-visualisation.html
Dec 2018
jrnold.github.io jrnold.github.io

R for Data Science Solutions

1
1. MajorosMask 05 Dec 2018
  
  in Public
  
  geom_point
  
  This should be "geom_jitter"
  
  bugs typo
Visit annotations in context

Tags

typo

bugs

Annotators

MajorosMask

URL

jrnold.github.io/r4ds-exercise-solutions/data-visualisation.html
Nov 2018
jrnold.github.io jrnold.github.io

R for Data Science Solutions

4
1. sunxm03 27 Nov 2018
  
  in Public
  
  scales
  
  axes
2. sunxm03 26 Nov 2018
  
  in Public
  
  position_dodge()
  
  position = "dodge2"
3. sunxm03 26 Nov 2018
  
  in Public
  
  changing
  
  slightly changing
4. sunxm03 26 Nov 2018
  
  in Public
  
  height = 0.8 and width = 0.8
  
  height = 0.4 and width =0.4 because the randomness is created on both negative and positive directions.
Visit annotations in context

Annotators

sunxm03

URL

jrnold.github.io/r4ds-exercise-solutions/data-visualisation.html

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Tags

Annotators

URL

Annotators

URL