Hypothesis

41 Matching Annotations

May 2021
jrnold.github.io jrnold.github.io

19 Functions | R for Data Science: Exercise Solutions

1
1. Amaks 29 May 2021
  
  in Public
  
  we could a FizzBuzz function
  
  omission of write?
  
  Suggested edit: we could write a FizzBuzz function
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/functions.html
jrnold.github.io jrnold.github.io

14 Strings | R for Data Science Solutions

1
1. Amaks 17 May 2021
  
  in Public
  
  Finding all plurals cannot be correctly accomplished with regular expressions alone. Finding plural words would at least require morphological information about words in the language. See WordNet for a resource that would do that. However, identifying words that end in an “s” and with more than three characters, in order to remove “as”, “is”, “gas”, etc., is a reasonable heuristic.
  
  I agree with the statement and used that as a basis for my answer.
  
  sent_with_words_end_s <- str_subset(sentences, "\b[A-Za-z]{3,}s\b") # Focusing on only those sentences that meets the specified criteria
  
  words_end_s <- str_extract(sent_with_words_end_s, "\b[A-Za-z]{3,}s\b") # Words ending in s (contains both plural words like "planks" and non plural words like "Sickness"
  
  str_view(words_end_s, "\b[A-Za-z]+[^s]s$") #Extracts only words that end in s but not in ss.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/strings.html
jrnold.github.io jrnold.github.io

13 Relational data | R for Data Science Solutions

1
1. Amaks 05 May 2021
  
  in Public
  
  Since is always good practice to have clear
  
  Typo: it omitted.
  
  Edit suggestion: Since (it) is always good practice to have clear
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/relational-data.html
Apr 2021
jrnold.github.io jrnold.github.io

7 Exploratory Data Analysis | R for Data Science: Exercise Solutions

1
1. Amaks 29 Apr 2021
  
  in Public
  
  mutate(cut = if_else(runif(n()) < 0.1, NA_character_, as.character(cut)))
  
  Can you explain this code to me? I've looked up the if_else function but I do not understand this code.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/exploratory-data-analysis.html
jrnold.github.io jrnold.github.io

5 Data transformation | R for Data Science: Exercise Solutions

2
1. Amaks 27 Apr 2021
  
  in Public
  
  (arr_delay <= 0))
  
  Why did you use filter arr_delay <= 0 and not arr_delay > 0 when we are looking for the plane with the worst on-time record? This sounds counterintuitive to me, what am I misunderstanding? Thank you.
2. Amaks 27 Apr 2021
  
  in Public
  
  this delay will not have those affects plans
  
  For better clarity, change "this delay will not have those affects plans nor does it affect the total time spent traveling." to this delay will not affect those plans nor would it affect the total time spent traveling.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/transform.html
jrnold.github.io jrnold.github.io

R for Data Science Solutions

3
1. Amaks 22 Apr 2021
  
  in Public
  
  geom_bar(width = 1)
  
  Please can you explain to me why you included the argument, width = 1 for geom_bar? Without it, the pie doesn't look different. I believe you must have specified for a reason?
  
  This was my attempt to answer the question. I'm not totally if the resulting plot makes much sense. Please take a look and let me know what you think. Thank you.
  
  ggplot(diamonds, mapping = aes(x = cut, fill = color)) + geom_bar() + coord_polar()
2. Amaks 22 Apr 2021
  
  in Public
  
  such
  
  Typo. I think should read, such as, not such.
3. Amaks 22 Apr 2021
  
  in Public
  
  ..count.. / sum(..count..
  
  Please, can you explain why you have dots in the code? Thanks.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/data-visualisation.html
Sep 2019
jrnold.github.io jrnold.github.io

21 Iteration | R for Data Science: Exercise Solutions

2
1. Amaks 02 Sep 2019
  
  in Public
  
  we could use map() followed by flatten_dbl(),
  
  The way this is written could be somewhat confusing to a reader, in my opinion, although the code makes the order of the functions clearer..
  
  Suggestion:
  
  If we wanted a numeric vector, we could combine the map() followed with the flatten_dbl(),
2. Amaks 02 Sep 2019
  
  in Public
  
  like so,
  
  Edit suggestion.
  
  like shown:
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/iteration.html
Aug 2019
jrnold.github.io jrnold.github.io

20 Vectors | R for Data Science: Exercise Solutions

4
1. Amaks 27 Aug 2019
  
  in Public
  
  at
  
  Minor mistake: a not at
2. Amaks 26 Aug 2019
  
  in Public
  
  not
  
  "Not" repeated. Delete?
3. Amaks 26 Aug 2019
  
  in Public
  
  Neither NaN nor Inf are not numbers, and so they aren’t even numbers
  
  Edit suggestion,
  
  Neither NaN nor Inf is a number
  
  Or
  
  Both NaN and Inf are not even numbers.
  
  If accepted, then "and so they aren’t even numbers" may be deleted as it becomes superfluous.
4. Amaks 25 Aug 2019
  
  in Public
  
  This is not the same as what you See the value of looking at the value of
  
  Could you please check the sentence? There seems to be some ambiguity.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/vectors.html
jrnold.github.io jrnold.github.io

19 Functions | R for Data Science: Exercise Solutions

2
1. Amaks 24 Aug 2019
  
  in Public
  
  is
  
  redundant. delete?
2. Amaks 24 Aug 2019
  
  in Public
  
  (!(x %% 3) && !(x %% 5))
  
  Hi,
  
  Please can you explain to me how to read this line of code and what it means. I'm having difficulty understanding it. Individually I do understand what each symbol means but put together as it is, I'm unable to. Moreover, it looks more efficient than my effort, as shown below.
  
  Thank you.
  
  fizzbuzz <- function(n) {
  
  x <- n
  
  if(( x %% 3 == 0 ) && ( x %% 5 == 0 )) {
  
  print("fizzbuzz")
  
  } else {
  
  if(x %% 3 == 0) {
  
  print("fizz")
  
  } else {
  
  if(x %% 5 == 0) {
  
  print("buzz")
  
  } else { print(x)
  
  } } }
  
  }
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/functions.html
jrnold.github.io jrnold.github.io

16 Dates and times | R for Data Science: Exercise Solutions

3
1. Amaks 15 Aug 2019
  
  in Public
  
  is that
  
  Change word order perhaps? that is
2. Amaks 15 Aug 2019
  
  in Public
  
  which as a
  
  which has
3. Amaks 14 Aug 2019
  
  in Public
  
  if
  
  Minor mistake. Should read if it does not
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/dates-and-times.html
jrnold.github.io jrnold.github.io

13 Relational data | R for Data Science Solutions

1
1. Amaks 04 Aug 2019
  
  in Public
  
  What does it mean for a flight to have a missing tailnum
  
  This seems to be a tad long so I apologise in advance. This is a whole new field for me and I would really like to understand.
  
  Could it be AA and MQ use different values to represent tailnum?
  
  filter( planes, tailnum == 0 )
  
  A tibble: 0 x 9
  
  length(is.na( planes$tailnum ))
  
  [1] 3322
  
  nrow(planes)
  
  [1] 3322
  
  filter( flights, tailnum == 0 )
  
  A tibble: 0 x 19
  
  length(is.na( flights$tailnum ) )
  
  [1] 336776
  
  nrow( flights )
  
  [1] 336776
  
  Yet , the anti_join () as shown in your code shows clearly that there are some talinum values in flights that are not represented in the planes datasets. How could that be? The one explanation I could come up with is that the two datasets used different talinum values, so I tried to investigate for AA and MQ.
  
  tailnum_flights <- flights %>% filter( carrier == 'AA'| carrier == 'MQ' ) %>% select ( carrier, tailnum )
  
  tailnum_planes <- planes %>% select( tailnum )
  
  tailnum_planes %in% tailnum_flights
  
  [1] FALSE
  
  So, it looks like the tailnum values are not missing for the ten airlines but are represented with values different in the two datasets (flights and planes).
  
  What are your thoughts? Thank you.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/relational-data.html
Jul 2019
jrnold.github.io jrnold.github.io

14 Strings | R for Data Science Solutions

3
1. Amaks 28 Jul 2019
  
  in Public
  
  In the full English language, no
  
  Erm, full stop omitted after no.
  
  And thanks, for the link. It is an interesting read.
2. Amaks 28 Jul 2019
  
  in Public
  
  Words that end with “-ed” but not ending in “-eed”
  
  This worked for me as well:
  
  str_view(stringr::words, ".*[^e]ed$", match = TRUE)
3. Amaks 28 Jul 2019
  
  in Public
  
  "ab$^$sfas"
  
  Please, could explain why you included this in the code? I replicated the answer without it. Thanks.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/strings.html
jrnold.github.io jrnold.github.io

13 Relational data | R for Data Science Solutions

8
1. Amaks 27 Jul 2019
  
  in Public
  
  avg_dest_delays <- flights %>% group_by(dest) %>% # arrival delay NA's are cancelled flights summarise(delay = mean(arr_delay, na.rm = TRUE)) %>% inner_join(airports, by = c(dest = "faa"))
  
  Please, could you explain to me why if I pipe this directly to ggplot the colour aesthetics is not applied. See code below, it's basically a replication of yours but with ggplot directly piped to avg_dest_delays:
  
  avg_dest_delays <- flights %>% group_by( dest ) %>% summarise( delay = mean( arr_delay, na.rm = TRUE )) %>% inner_join(airports, by = c( dest = "faa" ) ) %>% ggplot( aes(lon, lat, colour = delay ) ) + borders("state" ) + geom_point( ) + coord_quickmap( )
  
  Thanks
2. Amaks 27 Jul 2019
  
  in Public
  
  any any
  
  repetition?
3. Amaks 27 Jul 2019
  
  in Public
  
  join
  
  This also seems to be the case where the by = argument is not used in a code. In that case, it seems, the semi_join() will give outputs only where the rows for both datasets correctly match, for example:
  
  fueleconomy::vehicles %>% semi_join(fueleconomy::common)
  
  produces the same output as:
  
  fueleconomy::vehicles %>% semi_join(fueleconomy::common, by = c("make", "model"))
  
  Or is that a coincident?
  
  But, fueleconomy::vehicles %>% semi_join(fueleconomy::common, by = "make"
  
  will produce a different output as R will match only by "make" in this example.
4. Amaks 27 Jul 2019
  
  in Public
  
  arr_delay
  
  The code doesn't affect the output but I thought you might mean, sum( !is.na( dep_delay ))
5. Amaks 27 Jul 2019
  
  in Public
  
  There are few planes older than 30 years, so I combine them into a single category.
  
  The code in the solution book totally dropped the data for planes age > 25. How might we combine them into a single row? I didn’t think it could be done without first defining a second tibble that contains all the planes age >25, then merge it with the first tibble that contains the data for planes age <= 25, before carrying out the summarise actions on them after applying the group_by = age argument. older_plane_cohorts <- inner_join( flights, select( planes, tailnum, plane_year = year ), by = "tailnum" ) %>% mutate(age = year - plane_year) %>% filter(!is.na(age)) %>% mutate(age = pmin(46, age) - pmin( 25, age ) ) %>% filter( age != 0 )
  
  Then I got stuck. I can’t figure out how to proceed after that. And frankly, I can say with any certainty if my argument is sensible. And I’m not even sure it’s possible to combine the 17 rows into a single row. Your help will be appreciated.
  
  Thank you in advance and for your help so far. Truly appreciated.
6. Amaks 27 Jul 2019
  
  in Public
  
  mutate(age = pmin(25, age))
  
  I think you used this line of code to limit the selection to planes not older than 25 years. But in the text above, you stated, "There are few planes older than 30 years, so we combine them into a single category." So, I was expecting the selection to be age <= 30, or using your notation pmin(30, age) and not pmin(25, age). Perhaps an edit of the text may be required unless I'm wrong in my supposition?
7. Amaks 26 Jul 2019
  
  in Public
  
  This however, this default
  
  Edit suggestion: However, this default...
8. Amaks 26 Jul 2019
  
  in Public
  
  If we needed a unique identifier for our analysis, could add a surrogate key.
  
  Hi, I hope this doesn't come across as nitpicking:
  
  If we needed a unique identifier for our analysis, we could add a surrogate key, perhaps?
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/relational-data.html
jrnold.github.io jrnold.github.io

12 Tidy Data | R for Data Science: Exercise Solutions

2
1. Amaks 23 Jul 2019
  
  in Public
  
  It looks like it is possible for certain variables to missing for (country, years).
  
  "It looks like it is possible for certain variables to missing for (country, years)." Edit suggestion: It looks like it is possible that certain variables are missing for (country, years).
2. Amaks 22 Jul 2019
  
  in Public
  
  is
  
  delete?
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/tidy-data.html
jrnold.github.io jrnold.github.io

10 Tibbles | R for Data Science: Exercise Solutions

2
1. Amaks 20 Jul 2019
  
  in Public
  
  run
  
  run:
2. Amaks 20 Jul 2019
  
  in Public
  
  Using $
  
  Using $,
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/tibbles.html
jrnold.github.io jrnold.github.io

7 Exploratory Data Analysis | R for Data Science: Exercise Solutions

4
1. Amaks 20 Jul 2019
  
  in Public
  
  ggplot
  
  Please, can you explain to me what I am getting wrong in the below code and especially the error message, which I have tried goggling to no success.
  
  I tried to see if instead of count to show, in my practice exercise, the average price per carat group using the code below:
  
  group_by( diamonds, carat ) %>% summarise( avg_price = mean(price ) ) %>% ggplot( ) + mapping = aes( color = cut_width(carat, 5 ), x = avg_price ) + geom_freqpoly( )
  
  The code returns the following error:
  
  Error in group_by(diamonds, carat) %>% summarise(avg_price = mean(price)) %>% : could not find function "+<-"
  
  I've googled the error without success. So my confusion is in two parts:
  
  What is the right code to show the average price per carat type
  
  What does the above error mean?
  
  Thanks in advance
2. Amaks 20 Jul 2019
  
  in Public
  
  visualization
  
  Does anyone else notice that the R4DS uses British spelling for visualisation, but the Solution textbook uses the American spelling. It's a bit weird when one switches directly from the text book to the solution. I'm sure not many people notice the difference. I probably do because I generally write Brit English. By the way, I am not censuring, just expressing my thought. I am really grateful to the author for making this available.
3. Amaks 18 Jul 2019
  
  in Public
  
  there spikes in
  
  Omission of are? Perhaps, you meant there are?
4. Amaks 18 Jul 2019
  
  in Public
  
  the these
  
  Typo. Either the or these, preferably these in my opinion.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/exploratory-data-analysis.html
jrnold.github.io jrnold.github.io

5 Data transformation | R for Data Science: Exercise Solutions

1
1. Amaks 16 Jul 2019
  
  in Public
  
  n
  
  Can anyone please explain the n in this code? Thanks in advance.
Visit annotations in context

Annotators

Amaks

URL

jrnold.github.io/r4ds-exercise-solutions/transform.html

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

A tibble: 0 x 9

[1] 3322

[1] 3322

A tibble: 0 x 19

[1] 336776

[1] 336776

[1] FALSE

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL