Hypothesis

7 Matching Annotations

Jun 2022
data-feminism.mitpress.mit.edu data-feminism.mitpress.mit.edu

5. Unicorns, Janitors, Ninjas, Wizards, and Rock Stars

1
1. pvu23 25 Jun 2022
  
  in Public
  
  80% of data analysis is spent on the process of cleaning and preparing the data
  
  Imagine having unnecessary and wrong data in your document, you would most likely have to experience the concept of time demarcation -- the reluctance in going through every single row and column to eliminate these "garbage data". Clearly, owning all kinds of data without organizing them feels like stuffing your closet with clothes that you should have donated 5 years ago. It is a time-consuming and soul-destroying process for us. Luckily, in R, we have something in R called "tidyverse" package, which I believe the author talks about in the next paragraph, to make life easier for everyone. I personally use dplyr and ggplot2 when I deal with data cleaning, and they are extremely helpful. WIthout these packages' existence, I have no idea when I will be able to reach the final step of data visualization.
  
  data visualization data cleaning tidyverse dplyr ggplot2
Visit annotations in context

Tags

data cleaning

dplyr

ggplot2

tidyverse

data visualization

Annotators

pvu23

URL

data-feminism.mitpress.mit.edu/pub/2wu7aft8
www.tidyverse.org www.tidyverse.org

dplyr 1.0.4: if_any() and if_all()

1
1. pbk1 24 Jun 2022
  
  in Public
  
  across() is very useful within summarise() and mutate(), but it’s hard to use it with filter() because it is not clear how the results would be combined into one logical vector. So to fill the gap, we’re introducing two new functions if_all() and if_any().
  
  filter() across() dplyr
Visit annotations in context

Tags

dplyr

filter()

across()

Annotators

pbk1

URL

tidyverse.org/blog/2021/02/dplyr-1-0-4-if-any/
May 2022
stackoverflow.com stackoverflow.com

Numbering rows within groups in a data frame

1
1. pbk1 13 May 2022
  
  in Public
  
  df %>% group_by(cat) %>% mutate(id = row_number())
  
  numbering index within a group
  
  dplyr index by_group replicate count number
Visit annotations in context

Tags

number

by_group

dplyr

replicate

count

index

Annotators

pbk1

URL

stackoverflow.com/questions/12925063/numbering-rows-within-groups-in-a-data-frame
Oct 2020
tidyr.tidyverse.org tidyr.tidyverse.org

Replace NAs with specified values — replace_na

1
1. pbk1 15 Oct 2020
  
  in Public
  
  dplyr::coalesce() to replaces NAs with values from other vectors.
  
  dplyr remove NA replace NA R
Visit annotations in context

Tags

dplyr

remove NA

replace NA

R

Annotators

pbk1

URL

tidyr.tidyverse.org/reference/replace_na.html
Apr 2020
cran.r-project.org cran.r-project.org

labelled.pdf

1
1. daaronr 02 Apr 2020
  
  in Public
  
  Adding variable labels using pipe
  
  dplyr
Visit annotations in context

Tags

dplyr

Annotators

daaronr

URL

cran.r-project.org/web/packages/labelled/labelled.pdf
Mar 2020
jvns.ca jvns.ca

SQL queries don't start with SELECT - Julia Evans

1
1. pyxelr 02 Mar 2020
  
  in Public
  
  dplyr in R also lets you use a different syntax for querying SQL databases like Postgres, MySQL and SQLite, which is also in a more logical order
  
  dplyr R
Visit annotations in context

Tags

dplyr

R

Annotators

pyxelr

URL

jvns.ca/blog/2019/10/03/sql-queries-don-t-start-with-select/
Feb 2020
www.r-bloggers.com www.r-bloggers.com

Curly-Curly, the successor of Bang-Bang

1
1. daaronr 10 Feb 2020
  
  in Public
  
  Now this can be simplified using the new {{}} syntax: summarise_groups <- function(dataframe, grouping_var, column_name){ dataframe %>% group_by({{grouping_var}}) %>% summarise({{column_name}} := mean({{column_name}}, na.rm = TRUE)) } Much easier and cleaner! You still have to use the := operator instead of = for the column name however. Also, from my understanding, if you want to modify the column names, for instance in this case return "mean_height" instead of height you have to keep using the enquo()–!! syntax.
  
  curly curly syntax
  
  dplyr
Visit annotations in context

Tags

dplyr

Annotators

daaronr

URL

r-bloggers.com/curly-curly-the-successor-of-bang-bang/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL