3 Matching Annotations
  1. Jun 2022
    1. 80% of data analysis is spent on the process of cleaning and preparing the data

      Imagine having unnecessary and wrong data in your document, you would most likely have to experience the concept of time demarcation -- the reluctance in going through every single row and column to eliminate these "garbage data". Clearly, owning all kinds of data without organizing them feels like stuffing your closet with clothes that you should have donated 5 years ago. It is a time-consuming and soul-destroying process for us. Luckily, in R, we have something in R called "tidyverse" package, which I believe the author talks about in the next paragraph, to make life easier for everyone. I personally use dplyr and ggplot2 when I deal with data cleaning, and they are extremely helpful. WIthout these packages' existence, I have no idea when I will be able to reach the final step of data visualization.

  2. Feb 2021
  3. Apr 2020