8 Matching Annotations
  1. Jun 2024
  2. nih-r25-modelersandstorytellers.github.io nih-r25-modelersandstorytellers.github.io
    1. following steps

      I can demo these steps if needed.

    2. many

      Again don't worry about syntaxes. Focus on concepts of data wrangling, which are universal among many languages (SQL, Python, Julia).

    3. Tidyverse

      Tidyverse is not the only choice. data.table package is a popular framework for data wrangling as well.

    4. the life cycle of a data science project

      Don't be overwhelmed by syntax. GenAI tools such as GitHub Copilot and ChatGPT alleviate lots of programming details. More important to grasp the tasks and workflow.

    1. Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68.

      Thousands of citations already. Called "the monster" in big tech. Save billions of $$$ at Amazon by applying DML to online experimentation such as A/B testing.

    1. CART

      Tree-based methods such as random forest and boosting have been one of the most successful out-of-box machine learning methods for structured/tabular data.