- Jun 2024
-
nih-r25-modelersandstorytellers.github.io nih-r25-modelersandstorytellers.github.io
-
following steps
I can demo these steps if needed.
-
many
Again don't worry about syntaxes. Focus on concepts of data wrangling, which are universal among many languages (SQL, Python, Julia).
-
Tidyverse
Tidyverse is not the only choice. data.table package is a popular framework for data wrangling as well.
-
the life cycle of a data science project
Don't be overwhelmed by syntax. GenAI tools such as GitHub Copilot and ChatGPT alleviate lots of programming details. More important to grasp the tasks and workflow.
-
-
nih-r25-modelersandstorytellers.github.io nih-r25-modelersandstorytellers.github.io
-
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68.
Thousands of citations already. Called "the monster" in big tech. Save billions of $$$ at Amazon by applying DML to online experimentation such as A/B testing.
-
-
nih-r25-modelersandstorytellers.github.io nih-r25-modelersandstorytellers.github.io
-
CART
Tree-based methods such as random forest and boosting have been one of the most successful out-of-box machine learning methods for structured/tabular data.
-
-
nih-r25-modelersandstorytellers.github.io nih-r25-modelersandstorytellers.github.io
-
tidycensus
Last year's R25 program had many examples of using tidycensus to explore the Census and ACS data.
-
as_tibble() |>
Optional.
-