this comes from the data
hey a comment
this comes from the data
hey a comment
provinces
Which health zones had the highest Attack Rate ?
Substantial geographic heterogeneity was observed in the ability to establish epidemiological links
have to be careful here, it is not the healthzone teams that established link right ? these are HZ of residence ?
Finally, a reminder that as usual summarise() returns a dataframe, which can be further used and modified using mutate() - this means that we should be able to add a variable about the proportion of female in each sub_prefecture using our newly created n_female ! Can you add a mutate() after your summarise() to make this happen ?
didn't you already do this earlier ?
sub_pref_df <- df_linelist %>% summarise( .by = sub_prefecture, n_patients = n(), mean_age = mean(age), min_admission = min(date_admission, na.rm = TRUE), n_female = sum(sex == "f", na.rm = TRUE), n_hosp = sum(hospitalisation == "yes", na.rm = TRUE), mean_age_hosp = mean(age[hospitalisation == "yes"], na.rm = TRUE), mean_age_female = mean(age[sex == "f"], na.rm = TRUE), n_death_u6m = sum(outcome[age_group == "< 6 months"] == "dead", na.rm = TRUE) ) %>% mutate( prop_female = n_female / n_patients, prop_hosp = n_hosp / n_patients ) sub_pref_df
i wouldn't give them the code for the solution... but i think it can be nice to show the output !
_linelist %>% summarise( .by = sub_prefecture, n_patients = n(), mean_age = mean(age), min_admission = min(date_admission, na.rm = TRUE), n_female = sum(sex == "f", na.rm = TRUE), n_hosp = sum(hospitalisation == "yes", na.rm = TRUE), mean_age_hosp = mean(age[hospitalisation == "yes"], na.rm = TRUE) ) %>% mutate( prop_female = n_female / n_patients, prop_hosp = n_hosp / n_patients )
i don't know that i would show this in full pipe, it may be hard for them to see what you're talking about esp since its not added at the end. maybe just show the specific example you are trying to illustrate ?
also call the proportion of dead patients
i'm not sure what you mean by this... are you trying to check if they know what cfr is ?
Note. You can write either summarize() (US spelling) or summarise() (British spelling) in R.
i think this is low yeild, but up to you. if you want to keep it, you should make it one of the callouts (for example "tip").
Finally, a reminder that as usual summarise() returns a dataframe, which can be further used and modified using mutate() - this means that we should be able to add a variable about the proportion of female in each sub_prefecture using our newly created n_female ! Can you add a mutate() after your summarise() to make this happen ?
didn't you already do this earlier ? [Hugz] OK that was a mistake, I think we can keep this in, and it's outside summarise() so that we can show them how to use one after the other (even thought you can make this calculation directly in summarise() )
also call the proportion of dead patients
i'm not sure what you mean by this... are you trying to check if they know what cfr is ? [Hugz]: yes that was the idea
Can you try to use the syntax to calculate the mean age of female ?
it's hard to tell if they are supposed to be building up a pipe or just do one offs
organized and well docume
[Hugo]: here is a comment
Rstudio project, installing packages
here is a comment
Foreshadowing. File paths actually work a bit differently in Rmarkdown files than they do in R scripts, but this is something we will talk about much later in the course. If you don’t know what RMarkdown is at the moment, don’t worry about it
I would remove this, it's quite advance and such piles on the already huge load of new information they are getting here
Importing .xlsx files
I still feel strongly about this. I think we should stick to simple, conventional importing with .csv and move the .xlsx import/export into it's own dedicated satellite - happy to hear what people think here. I agree the current session is not so long so .xlsx would fit in there but it's more for organisation purpose. especially because we will probably have a satellite regarding satellites, there are a lot to be said so better not duplicate.
Add the code to create an object called path_data_raw
I would suggest we remove the object here for the variable name. I hear that it makes them practice object, but I feel it's confusing to introduce here("data", "raw") and then suddenly ask them to do here(path_data_raw, "msf_linelist_moissala_2023-09-24.xlsx"). The variable does not add much here and is only truly important with long automated scripts ...
OneDrive doesn’t play well with R as it will attempt to constantly synchronize certain project files in a way that can cause errors or memory problems.
I think side notes are distracting, I have not even noticed them until now. I would suggest we move this type of information (definitions/deeper concept) to the tooltips
The principles you learned in the Data Management module will apply here as well: we should do our best to ensure that our projects won’t just work today but can also be reused and shared in the future. While doing this is not always easy, there are several best practices that can help us, and one of the most important is to start with a good, organized code base.
Test of a comment here - what happens if I render ?
OneDrive doesn’t play well with R as it will attempt to constantly synchronize certain project files in a way that can cause errors or memory problems.
As discussed, I think side notes are distracting and are actually hard to spot. This is basically the type of stuff (explore a concept/definition ) I would put in the tooltip.
Published December 18, 2024
shall we show the date of last updates rather than publication ?
Resources
Really good idea to have a Resources sections
If you want to learn about importing several sheets in one go, or several similar files from a folder, go to the satellite on multiple imports.
I would send them to the full list of satellites rather than one specific one. The point is that they choose based on their specific needs
Importing .xlsx files
As discussed before, I think .xlsx import/export deserve their own satellites. It's a whole topic in itself, and I feel like we will make this satellites anyways so better to prevent duplication + I would stick to very simple and basics stuff at that stage, let's remember it's probably their first few hours using R. A lot has already been covered until now.
I would show import with .csv only, and them have a satellite that deals with working with .xlsx in R.
This type of function, providing an unified interface to other specific functions is known as a wrapper
I am really not a big fan of the side notes, I find them distracing and actually haven't notice them until now. I would suggest we stick to the three "tasks" callouts, and the three "informational" callout to convey information. Actually such a note, whoch is more of a deepening of a topi/concept should become a tooltip
Create a new section in your code called File Paths Add the code to create an object called path_data_raw that contains the path to your raw data folder using the function here(). We can now pass our new variable path_data_raw back into here() in order to create a full path to a specific data file.
Can we not simplify and use just here::here() without saving to a new variable ?
The first is to use the base R function file.path(), which will accept a set of the relevant parts (folders) in your desired path and combine them into a file path using the syntax of your local operating system, whichever it is: file.path("data", "raw", "exemple_linelist.xlsx") [1] "data/raw/exemple_linelist.xlsx" Note that the path is relative, here to the current working directory While file.path() works fine
Not sure we need to mention file.path actually - here::here() is now pretty common so I don't think it matters to know the base solution
Foreshadowing. File paths actually work a bit differently in Rmarkdown files than they do in R scripts, but this is something we will talk about much later in the course. If you don’t know what RMarkdown is at the moment, don’t worry about it.
I would remove this, seems a bit out of scope and let's try not to overload them with informations