longer for those in non-crayfish regions
Clearly state the direction.
longer for those in non-crayfish regions
Clearly state the direction.
explained by sampling variation
Can we explain the value of the sample mean by the standard deviation?
expected counts
expected frequencies
the second column
Odds and odds ratio are always relative to second row/column.
19.612
telling us "How much the difference between those sample means is likely to vary from sample to sample."
standard deviation
The standard deviation of the difference is not the difference between the two given values of the standard deviation.
22.03922.039
The standard deviation of the difference is not the difference between the two given values of the standard deviation.
Four types of validities
Ecological validity: tells us whether the results are likely to be realised in the real world
Statistical validity: tells us whether our statistical methods look appropriate
internal validity
tells if the conclusions we have made within our sample are reasonable
external validity
tells us if our sample represents the population
s
the sample standard deviation
x
the sample mean
x
the sample mean
μ
the population mean
standard deviation
the standard deviation is σ
mean of the distribution
the mean of the distribution is µ
sampling variation
抽出変動、標本変動
Some other samples may even have produced the opposite results
A different sample would have produced different sample odds.
Conditional probability
条件付き確率
difficult
Numerical summaries are difficult for describing the relationship between two quantitative variables, because the possible relationships vary greatly.
correlation coefficient
A correlation coefficient is a single number encapsulating all this information.
325.28
not necessarily the difference b/w these two individual Std.dev.s.
Std.dev. for all the individual reduction
A numerical summary
numerical summary table
Right
simplify with the straight lines indicating each range
Left
a bit messy
distribution
a way to summarise quantitative data;
describes what values are present in the data, and how often those values appear
Understanding the type of data collected is essential
Understanding variables we have is quite important.
Quantitative research can involve both quantitative and qualitative data
❓❓❓
Quantitative research can involve both quantitative and qualitative data
because both types of variables can be summarized and analysed numerically
People do not always answer truthfully
People may answer the questions even if they don't understand the questions.
Question wording can be important.
leading to making a big difference to the ansers that we get
Do you avoid purchasing water in plastic bottles unless it is carbonated, unless the bottles are plastic but not necessarily if the lid is recyclable?
Complicated to answer
Do you drink water in plastic and/or glass bottles?
Asking two questions ... better to ask two questions here
Are you more concerned about Coagulase-negative Staphylococcus or Neisseria pharyngis in bottled water?
Most people will give answers even if they actually don't know.
⇒ getting useless data
Do you drink more water now?
Ambiguous ... What do you mean? (more than yesterday? last week? WHEN?)
do you support banning bottled water?
Leading ... very obvious from the way the question is posed what answer everyone is wxpecting
exhaustive
EXCLUSIVE
Ensure options are exhaustive
Make sure that they're exhaustive → cover all options
Ensure clarity
clarity in our questions → we get the answers we seek
Avoid problems with ethics
Avoid people we're asking about a topic breaking laws, or revealing confidential or private information
double-barrelled questions
two things in one question
Avoid asking the uninformed
Ensure people we're asking understand of what we're taking about.
Avoid leading questions
respondents may be pushed in the direction
questionnaire
a set of questions for respondents to answer
the types of analyses and software (including version) used
What types of analysis is going to use?
What software we use for those analyses?
how data are collected from the individuals
How we get the information from/about those indeviduals?
how individuals are chosen from the population
How we get the indeviduls from the population?
repeatable
everybody can understand exactly what was done, and how
a pilot study
a small test conducted before the actual data collection, not whole test
pilot study
a small test conducted before the actual data collection, not whole test
to identify (and hence fix) possible problems
This leads to better future studies.
repeatable
repeatable ⇒ enable others to replicate ⇒ can be ethical
protocol
a procedure documenting the details of the design and implementation of studies, and for data collection
protocol
a plan = should be established and documented before collecting the data → explains exactly how the data will be obtained, which will include operational definitions
a procedure documenting the details of the design and implementation of studies, and for data collection
the extent to which a cause-and-effect relationship can be established in a study
effectiveness
the ability to generalise
generalizability
ecological validity
The likely practicality of the study results in the real world
autopsies
検死
by helping to manage confounding
by avoiding lurking variables (Sect. 3.4).
by determining if the comparison groups are similar (Sect. 7.2).
by using the information in analysis (Sect. 7.2).
triple blind
If the researchers, participants and the analyst are blinded to the comparison groups, the study is called triple blind.
double blind
If both the researchers and participants are blinded to the comparison groups, the study is called double blind.
single blind
If only the individuals are blinded to the comparison groups, the study is called single blind.
placebo effect
individuals in a study may report effects of a treatment, even if they have not received an active treatment
mental arithmetic
暗算
observer effect
∙ Suppose the researchers assessing the study outcomes knew the diet allocated to each patient.
∙ Researchers’ expectations or hopes for how the new diet will perform may unconsciously influence how the researchers interact with the individuals, and so perhaps (unconsciously) influence the behaviour of the individuals in the study.
white-coat hypertension
白衣高血圧
temporary rise in blood pressure in the doctor's office.
In experimental studies, people often know they are in a study, due to ethics requirements (Sect. 5.2), and the Hawthorne effect is difficult to manage
∙ The impact of the Hawthorne effect can be minimized by blinding the individuals, so that:
the individuals do not know that they are participating in a study; the individuals do not know the aims of the study; and/or the individuals do not know which treatment they are receiving in the study.
∙ Blinding people to knowing they are involved in a study is often difficult, as ethics often requires peoples’ informed consent.
Hawthorne effect
ホーソン効果
People, and perhaps animals, may behave differently if they know (or think) they are being watched
ensure that the values of potential confounding variables are approximately evenly spread between the comparison groups
This is true for identified potential confounders (such as age), and also for variables not even considered as confounders, or are hard to measure or observe (such as genetic conditions).
→ manage lurking variables
recording the values of potential confounding variables
recording all potential extraneous variables is important
control variable
The impact of some extraneous variables on the response variable can be reduced by fixing the values of the variable. = control variable
Control (or controlled) variables are extraneous variables whose values are fixed for the study.
Any difference in faecal weight detected between the two groups may not be due to the diets
There are many differences between two groups ... 👆 those differences written above
confounding variables
A confounding variable (or a confounder) is an extraneous variable associated with the response and explanatory variables.
Confounding is when a third variable influences the observed relationship between the response and explanatory variable.
knowing if the fertilizer dose impacted yield is difficult
they also changed the amount of labour and herbicide, not only the presence of fertilizer
Self-selected sampling
Voluntary response (self-selecting) sample
Representative sampling
non-random sampling
selection bias
The sample may not be representative of the population for many reasons.
These compromise how well the sample represents the population (i.e., compromises external validity and accuracy).
This is not a random sample, but this sample is likely to comprises a variety of students.
trying to have a mix of students as a sample
multi-stage sampling
Multistage sampling: larger collections of individuals are selected using a simple random sample.
Smaller collections of individuals within those large collections are selected using a simple random sample.
The simple random sampling continues for as many levels as necessary, until individuals are being selected (at random).
a large number of small groups
clusters
clusters
a large number of small groups
cluster sampling
Cluster sampling: the population is split into a large number of small groups (clusters).
A simple random sample of clusters is selected, and every member of the chosen clusters become part of the sample.
stratified sampling
The population is split into a small number of large (usually similar) groups called strata
Then cases are selected using a simple random sample from each stratum.
strata
a small number of large (usually similar) groups
a small number of large (usually similar) groups
strata
systematic sampling
the first case is randomly selected; then, more individuals are selected at regular intervals thereafter
sampling frame
a list of all members of the population
simple random sample
every possible sample of a given size has the same chance of being selected
to ensure that the sample faithfully represents the population
maximising external validity
impersonal chance
you have no say in what's going on in choosing the sample
FIGURE 6.1
Top Left: both Accuracy & Precision = very closely to the target & doesn't have so much variation around it
Bottom Left: Accuracy & Imprecision = hitting on the target on average, but not close to the target
Top Right: Inaccuracy & Precision = not much variation
Bottom Right: Inaccuracy & Imprecision = missing average, all the place
sampling variation
Many samples are possible, and every sample is likely to be different.
The results of studying a sample depend on which individuals are in the studied sample.
sample
Many different samples are possible.
External validity
A study is externally valid if the results from the sample can be generalised to the population, which is only possible if the sample faithfully represents the population.
the Tuskegee syphilis experiment
タスキギー梅毒事件、タスキギー人体実験
Ethics are important for all studies, not just those involving people or animals.
e.g. enginerering & chemical studies
RQ
decosion-making & relational
The following short video
POCI = in Abstract
e.g. P: adults with type 2 diabetes
O response variables: glycated hemoglobin (AIC), subject's weight * two time periods (pre- and 6-month-post program)
C: 2 types of education being received (diabetes patient education OR augumanted by a community self-management program) = explanatory variables
I: 2 types of education being received *assigned
POCI ⇒ Inerventional RQ ⇒ experiment
POCI
Population, Outcome, Comparison and Intervention
The following short video
Example 1: Unit of observation = each flower
Unit of analysis = each bunch = 6
Example 2: Unit of observation = each frozen cube = 24
Unit of analysis = carton = 2 (regular milk & chocolate milk)
Example 3: Unit of observation = individual students
Unit of analysis = individual students
100100100 different men
units of analysis
one hair strand
one unit of observation
hair strands
units of observation
the same single man
unit of analysis
two-tailed
A two-tailed test, in statistics, is a method in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values. A two-tailed test will test both if the mean is significantly greater than x and if the mean significantly less than x.
one-tailed
A one-tailed test is a statistical test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both.
explanatory variable
explanatory = independent variables = explain changes in response
response variable
response = dependent variable = changed response to explanatory
Exercise 1.4
Quantitative
Exercise 1.2
Qualitative
Exercise 1.3
Quantitative
Exercise 1.1
Quantitative