Group | Value |
---|---|
Blue | 7 |
Blue | 6 |
Blue | 4 |
Blue | 2 |
Blue | 2 |
White | 5 |
White | 4 |
White | 6 |
White | 11 |
White | 3 |
Workshop timeline
10.30 am - 12:00 pm (90 mins total)
- 10.30–10.40 am Introduction and Welcome
- 10.40–11.10 am Experimental Design (Section 3)
- Group/individual activity (Section 3.1)
- 11.15–11.45 am Randomization (Section 5)
- Classwide activity (Section 5.1)
- Group/individual activity (Section 5.2)
- 11.45 am – 12 pm Discussion & Feedback (Section 6)
Useful links
Summary
This workshop requires your active involvement! The content and activities are based on material used in a second stage biostatistics course here at UoA. Our 90 min slot roughly covers two weeks worth of the content 😱😱 Bear with me, the aim is to give an overview not an in-depth lecture!!
Throughout this worksheet you’ll find a number of different callout boxes:
Experimental design
Activity
Open Farm Rescue: Tomato Trials, choose your Farm Name and Begin. Explore the application and familiarise yourself with the scenario and backstory!
For the purposes of this activity, when you come to design your experiment only choose one variety of tomato and set only two treatments. Make sure to explore the greenhouse plots and the different natural conditions.
Note that one of your treatments could be the Control (i.e., 0% manure/ not applying anything to a plot).
Run your experiment!
Verify if all plants survived or did some die before reaching maturity? If a plant died, what do you think may have caused this? Could this be treatment related, or just a random occurrence?
Run the in-app R
code by hitting Run (this may take a while, but just wait for the plot)
Is there anything that needs to be corrected on your graph? Hint: check axis label spellings Can you correct the code and re-run it?
Based on your results fill in the missing percentage!
Break
We’ll have a few mins break before this section, if you fancy another quick game give shady a go.
Randomization tests
Activity
We are going back to school and are going to swap some stickers!
The ten coloured (Blue or White) stickers I’m about to hand out each have a number stapled to them. These data are given below and are also in this Google Sheet
If we write \(\mu_\text{Blue}\) and \(\mu_\text{White}\) as the means of the Blue and White Groups respectively (statisticians love using Greek letters at any given opportunity) then we have
- \(\mu_\text{Blue}\) = 4.2,
- \(\mu_\text{White}\) = 5.8, and
- \(\mu_\text{Blue} - \mu_\text{White}\) = -1.6.
Take a note of these numbers these are the observed data!
I want those of you with stickers to separate the stickers from the numbers and swap your sticker with another sticker holder (keeping the number!). You may, or may not, now have a different coloured sticker.
After this iteration we’re going to update the Google Sheet by changing the Group labels (i.e., sticker colours).
We’ll do this a few more times… remembering to annotate your plot with the recalculated value of \(\mu_\text{Blue} - \mu_\text{White}\).
Because we don’t have all day I recalculated the statistic \(\mu_\text{Blue} - \mu_\text{White}\) 1000 times for the reshuffled labels and plotted this for you below. Reading this plot do your answers above differ?
We’re now going to formalise things a bit more!
Note that the number of times out of the 1000 iterations that
- a value <= -1.6 was obtained was 249, and
- a value >= 1.6 was obtained was 235.
The hypothesis test you’ve just carried out can be formalised as
- Test statistic: The difference between the mean of the Blue group and the mean of the White group
- In maths syntax, \(\mu_\text{Blue} - \mu_\text{White}\)
- NULL hypothesis: The mean of the Blue group is the same as the White group
- In maths syntax, \(H_0: \mu_\text{Blue} = \mu_\text{White}\) or \(H_0: \mu_\text{Blue} - \mu_\text{White} = 0\)
- Alternative hypothesis: The mean of the Blue group is not the same as the White group
- In maths syntax, \(H_0: \mu_\text{Blue} \neq \mu_\text{White}\) or \(H_0: \mu_\text{Blue} - \mu_\text{White} \neq 0\)
The proportion of times a value as least as extreme as the magnitude of the test statistic was observed gives us information/evidence about whether the NULL hypothesis is plausible. We’re not going to go into too much detail here, but it can be interpreted as a probability (in stats terminology it’s called a p-value1).
Script
This activity assumes some familiarity with R
. If you are not familiar with R
I would still encourage you to give this a go, either directly in the Farm Rescue: Tomato Trials app or most likely the easiest via Posit Cloud. If you’re not keen then feel free to watch someone who is and follow along the code below and/or see if you can figure out what is going on.
Using your data from the experimental design activity (Section 3.1) have a go running the R
code below.
Note that the example below is executed using some example data from my run of the experiment; feel free to use this output to answer the questions posed.
## required package(s)
library(tidyverse) ##install.packages("tidyverse") # to install
## assumes your data is downloaded into a file called results.csv
## could use 'Import Dataset" tab
<- read.csv('results.csv')
results ## example data
results
row col plant recipe percent_manure survived fruit_weight
1 1 1 cherry 25% manure 25 yes 47.05592
2 3 1 cherry 60% manure 60 yes 60.35559
3 4 1 cherry 25% manure 25 yes 42.03602
4 6 1 cherry 25% manure 25 yes 36.80037
5 7 1 cherry 60% manure 60 yes 41.20610
6 1 2 cherry 60% manure 60 yes 74.73517
7 2 2 cherry 60% manure 60 yes 67.35283
8 3 2 cherry 25% manure 25 yes 53.75131
9 6 2 cherry 60% manure 60 yes 51.44552
10 7 2 cherry 25% manure 25 yes 37.57966
11 1 3 cherry 25% manure 25 yes 63.71887
12 2 3 cherry 25% manure 25 yes 64.99296
13 3 3 cherry 60% manure 60 yes 78.11955
14 4 3 cherry 60% manure 60 yes 68.56110
15 6 3 cherry 25% manure 25 yes 45.82823
16 7 3 cherry 60% manure 60 yes 52.28843
17 2 5 cherry 25% manure 25 yes 79.12103
18 3 5 cherry 25% manure 25 yes 74.96625
19 5 5 cherry 60% manure 60 yes 83.40485
20 6 5 cherry 25% manure 25 yes 56.96398
21 1 6 cherry 25% manure 25 yes 98.53688
22 2 6 cherry 60% manure 60 yes 116.44931
23 3 6 cherry 60% manure 60 yes 105.81186
24 5 6 cherry 25% manure 25 yes 71.50998
25 6 6 cherry 60% manure 60 yes 79.81267
26 7 6 cherry 60% manure 60 yes 67.78721
27 1 7 cherry 60% manure 60 yes 136.19697
28 3 7 cherry 25% manure 25 yes 92.72198
29 5 7 cherry 60% manure 60 yes 98.67417
30 6 7 cherry 25% manure 25 yes 66.64357
## plotting code matches app
ggplot(data = results, aes(x = recipe, y = fruit_weight)) +
geom_violin() + geom_dotplot(binaxis = "y", method = "histodot") +
theme_bw() + xlab("Fertilizer Recipe") + ylab("Fruit Weight (g/m^2)")
## Do we think there's a difference in treatment means?
##---------------------------------##
##----OBSERVED STATISTIC-----------##
##---------------------------------##
## Below is the calculated difference in group means given the sample
<- results %>%
diff_in_means group_by(recipe) %>% summarise(mean = mean(fruit_weight)) %>%
summarise(diff = diff(mean)) %>% as.numeric()
diff_in_means
[1] 16.66495
##---------------------------------##
##----RANDOMIZATION LOOP-----------##
##---------------------------------##
<- 1000 ## number of times to iterate
nreps ## initiate results vector
<- numeric(nreps)
randomisation_difference_mean for (i in 1:nreps) {
<- data.frame(value = results$fruit_weight)
dat $random_labels <- sample(results$recipe, replace = FALSE)
dat<- dat %>%
randomisation_difference_mean[i] group_by(random_labels) %>% summarise(mean = mean(value)) %>%
summarise(diff = diff(mean)) %>% as.numeric()
}## store results in a data frame
<- data.frame(randomisation_difference_mean = randomisation_difference_mean)
results ##---------------------------------##
##----SAMPLING DISTRIBUTION--------##
##---------------------------------##
ggplot(results, aes(x = randomisation_difference_mean)) +
geom_histogram(bins = 35) +
theme_bw() + theme(axis.text = element_text(size = 14)) +
ylab("Count") + xlab("Differences between randomised group means") +
geom_vline(xintercept = diff_in_means, col = "#023020", size = 1,alpha = 0.6) +
annotate(geom = 'text', label = "Observed difference \n between means" ,
x = 10, y = 75, color = "#023020")
## Do we think there's a difference in treatment means now?
## What do you think our strength of evidence is against the NULL hypothesis?
##---------------------------------##
##----P-VALUE----------------------##
##---------------------------------##
## How many randomized differences in means are as least as extreme as the one we observed
## absolute value as we are dealing with two tailed test
<- sum(abs(results$randomisation_difference_mean) >= abs(diff_in_means))
n_exceed /nreps n_exceed
[1] 0.049
Feedback
I would love any feedback/suggestions/questions!! Feel free to share your thoughts out loud or write them on the board (I will be photographing these at the end of the workshop), you are of course also more than welcome to contact me via c.jonestodd@auckland.ac.nz.
Things I’d particularly like to hear your thoughts on
- The experimental design app; did you find any aspects underexplained or confusing?
- Are there additional features or tools you’d like to see in the software?
- Beyond the Thrive/Survive/Die in-game feedback what else might you like to see?
- What modifications would you make for different learning stages etc.?
- How might you scale/modify the randomization activity for different learning stages etc.?
Footnotes
Recommended reading The ASA Statement on p-Values: Context, Process, and Purpose↩︎
Again, read The ASA Statement on p-Values: Context, Process, and Purpose!↩︎