July 25, 2022

How to Simulate Turnout in Tunisia's Constitutional Referendum Using R and the Tidyverse

On coming home, I looked at R-bloggers, while vegetating over the Giants game. I have a keen interest in Middle Eastern politics and specifically, the machinations in North Africa's Maghreb region.. So, it is with keen interest that I read that Tunisia is having a constitutional referendum and that he's run a few simulations of the referendum. I've copied the run below:
require(tidyverse)
require(ggthemes) # we need these two and our man forgot to include them.

N <- 7000000

N -> number.of.voters # for clarity purposes

stations <- 4500
vote_assign <- sample(1:stations,number.of.voters,replace=T, prob=sample(1:3,stations,replace=T))
# loop over turnout, sample polls, estimate turnout

over_turnout <- parallel::mclapply(seq(.05,.5,by=.1), function(t) {
# polling station varying turnout rates

station_rates <- rbeta(n=number.of.voters,t*20,(1-t)*20)

# randomly let voters decide to vote depending on true station-level turnout rate

pop_turnout <- lapply(1:stations, function(s) {
tibble(turnout=rbinom(n=sum(vote_assign==s), size=1,prob = station_rates[s]), station=s)}) %>% bind_rows

over_samples <- lapply(1:1000, function(i) {

# sample 100 random polling stations 1,000 times
sample_station <- sample(1:stations, size=50)

turn_est <- mean(pop_turnout$turnout[pop_turnout$station %in% sample_station])

return(tibble(mean_est=turn_est, experiment=i))
}) %>% bind_rows %>%
mutate(Turnout=t)

over_samples
},mc.cores=10) %>% bind_rows over_turnout_biased %>%
group_by(Turnout) %>%
summarize(pop_est=mean(mean_est), low_est=quantile(mean_est,.05), high_est=quantile(mean_est, .95)) %>%
ggplot(aes(y=pop_est,x=Turnout)) +
geom_pointrange(aes(ymin=low_est, ymax=high_est),size=.5,fatten=1) +
geom_abline(slope=1,intercept=0,linetype=2,colour="red") +
theme_economist() + # I prefer this to theme_tufte, as Robert employed
theme(text=element_text(family="")) +
labs(y="Estimated Turnout",x="True Turnout", caption=stringr::str_wrap("Comparison of Mourakiboun estimated (y axis) versus actual turnout (x axis). Red line shows where true and estimated values are equal. Based on biased samples of 50 polling stations with higher turnout stations more likely to be sampled. However, simulation assumes no problems with recording votes.")) +
ylim(c(0,0.5)) +
xlim(c(0,0.5))
From our experience in dealing with Brexit, referenda tend to be very hard to predict if the country isn't accustomed to having them regularly. And, Robert does acknowledge this.

No comments:

Post a Comment