Chapter 3 Re-analysis 1

This reanalysis is based on the paper:

Iwasa, Kazunori, et al. “Visual perception of moisture is a pathogen detection mechanism of the behavioral immune system.” Frontiers in Psychology 11 (2020): 170.

The data of this paper can be obtained from the repository: https://osf.io/5quj9/

library(bpcs)
library(tidyverse)
library(knitr)
library(cmdstanr)
PATH_TO_CMDSTAN <- paste(Sys.getenv("HOME"), '/.cmdstan/cmdstan-2.27.0', sep = '')
set_cmdstan_path(PATH_TO_CMDSTAN)
set.seed(99)

3.1 Importing the data

First let’s read the data:

d <- readxl::read_xlsx('data/moisture.xlsx', sheet = 'Exp02_BTmodel')

Below we show a sample of how the original data looks like:

sample_n(d, size = 5) %>% 
  kable(caption = 'Sample of the original data') %>% 
  kableExtra::scroll_box(width = "100%")

Table 3.1: Sample of the original data
player1	player2	win1	win2
image7	image6	131	109
image5	image6	18	222
image7	image2	226	14
image4	image1	239	1
image5	image8	20	220

The data is in the aggregated format. So let’s expand it to the long format

d_moisture <- expand_aggregated_data(d, player0 = 'player1', player1='player2', wins0 = 'win1', wins1='win2')

Now the data looks like this:

sample_n(d_moisture, size = 20) %>% 
  kable(caption = 'The data in the long format') %>% 
  kableExtra::scroll_box(width = "100%")

Table 3.2: The data in the long format
player0	player1	y	rowid
image2	image7	1	2922
image5	image2	0	7102
image7	image6	1	11490
image2	image8	1	3200
image1	image3	1	358
image8	image4	1	12704
image6	image3	0	8973
image8	image3	0	12308
image6	image4	0	9188
image5	image2	0	7071
image5	image1	0	6724
image4	image7	1	6409
image8	image1	0	11986
image4	image1	0	5278
image3	image7	1	4694
image4	image5	1	5972
image1	image8	1	1533
image4	image8	1	6560
image1	image3	1	398
image3	image5	1	4228

3.2 Analysis with the Bradley-Terry model and the order effect model

Although this is multiple judgment case, the dataset in the aggregated format does not provide information of how each individual voted, therefore we cannot compensate this effect. Therefore, we will create an analysis with only the Bradley-Terry model and the Bradley-Terry model with order effect

First, let’s sample the Bradley-Terry model

m_moisture <-
  bpc(
    d_moisture,
    player0 = 'player0',
    player1 = 'player1',
    result_column = 'y',
    model_type = 'bt',
    iter = 2000,
    show_chain_messages = T
  )
save_bpc_model(m_moisture,'m_moisture','./fittedmodels')

Low let’s sample the model with order effect.

Although the authors said that the order of the images was inverted to compensate order effect, we can still estimate if there is an order effect or not in the choice.

But first we need to create a column indicating if there was order effect for that case or not. In this problem, we just indicate with a column of ones that all instances could have an order effect. Not that the package marks the order effect relative to player1. So if the values should be interpret as such.

d_moisture <- d_moisture %>% 
  dplyr::mutate(z1 = 1)

m_moisture_order <-
  bpc(
    d_moisture,
    player0 = 'player0',
    player1 = 'player1',
    result_column = 'y',
    z_player1 = 'z1',
    model_type = 'bt-ordereffect',
    iter = 2000,
    show_chain_messages = T
  )
save_bpc_model(m_moisture_order,'m_moisture_order','./fittedmodels')

3.3 Diagnostics

check_convergence_diagnostics(m_moisture)

## Processing csv files: /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130836-1-55f37e.csv, /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130836-2-55f37e.csv, /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130836-3-55f37e.csv, /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130836-4-55f37e.csv
## 
## Checking sampler transitions treedepth.
## Treedepth satisfactory for all transitions.
## 
## Checking sampler transitions for divergences.
## No divergent transitions found.
## 
## Checking E-BFMI - sampler transitions HMC potential energy.
## E-BFMI satisfactory.
## 
## Effective sample size satisfactory.
## 
## Split R-hat values satisfactory all parameters.
## 
## Processing complete, no problems detected.

check_convergence_diagnostics(m_moisture_order)

## Processing csv files: /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130912-1-1cedf9.csv, /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130912-2-1cedf9.csv, /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130912-3-1cedf9.csv, /Users/xramor/OneDrive/2021/bpcs-online-appendix/.bpcs/bt-202109130912-4-1cedf9.csv
## 
## Checking sampler transitions treedepth.
## Treedepth satisfactory for all transitions.
## 
## Checking sampler transitions for divergences.
## No divergent transitions found.
## 
## Checking E-BFMI - sampler transitions HMC potential energy.
## E-BFMI satisfactory.
## 
## Effective sample size satisfactory.
## 
## Split R-hat values satisfactory all parameters.
## 
## Processing complete, no problems detected.

Checking convergence of the models with shinystan

launch_shinystan(m_moisture)

launch_shinystan(m_moisture_order)

3.3.1 Tables and plots

First, lets get a table for the parameters (to export it to Latex just utilize the format option)

get_parameters_table(m_moisture, format = 'html', caption = 'Parameters estimates for the simple Bradley-Terry model')

Table 3.3: Parameters estimates for the simple Bradley-Terry model
Parameter	Mean	Median	HPD_lower	HPD_higher
lambda[image1]	-4.476	-4.492	-6.800	-2.407
lambda[image2]	-2.362	-2.379	-4.680	-0.298
lambda[image3]	-0.049	-0.067	-2.408	1.986
lambda[image4]	0.144	0.126	-2.095	2.302
lambda[image5]	0.303	0.283	-1.916	2.476
lambda[image6]	1.848	1.826	-0.409	3.980
lambda[image7]	2.023	2.002	-0.210	4.193
lambda[image8]	3.038	3.014	0.642	5.052

get_parameters_table(
  m_moisture_order,
  params = c('lambda', 'gm'),
  format = 'html',
  caption = 'Parameters estimates for the Bradley-Terry model with order effect',
  keep_par_name = F,
  n_eff = T
)

Table 3.4: Parameters estimates for the Bradley-Terry model with order effect
Parameter	Mean	Median	HPD_lower	HPD_higher	n_eff
image1	-4.525	-4.560	-6.616	-2.522	843
image2	-2.413	-2.453	-4.513	-0.424	842
image3	-0.101	-0.141	-2.152	1.924	842
image4	0.091	0.048	-2.009	2.080	841
image5	0.251	0.216	-1.797	2.287	841
image6	1.795	1.755	-0.289	3.798	838
image7	1.970	1.931	-0.146	3.940	836
image8	2.984	2.952	0.943	5.035	841
gm	0.001	0.001	-0.054	0.058	1825

We can see from the table of the order effect that the gamma parameter is very close to zero, indicating that there is no order effect

Now lets compute the posterior ranks of the images based on the first BT model

r <- get_rank_of_players_df(m_moisture)

Generating a table with the images

r %>% 
  mutate(Image = c("data/moisture/Stimuli/image08.jpg",
                   "data/moisture/Stimuli/image07.jpg",
                   "data/moisture/Stimuli/image06.jpg",
                   "data/moisture/Stimuli/image05.jpg",
                   "data/moisture/Stimuli/image04.jpg",
                   "data/moisture/Stimuli/image03.jpg",
                   "data/moisture/Stimuli/image02.jpg",
                   "data/moisture/Stimuli/image01.jpg") %>% pander::pandoc.image.return()) %>% 
  knitr::kable(caption = "Rank of the images based on moisture content", format='html', booktabs=T)

Table 3.5: Rank of the images based on moisture content
Parameter	MedianRank	MeanRank	StdRank
image8	1	1.000	0.0000000
image7	2	2.002	0.0446990
image6	3	2.998	0.0446990
image5	4	4.006	0.0772656
image4	5	4.995	0.0835583
image3	6	5.999	0.0316228
image2	7	7.000	0.0000000
image1	8	8.000	0.0000000

Now lets get a caterpillar style plot

plot(
  m_moisture_order,
  HPDI = T,
  title = 'Estimates of the moisture content',
  xaxis = 'Images',
  yaxis = 'Ability',
  rotate_x_labels = F,
  APA = F,
  keep_par_names=F) + theme_bw()

3.3.2 WAIC

Calculating the WAIC of both models

get_waic(m_moisture)

## 
## Computed from 8000 by 13440 log-likelihood matrix
## 
##           Estimate    SE
## elpd_waic  -4066.2  78.9
## p_waic         7.0   0.2
## waic        8132.4 157.7

get_waic(m_moisture_order)

## 
## Computed from 8000 by 13440 log-likelihood matrix
## 
##           Estimate    SE
## elpd_waic  -4067.2  78.8
## p_waic         7.9   0.2
## waic        8134.3 157.7

We can see that the WAIC of the models are quite similar and that the model without the order effect has a slightly smaller WAIC and less parameters. Therefore we will select it.