This is the main function of the package. This function utilizes precompiled stan models to sample the posterior distribution of the specified model with the input data. For more information and larger examples of usage see the vignettes.

bpc(
  data,
  player0,
  player1,
  player0_score = NULL,
  player1_score = NULL,
  result_column = NULL,
  z_player1 = NULL,
  cluster = NULL,
  subject_predictors = NULL,
  predictors = NULL,
  model_type,
  solve_ties = "random",
  win_score = "higher",
  priors = NULL,
  chains = 4,
  parallel_chains = 4,
  iter = 2000,
  warmup = 1000,
  show_chain_messages = FALSE,
  seed = NULL,
  log_lik = T,
  dir = NULL
)

Arguments

data

A data frame containing the observations. The other parameters specify the name of the columns

player0

A string with name of the column containing the players 0. This column should be of string/character type and not be of factor type.

player1

A string with name of the column containing the players 0. This column should be of string/character type and not be of factor type.

player0_score

A string with name of the column containing the scores of players 0

player1_score

A string with name of the column containing the scores of players 1

result_column

A string with name of the column containing the winners. 0 for player 0, 1 for player 1 and 2 for ties

z_player1

A string with the name of the column containing the order effect for player 1. E.g. if player1 has the home advantage this column should have 1 otherwise it should have 0

cluster

A vector with strings with the names of the column containing the clusters for the observation. To be used with a random effects model. This column should contain strings

subject_predictors

A vector with strings with the name of the columns for the subject predictors. Since all S parameters follow the same normal prior centered in 0 we recommend that the predictors, if numeric are normalized, for example the base scale function

predictors

A data frame that contains the players predictors values when using a generalized model. Only numeric values are accepted. Booleans are accepted but will be cast into integers. The first column should be for the player name, the others will be the predictors. The column names will be used as name for the predictors

model_type

We first add a base model 'bt' or 'davidson' and then additional options with '-'. The additional options are '-U', '-generalized', '-ordereffect', '-subjectpredictors'. Below are some examples:

  • 'bt' for the Bradley Terry model. Ref: Bradley-Terry 1952,

  • 'davidson' the Davidson model to handle for ties. Ref: Davidson 1970

  • 'bt-ordereffect' for the Bradley-Terry with order effect, for home advantage. Ref: Davidson 1977

  • 'davidson-ordereffect' for the Davidson model with order effect, for home advantage, and ties. Ref: Davidson 1977

  • 'bt-generalized': for the generalized Bradley Terry model for subject specific predictors. Ref: Springall 1973

  • 'davidson-generalized' for the generalized Davidson model for subject specific predictors

  • 'bt-U': for the Bradley-Terry with random effects. Ref: Bockenholt 2001

  • 'bt-subjectpredictors': for the Bradley-Terry with subject predictors. Ref: Bockenholt 2001

  • 'davidson-U': For Davidson model with random effects

  • 'bt-ordereffect-U' for Bradley-Terry with order effects and random effects, use similar syntax for other variations by appending the correct options

solve_ties

A string for the method of handling ties.

  • 'random' for converting ties randomly,

  • 'remove' for removing the tie occurrences

  • 'none' to ignore ties. This requires a model capable of handling ties

win_score

A string that indicates if which score should win

  • 'higher' score is winner

  • 'lower' score is winner

priors

A list with the parameters for the priors.

  • 'prior_lambda_mu' Mean value of the lambda parameter in the all models. For the generalized this is also the prior for the B the parameter for lambda ~ normal(mu, std). Default = 0

  • 'prior_lambda_std' Standard deviation of the lambda parameter in the all models. lambda ~ normal(mu, std). Default = 3.0

  • 'prior_nu_mu' Mean value of the nu parameter in the Davidson models. nu ~ normal(mu, std)

  • 'prior_nu_std' Standard deviation ofnu parameter in the Davidson models. nu ~ normal(mu, std). Default = 0.3

  • 'prior_gm_mu' Mean value of the gm in the ordered effect model. gm ~ normal(mu, std). Default = 0

  • 'prior_gm_std' Standard deviation of the gm parameter in the ordered effect model. gm ~ normal(mu, std). Default = 1.0

  • 'prior_U1_std' Standard deviation of the U1 parameter in the random effects model. U ~ normal(0, std). Default = 3.0

  • 'prior_U2_std' Standard deviation of the U2 parameter in the random effects model. U ~ normal(0, std). Default = 3.0

  • 'prior_U3_std' Standard deviation of the U3 parameter in the random effects model. U ~ normal(0, std). Default = 3.0

  • 'prior_S_std' Standard deviation of the subject predictors parameter. S ~ normal(0,S_std). This for all predictors. Defaul =

chains

Number of chains passed to Stan sampling. Positive integer, default=4. For more information consult Stan documentation

parallel_chains

Number of parallel chains

iter

Number of iterations passed to Stan sampling. Positive integer, default =2000. For more information consult Stan documentation

warmup

Number of iteration for the warmup passed to Stan sampling. Positive integer, default 1000. For more information consult Stan documentation

show_chain_messages

FALSE (default) to hide chain messages from Stan; TRUE to show

seed

a random seed for Stan

log_lik

boolean. Calculate Log-likelihood for loo and waic?

dir

directory to save the csv files produced by cmdstanr. The default is in the current working directory at the .bpcs folder

Value

An object of the class bpc. This object should be used in conjunction with the several auxiliary functions from the package

References

  1. Bradley RA, Terry ME 1952. Rank Analysis of Incomplete Block Designs I: The Method of Paired Comparisons. Biometrika, 39, 324 45.

  2. Davidson RR 1970. On Extending the Bradley-Terry Model to Accommodate Ties in Paired Comparison Experiments. Journal of the American Statistical Association, 65, 317 328.

  3. Davidson, Roger R., and Robert J. Beaver 1977. "n extending the Bradley-Terry model to incorporate within-pair order effects. Biometrics: 693 702.

  4. Stan Development Team 2020. RStan: the R interface to Stan. R package version 2.21.2.

  5. Bockenholt, Ulf. Hierarchical modeling of paired comparison data. Psychological Methods 6.1 2001: 49.

  6. Springall, A. Response Surface Fitting Using a Generalization of the Bradley-Terry Paired Comparison Model. Journal of the Royal Statistical Society: Series C Applied Statistics 22.1 1973: 59 68.

Examples

# \donttest{
#For the simple Bradley-Terry model
bpc(data = tennis_agresti,
player0 = 'player0',
player1 = 'player1',
result_column = 'y',
model_type = 'bt',
solve_ties = 'none')
#> Running MCMC with 4 parallel chains...
#> 
#> Chain 4 finished in 3.1 seconds.
#> Chain 1 finished in 3.2 seconds.
#> Chain 2 finished in 3.2 seconds.
#> Chain 3 finished in 3.3 seconds.
#> 
#> All 4 chains finished successfully.
#> Mean chain execution time: 3.2 seconds.
#> Total execution time: 3.7 seconds.
#> Estimated baseline parameters with 95% HPD intervals:
#> 
#> Table: Parameters estimates
#> 
#> Parameter                Mean   Median   HPD_lower   HPD_higher
#> --------------------  -------  -------  ----------  -----------
#> lambda[Seles]           0.529    0.522      -2.197        3.417
#> lambda[Graf]            0.967    0.939      -1.746        3.761
#> lambda[Sabatini]       -0.310   -0.336      -3.087        2.496
#> lambda[Navratilova]     0.069    0.034      -2.714        2.869
#> lambda[Sanchez]        -1.089   -1.115      -3.898        1.740
#> NOTES:
#> * A higher lambda indicates a higher team ability
# }