There appears to be significant difference in the reuse rate between high and low debt versions.
d.completed %>%
mutate_at("high_debt_version",
function(x) case_when(
x == "false" ~ "Low debt",
x == "true" ~ "High debt"
)) %>%
ggplot(aes(high_debt_version)) +
geom_bar(aes(fill = reused_logic_constructor), position = "fill") +
scale_fill_manual("Legend", values = c("lightblue", "darkblue"), labels = c("Reused", "Duplicated")) +
labs(title = "Constructor reuse") +
xlab("Debt version") +
ylab("Ratio of reuse")
d.completed %>%
mutate_at("high_debt_version",
function(x) case_when(
x == "false" ~ "Low debt",
x == "true" ~ "High debt"
)) %>%
ggplot(aes(high_debt_version)) +
geom_bar(aes(fill = reused_logic_validation), position = "fill") +
scale_fill_manual("Legend", values = c("lightblue", "darkblue"), labels = c("Reused", "Duplicated")) +
labs(title = "Validation reuse") +
xlab("Debt version") +
ylab("Ratio of reuse")
For a boolean outcome, bernoulli is the most suitable family.
We include high_debt_verison
as a predictor in our model as this variable represent the very effect we want to measure. We also include a varying intercept for each individual to prevent the model from learning too much from single participants with extreme measurements.
Since they may correlate, constructor and logic reuse are both included in a single multivariate model.
We iterate over the model until we have sane priors, in this case a prior giving a 50/50 chance was chosen in both cases. The prior lkj(2)
will mean the model is skeptical of strong correlations.
reuse.with <- extendable_model(
base_name = "reuse",
base_formula = "mvbind(
reused_logic_validation,
reused_logic_constructor
) ~ 1 + high_debt_version + (1 |c| session)",
base_priors = c(
prior(normal(0, 1), class = "b"),
prior(normal(0, 1), class = "Intercept"),
prior(exponential(1), class = "sd", resp = reusedlogicconstructor),
prior(exponential(1), class = "sd", resp = reusedlogicvalidation),
prior(lkj(2), class = "L")
),
family = bernoulli(),
data = d.both_completed,
base_control = list(adapt_delta = 0.95)
)
prior_summary(reuse.with(only_priors= TRUE))
## Setting 'rescor' to FALSE by default for this model
prior_summary(reuse.with(sample_prior = "only"))
## Setting 'rescor' to FALSE by default for this model
pp_check(reuse.with(sample_prior = "only"), type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse.with(sample_prior = "only"), type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
We choose a beta parameter priors allowing for the beta parameter to account for 100% of the effect but that is skeptical to such strong effects from the beta parameter.
sim.size <- 1000
sim.intercept <- rnorm(sim.size, 0, 1)
sim.beta <- rnorm(sim.size, 0, 1)
sim.beta.diff <- (plogis(sim.intercept + sim.beta) / plogis(sim.intercept) * 100) - 100
data.frame(x = sim.beta.diff) %>%
ggplot(aes(x)) +
geom_density() +
xlim(-150, 150) +
labs(
title = "Beta parameter prior influence",
x = "Estimate with beta as % of estimate without beta",
y = "Density"
)
We check the posterior distribution and can see that the model seems to have been able to fit the data well. Sampling seems to also have worked well as Rhat values are close to 1 and the sampling plots look nice.
pp_check(reuse.with(), type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse.with(), type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
summary(reuse.with())
## Setting 'rescor' to FALSE by default for this model
## Family: MV(bernoulli, bernoulli)
## Links: mu = logit
## mu = logit
## Formula: reused_logic_validation ~ 1 + high_debt_version + (1 | c | session)
## reused_logic_constructor ~ 1 + high_debt_version + (1 | c | session)
## Data: as.data.frame(data) (Number of observations: 44)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~session (Number of levels: 22)
## Estimate
## sd(reusedlogicvalidation_Intercept) 1.83
## sd(reusedlogicconstructor_Intercept) 1.93
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.70
## Est.Error
## sd(reusedlogicvalidation_Intercept) 0.84
## sd(reusedlogicconstructor_Intercept) 0.86
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.24
## l-95% CI
## sd(reusedlogicvalidation_Intercept) 0.35
## sd(reusedlogicconstructor_Intercept) 0.45
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.08
## u-95% CI
## sd(reusedlogicvalidation_Intercept) 3.76
## sd(reusedlogicconstructor_Intercept) 3.90
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.96
## Rhat
## sd(reusedlogicvalidation_Intercept) 1.00
## sd(reusedlogicconstructor_Intercept) 1.01
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1.01
## Bulk_ESS
## sd(reusedlogicvalidation_Intercept) 1083
## sd(reusedlogicconstructor_Intercept) 1218
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1114
## Tail_ESS
## sd(reusedlogicvalidation_Intercept) 840
## sd(reusedlogicconstructor_Intercept) 911
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1080
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI
## reusedlogicvalidation_Intercept -0.04 0.58 -1.13
## reusedlogicconstructor_Intercept -0.27 0.57 -1.38
## reusedlogicvalidation_high_debt_versionfalse -1.61 0.69 -2.99
## reusedlogicconstructor_high_debt_versionfalse -1.45 0.67 -2.84
## u-95% CI Rhat Bulk_ESS Tail_ESS
## reusedlogicvalidation_Intercept 1.17 1.00 2806 2730
## reusedlogicconstructor_Intercept 0.91 1.00 2577 2828
## reusedlogicvalidation_high_debt_versionfalse -0.33 1.00 3823 2879
## reusedlogicconstructor_high_debt_versionfalse -0.18 1.00 4128 2915
##
## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(reuse.with(), ask = FALSE)
# default prior for monotonic predictor
edlvl_prior <- c(
prior(dirichlet(2), class = "simo", coef = "moeducation_level1", resp = "reusedlogicconstructor"),
prior(dirichlet(2), class = "simo", coef = "moeducation_level1", resp = "reusedlogicvalidation")
)
We use loo
to check some possible extensions on the model.
loo_result <- loo(
# Benchmark model(s)
reuse.with(),
# New model(s)
reuse.with("work_domain"),
reuse.with("work_experience_programming.s"),
reuse.with("work_experience_java.s"),
reuse.with("education_field"),
reuse.with("mo(education_level)", edlvl_prior),
reuse.with("workplace_peer_review"),
reuse.with("workplace_td_tracking"),
reuse.with("workplace_pair_programming"),
reuse.with("workplace_coding_standards"),
reuse.with("scenario"),
reuse.with("group")
)
loo_result[2]
## $diffs
## elpd_diff se_diff
## reuse.with("scenario") 0.0 0.0
## reuse.with("workplace_peer_review") -0.1 0.9
## reuse.with("mo(education_level)", edlvl_prior) -0.2 1.2
## reuse.with() -0.2 1.0
## reuse.with("workplace_coding_standards") -0.2 0.9
## reuse.with("education_field") -0.3 0.9
## reuse.with("work_experience_java.s") -0.4 1.5
## reuse.with("workplace_td_tracking") -0.6 1.1
## reuse.with("work_experience_programming.s") -0.6 1.5
## reuse.with("workplace_pair_programming") -0.7 1.2
## reuse.with("group") -0.9 1.4
## reuse.with("work_domain") -1.2 1.6
loo_result[1]
## $loos
## $loos$`reuse.with()`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.2
## p_loo 17.8 2.1
## looic 85.8 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 22 50.0% 314
## (0.5, 0.7] (ok) 12 27.3% 504
## (0.7, 1] (bad) 9 20.5% 165
## (1, Inf) (very bad) 1 2.3% 124
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("work_domain")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.8 4.4
## p_loo 18.1 2.2
## looic 87.7 8.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 22 50.0% 233
## (0.5, 0.7] (ok) 15 34.1% 368
## (0.7, 1] (bad) 7 15.9% 139
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("work_experience_programming.s")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.3 4.3
## p_loo 18.5 2.2
## looic 86.7 8.6
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 24 54.5% 362
## (0.5, 0.7] (ok) 15 34.1% 267
## (0.7, 1] (bad) 5 11.4% 179
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("work_experience_java.s")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.1 4.3
## p_loo 18.6 2.3
## looic 86.1 8.7
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 25 56.8% 466
## (0.5, 0.7] (ok) 15 34.1% 319
## (0.7, 1] (bad) 4 9.1% 89
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("education_field")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.0 4.2
## p_loo 18.3 2.1
## looic 85.9 8.3
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 20 45.5% 377
## (0.5, 0.7] (ok) 14 31.8% 423
## (0.7, 1] (bad) 10 22.7% 143
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("mo(education_level)", edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.8 4.4
## p_loo 18.8 2.2
## looic 85.7 8.7
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 21 47.7% 466
## (0.5, 0.7] (ok) 19 43.2% 278
## (0.7, 1] (bad) 4 9.1% 75
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_peer_review")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.8 4.1
## p_loo 18.0 2.0
## looic 85.7 8.3
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 26 59.1% 327
## (0.5, 0.7] (ok) 16 36.4% 385
## (0.7, 1] (bad) 2 4.5% 288
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_td_tracking")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.3 4.2
## p_loo 18.4 2.2
## looic 86.5 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 25 56.8% 216
## (0.5, 0.7] (ok) 13 29.5% 273
## (0.7, 1] (bad) 6 13.6% 102
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_pair_programming")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.4 4.0
## p_loo 17.0 1.9
## looic 86.8 8.0
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 24 54.5% 266
## (0.5, 0.7] (ok) 11 25.0% 229
## (0.7, 1] (bad) 9 20.5% 169
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_coding_standards")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.2
## p_loo 18.3 2.1
## looic 85.9 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 23 52.3% 408
## (0.5, 0.7] (ok) 14 31.8% 325
## (0.7, 1] (bad) 7 15.9% 176
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("scenario")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.7 4.1
## p_loo 18.1 2.1
## looic 85.4 8.2
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 24 54.5% 417
## (0.5, 0.7] (ok) 15 34.1% 174
## (0.7, 1] (bad) 5 11.4% 118
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("group")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.6 4.3
## p_loo 18.1 2.1
## looic 87.3 8.5
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 21 47.7% 346
## (0.5, 0.7] (ok) 17 38.6% 260
## (0.7, 1] (bad) 6 13.6% 74
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
loo_result <- loo(
# Benchmark model(s)
reuse.with(),
reuse.with("scenario"),
reuse.with("workplace_peer_review"),
reuse.with("mo(education_level)", edlvl_prior),
reuse.with("workplace_coding_standards"),
# New model(s)
reuse.with(c("scenario", "workplace_peer_review")),
reuse.with(c("scenario", "mo(education_level)"), edlvl_prior),
reuse.with(c("scenario", "workplace_coding_standards")),
reuse.with(c("workplace_peer_review", "mo(education_level)"), edlvl_prior),
reuse.with(c("workplace_peer_review", "workplace_coding_standards")),
reuse.with(c("workplace_coding_standards", "mo(education_level)"), edlvl_prior)
)
loo_result[2]
## $diffs
## elpd_diff
## reuse.with("scenario") 0.0
## reuse.with("workplace_peer_review") -0.1
## reuse.with("mo(education_level)", edlvl_prior) -0.2
## reuse.with(c("scenario", "workplace_coding_standards")) -0.2
## reuse.with() -0.2
## reuse.with("workplace_coding_standards") -0.2
## reuse.with(c("scenario", "mo(education_level)"), edlvl_prior) -0.4
## reuse.with(c("workplace_coding_standards", "mo(education_level)"), edlvl_prior) -0.6
## reuse.with(c("workplace_peer_review", "mo(education_level)"), edlvl_prior) -0.9
## reuse.with(c("workplace_peer_review", "workplace_coding_standards")) -1.0
## reuse.with(c("scenario", "workplace_peer_review")) -1.1
## se_diff
## reuse.with("scenario") 0.0
## reuse.with("workplace_peer_review") 0.9
## reuse.with("mo(education_level)", edlvl_prior) 1.2
## reuse.with(c("scenario", "workplace_coding_standards")) 0.5
## reuse.with() 1.0
## reuse.with("workplace_coding_standards") 0.9
## reuse.with(c("scenario", "mo(education_level)"), edlvl_prior) 0.7
## reuse.with(c("workplace_coding_standards", "mo(education_level)"), edlvl_prior) 1.1
## reuse.with(c("workplace_peer_review", "mo(education_level)"), edlvl_prior) 1.3
## reuse.with(c("workplace_peer_review", "workplace_coding_standards")) 1.1
## reuse.with(c("scenario", "workplace_peer_review")) 0.6
loo_result[1]
## $loos
## $loos$`reuse.with()`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.2
## p_loo 17.8 2.1
## looic 85.8 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 22 50.0% 314
## (0.5, 0.7] (ok) 12 27.3% 504
## (0.7, 1] (bad) 9 20.5% 165
## (1, Inf) (very bad) 1 2.3% 124
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("scenario")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.7 4.1
## p_loo 18.1 2.1
## looic 85.4 8.2
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 24 54.5% 417
## (0.5, 0.7] (ok) 15 34.1% 174
## (0.7, 1] (bad) 5 11.4% 118
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_peer_review")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.8 4.1
## p_loo 18.0 2.0
## looic 85.7 8.3
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 26 59.1% 327
## (0.5, 0.7] (ok) 16 36.4% 385
## (0.7, 1] (bad) 2 4.5% 288
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("mo(education_level)", edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.8 4.4
## p_loo 18.8 2.2
## looic 85.7 8.7
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 21 47.7% 466
## (0.5, 0.7] (ok) 19 43.2% 278
## (0.7, 1] (bad) 4 9.1% 75
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_coding_standards")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.2
## p_loo 18.3 2.1
## looic 85.9 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 23 52.3% 408
## (0.5, 0.7] (ok) 14 31.8% 325
## (0.7, 1] (bad) 7 15.9% 176
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("scenario", "workplace_peer_review"))`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.8 4.3
## p_loo 19.5 2.2
## looic 87.7 8.6
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 19 43.2% 904
## (0.5, 0.7] (ok) 18 40.9% 236
## (0.7, 1] (bad) 7 15.9% 78
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("scenario", "mo(education_level)"), edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.1 4.5
## p_loo 19.5 2.3
## looic 86.1 8.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 24 54.5% 461
## (0.5, 0.7] (ok) 10 22.7% 268
## (0.7, 1] (bad) 10 22.7% 166
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("scenario", "workplace_coding_standards"))`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.3
## p_loo 18.7 2.1
## looic 85.8 8.5
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 21 47.7% 1043
## (0.5, 0.7] (ok) 17 38.6% 311
## (0.7, 1] (bad) 6 13.6% 193
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("workplace_peer_review", "mo(education_level)"), edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.6 4.6
## p_loo 19.9 2.4
## looic 87.1 9.2
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 23 52.3% 1216
## (0.5, 0.7] (ok) 12 27.3% 232
## (0.7, 1] (bad) 9 20.5% 40
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("workplace_peer_review", "workplace_coding_standards"))`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.7 4.4
## p_loo 19.2 2.3
## looic 87.3 8.8
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 22 50.0% 397
## (0.5, 0.7] (ok) 16 36.4% 206
## (0.7, 1] (bad) 5 11.4% 96
## (1, Inf) (very bad) 1 2.3% 45
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("workplace_coding_standards", "mo(education_level)"), edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.3 4.4
## p_loo 19.5 2.3
## looic 86.5 8.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 17 38.6% 521
## (0.5, 0.7] (ok) 17 38.6% 245
## (0.7, 1] (bad) 10 22.7% 148
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
loo_result <- loo(
# Benchmark model(s)
reuse.with(),
reuse.with("scenario"),
reuse.with("workplace_peer_review"),
reuse.with("mo(education_level)", edlvl_prior),
reuse.with("workplace_coding_standards"),
reuse.with(c("scenario", "workplace_coding_standards")),
# New model(s)
reuse.with(c("scenario", "workplace_coding_standards", "mo(education_level)"), edlvl_prior)
)
loo_result[2]
## $diffs
## elpd_diff
## reuse.with("scenario") 0.0
## reuse.with("workplace_peer_review") -0.1
## reuse.with("mo(education_level)", edlvl_prior) -0.2
## reuse.with(c("scenario", "workplace_coding_standards")) -0.2
## reuse.with() -0.2
## reuse.with("workplace_coding_standards") -0.2
## reuse.with(c("scenario", "workplace_coding_standards", "mo(education_level)"), edlvl_prior) -0.5
## se_diff
## reuse.with("scenario") 0.0
## reuse.with("workplace_peer_review") 0.9
## reuse.with("mo(education_level)", edlvl_prior) 1.2
## reuse.with(c("scenario", "workplace_coding_standards")) 0.5
## reuse.with() 1.0
## reuse.with("workplace_coding_standards") 0.9
## reuse.with(c("scenario", "workplace_coding_standards", "mo(education_level)"), edlvl_prior) 0.8
loo_result[1]
## $loos
## $loos$`reuse.with()`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.2
## p_loo 17.8 2.1
## looic 85.8 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 22 50.0% 314
## (0.5, 0.7] (ok) 12 27.3% 504
## (0.7, 1] (bad) 9 20.5% 165
## (1, Inf) (very bad) 1 2.3% 124
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("scenario")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.7 4.1
## p_loo 18.1 2.1
## looic 85.4 8.2
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 24 54.5% 417
## (0.5, 0.7] (ok) 15 34.1% 174
## (0.7, 1] (bad) 5 11.4% 118
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_peer_review")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.8 4.1
## p_loo 18.0 2.0
## looic 85.7 8.3
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 26 59.1% 327
## (0.5, 0.7] (ok) 16 36.4% 385
## (0.7, 1] (bad) 2 4.5% 288
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("mo(education_level)", edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.8 4.4
## p_loo 18.8 2.2
## looic 85.7 8.7
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 21 47.7% 466
## (0.5, 0.7] (ok) 19 43.2% 278
## (0.7, 1] (bad) 4 9.1% 75
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with("workplace_coding_standards")`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.2
## p_loo 18.3 2.1
## looic 85.9 8.4
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 23 52.3% 408
## (0.5, 0.7] (ok) 14 31.8% 325
## (0.7, 1] (bad) 7 15.9% 176
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("scenario", "workplace_coding_standards"))`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -42.9 4.3
## p_loo 18.7 2.1
## looic 85.8 8.5
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 21 47.7% 1043
## (0.5, 0.7] (ok) 17 38.6% 311
## (0.7, 1] (bad) 6 13.6% 193
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## $loos$`reuse.with(c("scenario", "workplace_coding_standards", "mo(education_level)"), edlvl_prior)`
##
## Computed from 4000 by 44 log-likelihood matrix
##
## Estimate SE
## elpd_loo -43.2 4.5
## p_loo 20.1 2.4
## looic 86.3 9.0
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 17 38.6% 460
## (0.5, 0.7] (ok) 18 40.9% 268
## (0.7, 1] (bad) 9 20.5% 103
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
We pick some of our top performing models as candidates and inspect them closer.
The candidate models are named and listed in order of complexity.
reuse0 <- brm(
"mvbind(
reused_logic_validation,
reused_logic_constructor
) ~ 1 + high_debt_version + (1 |c| session)",
prior = c(
prior(normal(0, 1), class = "b"),
prior(normal(0, 1), class = "Intercept"),
prior(exponential(1), class = "sd", resp = reusedlogicconstructor),
prior(exponential(1), class = "sd", resp = reusedlogicvalidation),
prior(lkj(2), class = "L")
),
family = bernoulli(),
data = as.data.frame(d.both_completed),
control = list(adapt_delta = 0.95),
file = "fits/reuse0",
file_refit = "on_change",
seed = 20210421
)
summary(reuse0)
## Family: MV(bernoulli, bernoulli)
## Links: mu = logit
## mu = logit
## Formula: reused_logic_validation ~ 1 + high_debt_version + (1 | c | session)
## reused_logic_constructor ~ 1 + high_debt_version + (1 | c | session)
## Data: as.data.frame(d.both_completed) (Number of observations: 44)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~session (Number of levels: 22)
## Estimate
## sd(reusedlogicvalidation_Intercept) 1.83
## sd(reusedlogicconstructor_Intercept) 1.93
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.70
## Est.Error
## sd(reusedlogicvalidation_Intercept) 0.84
## sd(reusedlogicconstructor_Intercept) 0.86
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.24
## l-95% CI
## sd(reusedlogicvalidation_Intercept) 0.35
## sd(reusedlogicconstructor_Intercept) 0.45
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.08
## u-95% CI
## sd(reusedlogicvalidation_Intercept) 3.76
## sd(reusedlogicconstructor_Intercept) 3.90
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.96
## Rhat
## sd(reusedlogicvalidation_Intercept) 1.00
## sd(reusedlogicconstructor_Intercept) 1.01
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1.01
## Bulk_ESS
## sd(reusedlogicvalidation_Intercept) 1083
## sd(reusedlogicconstructor_Intercept) 1218
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1114
## Tail_ESS
## sd(reusedlogicvalidation_Intercept) 840
## sd(reusedlogicconstructor_Intercept) 911
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1080
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI
## reusedlogicvalidation_Intercept -0.04 0.58 -1.13
## reusedlogicconstructor_Intercept -0.27 0.57 -1.38
## reusedlogicvalidation_high_debt_versionfalse -1.61 0.69 -2.99
## reusedlogicconstructor_high_debt_versionfalse -1.45 0.67 -2.84
## u-95% CI Rhat Bulk_ESS Tail_ESS
## reusedlogicvalidation_Intercept 1.17 1.00 2806 2730
## reusedlogicconstructor_Intercept 0.91 1.00 2577 2828
## reusedlogicvalidation_high_debt_versionfalse -0.33 1.00 3823 2879
## reusedlogicconstructor_high_debt_versionfalse -0.18 1.00 4128 2915
##
## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
ranef(reuse0)
## $session
## , , reusedlogicvalidation_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033d69a5af2c702367b3a95 -1.64448922 1.679742 -5.65318850 0.8033159
## 6033d90a5af2c702367b3a96 -1.62690433 1.684030 -5.62943675 0.9040157
## 6034fc165af2c702367b3a98 -0.76279173 1.369683 -4.00549900 1.4654105
## 603500725af2c702367b3a99 2.51846792 1.769619 -0.10701285 6.6478783
## 603f97625af2c702367b3a9d 2.49516741 1.773428 -0.11299845 6.6805038
## 603fd5d95af2c702367b3a9e -1.65402718 1.721964 -5.84767875 0.8621369
## 60409b7b5af2c702367b3a9f 2.52822202 1.780113 -0.08867287 6.6143128
## 604b82b5a7718fbed181b336 -0.06599373 1.188023 -2.49329925 2.3903343
## 6050c1bf856f36729d2e5218 0.54433520 1.163719 -1.72581600 3.0152225
## 6050e1e7856f36729d2e5219 -1.57684245 1.643184 -5.41257000 0.8466732
## 6055fdc6856f36729d2e521b -0.05666709 1.187775 -2.52282150 2.3371613
## 60589862856f36729d2e521f 0.55057906 1.157042 -1.75923000 3.0030273
## 605afa3a856f36729d2e5222 0.54613712 1.153260 -1.68495875 2.9554012
## 605c8bc6856f36729d2e5223 0.53589562 1.149106 -1.61921350 2.8826883
## 605f3f2d856f36729d2e5224 0.54857289 1.107645 -1.54963100 2.8087348
## 605f46c3856f36729d2e5225 -1.62919069 1.665907 -5.61894275 0.8403877
## 60605337856f36729d2e5226 -1.62417926 1.700964 -5.74833775 0.8687494
## 60609ae6856f36729d2e5228 0.54418186 1.125913 -1.60584325 2.8864072
## 6061ce91856f36729d2e522e -1.63494073 1.666962 -5.67413275 0.7876265
## 6061f106856f36729d2e5231 -1.60880513 1.662427 -5.56378300 0.8064458
## 6068ea9f856f36729d2e523e -1.61402626 1.681033 -5.61390150 0.8417808
## 6075ab05856f36729d2e5247 -1.60730732 1.628466 -5.48658950 0.8400181
##
## , , reusedlogicconstructor_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033d69a5af2c702367b3a95 -1.711648279 1.756048 -5.93215200 0.8517546
## 6033d90a5af2c702367b3a96 -1.737330752 1.781254 -6.17996175 0.8482377
## 6034fc165af2c702367b3a98 -0.005344317 1.232504 -2.47904525 2.5154483
## 603500725af2c702367b3a99 2.691630801 1.812021 -0.01676437 6.9697668
## 603f97625af2c702367b3a9d 2.679688515 1.815994 -0.05252160 6.8975563
## 603fd5d95af2c702367b3a9e -1.684212161 1.724919 -5.92130450 0.8809865
## 60409b7b5af2c702367b3a9f 2.707746357 1.883769 -0.03635867 6.9711278
## 604b82b5a7718fbed181b336 -0.804791874 1.425887 -4.05708950 1.5209530
## 6050c1bf856f36729d2e5218 0.589916480 1.223256 -1.80534450 3.2209473
## 6050e1e7856f36729d2e5219 -1.633898531 1.722661 -5.56118700 0.9225588
## 6055fdc6856f36729d2e521b -0.808681736 1.435134 -4.24051725 1.5489745
## 60589862856f36729d2e521f 0.618890238 1.218197 -1.76410250 3.1219453
## 605afa3a856f36729d2e5222 0.601106509 1.146102 -1.65621200 2.9979560
## 605c8bc6856f36729d2e5223 0.626501807 1.187399 -1.59110625 3.1200883
## 605f3f2d856f36729d2e5224 0.617193545 1.162436 -1.65937700 3.1116105
## 605f46c3856f36729d2e5225 -1.697137433 1.735656 -5.98194350 0.8534365
## 60605337856f36729d2e5226 -1.685561549 1.770842 -5.94581475 0.8886053
## 60609ae6856f36729d2e5228 0.598373442 1.190561 -1.74908675 3.0263265
## 6061ce91856f36729d2e522e -1.686388960 1.737622 -5.97792575 0.8796309
## 6061f106856f36729d2e5231 -1.687842757 1.786531 -6.01350625 0.9543536
## 6068ea9f856f36729d2e523e -1.700962104 1.731385 -5.97584575 0.9611543
## 6075ab05856f36729d2e5247 -1.660715559 1.724302 -5.85075025 0.8891074
plot(reuse0, ask = FALSE)
pp_check(reuse0, type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse0, type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
reuse1 <- brm(
"mvbind(
reused_logic_validation,
reused_logic_constructor
) ~ 1 + high_debt_version + scenario + (1 |c| session)",
prior = c(
prior(normal(0, 1), class = "b"),
prior(normal(0, 1), class = "Intercept"),
prior(exponential(1), class = "sd", resp = reusedlogicconstructor),
prior(exponential(1), class = "sd", resp = reusedlogicvalidation),
prior(lkj(2), class = "L")
),
family = bernoulli(),
data = as.data.frame(d.both_completed),
control = list(adapt_delta = 0.95),
file = "fits/reuse1",
file_refit = "on_change",
seed = 20210421
)
summary(reuse1)
## Family: MV(bernoulli, bernoulli)
## Links: mu = logit
## mu = logit
## Formula: reused_logic_validation ~ 1 + high_debt_version + scenario + (1 | c | session)
## reused_logic_constructor ~ 1 + high_debt_version + scenario + (1 | c | session)
## Data: as.data.frame(d.both_completed) (Number of observations: 44)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~session (Number of levels: 22)
## Estimate
## sd(reusedlogicvalidation_Intercept) 1.89
## sd(reusedlogicconstructor_Intercept) 2.01
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.69
## Est.Error
## sd(reusedlogicvalidation_Intercept) 0.86
## sd(reusedlogicconstructor_Intercept) 0.89
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.25
## l-95% CI
## sd(reusedlogicvalidation_Intercept) 0.37
## sd(reusedlogicconstructor_Intercept) 0.46
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.04
## u-95% CI
## sd(reusedlogicvalidation_Intercept) 3.84
## sd(reusedlogicconstructor_Intercept) 4.03
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.97
## Rhat
## sd(reusedlogicvalidation_Intercept) 1.00
## sd(reusedlogicconstructor_Intercept) 1.00
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1.00
## Bulk_ESS
## sd(reusedlogicvalidation_Intercept) 1141
## sd(reusedlogicconstructor_Intercept) 1422
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1248
## Tail_ESS
## sd(reusedlogicvalidation_Intercept) 658
## sd(reusedlogicconstructor_Intercept) 917
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1066
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI
## reusedlogicvalidation_Intercept -0.26 0.67 -1.56
## reusedlogicconstructor_Intercept -0.39 0.68 -1.74
## reusedlogicvalidation_high_debt_versionfalse -1.65 0.68 -3.03
## reusedlogicvalidation_scenariotickets 0.44 0.66 -0.84
## reusedlogicconstructor_high_debt_versionfalse -1.48 0.69 -2.84
## reusedlogicconstructor_scenariotickets 0.26 0.66 -1.04
## u-95% CI Rhat Bulk_ESS Tail_ESS
## reusedlogicvalidation_Intercept 1.11 1.00 3706 2811
## reusedlogicconstructor_Intercept 0.97 1.00 3566 3134
## reusedlogicvalidation_high_debt_versionfalse -0.36 1.00 4183 2777
## reusedlogicvalidation_scenariotickets 1.73 1.00 6202 2864
## reusedlogicconstructor_high_debt_versionfalse -0.16 1.00 4877 2512
## reusedlogicconstructor_scenariotickets 1.57 1.00 5575 2896
##
## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
ranef(reuse1)
## $session
## , , reusedlogicvalidation_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033d69a5af2c702367b3a95 -1.695162823 1.676853 -5.62486825 0.8128174
## 6033d90a5af2c702367b3a96 -1.660967495 1.705909 -5.57747550 0.8185200
## 6034fc165af2c702367b3a98 -0.795359147 1.410638 -4.29177825 1.4300918
## 603500725af2c702367b3a99 2.657000081 1.830089 -0.01881306 7.0778410
## 603f97625af2c702367b3a9d 2.580973605 1.796790 -0.04921098 6.8277490
## 603fd5d95af2c702367b3a9e -1.734124510 1.738483 -6.01396500 0.8962419
## 60409b7b5af2c702367b3a9f 2.557759763 1.800641 -0.07594192 6.8860085
## 604b82b5a7718fbed181b336 -0.008325812 1.190826 -2.45745000 2.3916968
## 6050c1bf856f36729d2e5218 0.541533469 1.217123 -1.82525650 3.1873908
## 6050e1e7856f36729d2e5219 -1.666927946 1.694376 -5.75879000 0.8912427
## 6055fdc6856f36729d2e521b -0.072357751 1.231184 -2.58667350 2.5048593
## 60589862856f36729d2e521f 0.550181295 1.210881 -1.66271700 3.1372255
## 605afa3a856f36729d2e5222 0.566024661 1.151619 -1.60985425 3.0708253
## 605c8bc6856f36729d2e5223 0.570066575 1.130430 -1.62389375 2.9595718
## 605f3f2d856f36729d2e5224 0.539482972 1.198337 -1.84284925 3.0192855
## 605f46c3856f36729d2e5225 -1.618076218 1.643822 -5.50430625 0.8056822
## 60605337856f36729d2e5226 -1.732632313 1.799992 -6.09595575 0.8780582
## 60609ae6856f36729d2e5228 0.580551356 1.144888 -1.68345400 2.9726528
## 6061ce91856f36729d2e522e -1.651310798 1.708408 -5.87184950 0.8055595
## 6061f106856f36729d2e5231 -1.705539717 1.656698 -5.66440425 0.7433505
## 6068ea9f856f36729d2e523e -1.629824192 1.677229 -5.34906775 0.8807062
## 6075ab05856f36729d2e5247 -1.684848206 1.723714 -5.79982325 0.8651169
##
## , , reusedlogicconstructor_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033d69a5af2c702367b3a95 -1.82498455 1.849317 -6.237795000 0.8784421
## 6033d90a5af2c702367b3a96 -1.74701832 1.779300 -6.022685500 1.0182968
## 6034fc165af2c702367b3a98 0.04342082 1.242152 -2.354263500 2.6352758
## 603500725af2c702367b3a99 2.81061937 1.844072 -0.007717327 7.1762820
## 603f97625af2c702367b3a9d 2.78983275 1.924277 -0.074214957 7.3334005
## 603fd5d95af2c702367b3a9e -1.82439095 1.812279 -6.089966250 0.8871853
## 60409b7b5af2c702367b3a9f 2.76293483 1.891858 -0.021062767 7.0474240
## 604b82b5a7718fbed181b336 -0.85326719 1.452154 -4.315698250 1.4447720
## 6050c1bf856f36729d2e5218 0.60085146 1.219057 -1.817923250 3.1186740
## 6050e1e7856f36729d2e5219 -1.75462849 1.804687 -6.156250000 0.8432025
## 6055fdc6856f36729d2e521b -0.90331572 1.509559 -4.407148500 1.5385650
## 60589862856f36729d2e521f 0.63263833 1.197972 -1.651103750 3.1786760
## 605afa3a856f36729d2e5222 0.64369916 1.198076 -1.609403250 3.1688813
## 605c8bc6856f36729d2e5223 0.62713389 1.225826 -1.741967000 3.0997880
## 605f3f2d856f36729d2e5224 0.62512667 1.207206 -1.726381250 3.1226045
## 605f46c3856f36729d2e5225 -1.72376876 1.749191 -5.926578500 0.8558221
## 60605337856f36729d2e5226 -1.79606889 1.787076 -6.171988250 0.9625063
## 60609ae6856f36729d2e5228 0.61653287 1.198071 -1.616998250 3.0929363
## 6061ce91856f36729d2e522e -1.74622810 1.791889 -6.141287500 0.8451217
## 6061f106856f36729d2e5231 -1.78647574 1.770122 -5.897125000 0.8413062
## 6068ea9f856f36729d2e523e -1.76607406 1.788823 -6.162233750 0.8254057
## 6075ab05856f36729d2e5247 -1.75660207 1.821276 -5.879109000 0.8899523
plot(reuse1, ask = FALSE)
pp_check(reuse1, type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse1, type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
reuse2 <- brm(
"mvbind(
reused_logic_validation,
reused_logic_constructor
) ~ 1 + high_debt_version + scenario + workplace_coding_standards + (1 |c| session)",
prior = c(
prior(normal(0, 1), class = "b"),
prior(normal(0, 1), class = "Intercept"),
prior(exponential(1), class = "sd", resp = reusedlogicconstructor),
prior(exponential(1), class = "sd", resp = reusedlogicvalidation),
prior(lkj(2), class = "L")
),
family = bernoulli(),
data = as.data.frame(d.both_completed),
control = list(adapt_delta = 0.95),
file = "fits/reuse2",
file_refit = "on_change",
seed = 20210421
)
summary(reuse2)
## Family: MV(bernoulli, bernoulli)
## Links: mu = logit
## mu = logit
## Formula: reused_logic_validation ~ 1 + high_debt_version + scenario + workplace_coding_standards + (1 | c | session)
## reused_logic_constructor ~ 1 + high_debt_version + scenario + workplace_coding_standards + (1 | c | session)
## Data: as.data.frame(d.both_completed) (Number of observations: 44)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~session (Number of levels: 22)
## Estimate
## sd(reusedlogicvalidation_Intercept) 1.92
## sd(reusedlogicconstructor_Intercept) 2.07
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.71
## Est.Error
## sd(reusedlogicvalidation_Intercept) 0.82
## sd(reusedlogicconstructor_Intercept) 0.86
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.22
## l-95% CI
## sd(reusedlogicvalidation_Intercept) 0.54
## sd(reusedlogicconstructor_Intercept) 0.64
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.14
## u-95% CI
## sd(reusedlogicvalidation_Intercept) 3.76
## sd(reusedlogicconstructor_Intercept) 4.01
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.97
## Rhat
## sd(reusedlogicvalidation_Intercept) 1.00
## sd(reusedlogicconstructor_Intercept) 1.00
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1.00
## Bulk_ESS
## sd(reusedlogicvalidation_Intercept) 1317
## sd(reusedlogicconstructor_Intercept) 1589
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1744
## Tail_ESS
## sd(reusedlogicvalidation_Intercept) 1084
## sd(reusedlogicconstructor_Intercept) 1969
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1597
##
## Population-Level Effects:
## Estimate Est.Error
## reusedlogicvalidation_Intercept -0.14 0.75
## reusedlogicconstructor_Intercept -0.36 0.79
## reusedlogicvalidation_high_debt_versionfalse -1.64 0.68
## reusedlogicvalidation_scenariotickets 0.46 0.64
## reusedlogicvalidation_workplace_coding_standardsfalse -0.28 0.74
## reusedlogicconstructor_high_debt_versionfalse -1.50 0.70
## reusedlogicconstructor_scenariotickets 0.25 0.67
## reusedlogicconstructor_workplace_coding_standardsfalse -0.11 0.76
## l-95% CI u-95% CI Rhat
## reusedlogicvalidation_Intercept -1.61 1.34 1.00
## reusedlogicconstructor_Intercept -1.96 1.19 1.00
## reusedlogicvalidation_high_debt_versionfalse -3.01 -0.33 1.00
## reusedlogicvalidation_scenariotickets -0.76 1.71 1.00
## reusedlogicvalidation_workplace_coding_standardsfalse -1.70 1.18 1.00
## reusedlogicconstructor_high_debt_versionfalse -2.95 -0.18 1.00
## reusedlogicconstructor_scenariotickets -1.08 1.60 1.00
## reusedlogicconstructor_workplace_coding_standardsfalse -1.61 1.39 1.00
## Bulk_ESS Tail_ESS
## reusedlogicvalidation_Intercept 4560 3215
## reusedlogicconstructor_Intercept 4310 3061
## reusedlogicvalidation_high_debt_versionfalse 5165 3201
## reusedlogicvalidation_scenariotickets 6273 3149
## reusedlogicvalidation_workplace_coding_standardsfalse 4515 3033
## reusedlogicconstructor_high_debt_versionfalse 5475 2960
## reusedlogicconstructor_scenariotickets 4797 2343
## reusedlogicconstructor_workplace_coding_standardsfalse 4288 2963
##
## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
ranef(reuse2)
## $session
## , , reusedlogicvalidation_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033d69a5af2c702367b3a95 -1.72724019 1.729424 -5.76663225 0.8824620
## 6033d90a5af2c702367b3a96 -1.71009588 1.708723 -5.79641525 0.9031177
## 6034fc165af2c702367b3a98 -0.74414633 1.434611 -4.25154900 1.6368215
## 603500725af2c702367b3a99 2.72107734 1.734281 0.02164775 6.6522770
## 603f97625af2c702367b3a9d 2.62006783 1.852629 -0.06818948 7.0577815
## 603fd5d95af2c702367b3a9e -1.74757375 1.734656 -5.82139625 0.7781280
## 60409b7b5af2c702367b3a9f 2.72509629 1.815604 0.03953357 7.1162100
## 604b82b5a7718fbed181b336 0.04097657 1.251696 -2.48596700 2.5700880
## 6050c1bf856f36729d2e5218 0.51494732 1.205913 -1.83810450 3.0532292
## 6050e1e7856f36729d2e5219 -1.65802503 1.726420 -5.73621550 0.9130677
## 6055fdc6856f36729d2e521b -0.17569988 1.263859 -2.76861675 2.2771918
## 60589862856f36729d2e521f 0.50942404 1.177708 -1.75178900 2.9368663
## 605afa3a856f36729d2e5222 0.51982453 1.181801 -1.74835550 2.9743145
## 605c8bc6856f36729d2e5223 0.50061688 1.164279 -1.72087650 2.9150390
## 605f3f2d856f36729d2e5224 0.50876936 1.206301 -1.80681200 3.0395723
## 605f46c3856f36729d2e5225 -1.73787705 1.676404 -5.80324625 0.7288653
## 60605337856f36729d2e5226 -1.80345345 1.714569 -5.83591775 0.7222267
## 60609ae6856f36729d2e5228 0.67748147 1.169618 -1.54658200 3.0790463
## 6061ce91856f36729d2e522e -1.70074400 1.665290 -5.77741925 0.8031437
## 6061f106856f36729d2e5231 -1.75997584 1.734934 -5.85627250 0.8133482
## 6068ea9f856f36729d2e523e -1.63109295 1.734193 -5.84098250 0.9712994
## 6075ab05856f36729d2e5247 -1.64509634 1.732791 -5.95051925 0.9490631
##
## , , reusedlogicconstructor_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033d69a5af2c702367b3a95 -1.80952856 1.850799 -6.438907750 0.9194913
## 6033d90a5af2c702367b3a96 -1.83906327 1.836806 -6.294757750 0.8568513
## 6034fc165af2c702367b3a98 0.09418828 1.331059 -2.590392000 2.6774448
## 603500725af2c702367b3a99 2.94507590 1.893184 0.024102138 7.4642418
## 603f97625af2c702367b3a9d 2.83295501 1.917229 0.003376609 7.2194823
## 603fd5d95af2c702367b3a9e -1.82554381 1.778430 -6.245592250 0.8921717
## 60409b7b5af2c702367b3a9f 2.94358988 1.900508 0.039903063 7.5241995
## 604b82b5a7718fbed181b336 -0.82592433 1.528005 -4.441380000 1.6969093
## 6050c1bf856f36729d2e5218 0.60768382 1.257928 -1.787242750 3.1277430
## 6050e1e7856f36729d2e5219 -1.74898559 1.805799 -6.153328000 0.9559280
## 6055fdc6856f36729d2e521b -0.99352040 1.592587 -4.729589750 1.6322363
## 60589862856f36729d2e521f 0.60321616 1.283964 -1.906011250 3.2212508
## 605afa3a856f36729d2e5222 0.60466370 1.229093 -1.754774000 3.1802890
## 605c8bc6856f36729d2e5223 0.59818866 1.248713 -1.731229000 3.2203953
## 605f3f2d856f36729d2e5224 0.59694744 1.237238 -1.860229250 3.1646430
## 605f46c3856f36729d2e5225 -1.84161597 1.777057 -6.068717250 0.8589194
## 60605337856f36729d2e5226 -1.90844404 1.867845 -6.511749750 0.8394590
## 60609ae6856f36729d2e5228 0.73412792 1.236296 -1.712285500 3.1899440
## 6061ce91856f36729d2e522e -1.83475328 1.818216 -6.350759250 0.8747636
## 6061f106856f36729d2e5231 -1.81988540 1.785118 -6.051485750 0.9739698
## 6068ea9f856f36729d2e523e -1.77632682 1.793617 -6.196038750 0.8985561
## 6075ab05856f36729d2e5247 -1.78492474 1.840006 -6.169046750 0.9942407
plot(reuse2, ask = FALSE)
pp_check(reuse2, type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse2, type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
All candidate models look nice, none is significantly better than the others, we will proceed the simplest model: reuse0
We will try a few different variations of the selected candidate model.
Some participants did only complete one scenario. Those has been excluded from the initial dataset to improve sampling of the models. We do however want to use all data we can and will therefore try to fit the model with the complete dataset.
reuse0.all <- brm(
"mvbind(
reused_logic_validation,
reused_logic_constructor
) ~ 1 + high_debt_version + (1 |c| session)",
prior = c(
prior(normal(0, 1), class = "b"),
prior(normal(0, 1), class = "Intercept"),
prior(exponential(1), class = "sd", resp = reusedlogicconstructor),
prior(exponential(1), class = "sd", resp = reusedlogicvalidation),
prior(lkj(2), class = "L")
),
family = bernoulli(),
data = as.data.frame(d.completed),
control = list(adapt_delta = 0.95),
file = "fits/reuse0.all",
file_refit = "on_change",
seed = 20210421
)
summary(reuse0.all)
## Family: MV(bernoulli, bernoulli)
## Links: mu = logit
## mu = logit
## Formula: reused_logic_validation ~ 1 + high_debt_version + (1 | c | session)
## reused_logic_constructor ~ 1 + high_debt_version + (1 | c | session)
## Data: as.data.frame(d.completed) (Number of observations: 51)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~session (Number of levels: 29)
## Estimate
## sd(reusedlogicvalidation_Intercept) 1.82
## sd(reusedlogicconstructor_Intercept) 1.92
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.73
## Est.Error
## sd(reusedlogicvalidation_Intercept) 0.79
## sd(reusedlogicconstructor_Intercept) 0.83
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.22
## l-95% CI
## sd(reusedlogicvalidation_Intercept) 0.45
## sd(reusedlogicconstructor_Intercept) 0.47
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.17
## u-95% CI
## sd(reusedlogicvalidation_Intercept) 3.59
## sd(reusedlogicconstructor_Intercept) 3.80
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.97
## Rhat
## sd(reusedlogicvalidation_Intercept) 1.00
## sd(reusedlogicconstructor_Intercept) 1.00
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1.01
## Bulk_ESS
## sd(reusedlogicvalidation_Intercept) 1301
## sd(reusedlogicconstructor_Intercept) 1503
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1254
## Tail_ESS
## sd(reusedlogicvalidation_Intercept) 1200
## sd(reusedlogicconstructor_Intercept) 905
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1218
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI
## reusedlogicvalidation_Intercept -0.16 0.54 -1.19
## reusedlogicconstructor_Intercept -0.36 0.58 -1.53
## reusedlogicvalidation_high_debt_versionfalse -1.75 0.66 -3.09
## reusedlogicconstructor_high_debt_versionfalse -1.62 0.65 -2.95
## u-95% CI Rhat Bulk_ESS Tail_ESS
## reusedlogicvalidation_Intercept 0.90 1.00 2888 2622
## reusedlogicconstructor_Intercept 0.77 1.00 3128 2744
## reusedlogicvalidation_high_debt_versionfalse -0.54 1.00 3400 2098
## reusedlogicconstructor_high_debt_versionfalse -0.39 1.00 4202 3033
##
## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
ranef(reuse0.all)
## $session
## , , reusedlogicvalidation_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033c6fc5af2c702367b3a93 1.52433132 1.613569 -0.969815500 5.1783700
## 6033d69a5af2c702367b3a95 -1.57931215 1.663673 -5.810711250 0.9008065
## 6033d90a5af2c702367b3a96 -1.59139089 1.640972 -5.544601500 0.9003481
## 6034fc165af2c702367b3a98 -0.62410491 1.307430 -3.581760750 1.5665022
## 603500725af2c702367b3a99 2.61102205 1.662121 0.030484688 6.5857627
## 603f84f15af2c702367b3a9b -0.70710294 1.668152 -4.643539500 2.0840563
## 603f97625af2c702367b3a9d 2.68207342 1.720716 -0.005952295 6.6985173
## 603fd5d95af2c702367b3a9e -1.53515976 1.646294 -5.649376250 0.8672848
## 60409b7b5af2c702367b3a9f 2.64053426 1.720147 0.002959409 6.5898188
## 604b82b5a7718fbed181b336 0.03315654 1.192443 -2.364176750 2.5186843
## 604f1239a7718fbed181b33f -1.38184365 1.637852 -5.258658250 1.3136058
## 6050c1bf856f36729d2e5218 0.65427781 1.153967 -1.554867750 3.0870665
## 6050e1e7856f36729d2e5219 -1.57173866 1.662905 -5.629063000 0.9188571
## 6055fdc6856f36729d2e521b 0.02806918 1.163098 -2.375179250 2.4169628
## 60579f2a856f36729d2e521e -0.79264961 1.679038 -4.743765500 2.0493635
## 60589862856f36729d2e521f 0.66974787 1.129378 -1.433695000 3.0534772
## 605a30a7856f36729d2e5221 -0.80703678 1.641389 -4.638936750 2.0203173
## 605afa3a856f36729d2e5222 0.66357750 1.122612 -1.413732250 3.0645868
## 605c8bc6856f36729d2e5223 0.66305916 1.145585 -1.605166250 3.0536628
## 605f3f2d856f36729d2e5224 0.66566948 1.157849 -1.554673750 3.1535625
## 605f46c3856f36729d2e5225 -1.55358132 1.603079 -5.381195000 0.8799558
## 60605337856f36729d2e5226 -1.54732232 1.645383 -5.440211250 0.9519867
## 60609ae6856f36729d2e5228 0.70043553 1.144582 -1.402780750 3.1083285
## 6061ce91856f36729d2e522e -1.53685377 1.627986 -5.490529250 0.9917854
## 6061f106856f36729d2e5231 -1.55189487 1.605967 -5.378386750 0.8988301
## 60672faa856f36729d2e523c -0.74043191 1.656474 -4.704838000 2.1571240
## 6068ea9f856f36729d2e523e -1.57390343 1.673930 -5.748856500 0.9385252
## 606db69d856f36729d2e5243 -0.77615544 1.645051 -4.531027000 2.0059235
## 6075ab05856f36729d2e5247 -1.54696144 1.638185 -5.581241750 0.8568468
##
## , , reusedlogicconstructor_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033c6fc5af2c702367b3a93 1.6513637 1.705281 -0.980449675 5.5500160
## 6033d69a5af2c702367b3a95 -1.6029530 1.696962 -5.689778500 0.9606220
## 6033d90a5af2c702367b3a96 -1.6341323 1.736069 -5.883496250 0.9600657
## 6034fc165af2c702367b3a98 0.1029461 1.285825 -2.428427000 2.7107973
## 603500725af2c702367b3a99 2.7875200 1.784280 -0.014277007 6.9738533
## 603f84f15af2c702367b3a9b -0.7843906 1.731413 -4.820681500 2.1859730
## 603f97625af2c702367b3a9d 2.8057268 1.784552 0.008421944 6.9930365
## 603fd5d95af2c702367b3a9e -1.6181059 1.695680 -5.590281750 1.0030633
## 60409b7b5af2c702367b3a9f 2.8022027 1.807808 -0.006584326 6.9946810
## 604b82b5a7718fbed181b336 -0.6685813 1.388766 -3.863349250 1.6474405
## 604f1239a7718fbed181b33f -1.4548001 1.686973 -5.387264500 1.2411622
## 6050c1bf856f36729d2e5218 0.7298783 1.234289 -1.576021750 3.3316175
## 6050e1e7856f36729d2e5219 -1.6154009 1.719899 -5.666797500 0.9926875
## 6055fdc6856f36729d2e521b -0.6774652 1.383467 -3.966178750 1.6536542
## 60579f2a856f36729d2e521e -0.8246259 1.774828 -4.914796250 2.1394963
## 60589862856f36729d2e521f 0.7217280 1.168790 -1.439215250 3.1721455
## 605a30a7856f36729d2e5221 -0.8743952 1.804458 -5.217587250 2.1505523
## 605afa3a856f36729d2e5222 0.7291399 1.193025 -1.533765750 3.2442200
## 605c8bc6856f36729d2e5223 0.7477587 1.165007 -1.485445750 3.1594975
## 605f3f2d856f36729d2e5224 0.7356732 1.198331 -1.457312250 3.2265058
## 605f46c3856f36729d2e5225 -1.6476631 1.711048 -5.710195750 0.9255903
## 60605337856f36729d2e5226 -1.6325260 1.729426 -5.824566500 0.9650235
## 60609ae6856f36729d2e5228 0.7502496 1.190529 -1.538054750 3.3043045
## 6061ce91856f36729d2e522e -1.6604910 1.759366 -6.024402250 0.9926731
## 6061f106856f36729d2e5231 -1.6271320 1.670796 -5.771684500 0.9179115
## 60672faa856f36729d2e523c -0.7914588 1.746501 -4.877514250 2.1338238
## 6068ea9f856f36729d2e523e -1.6245046 1.748432 -5.650251500 1.0124705
## 606db69d856f36729d2e5243 -0.8596535 1.806563 -5.012448250 2.1501285
## 6075ab05856f36729d2e5247 -1.6136623 1.708559 -5.850888750 0.9880117
plot(reuse0.all, ask = FALSE)
pp_check(reuse0.all, type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse0.all, type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
As including all data points didn’t harm the model we will create this variant with all data points as well.
This variation includes work_experience_programming.s
predictors as it can give further insight into how experience play a factor in the effect we try to measure. This is especially important as our sampling shewed towards containing less experienced developer than the population at large.
reuse0.all.exp <- brm(
"mvbind(
reused_logic_validation,
reused_logic_constructor
) ~ 1 + high_debt_version + work_experience_programming.s + (1 |c| session)",
prior = c(
prior(normal(0, 1), class = "b"),
prior(normal(0, 1), class = "Intercept"),
prior(exponential(1), class = "sd", resp = reusedlogicconstructor),
prior(exponential(1), class = "sd", resp = reusedlogicvalidation),
prior(lkj(2), class = "L")
),
family = bernoulli(),
data = as.data.frame(d.completed),
control = list(adapt_delta = 0.95),
file = "fits/reuse0.all.exp",
file_refit = "on_change",
seed = 20210421
)
summary(reuse0.all.exp)
## Family: MV(bernoulli, bernoulli)
## Links: mu = logit
## mu = logit
## Formula: reused_logic_validation ~ 1 + high_debt_version + work_experience_programming.s + (1 | c | session)
## reused_logic_constructor ~ 1 + high_debt_version + work_experience_programming.s + (1 | c | session)
## Data: as.data.frame(d.completed) (Number of observations: 51)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~session (Number of levels: 29)
## Estimate
## sd(reusedlogicvalidation_Intercept) 1.91
## sd(reusedlogicconstructor_Intercept) 1.97
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.73
## Est.Error
## sd(reusedlogicvalidation_Intercept) 0.81
## sd(reusedlogicconstructor_Intercept) 0.85
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.22
## l-95% CI
## sd(reusedlogicvalidation_Intercept) 0.56
## sd(reusedlogicconstructor_Intercept) 0.51
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.16
## u-95% CI
## sd(reusedlogicvalidation_Intercept) 3.69
## sd(reusedlogicconstructor_Intercept) 3.89
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 0.97
## Rhat
## sd(reusedlogicvalidation_Intercept) 1.00
## sd(reusedlogicconstructor_Intercept) 1.00
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1.00
## Bulk_ESS
## sd(reusedlogicvalidation_Intercept) 1186
## sd(reusedlogicconstructor_Intercept) 1369
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1378
## Tail_ESS
## sd(reusedlogicvalidation_Intercept) 783
## sd(reusedlogicconstructor_Intercept) 896
## cor(reusedlogicvalidation_Intercept,reusedlogicconstructor_Intercept) 1031
##
## Population-Level Effects:
## Estimate Est.Error
## reusedlogicvalidation_Intercept -0.16 0.57
## reusedlogicconstructor_Intercept -0.38 0.58
## reusedlogicvalidation_high_debt_versionfalse -1.79 0.66
## reusedlogicvalidation_work_experience_programming.s 0.26 0.47
## reusedlogicconstructor_high_debt_versionfalse -1.63 0.65
## reusedlogicconstructor_work_experience_programming.s 0.31 0.48
## l-95% CI u-95% CI Rhat
## reusedlogicvalidation_Intercept -1.30 0.96 1.00
## reusedlogicconstructor_Intercept -1.53 0.74 1.00
## reusedlogicvalidation_high_debt_versionfalse -3.12 -0.54 1.00
## reusedlogicvalidation_work_experience_programming.s -0.67 1.21 1.00
## reusedlogicconstructor_high_debt_versionfalse -2.95 -0.39 1.00
## reusedlogicconstructor_work_experience_programming.s -0.66 1.24 1.00
## Bulk_ESS Tail_ESS
## reusedlogicvalidation_Intercept 2569 2547
## reusedlogicconstructor_Intercept 3682 3321
## reusedlogicvalidation_high_debt_versionfalse 4478 2812
## reusedlogicvalidation_work_experience_programming.s 2695 2302
## reusedlogicconstructor_high_debt_versionfalse 4569 2989
## reusedlogicconstructor_work_experience_programming.s 2549 2514
##
## Samples were drawn using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
ranef(reuse0.all.exp)
## $session
## , , reusedlogicvalidation_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033c6fc5af2c702367b3a93 1.69440769 1.669757 -1.03472550 5.5714933
## 6033d69a5af2c702367b3a95 -1.54017002 1.722019 -5.68123400 0.9601725
## 6033d90a5af2c702367b3a96 -1.56440549 1.759165 -5.72973025 1.0521998
## 6034fc165af2c702367b3a98 -0.60750281 1.373274 -3.84262425 1.7090468
## 603500725af2c702367b3a99 2.87956516 1.793369 0.06626417 7.0774048
## 603f84f15af2c702367b3a9b -0.79794279 1.785462 -4.95443175 2.2066393
## 603f97625af2c702367b3a9d 2.84001020 1.771612 0.04165200 6.9975305
## 603fd5d95af2c702367b3a9e -1.58100585 1.700567 -5.50799950 1.0505440
## 60409b7b5af2c702367b3a9f 2.85017201 1.783262 0.03730362 7.0165155
## 604b82b5a7718fbed181b336 0.10882545 1.228775 -2.27172200 2.6333652
## 604f1239a7718fbed181b33f -1.44418669 1.737604 -5.65407975 1.4076188
## 6050c1bf856f36729d2e5218 0.66830339 1.154907 -1.50743975 3.0903048
## 6050e1e7856f36729d2e5219 -1.60383489 1.687630 -5.68284750 0.9535508
## 6055fdc6856f36729d2e521b 0.08876662 1.222867 -2.40751125 2.5978193
## 60579f2a856f36729d2e521e -0.80411051 1.720143 -4.84170850 2.1074005
## 60589862856f36729d2e521f 0.26567694 1.330795 -2.37831325 3.0714628
## 605a30a7856f36729d2e5221 -0.83583864 1.791272 -5.01451450 2.1661973
## 605afa3a856f36729d2e5222 0.38890068 1.246211 -2.00175425 2.9543810
## 605c8bc6856f36729d2e5223 0.67401789 1.171177 -1.56932025 3.1122942
## 605f3f2d856f36729d2e5224 0.11490152 1.512102 -2.85034875 3.2663030
## 605f46c3856f36729d2e5225 -1.53612405 1.663945 -5.57183250 0.9840714
## 60605337856f36729d2e5226 -1.61733472 1.709670 -5.64570950 1.0346345
## 60609ae6856f36729d2e5228 0.81336784 1.182931 -1.44218375 3.2653805
## 6061ce91856f36729d2e522e -1.57869587 1.673598 -5.61324125 0.9152119
## 6061f106856f36729d2e5231 -1.57670092 1.683208 -5.57801550 1.0358605
## 60672faa856f36729d2e523c -0.79070451 1.739234 -4.83192975 2.1411885
## 6068ea9f856f36729d2e523e -1.60367217 1.668877 -5.62579500 0.8866286
## 606db69d856f36729d2e5243 -0.93886701 1.758043 -4.90556200 2.0598303
## 6075ab05856f36729d2e5247 -1.60869184 1.760252 -5.92382650 1.1282968
##
## , , reusedlogicconstructor_Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## 6033c6fc5af2c702367b3a93 1.8106828 1.775726 -1.00387800 5.9914078
## 6033d69a5af2c702367b3a95 -1.6174607 1.784822 -5.91908575 1.0826265
## 6033d90a5af2c702367b3a96 -1.5707289 1.729083 -5.67923250 1.1092988
## 6034fc165af2c702367b3a98 0.1529060 1.281617 -2.40816000 2.7561755
## 603500725af2c702367b3a99 2.9696416 1.888775 0.07114812 7.2944118
## 603f84f15af2c702367b3a9b -0.8366828 1.785836 -5.03745900 2.2163323
## 603f97625af2c702367b3a9d 2.9811799 1.865609 0.09382672 7.2111513
## 603fd5d95af2c702367b3a9e -1.6204577 1.806969 -5.84734450 1.0886143
## 60409b7b5af2c702367b3a9f 2.9579985 1.832993 0.08757456 7.2107488
## 604b82b5a7718fbed181b336 -0.6081398 1.441246 -4.11585300 1.8219030
## 604f1239a7718fbed181b33f -1.4476162 1.770623 -5.73338775 1.3782573
## 6050c1bf856f36729d2e5218 0.7342527 1.213078 -1.52650125 3.2850767
## 6050e1e7856f36729d2e5219 -1.6461784 1.764501 -5.99489800 1.0084595
## 6055fdc6856f36729d2e521b -0.6193632 1.419618 -4.01296625 1.7525353
## 60579f2a856f36729d2e521e -0.8327042 1.791048 -5.06673800 2.2946275
## 60589862856f36729d2e521f 0.3213118 1.375121 -2.37617575 3.2033225
## 605a30a7856f36729d2e5221 -0.8547844 1.881423 -5.35731650 2.2204713
## 605afa3a856f36729d2e5222 0.4464808 1.248476 -1.96266500 3.0384988
## 605c8bc6856f36729d2e5223 0.7178812 1.219858 -1.55502375 3.3135363
## 605f3f2d856f36729d2e5224 0.1586350 1.559585 -2.93105325 3.4420773
## 605f46c3856f36729d2e5225 -1.5890300 1.717338 -5.53698625 1.0393233
## 60605337856f36729d2e5226 -1.6091079 1.762246 -5.64782975 1.0305513
## 60609ae6856f36729d2e5228 0.8779225 1.243293 -1.41096400 3.5594180
## 6061ce91856f36729d2e522e -1.5674709 1.677502 -5.72683900 1.0392560
## 6061f106856f36729d2e5231 -1.5672068 1.736949 -5.79265925 1.1207223
## 60672faa856f36729d2e523c -0.8415536 1.790866 -4.83130600 2.1626418
## 6068ea9f856f36729d2e523e -1.6151410 1.708899 -5.78600800 0.9732339
## 606db69d856f36729d2e5243 -0.9906428 1.812217 -5.26753875 2.0467303
## 6075ab05856f36729d2e5247 -1.6098854 1.777096 -5.97939575 1.0652855
loo(
reuse0.all,
reuse0.all.exp
)
## Output of model 'reuse0.all':
##
## Computed from 4000 by 51 log-likelihood matrix
##
## Estimate SE
## elpd_loo -46.1 4.4
## p_loo 19.4 2.2
## looic 92.2 8.9
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 25 49.0% 369
## (0.5, 0.7] (ok) 17 33.3% 249
## (0.7, 1] (bad) 9 17.6% 165
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## Output of model 'reuse0.all.exp':
##
## Computed from 4000 by 51 log-likelihood matrix
##
## Estimate SE
## elpd_loo -47.4 4.7
## p_loo 21.0 2.4
## looic 94.8 9.3
## ------
## Monte Carlo SE of elpd_loo is NA.
##
## Pareto k diagnostic values:
## Count Pct. Min. n_eff
## (-Inf, 0.5] (good) 23 45.1% 214
## (0.5, 0.7] (ok) 17 33.3% 200
## (0.7, 1] (bad) 11 21.6% 108
## (1, Inf) (very bad) 0 0.0% <NA>
## See help('pareto-k-diagnostic') for details.
##
## Model comparisons:
## elpd_diff se_diff
## reuse0.all 0.0 0.0
## reuse0.all.exp -1.3 1.0
plot(reuse0.all.exp, ask = FALSE)
pp_check(reuse0.all.exp, type = "bars", nsamples = 200, resp = "reusedlogicconstructor")
pp_check(reuse0.all.exp, type = "bars", nsamples = 200, resp = "reusedlogicvalidation")
This means that our final model, with all data points and experience predictors, is time0.all.exp
To begin interpreting the model we look at how it’s parameters were estimated. As our research is focused on how the outcome of the model is effected we will mainly analyze the \(\beta\) parameters.
mcmc_areas(reuse0.all.exp, pars = c("b_reusedlogicconstructor_high_debt_versionfalse", "b_reusedlogicconstructor_work_experience_programming.s"), prob = 0.95) + scale_y_discrete() +
scale_y_discrete(labels=c("High debt version: false", "Professional programming experience")) +
ggtitle("Beta parameters densities in reuse constructor model", subtitle = "Shaded region marks 95% of the density. Line marks the median")
mcmc_areas(reuse0.all.exp, pars = c("b_reusedlogicvalidation_high_debt_versionfalse", "b_reusedlogicvalidation_work_experience_programming.s"), prob = 0.95) + scale_y_discrete() +
scale_y_discrete(labels=c("High debt version: false", "Professional programming experience")) +
ggtitle("Beta parameters densities in reuse validation model", subtitle = "Shaded region marks 95% of the density. Line marks the median")
When we look at effect size we will look at both outcomes combined
scale_programming_experience <- function(x) {
(x - mean(d.completed$work_experience_programming))/ sd(d.completed$work_experience_programming)
}
unscale_programming_experience <- function(x) {
x * sd(d.completed$work_experience_programming) + mean(d.completed$work_experience_programming)
}
post_settings <- expand.grid(
high_debt_version = c("false", "true"),
session = NA,
work_experience_programming.s = sapply(c(0, 3, 10, 25, 40), scale_programming_experience)
)
post <- posterior_predict(reuse0.all.exp, newdata = post_settings) %>%
melt(value.name = "estimate", varnames = c("sample_number", "settings_id", "model")) %>%
left_join(
rowid_to_column(post_settings, var= "settings_id"),
by = "settings_id"
) %>%
mutate(work_experience_programming = unscale_programming_experience(work_experience_programming.s)) %>%
select(
estimate,
high_debt_version,
work_experience_programming,
model
)
post %>%
mutate_at("high_debt_version",
function(x) case_when(
x == "false" ~ "Low debt",
x == "true" ~ "High debt"
)) %>%
mutate_at("estimate",
function(x) case_when(
x == 0 ~ "Reused",
x == 1 ~ "Duplicated"
)) %>%
ggplot(aes(high_debt_version)) +
geom_bar(aes(fill = estimate), position = "fill") +
facet_grid(rows = vars(work_experience_programming)) +
scale_fill_manual("Legend", values = c("lightblue", "darkblue")) +
labs(title = "Reuse / programming experience") +
xlab("Debt version") +
ylab("Ratio of reuse")
We can then proceed to calculate some likelihoods of duplication:
d <- post
d.high <- d %>% filter(high_debt_version == "true") %>% pull(estimate)
d.low <- d %>% filter(high_debt_version == "false") %>% pull(estimate)
x <- sum(d.high) / sum(d.low)
x
## [1] 1.866586
Given all the simulated cases we find that developers are 87% more likely to duplicate code in our high debt scenario.
d <- post %>% filter(work_experience_programming == 10)
d.high <- d %>% filter(high_debt_version == "true") %>% pull(estimate)
d.low <- d %>% filter(high_debt_version == "false") %>% pull(estimate)
x <- sum(d.high) / sum(d.low)
x
## [1] 2.017271
Considering developers with 10 years of professional programming experience we find that they are 102% more likely to duplicate code in our high debt scenario.
d <- post %>% filter(work_experience_programming == 0)
d.high <- d %>% filter(high_debt_version == "true") %>% pull(estimate)
d.low <- d %>% filter(high_debt_version == "false") %>% pull(estimate)
x <- sum(d.high) / sum(d.low)
x
## [1] 2.126187
Considering developers with no professional programming experience we find that they are 113% more likely to duplicate code in our high debt scenario.
d <- post %>% filter(work_experience_programming == 25)
d.high <- d %>% filter(high_debt_version == "true") %>% pull(estimate)
d.low <- d %>% filter(high_debt_version == "false") %>% pull(estimate)
x <- sum(d.high) / sum(d.low)
x
## [1] 1.783713
Considering developers with 25 years of professional programming experience we find that they are 78% more likely to duplicate code in our high debt scenario.