Skip to content

Add accept3_uk() for UK-specific CPRD recalibration#21

Open
jeenatm wants to merge 5 commits intoresplab:masterfrom
jeenatm:master
Open

Add accept3_uk() for UK-specific CPRD recalibration#21
jeenatm wants to merge 5 commits intoresplab:masterfrom
jeenatm:master

Conversation

@jeenatm
Copy link
Copy Markdown

@jeenatm jeenatm commented Apr 8, 2026

This PR adds accept3_uk(), a new function that provides
UK-specific COPD exacerbation risk predictions by recalibrating
ACCEPT 2.0 using the CPRD primary-care dataset.

Key features:

  • UK-specific Cox recalibration parameters (optimism-corrected)
  • Sequential triangular imputation for missing optional predictors (Table A3 from the ACCEPT CPRD Paper)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces accept3_uk(), a UK-specific COPD exacerbation risk prediction entrypoint that recalibrates ACCEPT 2.0 using CPRD-derived Cox recalibration parameters and a UK-specific sequential (triangular) imputation for optional predictors.

Changes:

  • Added accept3_uk() with CPRD-based sequential imputation for missing optional predictors.
  • Added UK-specific Cox recalibration step applied to ACCEPT 2.0 predicted risks.
  • Minor formatting-only adjustments in existing accept() and accept3() code blocks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1145 to +1147
#' @importFrom dplyr tibble mutate select starts_with
#' @export
accept3_uk <- function(patientData,
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The roxygen @export tag adds a new public function, but this PR doesn’t update the generated NAMESPACE (and corresponding .Rd in man/). As-is, accept3_uk() won’t be exported/accessible when installing from source. Regenerate and commit NAMESPACE/documentation via roxygen2 (or add them manually if roxygen isn’t part of the workflow).

Copilot uses AI. Check for mistakes.
R/predict.R Outdated
Comment on lines +1147 to +1149
accept3_uk <- function(patientData,
prediction_interval = FALSE,
return_predictors = FALSE) {
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prediction_interval is documented and exposed in the signature but never used, so callers will get identical output regardless of its value. Either implement interval propagation (e.g., recalibrate accept2()’s *_lower_PI/*_upper_PI columns and include them when prediction_interval=TRUE) or remove the parameter and its documentation to avoid a misleading API.

Copilot uses AI. Check for mistakes.
Comment on lines +1126 to +1130
#' CPRD Cox model (Table in manuscript):
#' \itemize{
#' \item Moderate-to-severe: \eqn{H_0 = 0.676}, \eqn{\beta = 0.986}
#' \item Severe: \eqn{H_0 = 1.124}, \eqn{\beta = 0.482}
#' }
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The severe-outcome recalibration parameters are inconsistent within this function’s documentation/comments vs the actual constants used. The roxygen block lists severe H0 = 1.124 and beta = 0.482, while the implementation/comment block uses H0_sev <- 0.482 and beta_sev <- 1.124. Please verify the correct CPRD values and make the roxygen formula section, in-code comments, and constants agree.

Copilot uses AI. Check for mistakes.
Comment on lines +1205 to +1210
BMI = list(
binary = FALSE,
clamp_low = 10,
clamp_hi = 70,
coef = c(28.944, -0.074, 0.195, 0.421, 0.027, -0.527, 0.010,
0.451, -0.093, -0.051, -0.105, 1.629) # +LABA, oxygen, ICS, LAMA, statin
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BMI imputation clamps to a maximum of 70, but the package documentation for samplePatients (and accept3 parameter docs) specifies BMI is expected in the 10–60 range. Clamping above the validated range may yield out-of-domain predictions; consider clamping to 60 (or explicitly documenting/justifying the higher cap if the UK recalibration supports it).

Copilot uses AI. Check for mistakes.
Comment on lines +1223 to +1226
# Use mMRC if available, otherwise back-transform from SGRQ
if (!"mMRC" %in% colnames(patientData) && "SGRQ" %in% colnames(patientData)) {
patientData$mMRC <- (patientData$SGRQ - 20.43) / 14.77
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mMRC is treated as a mandatory predictor for the imputation models, but the code only back-calculates mMRC when the column is entirely absent. If mMRC exists but has NA values (while SGRQ is present), those rows will keep NA in X, causing lp/imputations to become NA. Consider filling mMRC where it is missing using SGRQ when available (and/or explicitly error if mandatory predictors contain missing values).

Copilot uses AI. Check for mistakes.
Comment on lines +1269 to +1273
# 3. ACCEPT 2.0 predictions
accept2_preds <- accept2(patientData = patientData)

p2_modsev <- accept2_preds$predicted_exac_probability
p2_sev <- accept2_preds$predicted_severe_exac_probability
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike accept(), this function doesn’t validate LastYrSevExacCount <= LastYrExacCount. Since those fields directly influence both imputation and downstream accept2() predictions, add the same guard here (or call through the existing accept() validation path) to fail fast with a clear error for inconsistent history inputs.

Copilot uses AI. Check for mistakes.
Comment on lines +1147 to +1153
accept3_uk <- function(patientData,
prediction_interval = FALSE,
return_predictors = FALSE) {

if (!tibble::is_tibble(patientData)) {
stop("patientData must be a tibble. Use as_tibble() to convert.")
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds substantial new behavior (UK-specific sequential imputation + recalibration) but there are no tests covering accept3_uk() in tests/testthat/test-predict.R. Add tests for: (1) baseline output columns/shape, (2) expected predictions on fixed inputs, and (3) at least one missing-optional-predictor scenario to assert imputation + determinism.

Copilot uses AI. Check for mistakes.
Comment on lines +1251 to +1260
if (!vname %in% colnames(patientData)) {
patientData[[vname]] <- pred_vals
message(paste0("accept3_uk: '", vname, "' not found - imputed using UK model."))
} else {
na_idx <- is.na(patientData[[vname]])
patientData[[vname]][na_idx] <- pred_vals[na_idx]
if (any(na_idx))
message(paste0("accept3_uk: ", sum(na_idx), " missing value(s) in '",
vname, "' imputed using UK model."))
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequential imputation emits multiple message() calls (per missing column / per NA count), which can be noisy in batch prediction and hard to suppress selectively. Consider switching to a single aggregated warning()/message() at the end (or add a quiet/verbose flag consistent with the rest of the API) so callers can control console output more easily.

Copilot uses AI. Check for mistakes.
@aminadibi
Copy link
Copy Markdown
Collaborator

Should the function be called accept3_uk? Isn't it confusing that already have an accept3 call with GBR as one of potential country inputs? @jeenatm @msadatsafavi

@jeenatm
Copy link
Copy Markdown
Author

jeenatm commented Apr 9, 2026

Should the function be called accept3_uk? Isn't it confusing that already have an accept3 call with GBR as one of potential country inputs? @jeenatm @msadatsafavi

@aminadibi What about accept3_cprd? I can change accordingly. Or we can say that accept3_UK is based on primary care UK data whereas accept3 with GBR is for secondary/tertiary care?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants