[ad_1]
Outcome measures for CO and HKT were assessed at baseline (t0) and after three (t3), six (t6), twelve (t12), and 24 (t24) months using self-administered questionnaires which were delivered with a return envelope by postal mail. Economic data and ICD-Codes (International Classification of Diseases) for knee and hip arthroplasty were assessed from the insurance data base. Economic data were used for the propensity score matching (see section Statistical analyses).
Patient baseline characteristics (t0 only)
Self-reported patient characteristics comprised age, sex, body mass index (BMI), site of OA (hip/knee/both), additional joint replacement (yes/no). The following data were obtained from the insurance data base: working status, complexity of work, years of school education and level of education.
Primary outcomes (t0 – t3)
WOMAC pain and function
The subscales pain and physical function of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC® NRS 3.1 German Index) were used as primary outcomes. The scales in this study ranged from 0 (no limitation) to 10 (maximum limitation).
Secondary outcomes (t0—t24)
WOMAC pain and function
WOMAC follow-up data t6—t24 were used to assess mid- and long-term effects of the intervention.
Health-related quality of life (VR-12, PCS, MCS)
The Veterans RAND 12-Item Health Survey (VR-12) is a patient-reported global health measure that assesses a patient’s overall perspective of their health [21]. The instrument comprises 12 items, and the questions correspond to eight different health domains: general health perceptions (GHP), physical functioning, role limitations due to physical and emotional problems, bodily pain, energy-fatigue levels, social functioning, and mental health. The VR-12 uses five-point ordinal response choices (1 = no, none of the time to 5 = yes, all of the time; higher scores represent better health status). Answers were summarized in a Physical Component Score (PCS) and a Mental Component Score (MCS), each normalized to the 1990 US population norm (mean = 50; SD = 10).
General self-efficacy scale (GSE)
The GSE scale is a ten-item self-report psychometric scale that measures general self-efficacy as a prospective and operative construct [22]. Items are scored on a 4-point Likert scale (1 = not at all true to 4 = completely true, higher scores indicate higher self-efficacy). A mean score was calculated when at least six items were present.
Health-oriented activity status (Ho-AS)
Participants were asked to rate whether they are active in a health-oriented manner (Ho-AS), e.g., visiting gyms, going for a run or walk (1 = outstandingly active to 5 = not at all active).
Artificial joint replacement during follow-up (t3 – t24)
First incidence of artificial joint replacement (AJR) at the knee or hip joints during follow-up t3—t24 was read out from routine data of the insurance data base.
Perceived benefit from the intervention/satisfaction with exercise instructors (t3, HKT only)
The participants’ overall perceived benefit from the intervention was assessed on a 5-point Likert scale (1 = very high perceived benefit to 5 = no perceived benefit). Furthermore, questions on trainer competence (1 = very competent to 4 = not competent at all), trainer motivation (1 = very engaged and motivated to 4 = not engaged and motivated at all) and whether participants would recommend the training program to others (1 = definitely yes to 4 = definitely not) were asked.
Exercise adherence (t3, HKT only)
Participants of HKT were asked to report if they attended all group sessions (yes/no), all home-based exercise sessions (yes/no) and reasons for non-participation (multiple responses possible), if applicable.
Exercise-related adverse events (t3, HKT only)
Occurrence of exercise-related pain and its frequency, duration and intensity were collected.
Concomitant care (t3 – t24)
Participants of CO (t3—t24) and HKT (t6 – t24) were asked to report participation in a hip and/or knee training during the previous follow-up period. Programs were differentiated into HKT group training and HKT home-based training, AOK machine-based training (another specific offer of the AOK-BW, specifically designed for patients with hip/knee OA) or any other exercise training for hip/knee OA (provider not specified). Participants were further asked if they attended any other additional AOK-provided health care offers.
Sample size
The sample size was estimated on the empirical basis of a previous RCT [17]. In this RCT intra-individual differences of the WOMAC pain subscale and as well the WOMAC physical function subscale exhibited an effect size according to Cohen’s d of 0.5 between intervention and control group. Based on these results and a potential efficacy-effectiveness gap between RCTs and studies under real life conditions [23] we finally assumed an effect size of ES = 0.3. Accounting for the two primary endpoints (WOMAC pain, physical function), a level of significance of 0.025 (two-sided, Bonferroni correction) and a power of 0.90 was used. Calculations yielded a sample size of 278 subjects per group in a parallel group design (nQuery 7.0). Accounting for a dropout rate of 20% (n = 350 subjects/study arm) and cluster effects of subjects within treatment groups, n = 700 participants should be allocated to each treatment arm. Further details are provided in the study protocol [17] and Additional Information S1.
Blinding
Blinding of the subjects or care providers to treatment was not possible as treatment exposure was evident. Blinding of assessors was not applicable as all outcomes were patient reported or retrieved from the health insurance data base. Statisticians were not blinded due to the necessary preparation of the baseline data of the intervention group for PSM.
Statistical analyses
All data analyses were conducted with SPSS Statistics version 26 (IBM Corp. Armonk, N.Y., USA) and R version 4.0.4 (R Core Team, 2020) with R Studio (version 1.3.1056; RStudio, PBC., Boston, MA, USA).
Matching procedures for the control group
The matching procedure for the statistical twins of CO to each participant of HKT was conducted in two steps. First, customers of the AOK-BW were assessed for eligibility from the insurance data base according to pre-defined matching criteria (Additional Table S5). This step was done quarterly after including new subjects into HKT. We aimed to recruit ten customers of the AOK-BW for participation in the control group (CO) for each participant of HKT. Due to the low response rate, however, around 60 insured persons per HKT participant had to be selected and contacted in order to have a ratio of 1:4 for the final matching (see Fig. 1). Socio-demographic (age, sex), health-related (BMI, OA-related pain and function, affected joint, previous artificial joint replacement physical and mental health-related quality of life, QALY, health-related activity, general self-efficacy), and economic variables (unspecific and specific health care costs and days of disability) were included in the final matching. The standardized mean difference (SMD) for all covariates was < 9% (see Additional Table S 5).
Imputation of missing data
To investigate the mechanism of missing data, we performed Little’s test [24], which yielded a statistically significant result (p < 0.001), so the null hypothesis of missing completely at random (MCAR) was rejected. As missingness was mostly due to wave-nonresponse with patients being lost to follow-up, we further explored a missing at random (MAR) mechanism by comparing the characteristics of dropouts vs. completers of the study (see results section). Multiple imputation (MI) was then performed with the R package Amelia [25] under the assumption that data are missing at random (MAR). A two-step MI procedure [26] was chosen to combine the selection of statistical twins from the control group via PSM, which was based on imputed baseline data (t0) only, and the multiple imputation of the longitudinal follow-up data with the final matched pairs (t3, t6, t12, t24). M = 100 MI sets were generated in total.
Main analysis
Two separate linear mixed models (LMMs) for the primary endpoints WOMAC pain and function were conducted with a restricted maximum likelihood estimation (REML) including time (t0, t3) and treatment (HKT, CO) and time x treatment interaction as fixed factors with a random intercept for subject to account for within-subject correlations. We refrained from analyzing our data using a matched-pair design, as PSM does not guarantee individual pairs to be well-matched on the full set of covariates and included the PS as a covariate in the models instead [17]. Model assumptions were checked visually by means of residual- and QQ-plots (normality of residuals, normality of random effects, linearity, homogeneity of variance). Logarithmic transformations were applied to both primary outcomes to achieve normal distribution. Overall omnibus F-tests (pooled over the MI sets) were conducted to check for statistically significant time x treatment effects. To interpret the magnitude of the treatment and time effects, pooled estimated marginal means (EMM) and the corresponding 95% confidence interval (CI) were calculated and back-transformed from log-scale to the original measurement scale. From those EMMs, within-group change from baseline (cfb) estimates and the according estimated between-group treatment differences (ETD) were derived for each timepoint. Similar LMMs were run for long-term follow-ups (t0-t24) for all secondary outcomes including WOMAC pain and function (both with logarithmic transformation), GSE, MCS, PCS, and Ho-AS. Effect sizes (ES) were calculated using the estimates derived from the LMM analyses. Estimates were divided by the pooled SD of HKT and CO at baseline. Effect sizes were considered to be small (0.2–0.29), moderate (0.3–0.79) or large (> 0.8) [27].
Statistical significance for the two primary outcomes was set as p ≤ 0.025 (two-sided, Bonferroni correction). For secondary outcomes, statistical significance was set as p ≤ 0.05 without claiming confirmatory interpretation.
Additional analyses
Sensitivity analysis (pre-specified in the study protocol)
We ran the LMMs for WOMAC pain and WOMAC function on all available data (AA) without MI. To further evaluate the robustness of our results we also conducted a complete case (CC) analysis on the two primary endpoints. At this point it is noted that CC dataset has unequal group sizes and does not contain all matched 1:1-pairs.
Exploratory subgroup analysis
A subgroup analysis was done to compare WOMAC pain and WOMAC function at t3 versus baseline for complete cases of HKT versus a subsample of CO (CO-exercise). CO-exercise was defined as participants of CO having reported to engage in any hip/knee-specific exercise between t0 and t3 as outlined in Additional Table S 14. Again, it is noted that the subgroup dataset has unequal group sizes and does not align to all matched 1:1-pairs.
Exploratory analysis on artificial joint replacement during follow-up (t0 – t24)
An exploratory time-to-event analysis was conducted applying a multivariable cox proportional hazards regression model for the first incidence of joint replacement (AJR) in the follow-up period t0 – t24 to identify risk factors including the covariates intervention group, WOMAC pain, MCS and PCS at baseline (t0) as well as age, sex and site of OA. Variables that were excluded from the model with the respective reasons are outlined in Additional Information S1. Results were reported as hazard ratios (HR), 95% confidence intervals (CI) and two-sided p-values. The proportional hazard (PH) assumption required for Cox proportional hazards modelling was found to be fulfilled by inspecting the respective Schoenfeld residuals and time x covariate interactions.
[ad_2]
Source link