header advert
Orthopaedic Proceedings Logo

Receive monthly Table of Contents alerts from Orthopaedic Proceedings

Comprehensive article alerts can be set up and managed through your account settings

View my account settings

Visit Orthopaedic Proceedings at:

Loading...

Loading...

Full Access

General Orthopaedics

COMPARISON OF THE ACCURACY ASSOCIATED WITH THREE DIFFERENT MACHINE-LEARNING MODELS TO PREDICT OUTCOMES AFTER ANATOMIC TOTAL SHOULDER ARTHROPLASTY AND REVERSE TOTAL SHOULDER ARTHROPLASTY

International Society for Technology in Arthroplasty (ISTA) meeting, 32nd Annual Congress, Toronto, Canada, October 2019. Part 1 of 2.



Abstract

Introduction

Machine learning is a relatively novel method to orthopaedics which can be used to evaluate complex associations and patterns in outcomes and healthcare data. The purpose of this study is to utilize 3 different supervised machine learning algorithms to evaluate outcomes from a multi-center international database of a single shoulder prosthesis to evaluate the accuracy of each model to predict post-operative outcomes of both aTSA and rTSA.

Methods

Data from a multi-center international database consisting of 6485 patients who received primary total shoulder arthroplasty using a single shoulder prosthesis (Equinoxe, Exactech, Inc) were analyzed from 19,796 patient visits in this study. Specifically, demographic, comorbidity, implant type and implant size, surgical technique, pre-operative PROMs and ROM measures, post-operative PROMs and ROM measures, pre-operative and post-operative radiographic data, and also adverse event and complication data were obtained for 2367 primary aTSA patients from 8042 visits at an average follow-up of 22 months and 4118 primary rTSA from 11,754 visits at an average follow-up of 16 months were analyzed to create a predictive model using 3 different supervised machine learning techniques: 1) linear regression, 2) random forest, and 3) XGBoost. Each of these 3 different machine learning techniques evaluated the pre-operative parameters and created a predictive model which targeted the post-operative composite score, which was a 100 point score consisting of 50% post-operative composite outcome score (calculated from 33.3% ASES + 33.3% UCLA + 33.3% Constant) and 50% post-operative composite ROM score (calculated from S curves weighted by 70% active forward flexion + 15% internal rotation score + 15% active external rotation). 3 additional predictive models were created to control for the time required for patient improvement after surgery, to do this, each primary aTSA and primary rTSA cohort was subdivided to only include patient data follow-up visits >20 months after surgery, this yielded 1317 primary aTSA patients from 2962 visits at an average follow-up of 50 months and 1593 primary rTSA from 3144 visits at an average follow-up of 42 months. Each of these 6 predictive models were trained using a random selection of 80% of each cohort, then each model predicted the outcomes of the remaining 20% of the data based upon the demographic, comorbidity, implant type and implant size, surgical technique, pre-operative PROMs and ROM measures inputs of each 20% cohort. The error of all 6 predictive models was calculated from the root mean square error (RMSE) between the actual and predicted post-op composite score. The accuracy of each model was determined by subtracting the percent difference of each RMSE value from the average composite score associated with each cohort.

Results

For all patient visits, the XGBoost decision tree algorithm was the most accurate model for both aTSA & rTSA patients, with an accuracy of ∼89.5% for both aTSA and rTSA. However for patients with 20+ month visits only, the random forest decision tree algorithm was the most accurate model for both aTSA & rTSA patients, with an accuracy of ∼89.5% for both aTSA and rTSA. The linear regression model was the least accurate predictive model for each of the cohorts analyzed. However, it should be noted that all 3 machine learning models provided accuracy of ∼85% or better and a RMSE <12. (Table 1) Figures 1 and 2 depict the typical spread and RMSE of the actual vs. predicted total composite score associated with the 3 models for aTSA (Figure 1) and rTSA (Figure 2)

Discussion

The results of this study demonstrate that multiple different machine learning algorithms can be utilized to create models that predict outcomes with higher accuracy for both aTSA and rTSA, for numerous timepoints after surgery. Future research should test this model on different datasets and using different machine learning methods in order to reduce over- and under-fitting model errors.

For any figures or tables, please contact the authors directly.