header advert
Bone & Joint Open Logo

Receive monthly Table of Contents alerts from Bone & Joint Open

Comprehensive article alerts can be set up and managed through your account settings

View my account settings

Visit Bone & Joint Open at:

Loading...

Loading...

Open Access

General Orthopaedics

Acceptance and understanding of artificial intelligence in medical research among orthopaedic surgeons



Download PDF

Abstract

Aims

The principles of evidence-based medicine (EBM) are the foundation of modern medical practice. Surgeons are familiar with the commonly used statistical techniques to test hypotheses, summarize findings, and provide answers within a specified range of probability. Based on this knowledge, they are able to critically evaluate research before deciding whether or not to adopt the findings into practice. Recently, there has been an increased use of artificial intelligence (AI) to analyze information and derive findings in orthopaedic research. These techniques use a set of statistical tools that are increasingly complex and may be unfamiliar to the orthopaedic surgeon. It is unclear if this shift towards less familiar techniques is widely accepted in the orthopaedic community. This study aimed to provide an exploration of understanding and acceptance of AI use in research among orthopaedic surgeons.

Methods

Semi-structured in-depth interviews were carried out on a sample of 12 orthopaedic surgeons. Inductive thematic analysis was used to identify key themes.

Results

The four intersecting themes identified were: 1) validity in traditional research, 2) confusion around the definition of AI, 3) an inability to validate AI research, and 4) cautious optimism about AI research. Underpinning these themes is the notion of a validity heuristic that is strongly rooted in traditional research teaching and embedded in medical and surgical training.

Conclusion

Research involving AI sometimes challenges the accepted traditional evidence-based framework. This can give rise to confusion among orthopaedic surgeons, who may be unable to confidently validate findings. In our study, the impact of this was mediated by cautious optimism based on an ingrained validity heuristic that orthopaedic surgeons develop through their medical training. Adding to this, the integration of AI into everyday life works to reduce suspicion and aid acceptance.

Cite this article: Bone Jt Open 2023;4(9):696–703.

Take home message

The emergence of artifical intelligence (AI) within medical research challenges established principles used by surgeons to validate findings.

Researchers should be aware that complex AI methodologies may be beyond the understanding of most orthopaedic surgeons.

Despite this, the sample in this study reports cautious optimism about the application of AI within medical research.

Introduction

Evidence-based medicine (EBM) is the cornerstone of modern medical research.1,2 The principles aid clinicians in helping patients to provide informed consent and ultimately guide them in making decisions about clinical practice. The key doctrine of EBM is that medical decisions are supported by evidence collected through in vivo studies with standardized, high-quality methodologies. This systematic framework supplements principles such as reasoning, deduction from basic science, and clinician experience. Adoption of EBM was rapid, wide-reaching, and referred to at the time as a paradigm shift.1

Recently, another paradigm has emerged: artificial intelligence (AI). AI is an umbrella term which includes a number of relevant subfields such as machine learning (ML), deep learning (DL), and neural networks (NN).3,4 This field is growing rapidly as the techniques are applied widely across orthopaedics.5-10 Interpretation of AI data requires that some of the core principles of EBM are re-framed. Currently, AI research predominantly consists of retrospective analysis of datasets with an exploratory approach, which may be seen as a lower level of evidence. The methodological differences between EBM-derived research and AI-derived research appear dichotomous, but this is not the case, as the roots of AI are firmly planted in classic statistics.11 However, many of the instruments of AI require specialized training, something which most clinicians may not have.6,12 Therefore, AI-derived research may be asking clinicians who are trained by, and practice within, an environment where medical decisions were based on evidence to take a leap of faith. The aim of this research was to explore understanding of AI, and issues relating to AI use in research among orthopaedic surgeons.

Methods

Following ethical approval (UWE REC REF No: HAS.22.03.085), a series of semi-structured individual interviews were carried out. Qualitative research was chosen to provide subjective insight into the topic.13 These interviews were chosen as the primary source of data gathering to allow participants to determine issues of importance to them within the subject area.14,15 The target population was orthopaedic surgeons working within the UK who were either consultants or in speciality training. Participants were selected through convenience sampling. The first interview was considered a pilot, and data from this were removed from final analysis due to changes in the questioning technique thereafter. The only demographic data collected were surgeon level (consultant or trainee) and sex. Participant sex was recorded to ensure that the sample reflected the population. Within the final sample of 11 interviews, eight participants were at consultant level and three were at trainee level. Of the 11 participants, nine were male (82%) and two were female (18%). All participants worked in one of seven NHS university teaching hospitals. Each participant was interviewed for approximately 30 minutes, generating around five and a half hours of recording. Although the participant sample size was small, the depth of data generated was considered large enough to derive meaningful findings for qualitative research such as this.

Interviews were carried out remotely using Microsoft Teams (Microsoft, USA), which enabled recording and transcription. Transcriptions were reviewed immediately after the interview while the information was fresh in the mind of the interviewer. After the transcriptions were anonymized, the original recordings and transcriptions were permanently destroyed.

Questions which elicit a closed or semi-closed response were avoided and, when used in discourse, were followed up with an open request for more details. Within the semi-structured interview, all participants were asked the same basic questions, ensuring that a core set of subjects were covered (Table I). The questions followed the questioning route suggested by Stalmeijer et al:16 1) opening questions, 2) introductory questions, 3) transition questions, 4) key questions, and 5) ending questions. At the beginning of the interview, all participants were given the definition of traditional research as “Research into medical diagnostics or interventions carried out either in vivo or in vitro and analyzed using classic statistics such as correlations.”

Table I.

Standard questions applied during the semi-structured interview.

Question type Question
Opening questions Are you actively involved in carrying out traditional medical research?
If so, how long have you been involved in traditional medical research?
Introductory questions How would you describe your understanding of traditional research methods?
What things do you look for in traditional research that you think give it validity?
Transition questions What is your understanding of AI?
Key questions What experience of AI do you have, either in or out of medical research?
What is your opinion of medical research using AI?
Ending questions If some evidence emerged which was contrary to your current practice, would you be more or less likely to adopt that change if the evidence was derived from AI as opposed to traditional research?
  1. AI, artificial intelligence.

The interview transcription was analyzed using thematic analysis (TA). This is a recognized method for identifying, analyzing, and reporting patterns (themes) within data.17 The technique seeks to understand experiences, thoughts, and behaviours across a dataset.18 Inductive TA was chosen as opposed to deductive, as this has been shown to lead to a broader analysis of the whole body of data, rather than a specific aspect.18,19 When identifying themes, a patterned response was considered as any information that related to the area of interest and which appeared in more than one interview. Themes were analyzed using the six steps identified by Braun and Clarke:17 familiarizing oneself with the data, generating initial codes, searching for themes, reviewing themes, defining and naming themes, and producing the report.

Once initial codes had been identified, the quotes supporting them were initially colour-coded within transcripts. This allowed an assessment of how often themes emerged in different transcripts. These quotes were then copied into a table and highlighted such that connections between the quotes could be visualized. While some codes used the same direct terminology, others were indirect, i.e. used different words but with the same or similar meanings. Once themes were identified, they were positioned with relation to each other to form a coherent narrative.

Results

The themes identified through the TA process were: 1) validity of traditional research (surgeons develop a heuristic with which they judge the validity of traditional research); 2) confusion about AI (there is poor understanding of the term AI, and it is often conflated with other terms, such as robotics); 3) inability to validate AI research (some use their original criteria to assess validity, while others do not know where to start); and 4) cautious optimism (there is a perception that AI is promising, as it was perceived as removing human biases from research).

None of the themes appeared to show a relationship with surgeon level. This may reflect the sample size, but may also be due to the relatively short time that the concept of AI has been prominent in orthopaedic research.

Theme one: Validity of traditional research

All participants reported being involved with research, with the lengths of time reflecting their tenure. One surgeon (‘Surgeon Two’) reported that they were about to retire from clinical practice within the NHS but indicated that they intend to carry on being involved in research. ‘Trainee Three’ reported that they were at the end of their training and about to become a consultant, and had amassed 13 years’ experience of research in this process. This may suggest that the process of becoming an orthopaedic surgeon goes hand in hand with practicing as a clinician-scientist.

Most reported that they had high familiarity with classic research techniques, although a number of them qualified this with phrases such as “I’m probably good for a surgeon” (‘Surgeon Seven’) and “I am average for an orthopaedic surgeon” (‘Surgeon Three’). The premise of these statements may be that these participants saw surgeons’ understanding of research methods as being different from the general population, presumably above them, although this directionality was implied and not directly stated.

When asked what things they look for in classic research as markers of validity, all reported methodology as being the main indicator, being mentioned either directly using the specific word “methodology”, or indirectly by referring to elements that would make up the methodology. The main reason reported for why methodology was important related to the potential for bias: a strong methodology was seen as removing bias, and weaker ones were seen as having the potential for bias to affect the findings. A number of participants mentioned the importance of the question, which seemed to be rooted in the ethical implications for the patient, who is almost always the subject of the research as well as the ultimate beneficiary. These sub-themes of methodology, bias, and the importance of the question inform the theme of validity in traditional research (Figure 1).

Fig. 1 
            Thematic map demonstrating subthemes (grey boxes) linking to theme one. Dotted lines denote inter-relationship and arrows denote directionality.

Fig. 1

Thematic map demonstrating subthemes (grey boxes) linking to theme one. Dotted lines denote inter-relationship and arrows denote directionality.

Theme two: Confusion about AI

All participants had heard the term AI, but only two participants, both consultant-level, were able to give an accurate definition. ‘Surgeon Four’, who was the only surgeon to report some experience with AI, gave the most complete definition (Supplementary Table i). Several participants mentioned that it was something done by computers impersonating human activity which, although correct, is sufficiently vague to suggest that they may have little understanding beyond this. Two participants mentioned ML as a term they had heard before, one of whom also mentioned artificial NN, making the link between these and AI without being prompted. This suggests a good level of understanding, although neither surgeon gave a satisfactory definition of either term. The inclusion of other new technologies in the definition was common, with patient-specific instruments (PSI) and computer navigation mentioned twice, and robots mentioned four times. In a similar vein, multiple participants mentioned predictions and modelling alongside AI. This definition is not incorrect, as it is describing one potential use of AI, however neither is it all-encompassing, and suggests a knowledge gap. Despite all participants having heard of AI, with a few having some familiarity of AI terminology, the fact that only two were able to give any sort of definition and that other terms were commonly conflated with it makes it clear that confusion around AI is thematically represented in this dataset (Figure 2).

Fig. 2 
            Thematic map demonstrating sub-themes (grey boxes) linking to theme two. Dotted lines denote inter-relationship and arrows denote directionality. AI, artificial intelligence.

Fig. 2

Thematic map demonstrating sub-themes (grey boxes) linking to theme two. Dotted lines denote inter-relationship and arrows denote directionality. AI, artificial intelligence.

Theme three: Inability to validate AI research

In response to questioning about their ability to critically evaluate AI research, none of the participants reported feeling highly confident. The most confident response was from ‘Surgeon Four’ (Supplementary Table i), who had some experience of AI research and had also given the most complete definition previously. These responses demonstrate a critical evaluation of AI. One participant drew clear lines by suggesting that AI was essentially just an extension of traditional statistics (Supplementary Table i). All other participants either reported that they felt unable to validate the research, or gave an answer that hedged their ability to evaluate traditional research. The reliance on the validity heuristic is such that one participant reported that not understanding the methodology of AI research meant a “leap of faith” was required (‘Surgeon Seven’), whereas others referenced suspicion. The need to combine AI research with traditional research before it would be widely accepted was mentioned by several participants. To combat their lack of a clear understanding, some were attempting to apply the same criteria as they use for traditional research (Figure 3). This did not lead to a total rejection of the concept, as there are elements of AI research, such as sample sizes, which sit well within the validity heuristic.

Fig. 3 
            Thematic map demonstrating sub-themes (grey boxes) linking to theme three. Dotted lines denote inter-relationship and arrows denote directionality. AI, artificial intelligence.

Fig. 3

Thematic map demonstrating sub-themes (grey boxes) linking to theme three. Dotted lines denote inter-relationship and arrows denote directionality. AI, artificial intelligence.

Theme four: Cautious optimism

None of the participants reported complete distrust in AI research. The closest was a claim that “people [surgeons] were not ready for it” (‘Surgeon Two’). The validity heuristic again came to the fore, with repeated mention linking back to methodology and the idea that findings from larger datasets are meaningful (Supplementary Table i). Similarly, the potential for AI to mitigate for bias was pointed to by most participants who expressed optimistic viewpoints. These point strongly to the theme of cautious optimism, which is driven by the interplay of themes one and two: surgeons saw the potential for AI to fit within the validity heuristic, but were not quite sure how (Figure 4).

Fig. 4 
            Thematic map demonstrating sub-themes (grey boxes) linking to theme four. Dotted lines denote inter-relationship and arrows denote directionality. AI, artificial intelligence.

Fig. 4

Thematic map demonstrating sub-themes (grey boxes) linking to theme four. Dotted lines denote inter-relationship and arrows denote directionality. AI, artificial intelligence.

Discussion

Trust was at the heart of the surgeon’s relationship with the validity heuristic. The principles of EBM allow a surgeon to trust that research findings will translate into benefits for their patients. A lack of understanding changes this relationship.6 Farrow et al3 suggest that when reading AI research, the surgeon essentially becomes a layperson. This implies that the methodology of AI research should be considered as specialist mathematical information. This is echoed by Martin et al,20 who describe a knowledge gap in orthopaedic surgeons’ understanding of AI research and suggest that this may limit the impact within orthopaedics. Accepting research and adopting research are different, and a delay of between 15 and 20 years has been described between research publication and widespread adoption.21,22 In the case that findings are unable to be critically evaluated, it is reasonable to assume that this relationship may change, either by becoming longer or, perhaps more concerningly, that effective critical evaluation is absent.

The need for AI to be comprehensible is underlined by the term ‘explainable AI’ (XAI). This term has been attributed to the USA Defence Advanced Research Project Agency (DARPA), who saw the need to “open the black box and let users see how conclusions were drawn".23 The term ‘black box’ refers to a system for which users can see inputs and outputs, but not how it works.24 Behind this metaphor is the need for transparency in the process, which corresponds with trust, which is critical in a high-stakes field such as healthcare. This is reflected in the interview with Surgeon Seven, who uses the phrase “leap of faith”. Samek et al25 describe trusting predictions made by a black box system as irresponsible.25 A secondary discussion around trust and AI has emerged within medical ethics, with the patients’ trust in doctors at the forefront. Much of what doctors do is not understood by the patient,26 and it has been argued that XAI therefore does not need to be explainable to the patient.27 This may assume that the surgeon is better positioned to understand the AI than the patient in the same way that they are better positioned to understand the action of a specific drug, which may not necessarily be true.20

One critical area where the knowledge gap between a surgeon and a patient is significant is the legal principle of informed consent, which assumes that the surgeon knows both what they are doing and, critically, why they are doing it.28 During the process of informed consent, patients may question clinical decision-making. If the surgeon is not able to explain the basis on which they make their decisions, they are therefore unable to justify them.26

That negative thoughts were moderated by positive ones in theme four may be due to the rise of AI in areas of modern life outside medical research.20 Modern society is firmly in the information age – a shift the size of which has been likened to the Industrial Revolution.29 Those surgeons facing choices about whether to apply AI research findings are likely to already be aware of, and possibly already be using, AI-derived algorithms in their day-to-day lives; the leap from voice recognition in their smart speaker to image recognition in radiological analysis may not be too large.

When considered from the perspective of a society already accustomed to AI, EBM could be considered rudimentary, even dangerous. The largest comparative trials in orthopaedics rarely recruit more than a few hundred patients, and national observational databases such as the National Joint Registry only report in the hundreds of thousands.30 When compared to the amount of information that tech giants hold on individuals, this amount of data pales in significance. Randomized controlled trials that inform something as significant as whether a patient should be offered conservative treatment or have a joint arthroplasty could seem like not enough data to make a really informed choice when held against such datasets.

As the EBM framework has become ingrained, so too have ways in which to manipulate the system,31,32 with some reports suggesting that much of the evidence upon which medicine is based may not be true,33 or may even be missing.34 Alternative frameworks have appeared, such as the Modified Coleman Methodology Score,35 which takes a broader view of quality than just the type of study. There have also been attempts to bring elements of it up to date with modern medicine, specifically to accommodate AI research. The Consolidated Standards of Reporting Trials statement (CONSORT)36 and the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) have recently been updated with AI extensions.37,38 In the case of the CONSORT statement, 14 additional steps have been added and others clarified to make the CONSORT-AI statement. At the time of writing, the authors were unable to find any publications that use the CONSORT-AI checklist, however the fact that they have been added gives credence to the findings of this research.

This study has limitations that should be acknowledged. The convenience sampling technique could potentially include significant biases. Surgeons were chosen from personal contacts, which may affect the generalizability of these findings. Future work should look to use a sampling method that may have greater reproducibility. As with all qualitative research, it is not possible to completely remove the effect of researcher bias.39 For example, the framing of the questions used in the semi-structured interviews may have drawn participants towards comparisons between EBM and AI that otherwise may not have been made. Additionally, the interviewer has been involved with research into robotics, which may have led to the conflation of AI with robotics that was observed in theme two.

In conclusion, the orthopaedic surgeons interviewed had a deep sense of what they felt made up good evidence and what evidence may be compromised by bias. This validity heuristic leans heavily on the methodology of a given piece of evidence and is enshrined through traditional EBM frameworks. AI challenges the accepted framework, meaning the validity heuristic is disturbed, which gives rise to confusion and the inability for surgeons to validate findings. The impact of this is mediated by cautious optimism as the validity heuristic is applied; bias is potentially reduced, and sample sizes are large. Adding to this, the absorption of AI into everyday life works to reduce suspicion and potentially may aid acceptance.


Correspondence should be sent to Michael James Ormond. E-mail:

References

1. Evidence-Based Medicine Working Group . Evidence-based medicine. A new approach to teaching the practice of medicine . JAMA . 1992 ; 268 ( 17 ): 2420 2425 . Crossref PubMed Google Scholar

2. Greenhalgh T , Howick J , Maskrey N , Brassey J , Burch D , Burton M . Evidence based medicine: A movement in crisis? BMJ . 2014 ; 348 : g3725 . Crossref PubMed Google Scholar

3. Farrow L , Zhong M , Ashcroft GP , Anderson L , Meek RMD . Interpretation and reporting of predictive or diagnostic machine-learning research in Trauma & Orthopaedics . Bone Joint J . 2021 ; 103-B ( 12 ): 1754 1758 . Crossref PubMed Google Scholar

4. Myers TG , Ramkumar PN , Ricciardi BF , Urish KL , Kipper J , Ketonis C . Artificial intelligence and orthopaedics: An introduction for clinicians . J Bone Joint Surg Am . 2020 ; 102-A ( 9 ): 830 840 . Crossref PubMed Google Scholar

5. Li Z , Maimaiti Z , Fu J , Chen J-Y , Xu C . Global research landscape on artificial intelligence in arthroplasty: A bibliometric analysis . Digit Health . 2023 ; 9 : 20552076231184048 . Crossref PubMed Google Scholar

6. Kunze KN , Orr M , Krebs V , Bhandari M , Piuzzi NS . Potential benefits, unintended consequences, and future roles of artificial intelligence in orthopaedic surgery research: a call to emphasize data quality and indications . Bone Jt Open . 2022 ; 3 ( 1 ): 93 97 . Crossref PubMed Google Scholar

7. Leopold SS , Haddad FS , Sandell LJ , Swiontkowski M . Artificial intelligence applications and scholarly publication in orthopaedic surgery . Bone Joint J . 2023 ; 105-B ( 6 ): 585 586 . Crossref PubMed Google Scholar

8. Oliveira E Carmo L , van den Merkhof A , Olczak J , et al. An increasing number of convolutional neural networks for fracture recognition and classification in orthopaedics: are these externally validated and ready for clinical application? Bone Jt Open . 2021 ; 2 ( 10 ): 879 885 . Crossref PubMed Google Scholar

9. Jang SJ , Kunze KN , Brilliant ZR , et al. Comparison of tibial alignment parameters based on clinically relevant anatomical landmarks: a deep learning radiological analysis . Bone Jt Open . 2022 ; 3 ( 10 ): 767 776 . Crossref PubMed Google Scholar

10. Archer H , Reine S , Alshaikhsalama A , et al. Artificial intelligence-generated hip radiological measurements are fast and adequate for reliable assessment of hip dysplasia: an external validation study . Bone Jt Open . 2022 ; 3 ( 11 ): 877 884 . Crossref PubMed Google Scholar

11. Beam AL , Kohane IS . Big data and machine learning in health care . JAMA . 2018 ; 319 ( 13 ): 1317 1318 . Crossref PubMed Google Scholar

12. Liu Y , Chen PHC , Krause J , Peng L . How to read articles that use machine learning: Users’ guides to the medical literature . JAMA . 2019 ; 322 ( 18 ): 1806 1816 . Crossref PubMed Google Scholar

13. Ayre J , McCaffery KJ . Research Note: Thematic analysis in qualitative research . J Physiother . 2022 ; 68 ( 1 ): 76 79 . Crossref PubMed Google Scholar

14. Fowler Jr FJ . Improving Survey Questions: Design and Evaluation . Boston, Massachusetts : Sage Publications , 1995 : 191 . Google Scholar

15. Low J . Unstructured and Semi-structured Interviews in Health Research . In : Saks M , Allsop J . Researching Health: Qualitative, Quantitative and Mixed Methods . Second ed . Bodmin : MPG Books Group , 2013 . Google Scholar

16. Stalmeijer RE , Mcnaughton N , Van Mook WNKA . Using focus groups in medical education research: AMEE Guide No. 91 . Med Teach . 2014 ; 36 ( 11 ): 923 939 . Crossref PubMed Google Scholar

17. Braun V , Clarke V . Using thematic analysis in psychology . Qual Res Psychol . 2006 ; 3 ( 2 ): 77 101 . Crossref Google Scholar

18. Kiger ME , Varpio L . Thematic analysis of qualitative data: AMEE Guide No. 131 . Med Teach . 2020 ; 42 ( 8 ): 846 854 . Crossref PubMed Google Scholar

19. Braun V , Clarke V . What can “thematic analysis” offer health and wellbeing researchers? Int J Qual Stud Health Well-being . 2014 ; 9 : 26152 . Crossref PubMed Google Scholar

20. Martin RK , Ley C , Pareek A , Groll A , Tischer T , Seil R . Artificial intelligence and machine learning: an introduction for orthopaedic surgeons . Knee Surg Sports Traumatol Arthrosc . 2022 ; 30 ( 2 ): 361 364 . Crossref PubMed Google Scholar

21. Morris ZS , Wooding S , Grant J . The answer is 17 years, what is the question: understanding time lags in translational research . J R Soc Med . 2011 ; 104 ( 12 ): 510 520 . Crossref PubMed Google Scholar

22. Fitzpatrick JJ . Lag time in research to practice: are we reducing or increasing the gap? Appl Nurs Res . 2008 ; 21 ( 1 ): 1 . Crossref PubMed Google Scholar

23. Torres J . Explainable AI: The Next Frontier in Human-Machine Harmony . Towards Data Science , 2019 . https://towardsdatascience.com/explainable-ai-the-next-frontier-in-human-machine-harmony-a3ba5b58a399 ( date last accessed 14 August 2023 ). Google Scholar

24. Card D . Medium . 5 July , 2017 . https://dallascard.medium.com/the-black-box-metaphor-in-machine-learning-4e57a3a1d2b0 ( date last accessed 14 August 2023 ). Google Scholar

25. Samek W , Wiegand T , Müller KR . Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models . 28 August , 2017 . http://arxiv.org/abs/1708.08296 ( date last accessed 14 August 2023 ). Google Scholar

26. Theunissen M , Browning J . Putting explainable AI in context: institutional explanations for medical AI . Ethics Inf Technol . 2022 ; 24 ( 2 ): 23 . Crossref PubMed Google Scholar

27. London AJ . Artificial intelligence and black-box medical decisions: Accuracy versus explainability . Hastings Cent Rep . 2019 ; 49 ( 1 ): 15 21 . Crossref PubMed Google Scholar

28. Vercler CJ . Surgical ethics: surgical virtue and more . Narrat Inq Bioeth . 2015 ; 5 ( 1 ): 45 51 . Crossref PubMed Google Scholar

29. Hashimoto DA , Rosman G , Rus D , Meireles OR . Artificial intelligence in surgery: Promises and perils . Ann Surg . 2018 ; 268 ( 1 ): 70 76 . Crossref PubMed Google Scholar

30. Ben-Shlomo Y , Blom A , Boulton C , et al. The National Joint Registry 18th Annual Report 2021 [Internet] , London : National Joint Registry . 2021 . https://www.ncbi.nlm.nih.gov/books/NBK576858/ Google Scholar

31. Giofrè D , Cumming G , Fresc L , Boedker I , Tressoldi P . The influence of journal submission guidelines on authors’ reporting of statistics and use of open research practices . PLoS One . 2017 ; 12 ( 4 ): e0175583 . Crossref PubMed Google Scholar

32. Giofrè D , Boedker I , Cumming G , Rivella C , Tressoldi P . The influence of journal submission guidelines on authors’ reporting of statistics and use of open research practices: Five years later . Behav Res Methods . 2022; Epub ahead of print . Crossref PubMed Google Scholar

33. Ioannidis JP . Why most published research findings are false . PLoS Med . 2005 ; 2 ( 8 ): e124 . PubMed Crossref Google Scholar

34. Prescott RJ , Counsell CE , Gillespie WJ , et al. Factors that limit the quality, number and progress of randomised controlled trials . Health Technol Assess . 1999 ; 3 ( 20 ): 1 143 . PubMed Google Scholar

35. Cowan J , Lozano-Calderón S , Ring D . Quality of prospective controlled randomized trials. Analysis of trials of treatment for lateral epicondylitis as an example . J Bone Joint Surg Am . 2007 ; 89-A ( 8 ): 1693 1699 . Crossref PubMed Google Scholar

36. Moher D , Schulz KF , Altman DG , CONSORT Group . The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials . Clin Oral Investig . 2003 ; 7 ( 1 ): 2 7 . Crossref PubMed Google Scholar

37. Liu X , Faes L , Calvert MJ , Denniston AK , CONSORT/SPIRIT-AI Extension Group . Extension of the CONSORT and SPIRIT statements . Lancet . 2019 ; 394 ( 10205 ): 1225 . Crossref PubMed Google Scholar

38. Liu X , Rivera SC , Moher D , Calvert MJ , Denniston AK , SPIRIT-AI and CONSORT-AI Working Group . Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI Extension . BMJ . 2020 ; 370 : m3164 . Crossref PubMed Google Scholar

39. Chenail RJ . Interviewing the investigator: strategies for addressing instrumentation and researcher bias concerns in qualitative research . The Qualitative Report . 2011 ; 16 : 255 262 . http://www.nova.edu/ssss/QR/QR16-1/interviewing.pdf Google Scholar

Author contributions

M. J. Ormond: Conceptualization, Methodology, Formal analysis, Writiing – original draft.

N. D. Clement: Writing – review & editing.

B. G. Harder: Writing – review & editing.

L. Farrow: Writing – review & editing.

A. Glester: Writing – review & editing, Supervision.

Funding statement

The authors received no financial or material support for the research, authorship, and/or publication of this article.

ICMJE COI statement

N. D. Clement is an editorial board member of Bone & Joint Research and The Bone & Joint Journal. L. Farrow is currently in receipt of a CSO Clinical Academic Fellowship that relates to the application of artificial intelligence to orthopaedic surgery. M. J. Ormond and B. G. Harder are employed by Stryker. M. J. Ormond also holds stock or stock options in Stryker.

Data sharing

The datasets generated and analyzed in the current study are not publicly available due to data protection regulations. Access to data is limited to the researchers who have obtained permission for data processing. Further inquiries can be made to the corresponding author.

Open access funding

The open access fee for this article was funded by Stryker.

Supplementary material

Table showing transcription quotes sorted by theme.

© 2023 Author(s) et al. This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial No Derivatives (CC BY-NC-ND 4.0) licence, which permits the copying and redistribution of the work only, and provided the original author and source are credited. See https://creativecommons.org/licenses/by-nc-nd/4.0/