header advert
Orthopaedic Proceedings Logo

Receive monthly Table of Contents alerts from Orthopaedic Proceedings

Comprehensive article alerts can be set up and managed through your account settings

View my account settings

Visit Orthopaedic Proceedings at:

Loading...

Loading...

Full Access

General Orthopaedics

A DEEP LEARNING-BASED COMPUTER VISION SYSTEM FOR DETECTION AND POSE ESTIMATION OF SURGICAL TOOLS USED IN JOINT REPLACEMENT

International Society for Technology in Arthroplasty (ISTA) meeting, 32nd Annual Congress, Toronto, Canada, October 2019. Part 2 of 2.



Abstract

Introduction

Real-time tracking of surgical tools has applications for assessment of surgical skill and OR workflow. Accordingly, efforts have been devoted to the development of low-cost systems that track the location of surgical tools in real-time without significant augmentation to the tools themselves. Deep learning methodologies have recently shown success in a multitude of computer vision tasks, including object detection, and thus show potential for the application of surgical tool tracking. The objective of the current study was to develop and evaluate a deep learning-based computer vision system using a single camera for the detection and pose estimation of multiple surgical tools routinely used in both knee and hip arthroplasty.

Methods

A computer vision system was developed for the detection and 6-DoF pose estimation of two surgical tools (mallet and broach handle) using only RGB camera frames. The deep learning approach consisted of a single convolutional neural network (CNN) for object detection and semantic key point prediction, as well as an optimization step to place prior known geometries into the local camera coordinate system. Inference on a camera frame with size of 256-by-352 took 0.3 seconds. The object detection component of the system was evaluated on a manually-annotated stream of video frames. The accuracy of the system was evaluated by comparing pose (position and orientation) estimation of a tool with the ground truth pose as determined using three retroreflective markers placed on each tool and a 14 camera motion capture system (Vicon, Centennial CO). Markers placed on the tool were transformed into the local camera coordinate system and compared to estimated location.

Results

Detection accuracy determined from frame-wise confusion matrices was 82% and 95% for the mallet and broach handle, respectively. Object detection and key point predictions were qualitatively assessed. Marker error resulting from pose estimation was as little as 1.3 cm for the evaluation scenes. Pose estimation of the tools from each evaluation scene was also qualitatively assessed.

Discussion

The proposed computer vision system combined CNNs with optimization to estimate the 6-DoF pose of surgical tools from only RGB camera frames. The system's object detection component performed on par with state-of-the-art object detection literature and the pose estimation error was efficiently computed from CNN predictions. The current system has implications for surgical skill assessment and operations based research to improve operating room efficiency. However, future development is needed to make improvements to the object detection and key point prediction components of the system, in order to minimize potential pose error. Nominal marker errors of 1.3 cm demonstrate the potential of this system to yield accurate pose estimates of surgical tools.

For any figures or tables, please contact authors directly.