Safety of Interactive Image-Guided Surgery

In this chapter, we discuss Interactive Image-Guided Surgery (IIGS) and present the current state of the art in the analysis and evaluation of this kind of computer-supported medical procedure. We also present the results of our research to evaluate the safety of IIGS. During our discussion of the current state of the art, we review the various approaches that have been proposed to analyze and evaluate IIGS systems including their strengths and weaknesses.


Introduction
In this chapter, we discuss Interactive Image-Guided Surgery (IIGS) and present the current state of the art in the analysis and evaluation of this kind of computer-supported medical procedure. We also present the results of our research to evaluate the safety of IIGS. During our discussion of the current state of the art, we review the various approaches that have been proposed to analyze and evaluate IIGS systems including their strengths and weaknesses.
The original hypothesis that initiated our research was that: "it is possible to improve the safety analysis of computer-guided surgery if we model this kind of system as a process control model." The hypothesis was first elaborated in an initial research planning meeting while we were investigating a better way of developing a Verification and Validation method for IIGS. During the evolution of this research, this hypothesis has evolved significantly from a basic idea into a modelling language and a protocol that have been evaluated using a particular set of IIGS systems used for orthopedic surgery.

Using medical imaging for surgery
For many decades, medical imaging has been rapidly evolving as one of the most useful applications of scientific and engineering knowledge to improve the quality of health care. Medical imaging provides technology that enables physicians to see subcutaneous structures and tissues without the use of more invasive procedures 1 .
The use of medical imaging increases the observability of areas of interest inside the patient. Increased observability allows for more complete diagnosis and better planning of surgical procedures leading to potential improvements in surgical outcome. The first use of medical imaging with clinical intent can be traced back to January 1896 in Birmingham, United Kingdom where an x-ray was used to plan the removal of a needle from a woman's hand Viergever (1998). Medical imaging for clinical procedures has therefore been around for over 100 years. The first use of medical images was static, but it demonstrated the usefulness of medical imaging in planning surgeries. Medical imaging progressed over the next 100 years with improvements in the quality of existing modalities and image processing, as 1 The debate on the degree of invasiveness of ionizing and non-ionizing radiation is ongoing. Although it is clear that their side effects cannot be overlooked Assessment and Management of Cancer Risks from Radiological and Chemical Hazards (1998); on Radiological Protection (1960-Present); Profio (1979); Tenforde (1979), medical imaging is far less invasive and risky than open exploratory surgery Hindus (2001).

18
www.intechopen.com well as the addition of new imaging modalities. In today's medical imaging, there are numerous modalities that can be used in complementary fashion to improve diagnosis. Some of these modalities include: X-Ray, CAT 2 , Magnetic Resonance Imagery (MRI), Single Photon Emission Computed Tomography (SPECT) 3 , Positron Emission Tomography (PET), ultrasound and several other specialized forms of imaging .
In recent years, it has been possible to acquire intraoperative volumetric data with the use of mobile medical imaging (CT, MRI and ultrasound). Mobility of 3-Dimensional medical imaging in the operating room has allowed surgeons to track surgical instruments through tissues in real-time during the surgery, leading to less invasive procedures or Minimally Invasive Surgery (MIS) Patel et al. (1998); van der Weide et al. (1998); Viergever (1998).
During the last decade, increased ability to process computer graphics and images in real-time has made possible the intraoperative use of 3-D computer models obtained from segmentation and volume rendering of preoperative imaging from CAT, MRI and ultrasound Viergever (1998). Without volumetric data, the surgeon would have to form his/her own mental model from imaging slices.

General
This section covers the general principles of orthopedic IIGS. We present the processes and methods involved in IIGS; the intent is to provide an overview and to illustrate the complexity of such systems. IIGS can be represented as a pipeline of operations which can be separated into six distinct phases as displayed in Figure 1. The parts of the process are defined as follows: 1. Image Capture -Preoperative images of the region of interest in the patient are acquired. The images of interest are represented by a set of CAT slices or a scene 4 .
2. Visualization -Various complex operations are applied to images to extract information. Visualization has also been called rendering or more loosely refered to as image processing.
3. Surgery Planning -Preoperative planning of the surgery is conducted using the scenes from visualization that have been generated by the IIGS system. The planning phase can also include rehearsal of operations on some guidance systems. Training can also be included as a requirement.
4. Registration -In the operating theatre, the visualized images are registered to the patient to align the reference systems.
2 Also known as Computed Axial Tomography or CT 3 AKA tomoscintigraph 4 A scene is a constructed 3-D image from a region of interest in the patient.

5.
Intraoperative Guidance -During surgery, the position of surgical instruments is tracked and displayed superimposed on the preoperative visualized images, allowing the surgeon to accurately navigate through the area of interest in the patient.
6. Postoperative Followup -After surgery, the surgeon may want to confirm surgical outcome using the surgery planning information.
The above phases have also been presented in the literature as models containing three or four phases Breeuwer et al. (1998); Gauldie (2002). A three phase model could be composed of preoperative (phases 1-3 above), intraoperative (phases 4-5) and postoperative (phase 6).
Regardless of the separation of phases or actual requirement of the system, the system follows this linear progression of operations. Below, we address the six phases of IIGS. Planning and post operative followup phases can be represented as single processes and depend in large measure on information generated during the other four phases. These two phases are highly dependent on competent human intervention derived from medical knowledge.

Image capture
The imaging modality depends on the procedure to be performed and the target area to be imaged. Apart from the physics of the modality, it is important to note that for a given modality, various health facilities will have different machines each with its own pixel resolution, slice thickness and storage format. The increased need to communicate medical images over networks and between machines has pushed the adoption of medical imaging standards such as Digital Imaging and Communication in Medicine-3 (DICOM-3). This standard only supports sets of 2-D slices and not volume models Starreveld et al. (2001). Further processing of the images as described in the following section depend on the ability of the algorithms to adapt to the parameters of various CAT scan machines and format of images that come from other systems.
Image capture can be preceded by the surgical implantation of fiducial markers if fiducial registration is to be performed. Physical markers such as these are made visible on medical images by selecting appropriate material for the markers.

Visualization
Prior to intraoperative mobile 3-D medical imaging and IIGS, the surgeon had a series of 2-D slices that he/she used to form a mental 3-D model. Visualization (sometimes referred to as 375 Safety of Interactive Image-Guided Surgery www.intechopen.com rendering 5 ) deals with the display of 3-D information on a 2-D matrix of pixels (computer screen). Using computer graphics methods and today's technology, this 3-D visualization can be built from the 2-D slices obtained from medical imaging modalities. Constructing a 3-D model is a complex process that is best viewed as a pipe-and-filter architecture for image transformations. This view is useful since the selection and ordering of various operations will determine the characteristics and quality of the visualization.
Several image processing operations can be applied to medical images including: filtering, segmentation, classification, interpolation, quantification and image rendering. There is a great number of algorithms and techniques described in the literature for each of these operations. Most techniques require interactive or manual assistance from an expert in 3-D image processing, but recently, automated algorithms are starting to appear in the literature Radau et al. (2000); Udupi et al. (2001);Zoroofi et al. (2001). One of the most helpful references in the domain of visualization is a survey paper on 3-D rendering in medicine by Jayaram K. Udupa Udupa (2001). In trying to create a concise survey of the operations used in visualization, we found that there is no consensus on terminology or clean interfaces between various operations. For example filtering and classification are often included as part of segmentation. We consider each operation as a step in visualization. The operations are presented in the following subsections.
Images on computer screens are represented by a set of pixels. Likewise, medical images produced from modalities like CAT are formed from matrices of pixels of various intensities identifying different tissues. Pixels and voxels are sometimes used synonymously in the literature. In 3-D visualization, it makes more sense to talk about voxels, the 3-D equivalent of a 2-D pixel Hindus (2001); Udupa (2001). This mixup in terminology can be explained when we consider that a 2-D slice has a thickness that is determined by the beam width of the CAT machine.

Filtering
Filtering is an operation that takes a scene and changes the intensity 6 of its voxels to reduce noise, distortion and other artifacts that have not been eliminated by the imaging machine. Filtering smoothes the edges of structures to provide clearer and crisper images. Filtering can either be linear or optimal Ayache et al. (1996).
There are two approaches to linear filtering: gradient or Laplacian. Both approaches assume a white Gaussian noise. If the noise does not have zero mean or there are a lot of irregularities in structures, linear filtering may require large convolution masks as well as complex algorithms. Because medical images are very large, the computational requirements of complex algorithms are very high.
Another less computation intensive approach for noisy images is optimal filtering. Canny, Deriche and Shen have proposed such filters Ayache et al. (1996). 5 Visualization and rendering are often used interchangeably in the literature, but they are not synonymous. Visualization represents the entire process of image transforms where rendering is its last step of displaying the image. 6 Depending on the modality of imaging, and the author, intensity is sometimes replaced by density. This is especially true for scenes that have vector-valued voxel intensity such as obtained with MRI.

Segmentation
Segmentation is crucial for the identification and extraction of information and structures from a given scene. For orthopaedic surgery the structures consist of bones, ligaments, cartilage and tendons. The ability to accurately locate structures in the body for planning and intraoperative tracking is directly linked to surgical outcome Ayache et al. (1996); Dean et al. (1998); Udupa (2001).
There are two main classes of approaches for segmentation: region-based segmentation which includes a set of voxels describing a sub-scene and boundary-based segmentation that outputs a set of voxels or geometric information from voxels (polygons) about the boundary of the object Udupa (2001).
Another way to characterize segmentation algorithms is in the way set membership is decided for voxels. In a hard segmentation, also called binary segmentation, set membership is either 0 or 1. Gray or fuzzy segmentation uses voxel intensity to decide the degree of belonging to the set with real values ranging between 0 and 1. Some methods, such as thresholding and clustering, use either binary or gray segmentation Udupa & Gonçalves (1996).
Region-based methods can be divided in two different groups: clustering and region-growing. Clustering uses voxel features to define each set in a scene. The simplest clustering method is thresholding. Region-growing, as the name indicates, grows sets of voxels based on homogeneity properties. These methods grow regions of interest from seed points by considering neighboring voxels for inclusion.
There is a great number of boundary-based (edge detection) methods described in the literature. Only a few are mentioned here. Filtering is often used as a first step in the process of edge detection. The following are examples of boundary based methods: 1. Simple thresholding and hysteresis-thresholding can be used to find contours. The problem with these simple methods is that there are many discontinuities between edges in the contour Ayache et al. (1996). These contours can be closed by domain experts who have knowledge of anatomy.
2. Active contours also called snakes, have been successfully used to solve the 2-D contour problem. In order to use these algorithms in a 3-D environment, one must find a 3-D generalization of the energy defining the border of an object. The user defines an initial contour and the algorithm optimizes this contour based on a balance between inside and outside energies.
3. Isosurface methods track the surfaces of structures through the assumption that the boundary of an object in a scene has a constant density. One such algorithm is Lorensen's marching cubes Lorensen & Cline (1987), which has been adapted for one of Queen's University IIGS systems. The marching cubes method constructs the surface of each structure by finding, for each voxel, the local direction of an isosurface through it.
From a validation perspective, it is important to note that not only are the results from boundary-based methods highly dependent on the algorithm but also on the skills of the user.

Classification
Classification is basically a problem of creating an opacity scene from a density scene Udupa & Gonçalves (1996). Classification deals with connecting like-pixels with the use of a luminescence histogram Ayache et al. (1996). Objects are defined by their belonging to a range of luminescence. Classification and segmentation are two different operations but gray segmentation is synonymous with classification Udupa & Gonçalves (1996).

Interpolation
In order to facilitate the processing of information and to complete the 3-D representation from slices, it may be necessary to combine or split voxels through interpolation. The preferred shape for a voxel in most algorithms is a cube. Because slice spacing (z-dimension) is 2-8 times greater than the voxel size in the x or y dimension, we want to fill the gap with fictitious voxels based on interpolated density information from neighboring voxels. If the slices overlap, some voxels may need to be split, again requiring interpolation. The result will provide an isotropic 3-D representation.

Rendering
Rendering is the last process in the visualization pipeline. As such the image that is rendered depends on several operations, as well as the sequencing of these operations. There are two kinds of rendering: surface and volume.
Surface rendering for which density has either value [0,1] portrays the boundary surface of a structure. Surface rendering can be displayed as a set of polygons obtained from segmentation. Each polygon has its own intensity level that is calculated based on the orientation of the surface with respect to the screen. This gives the impression of depth to the structures. Surface rendering can also be portrayed through volume rendering techniques but the opacity of the surface voxels will hide the elements behind them.
Volume rendering for which density has a real value in the interval [0.
.1] portrays a translucent structure through two distinct operations: projection and pixel intensity calculations. The projection can be done with either direct projection of object polygons or with ray casting. Pixel intensity calculations can be done in several ways, depending on the determination of normal vectors to the object surface. The quality of the rendition is highly dependant on the estimation of the normal vectors Udupa & Gonçalves (1996).

Safety implications of visualization
As we can observe by the number of transformations described above, visualization, can be the source of some significant accuracy errors in IIGS. Each of the transformations is a single point of failure and has cumulative effects on the next set of transformations. The results from visualization are fed to the subsequent phases. Hazards created by failures in the transformations will propagate to these phases.
As we have found during our research, tests with phantoms and computer code inspections of algorithms represent the two main mitigation strategies.

Surgery planning
As indicated above, surgery planning can be reduced to a single process that is highly dependent on information provided by the previous phases of IIGS. During this phase, the surgeon uses a part of the IIGS system to perform a simulated surgery where s/he can make changes to the virtual or visualized skeletal structure of the patient. The planning tool can provide the surgeon with a variety of options to visualize corrections. Depending on the tool, the surgeon can change the amount of correction applied to the bone structure, choose and change prosthesis size, view post simulated surgery results. More complex planning tools may allow the surgeon to obtain technical measurements and even view dynamic models after a correction.

Registration
Registration is the process of aligning coordinate systems so that we can match geometric data. Registration is also called co-registration, matching and fusion Taylor et al. (1996). Registration is done between coordinate systems of various modalities such as the registration of CAT and MRI 3-D models, and/or between imaging data and patient. Registration is critical to IIGS because it is directly linked to the execution of a preoperative plan through intraoperative navigation.
There are several approaches to registration. We describe three of them here, namely: fiducial markers, anatomical landmarks and surface based methods. Other methods are described in Gauldie (2002); Lavallé (1996). In the selection of a registration technique, there is a tradeoff between accuracy and invasiveness.
Fiducial markers are small posts that are surgically secured to bone tissue prior to imaging. Markers are chosen to be visible for the particular medical imaging modality. Immediately prior to surgery, registration is performed by touching each marker with a tracked probe. This procedure is the gold standard for registration because it minimizes errors in transformation. This procedure is, however more invasive than other methods Birkfellner et al. (1998); Starreveld et al. (2001).
Anatomical landmarks are an option for registration when it is desired to eliminate the extra surgical procedure needed by fiducial methods. The problem is to find a set of distinctive and unambiguous points that can be registered reliably. Depending on the procedure, the surgeon may only expose surfaces that are devoid of such features rendering this method difficult to use.
Surface-based registration is accomplished by acquiring points from the surface to be registered. These points form a description of the surface that must be matched to the 3-D model. The most prevalent method for acquiring these points is by touching the exposed surface of the bone at several locations with a tracked probe. The problem with this type of point collection is that the surgeon will expose as little bone as possible. Recently, A-Mode ultrasound has been used to perform transcutaneous surface registration at Queen's University Gauldie (2002). A-Mode ultrasound has also been used to perform anatomy-based registration for the spine Lavallé et al. (1996), but the point gathering method and the matching technique used by the authors resemble that of surface-based registration.
After the points are matched, the registration algorithm tries to find the best transformation that will best align the coordinate systems. For bones, a rigid-body transformation ( − → F (x)) consisting of a 3 by 1 translation vector − → T and a 3 by 3 rotation matrix − → R is sufficient because the deformations of images are negligible Lavallé (1996). The transformation therefore has the form: Registration must be accurate in order to maximize surgical outcome. From the many papers on the topic, it is generally accepted that registration accuracy of less than 1 mm and 0.5 degrees is surgically acceptable. No paper could be found in the literature that provided empirical justification for the acceptability of these limits on surgical accuracy. Some papers also talk about sub-pixel accuracy for localization. This provides a vague requirement, since pixels vary in size and geometry depending on the imaging machine and the settings.

Intraoperative guidance
After the scenes are registered to the patient, it is possible to display digitized models of the tools as part of the visualization, to provide these scenes in real-time and provide guidance information to the surgeon. Other technical data such as numerical values of location, angles and forces can also be displayed. As an example, during a High Tibial Osteotomy (HTO) which involves the removal of a wedge of bone tissue, the position and angle of a drill used to insert guide wires into the leg can be tracked and errors displayed on the preoperative plan in real-time. Force monitoring of the drill can be ascertained by measuring the current to determine torque .
The accurate display of surgical instrument models depends on the tracking system used.
There are two basic categories of instrument tracking: articulated arms and triangulated emitters Galloway et al. (1994).

Articulated arms
Articulated arms can resolve positions though the use of instrumented joints (potentiometers, optical encoders, or resolvers) and simple kinematics. The surgeon manipulates a surgical instrument that is attached to the end of the arm. The arm can also have other useful characteristics such as limiters that are set to respect the preoperative plan or brakes that keep the position of the instrument fixed for long periods.
Articulated arms have good accuracy but this accuracy is limited by uncertainties related to deformations, backlashes, alignment errors and the like. Articulated arm signals cannot be obscured. However, arms have other limitations: accuracy is inversely proportional to the size of the operating envelope, only one instrument can be attached to a single arm, and sterility is a problem Galloway et al. (1994).

Triangulated emitters
Three types of triangulated emitters are discussed in the literature, each having multiple implementations.
Ultrasound time-of-flight was one of the first methods used to triangulate instruments. This technology is simple and cheap to implement. A piezoelectric emitter on the instrument is synchronized with several receivers whose locations are known precisely. Time-of-flight of an ultrasound pulse provides distances between each receiver and the transmitter. Intersection of the spheres in space at the known radii provides the location of the instrument. The two main problems with ultrasound are that the speed of sound changes depending on the temperature and the signals can be obscured by the presence of surgical staff.
Electromagnetic systems can also be used. Three emitters generate magnetic fields in orthogonal planes and the signals are captured by receivers. The main advantages of this technology are that the signals do not suffer from line-of-sight obscuration and they are not sensitive to temperature changes. It is also possible to attach such a device to an endoscope and locate the end position of these instruments inside the body. The main problems with this technology are the susceptibility to metallic objects in the operating theater as well as electromagnetic interference from other electrical equipment Galloway et al. (1994).
Passive and active optical tracking can also be used for triangulation. Queen's University uses an active system composed of Charge-Coupled Detector (CCD) cameras in a linear arrangement and a series of Infrared Light Emitting Diode (IRED) mounted on surgical instruments. The IRED's on each instruments are mounted at specific locations and are sequentially pulsed at time-specific intervals. The position of an IRED can be located in less than 300 microseconds. This technology does however suffer from line-of-sight obscuration which can be limited by careful placement of the cameras.
In terms of accuracy, the active optical tracking system is the most accurate, capable of locating instruments within +/-0.2 mm in a volume of (500mm) 3 Galloway et al. (1994); .

Postoperative followup
After the surgery the patient will be evaluated to determine surgical outcome and general well-being. Some of the postoperative assessments will be done without the use of the IIGS systems, using physician evaluations and/or patient questionnaires. Some evaluations will make use of a subset of the IIGS system such as the Medical Imaging and Visualization to compare preoperative and postoperative results. It can also be useful to compare the plan to the actual result. Such postoperative followups should be undertaken with care considering the additional radiation that the patient has to be subjected to.

Failures in IIGS systems
It is important to identify what is meant by a failure within the scope of Integrated Image-Guided Surgery. Apart from the obvious hardware and software failures that can cause a system to crash or malfunction, we must consider errors that may affect surgical outcome. A concept that is advanced by Schneider and Hines, is that of patient vulnerability from data 7 provided by medical software Schneider & Hines (1990). In their paper Schneider and Hines define patient safety as "freedom from harm by a medical device", while patient vulnerability is defined as "potential harm due to erroneous system output". During orthopaedic surgery involving IIGS, the decisions made by the surgeon depend in large measure on the IIGS system 8 . Vulnerabilities in IIGS therefore come from the data that is provided in the form of rendered images, real-time display of surgical instruments within the model and numerical information. As indicated by Fitzpatrick, the knowledge of accuracy is as important as the accuracy itself Fitzpatrick et al. (1998) both of which are crucial to managing patient vulnerability. Since geometric accuracy is linked with surgical outcome and the skeletal structure of the body is modified from this information, any errors in accuracy that can have an impact on patient vulnerability are potential hazards. Failures must be identified based on the severity and likelihood of hazards. Another problem of introducing complex medical devices is that the surgeon does not know the level of confidence he can have Fitzpatrick et al. (1998).

IIGS conclusion
Interactive Image-Guided Surgery includes a complex set of process and interfaces which have traditionally been represented as a pipeline of operations. The image processing portion of the system can be represented with a pipe-and-filter software architecture. We argue that although this representation is excellent to illustrate an introductory discussion to IIGS, it is not sufficient to perform a safety analysis on IIGS. Throughout the discussion of the various phases of IIGS, there are many interfaces, equipment and interactions that are not visible in the pipeline architecture of the system. For example, the quality of the visualization depends on the selection of the operations applied to the medical images, the algorithm used to implement each operation, the ordering of the operations in the pipe-and-filter architecture and the skills of the operator performing the image transformations.
The potential sources of errors from devices, processes and interfaces vary widely in range, which can affect the quality of visualization, planning, registration, guidance and ultimately the surgical outcome. As well, we must identify unwanted disturbances and noise that interfere with the ideal operation of the IIGS system. In order to increase the completeness of the safety analysis, we must first identify and expose all system interfaces.

State of the art in the analysis and evaluation of IIGS systems
Before we describe our approach to safety analysis of IIGS systems, we present work that has been done in the area analysis and evaluation of these systems. The intent is to show through various approaches taken, the complexity of studying these systems from a safety perspective.
Little in the published research is directly relevant to the topic of analysis, evaluation or validation and verification of IIGS systems. The literature, however, contains many papers on spatial fidelity and accuracy of visualization, segmentation and registration algorithms. The words verification and validation are often used by these papers, but they mainly refer to mathematical and procedural accuracy verification. Accuracy and spatial fidelity are crucial to IIGS systems and it is generally accepted that they can significantly affect surgical outcome. Although necessary to the conduct of safe IIGS procedures, accuracy and spatial fidelity are, however, not synonymous with system safety. Saying that software is correct and the calculations are accurate says nothing about possible omissions of safeguard requirements that could have made the system safer Gowen & Yap (1993).
Several papers have also been published on validation of user interfaces and clinical trials but none of the proposed methods or protocols are applicable or extendable to general clinical practice. We define general clinical practice as a hospital with no integral research and development department that would have knowledge of the internal design of the systems and could be called upon to operate IIGS technology.
In the following subsections we present several methods that have been used to perform some form of analysis, evaluation or validation activities for IIGS systems. At the end of each section, an evaluation of the scope, focus and aim of each method is given. The scope refers to the phase(s) that the method concentrates on: preoperative, intraoperative or postoperative. The focus localizes the area of interest or where the method is applied and can be one or more of: software, hardware, system, interfaces, patient, or surgeon. The aim identifies the goal of the method; the quality factor that is sought after: safety, reliability, usability, or effectiveness.

Clinical relevance
Two papers that closely address the validation of IIGS systems are from Verbeeck et al. (1995) and from Breeuwer et al. (1998). Both papers approach the validation of IIGS systems from the clinical evaluation perspective. Also, both papers recognize the necessity for verification of accuracy using phantoms. Breeuwer et al. also look at CAT and MRI image distortion and the impact of registration errors as part of their validation efforts.
In their paper, Verbeeck et al. establish some strong arguments for setting up a Clinical Relevance validation that in effect justifies or rejects the requirement for the system. The basic idea of Clinical Relevance validation is to use a set of questionnaires that ask the surgeon to evaluate the system based on a number of criteria which are rated on a three point scale. In their paper, they compare the results of an existing manual 9 method of performing stereotactic brain surgery and a new system for planning and guidance. The evaluation of the new IIGS system is done in two distinct phases, each phase including its own questionnaires.
The Functional Specification Phase uses prototyping and a functional specification questionnaire containing twelve questions to refine the system. Each question in this questionnaire has a weight factor, assigned by the surgeon, that depends on the patient's need for the new system, but does not depend on the technology. A second questionnaire that does not involve the patient and deals specifically with the surgeon's interaction with the new IIGS system contains ten questions without weight factors. This questionnaire deals mainly with the usability, usefulness and robustness of the new system from the surgeon's perspective.
The Clinical Acceptability Phase includes two questionnaires. The first questionnaire contains seven questions and compares the old and new IIGS systems as well as their defined medical protocols 10 . The questions deal with the clinical information that can be obtained from each system. The questions also evaluate the degree by which any additional information requirement from the new system outweighs the necessary activities to get this information.
Because the protocols differ and the information requirements are not the same for both systems, a "not available" score can be assigned to a question. Because the information needs within a protocol may vary in importance, weight factors are assigned to each question by the surgeon. The second questionnaire for this phase contains six questions and addresses the beneficial aspects of the new technology for the patient. Here the surgeon answers questions that deal with overall benefits, risk, comfort, time, and perceived accuracy of the systems.
Breeuwer et al. also use a questionnaire for assessment by the surgeons. The eight questions posed to the surgeon are on a visual analog scale scored from -5 to +5. This is similar to a Likert scale Likert (1932).
Both validations ask a question about the perceived safety from the surgeon's point of view following a procedure. Breeuwer also asks a question about the effect upon the confidence of the surgeon during the procedure.
The questionnaire method used by both research groups is an efficient instrument to evaluate a quality factor such as Clinical Relevance or to measure the confidence of the surgeon in the new technology. This, however, is not the appropriate kind of instrument to use to validate or verify system safety. Another weakness of questionnaires is that they explore known issues at the time the questionnaire was developed; safety critical systems require instruments that bring to light issues we did not think about "a priori". Also because of the small samples used in both studies, the issue of statistical significance is questionable. Finally, as seen from both papers, the questionnaires are quite short and contain no validating questions to ensure that answers are consistent.
Clinical relevance and effectiveness remain important aspects of introducing new technology in the operating theater and should be evaluated as part of a validation protocol. On the other hand, the use of system designers to operate the new IIGS system in the operating rooms is not the norm in clinical practice. System designers have a complete understanding of the underlying technology of the IIGS system compared to general clinical technical staff so the results obtained from clinical evaluation in a research setting may not be transferable to a general clinical setting. Clinical relevance covers the intraoperative phase and is surgeon centric. The study aims at relevance from a usability analysis perspective. Safety is mentioned but as a single indivisible global quality factor.

Human error analysis
A paper that describes another aspect of the state of the art in the analysis of IIGS systems is by Jiang et al. (1998). This paper is most closely related to our research. In their research, the authors perform a tailored Failure Mode Effect Analysis (FMEA) technique to study human errors in IIGS. Their research takes an approach similar to that of conventional safety analysis of safety critical systems Chudleigh et al. (1995); Fries et al. (1996); Gowen (1994) While the technique used by Jiang et al. has a lot of merit, their approach also has significant limitations. The aim of their research is to identify surgical errors by focussing on human factors and human errors that can occur in using software. Their analysis is limited to software component failure modes and human errors that could cause each software component to fail.
The authors concentrate on errors caused by the operation of the software not on systematic faults. Because the computer human interactions are part of the study, errors in interface design may be uncovered by this approach. In order to study this human interaction errors from design the study should include the design of the human machine interfaces.
There are several reasons why FMEA alone is not sufficient for this kind of analysis: 1. The underlying assumption made by the authors in only using FMEA, is that there are only single points of failure in the system. FMEA is a stovepipe analysis that does not look at module interfaces or error propagation between modules. For example, if two human errors are made, one during segmentation and one during registration, the effect of one of the errors may not be significant enough to activate a protection measure, but the accrued errors could be surgically significant and not be discovered.
2. Because the analysis only studies the failures of the software modules that are in the current system architecture, and does not look at global system safety, there is a high probability of missing a safety requirement that could be solved by adding a new module or function such as a watchdog or built-in test.
3. Errors caused by hardware malfunctions may be overlooked if not anticipated as part of the failure modes of the software modules.
4. Because the technique used by the authors focuses on mistakes and failures, they do not analyze the effects of the new medical protocol as part of the safety analysis. There is no mention of modifying surgical gestures, system interfaces, human-machine interfaces, noise, or disturbances. This is important because some procedures change with the introduction of IIGS 11 .
5. System failures that are not caused by software failures or human errors may be overlooked by this approach. For example the effects of motion artifacts, deformation of tissue due to changes in patient position between the imaging and the operating room, and system latency during tracking, may be overlooked.
The analysis would be more complete if complemented with other safety analysis techniques. The authors however, look at methods of prevention and protection against software failures and human errors by applying the concepts of FMEA to software. It is the only work of its kind that could be found in the open literature for the analysis of IIGS systems. Human-error analysis is software centric and includes preoperative and intraoperative phases, an improvement on other methods presented. The method aims at increasing safety by reducing failures induced by interaction with the software.

Component-based trusted architecture pattern
A recent paper suggests a drastically different approach to ensure safety of IIGS. The researchers claim that by using encapsulated Moore state machines and a "trusted architectural pattern of components" they can guarantee safety and reliability in the system Gary et al. (2006). They base this claim on the hypothesis that "state machines ensure that component behavior is deterministic and that all components are in a known and error-free state at any given moment." The paper also claims that the approach can predictably integrate third-party software by bounding these untrusted components' behavior using state machine wrappers.
The approach gives little consideration to the fact that the IIGS software interacts with a physical environment. The techniques they suggest do not consider safety hazards that are not related to the software and for which software design could provide mitigation and/or removal. This does not follow sound safety design principles elaborated in countless references Chudleigh et al. (1995); Elliott et al. (1994); Fei et al. (2001); Gowen (1995); Gowen & Collofello (1994); Halang et al. (1998);Jiang et al. (1998);Jones et al. (2002); Knight (1990); Leffingwell & Norman (1993); Leveson (1995). Leveson states that "...the quality of a safety program is measured by its ability to influence design" Leveson (1995); this ability comes in the form of safeguard requirements that comes from a safety analysis. These safeguards would be missed if we only use the trusted architecture paradigm as proposed by the authors.
By their own admission, communication with subject matter experts such as surgeons, to confirm requirements is difficult. In their research, the authors used Unified Modeling Language (UML) activity diagrams to show the process of state transitions and the safe behavior of the software.
The approach is entirely software centric and specifically addresses only the intraoperative phase of IIGS. The Component-Based Trusted Architecture Pattern is a sound design approach that may produce reliable 12 software, not safe systems.

Conclusion on state-of-the-art
The analysis and evaluation of IIGS systems discussed above focuses on narrow concerns and isolated parts of larger systems. This aspect of IIGS research is indeed in its infancy. There are several reasons for this lack of maturity. The complexity of the systems makes verification and validation activities difficult to identify and perform. Due to the nature of these interactive systems, with person in the loop, validation efforts are complex and difficult to express. Our research aims to increase the expressiveness of models in the conduct of safety analysis for these systems. By increasing expressiveness of the models we will better communicate with domain experts. Our aim is also to develop a system centric focus including pre, intra and post operative phases in our modeling and safety analysis. This allows to have a more complete analysis at the system level.

Abstract process safety analysis model
In this section, we present our protocol and methods used to perform safety analysis on IIGS systems. At the core of the protocol is our diagramming technique which is part of the Abstract Process Safety Analysis Model (APSAM).
In traditional digital control theory, the controllers (computer control systems) are generally composed of five classes of components Dunn (2003). These components, illustrated in Figure 2 are: 1. The application (process or plant), the physical entity that the control system monitors and controls.
2. The sensors, convert the application's measured/observed physical properties into corresponding electrical signals that will be inputs for the computer system.
3. The effectors, convert electrical signals from the computer's outputs to corresponding physical actions that control an application's functions.
4. The operator the human(s) who monitors or activates the computer system in real-time (or near real-time).
5. The computer/controller, composition of software and hardware, uses the sensors and the effectors to monitor and control the application.

Fig. 2. Traditional Digital Control System Components
A simple control system view of the guidance phase of an IIGS system is presented in Figure 3 13 . There are several elements that make this representation different from traditional digital control theory as displayed in Figures 2. Some of the five components listed above change between the physical and virtual realms. The controller is external to the computer. The process under control (application) is in a virtual environment. The controlled variable is the accurate placement of surgical instruments in a human body. The surgeon, as the controller, has limited observability 14 of the controlled variable in the physical domain. The observable behavior is the calculated position of surgical instrument models superimposed on the virtual environment. What is more interesting is that some of the views provided by the system are not there in the physical realm. The surgeon controls the calculated position of the instruments in a computer generated model. The effectors are an interactive system involving surgeon, display and surgical instruments. This is the inverse of what is normally considered a classical digital control problem, where the controller implements a control strategy through a series of specification functions and the process' observable behavior is in the physical environment.
The model of Figure 3 only represents the guidance phase of IIGS and must be augmented to include other phases, input variables, noise and disturbances. This will give us a system centric view of the safety problem.
One important aspect of this process control model is that it is independent of the software architecture of a particular system. Another added benefit of using process control diagrams in representing a virtual process is that the physical and conceptual (virtual) components, such as software supported processes and protocols, can both be represented in the same abstraction and be analyzed as part of a complete system. This visual representation of all components is critical to the complete description and analysis of the system. It allows for a more complete hazard analysis that is not compartmentalized. Contributing factors to accuracy errors, as well as the propagation of these errors through the system, will be more visible.

Process control model structure diagram
In this subsection we discuss the modeling diagram used as part of the Abstract Process Safety Analysis Model. The structure diagram is used to model the process of IIGS and will be used to guide the safety analysis performed at the lower layers of the architecture.
The structure diagram is also used to display the current status of the system analysis based on the collation of safety analysis information. Each element of the Process Control Model Structure Diagram is to be coloured with the status information and the safety risk will be recorded numerically with each symbol on the structure diagram.
We first highlight the diagrammatic considerations that are sufficient to describe and substantiate our approach and to model the structure of IIGS systems as Process Control Models. We then introduce the syntax and notation used to model the structure of Process Control Models such as IIGS systems.

Diagrammatic modeling
In simple cases, a diagrammatic representation can model an idea to fully meet a given purpose; studying spatial relations, modeling the relationships in an organization or a system's structure (i.e. the command relationship in an organization or the floor plan of an office). For such simple cases, one item on the diagram fully represents a sufficient abstraction of this item from the physical world. For the case where the entire abstraction of the system can be represented on a diagram, we intend to capture, represent and analyze structure only. For this kind of analysis, we simply need to draw a diagram with components having recognizable and meaningful symbols (notation), rules for connecting the symbols (syntax) and appropriate naming labels or markings. There is no need for further meaning (semantics). We say that these diagrams have "shallow semantics" 15 because the entire design or idea is conveyed in one level of abstraction and no further semantics are required.
We use shallow semantics to convey the idea of representing the physical and virtual characteristics of a system on a single layer diagram of components and connectors. By comparison, deep semantics conveys the idea of adding layers to the model and adding meaning to describe the safety properties of the system being analyzed.
Process control modeling is the kind of problem domain that requires precise syntax and deep semantics. If we want to model the process control of a hydraulic turret in a missile defence system, and analyze it for safety we need to augment the turret's structure diagram with safety analysis techniques such as Failure Mode Effect Analysis (FMEA) or Fault Tree Analysis (FTA) IEC (1985;1990); Leveson (1995). In a simplistic way, the structure diagram then describes the structure of physical components of the model such as hoses, valves, transducers, flow direction and pressure. The structure diagram is only part of the entire model. When we augment the structure diagram with mathematical models, analytical methods and information we further increase the semantics of the analysis.
The approach that we have taken to model IIGS systems to conduct safety analysis reflects the approach taken in process control modeling such as the one described in the paragraph above.

Design of the APSAM syntactic types
In this section, we describe the notation and syntax that will be used to create the APSAM structure diagram.

Types
Syntactic types represent the different classes of components and connectors on a diagram and help to distinguish between elements in a system Dean & Cordy (1995). Syntactic types are similar to abstract data types in programming languages. They have shallow semantics and must be further detailed during instantiation as typed nodes Dean & Cordy (1995) onto the Process Control Model structure diagram (subsection 5.3.4). The set of Types for components is displayed in Figure 4.

Component types
• Real-Time Physical Processes are those processes that change the physical environment and which have observe-and-react timing requirements that suffer from latency which can affect overall system performance and/or surgical outcome. Such processes include the movement of effectors, changes to model data and direct observation of the physical environment.
• Real-Time Virtual Processes are processes that change the virtual environment, may be subject to latency effects and may also contribute to these effects by means of computation delays.
• Near Real-Time Physical Processes must provide good quality of service in response time, but these processes can be interrupted without jeopardizing system outcome. It should be understood that the interruption should not be excessive to the point where the person in the loop forgets where he was in the process of transformation. Near Real-Time processes could have been called quality of service time as well because they closely characterize the timing requirements.
• Near Real-Time Virtual Processes must provide a good quality of service in response time to the person in the loop. These processes produce the input necessary to the conduct of the Real-Time processes (both virtual and physical).
• Sensors provide estimates of physical data for use by the virtual environment. They increase the visibility onto the physical world provided by the virtual environment.
Effectors are physical components that are moved or used in the physical environment. Digital models of effectors are created to show their location in the virtual environment. We use the same syntactic type to represent sensors and effectors since labels can provide enough characterization without adding to the notation. Prostheses are also modeled using this syntactic type.
• Disturbances are unwanted input to the system that may change physical and virtual processes. They can cause deviation from the control policy as well as reduce visibility and controllability in the system. Disturbances can come from various sources including but not limited to: electronic noise, errors in images or digitization, motion artifacts, light noise (infrared camera interference), obscuration, etc...
• Person is any human participant in the process. Only those people who are directly involved in the process and can affect safety issues should be modeled.
• Labels are used to identify all components and provide a role for the component as part of the system.

Connector types
The set of types for connectors is displayed in Figure 5.
• The interface connector represents the unidirectional interface between two components. The interface may be physical or a flow of information. The flow of information is in the form of an analog or digital stream and the information is not retained or saved between the two components, it is not persistent. If one of the components is a process and it stops, the stream of information is interrupted.
• The information flow with memory is a unidirectional connector that has the responsibility to transfer data between components and provide persistence between successive executions of the processes. The round end of the connector is at the source of the information transfer. The arrow is at the sink end. There is no need to identify on the structure diagram where the information is persistent. This can happen at the source, sink, at both ends or in the middle. The importance of this connector is to highlight limitations and requirements for freshness of information, security, mishandling, misplacement, stable storage, recovery, duplication, bandwidth, coherence and transport. The issues listed here are unique to this connector because of possible time lag affecting information flow with memory.
• The interactive connector identifies the interfaces in the system that require continuous flow of information. It has significant symbiotic relationships between components.
• The summation point is a junction for connectors. The summation point is the only element on the APSAM structure diagram that is not part of the safety analysis. Its only purpose is to help clarify the diagram.
• Labels are used to identify the interface or relationship between components. An important distinction to be made is for the Disturbance component type. In classical process control modeling ( Figure 2) input variables, set points and disturbances are normally represented as arrows that come from nowhere and point to where they induced an effect. When we developed the notation we wanted to differentiate between a connector that represents a flow of information and one that represents a disturbance, which is an unwanted manifestation representing uncontrollable effects that come from the environment. Our choice was to show such phenomena as a black dot representing the environment with an arrow to attach to the model at the point of induction. Disturbances can connect to components to show that the effect is specific or internal to the component, or to connectors to represent that the disturbance influences the flow of information or signal between components. Disturbances that can occur at the interface between the connector and the component are attached to the connector.

Process control model structure diagram
The Process Control Model Structure Diagram, or simply structure diagram, is a concept for the safety analysis expert to instantiate syntactic types to model system structure and behavior Dean & Cordy (1995). The modeler is responsible for providing each instantiated type with an identity in the form of a label.
Connectors are instantiated as first class objects in the model Shaw & Garlan (1996). Connectors as first class objects convey the idea that they not only show relationships on a diagram between two components, but that they have deeper semantics similar to that of components. The analyst uses connectors to represent relationships between components covering all meaningful system interfaces. Some interfaces need not be represented as they have no impact on the safety analysis. These interfaces could be represented on a separate view for completeness.

Results
We present some of the most interesting or significant findings of the application of the safety analysis protocol. The findings that were generated during the iterative elaboration of the analysis methods, design and the related research activities were key to the development of the protocol and conceptual architecture as well as in the identification of information requirements and analysis forms.
The APSAM structure diagram we elaborated for our orthopedic systems contains 28 Components and 29 Connectors, each was validated by computer system specialists that designed parts of the IIGS systems that were used during the analysis.
After the APSAM structure diagram was completed a System Hazard Analysis (SHA) Sha (1999) was performed for each component and connector. SHA analysis identified 57 Composite Hazards. Each of the hazard was identified with the system expert that validated the hazard. 33 of the composite hazards were identified to have propagation effects. The propagation effects were also described as part of the analysis and are recorded in an integral database. The Fault Tree Analysis (FTA) IEC (1990) that followed the SHA identified 139 Hazardous conditions (leaves). 49 of the FTA leaves were mitigated by competent human intervention. The analysis revealed only 7 uncontrolled or unresolved conditions. These uncontrolled or unresolved conditions did not imply that the system is unsafe but the experts believe that there is a need to further investigate these hazardous conditions. As part of the analysis we identified 31 Safeguard requirements; most were already are current practice but were not documented.
The findings below were elicited from various sources. They stem from direct observation during surgeries and observed surgical gestures of the physicians while using the IIGS systems for the surgeries; some of which are discussed in this section. Other findings came from the professional judgment and observations of the attending physicians during surgeries which were recorded.
The physicians would often address the clinical research scientist who operated the computers during the surgery on the operation of the system, limitations and improvements. Interviews with computer system specialists and a clinical research scientist were used to get the more technical information. The safety analysis protocol and the prototype of the architecture as well as the Process Control Model Structure Diagram were used to guide and record the iterative elicitation of hazards and their underlying hazardous conditions.
The most important finding is that there is a requirement for further research to evaluate the person machine interfaces in the system to better present information and reduce the loss of situational awareness. There are four different kinds of information that can be presented to the surgeon: Anatomic and morphologic using visualization of medical images, visual tracking, medical plans that show desired corrections and implants, and technical data which provides accuracy, measurements, depth, applied force and the like. Surgeons sometimes temporarily lose situational awareness during computer-guided surgery. This loss of situational awareness can occur if the presentation of the view is different from the normal physical views. Loss of situational awareness can also occur if the virtual surgical site is not the one the surgeon recognizes from the plan. In some cases the temporary loss in situational awareness can be caused if the image was not rotated in the right direction or the right plane by the computer operator. Zooming errors can also cause delays. Through all the surgeries that were witnessed for this research, the surgeons regained situational awareness in less than a minute. However, the temporary confusion could lead to longer procedures or loss of confidence in the tool in general clinical practice.
Complete identification of single points of failure is also very important. In some safety critical systems, if you cannot eliminate or mitigate single points of failure, your system is deemed unsafe and you must redesign or add redundant systems Dunn (2003). For IIGS we can allow the presence of single points of failure or degradation 16 in the system as long as competent human intervention can be used to mitigate the risk. That requires that there is visibility (or observability) of the failure. For example if a system fails to register, it represents a hazard, but the likelihood of occurrence is low and criticality is low because the surgeon can detect this condition in the large majority of cases and bail out of using the system.
Another important finding that quickly came to the forefront is that our system level approach encouraged thinking in terms of component and interface modifications to introduce safeguards or risk reduction mechanisms for introduced and attendant hazards. This system level approach makes visible potential hazards and ways to remove or mitigate those hazards.
By analyzing the leaves of the FTA trees we found that there are three main mitigation methods that can be applied to IIGS systems: • Competent Human Intervention: Competent human intervention was very useful in providing mitigation information for hazardous conditions where the surgeon can use empirical geometric rules to decide if the procedure is still within operating ranges. The idea to create the state-based representation of competent human intervention came from observations in the operating room during several surgeries.
• Phantom Tests: The use of tests with phantoms can be used as a mitigation factor where lack of accuracy or spatial fidelity constitute a hazard. However, motion artifacts and the resulting propagation effects cannot be mitigated with phantom tests. Motion artifacts need to be further investigated to quantify their effects on IIGS. Articulated cadavers could be used to simulate tensing and patient movements during imaging. The movements could be measured mechanically and the effects on motion artifacts quantified on the resulting images.
• Code Inspections: Fagan style code inspection or some more agile inspection methods can be used to review parameters and transformation in the code of the IIGS systems. Pair programming or peer review could be used if the resources are not available. The calibration criteria for tracking cameras are set by programmers as are parameter files, marker types, lens models, interface packing routines, cable types and many other settings. The processing of tracking raw data from the 3-D cameras can also lead to inaccuracies. These settings, calibration data and calculations using the data to provide the registration and guidance information are not visible to the operator or the surgeon. Code and data inspections can be useful to ensure correct values are used.

Conclusion and future work
The research aimed at improving several attributes while performing safety analysis for IIGS systems. The safety analysis using the prototype implementation of the protocol provided us with the ability to validate traceability and visibility attributes. The notation used with the prototype helped us demonstrate the expressiveness of the APSAM. The maintainability attribute is validated by the integration of the safety analysis methods under a single high-level abstraction, the APSAM structure diagram and a repository that enforces relational integrity. Maintainability was identified as one of the main benefits of the protocol and its supporting conceptual architecture.
Finally the validation of the IIGS APSAM for orthopedic surgery with domain experts using the prototype helps us support the conclusion that the completeness of the safety analysis has been increased from what it would have been using the baseline methods for comparison described in Section 4.
We can also support the conclusion that the scale of rigor provided by the APSAM and its safety analysis protocol provides the necessary due care for safety analysis for such systems where competent human intervention is present. The traceability and visibility of the hazards and their resolution should be sufficient to obtain Food and Drugs Administration market clearance for like systems. The level of the analysis also provides for efficient use of domain experts.
Future research is required to apply the process control model including APSAM to other interactive systems where the operator or expert makes control decisions based on a virtual representation of the environment. Much work needs to be done to quantify system latency and its effects on safety. This quantification of latency is key when systems that warn surgeons of impending surgical errors like the one proposed by Luz et al. (2010) are considered.
Recently a specification for an extension to the Unified Modeling Language called the Systems Modeling Language or SysML was published under control of the Object Management Group Team (2006). The intent of the document is to retain the main structure and components of UML 2.0 and to perform system engineering domain modeling. Much of SysML makes use of new stereotypes but some interesting extensions have been proposed because UML was too restrictive to represent some system engineering concepts. Safety has been considered as part of this proposed standard to specify safety under stereotyped classes for requirements. There are no other syntactic elements or semantics to further model these system safety requirement classes or to drive and track a system safety analysis. The use of SysML is likely to become a standard for system engineering modeling similar to the use of UML 2.0 to model software. An interesting future research direction would be to propose an extension to SysML based on APSAM and implemented on Eclipse.
The generic notation and syntax, as well as the protocol, that have been produced as part of this research are not limited to IIGS systems. Trying the model in other safety critical system domains that have abstract interactive processes and person in the loop is certainly possible and should be investigated. Any system that provides estimated information for decision making in a critical environment could benefit from an analysis conducted with the APSAM and its protocol.