1.1. Current challenges in surgical training
The introduction of virtual reality and simulation into the world of surgery came about for a number of reasons. The most paramount reason being the adoption of laparoscopy which was first introduced in the 1990s. The advent of laparoscopy and minimally invasive procedures has created a necessity for more novel techniques to learn surgical skills. The skill set required in laparoscopy is very different compared with open conventional surgery . This is due to lack of tactile feedback, precise hand eye coordination, and a change from 3D to 2D visualisation as well as adaption to the fulcrum effect [2, 3]. This skill set cannot be taught easily in the real life environment under the supervision of a senior surgeon. With the traditional open approach the supervising surgeon can directly guide the hands of the trainee and immediately intervene if a problem or difficulty arises. In laparoscopy however the expert surgeon has less control over what the trainee is doing. If a complication were to arise during the course of the surgery it would be more difficult for the expert surgeon to intervene and rectify the situation. The same can be said for endovascular procedures and endoscopy. The learning curve is also steeper in minimally invasive procedures than for open surgery as trainees have to learn not only new technology and overcome obstacles like the fulcrum effect but they also have to have a good fundamental ability for these procedures as well.
The early part of the learning curve is associated with a higher complication rate. Therefore it is intuitive that familiarity with surgical procedures should be taught outside the surgical environment in order to improve patient’s safety. Laparoscopic cholecystectomy was the index procedure for laparoscopy. Although it was embraced with vigor it was also the procedure where problems and concerns with the minimally invasive approach were first highlighted. A higher than acceptable rate of bile duct injury in laparoscopic cholecystectomy when compared to open cholecystectomy became an important issue in the 1990s.
The Southern Surgeons Club study is an oft referenced paper . They found that 90% of common bile duct injuries occurred within the first 30 operations performed by the trainee surgeon. They also predicted that the surgeon had a 1.7% probability of causing a bile duct injury in their first operation, which reduced to 0.17% by the 50th case. The probability of injury was found to have dropped to a significantly safe level by the 10th case. This was one of the first articles to underline the significance of the learning curve in minimally invasive surgery.
As the complexity of the procedure increases so too does the learning curve. This has been demonstrated for laparoscopic fundoplication, where a significant reduction in complications has been reported to reduce only after the 50th case with the highest complication rate found within the first 20 cases . The learning curve for laparoscopic colectomy has been estimated to be even higher, with the highest rate of complications occurring during the first forty procedures . The initial learning curve has been shown to be associated with the period of greatest risk to the patient.
For these reasons the surgical community looked to virtual reality as a way of bridging this skill gap and providing a method of safely introducing new techniques into surgical practice.
However, these are not the only challenges surgical training has been faced with in recent times. Economic factors also affect training structures. The length of elective waiting lists and time pressures in the operating theatre play an important role in the amount of operative experience a trainee now receives. Coupled with the increasing cost of new operative technologies and instrumentation and the global economic recession the financial restraints on supervising surgeons is greater than before.
There has also been a change in the expectations of patients and the population as a whole. This has resulted from the publication of high profile medico-legal cases such as the ‘Bristol Case’ in the British Isles and the ‘To Err is Human’ report in the US. These cases have brought medical errors and the quality of surgical training to the forefront.
The last decade has seen considerable changes in the structure of healthcare delivery. There has been a steady shift towards a consultant-based service with reduced service activity by trainees. This shift is set to continue into the future. One factor propelling a consultant-led service is the fact that the complexity of surgery is increasing. This is due to advancements in the minimally invasive approach but it is also due to improvements in critical care services allowing increasing numbers of elderly and sicker patients to be operated on.
Fatigue due to an excessive workload and hours worked first became recognised as a significant problem following the landmark Libby Zion case . It has been demonstrated that fatigue is directly linked to medical error and a reduction in clinical performance. Fatigue has also been shown to effect mood and psychomotor performance.
The European Working Time Directive (EWTD) was a piece of legislation that demanded that doctor’s work less hours per week (www.doh.ie). The aim of reducing hours worked was to ensure that patients receive high quality and safe care. The EWTD should have been fully implemented in the Irish healthcare system from 2004. It has been rolled out to a certain extent but as of 2012 has not reached the ascribed targets.
However with the introduction of the Calman reforms in the UK in 1993 and the implementation of the European Working Time Directive the surgical community has been forced to debate how best to train junior surgeons in a shortened period of time. Although the proposed reforms have received a cautious welcome from the medical community, there are significant worries about the impact of shortening the training time on trainees' experience.
All of these challenges pose a problem for trainees as they are no longer getting one on one teaching nor are they getting enough operative exposure, therefore the traditional Halstedian method no longer applies. However, most teaching in Ireland continues along this traditional apprenticeship route, where trainees are exposed to surgical procedures with the guidance of an experienced teacher. As a result teaching is quite unstructured and is very much dictated by the location of the hospital, the caseload, case variation and the enthusiasm of the supervising surgeon for teaching. Furthermore, the current training paradigm lacks objective feedback on trainee performance. The current structure is unlikely to change and therefore the approach to training must.
Training needs to be done in a more efficient manner to optimize the learning experience and surgical exposure of the trainee as well as combating the challenges we face in adapting the new skill set required with minimally invasive procedures. Virtual reality can offer the surgical community a solution.
1.2. How virtual reality offers a solution
In the past decade, various academic medical institutions have set up simulation laboratories. In the US, the American College of Surgeons has a specific training program for residents and also has introduced a process whereby they have accredited a number of institutions across the US and the UK. Ireland has followed suit by developing a surgical simulation laboratory in the Royal College of Surgeons, Dublin. A simulation laboratory is a space designated for trainees to practice various skills and procedures on a wide variety of available surgical simulators in a safe, controlled environment. Dedicated time is necessary in order to learn the required skills in a protected manner.
Surgical skill can be learnt very effectively on simulators. Simulation has much more to offer the trainee than the clinical environment alone as it allows for dedicated teaching which is focused and structured with specific learning goals. By mastering skills such as hand-eye coordination, counter intuitive fine movements and the ability to work with a 2D dimensional image in a 3D space on a simulator, the trainee surgeon can then focus on the critical steps of the operation when in the operating theatre. This is instead of trying to learn every aspect of a new skill set at once. One of the difficulties with acquiring these skills is due to the fulcrum effect of the body wall on instrumentation . This problem cannot be overcome with concentration; it requires practice until the process becomes automated .
Traditionally the skills required for minimally invasive surgery (MIS) have been attained in the operating theatre. It has been here that the steepest part of the learning curve has been battled out. It is obvious that this situation is not the ideal for either the patient or the trainee. However, the introduction of simulation to MIS has helped to address these issues. Simulation provides a safe environment for trainees to overcome the initial learning curve including the visual spatial, perceptual and psychomotor difficulties associated with minimally invasive techniques.
1.3. Background into simulation
Simulation has its roots in the commercial and military aviation industry. It was first considered in 1910 when student pilots trained in land-borne aircraft with reduced wingspans. The first rudimentary simulator was available in 1929 and was known as the Links Trainer . It consisted of a wooden fuselage mounted on an air bellows, which was able to represent the movements involved in flight. This allowed the pilot to train for hours. In 1934 the US purchased six Links simulators following a series of aviation accidents. At the time it was recognized that the current training programs were inadequate and simulation was a step towards improving the training system. World War II also had a dramatic impact on the uptake of simulation for training purposes. The war demanded that a greater number of pilots be trained and that skills such as the need to become proficient in instrument or blind flying were paramount. These factors led to simulator development and usage. Today, there are hugely sophisticated systems which replicate an aircraft environment precisely and can deal which a vast range of potential flight scenarios. Pilots must undergo ongoing annual training entitled “checking out” by the Federal Aviation Administration in order to ensure ongoing certification as well as additional training requirements if they wish to change to another aircraft. Astronauts are also required to follow similar procedures.
The first surgical simulator to use virtual reality technology was created at NASA by Rosen and Delp . It was an orthopaedic lower limb model that simulated tendon transfer. It was unique in that it allowed planning and therefore optimisation of operations. Virtual reality technology has evolved to the point today where actual patient data and radiological images can be inputted into the simulator allowing for a complete simulated run-through before operating on the patient; a process known as mission rehearsal.
The aviation industry paved the way for simulation, so we often look to their methods for guidance. However simulating the human body is a much more complex and unpredictable task and often we can only get close to elements of surgery rather than replicating it completely. If we take a look at the existing simulators today we notice they mainly involve the simulation of machinery (airplane, car, train, truck, bus, space vehicles). All these are perfect for reliable repetition of conditions and interface. The purpose is generally to allow the user to practice their skill in a controlled environment, with the additional benefit of having ‘metrics’ or computerised feedback on their performance.
There is an abundance of technology available today to help simulate these situations, so why are some areas better represented by simulation than others? As a general guide, if we made the object being simulated, then we can generally do a good job of simulating it. In flight simulation, airplanes are manmade so engineers understand every part of the airplane and its interaction with the physical environment. We can take the input from real (manmade) aeronautical instruments and interpret them perfectly. Flight simulators do not need to be any better than they are at present. They provide an excellent simulation of real flight. On the other hand, we have surgical simulation. We are trying to simulate interaction with the most complex system we know – the human body, coupled with trying to interpret the movement of laparoscopic instruments and the human hands. This is more challenging to the point that we may be generations away from being as satisfied with simulation as an optimal teaching tool when compared to aviation.
2. Virtual reality surgical trainers and types of assessment
2.1. VR Trainers
VR Trainers digitally recreate the procedures and environment of laparoscopy. The term “virtual reality” was coined by Jaron Lanier a philosopher and scientist in the 1980s. It is a phrase used to describe the concept of a virtual world which supports interaction instead of something that is passively visualised.
Here are some of the presently available commercial VR trainers:
LapSim (Surgical Science, Sweden): This system has practice sessions which can vary in complexity. Modules include basic laparoscopic skills, cholecystectomy, appendicectomy, suturing, anastomosis and laparoscopic gynaecological procedures. The metrics are specific to the task being performed. Time, instrument path length and procedure specific errors are measured. LapSim has been evaluated in many studies and construct validity has been established [11, 12].
LapMentor (Simbionix, USA): This has many modules including basic skills (camera navigation, clip applying, 2-handed maneuvers, hand-eye coordination drills, cutting, object translocation and suturing), laparoscopic cholcystectomy, laparoscopic ventral hernia repair, laparoscopic gastric bypass, laparoscopic nephrectomy, laparoscopic sigmoidectomy and a variety of laparoscopic gynaecological procedures. There are also several other platforms including URO mentor (urologic procedures), PERC mentor (percutaneous interventions), ANGIO mentor (catheter based interventions) and GI mentor (endoscopy). Metrics measured include time, economy of movement, safety and electrosurgical dissection, procedural errors, and procedure specific checklist items relating to knowledge of the procedure and handling of instruments. Studies have validated its validity [13, 14].
MIST-VR (Mentice, Sweden): The Minimally Invasive Surgical Trainer-Virtual Reality facilitates basic laparoscopy using two instrument handles, a computer, monitor and a foot pedal. Metrics measured include tool to tool contact, loss of tissue-tool contact, inappropriate “passing of the point” of the instrument through the tissue, inappropriate targert release, inappropriate cautery application and economy of movement. Mentice also make the Procedicus VIST which simulates catheter based interventions and the Procedicus COREP which simulates endovascular procedures. Several studies have demonstrated validity and transferability [15-17] of the MIST-VR and it is possibly the most established VR platform to date in terms of publications.
LapVR (Immersion Medical, USA): This offers simulation of basic skills (camera naviagtion, peg transfer, cutting and clip application), procedural skills (adhesiolysis and running of the bowel) and full laparoscopic cholecystectomy. Metrics measured include time and procedure specific errors. The LapVR system has only recently been validated .
SurgicalSim (METI, USA): This platform offers practice of core tasks (tissue manipulation, dissection, suturing and knot tying), transurethral resection of the prostate (TURP) and laparoscopic cholecystectomy. Metrics measured include time, instrument path length and procedure specific errors. The software enables the user to customize their own training programme which can be viewed by an administrator as well as practicing skills or procedures with virtual robotic arms and a 3D headset. No construct validity studies using the SurgicalSim were found to date.
2.2. Hybrid trainers
VR Trainers have certain limitation due to lack of tactile feedback. In a response to these limitations, hybrid trainers were developed which combine computerized components with ex vivo synthetic parts to provide tactile feedback. Haptics experienced when using the simulator are real as you are interacting with real objects and instruments. The limitations of such physical models however include the increase in cost as the models can only be used once. Also complex human anatomy and physiology cannot be replicated precisely, for example bleeding vessels and leaking structures following trauma, and appropriate surrounding anatomy. Virtual reality surpasses physical models in this realm.
ProMIS Simulator (Haptica, Dublin) is a hybrid simulator which uses (a) 100% VR for certain tasks (b) Augmented reality that overlays graphics onto a task performed on a physical exercise. ProMIS supports both basic skills and a range of surgical procedures, including laparoscopic appendectomy and hand-assisted laparoscopic colectomy. ProMIS enables learners to practice on physical models to ensure appropriate tactile feedback, which is not easily replicated in VR simulators. The ProMIS open module is a perfect fit for surgical research and surgical procedure experimentation in that it allows you to insert any physical exercise into the simulator and by tracking the instruments, gives you full measurement and feedback on performance. Numerous studies have provided construct validity for this hybrid simulator [19-21].
2.3. Methods of assessment
Assessing improvement in surgical skill is essential to allow the development of surgical trainers, simulators and training programmes.
Time is the most basic metric which may indicate progression in a task however it is not a real indicator of accurate performance. When time is combined with an error score (the amount of errors committed per task by the user), a trainee can also be assessed for accuracy. In order to use the metrics produced by a simulator as an assessment tool, they need to be validated. There are many different types of validity. Construct validity is the ability of a simulator to detect differences between groups with different levels of experience. Hence the simulator can measure what it claims to measure. Face validity is the extent to which simulation resembles the real task. Concurrent validity is defined as the concordance of a test to a known “gold standard”.
Early box trainers lacked tracking systems which recorded errors and time; a simple stopwatch was used to access the speed at which a task was performed. With the advent of virtual reality simulators, we now have stand alone systems which can measure and record metrics. Simulators can generate a profile summary upon completion of a procedure or task which provides immediate feedback and an opportunity to see ones progress upon repeated practice. The easy to use nature of VR simulators along with practice sessions and step by step instructions provides the user with an opportunity for practice and attainment of proficiency.
Further to basic metrics (time, errors), more sophisticated markers of performance measurement have emerged over the years. An example of this is instrument path length which is the distance travelled by the instrument or the sum of deviations from a fixed point. When this is applied to laparoscopy, this suggests operative focus and greater overall performance and experience. A study by Smith et al used computer sensors on the tips of laparoscopic instruments to track motion paths. They found that speed did not equate to improved performance hence time can be a misleading if not used in conjunction with other metrics. Another metric used is economy of movement which is a score based on sudden changes in acceleration that works as an indication of smooth movement or instrument handling.
In order to use simple metrics to measure proficiency, appropriate scoring systems must be developed. The computer enhanced laparoscopic training system (CELTS) was developed by the Centre for the Integration of Medicine and Innovative Technology CIMIT and Harvard Medical School. They used a task trainer with a computer interface to form a task-independent scoring system against expert benchmark levels. Expert scores were calculated for suturing, peg transfer and knot tying using time, path length, smoothness, and depth perception as metrics. The user’s score was then compared with an expert score which led to the development of a standardised scoring system. This scoring method provided a gold standard of comparing novices to experts. When ProMIS was later developed, it contains a similar system which can also compare the user’s score in time, economy of movement and path length to expert proficiency scores. The scores need to be preset once they have been established for each module.
2.3.2. Global rating scales
Further to metrics, subjective rating of a surgical performance remains a very important tool. An approach to testing operative skills outside the operative setting led to the Objective Structured Assessment of Technical Skill (OSATS) which was introduced by Reznick et al in 1996 . This seven item table of technical performance on a fivepoint grading scale includes respect for tissue, time and motion, instrument handling, knowledge of instruments, flow of operation, knowledge of specific procedure and use of assistants. The OSATS tool has demonstrated high reliability and construct validity and is now used as a globally validated rating scale .
Global assessments are now widely used in the assessment of proficency during training and are used to study the effect that simulated surgical training has on operative skill. Studies by Scott et al, Hamilton et al, Traxer et al and Lucas et al demonstrating the transfer of skill from a simulated environment to the operating room have used a slightly modified version of OSATS with an included parameter of overall performance [24-27]. In the study by Scott et al , the modified OSATS showed improvement in four of the eight parameters including the new parameter overall performance.
A study by Grantcharov  modified the scale so that a new parameter was created - economy of movement, which was a combination of time and motion (1= clear economy of movements and maximum efficacy; 5= many unnecessary moves) and instrument handling (1= fluent moves with instruments; 5= repeated tentative awkward or inappropriate moves). In Reznicks original scale, five was the best possible score and one was the worst. In this study, a parameter of error score was also created which is a combination of respect for tissue from Reznicks scale (1=consistently handled tissues appropriately with minimal damage; 5= frequently used unnecessary force on tissue or caused damage by inappropriate use of instruments) and precision of operative technique which is a new parameter (1= fluent, secure and correct technique in all stages of the operative procedure; 5= imprecise, wrong technique in approaching operative intentions)
The Global Assessment of Laparoscopic Skills (GOALS) tool was designed by Vassiliou et al  (based on Reznicks OSATS) for minimally invasive procedures. This five point scale assessed depth perception, bimanual dexterity, efficiency, tissue handling and autonomy. Results have shown that the tool is reliable and valid 
There is a trend towards using global rating tools in video analysis rather than direct observation in a live surgical setting due to time and cost resources. The advantage of simulation in this setting is the convenient storage of vast amounts of data. As there are so many available ways of rating surgical performance, the question of which is superior has been evaluated. A study by Aggarwal et al  assessed four different scales, OSATS, modified OSATS with four instead of seven parameters, a procedure-specific global rating scale and a procedure checklist using laparoscopic cholecystectomy. The generic global rating scales successfully distinguished between novices and experts unlike the procedure specific rating scale or checklist. An extensive systematic review was undertaken by van Hove and colleagues to examine the current evidence for objective assessment methods for technical surgical skills . It was concluded that OSATS is presently most accepted as the “gold standard” for objective skill assessment however it remains unknown whether OSATS can distinguish between different levels of performance. Furthermore cut off values have not been determined for OSATS. The same short comings apply to procedure specific checklists and currently there is only one checklist with a high level of evidence . The study also concluded that motion analysis devices can determine between operators with different levels of experience. An important point that was discusses in this study is that the value of a good assessment method can diminish when it is used in an appropriate setting.
3. The use of virtual reality in surgical training
3.1. The role of simulation in surgical training
The introduction and development of VR simulators has been one of the main innovations that have resulted in a change in training curricula in surgery. Satava was the first to recommend VR simulation as a complement to current training models .
The role of simulation in surgery is to provide our trainees with the opportunity to learn basic tasks in a safe and controlled environment. All movements the trainee makes can be recorded and therefore there is the facility for immediate and objective feedback. It is also possible to set a proficiency level on a simulator and therefore design a training program giving set goals that a trainee needs to accomplish before being allowed perform in the operating theatre. All of these factors contribute to skill learning, assessment, selection and credentialing. Simulators will also be invaluable in the teaching of the newer forms of surgery, single incision laparoscopy and natural orifice transluminal endoscopic surgery. The use of simulation should provide the setting in which challenges such as the use of new instruments and technology can be overcome. An example of this is in single incision laparoscopic surgery where it is difficult to have instruments working parallel to each other in a very narrow operative field.
Given that simulation is generally an education tool, there are two distinct parts to the delivery of a simulator. There is firstly the teaching aspect which is the way which we communicate or impart knowledge or information. Secondly there is the training aspect, which is the acquisition of psychomotor skill and cognitive skill . Furthermore, the learning of psychomotor and cognitive skill can become blurred. When a novice begins simulated training, they are naïve to both the fulcrum effect of laparoscopy and the steps of the surgical procedure, therefore it becomes unclear what rate each skill is learnt. The only way in which we can both teach and train in an effective way is through a carefully thought out, well-structured curriculum. Several studies [35-37] have proposed templates for this.
3.2. Mapping learning curves
A learning curve is a graphical representation of the changing rate of learning (figure 1.1). Typically the increase in retention of information is sharpest after the initial attempts. This increase gradually flattens out as less and less new information is retained after each repetition.
As mentioned earlier, simulation provides a protected environment for trainees to overcome the initial learning curve. This concept has been discussed and examined by researchers in several studies over the last ten years.
Gallagher and Satava carried out a study  which looked at using the MIST-VR trainer as a tool for assessing psychomotor performance. As an adjunct to this, they also looked at learning curves. Both senior (<50 laparoscopic operations) surgeons and junior surgeons (<10 laparoscopic operations) performed six tasks on the MIST-VR, by trial 10 there was a convergence of mean performance. This showed that juniors could potentially perform to the level of a senior surgeon with practice outside the operating theatre.
A study by Grantcharov  showed that different learning curves exist for surgeons with varying levels of laparoscopic experience. In this study, it was established that the MIST-VR was capable of differentiating between surgeons with different laparoscopic experience, which is, important for both construct validity and also for the potential development of internationally accepted norms of performance. If this was further developed then a trainee could use this as a reference point to establish where they currently are on the learning curve. Similar results were shown by Eversbusch . Three different learning curves were mapped for colonoscopy. The learning rate on the simulator was proportional to prior experience with endoscopy, which indicated that the simulator could assess parameters that are clinically relevant. Psychomotor training using the GI mentor compared with a control group who received no training demonstrated improved performance in the novice participants.
Aggarwal et al have produced several studies involving mapping learning curves. In a study in 2006 , two different learning curves were mapped out using medical students who performed tasks of various complexity on the MIST-VR. All three parameters (time, economy of movement and error scores) plateaued at the second repetition for the twelve core skills and at the fifth repetition for the most complex two tasks. Another study in 2006  assessed the learning rate for dissection of Calot’s triangle, a learning curve for novices was established as their performance plateaued at the fourth repetition. Learning curve data was established in a study in 2009  to ensure that repetitive practice improved performance, as measured by the simulator. Moreover by applying a stepwise process to learning a laparoscopic cholecycstectomy, a whole procedure-training curriculum can develop. The learning curve for this procedure plateaued for all metrics between six and nine repetitions.
When laparoscopic suturing was examined in a study by Botden , the number of repetitions required to reach the top of the performance curve (defined as proficiency) was eight knots. Lin et al  evaluated the learning curve for laparoscopic appendicectomy and found that operative duration and complication rate decreased in proportion to the increasing experience of the resident.
Interestingly, Grantcharov’s study in 2009  assessed the learning curve patterns of acquisition of generic skills in laparoscopy. In this study it was hypothesized that the familiarization rate with laparoscopic technique is different depending on psychomotor ability. Four types of learning curves were identified, proficiency from the beginning (5.4%), ability to advanced with practice which was found to be between two and nine repetitions (70.3%), ability to improve but unable to reach proficiency (16.2%), and finally no tendency to improve and overall underperformance (8.1%). This data suggests a role for developing a proficiency-based curriculum based on innate psychomotor ability. Several studies have looked into aptitude tests, which may relate basic laparoscopic technical skill performance  Further to this research has attempted to ascertain the rate of skill acquisition in relation to innate ability .
3.3. VR-to-OR transfer
It is intuitive that training in a simulated surgical setting implies improved skill in a clinical environment; however this important concept requires definite clarification. There is little value to developing sophisticated training programmes in a simulated laboratory if laboratory training does not improve clinical performance. Transferability is often called VR-to-OR (a term coined by Professor Anthony Gallagher) and refers to the ability of simulation-based training to improve clinical performance. Transferability in clinical terms would imply predictive validity as discussed earlier.
Such trials are usually designed by using two groups who are randomised to either receive simulation based training or no training. Their performance is then compared in a specific laparoscopic procedure or task after simulation training or no training. The groups ideally have similar baseline psychomotor and visuospatial ability. Assessment in the operating room is performed by an examiner who is blinded to the status of the subject, using the methods described previously. Even with sound methodology human trials can have many logistical challenges therefore many investigators opt to conduct their trials using animal specimen’s most commonly porcine models. Clinical transferability can be shown with animal models in suitable laboratorys as a bridge to the human setting. Transferability studies are essential in order to assess the ability of simulation based training to improve surgical performance in the operating room; they require approval from an intuitional review board.
The first study to demonstrate a transfer of simulator learned skills to the operating room was in Yale, 2001 .The control group had no simulation training and the trained group were taught to proficiency under supervision with emphasis on avoidance of errors. Candidates were assessed on dissection of the gallbladder from the liver edge both pre and post training or no training in the OR during human cholecystectomies. The scoring system used was a novel pre-defined eight error checklist; occurrence of these errors was recorded during each minute of the assessment. This was used instead of a global rating scale in an attempt to determine errors more accurately. A non-significant difference was detected in dissection time, with the trained group removing the gallbladder 29% faster than the non-trained group. In relation to error performance, the control group were five times more likely to burn the liver edge or injure the gallbladder and nine times more likely to fail to progress. Further evidence which supported this landmark research was in a study by Grantcharov et al  which assessed both a trained and a control group in the clipping and cutting of the cystic duct. Again both groups underwent pre and post testing in the OR during human cholecystectomies. Performance was measured using a modified OSATs scale by combining traditional parameters to create new parameters. It was found that the group who received simulated training on the MIST-VR performed faster, had greater economy of movement scores and lower error scores than the control group in the post-test assessment in the OR, hence the study demonstrated transferability.
Following on from this initial research, various other studies demonstrating transfer of skill have been published. Some of them have shown partial task transfer and some using whole laparoscopic procedures; the latter of which laparoscopic cholecystectomy form’s the bulk. Three other studies [26, 48, 49] assessed the transfer of skill in laparoscopic cholecystectomy. Scott used OSATS and demonstrated a significant improvement in the trained versus control groups. McClusky and Ahlberg used total error scores; both studies showed that error scores were higher in the control groups.
Other studies have looked at the transfer of whole procedures. A study by Larsen et al  assessed the performance of an entire laparoscopic salpingectomy using an OSAT scoring system and found significant differences between trained and control groups. These same results have been shown with both laparoscopic hernia repairs  and laparoscopic nephrectomy . When laparoscopic appendicectomy was assessed on a porcine specimen, the results of this study did not show any difference between trained and control groups. In this study, training time was very short, with three hours training in total. Achieving proficiency in a shorter time frame may have been difficult and therefore could have affected the outcome of this study. The assessment method used was blinded rater analysis using a scale of bad, average and good, which had no previous validation in this setting.
One study  provided training for the novice group in laparoscopic cholecystectomy but assessed skill transfer in laparoscopic nephrectomy. The results showed that the group who received time based simulated laparoscopic cholecystectomy training outperformed the control group when a laparoscopic nephrectomy was performed in a porcine model. The students were assessed using OSATS. This shows not only the transfer of skill after simulated training but also that specific skills learnt for certain laparoscopic procedures are useful for other laparoscopic procedures.
Laparoscopic tasks as well as laparoscopic suturing have also been explored. Three studies [51-53] evaluated the transfer of laparoscopic suturing. Two of them [51, 52], used the same formula which was 600 – [ (time + (10 x accuracy score) + (10x security error]. This method awarded higher scores for the most accurate performance in the faster time. The purpose of this formula was to establish one value which if high implied a fast accurate performance and a good quality knot. By assigning one value to the user’s performance as opposed to three, it gives results that are easy to compare and understand. Both studies showed significant improvements in the trained group compared with the control group. The third  study used an error scoring system which showed that the control group made more errors than the trained group and this study also performed blinded rater video analysis looking at economy of movement and error assessment (Table 2). Verdaadonk et al did not show any significant difference in the transfer of skill between the simulation-trained group and the control group.
4. Learning though virtual reality
Research has shown that training on simulators translates to the clinical environment but less is known about how best to integrate simulation into the surgical curriculum . In order to provide the ideal model for surgical training there are a number of factors to consider.
Firstly a structured curriculum needs to be developed . Wiggens and McTigue’s backward design approach to curriculum development for technical skills is one approach that has been proposed for surgical simulation .
The second factor is that training should be carried out in a stepwise manner where the trainee begins on a simulator in the skills lab until predefined proficiency criteria are reached . An example of this is part-task training. Part-task training is a learning strategy whereby a complex task is deconstructed into smaller components for practice. Trainees gain proficiency in the individual components before progressing to the more complex task. It is thought that a higher level of skill can be attained if participants master individual components before integrating them into the whole task.
Thirdly, there needs to be clear criteria to determine the competence level of the trainee and skill mastery . The setting of training goals ensures that the trainee is required to reach a predefined standard and competence is not determined by time spent on the simulator or by performing a set number of repetitions. Standards should be benchmarked against both clinically established and simulator generated data. When this has been demonstrated and assessed in an objective manner then the trainee can progress to the real life operating room.
Training sessions should be spread out over a period of time in order to better augment and optimise learning. Previously it has been shown that one hour on a virtual reality simulator equates to two hours spent in the operating room. Other conditions to consider include the learning environment; for example is it in a quiet and relaxed setting or does it mirror the everyday stress of the operating theatre? Whether the trainee engages in purely self-directed learning or whether a mentor or trainer is present is also important.
Finally in order for any training programme to be effective the virtual reality simulator needs to demonstrate acceptability, validity, reliability and reproducibility in the real life operating environment.
4.1. Theories of learning
Historically observational learning has played a central role in surgical training, constituting the first step in the time-honored ‘‘see one, do one, teach one’’ model. However although there is a trend today away from this traditional approach to training, observation still has a role. No longer is observation limited to the operating room; many professional organizations now offer free Web-based videos of surgical procedures for training purposes. Such training videos are undoubtedly valuable resources. A study by Snyder and colleagues aimed to evaluate the use of video observation and to compare it to the real-life observation of procedures . They found that while instructional videos are useful they may however not be an adequate substitute for actual real-time observation in the minimally invasive surgery setting.
There are a number of theories of learning that have been discussed in the setting of surgical skills training. The learning model postulated by Fitts and Posner is one such theory. This theory has been discussed as being relevant in learning minimally invasive surgical skills . Their theory states that there are three phases in the acquisition of skill; cognitive stage, associative stage and autonomous stage.
Cognitive stage: During this stage, learners need to know what the elements of the task are and what is expected in terms of performance. They will draw on their reasoning ability, past experience and instructions to use cognitive strategies that are subsequently modified as they gain experience with the task.
Associative stage: This stage involves working out how to optimize and integrate performance so as to greatly reduce major errors and make performance more efficient.
Autonomous stage: This stage refers to extremely advanced levels of performance where errors are greatly reduced and performance of the task seems to be almost automatic. At this stage, less attention is required to carry out the task and so can be allocated to other activities such as teaching, attending to anatomical anomalies, changes in instrument readouts and so on. Once a skill becomes automated, the learner has established a sequence of highly coordinated movements, which are integrated in time and are characterized by a rhythmic structure of their own.
The concept of deliberate practice, as proposed by Ericsson and Smith, has become the most popular learning theory of late . Their expert performance model comprises of three crucial stages which overall suggest that individual differences in performance can be explained by differences in deliberate practice. The first stage requires the identification of representative tasks of expert performance and their replication within a controlled laboratory setting. The second stage involves an empiric analysis to identify the mechanisms underlying experts’ superior performance. The last stage examines the effect of specific practice activity to elucidate factors that might influence the acquisition of these expert performance mechanisms. Deliberate practise requires the individual to focus their training on defined tasks or drills. It involves repeated practice and immediate feedback delivered by experts. Because perceptual-motor tasks can be designed to capture the essence of specific surgical tasks, simulators lend themselves well to applying Ericsson’s expert performance approach as they allow measurement and empiric analysis of representative tasks in a controlled setting and allow for repeated drills.
Crochet et al carried out a study to investigate deliberate practice in the simulated setting and compared it to the real-life clinical setting . They concluded that enhanced quality of surgical skills can be achieved with deliberate practice, both on simulated and realistic tissues.
Feedback can be defined as the provision or return of performance-related information to the performer . Feedback that is delivered in a timely and regular manner has been recognised as an important part of the learning process in medical education. It can be intrinsic, where it is relayed directly by the sensory system of the trainee or extrinsic, where it is provided by an external source.
One of the dangers associated with the use of virtual reality simulation is a situation where the trainee is unaware of committing an error, and as a result persists in this error which in turn allows the simulator to reinforce undesirable behaviour. Therefore high-fidelity simulators run the risk of becoming ineffective as a training tool without feedback.
In spite of this virtual reality simulators confer a number of benefits with respect to feedback when compared to the real-life environment. Virtual reality simulation allows the delivery of immediate, objective and automated feedback. Tasks can be interrupted to highlight errors to the trainee and then repeated as required. Trainees can assess their own errors using the automated feedback provided or they can observe video playback of performance which is recorded by the simulator. Virtual reality simulation also allows for the delivery of feedback regardless of whether an expert is available. In fact it has been suggested that trainees may value virtual reality simulator feedback as being clearer and more objective than human expert feedback. Automated feedback has been demonstrated to have similar efficacy to live expert feedback 
Although the benefits of performance feedback are not debated, questions remain about the optimal way to provide this. Research is currently been conducted to analyse the optimal frequency and type of feedback. It has been shown that feedback delivered in a standardised and structured manner results in an improvement in simulator performance . It has also been found that providing feedback has resulted in a shortening of the minimally invasive surgery learning curve .
4.3. Limitations of virtual reality
Virtual reality is an acceptable way of simulating a surgical procedure however there are several challenges given the limitations of modern technology. Graphics can simulate anatomical structures visually however they are unable to model the physical properties of its real counterpart to an accurate enough degree therefore it cannot be manipulated in a realistic fashion. The biggest limitation however and future challenge of virtual reality simulation is haptic feedback. Currently none of the VR simulators are capable of providing any tactile feedback. There is ongoing research into this area however haptic technology is currently very basic with the phantom device being at this pinnacle of this technology. Technology could be ten or more years away from providing a solution to these challenges.
5. Technology’s ability to deliver simulators
It is useful to compare the state of the flight simulator technology to the state of the surgical simulation technology. On one hand you can purchase a flight simulator today that will represent every single aspect of the flight experience right down to the chair you sit on. However, in no form does a similar setup exist for the surgeon. They cannot enter a virtual room and carry out whatever operation they like on a virtual patient and have everything behave exactly as it should. In order to understand why this is we need to break down the components of what makes a perfect simulator.
A perfect simulator will model a subset of the world (plane and terrain for the pilot, subset of human and procedure for the surgeon) and attempt to have you interact in your normal way with this simulated environment. We experience the world through our senses so to create the simulation we must be able to substitute for each of these experiences. For discussion sake we will concentrate on comparing a flight simulator with a cholecystectomy simulator.
1. Sight. For sight to be tricked, the objects we look at must look like their real counterpart – but must also behave like them.
Flight Simulation: The goal is to model and create a virtual terrain and have it behave correctly from a physics point of view: Because terrain, mountains, houses, in terms of a flight simulation need only be ‘shells’ and are generally far away in the distance (i.e.no physical interaction) then all we need to simulate is the effects of wind on an object (i.e. the airplane) that is a manmade object. Because it is manmade then we can easily understand and simulate all its aspects.
Cholecystectomy Simulation: Here the goal is to model the internals of a working body and have it behave correctly under physical interaction: Modeling a complex organic organ that is not fully understood is an incredibly difficult task. Unlike the ‘shell’ of the flight simulator, cutting into an organ must reveal a solid structure that behaves like a solid organic structure would – the weight of the separated tissue would help move it apart, blood would flow from blood supply and the combination of possible outcomes are infinite. With current technology we can do a reasonable job of modeling the outer shell, and have it behave in roughly an organic way, but any bleeding or cutting will be pre-scripted (fake). The model is just too complex.
2. Touch. For touch to work our sense must be fooled into thinking that we feel the same resistances as we would in the real situation.
Flight Simulation: Interaction is all through manmade interfaces (the cockpit is no more complex technology wise than any games console interface found in any home). They translate into increases/decreases in pressure and are very simple to interpret at a computer level.
Cholecystectomy Simulation: Interaction is physical and complex. Instruments must collide with each other, and their actions must cut/ burn/ grasp/ with organic material that will act realistically under pressure. If we press our virtual instrument against a liver it must stop in its tracks as a real liver would. We are now into the world of robotics. This technology is still in its infancy. For a surgeon to be able to pick instruments up and place them where they like in a simulated environment and have the instruments physically interact with every organ they come in contact with simply doesn’t exist.
In summary the world of video games has given us vastly improved graphics (or visual representation of the real world). Their physics technologies (needed for accurate behavior) is excellent, but only in rigid solid objects. Everything else (organic soft body modeling/fluid dynamics) is too complicated. And touch (haptics), or recreating what our hand experiences, is just too complex. The human body is infinitely complex with so many interdependent systems that is too complex to replicate accurately – so we try to create pre-scripted experiences and in doing so we do our best with the current technical challenges.
5.1. The response from the simulation industry to these challenges
The industry’s approach has been to try and combat the basic areas of surgery. The basic skills for example, can be practiced and measured in VR and Augmented Reality exercises with a high degree of accuracy and accountability. We are however, simulating manmade objects such as instruments, beads, suture needles and the consequences of the actions are easy to simulate. If a bead falls, it rolls somewhere. There is not a series of knock on effects; they are discrete pieces of simulation. After basic skills, the route has been to try to replicate the least complex operation, for example a laparoscopic appendicectomy or cholecystectomy. Several factors combine to create the level of complexity such as the number of organs involved, the number of structures involved and then the interaction between these. The rigidity of the organ being simulated varies; for example the small intestine would be more difficult to replicate than a gallbladder or even a liver. The level of interaction with the organ also varies; transecting the liver would require very complex physics as opposed to clipping the cystic duct during a laparoscopic cholecystectomy. Replicating the behavior of laparoscopic instruments interacting with tissue is infinitely less complex than replicating the experience of using your hands in open surgery when performing a procedure. We are essentially closer to the level of complexity of a flight simulator in that we are using a manmade interface to guide a rigid body through a known space.
5.2. Limitations of the simulation industry
The video games industry conquered their own limitations by delegation of complexity. The older model of a video game developer company involved one company creating every component to a video game. This is generally the case with surgical simulation companies. Simulator manufacturing may benefit greatly from having separate companies concentrate and perfect the various components needed for a more precise simulator, (e.g. interface, human body physics, anatomy rendering, fluid dynamics). These could all potentially run on a standardised platform designed by experts in the field internationally. By doing this, we may overcome the current restrictions that the simulation industry has regarding future developments.
5.3. The language barrier
A computer programming language is just a formal way of instructing a computer. If we leave what is possible out of the equation for a minute and assume we had limitless computer power, then to create a perfect simulator, we would need a surgeon to describe in perfect detail every aspect of a surgical procedure, describing at every moment how every cell reacted, every organ interacted, and the principles and physics behind every system such as blood flow. The computer programmers and artists would then have to understand every aspect of this with as much knowledge as the surgeon themselves. This doesn’t happen so we end up with information being lost between the surgeon, the artist, the programmer and the limitations of how descriptive we can be in instructing a computer how to behave.
6.1. The future of VR in surgery
We still face a variety of challenges before we have a virtual patient that will behave in the exact same fashion as a real human. It is not only technical challenges, such as those concerning interface and complex system simulation, but the financial challenges such as hospital budgets and developer budgets. So what does the future hold for VR’s role in surgical training? It would seem that excellent basic skills simulation and complete feedback is close. As we move towards procedure based simulation it may be a case of acknowledging the current technical limitations but expanding the material by improving and expanding the content using current platforms. Another step forward will be accessibility. The simulators need to become more integrated into surgical training programmes and should be onsite in teaching hospitals.
In an era of expanding minimally access technology, reduced working hours and increased awareness of patient safety; surgical simulation has helped to create a safe environment for surgeons to practice skills and procedures. The new ethos of proficiency based training programmes ensures that the surgical community can learn and perfect new skills. Further to this it has helped advance patient safety by battling out the steepest part of the learning curve.