Foreword: The Impact of RoCKIn on Robotics

When the RoCKIn project was conceived, the RoCKIn consortium decided to have a pool of external experts who will observe the project and provide feedback and advice. When we were asked to be part of this pool, we accepted enthusiastically. The reason for our enthusiasm was simple: benchmarking robots is a difficult task, but we must learn how to tackle it if we want to advance the field. The goal might be clear, but the methods to achieve the goal are open for interpretation. Adapting an existing benchmark is cumbersome. RoCKIn aimed to provide a way to explore possibilities and to provide a new way of thinking about benchmarking robots. Because of this, our intuition was that RoCKIn was one of those few projects that can contribute to redefine the fabric of robotic research, in order to align it both with our increased scientific expectations and with the new demands from the industry. We wanted to follow this adventure closely.


Impact on the participants
Quite obviously, the first candidates to benefit from RoCKIn are those researchers who participated in the competitions organized by the project.Did they?
RoCKIn pushed teams to advance the state of the art in terms of robotic technology.It did so by setting a research agenda that included specific challenges and specific performance metrics.The performance of most teams at the final competition, in 2015, was very good, and the top teams were just impressive.The progress made since the 2014 competition was considerable.This shows that the bar has been put at about the right level: a bit beyond the state of the art, but not so high that real progress cannot be made from one year to the next.This is quite a remarkable feat by itself.
Another interesting place where we observed a remarkable improvement over the lifetime of RoCKIn was the communication between organizers and participants.At the 2014 competition, we had the impression that some teams perceived the organizers as a separate, almost antagonist, entity.After that, the RoCKIn consortium worked hard to establish a better communication policy, to make teams feel that they are part of one and the same joint effort as the organizers, e.g., by having a shared understanding of the goals of the event, making the teams aware of the organizational difficulties and involving them in some organizational decision.This strategy worked out very well.When we watched the 2015 event, the teams and the organizers gave us the impression to act as one, cohesive unit that had been working together for a long time.The importance of having an open and effective communication between organizers and teams is an important practical lesson coming from RoCKIn.
From a technical point of view, an area that was and remains critical is system integration.Even the top robots were relatively brittle, which may suggest that system integration was a bit ad hoc: this impression was confirmed talking with the teams.This lack of integration is, unfortunately, rather common at robot benchmarks where the focus is on getting the software modules to work at all.Often, the focus is on getting through the tests.In a later stage, generalization comes into focus.More experienced teams take this into consideration from the start.Specialized finite-state machine solutions were preferred to the use of general purpose task planners, which would have been more flexible but more complex.The start-up time of the robots was very long, suggesting that many things had to be started and connected manually.If RoCKIn would continue, a logical next step would be to encourage a more systematic and general approach to system integration, for instance, by having challenges that involve the run-time modification and restart of the system.
Beside the technical advances, a great impact of RoCKIn was in terms of training of young researchers.The competition rules indirectly pushed the teams to adopt the values held by the RoCKIn consortium: modularity of the software, flexibility of the system, and replicability of the experiments.When interviewing the teams in 2014, many noticed that they had not put enough emphasis on these aspects and regarded this as one of their main weak points to be corrected for the 2015 competition.We see this awareness as a positive educational achievement of RoCKIn: the above values are important both for the development of a science of robotic systems and for the transfer of robotic techniques to industrial applications.

Impact on the robotic research community
It is safe to claim that RoCKIn advanced the state of the art in terms of experimental methodology in robotic research.The work on benchmarking and evaluation is one of the strong scientific contributions of RoCKIn and probably the one that will give RoCKIn its strongest impact in the long term.The idea of the functional challenges is an important and innovative part of the RoCKIn competitions.In fact, our perception is that the RoCKIn competitions are meta-experiments aimed at testing different hypotheses about what can be a "meaningful" evaluation metric.A good example of this method is the matrix "function × tasks."The entries of this matrix were initially an a priori guess about the correlations between functionalities and tasks, but as data from the competitions became available, they were used to confirm or disconfirm those correlations.This is, in our opinion, a novel and very promising methodological approach to empirical evaluation of complex systems-whether they are robotic systems or not.
The RoCKIn competitions took inspiration from RoboCup, and there was an inherent risk for RoCKIn to be perceived as yet another RoboCup-like activity.We soon realized that RoCKIn has done a good job in avoiding this risk.The project has defined its objectives and its methodology clearly, and it has implemented this event in a way to put forward what we see as its three strong messages about what one can do through a competition: to systematically evaluate full robotic systems, to benchmark key robotic functionalities, and to foster scientific communication and cooperation.
Having the data from the competition runs, including ground truth, is a strong added value of RoCKIn.The teams we talked to were excited about having these data.In addition, RoCKIn has adopted a strong open policy, which we applaud: the collected log files and ground truth data are intended to be openly available to the entire scientific community, not only the RoCKIn teams, which will make a big difference in impact.The RoCKIn open policy includes the creation of fully instrumented test facilities accessible for use by the robotic community at large.It is our hope that these repository and test facility will live well after the end of RoCKIn and that the heritage of RoCKIn is properly taken over after the end of the project.

Impact on technology transfer
One of the stated goals of RoCKIn is to help technology transfer in advanced autonomous robotic systems.This is an ambitious goal as the gap to bridge is wide.The RoCKIn@Work competition is strongly shaped by this goal, but technical progress in that section has been slower than in RoCKIn@Home.In fact, the main contribution of RoCKIn to technology transfer has probably been to highlight some of the main technological barriers that make it difficult and that require substantial research investments to be overcome.
The main barrier can be summarized in one word: robustness.It was surprising to note that most teams did not pay much attention to execution monitoring.Almost invariantly, whenever a robot failed to grasp an object or placed it improperly, the error was neither noticed nor corrected by the robot, which continued execution until the entire task inevitably failed.Failure detection and repair are key to achieving robust execution in open environments, which is critical for marketability.RoCKIn has helped us to put it in the research agenda.The next step would probably be to extend the RoCKIn benchmarks to include long-term or repeated experiments that stress robust operation over extended periods of time in non-nominal conditions.
A telling example of how robustness should enter in the benchmarking equation is provided by WLAN.Rather unsurprisingly, there were glitches in the WLAN connectivity during the competitions, and the performance of several robots was affected dramatically by these glitches.This, in our opinion, should not happen.A dependable domestic or industrial robot should be able to cope with reduced WLAN connectivity while remaining safe and reasonably functional.The ability to deal with WLAN problems should be one of the aspects that is tested in a robotic competition (as it is done in the DARPA challenge) since this is essential to real autonomy and deployability.

Impact on the general public
Robotic competitions have a fundamental role to play in informing and educating the general public about the reality of robotic research, trying to correct the too many misconceptions about robots and robotics.A strong effort must be placed to ensure that the public outreach is extensive and carefully prepared.RoCKIn was only partially successful in this respect, and it has helped us understand that public engagement should be one of the top priorities for future competitions.
During the 2014 competition, the host organizations in Toulouse (LAAS and the Cité de l'Espace) put an exceptional effort on dissemination: many visitors attended the event, and a professional commentator did a great job in explaining what was on.Despite this, we feel that the public received an unsatisfactory view of robotic research.The audience often expected to see Hollywood types of action but was faced with research robots where often there was "little action to watch."This problem is pervasive throughout robotic benchmarking.The robots often still require careful dedication and are far from being multipurpose machines with general types of intelligence.Many tests target specific capabilities, which make it difficult to tell a story to the audience.There is no clear solution, but probably showing only the best capabilities of the robots, and providing the audience with more understanding of what's going on inside them, could increase the appeal of the benchmarks.Organizers could decide to only open the finals and not all the preliminary trials, or they could allow the teams to do dedicated public demos, designed to be informative to a general audience (this was done as a last-minute addition to the program).One might also consider adding rules or scoring points related to the entertaining value of robots or including a new task to "interact with the public."Finally, the venue should be designed to maximize excitement, stimulate curiosity, and make explanations readily available.Showing a visualization of the internal state ("mind") of the robot on a big screen might also improve public engagement, allowing visitors to understand what the robot is doing and why.It would provide a commentator plenty of opportunities to explain general interesting things about robotics.Teams might also find this type of monitoring useful: we forgot how many times we heard the sentence "I do not know why the robot is doing this!"

Recommendation for future competitions
One of the important heritages left by RoCKIn is a set of best practices, lessons learned, and recommendations for benchmarking in general and for future competition in particular.We hinted at some of them above, and many more are contained in the different chapters of this book.We end this Foreword with four general recommendations that came from our experience as "external observers." The most important one is to have more competitions.RoCKIn demonstrated that there is room for different types of benchmarks.At the start of the RoCKIn experience, there was some skepticism about the usefulness of yet another benchmark in a field where others already existed.By the end of the project, it became clear that we are only at the beginning of understanding what it means to benchmark robots.We need more benchmark projects like RoCKIn.
The second recommendation is to keep exploring the space of possibilities for robotic benchmarking.It is not always possible, or practical, to adapt existing benchmarks, mostly due to the committed investments of the participating teams.Short spikes of exploration can give guidance to longer running benchmarks by showing the pros and cons of particular ideas.This is essential for the progress of related benchmarks.
The third recommendation is to consolidate the best practices of organizing benchmarks.This includes everything from the first brainstorms to the creation of the rules to the actual running of the benchmark.The dissemination and the measured impact of the dissemination are also of great interest for many researchers and others in the benchmarking community.The present book is a step in this direction.
The last recommendation is to radically experiment with the audience.To be effective, the public dimension should be taken into account at all stages: from deciding the schedule, to designing the venue, to setting the rules.It is difficult to make a benchmark with relatively dumb and slow robots interesting for a general audience, but it is not impossible.Human-robot interaction with the audience is an interesting research topic.Best practices with respect to entertaining the audience might provide a large boost to the public acceptance of robotic research.

In conclusion
Sometimes, we are faced with projects that do not make much noise, but have nonetheless a profound and durable impact on the way we work on robotics.RoCKIn is one of those projects.We have seen the start of a new way of investigation into benchmarking.RoCKIn has demonstrated that benchmarking is a valid research topic in itself and one of growing importance to research, development, and innovation in robotics.Benchmarks deliver the tools required to advance the field of robotics.RoCKIn delivered the tools to advance the field of robotic benchmarking.