Duration of calculation steps.
This work describes the development of a novel vision-based grasping system for unknown objects based on laser range and stereo data. The work presented here is based on
The outline of the paper is as follows: The next Section introduces our robotic system and its components. Section 3 describes the object segmentation and details the analysis of the objects to calculate practical grasping points. Section 4 details the calculation of optimal hand poses to grasp and manipulate the desired object without any collision. Section 5 shows the achieved results and Section 6 finally concludes this work.
1.2. Problem statement and contribution
The goal of the work is to show a new and robust way to calculate grasping points in the recorded point cloud from single views of a scene. This poses the challenge that only the front side of objects is seen and, hence, the second grasp point on the backside of the object needs to be assumed based on symmetry assumptions. Furthermore we need to cope with the typical sensor data noise, outliers, shadows and missing data points, which can be caused by specular or reflective surfaces. Finally, a goal is to link the grasp points to a collision free hand pose using a full All images are best viewed in colour.
All images are best viewed in colour.
The main problem is that
1.3. Related work
In the last few decades, the problem of grasping novel objects in a fully automatic way has gained increasing importance in machine vision and robotics. There exist several approaches on grasping quasi planar objects (Sanz et al., 1999; Richtsfeld & Zillich, 2008). (Recatalá et al., 2008) developed a framework for the development of robotic applications based on a grasp-driven multi-resolution visual analysis of the objects and the final execution of the calculated grasps. (Li et al., 2007) presented a 2D data-driven approach based on a hand model of the gripper to realize grasps. The algorithm finds the best hand poses by matching the query object by comparing object features to hand pose features. The output of this system is a set of candidate grasps that will then be sorted and pruned based on effectiveness for the intended task. The algorithm uses a database of captured human grasps to find the best grasp by matching hand shape to object shape. Our algorithm does not include a shape matching method, because this is a very time intensive step. The
(Ekvall & Kragic, 2007) analyzed the problem of automatic grasp generation and planning for robotic hands where shape primitives are used in synergy to provide a basis for a grasp evaluation process when the exact pose of the object is not available. The presented algorithm calculates the approach vector based on the sensory input and in addition tactile information that finally results in a stable grasp. The only two integrated tactile sensors of the used robotic gripper in this work are too limited for additional information to calculate grasping points. These sensors are only used if a potential stick-slip effect occurs.
(Miller et al., 2004) developed an interactive grasp simulator "GraspIt!" for different hands and hand configurations and objects. The method evaluates the grasps formed by these hands. This grasp planning system "GraspIt!" is used by (Xue et al., 2008). They use the grasp planning system for an initial grasp by combining hand pre-shapes and automatically generated approach directions. The approach is based on a fixed relative position and orientation between the robotic hand and the object, all the contact points between the fingers and the object are efficiently found. A search process tries to improve the grasp quality by moving the fingers to its neighboured joint positions and uses the corresponding contact points to the joint position to evaluate the grasp quality and the local maximum grasp quality is located. (Borst et al., 2003) show that it is not necessary in every case to generate optimal grasp positions, however they reduce the number of candidate grasps by randomly generating hand configuration dependent on the object surface. Their approach works well if the goal is to find a fairly good grasp as fast as possible and suitable. (Goldfeder et al., 2007) presented a grasp planner which considers the full range of parameters of a real hand and an arbitrary object including physical and material properties as well as environmental obstacles and forces.
(Saxena et al., 2008) developed a learning algorithm that predicts the grasp position of an object directly as a function of its image. Their algorithm focuses on the task of identifying grasping points that are trained with labelled synthetic images of a different number of objects. In our work we do not use a supervised learning approach. We find grasping points according to predefined rules.
(Bone et al., 2008) presented a combination of online silhouette and structured-light
(Stansfield, 1991) presented a system for grasping
Summarizing to the best knowledge of the authors in contrast to the state of the art reviewed above our algorithm works with
2. Experimental setup
We use a fixed position and orientation between the AMTEC http://www.amtec-robotics.com http://www.ottobock.de http://www.amrose.dk
The laser range scanner records a table scene with a pan/tilt-unit and the stereo camera grabs two images at -4 and +4 . (Scharstein & Szeliski, 2002) published a detailed description of the used dense stereo algorithm. To realize a dense stereo calibration to the laser range coordinate system as exactly as possible the laser range scanner was used to scan the same chessboard that is used for the camera calibration. At the obtained point cloud a marker was set as reference point to indicate the camera coordinate system. We get good results by the calibration most of the time. In some cases at low texture of the scanned objects and due to the simplified calibration method the point clouds from the laser scanner and the dense stereo did not correctly overlap, see Fig. 3. To correct this error of the calibration we used the iterative closest point (ICP) method (Besl & McKay, 1992) where the reference is the laser point cloud, see Fig. 4. The result is a transformation between laser and stereo data that can now be superimposed for further processing.
3. Grasp point detection
The algorithm to find grasp points on the objects consists of four main steps as depicted in Fig. 5:
Raw Data Pre-processing: The raw data points are pre-processed with a geometrical filter and a smoothing filter to reduce noise and outliers.
Range Image Segmentation: This step identifies different objects based on a 3D DeLaunay triangulation, see Section 4.
Grasp Point Detection: Calculation of practical grasping points based on the centre of the objects, see Section 4.
Calculation of the Optimal Hand Pose: Considering all objects and the table surface as obstacles, find an optimal gripper pose, which maximizes distances to obstacles, see Section 5.
4. Segmentation and grasp point detection
There is no additional segmentation step for the table surface needed, because the red light laser of the laser range scanner is not able to detect the surface of the blue table and the images of the stereo camera were segmented and filtered directly. However, plane segmentation is a well known technique for ground floor or table surface detection and could be used alternatively, e.g., (Stiene et al., 2006).
The segmentation of the unknown objects will be achieved with a
In most grasping literature it is assumed that good locations for grasp contacts are actually at points of high concavity. That's absolutely correct for human grasping, but for grasping with a robotic gripper with limited DOF and only two tactile sensors a stick slip effect occurs and makes these grasp points rather unreliable.
Consequently to realize a possible, stable grasp the calculated grasping points should be near the centre of mass of the objects. Thus, the algorithm calculates the centre
With the distances between two neighbouring hull points to the centre of the object
Fig. 7 illustrates that with stereo data alone there are definitely better results possible then with laser range data alone given that object appearance has texture. This is also reflected in Tab. 2. Fig. 8 shows that there is a smaller difference between the stereo data alone (see Fig. 7) and the overlapped laser range and stereo data, which Tab. 2 confirms.
5. Grasp pose
To successfully grasp an object it is not always sufficient to find locally the best grasping points, the algorithm should also decide at which angle it is possible to grasp the selected object. For this step we rotate the
The grasping pose depends on the orientation of the object itself, surrounding objects and the calculated grasping points. We set the grasping pose as a target pose to the path planner, illustrated in Fig. 9 and Fig. 1. The path planner tries to reach the target object on his part. Fig. 10 shows the advantage to calculate the gripper pose. The left Figure shows a collision free path to grasp the object. The right Figure illustrates a collision of the gripper with the table.
6. Experiments and results
To evaluate our method, we choose ten different objects, which are shown in Fig. 11. The blue lines represent the optimal positions for grasping points. Optimal grasping points are
required to be placed on parallel surfaces near the centre of the objects. To challenge the developed algorithm we included one object (Manner, object no. 6), which is too big for the used gripper. The algorithm should calculate realistic grasping points for object no. 6 in the pre-defined range, however it should recognize that the object is too large and the maximum opening angle of the hand is too small.
In our work, we demonstrate that our grasping point detection algorithm and the validation with a Open source software, http://public.kitware.com/vtk.
Open source software, http://public.kitware.com/vtk.
|Calculation Steps||Time [sec]|
|Filter (Stereo Data)||14sec|
|Smooth (Stereo Data)||4sec|
|Grasp Point Detection||4.34sec|
Tab. 2 illustrates the evaluation results of the detected grasping points by comparing them to the optimal grasping points as defined in Fig. 11. For the evaluation every object was scanned four times in combination with another object in each case. This analysis shows that a successful grasp based on stereo data with
We tested every object with four different combined point clouds, as illustrated in Tab. 3. In no case the robot was able to grasp the test object no. 6 (Manner), because the size of the object is too big for the used gripper. This fact could be determined before with the computation of the grasping points, however the calculated grasping points are in the defined range of object no. 6. Thus the negative test object, as described in Section 4 was successfully tested.
|No.||Objects||Laser [%]||Stereo [%]||Both [%]|
Tab. 2 shows that the detected grasping points of object no. 2 (Yippi) are not ideal to grasp it. The
For objects such as Dextro, Snickers, Cafemio, etc., the algorithm performed perfectly with a
7. Conclusion and future work
In this work we present a framework to successfully calculate grasping points of unknown objects in
Future work will extend this method to obtain more grasp points in a more generic sense. For example, with the proposed approach the robot could not figure out how to grasp a cup whose diameter is larger than the opening of the gripper. Such a cup could be grasped from above by grasping the rim of the cup. This method is limited to successfully convex objects. For this type of objects the algorithm must be extended, but with more heuristic functions the possibility to calculate wrong grasping points will be enhanced.
In the near future we plan to use a deformable hand model to reduce the opening angle of the hand, so we can model the closing of a gripper in the collision detection step.
- All images are best viewed in colour.
- Open source software, http://public.kitware.com/vtk.