Progressive Underwater Exploration with a Corridor-Based Navigation System

The present work focuses on the exploration of underwater environments by means of autonomous submarines like AUVs using vision-based navigation. An approach called Corridor SLAM (C-SLAM) was developed for this purpose. It implements a global exploration strategy that consists of first creating a trunk corridor on the seabed and then branching as far as possible in different directions to increase the explored region. The system guarantees the safe return of the vehicle to the starting point by taking into account a metric of the corridor lengths that are related to their energy autonomy. Experimental trials in a basin with underwater scenarios demonstrated the feasibility of the approach.

In particular, underwater navigation generally excludes any form of global positioning system such as in other environments, so vision is a sound alternative, provided visibility is good enough. On the other hand, in turbid waters it is possible, to some extent, to navigate at low altitude on the seabed. However, in these extreme cases, the texture of the seabed may appear voluminous, which implies visual occlusions and collision probabilities. Conversely, shallow waters, rapid changes in the natural illumination, for example, due to the sunlight caustic on the floor, shadows, and flashes, can seriously damage the photometric properties of the images. In such extreme cases, the estimation of the vehicle position can be so impaired with the possibility of vehicle loss.
These characteristics of the underwater environment pose major challenges to the success of navigation, especially if it takes place in unknown regions. These challenges relate not only to robust estimations of the vehicle position and surrounding cartography [11] but also to the ability of the guidance system to avoid possible collisions at low altitudes.
In this work we will present an approach for the autonomous exploration of unknown regions of the seabed by means of a navigating system connected to a decision-making process hosted in the submarine. This vision-based approach is called Corridor SLAM (C-SLAM), because it combines SLAM techniques with a strategy to build a corridor over the seabed. The original concept was described in [12]. This document presents a generalization of previous work centered on a robust network of corridors. A variant of the basic concept uses active SLAM for exploring and building a grid map (see recent works [13][14][15]), but they do not enclose in any way the concept of path optimality as here, which proposes continuous paths with viewpoints associated.
A relatively close idea to the concept of corridor can be found in the "teach and repeat" method, [16], to follow a path in unfamiliar outdoor environment, such as in the exploration on the surface of Mars by a rover. However, the basic method does not include an autonomous exploration, as the first path construction is given remotely by a human operator with the aid of a camera at the rover front. Once an a priori feasible path has been defined, the vehicle can "repeat" the path that was previously "learned." The main objective of our approach is to make the right decisions to build robust pathways to explore and configure them in a network. At the same time, the system is able to bring the vehicle back to the starting point from any position in the explored area. The robustness properties of the network are conceived in the sense that there are preferential directions to explore and are such that one allows a successful return. Figure 1 illustrates the complete structure of the proposal for an autonomous vision-based system. It consists of three nested feedback loops, called adaptive control loop, reference adjustment loop, and dense mapping loop. They are described in details below.

C-SLAM-based autonomous navigation
The adaptive control loop generates the control actions u necessary to achieve the prescribed reference pathη ref and the reference speed_ η ref . Both are provided by the guidance system. The path following will attenuate the influence of sea currents that push the vehicle off the corridor. The adaptive law circumvents the requirement to make available an AUV model which is generally complex due to its 6 degrees of freedom (see [17,18]). Additionally, the action of marine currents and a variable speed of the AUV make hydrodynamic resistance forces very difficult to estimate and include in the dynamic model. For these reasons, controllers with self-adjustment of coefficients and dynamics adaptation seem to be much more effective than fixed controllers.
In the adaptive control loop feedback, there is the vision system that basically takes images of the seabed with the vehicle in motion to estimate the position η and rate _ η of the AUV. The estimates are termedη and_ η, respectively. The monocular camera onboard takes images in a sequence called I. The line of vision is oriented forward with a certain inclination towards the ground. In case of sunlight caustics on the seafloor [19], the quality of the images may be significantly improved by an online deflickering filter [20][21][22]. The filter attempts to retrieve the original photometric properties of the frame. Its output is the frame sequence I d . A feature-based SLAM algorithm estimates the position of the vehicle and a sparse depth map M S of the floor. Here, texture characteristics of the ground like self-similarity and water turbidity will generally difficult the self-localization and map estimation. Therefore they are considered as disturbances [23,24].
The reference adjustment loop has the function of reformulating the referenceŝ η ref and_ η ref according to a new scale factor of the estimated relief. This is necessary due to the variability of the estimated sparse map when revisiting the corridor site.
The outer loop of the C-SLAM, namely, the dense mapping loop, is dedicated to the construction of the corridor network, providing alternatives for bifurcating the path during the exploration. The feedback consists of the navigation system, a block for estimating a dense mapping M D of the terrain, and a block for path planning and bifurcation managing. These blocks will process the odometry information provided by the inner feedback in order to update the topography of the corridors. The dense map helps the guidance system to implement the future path to follow and to avoid collisions.
A typical navigation route consists in part of stretches between relevant points or landmarks. For example, a stretch that links a starting point with a bifurcation point, or one that covers an entire branch, or one between a bifurcation point and a point of no return. During the exploration, the path is created step by step, in which the vehicle direction is regularly changed towards preferential points that should ensure the continuity of the future path.
According to the main objectives of the exploration, C-SLAM should allow a successful return of the vehicle to the starting point from any distant point of the network. The farthest points are called no-return points. In this case, it is assumed that when the vehicle reaches a point of no return, it possesses at least half of its energy autonomy [25][26][27]. It is therefore that C-SLAM must permanently estimate points of no return and also ensure that there exists at least one way for the comeback before losing the energy autonomy.

Corridor net
The network of submarine corridors is the composition of interconnected strips on the seabed, which covers a certain explored region and allows navigation between different points of it. Topographically the strips are sequences of recognizable sites on M S and topologically by a branched network that is made up of nodes and bifurcation points. A node is characterized by a cluster of seabed characteristics. It has a position and a viewpoint associated with the local characteristics. These characteristics are represented by keypoints of the local scene. The position of the cluster is determined by its centroid.
In the exploration, C-SLAM has to decide at each moment the next step of movement that starts from the current node. Each step consists of the identification of the next node, the direction of its centroid, and the speed of navigation to bridge the gap between nodes. The detection and identification of a node is a process that can involve many steps, during which a node is tracked and classified among other nodes in a merit order list.
In order to choose a suitable node among many possible ones, the vision system evaluates the most outstanding characteristics of the scene and then decides the optimal direction towards the corresponding node. The optimization of this direction results from the maximization of a cost function of the quantifiable requirements. Similarly, the rate is properly determined based on a previously specified vehicle cruising speed. Thus, the speed can be regulated to go up and down according to the curvature of the path. This construction procedure is repeated in the next step permanently during the exploration.
Cluster tracking involves localization on both a scattered local map and a dense map. The estimation of distances between nodes of any network branch is decisive for the planning of revisits according to the available margins of autonomy of the vehicle.
It is worth noting that estimated distances on the maps are simply evaluated on the particular scale of the monocular SLAM method, which does not necessarily coincide with the actual scale. This would not represent an obstacle to C-SLAM as long as the estimated distance between the starting point and any point of no return on the network is exactly related to half the energy autonomy of the vehicle. Some approaches define another type of scaled topology, achieved by means of stereo vision or scanner [8,10,27].
Bifurcation points are commonly distributed along each branch. There are two different sets of bifurcation points. The first set includes divergent points at which the paths physically bifurcate. The second set covers the so-called open bifurcation points, which were identified and ranked in a list of suitable landmarks for a future deviation of the path. The ranking position is issued in accordance to a criterion of robustness.
While each new corridor is simply added to the network as a new branch, existing corridors are updated in each new revisiting. Each round trip implies that the vehicle starts with plenty of energy autonomy. This indicates the dimensions of the explored region. Clearly, with the deployment of multi-robots, the scan will be faster [29, 30] but will not necessarily cover a larger region than with this proposal.
Since the camera is monocular, safe return is practically implemented by guiding the vehicle in reverse. This is because the system might not identify all the nodes in the 3D landscape in the way back, i.e., when navigating from the opposite direction. However, with the camera in reverse, the system is able to recognize the seabed features on the path straightforwardly. In reverse motion, all nodes are revisited but from back to front.

Construction of a corridor net
Before starting a round trip in the network of corridors, C-SLAM must make the decision to advance the exploration or revisit explored branches of the network. Commonly, both decisions are involved with the intent to expand the explored region. In addition, the system defines a suitable ground altitude for navigation with the end of achieving good visibility conditions.
For the next discussion, Figure 2 is taken as support.
The first corridor and each new section of the corridor network are constructed according to an optimum criterion. This criterion takes into account the most outstanding characteristics of the seabed in the form of keypoints (shortened KP), which can be categorically recognized over time and preferably from different points of view.
Autonomous navigation begins with a commissioning phase (CP) to adjust the controller and initialize SLAM techniques for scattered and dense mapping. To achieve these objectives, the vehicle is forced to travel fast short distances, zigzagging over the bottom with a rather erratic course. In this way, images can reflect the changes in texture and parallax that are necessary for a visual odometry. This, in turn, produces a stable estimation of the vehicle localization at the beginning of the exploration.
Once odometric data η and _ η are considered reliable, the adaptation of the controller coefficients is carried out by providing persistent and rich-in-frequencies changes in the reference positionη ref and rate_ η ref in the main degrees of freedom. To this end, it is convenient forη ref to continue in time the previous erratic motion. Additionally, the rate_ η ref is defined as the sum of the cruising rate and a random sequence that causes soft accelerations and brakings along the trajectory. Once the convergence of the controller coefficients has been attained, the exploration starts at the current position. This position is marked as the starting point (shortened SP) and fixed for future trips. Figure 2 (left) illustrates the construction of a corridor path from SP up to an end point (shortened EP), for instance, the estimated no-return point. The system analyzes the texture characteristics in the image sequence in order to detect robust clusters (C). Once they are recognized, the system tracks them as they pass in front of the camera. With a set of robust clusters, the system is able to triangulate frame by frame the pose of the camera.
Usually, raw information for robust cluster detection is provided by the featurebased SLAM method employed in the vision system. Certainly, for self-localization, the SLAM method constantly defines keyframes (KF) with the most stable keypoints in the image, i.e., with those that have been followed up so far. Thus, C-SLAM has to group the robust keypoints in clusters and finally perform an evaluation of cluster set. In this way, C-SLAM can define nodes (N) and eventually bifurcation points (BP).
The topology of the corridor network usually contains a diversity of elements that are described in Figure 2 (right), shortened SP, EP, KP, KF, C, N, LM, and BP. Additionally there are trifurcation points (TF) which are created from BPs: confluence points (CO) that occur between two paths that meet at a point; outer points (OP) which are detected landmarks or nodes but due to energy reasons are unreachable; dead-end points (DP) at which the vehicle must turn by force due to the absence of Ns or LMs in the horizon; and, finally, crosspoints (CR) at which a path crosses another path and these can or cannot be detected by the vision system.
It can frequently happen that, due to the existence of COs and CRs, many possible path alternatives are established to return from any point or to join two distant points in the network. Loops can also be generated during the expansion (see Figure 2, right).
As mentioned previously, the topological space is subject to an unknown and variable cartographic scale that produces accumulative odometric errors. This means that corridors can cross others without this being reflected in the global map. However, this does not invalidate the objective of C-SLAM, to return safely to SP.
It is expected that after reiterated revisits along different corridors, the number of loop closures carried out by the basic SLAM method will increase and the network will progressively take on a more real form. Simultaneously, the scale of the map will become more uniform, and the used metric will increasingly gain in accuracy.
The construction of the network is mainly based on a criterion to define a metric in the topological space. This is discussed below.

Nodes, landmarks, and keyframes
C-SLAM sets up a node to accurately synchronize future movement steps. For example, the direction to which the vehicle is to be driven in the short term in order to reach the next node is precisely defined at the current node. In addition, the speed of the vehicle on this direction is also defined here. All this gives the reference positionη ref and rate_ η ref , which are used in the adaptive control system as inputs. Many other steps are synchronized at the current node as detailed below.
When C-SLAM decides to expand the network with a new branch, it searches for a landmark in the current corridor and then deflects the path in the indicated direction.
Nodes and landmarks are gestated in a common optimization process. Optimization involves maximization of a cost function with one optimum and many suboptimal results. The optimum outcome defines the first place in a ranking list, followed orderly by the suboptimal outcomes. Nodes and landmarks are selected from this ranking list.
The direction that definesη ref and_ η ref to the next node is given by the optimal result. On the other hand, the best solution in the suboptimal set is reserved for a landmark.
Unlike a node whose next direction is defined in each step, a landmark can demand many steps. A landmark is a node whose high position on the merit list ranks it for a successful future branching. Therefore, not all the nodes become landmarks.
Finally, there is a subtle difference between a node and a keyframe generated from any SLAM technique. A node is the most robust cluster in the current keyframe, and in general its appearance frequency is generally lower than the frequency of a keyframe.

Selection of optimal directions and landmarks
During exploration, keyframes are the basis for selecting nodes. For every few keyframes of the sequence generated by the active SLAM algorithm, C-SLAM chooses the current keyframe and defines a node. The frequency of the node appearance depends physically on the visibility conditions and is commonly fixed in the commissioning phase.
On the other side, the selection of a path direction from the current node to navigate to the next (yet not defined) node follows a particular criterion. Also the selection of a promissory landmark for a future bifurcation is subject to the same criterion.
It is assumed that many keypoint clusters are faced at the same time in the ranking process in order to establish an order of merit. When the score of a particular cluster exceeds a threshold value, it is then considered as a potential landmark. Clearly, many clusters remain in the vision field for a time, so that they may continuously be tracked during their latency periods. Some of them will become landmarks.
After a landmark is lost in the field of vision, it will remain inactive with its last score and its location on the map. While in the exploration phase landmarks are generated, in a revisiting period they are consolidated, or even separated. Eventually, a landmark becomes a bifurcation point when C-SLAM decides to branch out to explore new regions.
The proposed criterion is based on a cost that is defined as a linear combination of quantifiable requirements to be met by each cluster in its latency period. Thus where V i is a cost for one keypoint cluster named C i , given the weights λ j with 0 ≤ λ j ≤ 1 and Σ j λ j ¼ 1: The requirements for C i are described and symbolized as • High density of keypoints in a cluster: factor f C • Continuous traceability of a cluster: period T C • Robustness against the change of points of view: span angle Δθ • Alignment of a cluster on the heading direction: factor δ À1 The cost weights will be defined previously according to the emphasis placed on some particular aspects of robust navigation. Since all requirement variables will be normalized to their expected highest values, a trivial choice λ j ¼ 0:25 is generally satisfactory. The simple election λ 1 ¼ 1 should support a continuous navigation; however in this case zigzagging on the pathway should not be ruled out.
Since some detected landmarks are far away from the vehicle position, one should check whether these will be attainable with the current autonomy range of the vehicle. If they are unreachable, they will be removed from the ranking list or marked as OPs. To this goal a metric is required for the topological space of the corridor network.

Density of keypoints in the cluster f C
The key assumption to calculate the first parameter f C is that visual features are very volatile in harsh underwater environments. Therefore, dense clusters may be thought to be more resistant to long-term disappearance than dispersed clusters. The first step to determine f C is to group keypoints of the current keyframe in clusters. For this, for example, k-means clustering can be used in combination with an efficient initialization method. For the time k of cluster appearance, the following is calculated: where i identifies the cluster of the set and N KP is the number of KPs in C i . Thus, the parameter can be determined recursively by Since the set of selected robust keypoints in the SLAM algorithm is generally sparse, clustering operation should not be excessively time-consuming.
Statistically, the maximum expected number of N KP can be estimated in the commissioning phase as a function of the harshness of underwater environments. For example, poor visibility, self-similarity of the ground, caustic sunlight, shadows, flares, etc. are factors that contribute to reduce the number of robust keypoints.

Period for continuous traceability T C
The second parameter is related to the traceability of C i . The longer C i is detected, the more feasible it will be to achieve continuity of navigation and a safe return. The parameter is estimated in each keyframe as follows: where N KF is the number of consecutive KFs in which C i appears approximately at the same position and almost with a similar density. To assess the similarity of the N KF clusters, a Euclidean norm can be used. In addition, attention should be paid to changes in f C and cluster midpoint. N KF is estimated by thresholding all these variables properly.
Finally, the maximum number of consecutive keyframes depends to some extent on vehicle speed, altitude, camera tilt, and field of vision, among other factors. As a reference for max N KF one can take the number of KFs required to cover a distance of 10 times the length of visibility in the line of sight.

Span angle Δθ for viewpoint range
The third parameter Δθ relates the robustness of C i with the variation of viewpoints. A fairly large angle would be necessary for a cluster to be unambiguously detected from different points of view. Δθ applies in these two important cases: first when clusters are aligned in front of the vehicle and second when they are aligned to one side of the vehicle (see example in Figure 2, left, for landmarks LM1 and LM2). Thus, it results in where the symbol span : f g means the difference between the maximum and minimum value of the sequence θ j during the tracking of C i . In turn, θ j is the angle between the line that joins the camera with the cluster midpoint and the camera optical axis.
A large value of Δθ suggests a more robust link between the nodes and, on the other hand, in the case of landmarks, a more successful bifurcation.
The maximal span of θ j is set equal to half of the horizontal field of vision of the camera, which specifies the tracking of a cluster that is initially seen far away in front of the vehicle and disappears from the field of vision sideways.

Course alienation factor δ À1
The fourth parameter δ À1 plays a key role in the construction of the corridor. When the corridor is built step by step, one would normally prefer to keep the real course along a straight line rather than change the course. This is because the control of the vehicle along a line is generally more precise than on curves. Similarly, revisits on straight stretches are more immune to loop-closing failures than on curved stretches.
Therefore, in the case of nodes, δ À1 should generally produce a higher cost than in the case of landmarks that are reserved for bifurcations. Hence, landmarks are generally created on one side of the path, while nodes are rather in a straight line.
Since the cost (Eq. (1)) contains terms that must be maximized, δ À1 reflects the opposite of the alienation parameter here called β. Therefore, δ À1 should indicate how much the transverse separation of a cluster from the route is δ À1 assessed in the horizontal navigation plane. To this end, both the optical axis and the line that joins the camera with the midpoint of the cluster are projected onto the navigation plane. Between these lines, the angle β j is obtained. A third line is defined, which connects the camera with the point furthest to the right of the vision field. Projecting this line on the navigation plane results in the angle called β * with the projected optical axis.
Hence, it results in geometric mean where N is the number of assessments j during cluster tracking.

Application of the selective criterion
The inclusion of a cluster C i in the merit order list during its tracking is subject to where V * is a global threshold given for the cluster score. The merit order list is updated keyframe by keyframe. Hence, if C i is already in the list, its score is simply actualized.
If V * is set too high, many landmarks might be created, and the network could be unnecessarily too dense. The optimal number of bifurcations per corridor should satisfy a general rule that the distance between two consecutive BPs is spanned by about 20 nodes. On the other hand, if the branches are too scattered, V * could be too high.
When a new node is created, C-SLAM immediately evaluates the next most promising direction to lead the vehicle into the unknown environment with the goal of exploration. Therefore, the highest score refers to the cluster with which the reference pathη ref should be continued.
In addition, C-SLAM checks whether the second highest score corresponds to an active cluster and in which case decides to set a landmark. Another more cautious strategy is to set a landmark when the second highest cluster just disappears from the list.
Landmarks can be also consolidated in the return and in every passage on any explored corridor. On the other side, the landmark cost may decrease so much that it is to be removed from the list. In this way, landmarks are permanently evaluated up to the moment they are employed to bifurcate into the corridor. At this moment, they change their status from LM to BP and disappear from the list forever.
In a changing environment, the adaptation of Ns, LMs, and BPs after a long period is necessary. This allows a renewal of the corridor network as needed (cf. [7,31]).
During assiduous navigation, the topology of the network is outlined like a tree of branches. Each branch contains its identification in the tree, the sequence of nodes arranged in one direction, the landmarks, the bifurcation points, and any other particular element that should be important for decision-making.

Expansion of the network
Extending the boundaries of the corridor network requires C-SLAM to implement certain policy-makings, which goes beyond the corridor network building.
A first policy implies the definition of an appropriate metric to extend the lengths of the corridor network as much as possible according to the available energy autonomy.
A second policy supports the decision-making to choose bifurcations in order to multiply the number of branches. In this case, revisitings are of secondary importance as they take place only when they are needed as bridges to create new branches.
A third and final policy concerns the optimal scheduling of paths between two points of the network. In a situation with many alternatives to get a connection between sites, the system will search for the one with the best energy efficiency.

Metric
The metric space represented by the corridor network depends on an unknown scale which also changes over time due to cumulative odometric errors. This will affect, above all, the global map, which is the composition of many local maps with their own scale. As scale variations are generally small from map to map, the metrics are similar.
In order to maximize the length of a corridor without compromising the safety of the way back, it is essential to have a reliable metric. Since safety and energy autonomy are closely related, the metric must express the energy margin that the vehicle possesses at any time and position in the network (cf. [24,25]).
It is clear that due to the challenging environment, the energy used to connect two points is often not the same in both directions. This will undoubtedly depend on sea currents, vehicle speed changes, travel breaks, and zigzagging on the reference route. Thus, for example, real-time detection of the no-return points based on the battery state of charge may not be reliable enough in unforeseeable circumstances. For this reason, a more powerful detection of no-return points is developed here on an empirical basis.
To this end, a function for the estimation of the energy margin is proposed. It adds up the energy consumption until the full autonomy is completed. The function can be evaluated at any time, especially in the decision to return.
The idea behind the proposal is that almost all the energy available in navigation is intended to move the vehicle. Therefore, an approach that is based solely on the energy of motion along the travelled path seems to be quite rational.
The first one defines the consumption of energy as a set of possible cases: where : f g is a symbol for a set of cases and Ð C i _dη describes a curvilinear integral along the path i that spans the way from the SP up to the point where the autonomy is completed (see Figure 3 to support the concept). Similarly, the distance covered by the vehicle on this path i until all available energy is consumed is defined as a set of cases: It should be noted that all routes cover each shape, both straight and curved ones.  Figure 3 shows numerical simulation results wherein the distance travelled by a vehicle over different paths is computed after accomplishing full autonomy. In the experiments the vehicle acquires a cruise velocity, but this is slightly changed around this value at random. The resulting ground truth statistics must be available before applying C-SLAM but are updated during the construction of the corridor network.
In order for C-SLAM to decide the time point for the comeback on a new corridor, a norm based on the average on the sets in Eqs. (8) and (9) is applied. The Euclidean norm can be used to obtain the energy limits called E EP and travelled distance d EP : subject to the conditions for safe navigating on the path i where the equality of either of the two equations will mean the identification of the end point EPi for this path. N is the number of trips in the statistics that cover a distance d i for any route from SP. It is very important to frame the confidence of the norm in the risk of loss of the vehicle. In this sense, the most unfavorable case might be more appropriated. In this case, it is valid and so the conditions for the detection of a no-return point are as follows: where again the equality of either of the two equations will mean the identification of the end point EPi for this path i.
In order to continuously adapt the norm to the environment, the set (Eqs. (8) and (9)) must be updated on each round trip. It can be noticed that the update can be applied both in a new corridor and in a revisiting case as well, regardless of the path chosen.

Planning of new corridors
The way in which C-SLAM progressively explores and increases branches can be supported by different criteria.

Method I
The first method is based on the criterion that for fast expansion of new branches, occurrences of revisitings of old branches should be reduced as much as possible. To this end, the number of revisited stretches in old branches and their distances to SP should be minimized during the sequence of round trips. Here, the rectification of the path Ð BP i SP dη is employed. It is evident that to reach a certain BP on an old branch, the vehicle must unfailingly revisit some first stretches on this branch. Figure 4, right and middle, illustrates a case study of an expanded network in two different forms.
Once the trunk corridor is created, the proposed method starts from SP and searches for the closest LM and splits into a new branch until it culminates in an EP. At this stage, this LM becomes a BP. In doing so, the approach minimizes the length of all revisited stretches. Any other branch generation is suboptimal. The worst suboptimal generation, on the contrary, expands the network by starting each round trip from SP to the most distant LM. In fact, this option produces a maximization of the revisited stretches.
Even when the optimal branch generation expands the network faster than other options, all methods end up with the same number of revisited stretches. This fact is reflected in Figure 4 (right). Therefore, from the point of view of the total energy consumed, the difference is short and medium term only.

Method II
The optimal solution described above has a theoretical rather than a practical value. The disadvantage of method I is that the way BPs are used has no connection with the merit order list. For that reason, vehicle safety might be compromised. When making decisions about where to branch, trust in the list is the only support for secure expansion.
Therefore, the main focus of the second method is to build up a solid network in which the path from every EPi to SP maximizes the trust on the BPs on the pathway. This means that the LMs chosen for bifurcation have the maximal score on this pathway. The number of revisited stretches over time is described by a curve that lies between the two extreme curves in Figure 4 (right). This means that in the long term, the new curve will also converge to the same extreme value.
Many other strategies can be applied besides the other two methods. For instance, one could be specifically interested in developing the branches to the left (or to the right) of the trunk line or in creating a solid statistic for the newly explored branch before continuing with the exploration.

Path scheduling
Starting from the conception of a fully interconnected corridor network, multiple alternatives can be considered to obtain a connection between two distant points of the network (see, for example, in Figure 3, right, the bypass shaped by the points BP-CO and BP-TP-CO). In such situations, C-SLAM should be able to seek that path that involves the best energy efficiency. For that purpose C-SLAM must count with odometric information of each stretch and the energy consumption cost per unit of length.
Once the network has converged in its expansion and especially after a large number of revisits, it is common to count with the presence of bypasses and loops. Thereupon these elements will become part of the topology of the network.
In the case of one-way corridors, the existence of a bypass can only occur in the presence of COs or CRs (see Figure 5, left). On the other hand, the generation of a loop needs indefectibly at least one CR (see Figure 5, right). It is worth noticing that a loop represents a dummy alternative to connect a point with oneself through the loop as illustrated in Figure 5 (right).
As topological elements of the network, loops are marked to avoid unnecessary energy consumption. For example, in Figure 5 (right) when a loop is encountered, the path entering the loop is avoided unless the destination is a BP allowing exit from the loop. For the optimization of paths, the criterion consists in searching for the shortest path between two points or for the path that minimizes the energy consumption, or a combination of both. If safety aspects are emphasized, the energy approach is the most suitable for C-SLAM. This option is described in the following.
To find the optimal path, the following cost functional is minimized for a path starting at node N A and ending at node N B : where C k is one of the paths that connects N A and N B and the pair N i and N iþ1 is the terminal nodes of the stretch in C k . The energy information on the stretch from N i to N iþ1 is the result of an average each time that the vehicle passes on this section. For the minimization in Eq. (16), BPs and COs, even dead-end points, are taken into account along with all nodes. Thus, the minimal value of the sum on every stretch that links N i and N iþ1 of C k represents the optimal path denoted by C * k . To assist Eq. (16), the A * search algorithm has been chosen in this work over other tree search algorithms because of its optimal efficiency [32]. Therefore, the cost is where g is the cost of the path from N A to N iþ1 and h is a heuristic function that estimates the cost of the cheapest path C * k from N iþ1 to N B . A* ends when the goal is reached or when there are no paths eligible to be extended. In this work, the cost functions g and h are selected according to the following two terms: The heuristic function h N iþ1 ð Þis admissible provided that paths with one or more sections travelled in both directions are excluded from the minimization. Thus, branches that do not represent a bypass to the branch on which N B is located are discarded. Loops are also avoided when used to connect a node to itself.

Case study
This section shows some experimental results which are selected to illustrate the viability of the proposal for autonomous navigation based on the proposal. Many of the functions of the approach are tested together according to the C-SLAM structure in Figure 1. It is important to note that the case studies are carried out in limited spaces with simulation of light effects and seabed textures. However, the staged environments reflect quite well the difficulties encountered in image processing in real underwater scenarios.

Environment
In order to provide a ground truth for testing and to achieve acceptable reproducibility of results, many scenarios were set up in a basin with a staged underwater landscape that closely resembled the natural seabed. In this scenario, rocks, gravel, sandbanks, and benthos, among others, predominated wherein a variety of underwater visual effects could be obtained [19,23] (see Figure 6).
The bioactivity of microorganisms was cyclically changing the characteristics of soil texture and water transparency. The scenarios were illuminated by direct and indirect sunlight. However, light disturbances and turbidity were controlled for the range of tests. For instance, strong or weak sunlight flickers on the ground were generated by agitating the water in two orthogonal directions. In addition, visibility was reduced by discharging silt particles onto the surface that remained suspended for a period of time.
Therefore, a wide variety of case studies could be faithfully reproduced, such as rapid changes in luminance, transition from dark to bright scenarios, blurriness, lens flares, motion blur, self-similarities, glare, and bubbles (see Figure 6).

Hardware
An ROV (model OpenROV v2.8) was used as the platform for the experiments, although this was hydrodynamically reformed with side fins to reduce the motion blur.
Two independent cameras were installed on board, a high-resolution, wideangle vehicle camera (Genius f100) and a high-performance camera (GoPro Session H4). Both cameras are rolling shutter and operate at different frame rates. To attenuate undesired effects of the rolling shutter mechanism, especially in photometric-based algorithms, a high speed of 120 fps with an image size of 848 Â 480 pixels was used for dense mapping, albeit off-line at the end of every round trip.
The vehicle was completely steered by C-SLAM that was programmed and installed in a notebook with GPGPU technology. The bi-directional flow of video and control signals was implemented via cable. The altitude was conveniently fixed in advance to adapt to the visibility of the environment. Altitude control was carried out independently of the adaptive controller by a PI controller. An adaptive speedgradient controller [17] was used to self-adjust the controller coefficients for the main dynamics of the ROV.

Software
Among other functions, C-SLAM uses state-of-the-art free software to localization and mapping. As shown in Figure 1, the block for feature-based SLAM was implemented with the algorithm ORB-SLAM [33]. On the other hand, the block for dense mapping was implemented with the photometric-based DSO-SLAM [34] and sometimes with LSD-SLAM [35,36]. However, there was a significant modification in the implementation, namely, DSO (or LSD-SLAM) was supported with the estimated camera position provided by ORB. This was necessary in general for improving the stability of the mapping in very harsh environments, especially due to the presence of strong sunlight caustics or/and poor visibility [23]. The deflickering filter in this work is based on estimations of sunlight caustic fringes using a feedback of predicted images [20], which employs very high accuracy velocity estimation [38]. Software for guidance, control, and corridor construction was developed specially for C-SLAM. Figure 7 which illustrates the role deflickering filter plays in the improvement of dense mapping in environments with strong sunlight caustics. Generally, spatiotemporal light changes affect the performance of photometricbased methods seriously. As seen in the heat maps, the camera depths are coherent with the physical environment. Thus, the photometric consistence is preserved after image deflickering.

One begins with
The situation is totally different in the case of ORB-SLAM, where light disturbances do not affect ORB to the same degree as DSO (or LSD-SLAM) (see Figure 8). However, from several experiments, it was concluded that in turbid waters, robust texture features decrease substantially, although the overall performance of ORB does not degrade as much as in the case of DSO (see [8]). Therefore, the filtering of caustic sunlight waves can be omitted in the case of ORB, but not in the case of DSO (or LSD-SLAM). For this reason, the deflickering filter in the C-SLAM structure in Figure 1 is necessary only for dense mapping ends. Another conclusion was drawn from navigation in self-similar terrains staged in the basin like in the third picture to the right in Figure 8.
These terrains provide commonly numerous features in a similar cluster pattern, but ORB-SLAM often losses the track as it is unable to deal with nearby similarities.
An important instance at the beginning of C-SLAM navigation is the start-up phase for initializing SLAM algorithms and adaptive controller parameters. To this end, in the study, the vehicle movements were performed manually providing a zigzag path of the camera. Figure 9 illustrates the initial process of constituting a dense mapping of the environment with an adequate camera trajectory. From there, the start of the exploration was supplied with good estimates of vehicle position and speed, which in turn, allow the adaptive controller to adjust its coefficients.
Dense methods suffer, more than any other class, from odometric errors [8], which are minimized in the particular case of DSO through a very cumbersome and thorough camera calibration process. In these trials, the combination of direct methods for mapping with ORB-SLAM for tracking increases the accuracy of the global map of the corridor network, even in the case of normal rolling shutter Dense mapping in two cases after image deflickering. Images with sunlight caustics (left), deflickered images (center) [14], and heat maps for camera depth (right). cameras as used here. Figure 10 illustrates these results in one experiment with good textured floor.
The limited space and relatively short time span for the experiments prevented the use of an energy autonomy metric beyond the context of numerical computer simulations. In this sense, end points located sideways must be at a certain limited distance from the trunk corridor. The frequency of occurrence of nodes was synchronized with the keyframe generation according to the simple strategy, namely, "one keyframe, one node." The direction to the next node was optimized primarily by the maximizing the density of cluster keypoints near the front line of vision to avoid zigzagging of the vehicle. In this way, branches could be extended to a relatively significant length.
An important feature of the monocular C-SLAM is the vehicle return through the corridor in reverse motion as seen in the display provided by ORB-SLAM by means of the symbol " ," indicating the direction of movement (see Figure 11, top). In addition, the track link that exists when the algorithm identifies a connection between nodes is characterized by green segments between them (see Figure 12, left).
Reverse movements often caused motion blur due to pulling and cable drag on the floor [37]. Besides, the drag of the ROV backside is much more pronounced than  in the forward displacement. All this made it necessary to reduce the cruise velocity to minimize large heading perturbations that would cause track loss. In the following, two case studies illustrate the C-SLAM performance under these circumstances. Figure 11 shows a corridor network that was constructed under the criterion of fast expansion of new branches, i.e., minimizing of revisits in short and medium terms. Here, three bifurcation points were implemented in four steps. First, the system creates the trunk corridor and returns to SP. Accordingly, in the next round trip, the vehicle is led to the nearest BP and forced to bifurcate to create the first branch, and thereafter it returns to SP again. This routine was repeated for the second and third BP.
It is notorious that the way back through the trunk corridor from the BP3 to the SP does not completely agree with the way out. However the position track was never lost, which demonstrates the robustness of the system against changes of points of views. The differences in trajectories were due to many causes, including control tracking errors, cable tugs, and drag disturbances [37].
The diversity of terrain texture shown in Figure 11 (bottom) and their impact in the performance and robustness of C-SLAM are noticeable. For instance, at BP 1 , the terrain is bulky, so the contrast is high and the keypoints are robust. On the other hand, in BP2 and BP3, the terrain was acquiring an increasingly self-similar appearance, and the keypoint clusters were becoming increasingly volatile. In some trips in reverse through this self-similar zone, the vehicle briefly lost its tracked position. Figure 12 illustrates a more sophisticated experiment in which branches were allowed to occur on both sides of the trunk corridor. Over time, in addition to the formation of branches, confluence of paths and way crossings were also appearing.
The picture on the left shows the network from above and the interconnection of nodes. On the other side, the picture on the right displays the network nodes and Figure 11. Corridor network construction with three branches (top). Camera-taken scenes at the corresponding SP, BPs, and EPs (bottom). keypoints produced during the multiple round trips. It is observed that the density of keypoints is significantly higher in the right zone than in the left zone of the explored area. This is due to the marked self-similarity of the terrain that dispersed the keypoints over the terrain, while in the left zone, the bulky elements concentrate the keypoints around their peripheries.
The trunk corridor was long enough to encompass areas of different textures. Light disturbances were moderate; most of them transitions from dim to shining scenarios.
The sequence of the round trips can be followed by the order of occurrence of EPs and BPs. The strategy of expanding the network is the same as in the case before, i.e., create the trunk corridor, return to SP, annex a new branch, and begin a new round trip.
The most remarkable instance in this experiment occurred on a stretch in front of EP1, in the self-similar zone, where the C-SLAM briefly loses the vehicle location. Afterwards the vehicle was led in the direction of EP1 up to cross the trunk corridor. This CR was recognized by the system and aggregated to the network. From this point, the vehicle could return in reverse to a point that was also recognized by the system as a CO. Finally, after the sixth branch, C-SLAM completed the exploration by providing a global map of the corridor network.

Conclusions
This work deals with the autonomous navigation of underwater vehicles with the aim of achieving a broad exploration of the seabed. Unlike autonomous navigation systems in aerial, terrestrial, and space applications, for which the main source for localization is a GPS system, this approach basically uses only a monocular camera as sensor.
To establish the position of the vehicle, the proposed vision system takes advantage of the characteristics of texture of the seabed. Since underwater scenarios are generally very diverse and harsh due to water turbidity and light disturbances, vision-supported navigation poses a huge challenge for safe autonomous exploration. In particular, the lack of transparency in water forces the control system to lower the altitude to the seabed, which in turns demands a high degree of maneuverability to avoid collisions. This work describes a vision-based system named C-SLAM that provides a nested loop structure to integrate control, guidance, navigation, and route planning systems into one. To explore the seabed, C-SLAM implements a strategy based on active SLAM. In contrast to many methods in the literature that employs topological models of the environment like feature graphs [13], Bayes tree [14], or grid maps [15], C-SLAM presents the environment as a network of interconnected corridors made up nodes and bifurcation points. It is claimed in this work that this simple topology is highly adequate and robust for harsh underwater environments with self-similarities, strong spatiotemporal light perturbations, and lack of water transparency [23].
Compared to other optimization techniques proposed for underwater applications such as random trees [28], particle swarming [38], octo-trees [39], level setting [40], and genetic background [41], among others, this proposal has a significant level of novelty. It radially searches in the field of vision for the optimal direction to explore. This direction is subject to satisfy quantifiable requirements for the navigation. The requirements are integrated in a weighted linear combination that is maximized step by step. This produces an active scoring list of promising points for selection of nodes and future bifurcation points providing adaptation to the environment and robustness of the node sequence.
The heuristic of the proposal have similitudes with the visual teach and repeat method for long-range autonomy [16]. Therein a pathway is created in an unfamiliar outdoor environment, at first by teleoperation and then followed by a rover. However, C-SLAM is essentially designed for autonomous exploration underwater and deals with a dense exploration in 2D. Besides, the submarine navigates basically on a plane over the seabed contemplating topographic features from above, not over the terrain. So, comparatively, C-SLAM has the advantage to choose the optimal route in order to be able to safely return to the start point. Another difference is that the C-SLAM can cope with the lack of map scale and odometric errors and even so ensure the vehicle return.
In order to expand the explored region as far as possible, a suitable metric in the non-scaled topological space is defined in relation to the vehicle energy autonomy.
The key idea is to lead the vehicle by a corridor just up to the no-return point in order to make the return trip safe. In contrast to other approaches that solve the problem in a rather complex form, for instance, employing dynamic programming onto bathymetric and sea current maps [26,27], C-SLAM proposes a statistic-based connection of odometric information and energy consumption.
The approach was experimentally tested in reduced-scale in a basin, wherein subaquatic environments with a good resemblance to a real seabed were staged. Experimental trials have demonstrated the feasibility of the approach in future applications where an autonomous underwater vehicle can host the C-SLAM vision system for large-scale underwater exploration. With the exception of extremely turbid environments, practically in all other cases, C-SLAM was able to make correct decisions to create and expand the underwater corridor network in a stable manner.