Understand the classification, typical applications and domestic development status of SLAM navigation technology in one article
“SLAM is the abbreviation of Simultaneous localization and mapping, which means “simultaneous localization and mapping”. It is mainly used to solve the positioning and map construction problems of robots when moving in an unknown environment.
SLAM is the abbreviation of Simultaneous localization and mapping, which means “simultaneous localization and mapping”. It is mainly used to solve the positioning and map construction problems of robots when moving in an unknown environment.
Typical application areas of SLAM
Mainly used in the field of robot positioning and navigation: map modeling.
SLAM can assist robots to perform tasks such as path planning, autonomous exploration, and navigation. Domestic Ecovacs, Tammy and the latest Lanbao floor sweeping robot can use the SLAM algorithm combined with lidar or camera to allow the sweeper to efficiently draw indoor maps, intelligently analyze and plan the sweeping environment, so as to successfully allow themselves to move forward. into the array of intelligent navigation.
For VR/AR: Assist in enhancing visual effects. SLAM technology can build a map with a more realistic visual effect, so as to render the superimposed effect of virtual objects according to the current perspective, making it more realistic and without any sense of violation. Among the representative products of VR/AR, Microsoft Hololens, Google Project Tango and MagicLeap have all applied SLAM as a visual enhancement method.
The field of drones: map modeling. SLAM can quickly build a local 3D map, and combined with geographic information system (GIS) and visual object recognition technology, it can assist UAVs to identify roadblocks and automatically avoid obstacles and plan paths. It is applied to SLAM technology.
Autonomous driving: visual odometry. SLAM technology can provide the function of visual odometer and integrate with other positioning methods such as GPS, so as to meet the needs of precise positioning of unmanned vehicles. For example, Google’s self-driving car based on lidar technology and the self-driving car Wildcat (Wildcat) modified by Oxford University’s MobileRobotics Group in 11 years have been successfully tested on the road.
SLAM system composition
The SLAM system is generally divided into five modules, including sensor data, visual odometry, back-end, mapping and loop detection.
Sensor data: mainly used to collect various types of raw data in the actual environment. Including laser scan data, video image data, point cloud data, etc.
Visual odometry: It is mainly used to estimate the relative position of moving targets at different times. Including the application of algorithms such as feature matching and direct registration.
Backend: Mainly used to optimize the cumulative error brought by visual odometry. Including filter, graph optimization and other algorithm applications.
Mapping: used for 3D map construction.
Loopback detection: mainly used for spatial accumulation error elimination
The workflow is roughly as follows:
After the sensor reads the data, the visual odometry estimates the relative motion (Ego-motion) at two moments, and the back end processes the cumulative error of the visual odometry estimation results. Loop closure detection considers images of the same scene at different times, and provides spatial constraints to eliminate accumulated errors.
ALAM loopback detection
In the visual SLAM problem, the estimation of pose is often a recursive process, that is, the pose of the current frame is calculated from the pose of the previous frame, so the error in it is passed on frame by frame, which is what we said cumulative error.
Our pose constraints are all established with the previous frame, and the errors in the previous four constraints have been accumulated in the pose error of the fifth frame. However, if we find that the pose of the fifth frame does not have to be derived from the fourth frame, it can also be calculated from the second frame. Obviously, the calculation error will be much smaller, because there are only two constraints. Establishing a pose constraint with a previous frame like this is called a loop closure. The loopback reduces the cumulative error by reducing the number of constraints.
How to judge the similarity of two frames of pictures? The most intuitive approach is feature matching. Since feature matching is very time-consuming, loop closure detection needs to match all key frames in the past, which is absolutely unbearable. Therefore, a bag-of-words model is proposed to speed up feature matching.
The bag-of-words model treats features as words, and judges whether the two pictures belong to the same scene by comparing the consistency of the words in the two pictures. To be able to classify features into words, we need to train a dictionary. The dictionary contains a set of all possible words, and in order to improve generality, it needs to be trained with massive data.
The sensors currently used in SLAM are mainly divided into two categories, one is Lidar-based laser SLAM (Lidar SLAM) and vision-based VSLAM (Visual SLAM).
1. Laser SLAM
Laser SLAM uses 2D or 3D lidar (also called single-line or multi-line lidar), 2D lidar is generally used on indoor robots (such as sweeping robots), and 3D lidar is generally used in the field of unmanned driving. The emergence and popularization of lidar makes the measurement faster and more accurate, and the information is richer. The object information collected by lidar presents a series of scattered points with accurate angle and distance information, which is called a point cloud. Usually, the laser SLAM system completes the positioning of the robot itself by matching and comparing two point clouds at different times to calculate the relative motion distance and attitude change of the lidar.
Lidar ranging is relatively accurate, the error model is simple, the operation is stable in environments other than direct sunlight, and the processing of point clouds is relatively easy. At the same time, the point cloud information itself contains direct geometric relationships, which makes the robot’s path planning and navigation intuitive. The theoretical research of laser SLAM is also relatively mature, and the landing products are more abundant.
2. Visual SLAM
Visual SLAM obtains massive and redundant texture information from the environment, and has super scene recognition ability. The early visual SLAM was based on filtering theory, and its nonlinear error model and huge computational load became obstacles to its practical implementation. In recent years, with the advancement of sparse nonlinear optimization theory (Bundle Adjustment), camera technology, and computing performance, real-time visual SLAM is no longer a dream.
The advantage of visual SLAM is the rich texture information it utilizes. For example, two billboards with the same size but different content cannot be distinguished by the point cloud-based laser SLAM algorithm, but can be easily distinguished visually. This brings unparalleled huge advantages in relocation and scene classification. At the same time, visual information can be easily used to track and predict dynamic objects in the scene, such as pedestrians, vehicles, etc., which is crucial for applications in complex dynamic scenes.
Through the comparison, it is found that laser SLAM and visual SLAM are good at each other, and they have their limitations when used alone, while the integrated use may have a huge potential to learn from each other’s strengths. For example, vision works stably in texture-rich dynamic environments and can provide very accurate point cloud matching for laser SLAM, while the precise direction and distance information provided by LiDAR is even more powerful on correctly matched point clouds. In environments with severely insufficient lighting or lack of texture, the localization work of laser SLAM enables vision to record scenes with little information.
The development status of SLAM in China
Laser SLAM started earlier than visual SLAM, and is relatively mature in theory, technology and product implementation. There are currently two main implementation paths for vision-based SLAM solutions, one is based on RGBD depth cameras, such as Kinect; the other is based on monocular, binocular or fisheye cameras. VSLAM is still in the stage of further research and development, application scenario expansion, and product implementation.
With the highlight of the importance of SLAM technology and the expansion of the application market, some enterprises have begun to invest in the research and development of SLAM technology, which can also be divided into two categories, one is the enterprises that provide navigation and positioning modules, and the other is the Mobile robot manufacturers, most of which develop SLAM for their own use.
With the increasing importance of SLAM technology, more and more domestic companies have invested in the research and development of SLAM technology. In addition to some mobile robot manufacturers as their own research and development, there are also some companies that provide positioning and navigation modules. Silan Technology is one of them. As a leading company in robot positioning and navigation technology, Silan Technology mainly wins with laser SLAM. It is the first company in my country to apply laser SLAM to service robots. In order to help robots to walk autonomously, Silan Technology has launched a modular autonomous positioning and navigation solution SLAMWARE. This positioning and navigation solution uses laser radar as the core sensor, with positioning The navigation control core, SLAMWARE Core, enables the robot to achieve functions such as autonomous positioning and navigation, automatic mapping, path planning and automatic obstacle avoidance. It is understood that the development kit has a 12-meter home version and a 25-meter commercial version. The main difference lies in the sensor ranging range. In addition to the 25-meter ranging, the commercial version of Silan Technology also launched the first TOF lidar. The ranging radius can reach 40 meters, which can meet the application of more and larger scenarios, and the anti-light interference ability is better. It can still achieve stable ranging and high-precision map construction under the strong light of 60Klx outdoors. Whether it is indoor or outdoor scenes, tasks can be easily completed.
In addition to Silan Technology, domestic companies such as Speed Sense Technology, Bucos, Miklimy, Gaoxian, and Stander also poured into it, and most of them focus on laser SLAM. After all, laser SLAM is the most stable and reliable. Positioning and navigation solutions, and visual SLAM will be the mainstream research direction in the future, but the fusion of the two will also become a trend in the future. The use of multi-sensor fusion can learn from each other and create a better positioning and navigation solution for the market. The intelligent process of robots.
In general, the domestic SLAM technology is still in the development stage both at the technical level and the application level. In the future, with the stimulation of consumption and the continuous development of the industrial chain, SLAM technology will have a broader market.