GND: Global Navigation Dataset

GND: Global Navigation Dataset


Jing Liang*1, Dibyendu Das*2, Daeun Song*2, Md Nahid Hasan Shuvo2, Mohammad Durrani1,
Karthik Taranath1, Ivan Penskiy1, Dinesh Manocha1, Xuesu Xiao2

1University of Maryland, 2George Mason University, *Equally Contributing Authors


GND Dataset and Open-Source Build are available on
DataverseGitHub
[Paper] [Video] [Dataset] [Code]
  • Pedestrian Walkways
  • Vehicle Roadways
  • Stairs
  • Off-Road Terrain
  • Obstacles
Map

About

Navigating large-scale outdoor environments requires complex reasoning in terms of geometric structures, environmental semantics, and terrain characteristics, which are typically captured by onboard sensors such as LiDAR and cameras. While current mobile robots can navigate such environments using pre-defined, high-precision maps based on hand-crafted rules tailored for specific environments, they often lack the commonsense reasoning capabilities that most humans possess when navigating unknown outdoor spaces.

To address this gap, the Global Navigation Dataset (GND) has been developed as a large-scale dataset that integrates multi-modal sensory data, including 3D LiDAR point clouds, RGB images, and 360° images, along with multi-category traversability maps. These maps include pedestrian walkways, vehicle roadways, stairs, off-road terrain, and obstacles, collected from ten university campuses. These environments encompass a variety of parks, urban settings, elevation changes, and campus layouts of different scales. The dataset covers approximately 2.7 km² and includes at least 350 buildings in total.

The GND is designed to enable global robot navigation by providing resources for various applications such as map-based global navigation, mapless navigation, and global place recognition.




Dataset

We first describe the data collection procedure. We then describe the details of our dataset, particularly on the traversability map.

Data Collection

Data collection was conducted by manually operating the robot to navigate various campus environments, considering the traversability of the roads. The robot primarily navigates pedestrian walkways but also traverses vehicle roadways when necessary, such as when crossing streets or accessing specific areas. The robot is equipped with the following sensors:

  • 3D LiDAR: Velodyne VLP-16 with 16 channels or Ouster OS1-32 with 32 channels, both covering a 360° field of view and operating at 10 Hz.
  • RGB Camera: ZED2 with an image resolution of 1080p, facing front, and operating at 15 Hz.
  • 360° Camera: RICOH Theta V operating at 15 Hz.
  • IMU: 6D 3DM-GX5-10 operating at 355 Hz.
  • GPS: u-blox F9P operating at 20 Hz.
Robot Setup

The robot operates on Ubuntu 20.04 and Robot Operating System (ROS) Noetic. Data captured by the sensors are recorded in the rosbag file format. We provide both intrinsic and extrinsic calibration parameters for the LiDARs and cameras. The dataset comprises data from 10 university campuses, covering approximately 2.7 km² with over 11 hours of recorded data. The campuses include a variety of environments, such as parks, different types of vegetation, elevation changes, diverse campus layouts, and objects.

Five example campuses with their details are outlined in the table below:

Five Example Campuses in GND
Campuses Covered Areas (km²) Number of Buildings Trajectory Length (km) Number of RGB Images Number of 360° Images Number of LiDAR Clouds Pedestrian Walkways (%) Off-road Terrain (%) Vehicle Roadways (%) Stairs (%)
UMD 0.84 60 23.26 214768 N/A 146703 10.66 16.29 25.84 1.67
GMU 0.46 51 13.67 137948 137027 91500 17.31 25.04 17.11 0.41
CUA 0.40 32 2.87 29921 30266 20025 7.86 42.29 18.78 1.81
Georgetown 0.25 40 3.25 33244 33325 22050 7.16 21.42 13.96 1.51
GWU 0.15 39 3.00 33156 32714 22190 8.95 14.04 28.09 1.99

Standardized Data Processing

The data processing workflow is standardized to encourage broader contributions. First, the raw rosbag data is processed to generate both the trajectories and 3D local maps. The point cloud maps are then processed by removing the ground, enhancing the visibility of significant features such as buildings. Local maps are registered to create a global map where all trajectories and maps are transformed into a unified coordinate system. For each campus, a single global map is generated within the dataset.

Point Cloud Map Multi-category Traversability Map

Multi-Category Traversability Map

The 2D traversability maps are created by projecting the 3D point cloud global map onto the 2D plane. A standard annotation method labels five distinct traversability types, each represented by different colors. The dataset provides not only geometric but also semantic information about the environment, closely aligned with real-world conditions.



Example Usages

We present three applications for the GND dataset, emphasizing its unique characteristics: globalness and traversability, which are not present in existing navigation datasets. This dataset is collected mostly by the Jackal robot, but it can be used for navigation tasks with different types of robots, such as legged robots and wheeled robots. We implement map-based global navigation, mapless navigation, and global place recognition.

Map-based Global Navigation

The primary goal of the Global Navigation Dataset (GND) is to provide precise map data for global robot navigation. To demonstrate its utility, an experiment was conducted comparing the navigation capabilities of two robots with different modalities and traversabilities: wheeled and legged. Using the map, path planning methods such as A* or RRT* can generate a path based on the GPS coordinates of the start and goal positions. As the robot moves, motion planning methods can be employed to observe real-time environment changes and guide the robot's actions.

Both robots initially navigate along the sidewalk. However, if the path becomes non-traversable for a particular robot type, the motion planner will select alternative traversable areas, adjusting the robot's course to reach the next waypoint along the trajectory. For example, when the path encounters stairs, the wheeled robot deviates to a nearby ramp before returning to the next waypoint, while the legged robot continues on its original path, walking directly up the stairs. For other obstacles, such as construction cones and groups of people blocking the sidewalk, the legged robot steps down the curb or navigates through off-road terrain to avoid the blockage.

Map-based Global Navigation

Mapless Navigation with Traversability Analysis

To assess the efficacy of various traversability types in the dataset for learning-based mapless navigation algorithms, the MTG algorithm is extended with multiple traversability levels, referred to as T-MTG. This approach allows for the generation of corresponding trajectories for different traversability levels. Three traversability levels are defined:

  • Basic traversability: Includes only pedestrian walkways, where robots can move at various speeds.
  • Agile traversability: Designed primarily for fast-moving wheeled robots and includes both pedestrian walkways and vehicle roadways.
  • Legged traversability: Suitable for legged robots, allowing traversal on pedestrian walkways and off-road terrain, but not safe for use on vehicle roadways.

T-MTG generates trajectories covering a 200° field of view (FOV) in front of the robot. For each traversability level, the generated trajectories effectively cover the areas in front of the robot, demonstrating the model's capability to adapt to different environments and requirements.

T-MTG Model for Mapless Navigation

Vision-based Place Recognition

Both RGB and 360° camera images are collected to enable vision-based global navigation. To demonstrate the usability of 360° image data, the NoMaD algorithm is used for goal detection. This method compares the current observation with topological image nodes to recognize the best target to follow for navigation. By using images in four views (front, left, right, and back) from the 360° camera, the algorithm can select the subgoal with the highest similarity as the closest node for further navigation.

The average similarity score of all four views from real-time image observations and the goal is higher compared to using only one directional image. This highlights the advantages of utilizing all available directional information from 360° images for more accurate and efficient goal-directed navigation.

Vision-based Place Recognition






Contact

For questions, please contact:

Jing Liang
Department of Computer Science
University of Maryland
8125 Paint Branch Dr, College Park, MD 20742, USA
liangjingjerry@gmail.com
https://github.com/jingGM

Dr. Xuesu Xiao
Department of Computer Science
George Mason University
4400 University Drive MSN 4A5, Fairfax, VA 22030 USA
xiao@gmu.edu
https://cs.gmu.edu/~xiao/