(8) Adam Allevato - Perception and Visual Recognition

(8) Adam Allevato - Perception and Visual Recognition

Summary

The goal of Qualifying Task 1 is to detect the colors and positions of bright lights on a computer console located in front of the Valkyrie. We use camera feeds from both 2D cameras to generate a 3D RGBD point cloud for analysis. Our approach uses C++, Robot Operating System (ROS) [1], and the Point Cloud Library (PCL) [2] to heavily filter the point cloud, removing outliers and clustering the points based on position and color. In this manner, the system detects the brightly colored lights with ease, and can immediately provide the 3D pose of these lights. The system uses a supervised autonomy framework, which could be run fully autonomously, but has been augmented with human verification.

A screenshot of the Gazebo simulation environment, with no panel lights on.

A view in Rviz, with a single light on.

Another view in Rviz, showing intermediate point clouds used in analysis and filtering. An interactive marker is also shown, signifying a successfully detected light at this location.

A video showing use of the interactive markers

A video of the task in progress, with the generated log file shown on the right.

Table of Contents

Task 1 Description

See [3] for a full detailed description of the task and running instructions.

In Qualifying Task 1, the robot begins standing in front of a full-height computer console with several buttons, lights, and screens. During the task, various lights and screens turn on in different colors, one at a time.

The goal of the task is to correctly identify the colors of the lights, as well as determining the XYZ coordinate of each light (relative to the Valkyrie's head) when it turns on.

High-Level Task Setup

We identified several ways to approach this qualifying task. While both 2D and 3D vision approaches were possible, we quickly settled on using a fully 3D, RGBD-based sampling approach, because of our team's familiarity with this type of data. Valkyrie has 3 cameras: two 2D ("standard") cameras, and a Hokuyo line scanner that is attached to a rotating base. 

The high-level data flow for the system is shown below.

File used to create this diagram: HCRL_Task1_HighLevel_Arch.pptx

Point Cloud Filtering

The point cloud filtering library is based on the Object Recognition and Perception (ORP) library [4], which is written in C++, and utilizes ROS and PCL. ORP was written by a team member and has been significantly modified and extended for use in the challenge. One notable addition was the use of a stereo correspondence algorithm, which allows perception using two 2D cameras, rather than a single 3D camera. Another addition was color-based point clusters, as all code in ORP prior to this project only used position-based clustering. Points are only joined into a cluster if their color does not differ from neighboring points by a tuned parameter, and if the overall variation in the cluster will not exceed a tuned threshold by the addition of said point. In this way, the algorithm can be adjusted to detect various colored-clusters and different cluster sizes with ease, depending on the needs of the specific sensing task.

Filtering consists of 6 steps, which are described in detail in [4]. We omit the plane-finding step, since the lights are flush with the console surface, and replace it with the color-based clustering described above. Once colored clusters have been detected, they are evaluated according to the logic below.

Perception-Based Intelligence

We use some task-specific knowledge to perform intelligent decision-making and adjust our algorithm as necessary. We desire every detected light to meet a certain set of specifications;

  1. The light size (as determined by the number of points it consists of) must be within the acceptable range, Smin–Smax,
  2. The light color could not be either full black or full white,
  3. The light color should be a color generated by the simulation.

All lights in the simulation used a red, green, or blue channel value of either 0 or 1, meaning that there were only 5 possible colors for any given light. Our system could therefore round the light's color to the nearest possible color, and use this as its reported light color. All lights not meeting the specifications above are immediately removed from consideration. Those that remained are presented to the user in the form of "interactive markers" in the RViz interface. The user can confirm the identity of a given light by clicking it in the visualizer. This causes the information about the detected light to be written to a log file.

Results

We successfully detected every light in the qualifying task (20/20 lights detected with zero false positives), and all colors matched exactly (100% accuracy).

A screenshot of the Gazebo simulation environment, with no panel lights on.

A view in Rviz, with a single light on.

Another view in Rviz, showing intermediate point clouds used in analysis and filtering. An interactive marker is also shown, signifying a successfully detected light at this location.

A video showing use of the interactive markers

A video of the task in progress, with the generated log file shown on the right.

Our submission files from this task are available here: hcrl_qual1.zip

Future Work

We would like to extend ORP to be used in Qualifying Task 2, to automatically detect the location of the red button (see [5]). Also, as mentioned in High-Level Task Setup, we did not use the high-fidelity Hokuyo spinning laser sensor, because of the complexity and time required to use it. For more precise tasks and perception, we would need to integrate ORP with the Hokuyo sensor as well.

References

[1] http://www.ros.org/

[2] http://pointclouds.org/

[3] https://bitbucket.org/osrf/srcsim/wiki/qual_task1

[4] https://repositories.lib.utexas.edu/bitstream/handle/2152/39369/ALLEVATO-THESIS-2016.pdf

[5] https://bitbucket.org/osrf/srcsim/wiki/qual_task2