Customizable Motion Planning

Creating a Customizable Motion Planner for Human-Robot Interaction


by Jack Kawell on August 27, 2019

Overview:

In the field of human-robot interaction (HRI), there is much interest in training methods that teach robot collaborators how to effectively execute tasks alongside human teammates. However, because of the complexity of deployment environments and the personal preferences of human teammates, these pre-trained policies often lead to behavior that can cause a human collaborator to be forced off of a desired path to task completion or even to experience discomfort. In this project, we intend to develop a system that can perform in-task, active learning to adapt its policy to a human teammate. This is accomplished by creating a responsive motion planning system that has an adaptable comfort model that can be adjusted to the needs of a specific person or environment.

Timeline

  • Phase 1: In the first phase of this research, we intend to create a customizable system that motion plans around a human in a collaborative space. This system consists of a custom cost map used by standard OMPL planning algorithms for motion planning. It will be a reactive system that will observe when it causes path deviation or discomfort in a human collaborator and then updates its comfort model of the human in order to avoid negative behavior in future interactions.
    • Key points of this foundational system:
      • Implements a custom comfort model on top of standard OMPL
      • Reactive/online adjustments to the model
      • Motion plan based updates (low level)
  • Phase 2: In this second phase, we will iterate on the previous design to model the human’s comfort over time. This will require the system to gather data from many interactions with human collaborators and synthesize these measurements to find how the robot’s behavior affects the human’s comfort over the course of time. We also will implement boundary testing behavior into the robot which will cause the system to attempt to fill in gaps within its comfort model more quickly.
    • Key points of this iterative system:
      • Comfort is modelled over time
      • Testing comfort boundaries over time
      • Still low level control but with higher-level understanding of human comfort
  • Phase 3: In this final phase, we will add a higher level of understanding to the system by enabling it to learn and apply different models to various objects/environments. This will incorporate a lifelong learning component into the system which will allow for the synthysis of many differing experiences to generalize model informtation across different objects and scenes. This will enable the robot to behavior robustly to different tasks or people while still behaving in a comfortable way.
    • Key points of this final system:
      • Object/environment generalization
      • Robust reaction to new situations
      • Lifelong learning to remember experiences and adapt to new experiences