Robot Learning from Inaccurate Humans
There are many methods that allow people to give feedback to robots using reinforcement learning (RL). However, critics can often give imperfect information.We introduce a framework for characterizing Interactive RL methods with imperfect teachers and propose an algorithm, Revision Estimation from Partially Incorrect Resources (REPaIR), which can estimate corrections to imperfect feedback over time.
T. A. Kessler Faulkner and A.L. Thomaz, Using Learning Curve Predictions to Learn from Incorrect Feedback. ICRA 2023.
T. A. Kessler Faulkner, E. S. Short, and A.L. Thomaz, Interactive Reinforcement Learning with Inaccurate Feedback. ICRA 2020.
Robot Learning from Partially Attentive Humans
Interactive reinforcement learning allows robots to learn from both exploring their environment and from human feedback. However, this approach typically assumes that human teachers are continuously paying attention to the robot, which is unlikely to be true during long-term learning. Thus, we propose interactive reinforcement learning methods that take the presence or absence of human attention into account.
T. A. Kessler Faulkner, R. A. Gutierrez, E. S. Short, G. Hoffman, and A.L. Thomaz, Active Attention-Modified Policy Shaping. AAMAS 2019.
T. Kessler Faulkner, E. S. Short, and A. L. Thomaz, “Towards Active Attention-Modified Policy Shaping.” Presented at Workshop on Human/Robot In-The-Loop Machine Learning, IROS 2018.
T. Kessler Faulkner, R. A. Gutierrez, E. S. Short, and A.L. Thomaz, Policy Shaping with Supervisory Attention Driven Exploration. IROS 2018.
Asking Humans for Help
When robots ask for help from humans, they must supply enough information to convey the problem or possible solution without overwhelming people with unnecessary details. We propose a method that predicts how people may help the robot based on what utterances the robot says, in order to create succinct and descriptive requests for assistance.
T. Kessler Faulkner, S. Niekum, and A. Thomaz, Asking for Help Effectively via Modeling of Human Beliefs. Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (pp. 149-150). ACM.
T. Kessler Faulkner, S. Niekum, and A. Thomaz, "Robot Dialog Optimization via Optimization of Human Belief Updates", Presented at Workshop on Robot Communication in the Wild, Robotics: Science and Systems (RSS). July 2017.
Regret Minimization with Nonlinear Functions
Regret minimization allows a smaller subset of items to be shown to people searching through a large database, choosing the best representative set based on people's potential utility functions. We derived bounds for non-linear utility functions, allowing for a larger set of utility functions that can represent more people.
T. Kessler Faulkner, W. Brackenbury, and A. Lall, 'K-regret queries with nonlinear utilities', Proceedings of the Very Large Database Endowment, vol. 8, no. 13, pp. 2098-2109, 2015.