Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation

Authors:

J. de Heuvel, F. Seiler, M. Bennewitz

Type:

Article

Published in:

IEEE International on Human & Robot Interactive Communication (RO-MAN)

Year:

2024

Links:

Preprint

BibTex String

@inproceedings{deheuvel24roman,
      title={{EnQuery: Ensemble} Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation}, 
      author={Jorge de Heuvel and Florian Seiler and Maren Bennewitz},
      booktitle={Proc. of the IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)}
      year={2024}
}

Abstract:

To align mobile robot navigation policies with user preferences throughreinforcement learning from human feedback (RLHF), reliable andbehavior-diverse user queries are required. However, deterministic policiesfail to generate a variety of navigation trajectory suggestions for a givennavigation task configuration. We introduce EnQuery, a query generationapproach using an ensemble of policies that achieve behavioral diversitythrough a regularization term. For a given navigation task, EnQuery producesmultiple navigation trajectory suggestions, thereby optimizing the efficiencyof preference data collection with fewer queries. Our methodology demonstratessuperior performance in aligning navigation policies with user preferences inlow-query regimes, offering enhanced policy convergence from sparse preferencequeries. The evaluation is complemented with a novel explainabilityrepresentation, capturing full scene navigation behavior of the mobile robot ina single plot.