In this dissertation, we are interested in the optimization of mobile robotic operations that involve visual sensing and recognition. First, we show how to properly incorporate human collaboration into the robot's field operation. More specifically, as human visual performance is not perfect and depends on the visual input, we propose a Deep Convolutional Neural Network (DCNN)-based approach to predict human performance and further show how the robot can utilize the prediction to optimally query a remote human operator. We then consider a generic robotic operation, and incorporate the predicted human performance into the co-optimization of the robot's field sensing, motion planning, and communication, while considering the imperfect sensing quality, the realistic communication conditions, and the limited onboard energy. We pose the co-optimization as a Multiple-Choice Multidimensional Knapsack Problem, for which we propose a Linear Program-based efficient near-optimal solution, and mathematically characterize the optimality gap.
In the second part, we focus on robotic visual scene understanding under poor sensing conditions. We show that while the robot may have low confidence in classifying some objects in the environment, its onboard DCNN classifier can still provide useful information about these objects and assess if they belong to the same class. More specifically, we show that the correlation coefficient of the DCNN feature vectors of two object images carries robust information on their similarity, even though the individual sensing and classification quality may be low. We then build a Correlation-based Markov Random Field to capture such similarity information for joint object labeling, which significantly improves the robot’s classification accuracy, without additional training, and further show how the robot can optimize its path and human query accordingly. This gives the robot a new way to optimally decide which object sites to move close to for better sensing and for which objects to ask for human help, which considerably improves the overall classification.
Finally, we explicitly consider the cost of communication and focus on the interplay of sensing, motion, and communication in the optimization of robotic operations. We consider the case where the robot navigates from a start position to a destination and needs to sense some sites in the field. The robot collects data when sensing each site and needs to transmit all the collected data to a remote station by the end of its trip. Our goal is to minimize the total motion and communication energy cost by co-optimizing the robot's path, its data transmission along the path, and its sensing decisions. We propose a novel approach to solve this challenging problem by formulating a specially-designed Markov Decision Process (MDP) and utilizing Monte Carlo Tree Search (MCTS) to efficiently and optimally solve it. We theoretically prove the convergence of our approach, characterize its convergence speed, and show key properties of the optimum solution.
In order to validate our proposed methodologies, we perform extensive evaluation for each of the aforementioned parts, through realistic simulation studies and/or real-world robotic experiments. Our results demonstrate the efficacy and performance of our proposed approaches.