Optimal trajectory tracking control for USVs under dynamic uncertainties and time-varying disturbances via PI and IRL algorithms
29 viewsDOI:
https://doi.org/10.54939/1859-1043.j.mst.208.2025.11-20Keywords:
Integral reinforcement learning; PI; Optimal control; HJB; USVs.Abstract
This paper presents a model-free optimal control framework for trajectory tracking of Unmanned Surface Vehicles operating under unknown dynamics and time-varying disturbances via Policy Iteration (PI) and Integral Reinforcement Learning (IRL) algorithms. The IRL-PI controller is developed based on an order reduction technique and an off-policy Actor-Critic neural network structure, allowing real-time approximation of the Hamilton-Jacobi-Bellman solution without requiring model knowledge. Simulation results on a three three-degree-of-freedom (3-DOF) USV model demonstrate that the proposed method outperforms conventional controllers in both tracking accuracy and robustness. These results highlight the potential of the IRL-PI controller to develop robust control solutions for complex marine systems operating in uncertain and dynamic environments.
References
[1]. T. I. Fossen, “Handbook of marine craft hydrodynamics and motion control”, John Wiley & Sons Ltd., (2011). DOI: https://doi.org/10.1002/9781119994138
[2]. X. Lin, H. Jiang, et al., “Adaptive sliding-mode trajectory tracking control for underactuated surface vessels based on NDO”, Proceedings of IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1043–1049, (2018). DOI: https://doi.org/10.1109/ICMA.2018.8484633
[3]. C. Liu et al., “Trajectory tracking of underactuated surface vessels based on neural network and hierarchical sliding mode”, Journal of Marine Science and Technology, vol. 20, pp. 322–330, (2015). DOI: https://doi.org/10.1007/s00773-014-0285-y
[4]. G. Wen et al., “Adaptive tracking control of surface vessel using optimized backstepping technique”, IEEE Transactions on Cybernetics, Vol. 49, No. 9, pp. 3420–3431, (2018). DOI: https://doi.org/10.1109/TCYB.2018.2844177
[5]. G. Xiao, H. Zhang, Y. Luo, H. Jiang, “Data-driven optimal tracking control for a class of affine nonlinear continuous-time systems”, (2016). DOI: https://doi.org/10.1049/iet-cta.2015.0590
[6]. V. T. Vu, T. L. Pham, Q. H. Tran, P. N. Dao, “Optimal control for fully-actuated surface vessel systems”, iRobotics, Vol. 4, No. 1, (2021).
[7]. C. Liu, et al., “Trajectory tracking control for underactuated surface vessels based on nonlinear model predictive control”, Lecture Notes in Computer Science (ICCL), Vol. 9335, pp. 166–180, (2015). DOI: https://doi.org/10.1007/978-3-319-24264-4_12
[8]. K. Kamalapurkar, W. E. Dixon, et al., “Model-based reinforcement learning for infinite-horizon approximate optimal tracking”, IEEE Transactions on Neural Networks and Learning Systems, vol. 28, pp. 753–758, (2016). DOI: https://doi.org/10.1109/TNNLS.2015.2511658
[9]. X. Guo, W. Yan, R. Cui, “Integral reinforcement learning-based adaptive neural network control for continuous-time nonlinear MIMO systems with unknown control directions”, IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 50, No. 11, pp. 4068–4077, (2019). DOI: https://doi.org/10.1109/TSMC.2019.2897221
[10]. Z. Zheng, et al., “Reinforcement learning control for underactuated surface vessel with output error constraints and uncertainties”, Neurocomputing, Vol. 399, pp. 479–490, (2020). DOI: https://doi.org/10.1016/j.neucom.2020.03.021
[11]. X. Yang, et al., “Adaptive dynamic programming for robust neural control of unknown continuous-time nonlinear systems”, IET Control Theory & Applications, Vol. 11, pp. 2307–2316, (2017). DOI: https://doi.org/10.1049/iet-cta.2017.0154
[12]. Y. Zhu, D. Zhao, X. Li, “Using reinforcement learning techniques to solve continuous-time nonlinear optimal tracking problem without system dynamics”, IET Control Theory & Applications, Vol. 10, pp. 1339–1347, (2016). DOI: https://doi.org/10.1049/iet-cta.2015.0769
[13]. K. Dupree, P. M. Patre, Z. D. Wilcox, W. E. Dixon, “Asymptotic optimal control of uncertain nonlinear Euler–Lagrange systems”, Automatica, vol. 47, pp. 99–107, (2011). DOI: https://doi.org/10.1016/j.automatica.2010.10.007
[14]. J. Y. Lee, et al., “Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations”, IEEE Transactions on Neural Networks and Learning Systems, Vol. 26, pp. 916–932, (2014). DOI: https://doi.org/10.1109/TNNLS.2014.2328590
