A Trajectory Simulation Approach for Autonomous Vehicles Path Planning using Deep Reinforcement Learning
DOI:
https://doi.org/10.31686/ijier.vol8.iss12.2837

Keywords:
Autonomous Vehicles, Deep Reinforcement Learning, Path Planning, Trajectory SimulationAbstract
Autonomous vehicle path planning aims to allow safe and rapid movement in an environment without human interference. Recently, Reinforcement Learning methods have been used to solve this problem and have achieved satisfactory results. This work presents the use of Deep Reinforcement Learning for the task of path planning for autonomous vehicles through trajectory simulation, to define routes that offer greater safety (without collisions) and less distance for the displacement between two points. A method for creating simulation environments was developed to analyze the performance of the proposed models in different difficult degrees of circumstances. The decision-making strategy implemented was based on the use of Artificial Neural Networks of the Multilayer Perceptron type with parameters and hyperparameters determined from a grid search. The models were evaluated for their reward charts resulting from their learning process. Such evaluation occurred in two phases: isolated evaluation, in which the models were inserted into the environment without prior knowledge; and incremental evaluation, in which models were inserted in unknown environments with previous intelligence accumulated in other conditions. The results obtained are competitive with state-of-the-art works and highlight the adaptive characteristic of the models presented, which, when inserted with prior knowledge in environments, can reduce the convergence time by up to 89.47% when compared to related works.
Downloads
References
S.G. Tzafestas. "Mobile Robot Control and Navigation: A Global Overview". J Intell Robot Syst. 91,35–58, 2018. [Online] Availabe: https://doi.org/10.1007/s10846-018-0805-9 DOI: https://doi.org/10.1007/s10846-018-0805-9
E. Prassler, M.E. Munich, P. Pirjanian, K. Kosuge. Domestic Robotics. In: B. Siciliano, O. Khatib (eds) Springer Handbook of Robotics. Springer Handbooks. Springer, Cham. 2016. [Online] Available: https://doi.org/10.1007/978-3-319-32552-1_65 DOI: https://doi.org/10.1007/978-3-319-32552-1_65
J. Duperret and D. E. Koditschek. Technical Report on: Towards Reactive Control of Simplified Legged Robotics Maneuvers. October 2017.
Y. Rasekhipour, A. Khajepour, S. Chen and B. Litkouhi. A Potential Field- Based Model Predictive Path-Planning Controller for Autonomous Road Vehicles. In IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 5, pp. 1255-1267, May 2017. DOI: https://doi.org/10.1109/TITS.2016.2604240
L. Pan and X. Wang. "Variable pitch control on direct-driven PMSG for offshore wind turbine using Repetitive-TS fuzzy PID control". In Renewable Energy. 2020. [Online] Available: https://doi.org/10.1016/j.renene.2020.05.093. DOI: https://doi.org/10.1016/j.renene.2020.05.093
L. Silveira, F. Guth, P. Drews-Jr, P. Ballester, M. Machado, F. Codevilla, N. Duarte-Filho, S. Botelho. "An Open-source Bio-inspired Solution to Underwater SLAM". In "IFAC-PapersOnLine". Vol. 48, Issue 2, pp. 212-217, 2015. [Online] Available: https://doi.org/10.1016/j.ifacol.2015.06.035 DOI: https://doi.org/10.1016/j.ifacol.2015.06.035
J. Theunissen, H. Xu, R. Y. Zhong and X. Xu. "Smart AGV System for Manufacturing Shopfloor in the Context of Industry 4.0".25th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), pp. 1-6, Stuttgart, 2018. DOI: https://doi.org/10.1109/M2VIP.2018.8600887
S.A. Bagloee, M. Tavana, M. Asadi et al. "Autonomous vehicles: challenges, opportunities, and future implications for transportation policies". In J. Mod. Transport. Vol. 24, pp. 284–303. 2016. [Online] Available: https://doi.org/10.1007/s40534-016-0117-3 DOI: https://doi.org/10.1007/s40534-016-0117-3
K. Berntorp. "Path planning and integrated collision avoidance for autonomous vehicles". In 2017 American Control Conference (ACC), Seattle, WA, pp. 4023-4028. 2017. DOI: https://doi.org/10.23919/ACC.2017.7963572
A. Koubaa et al. "Design and Evaluation of Intelligent Global Path Planning Algorithms". In: Robot Path Planning and Cooperation. Studies in Computational Intelligence. vol 772. Springer, Cham, 2018. [Online] Available: https://doi.org/10.1007/978-3-319-77042-0_3 DOI: https://doi.org/10.1007/978-3-319-77042-0_3
J. Funke, M. Brown, S. M. Erlien and J. C. Gerdes. "Collision Avoidance and Stabilization for Autonomous Vehicles in Emergency Scenarios". In IEEE Transactions on Control Systems Technology. vol. 25, no. 4, pp. 1204-1216, July 2017. DOI: https://doi.org/10.1109/TCST.2016.2599783
M. Brown, J. Funke, S. Erlien, J.C. Gerdes. "Safe driving envelopes for path tracking in autonomous vehicles". In Control Engineering Practice. Vol. 61, pp. 307-316. 2017. [Online] Available: https://doi.org/10.1016/j.conengprac.2016.04.013 DOI: https://doi.org/10.1016/j.conengprac.2016.04.013
D. Paley and A. Wolek. "Mobile Sensor Networks and Control: Adaptive Sampling of Spatiotemporal Processes". In Annual Review of Control, Robotics, and Autonomous Systems. Vol. 3, pp 1.1-1.24, May 2020. [Online] Available: https://doi.org/10.1146/annurev-control-073119-090634 DOI: https://doi.org/10.1146/annurev-control-073119-090634
C. Cadena et al. "Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age". In IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309-1332, Dec. 2016. [Online] Available: https://doi.org/10.1109/TRO.2016.2624754 DOI: https://doi.org/10.1109/TRO.2016.2624754
M. Labbé and F. Michaud. RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and longterm online operation. Journal of Field Robotics,pp. 1-31, 2018. [Online] Available: https://doi.org/10.1002/rob.21831 DOI: https://doi.org/10.1002/rob.21831
K. Ogata. Engenharia de Controle Moderno. Rio de Janeiro Prentice/Hall do Brasil, 2a Edição, 1993.
H. Oliveira. "Controlo de locomoção do veículo robótico submarino TURTLE com recurso a sistema de variação de flutuabilidade". M.Sc. dissertation. Instituto Politécnico do Porto, Porto, 2017.
A. Dantas et al. "PID Control for Electric Vehicles Subject to Control and Speed Signal Constraints". In Journal of Control Science and Engineering. 2018. [Online] Available: https://doi.org/10.1155/2018/6259049 DOI: https://doi.org/10.1155/2018/6259049
L. Nie, J. Guan, C. Lu et al. "Longitudinal Speed Control of Autonomous Vehicle Based on a Self-adaptive PID of Radial Basis Function Neural Network".In Iet Intelligent Transport Systems, 2018. DOI: https://doi.org/10.1049/iet-its.2016.0293
O.C. Arellano, N.H. Romero, J.C.S.T. Mora and J.M. Marín. "Algoritmo genético aplicado a la sintonización de un controlador PID para un sistema acoplado de tanques". In ICBI, vol. 5, n. 10, 2018. DOI: https://doi.org/10.29057/icbi.v5i10.2935
Z. Hu et al. "Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV’s Autonomous Motion Planning in Complex Unknown Environments". In Sensors. Vol. 20, 1890. 2020. DOI: https://doi.org/10.3390/s20071890
L. Yu et al. "Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning". In Sensors. Vol. 18, 2905. 2018 DOI: https://doi.org/10.20944/preprints201808.0049.v1
R.S. Sutton and A.G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998. DOI: https://doi.org/10.1109/TNN.1998.712192
T. Hester et al. "Deep Q-learning From Demonstrations". In AAAI Publications,Thirty-Second AAAI Conference on Artificial Intelligence, pp. 3223-3230, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.11757
C.J.C.H. Watkins and P. Dayan. "Q-learning".Mach Learn. Vol. 8, pp. 279–292, 1992. [Online] Available: https://doi.org/10.1007/BF00992698 DOI: https://doi.org/10.1007/BF00992698
R. Bellman. "The theory of dynamic programming". In RAND Corporation, Proc. National Academy of Sciences, pp. 503–715, 1952.
M.L. Puterman. "Markov Decision Processes—Discrete Stochastic Dynamic Programming". John Wiley & Sons, Inc., New York, NY, 1994. DOI: https://doi.org/10.1002/9780470316887
L. J. Lin. "Reinforcement Learning for Robotics Using Neural Networks". Ph.D thesis, Carnegie Mellon University, Pittsburgh, PA, 1993.
M. Grzes and D. Kudenko. "Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning". In 2009 International Conference on Machine Learning and Applications, IEEE, Miami Beach, FL, pp. 337- 344, 2009. DOI: https://doi.org/10.1109/ICMLA.2009.33
A.P. Braga, A. Carvalho and T. Ludermir. Redes Neurais Artificiais: Teorias e aplicações, 2nd ed. BR: LTC. 2016.
A. Blum. "On-Line Algorithms in Machine Learning". In: Fiat A., Woeginger G.J. (eds) Online Algorithms. Lecture Notes in Computer Science. Vol 1442. Springer, Berlin, Heidelberg, 1998. DOI: https://doi.org/10.1007/BFb0029575
S. Ishii, W. Yoshida, J. Yoshimoto. "Control of exploitation-exploration metaparameter in reinforcement learning". Neural Networks. 15(4-6), pp. 665–687, 2002. DOI: https://doi.org/10.1016/S0893-6080(02)00056-4
D. Ron et al. "On the learnability and usage of acyclic probabilistic finite automata". In 8th Annual Conference on Computational Learning Theory. pp. 31-40. ACM Press, New York, NY, 1995. DOI: https://doi.org/10.1145/225298.225302
H. Brink, J. Richards and M. Fetherolf, Real-World Machine Learning. 1st ed. USA: Manning Publications, 2016.
T. Masters. Advanced Algorithms for Neural Network: A C++ Sourcebook. 1st ed. New York, NY, USA: Wiley, 1995.
K.Q. Weinberger and L.K. Saul. "Fast solvers and efficient implementations for distance metric learning". In Proceedings of the 25th international conference on Machine learning (ICML ’08). Association for Computing Machinery, New York, NY, USA, pp. 1160–1167. 2008.[Online] Available: DOI: https://doi.org/10.1145/1390156.1390302
https://doi.org/ 10.1145/1390156.1390302
P. Ramachandran, B. Zoph, and Q. V. Le. "Searching for activation functions". arXiv preprint arXiv:1710.05941, 2017.
P. Vamplew, R. Dazeley, A. Berry et al. "Empirical evaluation methods for multiobjective reinforcement learning algorithms". Mach Learn. 84, pp. 51–80, 2011. [Online] Available: https://doi.org/10.1007/s10994-010- 5232-5 DOI: https://doi.org/10.1007/s10994-010-5232-5
Downloads
Published
Issue
Section
License
Copyright (c) 2020 Jean Phelipe de Oliveira Lima, Raimundo Correa de Oliveira, Cleinaldo de Almeida Costa

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Copyrights for articles published in IJIER journals are retained by the authors, with first publication rights granted to the journal. The journal/publisher is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author for more visit Copyright & License.
How to Cite
Accepted 2020-12-04
Published 2020-12-01