A Trajectory Simulation Approach for Autonomous Vehicles Path Planning using Deep Reinforcement Learning

Authors

  • Jean Phelipe de Oliveira Lima Universidade do Estado do Amazonas https://orcid.org/0000-0002-5861-9928
  • Raimundo Correa de Oliveira Universidade do Estado do Amazonas
  • Cleinaldo de Almeida Costa Universidade do Estado do Amazonas

DOI:

https://doi.org/10.31686/ijier.vol8.iss12.2837

Keywords:

Autonomous Vehicles, Deep Reinforcement Learning, Path Planning, Trajectory Simulation

Abstract

Autonomous vehicle path planning aims to allow safe and rapid movement in an environment without human interference. Recently, Reinforcement Learning methods have been used to solve this problem and have achieved satisfactory results. This work presents the use of Deep Reinforcement Learning for the task of path planning for autonomous vehicles through trajectory simulation, to define routes that offer greater safety (without collisions) and less distance for the displacement between two points. A method for creating simulation environments was developed to analyze the performance of the proposed models in different difficult degrees of circumstances. The decision-making strategy implemented was based on the use of Artificial Neural Networks of the Multilayer Perceptron type with parameters and hyperparameters determined from a grid search. The models were evaluated for their reward charts resulting from their learning process. Such evaluation occurred in two phases: isolated evaluation, in which the models were inserted into the environment without prior knowledge; and incremental evaluation, in which models were inserted in unknown environments with previous intelligence accumulated in other conditions. The results obtained are competitive with state-of-the-art works and highlight the adaptive characteristic of the models presented, which, when inserted with prior knowledge in environments, can reduce the convergence time by up to 89.47% when compared to related works.

Downloads

Download data is not yet available.

Author Biographies

  • Jean Phelipe de Oliveira Lima, Universidade do Estado do Amazonas

    Superior School of Technology

  • Raimundo Correa de Oliveira, Universidade do Estado do Amazonas

    Superior School of Technology

  • Cleinaldo de Almeida Costa, Universidade do Estado do Amazonas

    Superior School of Health

References

S.G. Tzafestas. "Mobile Robot Control and Navigation: A Global Overview". J Intell Robot Syst. 91,35–58, 2018. [Online] Availabe: https://doi.org/10.1007/s10846-018-0805-9 DOI: https://doi.org/10.1007/s10846-018-0805-9

E. Prassler, M.E. Munich, P. Pirjanian, K. Kosuge. Domestic Robotics. In: B. Siciliano, O. Khatib (eds) Springer Handbook of Robotics. Springer Handbooks. Springer, Cham. 2016. [Online] Available: https://doi.org/10.1007/978-3-319-32552-1_65 DOI: https://doi.org/10.1007/978-3-319-32552-1_65

J. Duperret and D. E. Koditschek. Technical Report on: Towards Reactive Control of Simplified Legged Robotics Maneuvers. October 2017.

Y. Rasekhipour, A. Khajepour, S. Chen and B. Litkouhi. A Potential Field- Based Model Predictive Path-Planning Controller for Autonomous Road Vehicles. In IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 5, pp. 1255-1267, May 2017. DOI: https://doi.org/10.1109/TITS.2016.2604240

L. Pan and X. Wang. "Variable pitch control on direct-driven PMSG for offshore wind turbine using Repetitive-TS fuzzy PID control". In Renewable Energy. 2020. [Online] Available: https://doi.org/10.1016/j.renene.2020.05.093. DOI: https://doi.org/10.1016/j.renene.2020.05.093

L. Silveira, F. Guth, P. Drews-Jr, P. Ballester, M. Machado, F. Codevilla, N. Duarte-Filho, S. Botelho. "An Open-source Bio-inspired Solution to Underwater SLAM". In "IFAC-PapersOnLine". Vol. 48, Issue 2, pp. 212-217, 2015. [Online] Available: https://doi.org/10.1016/j.ifacol.2015.06.035 DOI: https://doi.org/10.1016/j.ifacol.2015.06.035

J. Theunissen, H. Xu, R. Y. Zhong and X. Xu. "Smart AGV System for Manufacturing Shopfloor in the Context of Industry 4.0".25th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), pp. 1-6, Stuttgart, 2018. DOI: https://doi.org/10.1109/M2VIP.2018.8600887

S.A. Bagloee, M. Tavana, M. Asadi et al. "Autonomous vehicles: challenges, opportunities, and future implications for transportation policies". In J. Mod. Transport. Vol. 24, pp. 284–303. 2016. [Online] Available: https://doi.org/10.1007/s40534-016-0117-3 DOI: https://doi.org/10.1007/s40534-016-0117-3

K. Berntorp. "Path planning and integrated collision avoidance for autonomous vehicles". In 2017 American Control Conference (ACC), Seattle, WA, pp. 4023-4028. 2017. DOI: https://doi.org/10.23919/ACC.2017.7963572

A. Koubaa et al. "Design and Evaluation of Intelligent Global Path Planning Algorithms". In: Robot Path Planning and Cooperation. Studies in Computational Intelligence. vol 772. Springer, Cham, 2018. [Online] Available: https://doi.org/10.1007/978-3-319-77042-0_3 DOI: https://doi.org/10.1007/978-3-319-77042-0_3

J. Funke, M. Brown, S. M. Erlien and J. C. Gerdes. "Collision Avoidance and Stabilization for Autonomous Vehicles in Emergency Scenarios". In IEEE Transactions on Control Systems Technology. vol. 25, no. 4, pp. 1204-1216, July 2017. DOI: https://doi.org/10.1109/TCST.2016.2599783

M. Brown, J. Funke, S. Erlien, J.C. Gerdes. "Safe driving envelopes for path tracking in autonomous vehicles". In Control Engineering Practice. Vol. 61, pp. 307-316. 2017. [Online] Available: https://doi.org/10.1016/j.conengprac.2016.04.013 DOI: https://doi.org/10.1016/j.conengprac.2016.04.013

D. Paley and A. Wolek. "Mobile Sensor Networks and Control: Adaptive Sampling of Spatiotemporal Processes". In Annual Review of Control, Robotics, and Autonomous Systems. Vol. 3, pp 1.1-1.24, May 2020. [Online] Available: https://doi.org/10.1146/annurev-control-073119-090634 DOI: https://doi.org/10.1146/annurev-control-073119-090634

C. Cadena et al. "Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age". In IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309-1332, Dec. 2016. [Online] Available: https://doi.org/10.1109/TRO.2016.2624754 DOI: https://doi.org/10.1109/TRO.2016.2624754

M. Labbé and F. Michaud. RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and longterm online operation. Journal of Field Robotics,pp. 1-31, 2018. [Online] Available: https://doi.org/10.1002/rob.21831 DOI: https://doi.org/10.1002/rob.21831

K. Ogata. Engenharia de Controle Moderno. Rio de Janeiro Prentice/Hall do Brasil, 2a Edição, 1993.

H. Oliveira. "Controlo de locomoção do veículo robótico submarino TURTLE com recurso a sistema de variação de flutuabilidade". M.Sc. dissertation. Instituto Politécnico do Porto, Porto, 2017.

A. Dantas et al. "PID Control for Electric Vehicles Subject to Control and Speed Signal Constraints". In Journal of Control Science and Engineering. 2018. [Online] Available: https://doi.org/10.1155/2018/6259049 DOI: https://doi.org/10.1155/2018/6259049

L. Nie, J. Guan, C. Lu et al. "Longitudinal Speed Control of Autonomous Vehicle Based on a Self-adaptive PID of Radial Basis Function Neural Network".In Iet Intelligent Transport Systems, 2018. DOI: https://doi.org/10.1049/iet-its.2016.0293

O.C. Arellano, N.H. Romero, J.C.S.T. Mora and J.M. Marín. "Algoritmo genético aplicado a la sintonización de un controlador PID para un sistema acoplado de tanques". In ICBI, vol. 5, n. 10, 2018. DOI: https://doi.org/10.29057/icbi.v5i10.2935

Z. Hu et al. "Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV’s Autonomous Motion Planning in Complex Unknown Environments". In Sensors. Vol. 20, 1890. 2020. DOI: https://doi.org/10.3390/s20071890

L. Yu et al. "Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning". In Sensors. Vol. 18, 2905. 2018 DOI: https://doi.org/10.20944/preprints201808.0049.v1

R.S. Sutton and A.G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998. DOI: https://doi.org/10.1109/TNN.1998.712192

T. Hester et al. "Deep Q-learning From Demonstrations". In AAAI Publications,Thirty-Second AAAI Conference on Artificial Intelligence, pp. 3223-3230, 2018. DOI: https://doi.org/10.1609/aaai.v32i1.11757

C.J.C.H. Watkins and P. Dayan. "Q-learning".Mach Learn. Vol. 8, pp. 279–292, 1992. [Online] Available: https://doi.org/10.1007/BF00992698 DOI: https://doi.org/10.1007/BF00992698

R. Bellman. "The theory of dynamic programming". In RAND Corporation, Proc. National Academy of Sciences, pp. 503–715, 1952.

M.L. Puterman. "Markov Decision Processes—Discrete Stochastic Dynamic Programming". John Wiley & Sons, Inc., New York, NY, 1994. DOI: https://doi.org/10.1002/9780470316887

L. J. Lin. "Reinforcement Learning for Robotics Using Neural Networks". Ph.D thesis, Carnegie Mellon University, Pittsburgh, PA, 1993.

M. Grzes and D. Kudenko. "Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning". In 2009 International Conference on Machine Learning and Applications, IEEE, Miami Beach, FL, pp. 337- 344, 2009. DOI: https://doi.org/10.1109/ICMLA.2009.33

A.P. Braga, A. Carvalho and T. Ludermir. Redes Neurais Artificiais: Teorias e aplicações, 2nd ed. BR: LTC. 2016.

A. Blum. "On-Line Algorithms in Machine Learning". In: Fiat A., Woeginger G.J. (eds) Online Algorithms. Lecture Notes in Computer Science. Vol 1442. Springer, Berlin, Heidelberg, 1998. DOI: https://doi.org/10.1007/BFb0029575

S. Ishii, W. Yoshida, J. Yoshimoto. "Control of exploitation-exploration metaparameter in reinforcement learning". Neural Networks. 15(4-6), pp. 665–687, 2002. DOI: https://doi.org/10.1016/S0893-6080(02)00056-4

D. Ron et al. "On the learnability and usage of acyclic probabilistic finite automata". In 8th Annual Conference on Computational Learning Theory. pp. 31-40. ACM Press, New York, NY, 1995. DOI: https://doi.org/10.1145/225298.225302

H. Brink, J. Richards and M. Fetherolf, Real-World Machine Learning. 1st ed. USA: Manning Publications, 2016.

T. Masters. Advanced Algorithms for Neural Network: A C++ Sourcebook. 1st ed. New York, NY, USA: Wiley, 1995.

K.Q. Weinberger and L.K. Saul. "Fast solvers and efficient implementations for distance metric learning". In Proceedings of the 25th international conference on Machine learning (ICML ’08). Association for Computing Machinery, New York, NY, USA, pp. 1160–1167. 2008.[Online] Available: DOI: https://doi.org/10.1145/1390156.1390302

https://doi.org/ 10.1145/1390156.1390302

P. Ramachandran, B. Zoph, and Q. V. Le. "Searching for activation functions". arXiv preprint arXiv:1710.05941, 2017.

P. Vamplew, R. Dazeley, A. Berry et al. "Empirical evaluation methods for multiobjective reinforcement learning algorithms". Mach Learn. 84, pp. 51–80, 2011. [Online] Available: https://doi.org/10.1007/s10994-010- 5232-5 DOI: https://doi.org/10.1007/s10994-010-5232-5

Downloads

Published

2020-12-01

How to Cite

de Oliveira Lima, J. P., Oliveira, R. C. de, & Costa, C. de A. (2020). A Trajectory Simulation Approach for Autonomous Vehicles Path Planning using Deep Reinforcement Learning . International Journal for Innovation Education and Research, 8(12), 436-454. https://doi.org/10.31686/ijier.vol8.iss12.2837