skip to main content
10.5555/3545946.3598770acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

A Self-Organizing Neuro-Fuzzy Q-Network: Systematic Design with Offline Hybrid Learning

Published: 30 May 2023 Publication History

Abstract

In this paper, we propose a systematic design process for automatically generating self-organizing neuro-fuzzy Q-networks by leveraging unsupervised learning and an offline, model-free fuzzy reinforcement learning algorithm called Fuzzy Conservative Q-learning (FCQL). Our FCQL offers more effective and interpretable policies than deep neural networks, facilitating human-in-the-loop design and explainability.

References

[1]
Mark Abdelshiheed, John Wesley Hostetter, Preya Shabrina, Tiffany Barnes, and Min Chi. 2022. The Power of Nudging: Exploring Three Interventions for Metacognitive Skills Instruction across Intelligent Tutoring Systems. In Proceedings of the 44th annual conference of the cognitive science society. 541--548.
[2]
Mark Abdelshiheed, John Wesley Hostetter, Xi Yang, Tiffany Barnes, and Min Chi. 2022. Mixing Backward- with Forward-Chaining for Metacognitive Skill Acquisition and Transfer. In Artificial Intelligence in Education. Springer International Publishing, Cham, 546--552.
[3]
Mark Abdelshiheed, Mehak Maniktala, Tiffany Barnes, and Min Chi. 2022. Assessing Competency Using Metacognition and Motivation: The Role of Time- Awareness in Preparation for Future Learning. In Design Recommendations for Intelligent Tutoring Systems. Vol. 9. 121--131.
[4]
Mark Abdelshiheed, Mehak Maniktala, Song Ju, Ayush Jain, Tiffany Barnes, and Min Chi. 2021. Preparing Unprepared Students For Future Learning. In Proceedings of the 43rd annual conference of the cognitive science society. 2547--2553.
[5]
Mark Abdelshiheed, Guojing Zhou, Mehak Maniktala, Tiffany Barnes, and Min Chi. 2020. Metacognition and Motivation: The Role of Time-Awareness in Preparation for Future Learning. In Proceedings of the 42nd annual conference of the cognitive science society. 945--951.
[6]
Marcin Andrychowicz, Bowen Baker, et al. 2018. Learning dexterous in-hand manipulation. arXiv:1808.00177 (2018).
[7]
Kai Ang and Chai Quek. 2005. RSPOP: Rough Set-Based Pseudo Outer-Product Fuzzy Rule Identification Algorithm. Neural computation 17 (2005), 205--43.
[8]
Plamen Angelov and Xiaowei Gu. 2017. Empirical Fuzzy Sets. International Journal of Intelligent Systems 33 (09 2017). https://doi.org/10.1002/int.21935
[9]
H.R. Berenji and P. Khedkar. 1992. Learning and tuning fuzzy logic controllers through reinforcements. IEEE Transactions on Neural Networks 3, 5 (1992), 724--740.
[10]
H.R. Berenji, R.N. Lea, Y. Jani, P. Khedkar, A. Malkani, and J. Hoblit. 1993. Space shuttle attitude control by reinforcement learning and fuzzy logic. In [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems. 1396--1401 vol.2. https://doi.org/10.1109/FUZZY.1993.327605
[11]
Hamid R. Berenji. 1992. A reinforcement learning-based architecture for fuzzy logic control. International Journal of Approximate Reasoning 6, 2 (1992), 267--292. https://doi.org/10.1016/0888--613X(92)90020-Z
[12]
Hamid R. Berenji and Sterling Software. 1991. Refinement of Approximate Reasoning-based Controllers by Reinforcement Learning. In Machine Learning Proceedings 1991, Lawrence A. Birnbaum and Gregg C. Collins (Eds.). Morgan Kaufmann, San Francisco (CA), 475--479.
[13]
James C. Bezdek, Robert Ehrlich, and William Full. 1984. FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences 10, 2 (1984), 191--203. https://doi.org/10.1016/0098--3004(84)90020--7
[14]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. https://doi.org/10.48550/ARXIV.1606.01540
[15]
J. Casillas, O. Cordón, F.H. Triguero, and L. Magdalena. 2013. Interpretability Issues in Fuzzy Modeling. Springer Berlin Heidelberg. https://books.google.com/books?id=7r_qCAAAQBAJ
[16]
Ron Tor Das et al. 2016. ieRSPOP: A novel incremental rough set-based pseudo outer-product with ensemble learning. Applied Soft Computing 46 (2016), 170--186.
[17]
Sao Deroski, Luc De Raedt, and Kurt Driessens. 2001. Relational Reinforcement Learning. Machine Learning 43, 1 (01 Apr 2001), 7--52. https://doi.org/10.1023/A:1007694015589
[18]
Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, and Joelle Pineau. 2019. Benchmarking Batch Deep Reinforcement Learning Algorithms. https://doi.org/10.48550/ARXIV.1910.01708
[19]
Scott Fujimoto, David Meger, and Doina Precup. 2018. Off-Policy Deep Reinforcement Learning without Exploration. https://doi.org/10.48550/ARXIV.1812.02900
[20]
P.Y. Glorennec. 1994. Fuzzy Q-learning and dynamical fuzzy Q-learning. In Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference. 474--479 vol.1. https://doi.org/10.1109/FUZZY.1994.343739
[21]
P.Y. Glorennec and L. Jouffe. 1997. Fuzzy Q-learning. In Proceedings of 6th International Fuzzy Systems Conference, Vol. 2. 659--662 vol.2. https://doi.org/10.1109/FUZZY.1997.622790
[22]
Daniel Hein, Alexander Hentschel, Thomas Runkler, and Steffen Udluft. 2017. Particle swarm optimization for generating interpretable fuzzy reinforcement learning policies. Engineering Applications of Artificial Intelligence 65 (2017), 87--98. https://doi.org/10.1016/j.engappai.2017.07.005
[23]
John Wesley Hostetter. 2023. johnHostetter/AAMAS-2023-FCQL: First release. https://doi.org/10.5281/zenodo.7668308
[24]
J.-S.R. Jang. 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man, and Cybernetics 23, 3 (1993), 665--685.
[25]
Zhengyao Jiang and Shan Luo. 2019. Neural Logic Reinforcement Learning. CoRR abs/1904.10729 (2019). arXiv:1904.10729 http://arxiv.org/abs/1904.10729
[26]
L. Jouffe. 1998. Fuzzy inference system learning by reinforcement methods. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 28, 3 (1998), 338--355. https://doi.org/10.1109/5326.704563
[27]
Song Ju, Guojing Zhou, Mark Abdelshiheed, Tiffany Barnes, and Min Chi. 2021. Evaluating Critical Reinforcement Learning Framework in the Field. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 215--227.
[28]
N.K. Kasabov and Qun Song. 2002. DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Transactions on Fuzzy Systems 10, 2 (2002), 144--154. https://doi.org/10.1109/91.995117
[29]
J. Kim and N. Kasabov. 1999. HyFIS: adaptive neuro-fuzzy inference systems and their application to nonlinear dynamical systems. Neural Networks 12, 9 (1999), 1301--1319. https://doi.org/10.1016/S0893--6080(99)00067--2
[30]
Min-Soeng Kim, Sun-Gi Hong, and Ju-Jang Lee. 1999. Self-organizing fuzzy inference system by Q-learning. In FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315), Vol. 1. 372--377 vol.1. https://doi.org/10.1109/FUZZY.1999.793268
[31]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/ARXIV.1412.6980
[32]
G. Klir and Bo Yuan. 1995. Fuzzy sets and fuzzy logic - theory and applications. Prentice-Hall Inc., Upper Saddle River, New Jersey.
[33]
B. Kosko. 1994. Fuzzy systems as universal approximators. IEEE Trans. Comput. 43, 11 (1994), 1329--1333. https://doi.org/10.1109/12.324566
[34]
Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conservative Q-Learning for Offline Reinforcement Learning. https://doi.org/10.48550/ ARXIV.2006.04779
[35]
C.C. Lee. 1990. Fuzzy logic in control systems: fuzzy logic controller. I & II. IEEE Transactions on Systems, Man, and Cybernetics 20, 2 (1990), 404--435.
[36]
Sergey Levine, Aviral Kumar, George Tucker, and Justin Fu. 2020. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. https://doi.org/10.48550/ARXIV.2005.01643
[37]
Yuezhang Li, Katia P. Sycara, and Rahul Radhakrishnan Iyer. 2017. Object-sensitive Deep Reinforcement Learning. ArXiv abs/1809.06064 (2017).
[38]
Cheng-Jian Lin and Chin-Teng Lin. 1996. Reinforcement learning for an ART-based fuzzy adaptive learning control network. IEEE Transactions on Neural Networks 7, 3 (1996), 709--731. https://doi.org/10.1109/72.501728
[39]
C.-T. Lin and C.S.G. Lee. 1991. Neural-network-based fuzzy logic control and decision system. IEEE Trans. Comput. 40, 12 (1991), 1320--1336. https://doi.org/10.1109/12.106218
[40]
Chin-Teng Lin and C.S.G. Lee. 1994. Reinforcement structure/parameter learning for neural-network-based fuzzy logic control systems. IEEE Transactions on Fuzzy Systems 2, 1 (1994), 46--63. https://doi.org/10.1109/91.273126
[41]
P. Lindskog. 1997. Fuzzy Identification from a Grey Box Modeling Point of View. Springer Berlin Heidelberg, Berlin, Heidelberg, 3--50. https://doi.org/10.1007/978--3--642--60767--7_1
[42]
H.H. Lou and Y.L. Huang. 2000. Fuzzy-logic-based process modeling using limited experimental data. Engineering Applications of Artificial Intelligence 13, 2 (2000), 121--135. https://doi.org/10.1016/S0952--1976(99)00057--3
[43]
Jean M. Mandler. 2008. On the Birth and Growth of Concepts. Philosophical Psychology 21 (2008), 207 -- 230.
[44]
Jean M Mandler, Patricia J Bauer, and Laraine McDonough. 1991. Separating the sheep from the goats: Differentiating global categories. Cognitive Psychology 23, 2 (1991), 263--298. https://doi.org/10.1016/0010-0285(91)90011-C
[45]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (01 Feb 2015), 529--533. https://doi.org/10.1038/nature14236
[46]
Besa Muslimi, Miriam A. M. Capretz, and Jagath Samarabandu. 2008. An Efficient Technique for Extracting Fuzzy Rules from Neural Networks. International Journal of Electrical and Computer Engineering 2, 4 (2008), 1231 -- 1237.
[47]
Zdzislaw Pawlak. 1998. Rough Set Theory and its Applications to Data Analysis. Cybernetics and Systems 29, 7 (1998), 661--688. https://doi.org/10.1080/019697298125470 arXiv:https://doi.org/10.1080/019697298125470
[48]
Agus Priyono, Muhammad Ridwan, Ahmad Alias, Riza Rahmat, Azmi Hassan, and Mohd Mohd Ali. 2005. Generation of Fuzzy Rules with Subtractive Clustering. Jurnal Teknologi 43 (02 2005), 143. https://doi.org/10.11113/jt.v43.782
[49]
Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo, and Esther Luna Colombini. 2022. A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems. https://doi.org/10.48550/ARXIV.2203.01387
[50]
C. Quek and R.W. Zhou. 1999. POPFNN-AAR(S): a pseudo outer-product based fuzzy neural network. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 29, 6 (1999), 859--870.
[51]
Martin Riedmiller. 2005. Neural Fitted Q Iteration -- First Experiences with a Data Efficient Neural Reinforcement Learning Method. In Machine Learning: ECML 2005, João Gama, Rui Camacho, Pavel B. Brazdil, Alípio Mário Jorge, and Luís Torgo (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 317--328.
[52]
Rowena Rodrigues. 2020. Legal and human rights issues of AI: Gaps, challenges and vulnerabilities. Journal of Responsible Technology 4 (2020), 100005. https://doi.org/10.1016/j.jrt.2020.100005
[53]
Takuma Seno and Michita Imai. 2021. d3rlpy: An Offline Deep Reinforcement Learning Library. https://doi.org/10.48550/ARXIV.2111.03788
[54]
Hitesh Shah and M. Gopal. 2014. A Reinforcement Learning Algorithm with Evolving Fuzzy Neural Networks. IFAC Proceedings Volumes 47, 1 (2014), 1161--1165. https://doi.org/10.3182/20140313--3-IN-3024.00058 3rd International Conference on Advances in Control and Optimization of Dynamical Systems (2014).
[55]
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche, et al. 2016. Mastering the game of go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.
[56]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.
[57]
Faraz Torabi, Garrett Warnell, and Peter Stone. 2018. Behavioral Cloning from Observation. https://doi.org/10.48550/ARXIV.1805.01954
[58]
Sau Wai Tung, Chai Quek, and Cuntai Guan. 2011. SaFIN: A Self-Adaptive Fuzzy Inference Network. IEEE Transactions on Neural Networks 22, 12 (2011), 1928--1940.
[59]
W. L. Tung and C. Quek. 2002. DIC: A Novel Discrete Incremental Clustering Technique for the Derivation of Fuzzy Membership Functions. In PRICAI 2002: Trends in Artificial Intelligence, Mitsuru Ishizuka and Abdul Sattar (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 178--187.
[60]
Hado van Hasselt, Arthur Guez, and David Silver. 2015. Deep Reinforcement Learning with Double Q-learning. https://doi.org/10.48550/ARXIV.1509.06461
[61]
L.-X. Wang. 1992. Fuzzy systems are universal approximators. In [1992 Proceedings] IEEE International Conference on Fuzzy Systems. 1163--1170. https://doi.org/10.1109/FUZZY.1992.258721
[62]
Li-Xin Wang. 1997. A Course in Fuzzy Systems and Control.
[63]
L.-X. Wang and J.M. Mendel. 1992. Fuzzy basis functions, universal approximation, and orthogonal least-squares learning. IEEE Transactions on Neural Networks 3, 5 (1992), 807--814. https://doi.org/10.1109/72.159070
[64]
L.-X. Wang and J.M. Mendel. 1992. Generating fuzzy rules by learning from examples. IEEE Transactions on Systems, Man, and Cybernetics 22, 6 (1992), 1414--1427. https://doi.org/10.1109/21.199466
[65]
Xue-Song Wang, Yu-Hu Cheng, and Jian-Qiang Yi. 2007. A fuzzy Actor--Critic reinforcement learning network. Information Sciences 177, 18 (2007), 3764--3781. https://doi.org/10.1016/j.ins.2007.03.012
[66]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8, 3 (01 May 1992), 279--292. https://doi.org/10.1007/BF00992698
[67]
Ronald R. Yager and Dimitar P. Filev. 1994. Generation of Fuzzy Rules by Mountain Clustering. J. Intell. Fuzzy Syst. 2, 3 (may 1994), 209--219.
[68]
L.A. Zadeh. 1965. Fuzzy sets. Information and Control 8, 3 (1965), 338--353.
[69]
Lotfi Zadeh and Rafik Aliev. 2018. Fuzzy Logic Theory and Applications: Part I and II.
[70]
Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, and Peter Battaglia. 2018. Relational Deep Reinforcement Learning. https://doi.org/10.48550/ARXIV.1806.01830

Cited By

View all
  • (2024)Not a Team but Learning as One: The Impact of Consistent Attendance on Discourse Diversification in Math Group ModelingProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659554(120-131)Online publication date: 22-Jun-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems
May 2023
3131 pages
ISBN:9781450394321
  • General Chairs:
  • Noa Agmon,
  • Bo An,
  • Program Chairs:
  • Alessandro Ricci,
  • William Yeoh

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 30 May 2023

Check for updates

Author Tags

  1. ITS
  2. fuzzy logic control
  3. hybrid learning
  4. neuro-fuzzy
  5. offline reinforcement learning
  6. pedagogical policy
  7. unsupervised learning

Qualifiers

  • Research-article

Funding Sources

  • National Science Foundation

Conference

AAMAS '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)3
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Not a Team but Learning as One: The Impact of Consistent Attendance on Discourse Diversification in Math Group ModelingProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659554(120-131)Online publication date: 22-Jun-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media