skip to main content
10.1145/3377930.3390160acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Multi-tree genetic programming for feature construction-based domain adaptation in symbolic regression with incomplete data

Published: 26 June 2020 Publication History

Abstract

Nowadays, transfer learning has gained a rapid popularity in tasks with limited data available. While traditional learning limits the learning process to knowledge available in a specific (target) domain, transfer learning can use parts of knowledge extracted from learning in a different (source) domain to help learning in the target domain. This concept is of special importance when there is a lack of knowledge in the target domain. Consequently, since data incompleteness is a serious cause of knowledge shortage in real-world learning tasks, it can be typically addressed using transfer learning. One way to achieve that is feature construction-based domain adaptation. However, although it is considered as a powerful feature construction algorithm, Genetic Programming has not been fully utilized for domain adaptation. In this work, a multi-tree genetic programming method is proposed for feature construction-based domain adaptation. The main idea is to construct a transformation from the source feature space to the target feature space, which maps the source domain close to the target domain. This method is utilized for symbolic regression with missing values. The experimental work shows encouraging potential of the proposed approach when applied to real-world tasks considering different transfer learning scenarios.

References

[1]
Baligh Al-Helali, Qi Chen, Bing Xue, and Mengjie Zhang. 2018. A Hybrid GP-KNN Imputation for Symbolic Regression with Missing Values. In Australasian Joint Conference on Artificial Intelligence. Springer, 345--357.
[2]
Baligh Al-Helali, Qi Chen, Bing Xue, and Mengjie Zhang. 2019. Genetic Programming-Based Simultaneous Feature Selection and Imputation for Symbolic Regression with Incomplete Data. In Asian Conference on Pattern Recognition. Springer, 566--579.
[3]
Baligh Al-Helali, Qi Chen, Bing Xue, and Mengjie Zhang. 2019. A Genetic Programming-based Wrapper Imputation Method for Symbolic Regression with Incomplete Data. In 2019 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2395--2402.
[4]
Baligh Al-Helali, Qi Chen, Bing Xue, and Mengjie Zhang. 2019. Genetic Programming for Imputation Predictor Selection and Ranking in Symbolic Regression with High-Dimensional Incomplete Data. In Australasian Joint Conference on Artificial Intelligence. Springer, 523--535.
[5]
Baligh Al-Helali, Qi Chen, Bing Xue, and Mengjie Zhang. 2020. Hessian Complexity Measure for Genetic Programming-based Imputation Predictor Selection in Symbolic Regression with Incomplete Data. In EuroGP 2020: Proceedings of the 23rd European Conference on Genetic Programming (LNCS), Ting Hu, Nuno Lourenco, and Eric Medvet (Eds.), Vol. 12101. Springer Verlag, Seville, Spain, 1--17. https://doi.org/
[6]
Mazhar Ansari Ardeh, Yi Mei, and Mengjie Zhang. 2019. Genetic programming hyper-heuristic with knowledge transfer for uncertain capacitated arc routing problem. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 334--335.
[7]
Mazhar Ansari Ardeh, Yi Mei, and Mengjie Zhang. 2019. A Novel Genetic Programming Algorithm with Knowledge Transfer for Uncertain Capacitated Arc Routing Problem. In Pacific Rim International Conference on Artificial Intelligence. Springer, 196--200.
[8]
Mazhar Ansari Ardeh, Yi Mei, and Mengjie Zhang. 2019. Transfer learning in genetic programming hyper-heuristic for solving uncertain capacitated arc routing problem. In 2019 IEEE Congress on Evolutionary Computation (CEC). IEEE, 49--56.
[9]
Wolfgang Banzhaf, Peter Nordin, Robert E Keller, and Frank D Francone. 1998. Genetic programming: an introduction. Vol. 1. Morgan Kaufmann San Francisco.
[10]
Tomas Brandejsky. 2013. Model Identification from Incomplete Data Set Describing State Variable Subset Only-The Problem of Optimizing and Predicting Heuristic Incorporation into Evolutionary System. In Nostradamus 2013: Prediction, Modeling and Analysis of Complex Systems. Springer, 181--189.
[11]
Qi Chen, Bing Xue, and Mengjie Zhang. 2019. Differential evolution for instance based transfer learning in genetic programming for symbolic regression. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 161--162.
[12]
Qi Chen, Bing Xue, and Mengjie Zhang. 2019. Instance based Transfer Learning for Genetic Programming for Symbolic Regression. In 2019 IEEE Congress on Evolutionary Computation (CEC). IEEE, 3006--3013.
[13]
Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. Power-law distributions in empirical data. SIAM review 51, 4 (2009), 661--703.
[14]
Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. (2017). http://archive.ics.uci.edu/ml
[15]
Thi Thu Huong Dinh, Thi Huong Chu, and Quang Uy Nguyen. 2015. Transfer learning in genetic programming. In 2015 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1145--1151.
[16]
A Rogier T Donders, Geert JMG Van Der Heijden, Theo Stijnen, and Karel GM Moons. 2006. A gentle introduction to imputation of missing values. Journal of clinical epidemiology 59, 10 (2006), 1087--1091.
[17]
Renato Fabbri and Fernando Gularte De León. 2017. A Statistical Distance Derived From The Kolmogorov-Smirnov Test: specification, reference measures (benchmarks) and example uses. arXiv preprint arXiv:1711.00761 (2017).
[18]
Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research 13, Jul (2012), 2171--2175.
[19]
Magda Friedjungová and Marcel Jirina. 2017. Asymmetric Heterogeneous Transfer Learning: A Survey. In DATA. 17--27.
[20]
Wenlong Fu, Bing Xue, Mengjie Zhang, and Xiaoying Gao. 2017. Transductive transfer learning in genetic programming for document classification. In Asia-Pacific Conference on Simulated Evolution and Learning. Springer, 556--568.
[21]
Edward Haslam, Bing Xue, and Mengjie Zhang. 2016. Further investigation on genetic programming with transfer learning for symbolic regression. In 2016 IEEE Congress on Evolutionary Computation (CEC). IEEE, 3598--3605.
[22]
Jiayuan Huang, Arthur Gretton, Karsten Borgwardt, Bernhard Schölkopf, and Alex J Smola. 2007. Correcting sample selection bias by unlabeled data. In Advances in neural information processing systems. 601--608.
[23]
Muhammad Iqbal, Harith Al-Sahaf, Bing Xue, and Mengjie Zhang. 2019. Genetic programming with transfer learning for texture image classification. Soft Computing (2019), 1--13.
[24]
Muhammad Iqbal, Bing Xue, Harith Al-Sahaf, and Mengjie Zhang. 2017. Cross-domain reuse of extracted knowledge in genetic programming for image classification. IEEE Transactions on Evolutionary Computation 21, 4 (2017), 569--587.
[25]
Muhammad Iqbal, Bing Xue, and Mengjie Zhang. 2016. Reusing extracted knowledge in genetic programming to solve complex texture image classification problems. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 117--129.
[26]
John R Koza. 1994. Genetic programming as a means for programming computers by natural selection. Statistics and computing 4, 2 (1994), 87--112.
[27]
Andrew Lensen, Bing Xue, and Mengjie Zhang. 2019. Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis. Evolutionary computation (2019), 1--29.
[28]
Brandon Muller, Harith Al-Sahaf, Bing Xue, and Mengjie Zhang. 2019. Transfer learning: a building block selection mechanism in genetic programming for symbolic regression. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 350--351.
[29]
LuisMuñoz, Leonardo Trujillo, and Sara Silva. 2019. Transfer learning in constructive induction with Genetic Programming. Genetic Programming and Evolvable Machines (2019), 1--41.
[30]
Damien O'Neill, Harith Al-Sahaf, Bing Xue, and Mengjie Zhang. 2017. Common subtrees in related problems: A novel transfer learning approach for genetic programming. In Evolutionary Computation (CEC), 2017 IEEE Congress on. IEEE, 1287--1294.
[31]
Sinno Jialin Pan, Qiang Yang, et al. 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2010), 1345--1359.
[32]
David Pardoe and Peter Stone. 2010. Boosting for regression transfer. In Proceedings of the 27th International Conference on International Conference on Machine Learning. Omnipress, 863--870.
[33]
Leonardo Trujillo, Luis Muñoz, Uriel López, and Daniel E Hernández. 2019. Untapped Potential of Genetic Programming: Transfer Learning and Outlier Removal. In Genetic Programming Theory and Practice XVI. Springer, 193--207.
[34]
Bing Xue, Mengjie Zhang, Will N Browne, and Xin Yao. 2016. A survey on evolutionary computation approaches to feature selection. IEEE Transactions on Evolutionary Computation 20, 4 (2016), 606--626.
[35]
Hongliang Yan, Yukang Ding, Peihua Li, Qilong Wang, Yong Xu, and Wangmeng Zuo. 2017. Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2272--2281.
[36]
Peilin Zhao and Steven CH HOI. 2010. OTL: A framework of Online Transfer Learning. (2010).
[37]
Peilin Zhao, Steven CH Hoi, Jialei Wang, and Bin Li. 2014. Online transfer learning. Artificial Intelligence 216 (2014), 76--102.
[38]
Erheng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak Turaga, and Olivier Verscheure. 2009. Cross domain distribution adaptation via kernel mapping. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. 1027--1036.
[39]
Joey Tianyi Zhou, Ivor W Tsang, Sinno Jialin Pan, and Mingkui Tan. 2019. Multi-class Heterogeneous Domain Adaptation. Journal of Machine Learning Research 20 (2019), 1--31.
[40]
Xiaoping Zhu et al. 2014. Comparison of four methods for handing missing data in longitudinal data analysis through a simulation study. Open Journal of Statistics 4, 11 (2014), 933.
[41]
Hua Zuo, Guangquan Zhang, Witold Pedrycz, Vahid Behbood, and Jie Lu. 2016. Fuzzy regression transfer learning in Takagi-Sugeno fuzzy models. IEEE Transactions on Fuzzy Systems 25, 6 (2016), 1795--1807.

Cited By

View all

Index Terms

  1. Multi-tree genetic programming for feature construction-based domain adaptation in symbolic regression with incomplete data

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
        June 2020
        1349 pages
        ISBN:9781450371285
        DOI:10.1145/3377930
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 26 June 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. genetic programming
        2. incomplete data
        3. symbolic regression
        4. transfer tearning

        Qualifiers

        • Research-article

        Conference

        GECCO '20
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)43
        • Downloads (Last 6 weeks)4
        Reflects downloads up to 06 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media