skip to main content
10.1145/3680127.3680216acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicegovConference Proceedingsconference-collections
research-article

A new perspective for longitudinal measurement and analysis of public education in Brazil based on open data and machine learning

Published: 16 December 2024 Publication History

Abstract

In the mid-2000s, e-government began to be seen as a facilitator of transformation in the public sector, promoting transparency, reducing corruption, and increasing social engagement. One of the first steps towards this is providing access to information for the population. Additionally, the popularization of access to raw data from large-scale assessments has enabled analyses that promote public value in the education sector. Many Brazilian studies use large-scale assessments from high school to predict student or school performance, hindering the use of these analyses at the state and municipal levels. In this context, the objective of this study is to analyze the relationship between educational, economic, and social well-being characteristics of municipalities and their respective performance in IDEB for elementary education to support the decision-making process in the development of public polices for elementary education by public resource managers. For this, we used a method based on CRISP-DM. We began by understanding the business, collecting data from open data portals of the Brazilian government and the Human Development Atlas in Brazil. Among these data were HDI, educational investments, and educational indicators of municipalities. We then cleaned the data and proceeded to perform analyses using Spearman’s coefficient and prediction of IDEB-E and IDEB-L indicators using machine learning algorithms such as Linear Regression, Random Forest, and Artificial Neural Network. Finally, we conducted analyses of the important features identified by the most accurate model. As main results, we found a high positive correlation between IDEB and social well-being variables, as well as the level of education of teachers. The methodology and results of this paper may lead to a new perspective of analyzing elementary education at the municipality level by employing both statistical and machine learning models, supporting the public bodies in their decisions regarding investments.

References

[1]
Talha Mahboob Alam, Mubbashar Mushtaq, Kamran Shaukat, Ibrahim A. Hameed, Muhammad Umer Sarwar, and Suhuai Luo. 2021. A Novel Method for Performance Measurement of Public Educational Institutions Using Machine Learning Models. Applied Sciences 11 (10 2021), 9296. Issue 19.
[2]
Williams Alcantara, Judson Bandeira, Armando Barbosa, André Lima, Thiago Ávila, Ig Bittercourt, and Seiji Isotani. 2015. Desafios no uso de Dados Abertos Conectados na Educação Brasileira. Anais do IV Workshop de Desafios da Computação aplicada à Educação (DesafIE! 2015), 11–20.
[3]
Saba Batool, Junaid Rashid, Muhammad Wasif Nisar, Jungeun Kim, Hyuk-Yoon Kwon, and Amir Hussain. 2023. Educational data mining to predict students’ academic performance: A survey study. Education and Information Technologies 28 (1 2023), 905–971. Issue 1.
[4]
Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José M. F. Moura, and Peter Eckersley. 2020. Explainable machine learning in deployment. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 648–657.
[5]
Otávio Moreira de Castro Neves. 2013. Evolução das políticas de Governo aberto no Brasil. (2013).
[6]
Kellyton dos Santos Brito, Marcos Antônio da Silva Costa, Vinicius Cardoso Garcia, and Silvio Romero de Lemos Meira. 2014. Brazilian government open data. Proceedings of the 15th Annual International Conference on Digital Government Research, 11–16.
[7]
Eduardo Fernandes, Maristela Holanda, Marcio Victorino, Vinicius Borges, Rommel Carvalho, and Gustavo Van Erven. 2019. Educational data mining: Predictive analysis of academic performance of public school students in the capital of Brazil. Journal of Business Research 94 (1 2019), 335–343.
[8]
Rogerio L.C. Silva Filho and Paulo J.L. Adeodato. 2019. Data Mining Solution for Assessing the Secondary School Students of Brazilian Federal Institutes. 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), 574–579.
[9]
Cristiano Mauro Assis Gomes, Amanda Amantes, and Enio G. Jelihovschi. 2020. Applying the Regression Tree Method to Predict Students’ Science Achievement. Trends in Psychology 28 (3 2020), 99–117. Issue 1.
[10]
E A Harbison R. W.and Hanushek. [n. d.]. Educational performance of the poor: lessons from rural northeast Brazil.Oxford University Press. 362pp. pages.
[11]
Daniel Hernández-Torrano and Matthew G. R. Courtney. 2021. Modern international large-scale assessment in education: an integrative review and mapping of the literature. Large-scale Assessments in Education 9 (12 2021), 17. Issue 1.
[12]
Alfred Tat‐Kei Ho. 2002. Reinventing Local Governments and the E‐Government Initiative. Public Administration Review 62 (1 2002), 434–444. Issue 4.
[13]
Hassan Khosravi, Simon Buckingham Shum, Guanliang Chen, Cristina Conati, Yi-Shan Tsai, Judy Kay, Simon Knight, Roberto Martinez-Maldonado, Shazia Sadiq, and Dragan Gašević. 2022. Explainable Artificial Intelligence in education. Computers and Education: Artificial Intelligence 3 (2022), 100074.
[14]
Viviann Machado, Gabriel Mantini, José Viterbo, Flavia Bernardini, and Raissa Barcellos. 2018. An instrument for evaluating open data portals. Proceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data Age, 1–10.
[15]
Bo Ning, Jan Van Damme, Wim Van Den Noortgate, Sarah Gielen, Kim Bellens, Vincent Dupriez, and Xavier Dumay. 2016. Regional inequality in reading performance: an exploration in Belgium. School Effectiveness and School Improvement 27 (10 2016), 642–668. Issue 4.
[16]
Marcelo Iury S. Oliveira, Hélio Rodrigues de Oliveira, Lairson Alencar Oliveira, and Bernadette Farias Lóscio. 2016. Open Government Data Portals Analysis. Proceedings of the 17th International Digital Government Research Conference on Digital Government Research, 415–424.
[17]
Harikumar Pallathadka, Alex Wenda, Edwin Ramirez-Asís, Maximiliano Asís-López, Judith Flores-Albornoz, and Khongdet Phasinam. 2023. Classification and prediction of student performance data using various machine learning algorithms. Materials Today: Proceedings 80 (2023), 3782–3785.
[18]
Deborah A Phillips and Jack P Shonkoff. 2000. From neurons to neighborhoods: The science of early childhood development. (2000).
[19]
Daiane Rodrigues, Murilo Regio, Soraia Musse, and Isabel Manssour. 2021. Data Mining on the Prediction of Student’s Performance at the High School National Examination. Proceedings of the 13th International Conference on Computer Supported Education, 92–99.
[20]
Mariutsi Alexandra Osorio Sanabria, Ferney Orlando Amaya Fernández, and Mayda Patricia González Zabala. 2018. Colombian Case Study for the Analysis of Open Data Government. Proceedings of the 11th International Conference on Theory and Practice of Electronic Governance, 389–394.
[21]
Arthur Scanoni, Paulo Adeodato, and Kellyton Brito. 2022. Using data mining over open data for a longitudinal assessment of municipal public education in Brazil. EGOV-CeDEM-ePart 2022 (2022), 163.
[22]
Colin Shearer. 2000. The CRISP-DM model: the new blueprint for data mining. Journal of data warehousing 5 (2000), 13–22. Issue 4.
[23]
Sajjad Shokouhyar, Sina Shokoohyar, Niloufar Raja, and Vipul Gupta. 2021. Promoting fashion customer relationship management dimensions based on customer tendency to outfit matching: mining customer orientation and buying behaviour. International Journal of Applied Decision Sciences 14 (2021), 1. Issue 1.
[24]
Nataša Veljković, Sanja Bogdanović-Dinić, and Leonid Stoimenov. 2014. Benchmarking open government: An open data perspective. Government Information Quarterly 31 (4 2014), 278–290. Issue 2.
[25]
W F Wan Yaacob, N Mohd Sobri, S A Md Nasir, W F Wan Yaacob, N D Norshahidi, and W Z Wan Husin. 2020. Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques. Journal of Physics: Conference Series 1496 (3 2020), 012005.

Index Terms

  1. A new perspective for longitudinal measurement and analysis of public education in Brazil based on open data and machine learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICEGOV '24: Proceedings of the 17th International Conference on Theory and Practice of Electronic Governance
      October 2024
      479 pages
      ISBN:9798400717802
      DOI:10.1145/3680127
      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. Request permissions from owner/author(s).

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 December 2024

      Check for updates

      Author Tags

      1. Machine Learning
      2. Open Data
      3. Education
      4. Elementary School
      5. Brazil
      6. Municipalities

      Qualifiers

      • Research-article

      Conference

      ICEGOV 2024

      Acceptance Rates

      Overall Acceptance Rate 350 of 865 submissions, 40%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 11
        Total Downloads
      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)11
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media