Semantic Automated Assessment Of Student Flowcharts Via Graph Neural Networks And Symbolic Execution
DOI:
https://doi.org/10.24269/mtkind.v20i1.13681Keywords:
Automated Evaluation, Flowchart, Graph Convolutional Network, Symbolic Execution, Semantic EquivalenceAbstract
Automated evaluation of flowchart representations is essential for the facilitation of the acquisition of basic programming concepts. Nevertheless, traditional evaluation systems that rely exclusively on structural matching demonstrate some of their most fundamental limitations. The false negative misclassification rates of such systems are frequently high when students create visually distinct structures for algorithmic logic that are semantically equivalent. A hybrid assessment framework is introduced in this study to improve the reliability and efficacy of code evaluation in order to address this challenge. The model that has been proposed combines the probabilistic feature extraction capabilities of Graph Convolutional Networks (GCNs) with mathematical logic verification through symbolic execution of an SMT Solver. While the SMT Solver deterministically establishes functional equivalence, the GCN module adaptively manages graph topological variations. Use of a real-world dataset consisting of 3.600 flowcharts generated by novice students was implemented to assess the hybrid system's functionality. According to quantitative experimental results, the proposed framework obtained a peak F1 Score of 0.88, which is a substantial improvement over conventional Abstract Syntax Tree (AST) methods (F1 Score 0.75). Additionally, the 77.4% reduction in false negative rates was achieved by incorporating the SMT Solver in comparison to a pure GCN configuration. Finally, the semantic equivalence and structural divergence issues that arise during algorithm assessment are effectively resolved by this dual architectural integration. By implementing the proposed system, higher education institutions are equipped with a more dependable mechanism for reducing human error, thereby improving the impartiality, accuracy, and efficiency of the evaluation process.
Downloads
References
[1] D. Tagare, “WIP : Do Your Students Learn Multiple Programming Languages ? How Do Computational Thinking Skills Help Them ?, ”2025 IEEE Frontiers in Education Conference (FIE), pp. 1–5, doi: 10.1109/FIE63693.2025.11328630.
[2] R. Fabian, Z. Munoz, J. Ariel, H. Alegría, G. Robles, and S. Member, “Assessment of Computational Thinking Skills : A Systematic Review of the Literature,” vol. 18, no. 4, pp. 319–330, 2023, doi: 10.1109/RITA.2023.3323762.
[3] J.-H. Zhang, B. Meng, L.-C. Zou, Y. Zhu, and G.-J. Hwang, “Progressive flowchart development scaffolding to improve university students’ computational thinking and programming self-efficacy,” Interactive Learning Environments, vol. 31, no. 6, pp. 3792–3809, Aug. 2023, doi: 10.1080/10494820.2021.1943687.
[4] M. Zhang et al., “PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts,” pp. 2626–2649, 2025.
[5] B. E. N. Wang, Y. Tong, S. Ji, and B. Data, “A Review of Learning-based Smart Contract Vulnerability Detection : A Perspective on Code Representation,” ACM Transactions on Software Engineering and Methodology, vol. 35, no. 6, 2026, doi: 10.1145/3750042.
[6] S. Xiong, Y. Li, W. Luo, and H. Kong, “Design and implementation of online programming skill improvement system based on collaborative filtering algorithm,” 2025 IEEE 8th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), vol. 8, pp. 143–148, 2025, doi: 10.1109/IAEAC65194.2025.11166572.
[7] D. Deepshikha, “A comprehensive review of AI-powered grading and tailored feedback in universities,” 2025.
[8] I. M. Technologies and M. A. Models, “Interactive Mobile Technologies,” vol. 19, no. 14, pp. 57–69, 2025.
[9] C. Zhao and H. Education, “AI-assisted Assessment in Higher Education: A Systematic Review,” pp. 39–58.
[10] M. Landers, “Adapting to the Unsanctioned Use of AI-Supported Technologies in Student Assessments,” Higher Education for the Future, 2024, doi: 10.1177/23476311241300608.
[11] J. Lu, B. K. Balasubramanian, M. Joy, and Q. Xu, “Survey and Analysis for the Challenges in Computer Science,” vol. 58, no. 1, 2025.
[12] A. Abdu, Z. Zhai, H. A. Abdo, and R. Algabri, “Software Defect Prediction Based on Deep Representation Learning of Source Code From Contextual Syntax and Semantic Graph,” IEEE Transactions on Reliability, vol. 73, no. 2, pp. 820–834, 2024, doi: 10.1109/TR.2024.3354965.
[13] Y. Wu, S. Feng, W. Suo, D. Zou, and H. Jin, “Goner: Building Tree-Based N-Gram-Like Model for Semantic Code Clone Detection,” IEEE Transactions on Reliability, vol. 73, no. 2, pp. 1310–1324, 2024, doi: 10.1109/TR.2023.3312294.
[14] D. Yang, “Structure-Guided and Semantics-Enhanced Collaborative Code Generation,” 2025 8th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), pp. 556–562, 2025, doi: 10.1109/AEMCSE65292.2025.11042693.
[15] M. Trifan, B. Ionescu, and D. Ionescu, “Generative AI for Diagrams as Code and Code as Diagrams,” 2025 IEEE 19th International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 1–6, 2025, doi: 10.1109/SACI66288.2025.11030105.
[16] J. Jiang et al., “Error-Tolerant Code Segmentation for Supporting Semantic Conflict Prevention in Real-Time Collaborative Programming,” 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 4588–4593, 2024, doi: 10.1109/SMC54092.2024.10832108.
[17] S. Qiu, H. Huang, W. Jiang, F. Zhang, and W. Zhou, “Defect Prediction via Tree-Based Encoding with Hybrid Granularity for Software Sustainability,” vol. 9, no. 3, pp. 249–260, 2024.
[18] Y. Wu, “Research on prediction algorithm of college students’ academic performance based on Bert-GCN multi-modal data fusion,” Systems and Soft Computing, vol. 7, p. 200327, 2025, doi: https://doi.org/10.1016/j.sasc.2025.200327.
[19] M. Wang, S. Member, B. Fang, and S. Member, “Annotation-based Semantic Conflict Prevention in Real-Time Collaborative Programming: Approach , Techniques , Prototype , and User Study,” 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1755–1760, 2024, doi: 10.1109/SMC54092.2024.10831572.
[20] C. Dechsupa, T. Panboonyuen, W. Vatanawood, P. Padungweang, and C. So-in, “Toward AI-Augmented Formal Verification: A Preliminary Investigation of ENGRU and Its Challenges,” IEEE Access, vol. 13, no. April, pp. 84357–84379, 2025, doi: 10.1109/ACCESS.2025.3568194.
[21] W. Hashimoto, “Basic investigation of code edit distance measurement by CodeBERT,” 2023 15th International Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter), pp. 13–18, 2023, doi: 10.1109/IIAI-AAI-Winter61682.2023.00012.
[22] C. Vyshnavi, “Enhancing Code Insights through Semantic Change Impact Evaluation,” 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–7, 2024, doi: 10.1109/ICCCNT61001.2024.10725350.
[23] T. Yu, L. Yuan, L. Lin, and H. He, “A Multiple Representation Transformer with Optimized Abstract Syntax Tree for Efficient Code Clone Detection,” 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), pp. 281–293, 2025, doi: 10.1109/ICSE55347.2025.00050.
[24] Y. Wang and X. Wang, “PyReach: A Multi-Agent Framework for Vulnerability Reachability Analysis in Python,” 2025 32nd Asia-Pacific Software Engineering Conference (APSEC), pp. 173–183, 2025, doi: 10.1109/APSEC66846.2025.00027.
[25] K. Kishor, P. Vishwakarma, L. Sengar, V. Kumar, and V. Gupta, “Development of Grader Provider System Using Deep Learning,” Procedia Computer Science, vol. 259, pp. 172–181, 2025, doi: https://doi.org/10.1016/j.procs.2025.03.318.
[26] L. Sun, H. Du, and T. Hou, “FR-DETR: End-to-End Flowchart Recognition with Precision and Robustness,” IEEE Access, vol. PP, p. 1, 2022, doi: 10.1109/access.2022.3183068.
[27] P. Prediction, “applied sciences Multi-Output Based Hybrid Integrated Models for Student Performance Prediction,” 2023.
[28] A. A. Alsulami, A. S. A. M. AL-Ghamdi, and M. Ragab, “Enhancement of E-Learning Student’s Performance Based on Ensemble Techniques,” Electronics (Switzerland), vol. 12, no. 6, pp. 1–18, 2023, doi: 10.3390/electronics12061508.
[29] C. Xiang, Y. Wang, Q. Zhou, and Z. Yu, “Graph semantic similarity-based automatic assessment for programming exercises,” Scientific Reports, vol. 14, 2024, doi: 10.1038/s41598-024-61219-8.
[30] A. Aher, R. S. Waghode, M. Jamsutkar, A. J. Patil, U. Padelkar, and D. S. Gat, “C3 – Code Commit Collab - A collaborative Code Editor using Repository Level LLM,” 2025 3rd International Conference on Communication, Security, and Artificial Intelligence (ICCSAI), vol. 3, pp. 2071–2076, 2025, doi: 10.1109/ICCSAI64074.2025.11064134.
[31] S. Kamal, S. F. Nimmy, and G. S. Member, “Interpretable Code Summarization,” IEEE Transactions on Reliability, vol. 74, no. 1, pp. 2280–2289, 2025, doi: 10.1109/TR.2024.3392876.
[32] C. Wang, B. Chen, G. Li, H. Wang, and S. Member, “Federated Learning Framework,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 10, pp. 9959–9972, 2023, doi: 10.1109/TKDE.2023.3250264.
[33] E. J. Gutiérrez Beltrán and J. C. Martínez Arias, “Mi Superpoder es la Programación: A tool for teaching programming to children and youth,” Science of Computer Programming, vol. 240, p. 103198, 2025, doi: https://doi.org/10.1016/j.scico.2024.103198.
[34] H. Wan, H. Luo, M. Li, and X. Luo, “Automated Program Repair for Introductory Programming Assignments,” IEEE Transactions on Learning Technologies, vol. 17, pp. 1705–1720, 2024, doi: 10.1109/TLT.2024.3403710.
[35] J. W. Browning, J. Bustard, and N. Anderson, “ASTRO: A Semi-Automated Grading and Feedback System for Programming Assignments,” 2025 IEEE Frontiers in Education Conference (FIE), pp. 1–8, doi: 10.1109/FIE63693.2025.11328280.
[36] C. Qiu, J. Liu, X. Xiao, and Y. Xiao, “OpCodeBERT: A Method for Python Code Representation Learning by BERT With Opcode,” IEEE Transactions on Software Engineering, vol. 51, no. 11, pp. 3103–3116, 2025, doi: 10.1109/TSE.2025.3610244.
[37] G. Adorni and A. Piatti, “Designing the virtual CAT: A digital tool for algorithmic thinking assessment in compulsory education,” International Journal of Child-Computer Interaction, vol. 45, p. 100760, 2025, doi: https://doi.org/10.1016/j.ijcci.2025.100760.
[38] X. Li et al., “GAI Versus Teacher Scoring: Which is Better for Assessing Student Performance?,” vol. 18, pp. 569–580, 2025.
[39] C. Hillis et al., “AI ethics education: A scoping review of pedagogy, curriculum, and assessment,” Information Processing & Management, vol. 63, no. 6, p. 104767, 2026, doi: https://doi.org/10.1016/j.ipm.2026.104767.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Usman Nurhasan, Dian Hanifudin Subhi, Anugrah Nur Rahmanto, Endah Septa Sintiya, Deddy Kusbianto Purwoko Aji

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
MULTITEK is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License







