Options
Multi-Objective Grammar-Guided Genetic Programming for Grammar-Obeying Program Synthesis
Author(s)
Date Issued
2025
Date Available
2025-05-29T14:26:59Z
Abstract
Grammar-obeying program synthesis is a subfield of program syn- thesis (i.e., the task of automatically discovering an executable piece of code) that requires generated code to follow a Backus-Naur Form (BNF) grammar. Having code that obeys a grammar limits the structure of the code and the set of callable functions/method- s/libraries, thus having an impact on (i) security: e.g., using ma- licious functionalities, blacklisted libraries, and questionable con- structs, (ii) the computing environment: e.g., if the hardware is not able to run some functions or if some functions are too costly in terms of memory or energy, and (iii) code quality: improving code style and readability, reducing code smells, and enhancing consistency. Grammar-Guided Genetic Programming (G3P) is recognised as one of the most successful approaches for grammar-obeying pro- gram synthesis that evolves programs in arbitrary languages to solve program synthesis problems based on a set of input-output examples and predefined BNF grammars. Despite its success, the restriction on the evolutionary system to only leverage input-output error rate during its assessment of the programs it derives lim- its its scalability to larger and more complex program synthesis problems. Alternatively, while large language models (LLMs) are showing increasing success in program synthesis from a task de- scription, they often struggle to generate accurate code due to am- biguity in task specifications, complex programming syntax, and lack of reliability in the generated code. Furthermore, their gener- ative nature limits their ability to fix erroneous code with iterative LLM prompting. In this thesis, I focus on enhancing the performance of G3P for grammar-obeying program synthesis tasks. I begin by evaluating the state-of-the-art systems, including G3P and LLMs, in grammar- obeying program synthesis. Next, I explore how to map LLM- generated code into grammar-obeying forms, followed by seeding the grammar-mapped LLM-generated code into G3P’s initial pop- ulation to improve the evolutionary process. Building on these foundations, I investigate the potential of using code similarity measures to guide the evolution of programs, leading to the devel- opment of Bi-Objective Grammar-Guided Genetic Programming (BOG3P) and Multi-Objective Grammar-Guided Genetic Program- ming (MOG3P), which combine input-output error rate with mul- tiple similarity measures as objective functions. Finally, I integrate LLM-generated code into the MOG3P framework in two ways: (i) using LLM-generated code as a seed for the evolutionary pro- cess through a grammar-mapping phase, and (ii) leveraging multi- ple similarity measures towards LLM-generated code to guide the search process throughout the evolution. Additionally, I further extend the MOG3P framework by incorporating code from multi- ple LLMs to enhance the diversity and effectiveness of the search process.
Type of Material
Doctoral Thesis
Publisher
University College Dublin
Qualification Name
Ph.D.
Copyright (Published Version)
2025 the Author
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
File(s)
Loading...
Name
Ning_Tao_Thesis_15205926.pdf
Size
980.96 KB
Format
Adobe PDF
Checksum (MD5)
433f42847aeb8c0d612f8c2bdc481509
Owning collection