Date of Graduation
Spring 5-20-2025
Document Type
Thesis
Department
Computer Science
Thesis Chair
Dr. Jeong Yang
Abstract
With the advanced AI technologies, the role of Large Language Models (LLMs) has grown rapidly for software development with generating the code that is functionally correct, solving complex problems, and debugging existing code. However, LLMs often produce inefficient code with unnecessary logic, hallucinated content, and errors. This research measures the efficiency of Python code generated by GPT-4o-Mini, GPT-3.5-Turbo, and GPT-4-Turbo models using execution time, memory usage, and maximum memory usage while maintaining correctness. Using EffiBench datasets on Google’s Vertex AI Workbench with different machine configurations, the study uses the seed parameter for consistency and optimization techniques like Chain-of-Thought (CoT) prompting and fine-tuning. Except for GPT-4-Turbo, the results show that CoT prompting improves efficiency metrics for GPT-4o-Mini and GPT-3.5-Turbo. GPT-4o-Mini was selected for fine-tuning due to its better results with CoT prompt and its costeffectiveness but fine-tuning compromises accuracy and efficiency. Overall, high-CPU machine configurations, along with GPT-4o-Mini and CoT prompting, improves the efficiency and correctness of LLM generated Python code in resource-intensive scenarios.
Recommended Citation
Jonnala, Ramya, "MEASURING AND IMPROVING THE EFFICIENCY OF PYTHON CODE GENERATED BY LLMS USING COT PROMPTING AND FINE-TUNING" (2025). Masters Theses. 47.
https://digitalcommons.tamusa.edu/masters_theses/47