LLM-based Distributed Code Generation and Cost-Efficient Execution in the Cloud

Publication Date: 4/6/2025

Event: The Sixteenth International Conference on Cloud Computing, GRIDs, and Virtualization (Cloud Computing 2025)

Reference: pp. 114-121, 2025

Authors: Kunal Rao, NEC Laboratories America, Inc.; Giuseppe Coviello, NEC Laboratories America, Inc.; Gennaro Mellone, NEC Laboratories America, Inc., University of Napoli ”Parthenope”; Ciro Giuseppe De Vita, NEC Laboratories America, Inc., University of Napoli ”Parthenope”; Srimat T. Chakradhar, NEC Laboratories America, Inc.

Abstract: The advancement of Generative Artificial Intelligence (AI), particularly Large Language Models (LLMs), is reshaping the software industry by automating code generation. Many LLM-driven distributed processing systems rely on serial code generation constrained by predefined libraries, limiting flexibility and adaptability. While some approaches enhance performance through parallel execution or optimize edge-cloud distributed processing for specific domains, they often overlook the cost implications of deployment, restricting scalability and economic feasibility across diverse cloud environments. This paper presents DiCE-C, a system that eliminates these constraints by starting directly from a natural language query. DiCE-C dynamically identifies available tools at runtime, programmatically refines LLM prompts, and employs a stepwise approach—first generating serial code and then transforming it into distributed code. This adaptive methodology enables efficient distributed execution without dependence on specific libraries. By leveraging high-level parallelism at the Application Programming Interface (API) level and managing API execution as services within a Kubernetes-based runtime, DiCE-C reduces idle GPU time and facilitates the use of smaller, cost-effective GPU instances. Experiments with a vision-based insurance application demonstrate that DiCE-C reduces cloud operational costs by up to 72% when using smaller GPUs (A6000 and A4000 GPU machines vs. A100 GPU machine) and by 32% when using identical GPUs (A100 GPU machines). This flexible and cost-efficient approach makes DiCE-C a scalable solution for deploying LLM-generated vision applications in cloud environments.

Publication Link: https://www.researchgate.net/publication/390740915_CLOUD_COMPUTING_2025_Proceedings_of_the_Sixteenth_International_Conference_on_Cloud_Computing_GRIDs_and_Virtualization