The Vision: Effortless Workflow Creation
At Wiv, we embarked on an ambitious journey: to empower our users with the ability to create complex Finops workflows effortlessly, using nothing more than simple, natural language prompts. Our vision was clear: users should be able to input their needs for Finops in plain language, and within less than 30 seconds, receive a helpful, actionable response from our system.
The Choice: OpenAI’s API as the Perfect Partner
To achieve this, we turned to OpenAI’s Large Language Model (LLM) API, which seemed like the perfect match for our goals. Specifically, OpenAI offers the capability to create “Assistants”, which are LLM-based agents designed to handle user queries. These agents leverage user-provided data files to generate responses according to detailed instructions. This feature aligned perfectly with our needs, as it allowed us to “teach” the AI the unique format of our workflows. By providing the Assistant with existing workflows and detailed guidelines, we believed we could enable it to generate accurate and usable workflows on demand.
Naive Optimism and Harsh Realities
Our initial approach was straightforward, albeit somewhat naive. We simply requested the Assistant to generate a complete workflow, ready for execution in our systems. To our delight, the Assistant quickly began returning responses that resembled the workflows in our system. These workflows included the many parameters necessary for seamless communication between our backend and frontend.
However, our excitement was short-lived. Despite looking correct on the surface, the generated workflows often lacked crucial elements, rendering them unusable. We tried to refine the Assistant’s performance by explicitly addressing these common errors in the Assistant’s instructions. We also implemented validations within our system, using the API to continue the conversation with the Assistant and address any errors that arose. Despite these efforts, the Assistant frequently returned invalid responses, and the process became bogged down with long response times. Additionally, we encountered rate limits from the OpenAI API due to the lengthy responses we were requesting.
Pivoting Strategy: Compacting the Workflow Representation
Realizing we needed a different approach, we decided to ask the Assistant for something simpler: a compact representation of the workflow structure, focusing on key steps and prominent parameters. This meant we had to enrich and adapt the Assistant’s output to fit our system’s format—a process that required a lot of trial and error.
Our challenge was to strike the right balance. On one hand, the workflow representation needed to be compact enough to minimize errors that we couldn’t easily fix. On the other hand, we didn’t want to stifle the Assistant’s creativity by making the output too rigid.
The Breakthrough: Finding the Optimal Formula
After much experimentation, we finally found the sweet spot. Our new approach reduced the Assistant’s response length by about 80%, and we discovered that most workflows could be automatically completed from this compact representation without losing critical components.
Although there are still rare instances where the Assistant’s initial response doesn’t follow our expected format, our system now automatically validates the output. In cases where errors are detected, the system continues the “conversation” (i.e. “thread”) with the Assistant, presenting the validation failures and requesting a corrected response. On average, the final, valid response is obtained within just 20 seconds.
The workflows generated by the feature are impressive in their complexity and in the appropriate choice and usage of the many possible steps our system has. The final output also includes a detailed description of the workflow components. For example, given the prompt: “My ECS costs are very high, I want a workflow that will help me save money”, the system returned the following workflow:
In this example, our system was able to take a very abstract and vague prompt and to translate it to concrete workflow.
However, this feature will try to return a workflow even when the prompt is not necessarily indicated on a clear Finops workflow. As an amusing example, the prompt “Give me a chicken soup recipe” generated the following workflow:
Conclusion: From Challenge to Success
What began as a challenging endeavor to integrate LLMs into Wiv’s platform evolved into a journey of innovation and discovery. By refining our approach and embracing a strategy of minimalism in the Assistant’s outputs, we were able to meet our goals and create a system that not only works efficiently but also provides our users with the effortless workflow creation experience we envisioned.
Key Takeaways: Finding the Optimal Response Format
Our key takeaway from this process was that when working with LLMs to generate strictly formatted and lengthy responses—especially when those formats are not widely recognized by the model—it is essential to focus on the most critical aspects of the response that cannot be automatically generated. The responsibility should be divided between the LLM and our system: the LLM should be tasked with understanding the user’s input and creating the core of the expected response, while the system should handle everything else. By allowing the LLM to focus on what it does best, you can ensure a more efficient and accurate workflow creation process.
Read more about FinOps Automation