Taming the cost of AI: Is FinOps the answer?

As artificial intelligence (AI) services, particularly generative AI (genAI), become increasingly integral to modern enterprises, establishing a robust financial operations (FinOps) strategy is essential. AI services require high resources like CPU/GPU and memory and hence cloud providers like Amazon AWS, Microsoft Azure and Google Cloud provide many AI services including features for genAI. When using these services, it is imperative that we keep an eye on the consumption as cost overhead in using AI services can be costly for an organization.

The advent of AI services, particularly genAI, has revolutionized various industries, enhancing capabilities and driving innovation. However, the financial complexities posed by these advanced technologies necessitate a robust FinOps strategy to ensure cost efficiency and sustainability.

Establishing a governance model and cost management strategy for AI services plays a vital role in the AI strategy. FinOps provides the structure to achieve cost transparency, cost management and cost optimization, ensuring that AI services are not only effective but also economically sustainable. This article delves into developing FinOps solutions tailored for AI services, highlighting the unique considerations and strategic approaches necessary.

Defining a FinOps strategy for AI

A comprehensive FinOps strategy for AI services involves several critical components, each aimed at fostering an environment of financial clarity and control.

Cost transparency

Cost transparency is the cornerstone of any FinOps strategy. For AI services, this entails detailed tracking and reporting of expenses associated with AI workloads. By leveraging granular cost data, organizations can identify cost drivers, allocate expenses accurately and make informed financial decisions.

Achieving cost transparency involves making the cost of AI services visible and comprehensible to all stakeholders. For AI services, this implies breaking down costs associated with data processing, model training and inferencing.

Data processing costs: Track storage, retrieval and preprocessing costs.
Model training costs: Monitor expenses related to computational resources during model development.
Inferencing costs: Identify costs incurred during the deployment and real-time usage of AI models.

diagram - the stages of finops — The stages of defining a FinOps strategy for AI services.

Dr. Magesh Kasthuri

Cost management

Effective cost management practices are crucial for maintaining budgetary discipline. This includes proactive budgeting, regular financial reviews and the implementation of cost allocation policies that ensure accountability. For AI services, cost management also involves optimizing resource utilization to prevent overspending.

Cost management in AI services requires proactive monitoring and control of spending across different phases of the AI lifecycle. This includes setting budgets, forecasting costs based on usage patterns and implementing automated alerts for cost overruns.

Budgeting: Establish budgets for each AI project and track adherence.
Cost forecasting: Use historical data to predict future costs and adjust allocations accordingly.
Automated alerts: Configure alerts to notify stakeholders of any unexpected cost spikes or budget breaches.

Cost optimization

Cost optimization goes beyond mere cost control; it seeks to maximize the value derived from AI investments. This involves leveraging advanced techniques such as predictive analytics for cost forecasting, automation of cost management processes and continuous refinement of financial strategies to identify and eliminate inefficiencies.

Optimizing costs for AI services involves leveraging various techniques to reduce expenses without compromising performance. Key strategies include right-sizing resources, utilizing spot instances and scheduling workloads during off-peak hours.

Rightsizing: Ensure computational resources match the specific needs of AI tasks.
Spot instances: Utilize discounted compute instances for non-urgent workloads.
Workload scheduling: Run compute-intensive tasks during periods of lower demand to take advantage of lower prices.

FinOps for AI

AI requires FinOps teams to learn new terminology and concepts, collaborate with new stakeholders and understand spending and discounting models to optimize costs. Specific challenges include managing specialized services, optimizing GPU instances and handling specialized data ingestion requirements. The rapid impact of AI costs on diverse cross-functional teams adds to the complexity.

The accessibility of GenAI services has led to non-traditional user groups, such as product, marketing, sales and leadership, directly contributing to AI-driven cloud expenses. The scarcity of GPUs has created a volatile infrastructure market, and diverse implementation models and cost structures make achieving FinOps goals more complex.

To optimize GPU instances effectively, FinOps teams need to focus on several strategies. Firstly, they should ensure that GPU resources are allocated efficiently to avoid wastage. This can be achieved by monitoring usage patterns and adjusting allocations based on actual needs. Additionally, leveraging spot instances or reserved instances can help reduce costs. FinOps teams should also explore the use of auto-scaling to dynamically adjust GPU resources based on demand.

The spending and discounting models specific to AI are evolving to address the unique demands and complexities of AI services such as:

Resource-based pricing: This model charges based on the resources consumed, such as GPU hours or data processed. For example, OpenAI uses a token-based model, while Synthesia.io (to generate AI Video) charges per minute of video generated.
Outcome-based pricing: This model aligns costs with the outcomes delivered. For instance, some companies charge based on the number of tasks completed or the success rate of AI applications.
Hybrid models: These combine elements of resource-based and outcome-based pricing. For example, some creative tools offer unlimited edits or dynamic credit systems, reflecting a blend of resource usage and value delivered.
Per-conversation pricing: Companies like Salesforce charge a fixed amount per conversation, which can be predictable for seasonal workloads but may become costly for continuous, high-volume scenarios.
Time-based pricing: Established players like Microsoft charge based on the time AI services are used, such as $4 per hour.
Success-based pricing: Newer entrants like Zendesk align costs with the success of AI applications, such as charging a fee based on the resolution of customer queries.

These models reflect a shift from traditional per-seat software pricing to more dynamic and value-driven approaches, ensuring that AI investments are aligned with the value they deliver.

Comparing FinOps for AI services and other cloud services

FinOps for AI services differs from traditional cloud services due to the distinct nature of AI workloads. While traditional services might focus more on static resource allocation, AI services require dynamic scaling, high computational power and extensive data processing capabilities. Consequently, FinOps strategies for AI must account for these unique demands.

Feature	Azure	AWS	GCP
Cost transparency	Azure Cost Management and Billing	AWS Cost Explorer	GCP Cost Management
Cost management	Azure Budgets and Alerts	AWS Budgets and Alarms	GCP Budgets and Notifications
Cost optimization	Azure Advisor	AWS Trusted Advisor	GCP Recommender
Cost allocation	Tag-based Cost Allocation	Resource Tagging and Allocation	Label-Based Cost Allocation
Budgeting	Azure Budgets	AWS Budgets	GCP Budgets
Predictive analytics	Forecasting Tools	Forecasting and Recommendations	AI-Driven Forecasting
Automation	Automation and Orchestration	Cost Management Automation	Automated Tools and Scripts

Why FinOps strategy for AI services?

While the principles of FinOps apply universally to cloud services, the unique characteristics of AI workloads necessitate a tailored approach. Unlike traditional cloud services, AI workloads often involve high computational requirements, dynamic scaling and specialized hardware, all of which contribute to increased complexity and cost.

High computational requirements

AI services frequently require significant computational power, leading to escalated costs. FinOps strategies for AI must account for these requirements by optimizing the use of computational resources and exploring cost-effective alternatives.

Dynamic scaling

AI workloads can vary significantly, necessitating dynamic scaling of resources. FinOps for AI must incorporate strategies to manage this variability, ensuring that resources are scaled efficiently to match demand without incurring unnecessary costs.

Specialized hardware

AI services often rely on specialized hardware, such as GPUs and TPUs, which can be expensive. FinOps strategies must include measures to optimize the utilization of these resources, balancing performance and cost.

Strategic operating and maturity models

Implementing a strategic operating model for AI services using FinOps requires a systematic approach. Organizations must establish clear governance structures, define roles and responsibilities and foster a culture of financial accountability.

Strategic operating model

To develop a strategic operating model for AI services using the FinOps framework, organizations should focus on the following steps:

Governance: Establish clear policies and governance structures to manage AI costs effectively. Effective governance structures ensure that financial management practices are aligned with organizational objectives. This includes setting up FinOps teams, establishing policies and procedures and ensuring regular financial oversight.
Stakeholder engagement: Involve all relevant stakeholders, including finance, engineering and operations teams, in the FinOps process. Defining roles and responsibilities is crucial for the successful implementation of FinOps practices. This involves assigning specific tasks to finance, engineering and operations teams, ensuring collaboration and accountability.
Continuous improvement: Implement a cycle of continuous monitoring, assessment and refinement of FinOps practices. Fostering a culture of financial accountability is essential for sustained cost management. This includes promoting cost awareness across the organization, encouraging responsible spending and incentivizing cost-saving initiatives.

Maturity model

A FinOps maturity model for AI services helps organizations assess their current capabilities and identify areas for improvement. It can be structured into the following stages:

Initial: Basic cost tracking and reporting are in place. At this stage, organizations have basic cost tracking and reporting in place. The focus is on understanding cost drivers and establishing a foundation for financial management.
Developing: Proactive cost management practices are implemented, with regular budget reviews. In the developing stage, organizations implement proactive cost management practices, including regular budget reviews and cost allocation policies. The focus is on enhancing cost visibility and control.
Mature: Advanced cost optimization techniques are utilized, and there is a culture of cost awareness across the organization. At the mature stage, organizations utilize advanced cost optimization techniques, such as predictive analytics and automation. There is a culture of cost awareness across the organization, and financial management practices are well-established.
Leading: The organization employs predictive analytics for cost forecasting and has fully automated cost management processes. In the leading stage, organizations employ predictive analytics for cost forecasting and have fully automated cost management processes. The focus is on continuous improvement and innovation in financial management.

Conclusion: Take a nuanced approach

Implementing a FinOps strategy for AI services, particularly genAI, necessitates a nuanced approach that addresses the unique requirements of AI workloads. By focusing on cost transparency, management and optimization and by leveraging the capabilities of major cloud platforms such as Azure, AWS and GCP, organizations can develop a robust financial framework. Furthermore, adopting a strategic operating model and progressing through a FinOps maturity model ensures that AI services remain both effective and cost-efficient.

Magesh Kasthuri is a Ph.D in artificial intelligence and the genetic algorithm. He currently works as a distinguished member of the technical staff and Principal Consultant in Wipro Ltd.

This article was made possible by our partnership with the IASA Chief Architect Forum. The CAF’s purpose is to test, challenge and support the art and science of Business Technology Architecture and its evolution over time as well as grow the influence and leadership of chief architects both inside and outside the profession. The CAF is a leadership community of the IASA, the leading non-profit professional association for business technology architects. 

Read More from This Article: Taming the cost of AI: Is FinOps the answer?
Source: News