Scaling generative AI with flexible model choices

This blog series demystifies enterprise generative AI (gen AI) for business and technology leaders. It provides simple frameworks and guiding principles for your transformative artificial intelligence (AI) journey. In the previous blog, we discussed the differentiated approach by IBM to delivering enterprise-grade models. In this blog, we delve into why foundation model choices matter and how they empower businesses to scale gen AI with confidence.

Why are model choices important?

In the dynamic world of gen AI, one-size-fits-all approaches are inadequate. As businesses strive to harness the power of AI, having a spectrum of model choices at their disposal is necessary to:

  • Spur innovation: A diverse palette of models not only fosters innovation by bringing distinct strengths to tackle a wide array of problems but also enables teams to adapt to evolving business needs and customer expectations.
  • Customize for competitive advantage: A range of models allows companies to tailor AI applications for niche requirements, providing a competitive edge. Gen AI can be fine-tuned to specific tasks, whether it’s question-answering chat applications or writing code to generate quick summaries.
  • Accelerate time to market: In today’s fast-paced business environment, time is of the essence. A diverse portfolio of models can expedite the development process, allowing companies to introduce AI-powered offerings rapidly. This is especially crucial in gen AI, where access to the latest innovations provides a pivotal competitive advantage.
  • Stay flexible in the face of change: Market conditions and business strategies constantly evolve. Various model choices allow businesses to pivot quickly and effectively. Access to multiple options enables rapid adaptation when new trends or strategic shifts occur, maintaining agility and resilience.
  • Optimize costs across use cases: Different models have varying cost implications. By accessing a range of models, businesses can select the most cost-effective option for each application. While some tasks might require the precision of high-cost models, others can be addressed with more affordable alternatives without sacrificing quality. For instance, in customer care, throughput and latency might be more critical than accuracy, whereas in resource and development, accuracy matters more.
  • Mitigate risks: Relying on a single model or a limited selection can be risky. A diverse portfolio of models helps mitigate concentration risks, helping to ensure that businesses remain resilient to the shortcomings or failure of one specific approach. This strategy allows for risk distribution and provides alternative solutions if challenges arise.
  • Comply with regulations:The regulatory landscape for AI is still evolving, with ethical considerations at the forefront. Different models can have varied implications for fairness, privacy and compliance. A broad selection allows businesses to navigate this complex terrain and choose models that meet legal and ethical standards.

Selecting the right AI models

Now that we understand the importance of model selection, how do we address the choice overload problem when selecting the right model for a specific use case? We can break down this complex problem into a set of simple steps that you can apply today:

  1. Identify a clear use case: Determine the specific needs and requirements of your business application. This involves crafting detailed prompts that consider subtleties within your industry and business to help ensure that the model aligns closely with your objectives.
  2. List all model options: Evaluate various models based on size, accuracy, latency and associated risks. This includes understanding each model’s strengths and weaknesses, such as the tradeoffs between accuracy, latency and throughput.
  3. Evaluate model attributes: Assess the appropriateness of the model’s size relative to your needs, considering how the model’s scale might affect its performance and the risks involved. This step focuses on right-sizing the model to fit the use case optimally as bigger is not necessarily better. Smaller models can outperform larger ones in targeted domains and use cases.
  4. Test model options: Conduct tests to see if the model performs as expected under conditions that mimic real-world scenarios. This involves using academic benchmarks and domain-specific data sets to evaluate output quality and tweaking the model, for example, through prompt engineering or model tuning to optimize its performance.
  5. Refine your selection based on cost and deployment needs: After testing, refine your choice by considering factors such as return on investment, cost-effectiveness and the practicalities of deploying the model within your existing systems and infrastructure. Adjust the choice based on other benefits such as lower latency or higher transparency.
  6. Choose the model that provides the most value: Make the final selection of an AI model that offers the best balance between performance, cost and associated risks, tailored to the specific demands of your use case.

Download our model evaluation guide

IBM watsonx™ model library

By pursuing a multimodel strategy, the IBM watsonx library offers proprietary, open source and third-party models, as shown in the image:

List of watsonx foundation models as of 8 May 2024.

This provides clients with a range of choices, allowing them to select the model that best fits their unique business, regional and risk preferences.

Also, watsonx enables clients to deploy models on the infrastructure of their choice, with hybrid, multicloud and on-premises options, to avoid vendor lock-in and reduce the total cost of ownership.

IBM® Granite™: Enterprise-grade foundation models from IBM

The characteristics of foundation models can be grouped into 3 main attributes. Organizations must understand that overly emphasizing one attribute might compromise the others. Balancing these attributes is key to customize the model for an organization’s specific needs:

  1. Trusted: Models that are clear, explainable and harmless.
  2. Performant: The right level of performance for targeted business domains and use cases.
  3. Cost-effective: Models that offer gen AI at a lower total cost of ownership and reduced risk.

IBM Granite is a flagship series of enterprise-grade models developed by IBM Research®. These models feature an optimal mix of these attributes, with a focus on trust and reliability, enabling businesses to succeed in their gen AI initiatives. Remember, businesses cannot scale gen AI with foundation models they cannot trust.

View performance benchmarks from our research paper on Granite

IBM watsonx offers enterprise-grade AI models resulting from a rigorous refinement process. This process begins with model innovation led by IBM Research, involving open collaborations and training on enterprise-relevant content under the IBM AI Ethics Code to promote data transparency.

IBM Research has developed an instruction-tuning technique that enhances both IBM-developed and select open-source models with capabilities essential for enterprise use. Beyond academic benchmarks, our ‘FM_EVAL’ data set simulates real-world enterprise AI applications. The most robust models from this pipeline are made available on IBM®™, providing clients with reliable, enterprise-grade gen AI foundation models, as shown in the image:

Latest model announcements:

  • Granite code models: a family of models trained in 116 programming languages and ranging in size from 3 to 34 billion parameters, in both a base model and instruction-following model variants.
  • Granite-7b-lab: Supports general-purpose tasks and is tuned using the IBM’s large-scale alignment of chatbots (LAB) methodology to incorporate new skills and knowledge.

Try our enterprise-grade foundation models on watsonx with our new chat demo. Discover their capabilities in summarization, content generation and document processing through a simple and intuitive chat interface.

Learn more about IBM watsonx foundation models

The post Scaling generative AI with flexible model choices appeared first on IBM Blog.