Home News A new way to optimize and prioritize AI projects for the GPU shortage

A new way to optimize and prioritize AI projects for the GPU shortage

by WeeklyAINews
0 comment

Head over to our on-demand library to view classes from VB Remodel 2023. Register Right here


Generative AI, enabled by giant language fashions (LLMs) like GPT-4, has triggered shockwaves within the tech world. ChatGPT’s meteoric rise has triggered the worldwide tech business to reassess and prioritize gen AI, reshaping product methods in actual time.

Integration of LLMs has given product builders a straightforward strategy to incorporate AI-powered options into their merchandise. Nevertheless it’s not all easy crusing. A evident problem looms giant for product leaders: the GPU scarcity and spiraling prices.

Rise of LLMs and GPU scarcity

The growing variety of AI startups and companies has led to excessive demand for high-end GPUs akin to A100s and H100s, thereby overwhelming Nvidia and its manufacturing accomplice TSMC, each of whom are struggling to satisfy the provision. On-line boards like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment throughout the tech group. It’s turn out to be so dire that each AWS and Azure have had no alternative however to implement quota programs.

This bottleneck doesn’t simply squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a latest off-the-record assembly in London, OpenAI’s CEO Sam Altman candidly acknowledged that the pc chip scarcity is stymieing ChatGPT’s development. Altman reportedly lamented that the dearth of computing energy has resulted in subpar API availability and has obstructed OpenAI from rolling out bigger “context home windows” for ChatGPT.

Prioritizing AI options

On the one hand, product leaders discover themselves caught in a relentless push to innovate, going through the expectations to ship cutting-edge options that leverage the facility of gen AI. However, they grapple with the cruel realities of GPU capability constraints. It’s a fancy juggling act, the place ruthless prioritization turns into not only a strategic choice however a necessity.

On condition that GPU availability is poised to stay a problem for the foreseeable future, product leaders should suppose strategically about GPU allocation. Historically, product leaders have leaned on prioritization strategies just like the Buyer Worth/Want vs. Effort Matrix. This technique, nevertheless logical in a world the place computational assets have been plentiful, now calls for a little bit of reevaluation.

See also  Nvidia GPU shortage is 'top gossip' of Silicon Valley

In our present paradigm, the place compute is the constraint and never software program expertise, product leaders should redefine how they prioritize varied merchandise or options, bringing GPU limitations to the forefront of strategic decision-making.

Planning round capability constraints may appear uncommon for the tech business, however it’s a commonplace technique in different industries. The underlying idea is simple: Essentially the most priceless issue is the time spent on the constrained useful resource, and the target is to optimize the worth per unit of time spent on that constraint.

Know-how success metrics

As a former guide, I’ve efficiently utilized this framework throughout varied industries. I consider that tech product leaders can even use an analogous method to prioritize merchandise or options whereas GPU constraints exist. When making use of this framework, probably the most easy measure of worth is profitability.

Nevertheless, in tech, profitability won’t all the time be the suitable metric, significantly when venturing into a brand new market or product. Thus, I’ve tailored the framework to align with the success metrics usually utilized in tech, outlining a easy 4 steps course of:

1. Contribution

In the beginning, establish your North Star metric. That is the contribution of every product or function, one thing that encapsulates the essence of its value. Some concrete examples would possibly embody:

  • A rise in income and revenue
  • Good points in market share
  • Progress within the variety of day by day/month-to-month lively customers

2. Variety of GPUs required

Gauge the variety of GPUs wanted for every product or function. Deal with key components together with:

  • Variety of queries per consumer per day
  • Variety of day by day lively customers
  • Complexity of the question (what number of tokens every question consumes)

3. Calculate contribution per GPU

Break it right down to the specifics. How does every GPU contribute to the general purpose? Understanding this will provide you with a transparent image of the place your GPUs are finest allotted.

Prioritize merchandise based mostly on contribution per GPU

Now, it’s time to make the robust selections. Rank your merchandise by their Contribution per GPU, after which line them up accordingly. Deal with the merchandise with the very best Contribution per GPU first, guaranteeing that your restricted assets are channeled into the areas the place they’ll take advantage of influence.

See also  Why you don't need big data to train ML

With GPU constraints not a blind spot however a quantifiable issue within the decision-making course of, your organization can extra strategically navigate the GPU scarcity. To deliver this framework to life, let’s visualize a situation the place you, as a product chief, are grappling with the problem of prioritizing amongst 4 completely different merchandise:

  Product A Product B Product C Product D
Income Potential (Contribution) $100M $80M $50M $25M
Variety of GPUs Required 1,000 450 500 50
Contribution Per GPU $0.1M/GPU $0.18M/GPU $0.1M/GPU $0.5M/GPU

Though Product A has the very best income potential, it doesn’t yield the very best contribution per GPU. Surprisingly, Product D, with the least income potential, gives probably the most substantial return per GPU. By prioritizing based mostly on this metric, you may maximize complete potential income.

Let’s say you will have a complete of 1,000 GPUs at your disposal. A simple alternative might need you choosing Product A, producing a income potential of $100 million. Nevertheless, by making use of the prioritization technique described above, you may obtain $155 million in income:

Precedence Order Product Income Acquire GPUs
1 Product D $25M 50
2 Product B $80M 450
3 Product C $50M 500
Complete   $155M 1,000

The identical technique could be utilized to different contribution metrics, akin to market share achieve:

  Product A Product B Product C Product D
Market Share Acquire (Contribution) 5% 4% 2.5% 1.25%
Variety of GPUs Required 1,000 500 500 50
Contribution Per GPU 0.005%/GPU 0.008%/GPU 0.005%/GPU 0.025%/GPU

Equally, choosing Product A would have led to a market share achieve of 5%. Nevertheless, making use of the prioritization technique described above, you may obtain 7.75% in market share achieve:

Precedence Order Product Market Share achieve GPUs
1 Product D 1.25% 50
2 Product B 4% 450
3 Product C 2.5% 500
Complete   7.75% 1,000

Advantages and limitations

This various prioritization framework introduces a extra nuanced and strategic method. By zeroing in on the Contribution Per GPU, you’re strategically aligning assets the place they will take advantage of substantial distinction, whether or not when it comes to income, market share or another defining metric.

See also  OpenAI looks beyond diffusion with 'consistency' based image generator

However the benefits don’t cease there. This technique additionally fosters a larger sense of readability and objectivity throughout product groups. In my expertise, together with my early days main digital transformation at a healthcare firm and later whereas working with varied McKinsey purchasers, this method has been a game-changer in situations the place capability constraints are a important issue. It’s enabled us to prioritize initiatives in a extra data-driven and rational manner, sidelining the standard politics the place selections would possibly in any other case fall to the loudest voice within the room.

Nevertheless, no one-size-fits-all answer exists, and it’s value acknowledging the potential limitations of this technique. For example, this method might not all the time encapsulate the strategic significance of sure investments. Thus, whereas exceptions to the framework can and needs to be made, they should be rigorously thought-about moderately than the norm. This maintains the integrity of the method and ensures that any deviations are made with a broader strategic context in thoughts.

Conclusion

Product leaders are going through an unprecedented scenario with the GPU scarcity, so discovering new methods of managing assets is required. Within the phrases of the good strategist Solar Tzu, “Within the midst of chaos, there may be additionally alternative.”

The GPU scarcity is certainly a problem, however with the fitting method, it might even be a catalyst for differentiation and success. The proposed prioritization framework, specializing in Contribution Per GPU, gives a strategic strategy to prioritize. By zeroing in on Contribution Per GPU, corporations can maximize their return on funding, aligning assets the place they’ll take advantage of influence and specializing in what issues probably the most to the long-term success of their firm.

Prerak Garg is senior director of cloud and AI company technique at Microsoft and a former McKinsey and Firm engagement supervisor.

Source link

You may also like

logo

Welcome to our weekly AI News site, where we bring you the latest updates on artificial intelligence and its never-ending quest to take over the world! Yes, you heard it right – we’re not here to sugarcoat anything. Our tagline says it all: “because robots are taking over the world.”

Subscribe

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

© 2023 – All Right Reserved.