AWS's Positioning in GenAI, and Re:Invent 2023 Recap
What game is AWS playing in generative AI space?
In generative AI, AWS is not the first mover. In fact, its flagship chatbot (Amazon Q) was released a full year after ChatGPT’s release, and it has already been panned due to severe performance and hallucination issues. And its main LLM product (Bedrock) is a managed service hosting foundational models of 3rd party providers (e.g. Anthropic, Meta, etc), and its own model (Titan) has failed to gain as much mindshare.
But AWS may still succeed in generative AI due to its relentless fast-follower strategy and focus on enterprise, industry customers. If anything, being a fast follower and a neutral observer to the LLM war (among OpenAI, Meta, and the broader OSS community) has helped AWS to focus on what it does best - distributing managed services to the enterprise laggards, which is arguably the most profitable segment for generative AI industry.
In other words, don’t underestimate a competitor who has the distribution already, and is willing to execute a fast follow strategy.
In this post, we will discuss:
why AWS has been able to get away with being a fast follower, and how the chaos of LLM space has helped AWS
exactly how AWS will respond strategically to the chaos, as gleaned from the recent product announcements and customer conversations
some other interesting highlights from Re:Invent 2023
AWS’s Fast Follower Strategy
Until now (December 2023), AWS seemed to be in a tenuous position in generative AI. Its flagship Bedrock GA launch was delayed until late September, a full 10 months after ChatGPT’s release, and its foundational models were difficult to demo. Even the recent releases of Amazon Q and PartyRock have been either panned or ignored.
But now that Re:invent 2023 is over, it’s time to digest the 100+ product releases and reassess AWS’s positioning and strategy in Gen AI. And luckily for AWS, things are finally starting to look better, and there are even signs of a coherent generative AI strategy, which is to play the fast follower game, with a marked focus on enterprise in lieu of servicing startups. And in a weird way, this fast-follower strategy is working.
So what do we mean by AWS’s fast follower strategy? It boils down to:
Leveraging AWS Bedrock (serverless foundational model service) as a “curated model hub” for hosting models from providers like Anthropic, Meta, Stability, etc. The bet is that AWS profits from the distribution of LLMs, and that it will eventually have more leverage over model providers, who are fighting a war of attrition between OSS community and proprietary model vendors. This also allows AWS to focus on building distribution, which is a classic Amazon speciality.
Providing managed wrappers around workloads popularized by either customers or OSS community. For example, during Re:Invent, AWS announced managed RAG (retrieval augmented generation) service that takes care of chunking, embedding, etc, a preview for a LLM evaluation and LLMOps service, as well as a LLM reliability feature called Guardrails (which even stole the name for a popular OSS project doing the same thing).
Developing “lego blocks” for every part of the developer lifecycle when it comes to LLM development. For example, this re:invent announced HyperPod (a managed GPU cluster with training job baby-sitting, so that enterprises can pretrain their own foundational models), as well as continuous training (allowing easy checkpointing and continuous finetuning).
Expanding its portfolio of managed solutions (as opposed to components). For example, AWS is still investing into products like Kendra (managed knowledge base) and Healthscribe (note taking for healthcare professionals) that provide horizontal solutions that typically ISVs compete in.
Note, none of this is new when it comes to AWS strategy - they have been aggregating vendor services, wrapping on top of OSS projects, and launching new products that compete with ISVs even before GenAI. What’s surprising is that this strategy is still kind of working even in this “GenAI era”.
OpenAI’s massive lead in LLM technology - theoretically - should have given Microsoft and OpenAI all the spoils of generative AI, but that hasn’t happened. But why?
Why AWS’s Fast Follower Strategy is Working for Enterprise
AWS is able to get away with a fast follow strategy, mainly due to AWS focusing on the slow moving enterprise market, as well as some luck:
The delay of enterprise customers to adopt GenAI bought AWS plenty of time to catch up. Most enterprise customers haven’t fully committed to a generative AI product roadmap, which removed urgency to fully commit to OpenAI / Azure stack. Customers spent the first 6-9 months of 2023 doing prototypes of POCs, which bought AWS plenty of time to catch up via its Meta and Anthropic partnerships. Fast forward to December 2023, many OSS models - at least on paper - seem not too far behind ChatGPT-3.5 - which is more than sufficient for most enterprise use cases.
Fierce competition in LLM space has allowed AWS to provide near State-Of-The-ART foundational models (Anthropic, Llama2). Even if you are Anthropic, life isn’t easy. GPU shortage affects everyone, leading to Anthropic needing a strategic partner in AWS. Meta also didn’t want Microsoft to dominate GenAI, so they released Llama2 as pseudo-opensource as a defensive move, and made it available on AWS.
Shelf-life of LLM tech decreasing also helps AWS, because it leads to commoditization of technology around LLMs and LLMOps, which decreases the suppliers’ bargaining power versus AWS’s position as the top distributor.
Thus, as long as AWS is able to partner with onboarding with Llama, Mistral, or whichever LLM provider with the most community traction quickly and reliably enough, they should be fine. Anthropic presumably will keep making progress as well, but that’s a different bet (more enterprise focused, given Anthropic’s branding around responsible AI).
As long as AWS can onboard new models quickly and make them available on Bedrock, AWS is in consideration. And AWS is getting better at that - Claude 2.1 was made available on Bedrock within one week of its announcement. This is much faster than 5 months it took for Llama2 to be available on Bedrock.
Note, it’s also clear from AWS-Anthropic partnership that Claude models are generally out of startups’ reach (aka it’s not going for the startup market). This is evident from Claude models’ pricing and the long waitlist for startups.
Not needing to fight the war of attrition over LLMs frees up AWS to focus on what it does best: 1) packaging innovation from OSS community into managed services, 2) commoditizing emerging workloads into utility components, and 3) forming partnerships.
AWS’s Gen AI Surface Area Increasing
While Google, OpenAI, Mea, and a host of other startups are fighting the LLM war, AWS has quietly announced a slew of managed services that packaged the best ideas from the OSS community, such as RAG, function calling, and multi-lingual, real-time TTS and STT - but making them available for enterprise customers who prefer to operate in the AWS ecosystem.
In short, 2024 will see far more enterprise GenAI product releases, thanks to them finally having all the “lego blocks” to assemble their GenAI apps, such as:
Guadrails (preview) for Bedrock: basically the same idea (and name) as the OSS Guardrails project that allows for enforcing constraints around LLM agent behavior and generation.
Agents & Function calling (GA): popularized by the OSS Gorilla project, ReAct, Langchain, and numerous other OSS community demos, made finally available for the enterprise market.
Managed RAG / Knowledge base (GA) for Bedrock: This service basically provides a managed experience for chunking and embedding documents sitting in a S3 data lake, and even loading them into a purpose-built vector store such as OpenSearch Serverless, RDS PGVector, or the new DocumentDB vector search feature. With this, customers may potentially not need to build and operate their own RAG service, which is a very common workload.
LLM evaluation framework: Perhaps inspired from the plethora of LLMOps and evaluation projects such as Langchain’s Langsmith, PromptLayer, etc, Bedrock now has its own LLMOps tool that’s in preview so that enterprise customers don’t have to leave AWS ecosystem for LLMOps.
Multi modal embeddings: Multimodal embedding models such as CLIP will become more important as Multi-Modal Gen AI use cases explode, and AWS releasing an embedding model is a start.
LLM training babysitting + training clusters as a service (Hyperpod): This AWS service helps customers looking to pretrain their own models from scratch by managing GPU clusters and model training job babysitting on behalf of customers. Whether pretraining own models is a good idea is a separate topic.
Continuous pretraining: Bedrock already allows for model finetuning, but some customers may want to do a full parameter retraining periodically to incorporate new (recent) data to advance their knowledge cutoffs. This service supposedly makes that workload easier.
The above were some of the highlights - but there were many other announcements that encroached even deeper into vertical use cases such as AWS Healthscribe, which uses deep learning to help healthcare professionals take notes, or Amazon Q, which is basically an omni-channel agent assist solution that is meant to compete in the “train your LLM chatbot with your company data” space. There were also many “enterprise-grade” announcements for neural STT and TTS, as well as agent-assist and contact center solutions (but these adoption trends have been in motion for years, even prior to GenAI).
Note, building products with a high degree of polish and developer experience was never AWS’s forte, so the initial reactions to Amazon Q have been lukewarm at best, and you can try it out for yourself to see that. But performance may not matter - for now - since many enterprise customers still aren’t tech-capable to develop Amazon Q on their own.
Conclusion
Due to fierce competition in LLM space, AWS’s strength in distribution, and partnerships with Anthropic and OSS model providers - AWS can afford to play the fast follower game and succeed in generative AI with enterprise segment. The main risks to this game would come from another step-jump in foundational models that only OpenAI or Google would possess (and OSS community don’t have access to), which would force AWS in a tougher position.