06 Apr War of the Machines: In the great battle of LLMs, what side should enterprises be on?
It has taken about 3 months for ChatGPT to become a household phrase eliciting media headlines, one-liners from comedians, and the average tech user tweeting about a new magical task they were able to do with it.
Suddenly “This AI” is real. Thousands of enthusiastic self-publishing authors are holding forth in LinkedIn posts, Reddit forums and Youtube tutorials on exciting new ways to use the technology. What was once an obscure term in computer science, LLM (large language models) has entered the lexicon of the masses. For a deep dive into LLM’s, I recommend setting time aside to read this blog post from Stephen Wolfram creator of the Wolfram Alpha which goes under the bonnet of ‘large language models’ to take a peek at their inner workings.
The War is deadly serious.
What is less well understood is this has kicked off a multi-year war between tech companies that is deadly serious. There are old players, new players, pretenders, VCs, founders, all jostling for attention with increasingly strident “announcements” that scream for attention in the new ChatGPT landscape. The knives are drawn, and there is no quarter given.
Satya Nadella of Microsoft defined it well when he gleefully announced “I want people to know we made them dance,” referring to Google. The media landscape has been chockful with hasty announcements, tall claims, and flubbed demo events.
Companies have persisted knowing that this is the first inning of a war that will be fought on every front of software. In tech the first innings: #mindshare is #marketshare. This war is coming to search, browser, operating systems, and applications.
The LLM traffic Jam
The market is ablaze with a confusing array of chest thumping claims, model releases from all over the globe. The screaming media headlines that follow yet another announcement and the cycle starts all over again next week. Lets take a quick look at the roster:
Google’s arguably has the largest library of LLM’s that include LaMDA, PaLM, Imagen, MusicLM and Deep Mind’s Chinchilla. However they have chosen to release a more demure chatbot named Bard based on LaMDA and followed it up quickly with a Microsoft style copycat announcement of adding generative AI to its work apps.
Hugging face ostensibly founded with the intention of breaking the stranglehold of big tech on LLMs released BLOOM. The company announced partnership with Amazon Web Services (AWS) which would allow Hugging Face’s products available to AWS customers to use them as the building blocks for their custom applications.
Baidu Ernie developed in-house models ERNIE and Plato but flubbed the demo event causing the stock to drop 10%. It’s out there now, and Alibaba and JD have announced similar projects.
A startup in Israel that plans on releasing a LLM called Jurassic-1 Jumbo, which contains 178 billion parameters, or 3 billion more than GPT-3. In machine learning, correlation between the number of parameters and sophistication has held up remarkably well. I guess we will have to see…
Other Research Projects
And the list goes on with several startups fattening their coffers with hundreds of millions in new funding entering the fray including Adept.
So what are enterprises supposed to do?
Realize that implementing large language models in their raw form is a tall proposition despite calls from in-house developers, the cost of compute can be extraordinary, the complex dev environment, and intricacies of competing architecture make this a daunting task. Counseling developers to build initial “business use cases” that can be tested for usability, accuracy and security is a good first step.
Credit: HFS Research
In this ensuing chaos, enterprise decision makers are being inundated with demands for ChatGPT-style functionality. It’s time to cool your jets; wait for the fracas to settle down, learn what ChatGPT style technology can and cannot do. Bring your security and compliance team into the mix to decide what’s possible and talk to your vendors/partners who are embracing the technology by building toolsets so it can be used safely in an enterprise environment. Meanwhile learn about the possibilities and its limitations.
ChatGPT itself agrees with me too: