Introducing Lamini, the LLM Platform for Rapidly Customizing Models

Sharon Zhou, CEO

TL;DR

  • Lamini emerges from stealth to give every developer the superpowers that took the world from GPT-3 to ChatGPT!
  • Today, you can try out our hosted data generator for training your own LLMs, weights and all, without spinning up any GPUs, in just a few lines of code from the Lamini library.
  • You can play with an open-source LLM, fine-tuned on generated data using the Lamini library.
  • Sign up for early access to our full LLM training module, from speed optimizations like LoRa to enterprise features like virtual private cloud (VPC) deployments.

Training LLMs should be as easy as prompt-tuning 🦾

Why is writing a prompt so easy, but training an LLM from a base model still so hard? Iteration cycles for fine-tuning on modest datasets are measured in months because it takes significant time to figure out why fine-tuned models fail. Conversely, prompt-tuning iterations are on the order of seconds, but performance plateaus in a matter of hours. Only a limited amount of data can be crammed into the prompt, not the terabytes of data in a warehouse.

It took OpenAI months with an incredible ML team to fine-tune and run RLHF on their base GPT-3 model that was available for years — creating what became ChatGPT. This training process is only accessible to large ML teams, often with PhDs in AI.

Technical leaders at Fortune 500 companies have told us:

  • “Our team of 10 machine learning engineers hit the OpenAI fine-tuning API, but our model got worse — help!”
  • “I don’t know how to make the best use of my data — I’ve exhausted all the prompt magic we can summon from tutorials online.”
That’s why we’re building Lamini: to give every developer the superpowers that took the world from GPT-3 to ChatGPT.

Rapidly train LLMs to be as good as ChatGPT from any base model 🚀

Lamini is an LLM platform that allows any developer, not just machine learning experts, to train high-performing LLMs, as good as ChatGPT, on large datasets with just a few lines of code from the Lamini library (check out an example here!).

The optimizations in this library reach far beyond what’s available to developers now, from more challenging optimizations like RLHF to simpler ones like reducing hallucinations.

Lamini makes it easy to run multiple base model comparisons in just a single line of code, from OpenAI’s models to open-source ones on HuggingFace.

Now that you know a bit about where we’re going: today, we’re excited to release our first major community resource!

Available now: a hosted data generator for LLM training 🎉

We are excited to release several important steps to training your own LLM:

Steps to a ChatGPT-like LLM for your use case

Base models have a good understanding of English for consumer use cases. But when you need them to learn your vertical-specific language and guidelines, prompt-tuning is often not enough and you will need to build your own LLM.

Here are the steps to get an LLM that follows instructions to handle your use case like ChatGPT:Try prompt-tuning ChatGPT or another model. You can use Lamini library’s APIs to quickly prompt-tune across different models, swapping between OpenAI and open-source models in just one line of code. We optimize the right prompt for you, so you can take advantage of different models without worrying about how to format the prompt for each model.

  1. Try prompt-tuning ChatGPT or another model. You can use Lamini library’s APIs to quickly prompt-tune across different models, swapping between OpenAI and open-source models in just one line of code. We optimize the right prompt for you, so you can take advantage of different models without worrying about how to format the prompt for each model.
  2. Build a large dataset of input-output pairs. These will show your model how it should respond to its inputs, whether that's following instructions given in English, or responding in JSON. Today, we’re releasing a repo with just a few lines of code using the Lamini library to generate 50k data points from as few as 100 data points, using the Lamini library to hit the Lamini platform, so you don’t have to spin up any GPUs. We include an open-source 50k dataset in the repo. (More details below on how you can do this!)
  3. Finetune a base model on your large dataset. Alongside the data generator, we’re also releasing an LLM that is fine-tuned on the generated data using Lamini. We’ll soon be releasing the ability to do this programmatically (early access). You can also hit OpenAI’s fine-tuning API as a great starting point.
  4. Run RLHF on your fine-tuned model. With Lamini, you no longer need a large ML and human labeling team to run RLHF.
  5. Deploy to your cloud. Simply hit the API endpoint in your product or feature.

Lamini delivers the ease of prompt-tuning, with the performance of RLHF and fine-tuning. It will soon handle this entire process (sign up for early access!).

Deeper dive into step #1: a ChatGPT-like data generator

For your application, you might want similar "instruction-following" data, but you could also want something completely different, like responding only in JSON.

ChatGPT took the world by storm because it could follow instructions from the user, while the base model that it was trained from (GPT-3) couldn’t do that consistently. For example, if you asked the base model a question, it might generate another question instead of answering it.

You'll need a dataset of ~50k instruction-following examples to start. Don't panic. You can now use Lamini’s hosted data generator to turn just 100 examples into over 50k in just a few lines of code.
You don’t need to spin up any GPUs, because Lamini hosts it for you. All the data that is used is commercial-use-friendly, meaning you own all the data that comes out of it.

You can customize the initial 100+ instructions so that the LLM follows instructions in your own vertical. Once you have those, submit them to the Lamini data generator, and voilà: you get a large instruction-following dataset on your use case as a result!

How the data generator works

The Lamini data generator is a pipeline of LLMs that takes your original small set of 100+ instructions, paired with the expected responses, to generate 50k+ new pairs, inspired by Stanford Alpaca.

This generation pipeline uses the Lamini library to define and call LLMs to generate different, yet similar, pairs of instructions and responses. Trained on this data, your LLM will improve to follow these instructions.

We provide a good default for the generation pipeline that uses open-source LLMs, which we call Lamini Open and Lamini Instruct. With new LLMs being released each day, we update the defaults to the best-performing models.

As of this release, we are using EleutherAI’s Pythia for Lamini Open and Databricks’ Dolly for Lamini Instruct. Lamini Open generates more instructions, and Lamini Instruct generates paired responses to those instructions.

The final generated dataset is available for your free commercial use (CC-BY license).

The Lamini library allows you to swap our defaults for other open-source or OpenAI models in just one line of code. Note that while we find OpenAI models to perform better on average, their license restricts commercial use of generated data for training models similar to ChatGPT.

If you’re interested in more details on how our data generator works, read more or run it here.

Releasing step #2: an open-source LLM, fine-tuned on generated data from step #1 using Lamini

Some of the generated data is good, some not. Before fine-tuning, the next step is to filter the generated data to mostly high-quality data (just run this simple script in the same repo). Lamini then creates a custom LLM by training a base model on this filtered, generated dataset.


We have released an open-source instruction-following LLM (CC-BY license) using Lamini to train the Pythia base model with 37k generated instructions, filtered from 70k. Play with this custom LLM in the playground now.

Pushing the boundaries of fast & usable generative AI

We’re excited to dramatically improve the performance of training LLMs and make it easy for engineering teams to train them. These two frontiers are intertwined: with faster, more effective iteration cycles, more people will be able to build these models, beyond just fiddling with prompts. We exist to help any company unlock the power of generative AI by making it easy to put their own data to work.

Team++: We are growing our team with people who are passionate about making it possible to build LLMs 10x faster and making them widely accessible to empower new, extraordinary use cases. If that’s you, please apply via https://jobs.lever.co/laminiai 🤝

--

April 28, 2023