You know all too well the pain of crafting and retrieving the right prompt that fits in your LLM.
Now, you’ve heard that you can train an LLM on all your data, effectively giving you an unlimited prompt size with the internet’s worth of data having fit into these models. Large AI labs like OpenAI, Anthropic, and Google use this learning process all the time. It’s the learning in machine learning, after all.
But off-the-shelf solutions for finetuning out there are expensive, hard to try out with a massive infrastructure investment, or behind some kind of paywall to even glimpse how it works.
That’s why we’re releasing Lamini finetuning for free: with 15min, a few lines of code, and your own data, you can get results on a 410M-parameter finetuned LLM for $0.
Walk through the Colab notebook on how to finetune now.
1. Does the free finetuning tier let me play with finetuning more easily than ever before? Yes.
2. Does it allow me to iterate on improving finetuning faster than ever before? Yes.
3. Does it result in the best model ever? No.
The purpose of the free tier is so you get a feel for finetuning, iterating on what works, what doesn't. This is just the start, we'll be releasing more and listening to what people want next.
If you want bigger models, production-grade results, and run LLMs on your own infrastructure, you can chat with our team, and join other large startups and Fortune 500 companies to get access now!
To get started easily with finetuning, let’s walk through a small-scale version of this process on a toy example, also in code here. If you’re not familiar with Lamini yet, it’s an LLM developer platform for training custom models on your private infrastructure.
The Lamini library makes it easy to run and train LLMs, and can be installed on your servers, and the purpose of it is to show ease of use, so you can quickly iterate on the finetuning process. The free example here runs on our servers.
Let’s say you want a Question-Answer LLM that can answer questions about your data, such as customer support documents, HR or financial information, or internal engineering documentation so you can easily troubleshoot issues.
So first, there’s a class for that, and you can run that model quickly and easily. Here, it’s a Question-Answer LLM on Lamini’s internal engineering documentation.
The model gives you a chat interface and the one here is using a 410M-parameter Pythia model as a base.
See the Lamini library docs to run a different base model, use a generic LLM for something other than question-answering, or use our REST API.
Out of the box, the 410M-parameter LLM’s performance looks unsatisfactory. When you ask: “How can I add data to Lamini?”
It gives you some garbage back:
This base LLM looks like it’s going off the rails and cuts itself off. Note that it still knows basic English words, so you don’t need to start from scratch. But it definitely doesn't know how to reply to your topic. It just sounds generic and the formatting is a little weird.
To teach it more about your version of English and your task, you can feed it data. For this example, you have a dataset of 1400 questions and answers about Lamini. While it seems small, it tallies far more tokens (~120K) than the largest prompt sizes today.
Pro tip on preparing data: first, quality matters a lot. Just 100 high-quality examples will get you on the right path. What does high-quality look like?
We recommend including all questions your users might ask and provide accurate and realistic answers. LLM-generated low-quality answers won’t help train the model unless they’re carefully reviewed and edited by humans.
Sounds complex and boring? Don’t worry! We’re building tools to automate and streamline this process and help you succeed with ease. Stay tuned!
You can format the data in many ways: a pandas dataframe, a jsonlines file, a list of dictionaries, or a csv file, with keys or column names “question” and “answer”.
Then, just load that data into your model, and tell it to train:
And, in just 10-15 minutes, you can run this LLM trained on all this data.
Let’s ask the same question, “How can I add data to Lamini?”, as follows:
It’s correct. The improvement after finetuning is poggers! 💥
AI is iterative. What would we do to improve this further? As a first step, we'd experiment with this LLM to understand where it's succeeding and failing. We would adapt the dataset, adding more examples that cover diverse, high-quality use cases. To prevent hallucinations, for example, we would add examples where correct answer is to say the question is irrelevant.
Of course, this is just a toy example showing how finetuning can improve things. If you'd like to be a tiny LLM power user (fun fact: 400M was considered humungous only ~6 years ago!), a model this small would actually work for some easy, limited tasks on the right data.
If you want something more capable, keep reading!
The free-tier is amazing to get a sense for finetuning on a toy example—it's over 1000x smaller than the original GPT-3. You can finetune as many LLMs as you like, so you can see what iteration feels like :)
If you want to get access to larger models, production-grade results, 1000x fewer hallucinations, skip the queue, scale throughout to millions of users, or run LLMs on your private infrastructure, you can chat with our team.
In either case, the potential of training your own LLMs is massive. By making use of all your data, you turn your LLM into an expert in your field.
Join thousands of developers using Lamini by signing up with a free account, and join other Fortune 500 companies and large mid-market startups deploying Lamini privately with our enterprise plan today.
- from the Lamini team on July 12, 2023