What You Will Learn
- How to prepare and format your dataset for fine-tuning
- Uploading datasets to Lumino using the SDK
- Initiating fine-tuning jobs with customizable parameters
- Running evaluations to assess model performance
Prerequisites
- Lumino SDK: Install using
pip install lumino-api-sdk-python
- API Key: Obtain it from the Lumino dashboard, and set it up as an environment variable:
Dataset Preparation
To begin, let’s prepare a dataset for fine-tuning. In this tutorial, we’ll use a simple trivia-based dataset. If you’re following along, feel free to create or use any dataset that fits your use case.Example Dataset - Original Format
Formatting the Dataset to JSONL
To use this dataset with the Lumino SDK, we need to convert it to a .jsonl format, where each line represents a structured conversation between a system, user, and assistant. Here’s a Python script that formats the dataset into the correct structure for fine-tuning:Uploading the Dataset to Lumino
Method 1: Uploading via the Lumino SDK
Once the dataset is prepared, it’s time to upload it to Lumino using the SDK. Here’s how to do that:Method 2: Uploading via the Lumino Web App UI
Alternatively, you can upload your dataset through the Lumino Web App, which provides an intuitive UI for managing datasets.- Log in to the Lumino Web App: Go to the Lumino Web App and log in with your credentials.
- Navigate to the Dataset Page:: Once logged in, head to the “Datasets” section on the dashboard.
- Upload Dataset::
- Click the Upload Dataset button.
- Select your formatted .jsonl file (e.g., Formatted_Trivia.jsonl) from your local machine.
- Confirm the Upload::
- After uploading, you’ll see the dataset listed in the Datasets section. You can now use this dataset for fine-tuning jobs.