Trinity

Trinity is a family of open weight MoE models trained by Arcee.ai.

This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.

Getting started

  1. Install Axolotl following the main from the installation guide.

  2. Install Cut Cross Entropy to reduce training VRAM usage.

  3. Run the finetuning example:

    axolotl train examples/trinity/trinity-nano-preview-qlora.yaml

This config uses about 24.9 GiB VRAM (w/o CCE).

Let us know how it goes. Happy finetuning! 🚀

TIPS

  • For inference, the official Arcee.ai team recommends top_p: 0.75, temperature: 0.15, top_k: 50, and min_p: 0.06.
  • You can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.
  • Read more on how to load your own dataset at docs.
  • The dataset format follows the OpenAI Messages format as seen here.

Optimization Guides

Please check the Optimizations doc.