From their website
Mistral AI team is proud to release Mistral 7B, the most powerful language model for its size to date.
Mistral 7B in short
Mistral 7B is a 7.3B parameter model that:
- Outperforms Llama 2 13B on all benchmarks
- Outperforms Llama 1 34B on many benchmarks
- Approaches CodeLlama 7B performance on code, while remaining good at English tasks
- Uses Grouped-query attention (GQA) for faster inference
- Uses Sliding Window Attention (SWA) to handle longer sequences at smaller cost
We’re releasing Mistral 7B under the Apache 2.0 license, it can be used without restrictions.
- Download it and use it anywhere (including locally) with our reference implementation
- Deploy it on any cloud (AWS/GCP/Azure), using vLLM inference server and skypilot
- Use it on HuggingFace
Mistral 7B is easy to fine-tune on any task. As a demonstration, we’re providing a model fine-tuned for chat, which outperforms Llama 2 13B chat.
You must log in or register to comment.