A family of efficient AI models under 10B parameters performant in science, math, and coding through innovative training techniques.

1b 3b 7b 10b

4,759 5 days ago

Readme

Falcon3 represents TII’s latest advancement in efficient language models under 10B parameters, focused on enhancing science, math, and code capabilities while maintaining training efficiency.

Key Features

  • Four sizes: 1B, 3B, 7B, 10B
  • Depth up-scaling technique used to create 10B model from 7B
  • Knowledge distillation for smaller models (1B, 3B)

Performance Highlights

  • falcon3:1b outperforms smollm2:1.7b, matches gemma2:2b
  • falcon3:10b achieves SOTA in under-13B category
  • Extended context length up to 32K tokens (8K for 1B model)

References

Hugging Face