BTLM-3B-8k-base brings LLM capabilities to devices with just 3GB of memory

Cerebras and Opentensor have trained a powerful 3 billion parameter language model with an 8k context length window, called BTLM-3B-8k-base, on the Condor Galaxy 1 (CG-1) supercomputer. This new model outperforms similar models, achieves performance comparable to open 7B parameter models, can be quantized to fit on devices with as little as 3 GB of memory, and is licensed for commercial use. It requires 71% fewer training FLOPs and has a 58% smaller memory footprint for inference than comparable 7B models.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top