There’s additionally a chat model. The fashions can be found on the Hugging Face hub:
Falcon 180B is totally free and state-of-the-art. However it’s additionally an enormous mannequin.
Can it run in your laptop?
Until your laptop is prepared for very intensive computing, it might’t run Falcon 180B out-of-the-box. You have to to improve your laptop and use a quantized model of the mannequin.
On this article, I clarify how one can run Falcon-180B on shopper {hardware}. We are going to see that it may be moderately inexpensive to run a 180 billion parameter mannequin on a contemporary laptop. I additionally focus on a number of strategies that assist cut back the {hardware} necessities.
The very first thing it’s worthwhile to know is that Falcon 180B has 180 billion parameters saved as bfloat16. A (b)float16 parameter is 2 bytes in reminiscence.
Once you load a mannequin, the usual Pytorch pipeline works like this:
- An empty mannequin is created: 180B parameters * 2 bytes = 360 GB