Microsoft's BitNet shows what AI can do with just 400MB and no GPU
techspot.com
What just happened? Microsoft has introduced BitNet b1.58 2B4T, a new type of large language model engineered for exceptional efficiency. Unlike conventional AI models that rely on 16- or 32-bit floating-point numbers to represent each weight, BitNet uses only three discrete values: -1, 0, or +1. This approach, known as ternary quantization, allows each weight to be stored in just 1.58 bits. The result is a model that dramatically reduces memory usage and can run far more easily on standard hardware, without requiring the high-end GPUs typically needed for large-scale AI.
The BitNet b1.58 2B4T model was developed by Microsoft's General Artificial Intelligence group and contains two billion parameters – internal values that enable the model to understand and generate language. To compensate for its low-precision weights, the model was trained on a massive dataset of four trillion tokens, roughly equivalent to the contents of 33 million ...
Copyright of this story solely belongs to techspot.com . To see the full text click HERE