top of page

SLM

SMALL LANGUAGE MODEL

As its name implies, small language models (SLM) are smaller than scale and scope than LLM. 

​

While SLM parameters range from a few million to a few billion, which means it require less memory and computational power, making them perfect for resource-constrained environments. In certain domain-specific tasks, SLMs excel superior performance.

how does slm work?

Pruning

  • Like pruning trees, pruning removes unnecessary parameters from the model

  • Model fine tuning is required after  model is being pruned to avoid any overpruning to degrade the model performance. 

Knowledge distillation

  • Knowledge distillation is a transfer learning from a LLM into smaller model. 

Low-rank factorization

  • Low-rank factorization decomposes a large matrix of weights into a smaller, lower-rank matrix. This results in fewer parameters, decrease the number of computations and simplify complex matrix operations.

Quantization

  • Quantization converts high-precision data to lower-precision data, it can lighten the computional load and speed up inferencing​

AI Data Intelligence Logo (AI Consulting Service)

AI Data Intelligence

  • Whatsapp
  • LinkedIn
bottom of page