Microsoft unveils Phi-3-mini, its smallest AI model yet: How it compares to bigger models

Subject: Science and tech

Sec: Awareness in IT & Computer

Context:

A few days after Meta unveiled its Llama 3 Large Language Model (LLM), Microsoft on Tuesday unveiled the latest version of its ‘lightweight’ AI model – the Phi-3-Mini.

More on news:

Microsoft has described the Phi-3 as a family of open AI models that are the most capable and cost-effective small language models (SLMs) available.

What is Phi-3-mini?

Phi-3-Mini is believed to be first among the three small models that Microsoft is planning to release.
It has reportedly outperformed models of the same size and the next size up across a variety of benchmarks, in areas like language, reasoning, coding, and maths.
Language models are the backbone of AI applications like ChatGPT, Claude, Gemini, etc.
These models are trained on existing data to solve common language problems such as text classification, answering questions, text generation, document summarisation, etc.
The ‘Large’ in LLMs has two meanings — the enormous size of training data; and the parameter count. In the field of Machine Learning, where machines are equipped to learn things themselves without being instructed, parameters are the memories and knowledge that a machine has learned during its model training.

What’s new in Microsoft’s Phi-3-mini?

The latest model from Microsoft expands the selection of high-quality language models available to customers, offering more practical choices as they build generative AI applications.
Phi-3-mini, a 3.8B language model, is available on AI development platforms such as Microsoft Azure AI Studio, Hugging Face, and Ollama.
Phi-3-mini is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.
The model is instruction-tuned, which means that it is trained to follow the different types of instructions given by users.
This also means that the model is ‘ready to use out-of-the-box’.
Microsoft says that in the coming weeks, new models will be added to the Phi-3 family to offer customers more flexibility.
Phi-3-small (7B) and Phi-3-Medium will be available in the Azure AI model catalog and other model libraries shortly.

How is Phi-3-mini different from LLMs?

Phi-3-mini is an SLM.
SLMs are more streamlined versions of large language models.
When compared to LLMs, smaller AI models are also cost-effective to develop and operate, and they perform better on smaller devices like laptops and smartphones.
According to Microsoft, SLMs are great for resource-constrained environments including on-device and offline inference scenarios.
The company claims such models are good for scenarios where fast response times are critical, say for chabots or virtual assistants.
Moreover, they are ideal for cost-constrained use cases, particularly with simpler tasks.

How good are the Phi-3 models?

Phi-2 was introduced in December 2023 and reportedly equaled models like Meta’s Llama 2.
Microsoft claims that the Phi-3-mini is better than its predecessors and can respond like a model that is 10 times bigger than it.
Based on the performance results shared by Microsoft, Phi-3 models significantly outperformed several models of the same size or even larger ones, including Gemma 7B and Mistral 7B, in key areas.