What is Llama 3, Meta’s most sophisticated and capable large language model yet?

Subject: Science and tech

Sec: Awareness in AI and Computers

Context:

Meta on Thursday introduced its most capable Large Language Model (LLM), the Meta Llama 3.

More on news:

The company also introduced an image generator, which updates pictures in real-time even as the user types out the prompt.
Meta will be integrating its latest model into its proprietary virtual assistant — Meta AI.
Meta is pitching its latest models as the most sophisticated AI models, steering way ahead of its peers such as Google, Mistral, etc., in terms of performance and capabilities.
The updated Meta AI assistant will be integrated into Facebook, Instagram, WhatsApp, Messenger, and a standalone website much like OpenAI’s ChatGPT.
At present, Meta AI is available in English across the US on WhatsApp. Meta is also expanding to more countries including Australia, Canada, Ghana, Jamaica, Malawi, New Zealand, Nigeria, Pakistan, Singapore, South Africa, Uganda, Zimbabwe, and Zambia.
Llama 3 models will soon be available on AWS, Google Cloud, Hugging Face, Databricks, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, Snowflake, etc.

What is Llama 3?

Llama or Large Language Model Meta AI is a family of LLMs introduced by Meta AI in February 2023. The first version of the model was released in four sizes — 7B, 13B, 33B, and 65 billion parameters. The 13B model of Llama outperformed OpenAI’s GPT-3 which had 135 billion parameters.
Meta released Llama 2 in July last year, a significantly upgraded version of its first LLM.
Llama 2 was released in 7B, 13B, and 70B parameters and it was trained on 40 per cent more data when compared to its predecessor.
Meta is back with Llama 3, the latest iteration of its LLM which is claimed to be the most sophisticated model with significant progress in terms of performance and AI capabilities.
Llama 3, which is based on the Llama 2 architecture, has been released in two sizes, 8B and 70B parameters.
Both sizes come with a base model and an instruction-tuned version that has been designed to augment performance in specific tasks.
Meta has released text-based models in the Llama 3 collection of models.
All models of Llama 3 support context lengths of 8,000 tokens.
This allows for more interactions, and complex input handling compared to Llama 2 or 1.

How good is Llama 3?

Meta claims that the 8B and 70B parameter Llama 3 models are a giant leap from Llama 2.
Llama 3 outperformed Google’s Gemma 7B and Mistral’s Mistral 7B, Anthropic’s Claude 3 Sonnet in benchmarks such as MMLU 5-shot (Massive Multitask Language Understanding), GPQA 0-shot (A Graduate-Level Google-Proof Q&A Benchmark), HumanEval 0-shot (a benchmark for evaluating the multilingual ability of code generative models), GSM-8K 8-shot and Math 4-shot, CoT (maths and word problems).