Why Anthropic calls the new Claude 3 its ‘most intelligent’ AI model yet
- March 7, 2024
- Posted by: OptimizeIAS Team
- Category: DPN Topics
No Comments
Why Anthropic calls the new Claude 3 its ‘most intelligent’ AI model yet
Subject: Science and tech
Section: Awareness in IT & Computer
Context:
- Anthropic was founded by former members of OpenAI, the company behind ChatGPT. It says its new family of AI models is capable of advanced performance, beating the likes of GPT-4 on some parameters.
More on news:
- The family includes three state-of-the-art AI models in the ascending order of capabilities – Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus.
- The company claims that each model offers an increasingly powerful performance, offering a balance between intelligence, speed, and cost based on their specific use case.
What is Claude 3?
- Claude is a group of large language models (LLMs) developed by Anthropic.
- The chatbot is capable of handling text, voice messages, and documents.
- The chatbot is capable of generating faster, contextual responses compared to its peers.
- Claude 3 Opus is the most powerful model, Claude 3 Sonnet is the middle model that is capable and price competitive, and Claude 3 Haiku is relevant for any use case that requires instant responses.
- Claude Sonnet powers the Claude.ai chatbot for free at present and users only need an email sign-in.
- Opus is only available through Anthropic’s web chat interface and if a user is subscribed to the Claude Pro service on the Anthropic website. It is available for $20 a month.
- All new models come with a 2,00,000-token window, signifying possibly better performance, accuracy and the capacity to input more information in a user prompt.
How did Claude 3 perform?
- Based on the comparison of Claude 3 with its peers, it seems the Anthropic may have caught up with OpenAI.
- It had surpassed many AI models with the launch of its GPT-4 Turbo.
- Claude 3 reportedly demonstrates advanced performance across cognitive tasks such as reasoning, expert knowledge, mathematics, and language fluency.
- Despite the lack of consensus over whether LLMs can really “know” or “reason,” the AI research community commonly uses these terms.
- The company says that the Opus model exhibits “near-human levels of comprehension and fluency on complex tasks”.
- While this is a big claim, the scores show that Claude 3 Opus has shown some near-human performance on specific benchmarks. However, this doesn’t mean that Opus possesses general intelligence like humans.
Claude 3 vs GPT-4
- Claude 3 Opus has surpassed GPT-4 on as many as 10 AI benchmarks, which include MMLU (undergraduate level knowledge), HumanEval (Coding), HellaSwag (common knowledge), and GSM8K (grade school math).
- On the benchmark scores, Claude 3 beats its peers narrowly. For example, in the five-shot MMLU trial, Claude 3 secured 86.8 percent while GPT-4 obtained 86.4 percent.
Benchmark scores.
- Claude 3 has also shown improvements in terms of analysis, forecasting, content creation, multilingual conversations, code generation, etc.
- Anthropic claimed that the new model family also comes with enhanced vision capabilities, allowing Claude 3 to process photos, charts, and diagrams, much like GPT-4V.
Limitations of Claude 3
- According to those who had early access to the model, Claude 3 performs well in tasks such as answering factual questions and optical character recognition (OCR), meaning the ability to extract text from images.
- However, it struggles with complex reasoning and mathematical problems at times.
- It also exhibited biases in its responses, such as favoring a certain racial group over others.
- In the past too, other AI models have faced similar problems.
- Google’s AI chatbot Gemini was criticized after it showed racial bias and historical inaccuracies. It refused to generate images of white individuals and depicted those individuals as people of color.
- Anthropic has emphasized the safety features of Claude 3, especially its refusal to generate harmful or illegal content.
- The company was also among the first to bring about Constitutional AI. Developers laid down a set of values that the system must follow so that it undertakes politically and socially responsible actions.
- As of now the Claude 3 is the most expensive model on the market, but Anthropic has plans to release affordable versions soon.
- Based on the early reports, benchmarks, and confidence from the AI community, Claude 3 seems to be a significant step forward in the development of LLMs.