The post Forget AGI—Top AI Models Still Struggle With Math appeared on BitcoinEthereumNews.com. In brief MATHVISTA, built with more than 6,000 annotated datapointsThe post Forget AGI—Top AI Models Still Struggle With Math appeared on BitcoinEthereumNews.com. In brief MATHVISTA, built with more than 6,000 annotated datapoints

Forget AGI—Top AI Models Still Struggle With Math

For feedback or concerns regarding this content, please contact us at [email protected]

In brief

  • MATHVISTA, built with more than 6,000 annotated datapoints from Sahara AI, tests AI models on multimodal math reasoning.
  • GPT-4V scored 49.9%, the highest result among 12 models tested, but still 10.4 percentage points below human performance.
  • Researchers say progress toward AGI may depend less on model size than on better training and evaluation data.

Artificial general intelligence, or AGI, is often described as a system that can perform across many domains the way humans do. Results released this week from the MATHVISTA benchmark test show current models still fall short of that goal.

Researchers from Microsoft Research, Sahara AI, and Emory University tested capabilities central to general intelligence, mathematical reasoning grounded in visual information, including charts, graphs, and diagrams.

Across 12 foundation models tested, including ChatGPT, Gemini, and Claude, GPT-4 Vision scored highest at 49.9%. Human participants averaged 60.3%, highlighting a gap between current AI systems and the broader reasoning ability often associated with AGI.

“We want the machine to do things that a normal, average person can do for their daily tasks,” Principal Researcher at Microsoft Research Hao Cheng told Decrypt. “That’s basically what everybody is pursuing for AGI.”

By putting problems into images, diagrams, and plots, the project tests whether models can accurately interpret visual information and solve multi-step mathematical and logical problems—skills that go beyond pattern-matching on text alone.

Models still struggle with those tasks, and measuring that limitation is difficult.

When Cheng’s team reviewed existing evaluation datasets, many included problems that did not require visual reasoning. Models often reached correct answers by relying solely on text.

“Which is not ideal,” Cheng said.

MathVista, available on GitHub and Hugging Face, launched in October 2023. Since then, it has been downloaded more than 275,000 times, including more than 13,000 downloads in the past month, according to Microsoft Research.

Creating the dataset required more than standard data labeling, however. Microsoft Research needed annotators who could work through problems across arithmetic, algebra, geometry, and statistics, while distinguishing deeper mathematical reasoning, such as interpreting graphs or solving equations, from simpler tasks like counting objects or reading numbers.

After a pilot phase, Microsoft selected Sahara AI to support the effort. The company provided trained annotators, custom workflows, and multi-stage quality checks to produce more than 6,000 multimodal examples used in the benchmark.

Without reliable benchmarks, measuring progress toward broader machine intelligence becomes difficult, according to Sean Ren, CEO of Sahara AI and an associate professor of computer science at USC

“There’s this nuance of data contamination, where once we start using this dataset to test, those results get absorbed into the next version,” Ren told Decrypt. “So you don’t really know if they are solving just a data set, or they have the capability.”

If benchmark answers appear in a model’s training data, high scores can reflect memorization rather than reasoning. That makes it harder to determine whether AI systems are actually improving.

Researchers also point to limits in training data. Much of the publicly available internet has already been incorporated into model datasets.

“You definitely need to have some way to inject some of the new knowledge into this process,” Cheng said. “I think this kind of thing has to come from high-quality data so that we can actually break this knowledge boundary.”

One proposed path involves simulated environments where models can interact, learn from experience, and improve through feedback.

“You create a twin world or a mirror of the real world inside some sandbox so the model can play and do a lot of things humans do in real life, so that it can basically break the boundary of the internet,” Cheng said.

Ren said humans may still play an important role in improving AI systems. While models can generate content quickly, humans remain better at evaluating it.

“That kind of gap between human and AI, where they’re good at, where they’re not good at, can be leveraged to really improve the AI down the road,” he said.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Source: https://decrypt.co/361474/forget-agi-top-ai-models-still-struggle-with-math

Market Opportunity
Delysium Logo
Delysium Price(AGI)
$0.01233
$0.01233$0.01233
-4.27%
USD
Delysium (AGI) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

You May Also Like

Will the 2026 cycle really be like the 2022 crash?

Will the 2026 cycle really be like the 2022 crash?

The post Will the 2026 cycle really be like the 2022 crash? appeared on BitcoinEthereumNews.com. How Bitcoin Cycles Work Bitcoin cycles are often interpreted through
Share
BitcoinEthereumNews2026/03/21 16:35
BitGo Revenue Skyrockets: Quadruples Year-Over-Year in Astounding H1

BitGo Revenue Skyrockets: Quadruples Year-Over-Year in Astounding H1

BitcoinWorld BitGo Revenue Skyrockets: Quadruples Year-Over-Year in Astounding H1 The world of digital assets is constantly evolving, and recent news from BitGo offers a compelling look into the sector’s robust growth. If you’ve been following the cryptocurrency space, you know that institutional players are increasingly seeking secure solutions for their digital holdings. This context makes the latest announcement about BitGo revenue particularly significant, revealing an astounding quadrupling of its first-half earnings. What’s Behind the Astounding BitGo Revenue Surge? Digital asset custodian BitGo has made headlines with its impressive financial performance. The company recorded a remarkable first-half revenue of $4.19 billion. This figure represents an incredible fourfold increase year-over-year, as reported by Reuters, highlighting a substantial expansion in its operations and market penetration. This dramatic growth underscores the increasing demand for secure digital asset custody solutions among institutional investors. Despite this revenue boom, it’s interesting to note that BitGo’s operating profit saw a decline, moving from $30.9 million to $12.6 million during the same period. This suggests significant reinvestment or increased operational costs associated with scaling. BitGo’s Crucial Role in Digital Asset Custody and Its Impact on BitGo Revenue As a leading digital asset custodian, BitGo plays a critical role in providing secure infrastructure for cryptocurrencies. Its services are essential for institutions looking to enter or expand their presence in the crypto market without compromising security. The surge in BitGo revenue directly reflects this growing trust and reliance on specialized custody providers. The company’s robust security measures and compliance framework attract a wide array of clients, from hedge funds to exchanges. Therefore, the significant increase in its top-line earnings is a strong indicator of broader institutional adoption of digital assets. Navigating Growth: BitGo’s Financials and Future Ambitions While the quadrupling of BitGo revenue is a clear win, the dip in operating profit warrants a closer look. This scenario is not uncommon for rapidly expanding companies that are investing heavily in infrastructure, talent, and new product development to capture market share. Such investments, while impacting short-term profitability, are often crucial for long-term sustainable growth and market leadership. Furthermore, BitGo previously submitted an S-1 filing with the U.S. Securities and Exchange Commission (SEC) for an initial public offering (IPO). This move signals the company’s ambition to become a publicly traded entity, potentially seeking to raise substantial capital to fuel further expansion and solidify its market position. An IPO would also bring increased transparency and regulatory scrutiny, which could further build trust among institutional clients. What Does This BitGo Revenue Boom Mean for the Broader Crypto Market? The substantial growth in BitGo revenue is more than just a company success story; it offers valuable insights into the health and direction of the wider cryptocurrency ecosystem. It suggests a maturing market where professional and institutional money is flowing in, demanding enterprise-grade solutions for managing digital assets. This trend indicates: Increased Institutional Adoption: More traditional financial institutions are comfortable holding and managing cryptocurrencies. Demand for Security: The need for secure, compliant, and insured custody services is paramount. Market Maturation: The infrastructure supporting digital assets is becoming more sophisticated and robust. This positive indicator could encourage more cautious investors to explore digital assets, knowing that reputable custodians like BitGo are providing essential services. In conclusion, BitGo’s phenomenal quadrupling of its first-half BitGo revenue to $4.19 billion is a testament to the surging demand for institutional-grade digital asset custody. While its operating profit saw a temporary decline, this often reflects strategic investments aimed at future growth and market dominance. With an eye towards a potential IPO, BitGo is not only securing digital assets but also shaping the future landscape of cryptocurrency finance. This impressive performance underscores the ongoing institutionalization of the crypto market and highlights the critical role played by secure, reliable custodians. Frequently Asked Questions About BitGo’s Performance Here are some common questions regarding BitGo’s recent financial disclosures and its role in the digital asset space: What is BitGo, and what services does it provide? BitGo is a leading digital asset custodian that provides secure and compliant custody solutions for cryptocurrencies. It offers services like multi-signature wallets, institutional trading, and asset management for businesses and institutional investors. Why did BitGo’s operating profit decline even with a significant increase in BitGo revenue? A decline in operating profit amidst revenue growth often indicates substantial strategic investments. BitGo is likely investing heavily in expanding its infrastructure, technology, security measures, and team to meet growing demand and pursue its IPO ambitions, which can temporarily impact short-term profitability. What is the significance of BitGo’s S-1 filing with the SEC? The S-1 filing is a preliminary step for companies planning an Initial Public Offering (IPO) in the U.S. It signifies BitGo’s intention to become a publicly traded company, aiming to raise capital and enhance its market presence and transparency within the traditional financial system. How does the growth in BitGo revenue reflect on the broader cryptocurrency market? The impressive growth in BitGo revenue is a strong indicator of increasing institutional adoption and confidence in digital assets. It highlights a maturing market where professional investors are seeking robust and secure solutions for managing their crypto holdings, suggesting a positive trend for the overall ecosystem. What are the benefits of using a digital asset custodian like BitGo? Using a custodian like BitGo provides enhanced security against hacks and theft, regulatory compliance, insurance, and professional management of digital assets. This is crucial for institutions that need to meet stringent security and regulatory requirements. We hope this deep dive into BitGo’s impressive financial performance has shed light on the evolving digital asset landscape. If you found this article insightful, consider sharing it with your network on social media. Your shares help us continue to provide valuable insights into the dynamic world of cryptocurrency! To learn more about the latest crypto market trends, explore our article on key developments shaping digital asset institutional adoption. This post BitGo Revenue Skyrockets: Quadruples Year-Over-Year in Astounding H1 first appeared on BitcoinWorld.
Share
Coinstats2025/09/20 09:25
TEAMZ Summit 2026 Unveils Agenda for International Conference – Where Japanese Culture Meets Web3 and AI

TEAMZ Summit 2026 Unveils Agenda for International Conference – Where Japanese Culture Meets Web3 and AI

One of Japan’s largest Web3 and AI conferences, TEAMZ Summit 2026, will take place on April 7–8, 2026, at the prestigious Happo-en in Tokyo.
Share
The Cryptonomist2026/03/21 16:00