New Studies Reveal Surprising Differences in the Capabilities of ChatGPT: Struggles with Coding, Excels in Medicine

When it comes to the development of advanced artificial intelligence (AI), not all language models are created equal. Recent studies have shed light on the striking differences in the capabilities of popular systems like ChatGPT, highlighting its struggles with coding challenges while showcasing its excellence in the field of medicine.

Researchers at Purdue University conducted an analysis of ChatGPT’s performance on Stack Overflow, a platform frequented by developers and programmers seeking answers to their coding questions. The results were eye-opening, with ChatGPT providing incorrect responses to 52% of the over 500 questions posed. Furthermore, a significant number of its answers (77%) were found to be unnecessarily verbose. Surprisingly, despite these shortcomings, ChatGPT’s responses were still preferred 39.34% of the time due to their comprehensive nature and well-articulated language style.

On the other hand, another study conducted jointly by UCLA and Pepperdine University examined ChatGPT’s ability to answer tough medical exam questions. Specifically, it was tested on a range of over 850 multiple-choice questions in the field of nephrology, a specialty within internal medicine. The results were impressive, with ChatGPT scoring 73% – a passing rate comparable to that of human medical residents.

The superiority of ChatGPT in accurately answering medical questions reveals its potential for future medical applications. In fact, the study conducted by UCLA concludes that such advancements signify the utility of more capable AI models in the medical field. It’s worth noting that the model behind ChatGPT’s medical knowledge, Claude AI, benefited from additional proprietary training data provided by Anthropic, the creator of ChatGPT. In contrast, the open-source ChatGPT from OpenAI relied solely on publicly available data. This difference in training data availability showcases the impact of proper training with extensive datasets, which can result in AI models outperforming others.

However, it’s essential to recognize that an AI model’s performance is constrained to the parameters it was trained on. If it lacks prior knowledge regarding a specific domain, it may struggle to produce accurate results in that area. This limitation leads to what experts refer to as hallucinations, where an AI tries to generate content without sufficient understanding or training on the topic. The lack of free access to training data that is not publicly available remains a significant obstacle to further improving AI performance.

ChatGPT’s struggle with coding aligns with other assessments conducted by researchers from Stanford and UC Berkeley. Their findings indicate a decline in ChatGPT’s math and visual reasoning skills between March and June 2022. While it initially demonstrated proficiency in areas like primes and puzzles, its performance dropped to a mere 2% on key benchmarks by summer.

In summary, ChatGPT’s capabilities vary across different domains, excelling in medicine but facing challenges in coding. The divergence arises from the distinct strengths of machine learning models and the availability of training data. While AI models like ChatGPT have made incredible advancements, they still have much to learn before becoming proficient programmers. After all, it’s rare to find doctors who are also skilled hackers.

New Studies Reveal Surprising Differences in the Capabilities of ChatGPT: Struggles with Coding, Excels in Medicine

Subscribe

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

Democratic National Convention Approves Platform Doubling Down on Abortion and LGBTQ+ Rights in 2024

More like this
Related

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

About us

Company

The latest

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Subscribe

New Studies Reveal Surprising Differences in the Capabilities of ChatGPT: Struggles with Coding, Excels in Medicine

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related