New Studies Reveal Surprising Differences in the Capabilities of ChatGPT: Struggles with Coding, Excels in Medicine

Date:

Updated: [falahcoin_post_modified_date]

New Studies Reveal Surprising Differences in the Capabilities of ChatGPT: Struggles with Coding, Excels in Medicine

When it comes to the development of advanced artificial intelligence (AI), not all language models are created equal. Recent studies have shed light on the striking differences in the capabilities of popular systems like ChatGPT, highlighting its struggles with coding challenges while showcasing its excellence in the field of medicine.

Researchers at Purdue University conducted an analysis of ChatGPT’s performance on Stack Overflow, a platform frequented by developers and programmers seeking answers to their coding questions. The results were eye-opening, with ChatGPT providing incorrect responses to 52% of the over 500 questions posed. Furthermore, a significant number of its answers (77%) were found to be unnecessarily verbose. Surprisingly, despite these shortcomings, ChatGPT’s responses were still preferred 39.34% of the time due to their comprehensive nature and well-articulated language style.

On the other hand, another study conducted jointly by UCLA and Pepperdine University examined ChatGPT’s ability to answer tough medical exam questions. Specifically, it was tested on a range of over 850 multiple-choice questions in the field of nephrology, a specialty within internal medicine. The results were impressive, with ChatGPT scoring 73% – a passing rate comparable to that of human medical residents.

The superiority of ChatGPT in accurately answering medical questions reveals its potential for future medical applications. In fact, the study conducted by UCLA concludes that such advancements signify the utility of more capable AI models in the medical field. It’s worth noting that the model behind ChatGPT’s medical knowledge, Claude AI, benefited from additional proprietary training data provided by Anthropic, the creator of ChatGPT. In contrast, the open-source ChatGPT from OpenAI relied solely on publicly available data. This difference in training data availability showcases the impact of proper training with extensive datasets, which can result in AI models outperforming others.

However, it’s essential to recognize that an AI model’s performance is constrained to the parameters it was trained on. If it lacks prior knowledge regarding a specific domain, it may struggle to produce accurate results in that area. This limitation leads to what experts refer to as hallucinations, where an AI tries to generate content without sufficient understanding or training on the topic. The lack of free access to training data that is not publicly available remains a significant obstacle to further improving AI performance.

ChatGPT’s struggle with coding aligns with other assessments conducted by researchers from Stanford and UC Berkeley. Their findings indicate a decline in ChatGPT’s math and visual reasoning skills between March and June 2022. While it initially demonstrated proficiency in areas like primes and puzzles, its performance dropped to a mere 2% on key benchmarks by summer.

In summary, ChatGPT’s capabilities vary across different domains, excelling in medicine but facing challenges in coding. The divergence arises from the distinct strengths of machine learning models and the availability of training data. While AI models like ChatGPT have made incredible advancements, they still have much to learn before becoming proficient programmers. After all, it’s rare to find doctors who are also skilled hackers.

[single_post_faqs]
Tanvi Shah
Tanvi Shah
Tanvi Shah is an expert author at The Reportify who explores the exciting world of artificial intelligence (AI). With a passion for AI advancements, Tanvi shares exciting news, breakthroughs, and applications in the Artificial Intelligence category. She can be reached at tanvi@thereportify.com for any inquiries or further information.

Share post:

Subscribe

Popular

More like this
Related

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

Revolutionary SBEN connects small business sellers and buyers, transforming the way businesses are bought and sold in the U.S.

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

District 1 Commissioner Race in Orange County faces delays with recounts and ballot reviews. Find out who will come out on top in this close election.

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Federal Reserve minutes suggest potential rate cut in September amid economic uncertainty. Find out more about the upcoming policy decisions.

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

Experience the powerful testimonies of Baltimore Orioles players on their first-ever 'Faith Night.' Hear how their faith impacts their lives on and off the field.