
Tel Aviv University-led research sparks debate on AI’s growing role in clinical care
Tel Aviv/Los Angeles: A groundbreaking study led by Prof. Dan Zeltzer of Tel Aviv University’s School of Economics has found that artificial intelligence can outperform human doctors in diagnosing illnesses and recommending treatments in urgent care settings. The findings, published in the Annals of Internal Medicine and presented at the American College of Physicians’ annual conference, signal a potential turning point in the integration of AI in real-world clinical practice.
AI Trained on Medical Records Shows Superior Accuracy
The research was conducted at Cedars-Sinai Medical Center’s virtual urgent care center in Los Angeles, operated in collaboration with Israeli health tech firm K Health. The AI tool, trained on millions of anonymized patient records, interacts with patients via structured chat to gather medical history and offer diagnostic and treatment recommendations — including prescriptions, lab tests, and referrals — before a physician reviews the case.
In the study of 461 adult patient visits in July 2024, a panel of experienced physicians evaluated both AI and human doctor recommendations on a four-tier scale: optimal, acceptable, inadequate, and potentially harmful.
Key Findings:
- AI-generated recommendations were rated optimal in 77% of cases, compared to 67% for human doctors.
- Potentially harmful recommendations were less common from AI (2.8%) than physicians (4.6%).
- In 21% of cases, AI outperformed doctors; doctors had the edge in 11%.
- In 68% of cases, both received the same rating.
“These findings surprised us,” said Prof. Zeltzer. “Across a broad range of common symptoms, AI produced more optimal and fewer harmful recommendations.”
Strengths: Data Depth and Consistency
AI showed particular strength in antibiotic stewardship and history-based decisions. For example, it avoided prescribing antibiotics in cases likely caused by viruses, where overprescription is common due to patient pressure. It also adjusted UTI treatment plans based on recurrence or previous antibiotic resistance — details doctors under pressure may overlook.
“AI doesn’t get fatigued or distracted,” Zeltzer said. “It can instantly access and analyze full patient records to guide decisions in a consistent, guideline-based manner.”
Limits: Clinical Intuition Still Matters
Despite its strengths, AI wasn’t flawless. It struggled with subtleties best observed visually, such as differentiating nasal congestion from genuine breathing distress. “In some cases, the doctor’s visual judgment was more accurate,” said Zeltzer.
Safety-First AI: No Hallucinations, No Guesswork
Unlike generative AI tools like ChatGPT, the system used in this study refrains from guessing. It only offers recommendations when it has high confidence — about 80% of the time. In 20% of cases, it withholds suggestions due to uncertainty. “This AI isn’t trying to sound convincing; it’s designed to be clinically reliable,” Zeltzer emphasized.
Implications for the Future of Medicine
The study did not assess whether doctors incorporated AI suggestions into their final decisions — a question slated for future research. Still, the results suggest AI could serve as a powerful clinical aid, especially in environments with high patient volume and time pressure.
“This doesn’t mean doctors are replaceable,” said Zeltzer. “But AI can help reduce diagnostic errors, standardize care, and free up physicians to focus on complex judgment calls and patient interaction.”
Looking Ahead: Collaboration, Not Replacement
With follow-up studies underway, experts are now grappling with broader questions: How should AI and doctors collaborate? When should AI recommendations be shown? What safeguards are needed to ensure accountability?
Zeltzer concluded, “The innovation is moving fast. But thoughtful implementation, transparency, and continued evaluation will be key to ensuring AI enhances — rather than disrupts — the future of healthcare.”