The Science

At HealthGPT.Plus, our mission is to revolutionize the way people manage their health. By leveraging the power of large language models, prompt engineering, and diagnostic calculators, we offer possibly the smartest suite of medical products in the world. With three main features designed to match, diagnose, and treat health concerns. This page explores the scientific foundation behind our cutting-edge platform.

1. Large Language Models in Healthcare

HealthGPT.Plus is built upon state-of-the-art AI technology, utilizing large language models like GPT-4 to solve complex problems in healthcare. These models analyze vast amounts of data, enabling our app to provide users with accurate, personalized health information and support.

Studies show that GPT-4, without any specialized prompt crafting, exceeds the passing score on United States Medical Licensing Examination by over 20 points and outperforms models specifically fine-tuned on medical knowledge (Med-PaLM, a prompt-tuned version of Flan-PaLM 540B). It also scored a 75% on the Medical Knowledge Self-Assessment Program.

2. Prompt engineering and self critique/reflection

Prompt engineering has been shown to increase the accuracy and performance of language models like GPT4. Most notably by breaking down complex health problems into smaller, context-free tasks, and having the AI reflect/critique and debate it’s own answers. Our AI platform’s proprietary prompting is based on the following papers.

Break complex tasks into simpler subtasks (and consider exposing the intermediate outputs to users)AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
Improve output by generating many candidates, and then picking the one that looks bestTraining Verifiers to Solve Math Word Problems
On reasoning tasks, models do better when they reason step-by-step before answeringChain of Thought Prompting Elicits Reasoning in Large Language Models
Improve step-by-step reasoning by generating many explanation-answer outputs, and picking the most popular answerSelf-Consistency Improves Chain of Thought Reasoning in Language Models
The step-by-step reasoning method works great even with zero examplesLarge Language Models are Zero-Shot Reasoners
Do better than step-by-step reasoning by alternating a ‘selection’ prompt and an ‘inference’ promptSelection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
On long reasoning problems, improve step-by-step reasoning by splitting the problem into pieces to solve incrementallyLeast-to-most Prompting Enables Complex Reasoning in Large Language Models
Have the model analyze both good and bogus explanations to figure out which set of explanations are most consistentMaieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Think about these techniques in terms of probabilistic programming, where systems comprise unreliable componentsLanguage Model Cascades
Eliminate hallucination with sentence label manipulation, and you can reduce wrong answers with a ‘halter’ promptFaithful Reasoning Using Large Language Models

3. Medical Calculators and Diagnostic Flowcharts

HealthGPT.Plus incorporates various medical calculators and diagnostic flowcharts to help users identify potential health risks and diagnose symptoms. These tools, based on evidence-based medicine and clinical guidelines, ensure the accuracy and reliability of the information provided. By combining AI technology with established medical knowledge, our app offers a user-friendly way to navigate complex health information.  


Nori, Harsha, et al. “Capabilities of GPT-4 on Medical Challenge Problems.” arXiv preprint arXiv:2303.13375 (2023).

OpenAI. “GPT-4 Technical Report.” ArXiv abs/2303.08774 (2023): n. pag.