Artificial Intelligence (AI) Lexicon

The ISMPP Artificial Intelligence (AI) Task Force has developed this AI Lexicon, which explains the most important AI-related terms for a medical writer audience. Feedback on the AI Lexicon can be emailed to [email protected].

There are many other glossaries on the internet explaining these terms that also contain many other terms; some are listed below. Note: ISMPP is not affiliated with any of the providers of these glossaries.

NIST: ​​https://airc.nist.gov/AI_RMF_Knowledge_Base/Glossary
Wikipedia: https://en.wikipedia.org/wiki/Glossary_of_artificial_intelligence
Coursera: https://www.coursera.org/articles/ai-terms
Expert.ai: https://www.expert.ai/glossary-of-ai-terms/
Google: https://developers.google.com/machine-learning/glossary
AINavigator: https://www.theainavigator.com/#ai-glossary

Last updated: July 31, 2024

Terms related to Large Language Models

Context Length / Context Window

Context length in LLMs (see Large Language Models) refers to the maximum number of tokens (words or parts of words) a model can consider at once. For models like GPT, it affects how well the model can keep track of long stories of conversations. Very large context lengths, for instance, would support submitting an entire full text article at once. 

Embeddings

Embeddings are sets of numbers (usually up to 1,000) that represent the meaning of a text, whether it's a document, paragraph, sentence, or word. They help find similar texts (see semantic search and RAG systems) and are an important part of large language models (LLMs).

Generative AI (GenAI)

Encompasses technologies that produce new content, such as text, images, videos, and code, unlike systems that classify or cluster data. In the medical field, examples include AI systems like ChatGPT, which can generate medical documents, patient communication, and research papers, and tools like DALL-E or Midjourney, which can create images for educational or diagnostic purposes.

Hallucination

A well-known phenomenon in LLMs, in which the system provides an answer that is factually incorrect, irrelevant, or nonsensical, because of limitations in its training data and architecture. Hallucination is inevitable but can be mitigated, e.g., with a RAG system.

Large Language Model (LLM)

A type of neural network that learns skills — including generating prose, conducting conversations and writing computer code — by analyzing vast amounts of text from across the internet. The basic function is to predict the next word in a sequence, but these models have surprised experts by learning new abilities. Researchers were surprised that LLMs passed the USMLE (United States Medical Licensing Examination) exam, for instance.

Retrieval-augmented Generation (RAG)

Question answering systems may be hampered if the answer to the question asked was not part of the training set, e.g., if it involves more recent information. RAG systems enhance the quality of answers by accessing an external database during the answer generation process. This approach enriches the prompts used by incorporating relevant context, historical data, and up-to-date information. This means that RAG systems can generate more accurate and comprehensive content by referencing an external data store, such as medical literature or clinical guidelines, at the time of writing. These models can outperform traditional LLMs with fewer parameters and can be updated easily by refreshing their data sources. Additionally, RAG LLMs can provide citations for their generated content, making it easier for users to verify and trust the information. Accuracy is, however, highly dependent on the retrieval phase, which is responsible for the relevance of the additional information.

Semantic Search

Semantic search is similar to traditional keyword-based searches, but it aims to understand the intent and context behind a query, providing more relevant and accurate results. It uses embeddings to capture the semantic meaning of a text and uses these to find texts similar in meaning. This allows, for instance, that a search for "myocardial infarction" might also return relevant results about "heart attacks" or "coronary thrombosis." By grasping the nuances of medical terminology and the relationships between different concepts, semantic search significantly enhances the depth and breadth of information retrieval in medical writing and research.

Token

A "token" in the context of LLMs is a basic unit of text. Tokens can be whole words, parts of words, or punctuation marks. They help the model to analyze and generate language, crucial for tasks like medical documentation. LLMs usually work with a limited set of tokens, typically between 50k and 100k and are chosen so they can cover any text by splitting complex words into chunks, e.g., che-mother-apy.

Transformer Model

A transformer model is a neural network architecture that revolutionized language understanding by processing entire sentences simultaneously rather than sequentially. This architecture uses a self-attention mechanism to focus on important words in a sentence, helping it to grasp the semantics, understand context better, and handle long sentences. This is crucial for creating advanced medical AI tools that need to understand complex medical documents and literature.

General AI / Machine Learning Concepts

Adversarial Examples

Adversarial examples are inputs to an AI system that have been intentionally designed to cause the system to make a mistake. They are often used to see how stable the system is or to make the training process more challenging.

Artificial General Intelligence (AGI)

Artificial General Intelligence (AGI) is a type of AI that can do anything a human can do intellectually. It’s not limited to specific tasks like playing chess or recognizing faces. Instead, AGI can learn, understand, and apply knowledge in many different areas, just like a human. This means an AGI system could potentially solve problems, think creatively, and adapt to new situations across various fields, from science to art to everyday decision-making. It does not exist yet, but it is considered the ultimate goal of AI research; by some people, it is viewed as dangerous.

Anthropomorphism

This refers to when people assign human-like traits to AI systems. For example, users might feel that an AI system is empathetic based on how it interacts with them, even though the AI doesn't actually have emotions.

Bias

Bias refers to systematic errors that manifest in AI outputs, potentially skewing results based on the data used during the training phase. This could lead to AI systems making unreliable predictions or generating inappropriate responses, for instance, by correlating specific occupations, medical conditions, or treatments disproportionately with certain demographics like race or gender. This happens because the training data might not represent everyone properly. Such bias can lead to unfair practices or poor medical care; for instance, AI systems trained on datasets with predominantly male patients may misdiagnose or underdiagnose conditions in women, especially for diseases that present differently across genders. To prevent this, it is important to collect diverse and thorough data and keep checking the AI system regularly. When using LLMs, be aware of this and try to take this into account in your prompt iteration.

Deep Learning

Deep learning is a subset of machine learning using artificial neural networks with multiple layers to understand complex patterns in data. It allows AI, like large language models (LLMs), to learn from huge amounts of text and create human-like responses.

Emergent Behavior

Refers to unexpected or unintended abilities that LLMs develop based on what they learned during training. For example, an AI trained on medical literature and patient data might start suggesting new treatments or noticing connections between symptoms and diseases that haven't been explored before.

Explainable AI (XAI)

Explainable AI (XAI) is about making it clear how AI makes decisions, which helps build trust. In generative AI, XAI helps users understand why certain outputs, like medical text predictions, are created. This makes the AI more reliable and helps meet legal requirements.

Federated Learning

Federated learning is a way to train AI models without moving data from where they are stored. This helps protect patient privacy. It is especially useful in medicine, as it allows different hospitals or clinics to work together on improving AI models without sharing private patient information.

Inference

This is when a trained model is used on new data to make predictions or create outputs. For example, it can write clinical documents or summarize text. This happens after the model has been trained and is ready to be used in real situations.

Machine Learning

This technology lets computers learn from data, get better over time, and make decisions on their own. It involves training models with lots of data to, for example, create text that reads like a human wrote it, or interpret X rays automatically. This can greatly help with tasks like making medical documents and clinical reports, and serve as an assistant to clinicians.

Multi-modal

A multi-modal LLM can understand and combine different types of data, like text, images, and numbers, and also generate these different types. For example, it can look at a graph and the text that explains it at the same time, to give a detailed analysis, or interpret a table and generate a graph.

Natural Language Processing (NLP)

This includes methods that help models understand and create human language. Tasks like sorting texts and analyzing emotions in texts are part of this. NLP (Natural Language Processing) uses machine learning, statistics, and language rules to handle complex language data. This is important for things like analyzing patient communication and automating clinical documents.

Neural Network

A machine learning model inspired by the human brain, simulating connected neurons. It learns from identifying statistical patterns in data. Multiple layers of artificial neurons are connected. The first layer processes the input data, and the last layer gives the output. Despite their complexity, even the developers often do not fully understand what happens in the middle layers. A separate research topic to investigate that is XAI. This type of model is very important for tasks like analyzing medical images or predicting patient outcomes.

Parameters

Parameters are numbers that shape how a neural network or large language model (LLM) works, like hints that help it predict the next word. Systems like GPT-4 have hundreds of billions of these parameters. Generally, bigger neural networks are more costly to train and use, but they can handle a wider range of tasks. For specific tasks, a smaller network can be just as good or even better than a large one if it's trained well, it will be cheaper and faster to run, and it will run on smaller systems like laptops or even smartphones.

Reinforcement Learning

This method involves an AI model learning the best actions by trying different things and getting rewards or penalties based on its actions. Human feedback, such as evaluations, corrections, and suggestions, further refines the learning process. This is especially helpful in clinical decision support systems, where the AI improves its recommendations over time by learning from results and expert advice.

Supervised / Unsupervised / Self-supervised Learning

Supervised learning: This method trains models using labeled data, where inputs are matched with known outputs, often labeled by human experts. It is useful for tasks like diagnostic imaging, where images are labeled with specific diagnoses.

Unsupervised learning: This method finds patterns in data without labels. It is useful for discovering new disease clusters or repurposing drugs. It is more scalable than supervised learning, but has fewer applications.

Self-supervised learning: This method creates its own labels from the data, learning from the context within the data itself. It is crucial for large language models (LLMs), allowing them to predict text sequences based on previous words, and is more scalable than supervised learning and more versatile than unsupervised learning.

Synthetic Data Generation

Synthetic data generation means creating fake datasets that look like real-world data. This is very important in medical AI because it helps train models when there is not enough patient data due to privacy issues or rare conditions. This way, AI tools can be developed without risking patient privacy.

Transfer Learning

Transfer learning is a method where a model trained for one task is used as the starting point for a model on a different task. This is very useful in medical AI, as it allows models to be quickly adapted for different medical specialties or regions by using pre-trained models for new, specific tasks.

Virtual Health Assistants

Virtual health assistants are AI tools that help patients by scheduling appointments, answering health questions, and keeping track of treatments. They make it easier for patients to stay engaged with their healthcare, especially for those with chronic conditions or living in remote areas.