What is LLM?
Large language models (LLMs) are computer programs that open new possibilities of text understanding and generation. They are deep neural networks that are tens/hundreds of gigabytes in size and trained on enormous amounts of text data.
LLMs have shown to be very powerful in various natural language tasks including generating summaries and articles based on a given source text or topic.
Most LLMs require some sort of prompt as input.
Example:
"Please use the following job description and my resume to write a letter."
"Two American citizens leave the Irish pub sober. Continue the joke, please."
“What is the evidence for inhaled bronchodilators for acute chest syndrome in people with sickle cell disease?”
The LLM will output some text based on the input. You can think of them as models that can complete text.
Examples of LLMs
Some well-known examples of LLMs are BERT, GPT-3, and ChatGPT.
In this project, we used 3 text-generating LLMs to create example medical systematic review or evidence synthesis outputs for the interviews. These LLMs are not connected to an outside knowledge base (e.g. a search engine or a journal database). They generate text based on learning from seeing or reading many texts during training.
The descriptions for each of the models we utilized are as follows:
- Galactica is an LLM that is trained on over 48 million papers, textbooks, lecture notes, reference material, compounds, proteins, scientific websites, and other sources of scientific knowledge. The authors of the paper claim that the model can be used to explore scientific literature, ask scientific questions, and write scientific literature reviews and help researchers to automatically organize science. The largest model has 120B parameters. However, in our study, we use the model with 6.7B parameters due to limited computing resources.
- BioMedLM is an LLM based on an open-sourced LLM model with 2.7B parameters. The model was trained on 16 million PubMed Abstracts and 5 million PubMed Central full-text articles. This model can achieve strong results on a variety of biomedical NLP tasks, including the multiple choice question answering based on the United States Medical License Exams (USMLE).
- ChatGPT is an LLM trained to follow an instruction in a prompt and provide a detailed response. ChatGPT was specifically trained to interact in a conversational way with the user. The dialogue format of the output makes it possible for the model to answer followup questions, admit its mistakes, and reject inappropriate requests. ChatGPT has shown to be a powerful generative model that can complete tasks from a variety of domains.