Expert Language Model
To access the benefits of an LLM helper tool without sharing internal/proprietary experiment information, this
pilot explores the use of an open LLM
enhanced with information from internal notes, wikis, elog entries, shift procedures, etc. via RAG.
Components
Our pilot ELM is built from these components:
- Ollama - we use ollama as the inference backend. It loads and serves quantized
open models (e.g. llama3:3b) on local hardware.
- llama3 - open-weight language model from here. We focus on smaller (< 4B) LMs,
since most of the information we want retrieved will be from our local knowledge base.
- RAG Layer - as retriever and prompt constructor; interfaces between the user query and the model, performs semantic
search and prompt assembly.
- Document store - corpus/knowledge base (internal notes, wikis, elog entries, shift procedures, etc.) preprocessed and
vectorised using LangChain and ChromaDB.
Resources
Wrapper and utility python scripts are provided below. Some of the preprocessed knowledge base is also linked
below (password protected).
Page last updated on June 5, 2025