Expert Language Model

To access the benefits of an LLM helper tool without sharing internal/proprietary experiment information, this pilot explores the use of an open LLM enhanced with information from internal notes, wikis, elog entries, shift procedures, etc. via RAG.

Components

Our pilot ELM is built from these components:

  • Ollama - we use ollama as the inference backend. It loads and serves quantized open models (e.g. llama3:3b) on local hardware.
  • llama3 - open-weight language model from here. We focus on smaller (< 4B) LMs, since most of the information we want retrieved will be from our local knowledge base.
  • RAG Layer - as retriever and prompt constructor; interfaces between the user query and the model, performs semantic search and prompt assembly.
  • Document store - corpus/knowledge base (internal notes, wikis, elog entries, shift procedures, etc.) preprocessed and vectorised using LangChain and ChromaDB.

Resources

Wrapper and utility python scripts are provided below. Some of the preprocessed knowledge base is also linked below (password protected).

NA62 ELM

Page last updated on June 5, 2025