CalPERS Private Equity RAG


Recently I have been thinking about diving into RAGs, so was looking for a good reference document that I could use. Considering that at the moment I work in private markets, CalPERS Private Equity documents seemed like a good source to use.

CalPERS (California Public Employees’ Retirement System) is the largest public pension fund in the United States, managing retirement and health benefits for over 2 million California public employees, retirees, and their families. With approximately $500 billion in assets under management, CalPERS invests across multiple asset classes including public equity, fixed income, real estate, and private equity.

Technologies

The pipeline is built entirely in Python using the following stack:

  • LangChain — orchestration framework for chaining LLM calls, document loaders, and retrievers
  • LangChain Text Splitters — chunking documents into overlapping segments before embedding
  • LangChain Ollama — integration with locally-running Ollama models
  • LangChain Community — additional loaders and utilities
  • LangChain HuggingFace — HuggingFace embeddings bridge
  • LangChain Chroma — LangChain wrapper for ChromaDB
  • Sentence Transformers — local embedding models for converting text chunks into vectors
  • ChromaDB — local vector store for persisting and querying embeddings
  • PyPDF — PDF ingestion and text extraction
  • Typer — CLI interface for running the pipeline from the terminal
  • Rich — formatted terminal output
  • python-dotenv — environment variable management
  • httpx — async HTTP client used by Ollama integration

Models

One model runs locally via Ollama, no API keys or internet connection required at query time:

  • gemma3:4b via Ollama — the default LLM responsible for generating answers from retrieved context. Runs fully on-device. Any model available in Ollama can be substituted.
  • all-MiniLM-L6-v2 via HuggingFace — a lightweight sentence-transformer model used to embed document chunks and queries into vectors for semantic search.

RAG

The diagram below gives an overview of how RAG works on a high level.

RAG overview for CalPERS Private Equity data

References