Creating a local Chat GPT - using private data

· December 1, 2023

The other day I stumbled on a YouTube video that looked interesting. I’ve been using Chat GPT quite a lot (a few times a day) in my daily work and was looking for a way to feed some private, data for our company into it.

The title of the video was “PrivateGPT 2.0 - FULLY LOCAL Chat With Docs”

It was both very simple to setup and also a few stumbling blocks. But in the end I could have conversations in English (and broken Swedish) about how to build data pipelines, the Scling-way, by feeding the AI our documentation files.

This is all running locally on my machine without any keys to a third party service.

I’m going to split this post into two parts; first a description on what I want to and why, then a section on how I set it up

The need for private GPT

ChatGPT (or other LLM based chats) has taken the world with storm and the word “paradigm-shift” feels appropriate to describe where are at. But there is a need that I think I’m not alone in having; I want it to know what we know in the company.

Here’s my example; I’m a fairly new developer on a platform that contains loads of code, opinions and recommendations that I need to scan the code to know. Or ask a colleague, which gets annoying for both me and people around me after awhile. It’s sometimes simple questions like have anyone else downloaded FTPs in Scala in our platform?, but can also be a bit more vague I've never created an integration test on our platform before, can you guide me?.

To do this I would have create a GPT chat that knows what we know, and hold the same opinions as we when it comes to preferring one solution over another. I want a Scling-Chat-bot.

Turns out that creating one is suprisingly simple - using the Private GPT that the video above showed us.

I’m no expert in this field, and the following description is my layman understanding of it… but:

PrivateGPT is a webserver (including an API) that runs locally on your machine (you can run it in a docker, but that seems to be not recommended due to performance / hardware virtualization issues… layman, told you). When setting up your local PrivateGPT server you feed it an LLM, that you download. This is important as it means that you can use whatever LLM you want. For example the GPT-SW3 Swedish model.

In my understanding the LLM model is what gives PrivateGPT the model to hold conversations, to help it to understand languages, in short. LLM - large language model. GPT - Generative Pre-training Transformer. But that is general knowledge and doesn’t know things about our specific world; the Scling-platform, our products, use cases from our support etc.

What makes PrivateGPT very interesting is that you can easily ingest documents (in MANY formats) to teach it that missing knowledge. Before I ingest the Scling user documentation into PrivateGPT it would recommend any old framework to parse Excel files in Python. After the ingestion it would make recommendations based on the Scling documentation. (It should be emphasized the the Scling documentation is really good and have recommendations for these things in the form of Architectural decisions :))

And here I’m sure that you can come up with many more examples;

  • Before I fed it all documentation about being an employee at X it wouldn’t know how to properly report vacation in system Z, but afterwards it would
  • Before I fed it all our product user manuals for dishwashers it would come with generic recommendations for washing your clothes, afterwards it would recommend a specific program and tell you how to sort your clothes
  • Before I fed it all our customer service FAQs it gave bad answers to common questions, afterwards it could hold long conversations with customers as if it worked in our customer support for years.

But how to do it - let’s see. It’s not hard, but had a few caveats too.

Getting it to work - installation

Everything I write from this point on is what I learned from following the installation page of PrivateGPT. If this doesn’t work, you should probably go that page and follow their instructions instead.

At the end I will put all of these commands in one file, which will look much like their quick start. However, not reading through and understanding the individual steps will cause problems later. Trust me on this…

  1. To use PrivateGPT you need:

    • Git
    • Python 3.11
    • PyEnv (not a requirement, but recommended and makes life easier). I struggled with this one, until I found the automatic installer curl https://pyenv.run | bash which guided me right
    • Make, which is preinstalled on my Linux system. This is needed to be able to do ingestion at the prompt
  2. Start by cloning the repository to get the PrivateGPT locally. This is a key feature of using PrivateGPT - you run it locally.

    git clone https://github.com/imartinez/privateGPT
    cd privateGPT
    
  3. Now use pyenv to install and use the correct version of Python and activate the environment:

    pyenv install 3.11
    pyenv local 3.11
    
  4. PrivateGPT uses a tool call Poetry to manage dependencies and run commands like start and ingestion. We need to install that into the PyEnv virtual environment. This is how to do that:

    pip install --upgrade pip poetry
    
  5. Once that Poetry is installed you can start PrivateGPT using the following command, that will install all dependencies. This command will install all dependencies to run PrivateGPT locally:

    poetry install --with ui,local
    
  6. Here’s a step that I don’t know so much about and that I also had to battle a bit. PrivateGPT GPU support via a C++ compiler… I don’t even want to know. Long-story short, this connection or usage of the GPU is done using using llama-cpp-python, which needs to be installed like the following.

    CMAKE_ARGS="-DLLAMA_METAL=on" pip install --force-reinstall --no-cache-dir "llama-cpp-python==0.1.59"
    

    However, I had to use an older version, hence the ==0.1.59 at the end. Without that addition the installation failed for me. Read about this step here.

  7. Start the application using make run, but make sure that you pick up the local settings from the setting.yaml-file using :

    PGPT_PROFILES=local make run
    

    You can use another profile by setting the PGPT_PROFILES variable to another name, if you wanted to

    You can now see the application running at http://localhost:8001. But it’s using a mock LLM, so once you verified that it works, shut it down again (CTRL+C) and continue

  8. Let’s download a LLM and the absolutely easiest way to do that is to use PrivateGPTs recommended settings, through the setup script:

    poetry run python scripts/setup
    

    This command takes considerable time, but it also downloads a 5 GB file to your disk - the LLM

  9. You could now have ChatGPT like conversations with PrivateGPT, but let’s ingest some private data into it, before we try it out.

    Ingestion can be done in the UI of the PrivateGPT web but you can also do it from the command line (or write your own code that uses the /ingest/ - API)

    Here’s how to ingest all documents (in many formats) from a certain folder:

    make ingest /path/to/folder -- --log-file /path/to/log/file.log
    

    (There’s a --watch flag that I didn’t use. First it caused some problems with the running web server, and secondly I think that these updates are probably better done offline.)

    This takes some time, as the ingested content needs to be tokenized in a format that PrivateGPT can understand. It took me about 1 min for 3 mb markdown.

  10. Done You can now start the application again (make run)

  11. When playing around with this I found the wipe command very useful. All the ingested data ends up in a folder called local_data. Simply deleting it works, but causes some weirdness in the UI - PrivateGPT has a command for this, so it’s better to use it:

    make wipe
    

Once again, as one long command

Here’s a single command that sets up a PrivateGPT installation, with parameters for location of the installation and the folder of documents to ingest.

If the file was save to privateGPTSetup.sh you can call it with:

sh privateGPTSetup.sh /privateGPT/Installation /docs/to/ingest

Here’s the script:

#!/bin/bash
echo "Clone Private GPT to $1"
git clone https://github.com/imartinez/privateGPT $1
cd $1

echo "Setting up PyEnv"
pyenv install 3.11
pyenv local 3.11

echo "Installing poetry"
pip install --upgrade pip poetry

echo "Installing PrivateGPT as local installation"
poetry install --with ui,local
CMAKE_ARGS="-DLLAMA_METAL=on" pip install --force-reinstall --no-cache-dir "llama-cpp-python==0.1.59"

echo "Downloading standard LLM using script/setup"
poetry run python scripts/setup

echo "Ingest all files in $2"
make ingest $2

echo "Ingestion done - Starting application"
PGPT_PROFILES=local make run
echo "Application running at http://localhost:8001"

Summary

With that you can now setup your own PrivateGPT, that doesn’t require any keys, or to upload documents to a third party, that you can feed documents to give it knowledge of your data.

A brave new world

Twitter, Facebook