README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98

# Privacy-First Command-Line AI for Linux

![AI_ENV](logo.webp)

Unlock the power of AI—right from your Linux terminal.

This project delivers a fully local AI environment, running open source language models directly on your machine. 

No cloud. No GAFAM. Just full privacy, control, and the freedom to manipiulate commands in your shell.

## How it works

* [Ollama](https://ollama.com/) run language models on the local machine.
* [openedai-speech](https://github.com/matatonic/openedai-speech) provide text to speech capability.
* [speaches-ai](https://github.com/speaches-ai/speaches) provide transcription, translation, and speech generation.
* [nginx](https://nginx.org/en/) add an authentication to the API.
* [aichat](https://github.com/sigoden/aichat) is used as LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents.

Everything is free, open-source and automated using Docker Compose and shell scripts.

## Requirements

To run this project efficiently, a modern computer with a recent NVIDIA GPU is required.
As an example, I achieve good performance with an Intel(R) Core(TM) i7-14700HX, a GeForce RTX 4050, and 32GB of RAM.

You must use Linux and the [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-container-toolkit).

Note that it is probably possible to run the project on other GPUs or modern MacBooks, but this is not the purpose of this project.

## How to launch the server

Choose the models you wish to use in the docker-compose.yaml file and change the API token in the .env file as follows:
```
LLM_API_KEY=1234567890
```

Next, start the servers and their configuration with Docker Compose:
```bash
docker compose up --build -d
```

## How to use

The `setup_desktop.sh` script allows for copying a compiled static version of [aichat](https://github.com/sigoden/aichat) from a container to your host and configuring the tool.

### Aichat essentials

To launch a chatbot while maintaining context:
```bash
aichat -m ollama:qwen2.5 -s
```

With a prompt:
```bash
aichat -m ollama:qwen2.5 --prompt "I want you to act as an English translator, spelling corrector and improver. I will speak to you in any language and you will detect the language, translate it and answer in the corrected and improved version of my text, in English. I want you to only reply the correction, the improvements and nothing else, do not write explanations."
```

Pipe a command and transform the result with the LLM:
```
ls | aichat -m ollama:qwen2.5 --prompt "transform to json"
```

Go to the [AIChat](https://github.com/sigoden/aichat) website for other possible use cases.

### Text To Speech

To use text-to-speech, use the script in the `tools/tts.sh` file.

Example:
```
./tools/tts.sh -l french -v pierre --play "Aujourd'hui, nous sommes le $(date +%A\ %d\ %B\ %Y)."
```

### Speech To Text

For the Speech to Text functionality use `tools/stt.sh`.
The function record allows you to use PulseAudio to record the computer's audio (for example, a video in the browser).
The transcription function converts the audio file into text.

Example:
```bash
./tools/stt.sh record -s alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__Speaker__sink.monitor
./tools/stt.sh transcription -f record_20250112_125726.wav -l fr
```

## How to Use Remotely

The API authentication via Nginx allows you to open the API on the internet and use it remotely.
By adding a reverse proxy like Caddy in front of it, you can also add TLS encryption. 
This way, you can securely use this environment remotely.

To use the scripts tools in a remote context, use the environment variables TTS_API_HOST and STT_API_HOST.

Example:
```
TTS_API_HOST="https://your-remote-domain" ./tools/tts.sh -l french -v pierre --play "Aujourd'hui, nous sommes le $(date +%A\ %d\ %B\ %Y)."
STT_API_HOST="https://your-remote-domain" ./tools/stt.sh transcription -f speech_20250112_124805.wav -l fr
```