diff options
author | ben | 2025-01-12 14:37:13 +0100 |
---|---|---|
committer | ben | 2025-01-12 14:37:13 +0100 |
commit | 778188ed95ccf50d2e21938bf5b542d76e066f63 (patch) | |
tree | e5138e638da98036e03cb11b2b0cf48fe4c590b2 | |
download | ai_env-778188ed95ccf50d2e21938bf5b542d76e066f63.tar.gz ai_env-778188ed95ccf50d2e21938bf5b542d76e066f63.tar.bz2 ai_env-778188ed95ccf50d2e21938bf5b542d76e066f63.tar.xz |
Initial commit, first public version.
-rw-r--r-- | .gitignore | 1 | ||||
-rw-r--r-- | README.md | 98 | ||||
-rw-r--r-- | docker-compose.yml | 98 | ||||
-rw-r--r-- | logo.webp | bin | 0 -> 39410 bytes | |||
-rwxr-xr-x | setup_desktop.sh | 13 | ||||
-rw-r--r-- | src/aichat/Dockerfile | 7 | ||||
-rw-r--r-- | src/aichat/config.yaml | 8 | ||||
-rw-r--r-- | src/llm_provision/Dockerfile | 12 | ||||
-rw-r--r-- | src/llm_provision/entrypoint.sh | 4 | ||||
-rwxr-xr-x | src/llm_provision/init_models.sh | 17 | ||||
-rw-r--r-- | src/nginx/nginx.conf | 61 | ||||
-rw-r--r-- | src/tts/Dockerfile | 47 | ||||
-rw-r--r-- | src/tts/download_voices_tts-1.sh | 8 | ||||
-rw-r--r-- | src/tts/voice_to_speaker.default.yaml | 36 | ||||
-rw-r--r-- | src/whisper/Dockerfile | 13 | ||||
-rwxr-xr-x | tools/aichat | bin | 0 -> 12073152 bytes | |||
-rwxr-xr-x | tools/stt.sh | 121 | ||||
-rwxr-xr-x | tools/tts.sh | 115 |
18 files changed, 659 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..4c49bd7 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.env diff --git a/README.md b/README.md new file mode 100644 index 0000000..c089330 --- /dev/null +++ b/README.md @@ -0,0 +1,98 @@ +# Privacy-First Command-Line AI for Linux + + + +Unlock the power of AI—right from your Linux terminal. + +This project delivers a fully local AI environment, running open source language models directly on your machine. + +No cloud. No GAFAM. Just full privacy, control, and the freedom to manipiulate commands in your shell. + +## How it works + +* [Ollama](https://ollama.com/) run language models on the local machine. +* [openedai-speech](https://github.com/matatonic/openedai-speech) provide text to speech capability. +* [speaches-ai](https://github.com/speaches-ai/speaches) provide transcription, translation, and speech generation. +* [nginx](https://nginx.org/en/) add an authentication to the API. +* [aichat](https://github.com/sigoden/aichat) is used as LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents. + +Everything is free, open-source and automated using Docker Compose and shell scripts. + +## Requirements + +To run this project efficiently, a modern computer with a recent NVIDIA GPU is required. +As an example, I achieve good performance with an Intel(R) Core(TM) i7-14700HX, a GeForce RTX 4050, and 32GB of RAM. + +You must use Linux and the [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-container-toolkit). + +Note that it is probably possible to run the project on other GPUs or modern MacBooks, but this is not the purpose of this project. + +## How to launch the server + +Choose the models you wish to use in the docker-compose.yaml file and change the API token in the .env file as follows: +``` +LLM_API_KEY=1234567890 +``` + +Next, start the servers and their configuration with Docker Compose: +```bash +docker compose up --build -d +``` + +## How to use + +The `setup_desktop.sh` script allows for copying a compiled static version of [aichat](https://github.com/sigoden/aichat) from a container to your host and configuring the tool. + +### Aichat essentials + +To launch a chatbot while maintaining context: +```bash +aichat -m ollama:qwen2.5 -s +``` + +With a prompt: +```bash +aichat -m ollama:qwen2.5 --prompt "I want you to act as an English translator, spelling corrector and improver. I will speak to you in any language and you will detect the language, translate it and answer in the corrected and improved version of my text, in English. I want you to only reply the correction, the improvements and nothing else, do not write explanations." +``` + +Pipe a command and transform the result with the LLM: +``` +ls | aichat -m ollama:qwen2.5 --prompt "transform to json" +``` + +Go to the [AIChat](https://github.com/sigoden/aichat) website for other possible use cases. + +### Text To Speech + +To use text-to-speech, use the script in the `tools/tts.sh` file. + +Example: +``` +./tools/tts.sh -l french -v pierre --play "Aujourd'hui, nous sommes le $(date +%A\ %d\ %B\ %Y)." +``` + +### Speech To Text + +For the Speech to Text functionality use `tools/stt.sh`. +The function record allows you to use PulseAudio to record the computer's audio (for example, a video in the browser). +The transcription function converts the audio file into text. + +Example: +```bash +./tools/stt.sh record -s alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__Speaker__sink.monitor +./tools/stt.sh transcription -f record_20250112_125726.wav -l fr +``` + +## How to Use Remotely + +The API authentication via Nginx allows you to open the API on the internet and use it remotely. +By adding a reverse proxy like Caddy in front of it, you can also add TLS encryption. +This way, you can securely use this environment remotely. + +To use the scripts tools in a remote context, use the environment variables TTS_API_HOST and STT_API_HOST. + +Example: +``` +TTS_API_HOST="https://your-remote-domain" ./tools/tts.sh -l french -v pierre --play "Aujourd'hui, nous sommes le $(date +%A\ %d\ %B\ %Y)." +STT_API_HOST="https://your-remote-domain" ./tools/stt.sh transcription -f speech_20250112_124805.wav -l fr +``` diff --git a/docker-compose.yml b/docker-compose.yml new file mode 100644 index 0000000..25d4ef3 --- /dev/null +++ b/docker-compose.yml @@ -0,0 +1,98 @@ +services: + ollama: + image: ollama/ollama + volumes: + - ollama:/root/.ollama + restart: always + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + healthcheck: + test: ollama --version && ollama ps || exit 1 + interval: 60s + retries: 5 + start_period: 20s + timeout: 10s + openedai-speech: + build: + dockerfile: src/tts/Dockerfile + environment: + - TTS_HOME=voices + volumes: + - voices:/app/voices + - speech-config:/app/config + restart: unless-stopped + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + healthcheck: + test: curl --fail http://localhost:8000 || exit 1 + interval: 60s + retries: 5 + start_period: 10s + timeout: 10s + llm_provision: + build: + dockerfile: src/llm_provision/Dockerfile + environment: + - MODELS=qwen2.5:latest,qwen2.5-coder:32b,nomic-embed-text:latest + restart: no + depends_on: + ollama: + condition: service_healthy + restart: true + links: + - ollama + aichat-build: + build: + dockerfile: src/aichat/Dockerfile + faster-whisper-server: + image: fedirz/faster-whisper-server:latest-cuda + environment: + - WHISPER__MODEL=Systran/faster-whisper-large-v3 + volumes: + - hf-hub-cache:/home/ubuntu/.cache/huggingface/hub + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + healthcheck: + test: timeout 10s bash -c ':> /dev/tcp/127.0.0.1/8000' || exit 1 + interval: 30s + timeout: 15s + retries: 3 + nginx: + image: nginx + volumes: + - ./src/nginx/nginx.conf:/etc/nginx/templates/nginx.conf.template + environment: + - NGINX_ENVSUBST_OUTPUT_DIR=/etc/nginx + - API_KEY=${LLM_API_KEY} + depends_on: + - openedai-speech + - faster-whisper-server + - ollama + links: + - ollama + - faster-whisper-server + - openedai-speech + ports: + - "11434:11434" + - "8000:8000" + - "8001:8001" +volumes: + ollama: + voices: + speech-config: + hf-hub-cache: diff --git a/logo.webp b/logo.webp Binary files differnew file mode 100644 index 0000000..9b1f516 --- /dev/null +++ b/logo.webp diff --git a/setup_desktop.sh b/setup_desktop.sh new file mode 100755 index 0000000..94bf0bd --- /dev/null +++ b/setup_desktop.sh @@ -0,0 +1,13 @@ +#!/usr/bin/env bash + +SCRIPT=$(readlink -f "$0") +SCRIPTPATH=$(dirname "$SCRIPT") +cd "$SCRIPTPATH" || exit + +container_id=$(docker create "aichat-build") +docker cp "${container_id}:/usr/local/cargo/bin/aichat" "./tools/" +docker rm "${container_id}" + +source .env +mkdir -p ~/.config/aichat/ +cat src/aichat/config.yaml | sed "s/__LLM_API_KEY__/${LLM_API_KEY}/" > ~/.config/aichat/config.yaml diff --git a/src/aichat/Dockerfile b/src/aichat/Dockerfile new file mode 100644 index 0000000..df13f63 --- /dev/null +++ b/src/aichat/Dockerfile @@ -0,0 +1,7 @@ +FROM rust:latest + +RUN rustup target add x86_64-unknown-linux-musl +RUN apt update && apt install -y musl-tools musl-dev +RUN update-ca-certificates + +RUN cargo install --target x86_64-unknown-linux-musl aichat diff --git a/src/aichat/config.yaml b/src/aichat/config.yaml new file mode 100644 index 0000000..a74af2c --- /dev/null +++ b/src/aichat/config.yaml @@ -0,0 +1,8 @@ +# see https://github.com/sigoden/aichat/blob/main/config.example.yaml + +model: ollama +clients: +- type: openai-compatible + name: ollama + api_base: http://localhost:11434/v1 + api_key: __LLM_API_KEY__ diff --git a/src/llm_provision/Dockerfile b/src/llm_provision/Dockerfile new file mode 100644 index 0000000..77701fe --- /dev/null +++ b/src/llm_provision/Dockerfile @@ -0,0 +1,12 @@ +FROM debian:bookworm-slim + +ENV DEBIAN_FRONTEND=noninteractive +RUN apt-get update +RUN apt-get --yes -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confnew" install bash curl jq + +ADD ./src/llm_provision/init_models.sh /init_models.sh +ADD ./src/llm_provision/entrypoint.sh /entrypoint.sh +RUN chmod 755 /entrypoint.sh + +ENTRYPOINT ["/entrypoint.sh"] +#ENTRYPOINT ["tail", "-f", "/dev/null"] # to debug diff --git a/src/llm_provision/entrypoint.sh b/src/llm_provision/entrypoint.sh new file mode 100644 index 0000000..d0b6e85 --- /dev/null +++ b/src/llm_provision/entrypoint.sh @@ -0,0 +1,4 @@ +#!/usr/bin/env bash + +echo "pull models into ollama volumes" +bash /init_models.sh diff --git a/src/llm_provision/init_models.sh b/src/llm_provision/init_models.sh new file mode 100755 index 0000000..0afbbd0 --- /dev/null +++ b/src/llm_provision/init_models.sh @@ -0,0 +1,17 @@ +#!/usr/bin/env bash + +OLLAMA_HOST="http://ollama:11434" + +IFS=',' read -r -a models_arr <<< "${MODELS}" + +## now loop through the above array +for m in "${models_arr[@]}" +do + curl -s "${OLLAMA_HOST}/api/tags" | jq '.models[].name' | grep ${m} > /dev/null + if [[ $? -ne 0 ]] + then + curl -s "${OLLAMA_HOST}/api/pull" -d "{\"model\": \"${m}\"}" + else + echo "${m} already installed" + fi +done diff --git a/src/nginx/nginx.conf b/src/nginx/nginx.conf new file mode 100644 index 0000000..2dc6d52 --- /dev/null +++ b/src/nginx/nginx.conf @@ -0,0 +1,61 @@ +events{} +http { + server_tokens off; + client_max_body_size 200m; + + server { + listen 11434; + set $deny 1; + if ($http_authorization = "Bearer $API_KEY") { + set $deny 0; + } + if ($deny) { + return 403; + } + location / { + proxy_pass http://ollama:11434; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } + } + server { + listen 8000; + set $deny 1; + if ($http_authorization = "Bearer $API_KEY") { + set $deny 0; + } + if ($deny) { + return 403; + } + location / { + proxy_pass http://openedai-speech:8000; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } + } + server { + listen 8001; + set $deny 1; + if ($http_authorization = "Bearer $API_KEY") { + set $deny 0; + } + if ($deny) { + return 403; + } + location / { + proxy_pass http://faster-whisper-server:8000; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + proxy_read_timeout 180; + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + } + } +} diff --git a/src/tts/Dockerfile b/src/tts/Dockerfile new file mode 100644 index 0000000..1636bd2 --- /dev/null +++ b/src/tts/Dockerfile @@ -0,0 +1,47 @@ +FROM python:3.11-slim + +RUN --mount=type=cache,target=/root/.cache/pip pip install -U pip + +ARG TARGETPLATFORM +RUN <<EOF +apt-get update +apt-get install --no-install-recommends -y curl ffmpeg git +if [ "$TARGETPLATFORM" != "linux/amd64" ]; then + apt-get install --no-install-recommends -y build-essential + curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y +fi + +# for deepspeed support - image +7.5GB, over the 10GB ghcr.io limit, and no noticable gain in speed or VRAM usage? +#curl -O https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.1-1_all.deb +#dpkg -i cuda-keyring_1.1-1_all.deb +#rm cuda-keyring_1.1-1_all.deb +#apt-get install --no-install-recommends -y libaio-dev build-essential cuda-toolkit + +apt-get clean +rm -rf /var/lib/apt/lists/* +EOF +#ENV CUDA_HOME=/usr/local/cuda +ENV PATH="/root/.cargo/bin:${PATH}" + +WORKDIR /app +RUN mkdir -p voices config + +ARG USE_ROCM +ENV USE_ROCM=${USE_ROCM} + +RUN git clone https://github.com/matatonic/openedai-speech.git /tmp/app +RUN mv /tmp/app/* /app/ +ADD src/tts/download_voices_tts-1.sh /app/download_voices_tts-1.sh +ADD src/tts/voice_to_speaker.default.yaml /app/voice_to_speaker.default.yaml +RUN if [ "${USE_ROCM}" = "1" ]; then mv /app/requirements-rocm.txt /app/requirements.txt; fi +RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt + + +ARG PRELOAD_MODEL +ENV PRELOAD_MODEL=${PRELOAD_MODEL} +ENV TTS_HOME=voices +ENV HF_HOME=voices +ENV COQUI_TOS_AGREED=1 + +CMD bash startup.sh + diff --git a/src/tts/download_voices_tts-1.sh b/src/tts/download_voices_tts-1.sh new file mode 100644 index 0000000..f880650 --- /dev/null +++ b/src/tts/download_voices_tts-1.sh @@ -0,0 +1,8 @@ +#!/bin/sh +# cat voice_to_speaker.default.yaml | yq '.tts-1 ' | grep mode | cut -d'/' -f2 | cut -d'.' -f1 | sort -u | xargs +models=${*:-"en_GB-alba-medium en_GB-northern_english_male-medium en_US-bryce-medium en_US-john-medium en_US-libritts_r-medium en_US-ryan-high fr_FR-siwis-medium fr_FR-tom-medium fr_FR-upmc-medium"} +piper --update-voices --data-dir voices --download-dir voices --model x 2> /dev/null +for i in $models ; do + [ ! -e "voices/$i.onnx" ] && piper --data-dir voices --download-dir voices --model $i < /dev/null > /dev/null +done + diff --git a/src/tts/voice_to_speaker.default.yaml b/src/tts/voice_to_speaker.default.yaml new file mode 100644 index 0000000..53acda6 --- /dev/null +++ b/src/tts/voice_to_speaker.default.yaml @@ -0,0 +1,36 @@ +# Use https://rhasspy.github.io/piper-samples/ to configure +tts-1: + alloy: + model: voices/en_US-libritts_r-medium.onnx + speaker: 79 + siwis: + model: voices/fr_FR-siwis-medium.onnx + speaker: 0 + tom: + model: voices/fr_FR-tom-medium.onnx + speaker: 0 + pierre: + model: voices/fr_FR-upmc-medium.onnx + speaker: 1 + jessica: + model: voices/fr_FR-upmc-medium.onnx + speaker: 0 + alba: + model: voices/en_GB-alba-medium.onnx + speaker: 0 + jack: + model: voices/en_GB-northern_english_male-medium.onnx + speaker: 0 + john: + model: voices/en_US-john-medium.onnx + speaker: 0 + bryce: + model: voices/en_US-bryce-medium.onnx + speaker: 0 + ryan: + model: voices/en_US-ryan-high.onnx + speaker: 0 + echo: + model: voices/en_US-libritts_r-medium.onnx + speaker: 134 + diff --git a/src/whisper/Dockerfile b/src/whisper/Dockerfile new file mode 100644 index 0000000..2909803 --- /dev/null +++ b/src/whisper/Dockerfile @@ -0,0 +1,13 @@ +FROM debian:bookworm-slim + +RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \ + sudo \ + python3 \ + python3-distutils \ + python3-pip \ + ffmpeg + +RUN pip install -U openai-whisper --break-system-packages +WORKDIR /app + +CMD ["whisper"] diff --git a/tools/aichat b/tools/aichat Binary files differnew file mode 100755 index 0000000..ff31ede --- /dev/null +++ b/tools/aichat diff --git a/tools/stt.sh b/tools/stt.sh new file mode 100755 index 0000000..13a1b5a --- /dev/null +++ b/tools/stt.sh @@ -0,0 +1,121 @@ +#!/bin/bash + +# Function to print usage information +usage() { + echo "Usage: $0 [record|transcription] <options>" + echo "" + echo "Actions:" + echo " record Record audio from a selected source" + echo " transcription Transcribe audio from a .wav file" + echo "" + echo "Options for 'record':" + echo " -s, --source Specify the audio source (required)" + echo "" + echo "Options for 'transcription':" + echo " -f, --file Specify the audio file to transcribe (required)" + echo " -l, --lang Specify the audio file language (default: en)" + exit 1 +} + +if [[ $# -eq 0 ]]; then + usage +fi + +# Check for required environment variable +if [[ -z "${LLM_API_KEY}" ]]; then + echo "The environment variable LLM_API_KEY is not set." + echo 'You can use the following command: export $(xargs < ../.env))' + exit 1 +fi + +ACTION=$1 +shift + +host=${STT_API_HOST:-"http://localhost:8001"} +LANG="en" # Default language + +if [ "$ACTION" == "record" ]; then + if [ "$#" -eq 0 ]; then + echo "Error: Source is required for record action." + echo "Available sources:" + pactl list short sources | awk '{print $2}' + exit 1 + fi + + SOURCE="" + while [[ "$#" -gt 0 ]]; do + case $1 in + -s | --source) + SOURCE="$2" + shift + ;; + *) + echo "Unknown parameter passed: $1" + usage + ;; + esac + shift + done + + # Validate the provided source + if ! pactl list short sources | awk '{print $2}' | grep -q "^$SOURCE$"; then + echo "Error: Invalid audio source. Available sources:" + pactl list short sources | awk '{print $2}' + exit 1 + fi + + timestamp=$(date +"%Y%m%d_%H%M%S") + filename="record_${timestamp}.wav" + echo "Start recording to ${filename} ; use CTRL+C to terminate." + parec -d "${SOURCE}" --file-format=wav "${filename}" +elif [ "$ACTION" == "transcription" ]; then + if [ "$#" -eq 0 ]; then + echo "Error: File is required for transcription action." + usage + fi + + FILE="" + while [[ "$#" -gt 0 ]]; do + case $1 in + -f | --file) + FILE="$2" + shift + ;; + -l | --lang) + LANG="$2" + shift + ;; + *) + echo "Unknown parameter passed: $1" + usage + ;; + esac + shift + done + + if [ -z "$FILE" ]; then + echo "Error: File is required for transcription action." + usage + fi + + # Check if the file exists + if [ ! -f "$FILE" ]; then + echo "Error: File '$FILE' does not exist." + exit 1 + fi + + # Ensure that curl is available + if ! command -v curl &>/dev/null; then + echo "curl is required for transcription but could not be found on your system. Please install it." + exit 1 + fi + + # Transcribe the specified file + echo "Transcribing file $FILE, be patient" + curl "${host}/v1/audio/transcriptions" -H "Authorization: Bearer ${LLM_API_KEY}" \ + -F "file=@${FILE}" \ + -F "stream=true" \ + -F "language=${LANG}" +else + usage +fi diff --git a/tools/tts.sh b/tools/tts.sh new file mode 100755 index 0000000..2065a3d --- /dev/null +++ b/tools/tts.sh @@ -0,0 +1,115 @@ +#!/bin/bash + +# Function to display usage information +usage() { + echo "Usage: $0 -l <lang> -v <voice> -s <speed> [--play] \"<text>\"" + echo " -l|--lang : Specify the language (french|english)" + echo " -v|--voice : Specify the voice" + echo " -s|--speed : Specify the speed (0.0 > 3.0, default is 1.0)" + echo " --play : Play the generated audio file using ffplay" + echo " <text> : The text to synthesize" + exit 1 +} + +# Function to check if a value is a valid float between 0 and 3.0 +is_valid_float() { + local value=$1 + # Check if the value is a valid number + if [[ $value =~ ^-?[0-9]+(\.[0-9]+)?$ ]]; then + # Check if the value is between 0 and 3.0 + if (($(echo "$value >= 0" | bc -l))) && (($(echo "$value <= 3.0" | bc -l))); then + return 0 + fi + fi + return 1 +} + +# Check for required environment variable +if [[ -z "${LLM_API_KEY}" ]]; then + echo "The environment variable LLM_API_KEY is not set." + echo 'You can use the following command: export $(xargs < ../.env))' + exit 1 +fi + +# Default values +speed=1.0 +host=${TTS_API_HOST:-"http://localhost:8000"} +play_audio=false + +# Parse command line arguments +while [[ $# -gt 0 ]]; do + case $1 in + -l | --lang) + lang="$2" + shift 2 + ;; + -v | --voice) + voice="$2" + shift 2 + ;; + -s | --speed) + speed="$2" + shift 2 + ;; + --play) + play_audio=true + shift 1 + ;; + -h | --help) + usage + ;; + -* | --*) + echo "Unknown option $1" + usage + ;; + *) + break + ;; + esac +done + +# Optionally grab the text after the options +if [[ $# -gt 0 ]]; then + text="$*" +else + echo "Error: Text to synthesize is required." + usage +fi + +# Generate a timestamp +timestamp=$(date +"%Y%m%d_%H%M%S") + +# Construct the filename with the current date and time +filename="speech_${timestamp}.wav" + +# Validate language and voice options +if [[ -z "$lang" || -z "$voice" ]]; then + echo "Error: Language (-l) and voice (-v) options are required." + usage +fi + +# Check if the speed is valid +if ! is_valid_float "$speed"; then + echo "Error: Speed must be a float between 0.0 and 3.0." + exit 1 +fi + +# Fetch the audio file from the API +http_status_code=$(curl -s "${host}/v1/audio/speech" -o "${filename}" -w "%{http_code}" -H "Authorization: Bearer ${LLM_API_KEY}" -H "Content-Type: application/json" -d "{\"model\": \"tts-1\",\"input\": \"${text}\",\"voice\": \"${voice}\",\"response_format\": \"wav\",\"speed\": ${speed}}") + +# Check the response code for successful HTTP request +if [[ "$http_status_code" -ne 200 ]]; then + echo "Error: Failed to fetch audio file. Received HTTP status code: $http_status_code" + exit 1 +fi + +# Optionally play the generated WAV file with ffplay +if [ "$play_audio" = true ]; then + if ! command -v ffplay &>/dev/null; then + echo "Error: ffplay is not installed. Please install mpv to play audio files." + exit 1 + fi + ffplay ${filename} -nodisp -nostats -hide_banner -autoexit -v quiet +fi + +echo "Audio file '$filename' generated successfully." |