Motivation: Many voice assistants with artificial intelligence have an unpleasant voice. While it may be a matter of personal preference. I aim to create a more human assistant that allows you to plug in cute anime character voices and vtuber voices.
Solution diagram:
This guide may not be the most detailed. It will need to be improved.
Installation procedure for Windows
- Install git https://git-scm.com/downloads
- Install cudatoolkit (You should only choose versions of cuda that pytorch supports. See supported versions here https://pytorch.org/get-started/locally/)
- Install miniconda https://docs.conda.io/en/latest/miniconda.html
- Open miniconda console
- Create new conda environment
conda create --name llama_cute_voice_assistant python=3.11
- Activate conda environment
conda activate llama_cute_voice_assistant
- Clone project
git clone https://github.com/atomlayer/llama_cute_voice_assistant.git
- Go to project directory
cd llama_cute_voice_assistant
- Install pytorch
- Go to https://pytorch.org/get-started/locally/
- Generate a command to install pytorch for your system (the command will be like this: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 )
- Execute the command
- Install the libraries
pip install SpeechRecognition==3.10.0
pip install pyttsx3==2.90
pip install soundfile==0.12.1
pip install simpleaudio==1.0.4
pip install pygame==2.5.1
conda install PyAudio
pip install openai-whisper --no-cache-dir
pip install omegaconf==2.3.0
pip install git+https://github.com/openai/whisper.git
conda install -c conda-forge ffmpeg
-
https://github.com/oobabooga/text-generation-webui#one-click-installers
-
Open the oobabooga Text Generation web UI using the -api parameter.
-
On the model tab: download and run your favorite AI model.
-
On the Chat settings > Character tab: set your character name and description.
Detailed instruction - https://www.youtube.com/watch?v=_JXbvSTGPoo
-
Download https://huggingface.co/wok000/vcclient000/blob/main/MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.11.zip
-
Unpack the archive
-
Run start_http.bat
-
Join the AI Hub Discord: https://discord.gg/aihub
-
Go to the search-models channel
-
Find and download the model you like
-
Click the edit button in the Realtime Voice Changer Client
-
Upload the model to a free cell
-
Adjust the best TUNE parameter for your voice.
-
Download and install VB-CABLE Virtual Audio Device https://vb-audio.com/Cable/
-
Open Realtime Voice Changer
-
Set up an audio input: Cable Output (VB-Audio Virtual Cable)
- Press start button
- Replace oobabooga_api_name and wake words.
oobabooga_api_name - the name of one of your characters in oobabooga text generation web UI (Paremeters > Character tab)
oobabooga_api_name = "Lisa"
wake_words = ["lisa"]
- Open conda console in the project folder
- Run the command:
python voice_chat.py
- Say the wake word and the command for your assistant.