Bachelor thesis project providing an interactive web application with the core functionality of uploading an audio file with human speech and displaying the corresponding lip movements on the provided avatar basing on the output of an LSTM neural network model on the remote server.
First, prepare the frontend environment using the command:
npm install
Second, prepare the backend virtual environment using commands:
cd api
python3 -m venv venv
.\venv\Scripts\activate
pip install flask python-dotenv
pip install -r requirements.txt
cd api
python3 -m venv venv
source venv/bin/activate
pip install flask python-dotenv
pip install -r requirements.txt
If you followed the Setup guide, run
python api.py
to start the backend on port 5001.
Alternatively run (depending on your OS)
yarn start-api-win
or
yarn start-api-mac
in the project source.
To start the frontend on port 3000, run:
npm start
in the project source.
Open http://localhost:3000 to view the project in the browser.
- Press the icon in the top-left corner to display the menu.
- Choose file to upload a wav or mp3 audio file with speech. Press Record to record the speech in real-time.
- Press Upload to send the speech recording to the server.
- When the model is performing the computations on the server, a loading spinner is displayed. Wait until it disappears.
- The player options in the bottom-left corner handle the animation. Change the intensity of the avatar's expression with the slider.
demo.mp4
You can access the web application through the link: https://facialanimation.page/. Not supported!
Use the exemplary files in audio_files folder.
See the model preparation in the ml_model_folder
.
Małgorzata Nowicka and Filip Zawadka