Skip to content

reese159/LINE-TTS

Repository files navigation

LINE-TTS

Local Interactive Narration Environment for Text-To-Speech


Introduction

LINE-TTS is a text-to-speech application written in python, providing the user the ability to:

  • Generate their own voices for narration via blending of either open source or user-uploaded voice tensors
  • Enter text or upload a PDF
  • Generate a summary of the provided text with voice narration using an OpenAI model with configuration options for the model being used and the maximum number of tokens allowed in the summary
  • Generate a full narration of the provided text
  • Listen to or download either type of narration
  • Download and save generated voice tensors for later use

This project contains both an online and local version, with the online version of the application existing primarily as a showcase of the application's capabilities with hard limits on the length of narrations that can be generated to avoid exceeding streamlit's resource limits. As such, for full use of narration generation and maximum user privacy, the local version is recommended.

The browser version of this application can be found at https://linetts.streamlit.app/

Aside from the character limit and OpenAI API key configuration (should the user wish to generate text summaries in the local version), the functionality of both versions of the application should be identical.

Below is a typical user workflow:

Step 1, select/upload voices:

User_Flow_Recording_1-Voice_Selection

Step 2, set voice weights:

User_Flow_Recording_2-Weight_Selection

Note, the user should input weights summing to 1.0 to avoid warnings.

Step 3, provide text input for summary/narration using the text box provided or uploadinga valid PDF:

User_Flow_Recording_3-text_input

User_Flow_Recording_3-pdf_input

Step 4, generate summary or full narration:

User_Flow_Recording_4-text_summary_generation

User_Flow_Recording_4-narration_generation

Note, after either step 4, the user can download both the narration as well as the blended voice itself for future use.

not all steps are necessary, as the user may only want to generate a summary or full narration. The summary in particular is heavily recommended for the browser version of the application.

Local Setup

This project was created using python 3.12, please install to run locally.

All requirements can be found in "requirements.txt", found in the root of this repository, and can be installed directly to a virtual environment by running "pip install -r requirements.txt" in the terminal with the venv active.

After installation, the user can run the local version of this application by opening a terminal in the root directory of the project with the venv activated, and entering the following command:

streamlit run local_streamlit_narrator.py

Optional

For text summarization to function locally, the user will need to generate an api key from OpenAI, which can be purchased on the OpenAI Platform. After obtaining the API key, the user will need to generate a "secrets.toml" file in the root directory of the project, containing the OpenAI API key in the following format: OPENAI_API_KEY="your_api_key_here"

More technical documentation on the process found in the LINE-TTS Developer Guide


Credits

All open-weight models provided can be found on huggingface under Kokoro-82M created by hexgrad under the Apache 2.0 License.

Direct download for voice-tensor files provided can be found under voices in the Kokoro-82M repository.

Credit to OpenAI for the provided models used in text summarization.

About

Local Interactive Narration Environment for Text-To-Speech

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages