--- title: ttsdoc emoji: 🌖 colorFrom: yellow colorTo: gray sdk: gradio sdk_version: 4.41.0 app_file: app.py pinned: false license: apache-2.0 --- # ttsdoc 🌖 ttsdoc is a Text-to-Speech (TTS) application that can read your PDF documents aloud. It uses the Parler TTS Mini v1 model to generate high-quality audio from text inputs, including uploaded PDF files. ## Features - 📄 Support for PDF, TXT, and DOCX file uploads - ✍️ Direct text input option - 🗣️ Customizable voice descriptions - ⏱️ Adjustable maximum audio duration - 🚀 GPU-accelerated audio generation ## How to Use 1. Upload a PDF, TXT, or DOCX file or enter text directly. 2. Customize the voice description if desired. 3. Adjust the maximum audio duration. 4. Click "Generate Audio" to create the TTS output. ## Tips for Best Results - For longer texts, the generator will create audio up to the specified maximum duration. - Experiment with different voice descriptions to achieve the desired output. - Use punctuation to control pacing and intonation in the generated speech. - For optimal quality, try to keep individual sentences or paragraphs concise. ## Technical Details - This demo uses the Parler TTS Mini v1 model. - Audio generation is GPU-accelerated for faster processing. - Maximum file size for uploads: 5MB ## License This project is licensed under the Apache 2.0 License. --- Powered by [Gradio](https://gradio.app) and [Hugging Face](https://huggingface.co)