HarmonyOS App Development - Customizable Deletable List Popup

This application implements a custom list popup window function, supporting task addition, deletion, and confirmation. The specific implementation is as follows: 1. **Entity Class**: The `Intention` class is used to define task items. 2. **Data Source Class** (`IntentionDataSource`): Manages data operations for the task list, including CRUD operations and notifying listeners of updates. 3. **Custom Popup Component** (`AddIntentionDialog`): Displays the current task list and provides delete and confirm buttons. (Note: The original text cuts off here, the translation assumes standard functionality continuation)

Read More
HarmonyOS Application Development - Imitating WeChat Chat Message List

This example demonstrates how to create a chat application interface similar to WeChat using ArkTS. The page structure includes a scrollable message list and a button to dynamically add new messages. The core code is as follows: 1. The `Msg` class defines the message type (sent or received). 2. The `MsgDataSource` class implements the data source interface, manages the message list, and provides add/delete operations. 3. The page uses the `List` component to display the message list, with `LazyForEach` to dynamically load new messages as the user scrolls.

Read More
HarmonyOS Application Development - Sending POST Request and Obtaining Result

This code is used to send data to the server via a POST request and parse the JSON response. The core functionalities include: 1. Using the `http.createHttp().request()` method to send asynchronous POST requests. 2. Setting request headers and the data to be sent. 3. Obtaining the response result and parsing it into JSON format. 4. Parsing the JSON data and extracting valid information to update the interface text. The code structure clearly demonstrates how to implement HTTP requests in a HarmonyOS application by setting state variables.

Read More
HarmonyOS Application Development - Playing Local Audio Files

This document introduces the implementation of audio playback functionality on HarmonyOS using the AVPlayer audio and video player. The main steps include: 1. Creating an `AVPlayer` instance and registering callback functions to handle state changes and errors; 2. Obtaining the local audio file path, opening the audio file through file system operations to get the file descriptor, and setting it to `AVPlayer` to trigger resource initialization; 3. Implementing state machine transition logic, from resource initialization to playback completion. This code snippet demonstrates how to implement audio playback using the ArkTS language under the Stage model.

Read More
HarmonyOS Application Development - Requesting Voice Synthesis Service to Obtain Audio File

This document describes a text-to-speech service implemented using HarmonyOS, which uploads text data and requests the server to return audio data. Key steps include creating HTTP requests, setting request headers and data bodies, processing response data, and saving it to a local file. The code example demonstrates how to integrate this functionality in an Ability, specifically implementing the download and saving of a .wav format voice file after the user inputs text. It should be noted that the service response type must be `application/octet-stream` to correctly obtain the audio stream, and this service is only applicable to... (The original text appears to be cut off here.)

Read More
Easily Identify Long Audio/Video Files with Hours-Long Duration

This article introduces a method to build a long - speech recognition service capable of processing audio or video files that last tens of minutes or even several hours. First, the folder needs to be uploaded to the server, and then commands for compilation, permission modification, and starting the Docker container are executed to deploy the service. After testing that the service is available, the WebSocket interface or HTTP service can be used for interaction. The HTTP service provides a web interface, supporting the upload and recording recognition of audio and video in multiple formats, and returns text results containing the start and end timestamps of each sentence. This service simplifies the long - audio recognition process and improves user...

Read More
Real-time Command Wake-up

This paper introduces the development and usage of a real-time instruction wake-up program, including steps such as installation environment, instruction wake-up, and model fine-tuning. The project runs on Anaconda 3 and Python 3.11, with dependencies on PyTorch 2.1.0 and CUDA 12.1. Users can customize the recording time and length by adjusting parameters `sec_time` and `last_len`, and add instructions in `instruct.txt` for personalized settings. The program can be executed via `infer_pytorch.py` or `infer_on

Read More
Tank Battle Controlled by Voice Commands

This paper introduces the program development process for controlling the Tank Battle game through voice commands, including steps such as environment setup, game startup, and instruction model fine-tuning. First, the project uses Anaconda 3, Windows 11, Python 3.11, and corresponding libraries for development. Users can adjust parameters in `main.py` such as recording time and data length, add new commands in `instruct.txt`, and write processing functions to start the game. Secondly, `record_data.py` is run to record command audio and generate training

Read More
Run Large Language Model Service with One Click and Build a Chat Application

This article introduces a method to build a local large language model chat service based on the Qwen-7B-Int4 model. First, you need to install the GPU version of PyTorch and other dependency libraries. Then, execute `server.py` in the terminal to start the service. The service supports Windows and Linux systems and can run smoothly with a low VRAM requirement (8G graphics card). In addition, an Android application source code is also provided. By modifying the service address and opening the `AndroidClient` file with Android Studio...

Read More
Easily and Quickly Set Up a Local Speech Synthesis Service

This article introduces a method to quickly set up a local speech synthesis service using the VITS model architecture. First, you need to install the PyTorch environment and related dependency libraries. To start the service, simply run the `server.py` program. Additionally, the source code for an Android application is provided, which requires modifying the server address to connect to your local service. At the end of the article, a QR code is provided to join a knowledge planet and obtain the complete source code. The entire process is simple and efficient, and the service can run without an internet connection.

Read More
Real-time Speech Recognition Service with Remarkably High Recognition Accuracy

This article introduces the installation, configuration, and application deployment of the FunASR speech recognition framework. First, PyTorch and related dependency libraries need to be installed. For the CPU version, it can be completed using the command `conda install pytorch torchvision torchaudio cpuonly -c pytorch`; for the GPU version, use `conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c p` (Note: The original command may be truncated, and the complete command should be checked for accuracy).

Read More
FunASR Speech Recognition GUI Application

This paper introduces a speech recognition GUI application developed based on FunASR, which supports recognition of local audio and video files as well as audio recording recognition. The application includes short audio recognition, long audio recognition (with and without timestamps), and audio file playback. The installation environment requires dependencies such as PyTorch (CPU/GPU), FFmpeg, and pyaudio. To use the application, execute `main.py`. The interface provides four options: short speech recognition, long speech recognition, recording recognition, and playback functionality. Among them, long speech recognition is divided into two models: one for concatenated output and another for explicit

Read More
Voiceprint Recognition System Implemented Based on PyTorch

This project provides an implementation of voice recognition based on PaddlePaddle, mainly using the EcapaTDNN model, and integrates functions of speech recognition and voiceprint recognition. Below, I will summarize the project structure, functions, and how to use these functions. ## Project Structure ### Directory Structure ``` VoiceprintRecognition-PaddlePaddle/ ├── docs/ # Documentation │ └── README.md # Project description document ```

Read More
Voiceprint Recognition System Based on PaddlePaddle

This project demonstrates how to use PaddlePaddle for speaker recognition (voiceprint recognition), covering the complete workflow from data preparation, model training to practical application. The project has a clear structure and detailed code comments, making it suitable for learning and reference. Below are supplementary explanations for some key points mentioned: ### 1. Environment Configuration Ensure you have installed the necessary dependency libraries. If using the TensorFlow or PyTorch version, please configure the environment according to the corresponding tutorials. ### 2. Data Preparation The `data`

Read More
Fine-tuning Whisper Speech Recognition Model and Accelerating Inference

Thank you for providing the detailed project description. To help more people understand and use your project, I will summarize and optimize some key information and steps: ### Project Overview This project aims to deploy a fine-tuned Whisper model to Windows desktop applications, Android APKs, and web platforms to achieve speech-to-text functionality. ### Main Steps #### Model Format Conversion 1. Clone the Whisper native code repository: ```bash git clone https://git

Read More
Segmenting Long Speech into Multiple Short Segments Using Voice Activity Detection (VAD)

This paper introduces YeAudio, a voice activity detection (VAD) tool implemented based on deep learning. The installation command for the library is `python -m pip install yeaudio -i https://pypi.tuna.tsinghua.edu.cn/simple -U`, and the following code snippet can be used for speech segmentation: ```python from yeaaudio.audio import AudioSegment audio_seg ``` (Note: The original code snippet appears incomplete in the user's input; the translation preserves the partial code as provided.)

Read More
Training a Chinese Punctuation Model Based on PaddlePaddle

This project provides a complete process to train and use a model for adding punctuation marks to Chinese text. Below is a summary of the entire process: 1. **Environment Preparation**: - Ensure necessary libraries are installed, such as `paddlepaddle-gpu` and `PaddleNLP`. - Configure the training dataset. 2. **Data Processing and Preprocessing**: - Tokenize the input text and label the punctuation marks. - Create splits for training, validation, and test sets. 3.

Read More
Speech Emotion Recognition Based on PyTorch

This project provides a detailed introduction to how to perform emotion classification from audio using PyTorch, covering the entire process from data preparation, model training to prediction. Below, I will give more detailed explanations for each step and provide some improvement suggestions and precautions. ### 1. Environment Setup Ensure you have installed the necessary Python libraries: ```bash pip install torch torchvision torchaudio numpy matplotlib seaborn soundf ```

Read More
Speech Emotion Recognition Based on PaddlePaddle

The content you provided describes the training and prediction process for a speech classification task based on PaddlePaddle. Next, I will provide a more detailed and complete code example, along with explanations of the functionality of each part. ### 1. Environment Preparation Ensure that the necessary dependency libraries are installed, including `paddle` (specifically the PaddlePickle version). You can install them using the following command: ```bash pip install paddlepaddle==2.4.1 ``` ### 2. Code Implementation

Read More
Easily Implement Speech Synthesis with PaddlePaddle

This paper introduces the implementation method of speech synthesis using PaddlePaddle, including simple code examples, GUI interface operations, and Flask web interfaces. First, a simple program is used to achieve the basic text-to-speech function, utilizing acoustic model and vocoder model to complete the synthesis process and save the result as an audio file. Secondly, the `gui.py` interface program is introduced to simplify the user operation experience. Finally, the Flask web service provided by `server.py` is demonstrated, which can be called by Android applications or mini-programs to achieve remote speech...

Read More
Building an Animal Recognition System with PaddlePaddle to Identify Thousands of Animal Species

This paper introduces a project for animal recognition using PaddlePaddle. Firstly, the animal recognition task can be completed with just a few lines of code. Secondly, a GUI interface is provided to facilitate users in uploading images for recognition. Finally, a Flask web interface is supported for Android calls, enabling cross - platform application. The project includes details such as model path, image reading, and prediction result output, and running screenshots are attached to demonstrate the implementation effect.

Read More
ECAPa-TDNN Voiceprint Recognition Model Implemented with PyTorch

This project demonstrates how to implement speech recognition functionality using PaddlePaddle, specifically including voiceprint comparison and voiceprint registration. Below is a summary of the main content and some improvement suggestions: ### 1. Project Structure and Functions - **Voiceprint Comparison**: Compare the voice features of two audio files to determine if they are from the same person. - **Voiceprint Registration**: Store the voice data of new users in a database and generate corresponding user information. ### 2. Technology Stack - Use PaddlePaddle for model training and prediction.

Read More
ECAPa-TDNN Speaker Recognition Model Implemented Based on PaddlePaddle

This project is a voiceprint recognition system based on PaddlePaddle. It covers application scenarios from data preprocessing, model training to voiceprint recognition and comparison, and is suitable for practical applications such as voiceprint login. Here is a detailed analysis of the project: ### 1. Environment Preparation and Dependency Installation First, ensure that PaddlePaddle and other dependent libraries such as `numpy`, `matplotlib`, etc., have been installed. They can be installed using the following command: ```bash pip install paddlepaddle ```

Read More
Adding Punctuation Marks to Speech Recognition Text

This paper introduces a method for adding punctuation marks to speech recognition text according to grammar, mainly divided into four steps: downloading and decompressing the model, installing PaddleNLP and PPASR tools, importing the PunctuationPredictor class, and using this class to automatically add punctuation marks to the text. The specific steps are as follows: 1. Download the model and decompress it into the `models/` directory. 2. Install the relevant libraries of PaddleNLP and PPASR. 3. Instantiate the predictor using the `PunctuationPredictor` class and pass in the pre

Read More
PPASR Streaming and Non-Streaming Speech Recognition

This document introduces how to deploy and test a speech recognition model implemented using PaddlePaddle, and provides various methods to execute and demonstrate the model's functionality. The following is a summary and interpretation of the document content: ### 1. Introduction - Provides an overview of PaddlePaddle-based speech recognition models, including recognition for short voice segments and long audio clips. ### 2. Deployment Methods #### 2.1 Command-line Deployment Two commands are provided to implement different deployment methods: - `python infer_server.

Read More