Yeyupiaoling

Processing and Usage of the WenetSpeech Dataset

2021-11-30 270 views Speech PaddlePaddle Deep Learning Speech Recognition PaddlePaddle WenetSpeech Mandarin Speech Dataset Chinese Speech Dataset

The WenetSpeech dataset provides over 10,000 hours of Mandarin Chinese speech, categorized into strong-labeled (10,005 hours), weak-labeled (2,478 hours), and unlabeled (9,952 hours) subsets, suitable for supervised, semi-supervised, or unsupervised training. The data is grouped by domain and style, and datasets of different scales (S, M, L) as well as evaluation/test data are provided. The tutorial details how to download, prepare, and use this dataset for training speech recognition models, making it a valuable reference for ASR system developers.

Fast Face Recognition Model Implemented with PaddlePaddle

2021-11-03 234 views PaddlePaddle Deep Learning Deep Learning Computer Vision Artificial Intelligence

This project develops a small and efficient face recognition system based on the ArcFace and PP-OCRv2 models. The training dataset is emore (containing 85,742 individuals and 5,822,653 images), and the lfw-align-128 dataset is used for testing. The project provides complete code and preprocessing scripts. The `create_dataset.py` script is executed to organize raw data into binary file format, improving training efficiency. Model training and evaluation are controlled by `train.py` and `eval.py` respectively. The prediction function supports

A Fast Face Recognition Model Implemented Based on PyTorch

2021-11-03 192 views Pytorch Deep Learning Pytorch Deep Learning Artificial Intelligence

This project aims to develop a face recognition system with small models, high recognition accuracy, and fast inference speed. The training data is sourced from the emore dataset (5.82 million images), and the lfw-align-128 dataset is used for testing. The project combines the ArcFace loss function and MobileNet, implemented through Python scripts. The process of training the model includes data preparation, training, and evaluation, with all code available on GitHub. To start the training process, the `train.py` command is executed; for performance verification, run `ev`

PPASR Speech Recognition (Advanced Level)

2021-09-18 237 views PaddlePaddle Deep Learning Speech Speech Recognition Deep Learning PaddlePaddle

This project is an end-to-end Automatic Speech Recognition (ASR) system implemented based on Kaldi and MindSpore. The system architecture includes multiple stages such as data collection, preprocessing, model training, evaluation, and prediction. Below, I will explain each step in detail and provide some key information to help you better understand the process. ### 1. Dataset The project supports multiple datasets, such as AISHELL, Free-Spoken Chinese Mandarin Co

Sound Classification Based on PyTorch

2021-08-20 331 views Deep Learning Pytorch Speech Python Artificial Intelligence Deep Learning Pytorch Sound Classification

This code is mainly based on the PaddlePaddle framework and is used to implement a speech recognition system based on acoustic features. The project structure is clear, including functional modules such as training, evaluation, and prediction, and provides detailed command-line parameter configuration files. The following is a detailed analysis and usage instructions for the project: ### 1. Project Structure ``` . ├── configs # Configuration files directory │ └── bi_lstm.yml ├── infer.py # Acoustic model inference code ├── recor ``` (Note: The original Chinese text was cut off at "recor" in the last line, so the translation reflects the visible content.)

Speech Recognition Model Based on PyTorch

2021-07-06 301 views Deep Learning Pytorch Speech Pytorch Deep Learning Voiceprint Recognition Chinese voiceprint ArcNet

This project demonstrates how to use the PaddlePaddle framework for voiceprint recognition, covering multiple steps from model training to application deployment. The following are some key points and improvement suggestions for this project: ### Summary of Key Points 1. **Data Preparation**: The `prepare_data.py` in the project is used to generate a dataset containing voiceprint features. 2. **Model Design**: ECAPA-TDNN was selected as the base model, and voiceprint recognition tasks were implemented through custom configurations. 3. **Training Process**: In the training...

Chinese Speaker Recognition Based on TensorFlow 2

2021-07-06 245 views TensorFlow Deep Learning Speech Tensorflow Deep Learning Voiceprint Recognition Chinese Voiceprint Recognition ArcFace

This project well demonstrates how to use deep learning models for voiceprint recognition and voiceprint comparison. Below, I will optimize and improve the code and provide some suggestions to better implement these functions. ### 1. Project Structure First, ensure the project directory structure is clear and easy to understand, for example: ``` VoiceprintRecognition/ ├── data/ │ ├── train_data/ │ │ └── user_01.wav │ ├── test_ ``` (Note: The original input was cut off at "test_", so the translation includes the visible portion only.)

My New Book, "Introduction to and Practical Guide of PaddlePaddle Fluid Deep Learning" Has Been Published!

2021-06-06 203 views Deep Learning Artificial Intelligence Deep Learning PaddlePaddle Edge Computing Natural Language Processing

This book provides a detailed introduction to deep learning development using PaddlePaddle, covering the entire process from environment setup to practical project applications. The content includes environment setup, quick start, linear regression algorithm, practical cases of convolutional neural networks and recurrent neural networks, generative adversarial networks, reinforcement learning, etc. Additionally, it explains model saving and usage, transfer learning, and the application of the mobile framework Paddle-Lite. This book is suitable for beginners to get started and can help solve practical problems such as flower species recognition and news headline classification projects. All the code in the book has been tested, and there are supporting resources.

Face Landmark Detection Model MTCNN Implementation Based on PyTorch

2021-06-02 229 views Deep Learning Pytorch Pytorch Deep Learning Facial Recognition Computer Vision

MTCNN is a multi-task convolutional neural network (CNN) for face detection, consisting of three networks: P-Net, R-Net, and O-Net. P-Net generates candidate windows; R-Net performs high-precision filtering; and O-Net outputs bounding boxes and key points. The model adopts the candidate box + classifier idea, and uses techniques such as image pyramids and bounding box regression to achieve fast and efficient detection. Training MTCNN consists of three steps: 1. Train PNet: Generate PNet data and use the `train_PNet.py` script for training; 2. Train RNet: Generate RN

Age and Gender Recognition Based on MXNET

2021-04-07 183 views Deep Learning Deep Learning mxnet Age Recognition Gender Recognition Face Detection

This project is a deep learning-based face age and gender recognition system. It uses OpenCV and MTCNN (Multi-Task Cascaded Convolutional Network) for face detection, along with a pretrained model for age and gender prediction. Below, I will briefly introduce how to run and understand these scripts. ### 1. Environment Preparation Ensure you have installed the necessary Python libraries: ```bash pip install numpy opencv-python dlib mtcnn ```

CRNN Text Recognition Model Implemented with PaddlePaddle 2.0 Dynamic Graph

2021-04-03 206 views PaddlePaddle Deep Learning Deep Learning Artificial Intelligence PaddlePaddle crnn Optical Character Recognition (OCR)

This document introduces a CRNN text recognition model implemented using PaddlePaddle 2.0 dynamic graph. The model extracts features through CNN, performs sequence prediction via RNN, and uses CTC Loss for loss calculation, making it suitable for input images of irregular lengths. **Training and Data Preparation:** 1. **Environment Configuration**: PaddlePaddle 2.0.1 and Python 3.7 need to be installed. 2. **Dataset Generation**: - Use the `create_image.py` script to automatically generate validation

End-to-End Recognition of Captchas Based on PaddlePaddle 2.0

2021-03-23 213 views PaddlePaddle Deep Learning PaddlePaddle ocr crnn Image Recognition Deep Learning

Your code has covered most aspects of the CAPTCHA recognition project, including data processing, model training, and inference. Below are some suggestions for improvements and enhancements to your provided code: ### 1. Data Preprocessing Ensure the image dimensions are consistent (27x72), as this is the input size used during training. ### 2. Model Definition Your `Model` class has already encapsulated the network structure well. You can further optimize it and add more comments to facilitate understanding. ### 3. Training Process During the training process, ensure that when using multi-GPU training,

PPASR Chinese Speech Recognition (Beginner Level)

2021-03-16 231 views PaddlePaddle Deep Learning Speech Deep Learning PaddlePaddle Artificial Intelligence Speech Recognition Chinese Speech Recognition

Thank you for your detailed introduction! To further help everyone understand and use this CTC-based end-to-end Chinese-English speech recognition model, I will supplement and improve it from several aspects: ### 1. Dataset and Its Processing #### AISHELL - **Data Volume**: Approximately 20 hours of Mandarin Chinese pronunciation. - **Characteristics**: Contains standard Mandarin Chinese pronunciation and some dialects. #### Free ST Chinese Mandarin Corpus - **Data Volume**: Approximately 65 hours of Mandarin Chinese pronunciation. -

Implementing Image Classification on Android Phones Based on TNN

2020-09-06 199 views Deep Learning Android Deep Learning Android tnn Image Classification Image Recognition

This project is mainly an image classifier based on TensorFlow Lite, which can achieve real-time image recognition on Android devices. Its main functions and implementation steps are as follows: ### Project Structure - **MainActivity.java**: Implements gallery image selection and real-time camera prediction on the main interface. - **MNNClassification.java**: Integrates and encapsulates MNN model-related operations. ### Implementation Ideas 1. **Initialization**:

Image Classification on Android Phones Based on MNN

2020-09-05 225 views Deep Learning Android Android mnn Image Recognition Tensorflow onnx

This is a detailed guide on how to implement image classification in an Android application. You have successfully used TensorFlow Lite for image classification and demonstrated how to obtain input data through two methods: calling the camera and selecting images, and then passing this data to the model for prediction. ### Summary of Main Content 1. **Model Initialization**: First, load the pre-trained `mobilenet_v2_1.0_224.tflite` model and create a classifier instance. 2. **Reading Images and Pro

Face Detection, Key Point Detection, and Mask Detection on Android with One Line of Code

2020-09-05 211 views PaddlePaddle Android Deep Learning Android java Development Language

This paper introduces the method of implementing face detection, key point detection, and mask detection in Android applications using Paddle Lite. The core code is only one line: calling `FaceDetectionUtil.getInstance().predictImage(bitmap)` can complete multiple functions. Behind this line of code, it involves model training and compilation, including face detection (`pyramidbox.nb`), face key point detection (`facekeypoints.nb`), and mask classification (

Face Recognition and Face Registration Based on InsightFace

2020-08-30 281 views Deep Learning Facial Recognition Deep Learning mxnet Artificial Intelligence insightface

This code implements a deep learning-based face recognition system using the InsightFace framework. It includes functions for face detection, feature extraction, and face recognition, and also provides a feature to register new users. Below is a detailed explanation of the code: ### 1. Import necessary libraries ```python import cv2 import numpy as np ``` ### 2. Define the `FaceRecognition` class This class contains all functions related to face recognition.

Person Background Replacement on Android Based on Image Semantic Segmentation

2020-08-29 207 views PaddlePaddle Android PaddlePaddle Android Computer Vision Semantic Segmentation

Your project has already implemented basic human image recognition and background replacement functions. To further improve and optimize your code, I will provide some improvement suggestions and sample code. ### 1. Improve the processing flow of predicted images During the conversion of prediction results to images, you can consider using the constructor of `Bitmap.createBitmap` to create a bitmap directly from the array, which can reduce the creation of unnecessary temporary objects. Additionally, when drawing a transparent background, you can directly use `Canvas` and `Paint` to set the background transparency.

PP-YOLOE: A Target Detection Model Based on PaddlePaddle

2020-08-18 225 views PaddlePaddle Deep Learning Deep Learning Artificial Intelligence PaddlePaddle Object Detection Computer Vision

This document provides a detailed introduction to how to implement the training, evaluation, export, and prediction processes of the object detection model PP-YOLOE using PaddlePaddle, along with various deployment methods including the Inference prediction interface, ONNX interface, and prediction on Android devices. Here is a summary of each part: ### 1. Training - **Single-card training**: Use `python train.py --model_type=M --num_classes=8

Implementing Image Classification on Android Phones Based on Paddle Lite

2020-08-02 225 views PaddlePaddle Android Deep Learning PaddlePaddle Android Image Recognition Artificial Intelligence Deep Learning

Thank you for sharing this Android application development example for image classification based on Paddle Lite. Your project not only covers how to obtain categories from images but also introduces methods for real-time image recognition through the camera, enabling users to quickly understand information about the captured object in practical application scenarios. Below, I will further optimize and supplement the content you provided and offer some suggestions to improve the user experience or enhance code efficiency: ### 1. Project Structure and Resource Management Ensure the project has a clear file structure (e.g., `assets/image

Stream and Non-Stream Speech Recognition Implemented with PyTorch

2020-07-30 276 views Deep Learning Pytorch Speech Pytorch Deep Learning Speech Recognition Convolutional Neural Network Artificial Intelligence

### Project Overview This project is a speech recognition system implemented based on PyTorch. By utilizing pretrained models and custom configurations, it can recognize input audio files and output corresponding text results. ### Install Dependencies First, necessary libraries need to be installed. Run the following command in the terminal or command line: ```bash pip install torch torchaudio numpy librosa ``` If the speech synthesis module is required, additionally install `gTTS` and

Implementation of Image Classification on Android Phones Based on TensorFlow Lite

2020-07-22 239 views TensorFlow Android Tensorflow Computer Vision Android Image Recognition Tensorflow lite

This project mainly implements an image classification application based on TensorFlow Lite, which can perform object recognition using images from the camera or photo album on an Android device and provide real-time prediction functionality. The following is a detailed analysis of the core steps and key code of this project: ### Project Structure - **TFLiteModel**: Contains model-related configurations. - **MainActivity**: The main interface for launching the camera or selecting images for classification. - **RunClassifier** (Note: The original text seems to be incomplete here, so the translation preserves the placeholder as is.)

Face Recognition Based on MTCNN and MobileFaceNet

2020-07-19 231 views Deep Learning TensorFlow Facial Recognition Deep Learning Tensorflow MTCNN MobileFaceNet

Your project has designed a deep learning-based face recognition system with a front-end and back-end separated implementation. This system includes a front-end page and a back-end service, which can be used for face registration and real-time face recognition. Below are detailed analysis and improvement suggestions for your code: ### Front-end Part 1. **HTML Template**: - You have already created a simple `index.html` file in the `templates` directory to provide the user interface. - Some basic CSS styles can be added.

Chinese Voiceprint Recognition Based on Kersa

2020-07-15 196 views TensorFlow Deep Learning Speech Deep Learning Tensorflow Keras Voiceprint Recognition Speaker Recognition

Thank you for providing the detailed explanation about voiceprint recognition and comparison. Below, I will provide you with a more detailed implementation step-by-step for the PaddlePaddle version, along with code examples. This project will include data preprocessing, model training, voiceprint comparison, and registration/recognition. ### 1. Environment Setup First, ensure that you have installed PaddlePaddle and other necessary libraries such as `numpy` and `sklearn`. You can install them using the following command: ```bash pip install p ```

Large-scale Face Detection Based on Pyramidbox

2020-07-09 196 views PaddlePaddle Deep Learning Computer Vision Artificial Intelligence Deep Learning PaddlePaddle Facial Recognition

Based on the code and description you provided, this is an implementation of a face detection model using PyTorch. The model employs a custom inference process to load images, perform preprocessing, and conduct face detection through the model. Here are key points summarizing the code: - **Data Preprocessing**: Transpose the input image from `HWC` to `CHW` format, adjust the color space (BGR to RGB), subtract the mean, and scale. This step ensures compatibility with the data format used during training. - **Model Inference**: Uses the PaddlePaddle framework (Note: There appears to be a discrepancy here, as the initial description mentions PyTorch but this part references PaddlePaddle. If this is an error, please clarify.)