Summary of Deep Learning Project Experience

"The whole process of data modeling and analysis"

Data modeling can be divided into several steps:

1. Define Objectives: Clearly define the business problem and determine objectives.

2. Data Understanding and Acquisition: Acquire and understand data, followed by descriptive analysis.

3. Data Cleaning and Preprocessing: Handle missing values, outliers, and duplicates.

4. Data Restructuring: Labeling, standardization, etc.; further cleaning based on actual situations.

5. Descriptive Statistics and Insights: Draw insights from descriptive statistics, forming the initial report.

6. Feature Selection and Model Selection: Select variables and algorithms.

7. Dataset Partitioning and Model Construction: Split dataset, set parameters, load algorithms, and build the model.

8. Model Evaluation: Evaluate the model’s performance.

9. Model Tuning: Fine-tune the model for better results.

10. Output Rules and Presentation: Determine output rules, load the model, and present results.

11. Model Deployment: Deploy the model for practical use.

More information can be found in the following link:
数据建模分析的全流程

"A complete Pytorch deep learning project code structure and project release guide"

A common project structure is as follows:

–project_name/
—-data/：Data
—-checkpoints/：Saved trained models
—-logs/：Logs
—-model_hub/：Pre-trained model weights
——–chinese-bert-wwm-ext/：
—-utils/：Utility modules, such as logging, evaluation metrics, etc.
——–utils.py
——–metrics.py
—-models/：Models
——–model.py
—-configs/：Configuration files
——–config.py
—-datasets/：Data loading
——–data_loader.py
—-main.py：Main program containing training, validation, testing, and prediction logic

More information can be found in the following link:
一个完整的Pytorch深度学习项目代码结构及项目发布指南

"Neural Network Hyperparameter Tuning: Loss Issues Compilation"

The issues are listed as follows:

Reasons for Model Non-Convergence
1.1. Setting a high learning rate may lead to divergence (sudden increase in loss).
1.2. Generally, having a too-small database might not cause non-convergence issues.
1.3. Prefer using smaller models.
Model Loss Not Decreasing
2.1. Loss remains constant at 87.33.
2.2. Loss consistently stays around 0.69.
Summary of Solutions
3.1. Data and labels
3.2. Improperly set learning rate
3.3. Inappropriate network architecture
3.4. Dataset label settings
3.5. Data normalization
3.6. Excessive regularization
3.7. Choosing the appropriate loss function
3.8. Inadequate training time

More information can be found in the following links:
神经网络调参：loss 问题汇总（震荡/剧烈抖动，loss不收敛/不下降）
深度学习调参技巧
Pytorch 模型查看网络参数的梯度以及参数更新是否正确，优化器学习率设置固定的学习率，分层设置学习率

"The considerations for using common evaluation metrics"

The issues are listed as follows:

# coding: utf-8

from math import sqrt
from sklearn.metrics import mean_squared_error
from sklearn.metrics import accuracy_score
from sklearn import metrics
import torch
import os
import numpy as np

def obtain_acc(actual, pred):
    return accuracy_score(actual, pred)

def obtain_rmse(actual, pred):
    return sqrt(mean_squared_error(actual, pred))

def obtain_auc(actual, pred):
    fpr, tpr, threshholds = metrics.roc_curve(actual, pred, pos_label=1)
    auc = metrics.auc(fpr, tpr)
    return auc

def obtain_f1(actual, pred):
    f1 = metrics.f1_score(actual, pred)
    return f1

def obtain_rec(actual, pred):
    recal = metrics.recall_score(actual, pred)
    return recal

def obtain_pre(actual, pred):
    precision = metrics.precision_score(actual, pred)
    return precision

def obtain_confusion_matrix(actual, pred):
    confusion_matrix = metrics.confusion_matrix(actual, pred)
    return confusion_matrix

1. obtain_auc(actual, pred) -> actual is discrete (e.g. 0,1) and pred is continuous (e.g. 0.3,0.8). To plot the ROC curve, it’s necessary to sort the probability values indicating each test sample’s likelihood of belonging to the positive class in descending order.

"Upsampling, Downsampling and Negative Sampling"

Upsampling:
- Definition: Upsampling is an image processing technique used to enhance the resolution and quality of an image by increasing the number of pixels. It expands the image to higher resolutions, revealing more details and information.
- Methods: Common methods of upsampling in image processing include bilinear interpolation, bicubic interpolation, and deconvolution. These methods use existing pixel values to compute new pixel values, thereby increasing the size and resolution of the image.
Downsampling:
- Definition: Downsampling is an image processing technique used to reduce the resolution and size of an image by decreasing the number of pixels. It decreases the storage and processing requirements of an image.
- Methods: Common downsampling methods include average pooling, max pooling, and convolutional layers with adjusted strides. These methods merge pixel values into smaller regions and select representative pixel values to reduce the image’s size and number of pixels.
Negative Sampling:
- Definition: Negative Sampling is a sampling technique in machine learning used for model training, particularly in dealing with imbalanced class data. In natural language processing or recommendation systems, negative sampling reduces the number of negative samples to improve model training.
- Methods: Negative sampling balances the ratio of positive and negative samples by selecting a small number of negative samples from the dataset, thereby enhancing the model’s performance and generalization.

"Hard Labels, Soft Labels, and Pseudo Labels"

Hard Labels: Hard labels refer to the original, discrete labels used in supervised learning tasks. In classification problems, hard labels typically represent each sample with a clear, unique class label, such as 0 or 1, or index values denoting different classes.
Soft Labels: Soft labels differ from hard labels in that they aren’t strict discrete values but instead represent a probability distribution or continuous values indicating the likelihood of a sample belonging to each class. These labels provide more information by not only revealing the true class of the sample but also offering probabilities for other classes. Soft labels are often used in training models, especially in deep learning tasks, where certain loss functions (like cross-entropy loss) support the use of soft labels.
Pseudo Labels: _Pseudo labels are a type of semi-supervised learning technique that involves using the model’s predictions on unlabeled data as the labels for that data. Initially, a trained model predicts labels for unlabeled data, and these predictions _