基于深度双流网络与强化Focal Loss的极端不平衡螺栓缺陷鲁棒分类方法

葛晓丽

基于深度双流网络与强化Focal Loss的极端不平衡螺栓缺陷鲁棒分类方法

葛晓丽

Robust classification method of extremely unbalanced bolt defects based on deep two-stream network and enhanced Focal Loss

Ge Xiaoli

摘要

摘要: 通信铁塔螺栓松动和腐蚀是威胁基础设施安全的关键隐患。然而，实际检测数据集存在极端类别不平衡的挑战，正常状态占主导，松动与腐蚀等缺陷样本稀少，导致传统深度学习模型在安全敏感的少数类召回率（不漏报率）上表现不佳。为解决这一难题，本文提出一种基于深度估计引导的异构双流卷积网络 (Dual-Stream CNN) 及其鲁棒损失优化策略。首先，模型采用 RGB-原始深度（Raw Depth）双流架构（基于ResNet-18），有效融合了图像的纹理信息（RGB）和几何结构特征（Depth），以增强模型对松动（几何形变）和腐蚀（表面纹理）的判别能力。其次，针对极端类别不平衡，我们创新性地结合了Focal Loss (γ=1.5) 与Loose类别激进超频权重 ( ×2.8)，通过提高对松动（Loose）这一最稀疏且关键缺陷的关注度，实现了其召回率的突破性提升。最后，为了使所有缺陷类别达到工业安全标准，我们引入了Corroded类别决策阈值调优策略（阈值降至 0.10）。本研究的数据集共3886张图像，训练集3109张，验证集777张。实验在PyTorch框架下进行，使用NVIDIA RTX 3090 GPU训练。最终结果表明，经过上述系统性优化，该方法在测试集上实现了整体准确率96.78%，并且关键缺陷的召回率全面达标：Normal召回率达到 97.64%，Loose召回率达到 100.00%，Corroded召回率达到 90.35%。本研究成功解决了极端不平衡下的缺陷漏报问题，为通信铁塔等关键基础设施的智能维护提供了鲁棒且高效的技术方案。

Abstract: Loosening and corrosion of communication tower bolts are the key hidden dangers that threaten the safety of infrastructure. However, there is a challenge of extreme class imbalance in the actual detection data set. The normal state is dominant, and the samples of defects such as loosening and corrosion are scarce, which leads to the poor performance of the traditional deep learning model in the safety-sensitive minority class recall rate ( no false negative rate ). In order to solve this problem, this paper proposes a heterogeneous dual-stream convolutional network ( Dual-Stream CNN ) based on depth estimation guidance and its robust loss optimization strategy. Firstly, the model adopts the RGB-Raw Depth dual-stream architecture ( based on ResNet-18 ), which effectively integrates the texture information ( RGB ) and geometric structure features ( Depth ) of the image to enhance the model "s ability to discriminate looseness ( geometric deformation ) and corrosion ( surface texture ). Secondly, for extreme class imbalance, we innovatively combine Focal Loss ( γ = 1.5 ) with Loose class radical over-frequency weight ( × 2.8 ), and achieve a breakthrough improvement in its recall rate by increasing the attention to Loose, the most sparse and critical defect. Finally, in order to make all defect categories meet the industrial safety standards, we introduce a Corroded category decision threshold tuning strategy ( the threshold is reduced to 0.10 ). The dataset in this study consists of 3,886 images, with 3,109 for training and 777 for validation.Experiments were conducted using the PyTorch framework on an NVIDIA RTX 3090 GPU. The experimental results show that after the above systematic optimization, the method achieves an overall accuracy of 96.78 % on the test set, and the recall rate of key defects is fully up to standard : Normal recall rate reaches 97.64 %, Loose recall rate reaches 100.00 %, and Corroded recall rate reaches 90.35 %. This study successfully solves the problem of defect underreporting under extreme imbalance, and provides a robust and efficient technical solution for intelligent maintenance of key infrastructure such as communication towers.

HTML全文

参考文献(0)

施引文献