Quantitative Analysis of Metallographic Image Using Attention-Aware Deep Neural Networks
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Network Structure of MAUNet
Algorithm 1 The learning process of MAUNet |
Input: The training images , max-epochs , the number of , The testing image and the groundtruth labels G |
Output: The output prediction , and its performance results , , and |
All the images are preprocessed according to the steps in Section 4.2. |
Training Stage: |
Initialize the network weights, learning rate, batch size, and other parameters |
for ; ; do |
Get the data batch from |
for ; ; do |
Compute IoU loss function ; |
Compute Dice loss ; |
Compute Focal loss ; |
Train MAUNet by optimizing loss and update the weights and parameters; |
end for |
end for |
Testing Stage: |
Feed into the well-trained MAUNet and then output the prediction segmentation ; |
Compute the performance results (Equation (14)), (Equation (15)), (Equation (16)) and running time |
return, , , and . |
3.2. Hybrid Loss for MAUNet
3.3. Network Structure and Loss of SASAPD
Algorithm 2 The learning process of SASAPD |
Input: The training images , max-epochs , the number of , The testing image and the groundtruth labels G |
Output: The output prediction , and its performance results , , and |
All the images are preprocessed according to the steps in Section 4.2. |
Training Stage: |
Initialize the network weights, learning rate, batch size, and other parameters; |
for; ; do |
for ; ; do |
Get the data batch from |
Compute loss function (Equation (7)) |
Each instance is assigned to the pyramid level which has the minimal loss |
Train SASAPD by optimizing loss (Equation (10)) where , and update the weights and parameters; |
end for |
end for |
for ; ; do |
for ; ; do |
Get the data batch from |
Compute loss function (Equation (7)) |
Train SASAPD by optimizing loss (Equation (10)), and update the weights and parameters; |
end for |
end for |
Feed into the well-trained SASAPD and then output the prediction segmentation ; |
Compute the performance results (Equation (14)), (Equation (17)), (Equation (17)) and running time |
return , , and . |
4. Experiments
4.1. Dataset and Data Preparation
4.2. Dataset Preprocessing
4.3. Performance Evaluation Metrics
4.3.1. Evaluation Metrics for Segmentation
4.3.2. Evaluation Metrics for Object Detection
4.4. Learning Parameters and Training Details
4.5. Experiments on Dataset SPMID
4.6. Experiments on Dataset MPMID
5. Results and Discussion
5.1. Analysis of Segmentation Results on Dataset SPMID
5.1.1. Discussion about Ablation Study on Dataset SPMID
5.1.2. Discussion about User Study on Dataset SPMID
5.2. Analysis of Detection Results on Dataset MPMID
5.2.1. Discussion about Ablation Study on Dataset MPMID
5.2.2. Discussion about User Study on Dataset MPMID
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, Y.; Chen, J. A watershed segmentation algorithm based on ridge detection and rapid region merging. In Proceedings of the 2014 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Guilin, China, 5–8 August 2014; pp. 420–424. [Google Scholar]
- Han, Y.; Lai, C.; Wang, B.; Gu, H. Segmenting images with complex textures by using hybrid algorithm. J. Electron. Imaging 2019, 28, 013030. [Google Scholar] [CrossRef]
- Chen, L.; Han, Y.; Cui, B.; Guan, Y.; Luo, Y. Two-dimensional fuzzy clustering algorithm (2DFCM) for metallographic image segmentation based on spatial information. In Proceedings of the 2015 2nd International Conference on Information Science and Control Engineering, Shanghai, China, 24–26 April 2015; pp. 519–521. [Google Scholar]
- Lai, C.; Song, L.; Han, Y.; Li, Q.; Gu, H.; Wang, B.; Qian, Q.; Chen, W. Material image segmentation with the machine learning method and complex network method. MRS Adv. 2019, 4, 1119–1124. [Google Scholar] [CrossRef] [Green Version]
- Li, M.; Chen, D.; Liu, S.; Guo, D. Online learning method based on support vector machine for metallographic image segmentation. Signal Image Video Process. 2020. [Google Scholar] [CrossRef]
- Sanchez-Lengeling, B.; Aspuru-Guzik, A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018, 361, 360–365. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, J.; Ma, L.; Yao, Y. Segmentation of casting defect regions for the extraction of microstructural properties. Eng. Appl. Artif. Intell. 2019, 85, 150–163. [Google Scholar] [CrossRef]
- Wu, W.H.; Lee, J.C.; Wang, Y.M. A Study of Defect Detection Techniques for Metallographic Images. Sensors 2020, 20, 5593. [Google Scholar] [CrossRef]
- Chen, Y.; Jin, W.; Wang, M. Metallographic image segmentation of GCr15 bearing steel based on CGAN. Int. J. Appl. Electromagn. Mech. 2020, 64, 1237–1243. [Google Scholar] [CrossRef]
- Huang, Z.; Huang, L.; Gong, Y.; Huang, C.; Wang, X. Mask scoring r-cnn. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 6409–6418. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Tian, Z.; Chen, D.; Liu, S.; Liu, F. DexiNed-based Aluminum Alloy Grain Boundary Detection Algorithm. In Proceedings of the 2020 Chinese Control And Decision Conference (CCDC), Hefei, China, 22–24 August 2020; pp. 5647–5652. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Zhu, C.; He, Y.; Savvides, M. Feature selective anchor-free module for single-shot object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 840–849. [Google Scholar]
- Xu, K.; Guan, K.; Peng, J.; Luo, Y.; Wang, S. DeepMask: An algorithm for cloud and cloud shadow detection in optical satellite remote sensing images using deep residual network. arXiv 2019, arXiv:1911.03607. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Schonfeld, E.; Schiele, B.; Khoreva, A. A u-net based discriminator for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8207–8216. [Google Scholar]
- Azad, R.; Asadi-Aghbolaghi, M.; Fathy, M.; Escalera, S. Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Chen, D.; Guo, D.; Liu, S.; Liu, F. Microstructure Instance Segmentation from Aluminum Alloy Metallographic Image Using Different Loss Functions. Symmetry 2020, 12, 639. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Chen, Y.; Wang, N.; Zhang, Z. Scale-aware trident networks for object detection. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 6054–6063. [Google Scholar]
- Law, H.; Deng, J. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 734–750. [Google Scholar]
- Zhou, X.; Zhuo, J.; Krahenbuhl, P. Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 850–859. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Zhu, C.; Chen, F.; Shen, Z.; Savvides, M. Soft anchor-point object detection. arXiv 2019, arXiv:1911.12448. [Google Scholar]
- Zhu, B.; Wang, J.; Jiang, Z.; Zong, F.; Liu, S.; Li, Z.; Sun, J. AutoAssign: Differentiable Label Assignment for Dense Object Detection. arXiv 2020, arXiv:2007.03496. [Google Scholar]
- Norman, B.; Pedoia, V.; Majumdar, S. Use of 2D U-Net convolutional neural networks for automated cartilage and meniscus segmentation of knee MR imaging data to determine relaxometry and morphometry. Radiology 2018, 288, 177–185. [Google Scholar] [CrossRef] [Green Version]
- Luo, L.; Chen, D.; Xue, D. Retinal blood vessels semantic segmentation method based on modified u-net. In Proceedings of the 2018 Chinese Control And Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 1892–1895. [Google Scholar]
- Han, Y.; Ye, J.C. Framing U-Net via deep convolutional framelets: Application to sparse-view CT. IEEE Trans. Med. Imaging 2018, 37, 1418–1429. [Google Scholar] [CrossRef] [Green Version]
- Ye, J.C.; Han, Y.; Cha, E. Deep convolutional framelets: A general deep learning framework for inverse problems. SIAM J. Imaging Sci. 2018, 11, 991–1048. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; So Kweon, I. Cbam: Convolutional block attention module. In Proceedings of the European conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Rahman, M.A.; Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In Proceedings of the International Symposium on Visual Computing, Las Vegas, NV, USA, 12–14 December 2016; pp. 234–244. [Google Scholar]
- Zhang, J.; Shen, X.; Zhuo, T.; Zhou, H. Brain tumor segmentation based on refined fully convolutional neural networks with a hierarchical dice loss. arXiv 2017, arXiv:1712.09093. [Google Scholar]
- Chattopadhyay, S.; Basak, H. Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for Scene Segmentation. arXiv 2020, arXiv:2009.06911. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Kong, T.; Sun, F.; Liu, H.; Jiang, Y.; Li, L.; Shi, J. FoveaBox: Beyound anchor-based object detection. IEEE Trans. Image Process. 2020, 29, 7389–7398. [Google Scholar] [CrossRef]
- Zhou, Y.; Dou, Y. Double Weighted RPCA Denoising Algorithm for Color Images. In Proceedings of the 2018 IEEE 4th International Conference on Computer and Communications (ICCC), Chengdu, China, 7–10 December 2018; pp. 1670–1674. [Google Scholar]
- Wang, S.; Xia, K.; Wang, L.; Zhang, J.; Yang, H. Improved RPCA method via weighted non-convex regularization for image denoising. IET Signal Process. 2020, 14, 269–277. [Google Scholar] [CrossRef]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Seo, H.; Huang, C.; Bassenne, M.; Xiao, R.; Xing, L. Modified U-Net (mU-Net) With Incorporation of Object-Dependent High Level Features for Improved Liver and Liver-Tumor Segmentation in CT Images. IEEE Trans. Med. Imaging 2020, 39, 1316–1325. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 2019, 39, 1856–1867. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, C.; Tan, Y.; Chen, W.; Luo, X.; He, Y.; Gao, Y.; Li, F. ANU-Net: Attention-based Nested U-Net to exploit full resolution features for medical image segmentation. Comput. Graph. 2020, 90, 11–20. [Google Scholar] [CrossRef]
- Sun, J.; Darbehani, F.; Zaidi, M.; Wang, B. SAUNet: Shape Attentive U-Net for Interpretable Medical Image Segmentation. arXiv 2020, arXiv:2001.07645. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection. arXiv 2020, arXiv:2006.04388. [Google Scholar]
Ref. | Proposed | Finding | Limitation |
---|---|---|---|
[7] | 3D CNN | proposes 3DCNN to extract microstructural. | The computation cost is too much. |
[20] | FCN | adapts contemporary classification networks (AlexNet and GoogLeNet) into FCNs. | needs extra fine-tuning layer for postprocessing. |
[21] | DeepLab | utilizes atrous spatial pooling and multi-scale atrous pyramid features with encoder-decoder. | The computation cost is too much. |
[22] | U-Net | use a contracting path to capture context and a symmetric path that enables precise localization. | High-frequency information in skip connection is lost. |
[23] | U-Net Based GCN | adapts a per-pixel feedback to the generator and a per-pixel consistency regularization technique. | High-frequency information in skip connection is lost. |
[24] | BCDU-Net | U-Net included BConvLSTM and inserts a densely connected convolutional block. | dense layer brings too much computation cost. |
[25] | Mask-RCNN | uses Mask RNN for instance segmentation with different loss functions. | complex generation procedure of candidate generation. |
[14] | Yolov4 | applies some tricks on Yolov3. | The heavily dependent on pre-defined proposals; Poor performance for tiny objects. |
[26] | TridentNet | constructs a parallel multi-branch architecture where each branch shares the same parameters. | treats all the scales equally. |
[27] | Cornernet | reformulates the detection problem as locating several key points of the bounding boxes. | The corner points still models a bounding box. |
[28] | ExtremeNet | locate the extreme points of objects with supervision from ground-truth mask annotation. | relies on handcrafted clustering to compose whole objects. |
[29] | FCOS | regresses the four sides from the center points to form the final bounding box outputs. | Better performance comes at a high computation cost. |
[18] | FSAF | applies online feature selection to train anchor-free branches in the feature pyramid. | only selects the optimal feature level for each instance. |
[30] | SAPD | assigns optimal feature levels to given sample based on the loss distribution in object detection. | fails to obtain discriminable features with poor sample weighting. |
[31] | AutoAssign | determines positive/negative samples by generating proper weights to modify each location’s prediction. | fails to output satisfying results when the objects are with similar appearances and shapes. |
Carbon | A | B | C | D | E | |
---|---|---|---|---|---|---|
Segregation | ||||||
1 | 345 | 432 | 341 | 456 | 298 | |
2 | 353 | 451 | 357 | 419 | 370 | |
3 | 367 | 386 | 394 | 373 | 451 | |
4 | 346 | 401 | 410 | 402 | 269 | |
5 | 296 | 447 | 391 | 323 | 355 |
F | P | Segband | Up B | Background | Total |
---|---|---|---|---|---|
632,880 | 369,920 | 68,560 | 25,200 | 417,520 | 1,514,080 |
41.80% | 24.43% | 4.53% | 1.66% | 27.58% | - |
Model | Dice | IoU | RoC | Params | Runing Time (s) |
---|---|---|---|---|---|
U-Net | 0.786 | 0.645 | 0.981 | 7.8 M | 4.86 |
MAUNet (Dual) | 0.836 | 0.715 | 1.079 | 7.9 M | 4.92 |
MAUNet- | 0.934 | 0.873 | 1.260 | 8.2 M | 5.17 |
SAUNet | 0.831 | 0.711 | 1.037 | 9.3 M | 4.99 |
UNet++ | 0.875 | 0.777 | 1.264 | 9.0 M | 8.73 |
ANU-Net | 0.906 | 0.828 | 1.149 | 8.9 M | 6.42 |
mU-Net | 0.940 | 0.886 | 1.257 | 8.5 M | 5.25 |
Deeplab V3+ | 0.793 | 0.652 | 1.037 | 352.5 M | 16.76 |
MAUNet(Ours) | 0.963 | 0.923 | 1.257 | 8.8 M | 5.02 |
Input | Backbone | The Number of Phases | Anchor Free or not | Dice (F) | Precision (F) | Recall (F) | Dice (P) | Precision (P) | Recall (P) | FPS |
---|---|---|---|---|---|---|---|---|---|---|
SAPD(SRS) | X-101-32x4d-DCN | one | yes | 0.918 | 0.928 | 0.908 | 0.911 | 0.943 | 0.881 | 25 |
SAPD(3S) | X-101-32x4d-DCN | one | yes | 0.932 | 0.942 | 0.921 | 0.931 | 0.956 | 0.905 | 22 |
SAPD | X-101-32x4d-DCN | one | yes | 0.876 | 0.893 | 0.857 | 0.887 | 0.913 | 0.865 | 28 |
SASAPD | X-101-32x4d-DCN | one | yes | 0.963 | 0.971 | 0.954 | 0.947 | 0.967 | 0.928 | 20 |
AutoAssign | X-101-32x4d-DCN | one | yes | 0.951 | 0.964 | 0.938 | 0.937 | 0.958 | 0.914 | 20 |
Yolo V4 | EfficientNet-B3 | one | no | 0.943 | 0.953 | 0.930 | 0.931 | 0.951 | 0.911 | 31 |
ATSS+GFL | X-101-32x4d-DCN | one | yes | 0.914 | 0.934 | 0.895 | 0.918 | 0.936 | 0.903 | 18 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, Y.; Zhang, Y.; Zhang, M.; Wang, M.; Xu, W.; Wang, C.; Sun, Y.; Wei, P. Quantitative Analysis of Metallographic Image Using Attention-Aware Deep Neural Networks. Sensors 2021, 21, 43. https://doi.org/10.3390/s21010043
Xu Y, Zhang Y, Zhang M, Wang M, Xu W, Wang C, Sun Y, Wei P. Quantitative Analysis of Metallographic Image Using Attention-Aware Deep Neural Networks. Sensors. 2021; 21(1):43. https://doi.org/10.3390/s21010043
Chicago/Turabian StyleXu, Yifei, Yuewan Zhang, Meizi Zhang, Mian Wang, Wujiang Xu, Chaoyong Wang, Yan Sun, and Pingping Wei. 2021. "Quantitative Analysis of Metallographic Image Using Attention-Aware Deep Neural Networks" Sensors 21, no. 1: 43. https://doi.org/10.3390/s21010043
APA StyleXu, Y., Zhang, Y., Zhang, M., Wang, M., Xu, W., Wang, C., Sun, Y., & Wei, P. (2021). Quantitative Analysis of Metallographic Image Using Attention-Aware Deep Neural Networks. Sensors, 21(1), 43. https://doi.org/10.3390/s21010043