skip to main content
10.1145/3503161.3548310acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Image-Signal Correlation Network for Textile Fiber Identification

Published: 10 October 2022 Publication History

Abstract

Identifying fiber compositions is an important aspect of the textile industry. In recent decades, near-infrared spectroscopy has shown its potential in the automatic detection of fiber components. However, for plant fibers such as cotton and linen, the chemical compositions are the same and thus the absorption spectra are very similar, leading to the problem of "different materials with the same spectrum, whereas the same material with different spectrums" and it is difficult using a single mode of NIR signals to capture the effective features to distinguish these fibers. To solve this problem, textile experts under a microscope measure the cross-sectional or longitudinal characteristics of fibers to determine fiber contents with a destructive way. In this paper, we construct the first NIR signal-microscope image textile fiber composition dataset (NIRITFC). Based on the NIRITFC dataset, we propose an image-signal correlation network (ISiC-Net) and design image-signal correlation perception and image-signal correlation attention modules, respectively, to effectively integrate the visual features (esp. local texture details of fibers) with the finer absorption spectrum information of the NIR signal to capture the deep abstract features of bimodal data for nondestructive textile fiber identification. To better learn the spectral characteristics of the fiber components, the endmember vectors of the corresponding fibers are generated by embedding encoding, and the reconstruction loss is designed to guide the model to reconstruct the NIR signals of the corresponding fiber components by a nonlinear mapping. The quantitative and qualitative results are significantly improved compared to both single and bimodal approaches, indicating the great potential of combining microscopic images and NIR signals for textile fiber composition identification.

Supplementary Material

MP4 File (mmfp2448.mp4)
Thank you for watching this video. My name is Peng Bo, I am a PhD candidate of Fudan University. The paper I will be presenting today is titled Image-Signal Correlation Network for Textile Fiber Identification. In this paper, we consider joint modeling of NIR signals and images to enhance the accuracy of textile fiber identification and propose an image-signal correlation network (ISiC-Net) for nondestructive textile fiber identification by fusing NIR-signal and microscopic-image. ISiC-Net is designed inspired by human identification processing in the textile field. That is, the model can learn the corresponding sensitive bands of NIR signals of different materials and find local features in the image simultaneously.

References

[1]
2007. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review. Postharvest Biology and Technology 46, 2 (2007), 99--118.
[2]
Syed Umar Amin, Mansour Alsulaiman, Ghulam Muhammad, Mohamed Amine Mekhtiche, and M Shamim Hossain. 2019. Deep Learning for EEG motor imagery classification based on multi-layer CNNs feature fusion. Future Generation computer systems 101 (2019), 542--554.
[3]
Pradeep K Atrey, M Anwar Hossain, Abdulmotaleb El Saddik, and Mohan S Kankanhalli. 2010. Multimodal fusion for multimedia analysis: a survey. Multimedia systems 16, 6 (2010), 345--379.
[4]
Zhile Chen, Feng Li, Yuhui Quan, Yong Xu, and Hui Ji. 2021. Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5231--5240.
[5]
X. Dong, J. Dong, G. Sun, Y. Duan, L. Qi, and H. Yu. 2019. Learning-Based Texture Synthesis and Automatic Inpainting Using Support Vector Machines. IEEE Transactions on Industrial Electronics 66, 6 (2019), 4777--4787.
[6]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[7]
Zunlei Feng, Weixin Liang, Daocheng Tao, Li Sun, Anxiang Zeng, and Mingli Song. 2019. Cu-net: Component unmixing network for textile fiber identification. International Journal of Computer Vision 127, 10 (2019), 1443--1454.
[8]
Manuel J Ferreira, Cristina Santos, and João Monteiro. 2009. Cork parquet quality control vision system based on texture segmentation and fuzzy grammar. IEEE Transactions on industrial Electronics 56, 3 (2009), 756--765.
[9]
Chanel A Fortier, James E Rodgers, Michael S Cintron, Xiaoliang Cui, and Jonn A Foulk. 2011. Identification of cotton and cotton trash components by Fourier transform near-infrared spectroscopy. Textile Research Journal 81, 3 (2011), 230--238.
[10]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[11]
Jian Huang, Jianhua Tao, Bin Liu, Zheng Lian, and Mingyue Niu. 2020. Multimodal transformer fusion for continuous emotion recognition. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3507--3511.
[12]
Jian Huang, Jianhua Tao, B. Liu, Zheng Lian, and Mingyue Niu. 2020. Multimodal Transformer Fusion for Continuous Emotion Recognition. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020), 3507--3511.
[13]
Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez. 2010. Aggregating local descriptors into a compact image representation. In The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13--18 June 2010. IEEE Computer Society, 3304--3311.
[14]
Aisha Urooj Khan, Amir Mazaheri, Niels da Vitoria Lobo, and Mubarak Shah. 2020. MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering. ArXiv abs/2010.14095 (2020).
[15]
Yves Langeron, Michel Doussot, David J Hewson, and Jacques Duchêne. 2007. Classifying NIR spectra of textile products with kernel methods. Engineering Applications of Artificial Intelligence 20, 3 (2007), 415--427.
[16]
Xuejing Liu, Liang Li, ShuhuiWang, Zheng-Jun Zha, Li Su, and Qingming Huang. 2019. Knowledge-guided pairwise reconstruction network for weakly supervised referring expression grounding. In Proceedings of the 27th ACM International Conference on Multimedia. 539--547.
[17]
Zhengdong Liu, Wenxia Li, and Zihan Wei. 2020. Qualitative classification of waste textiles based on near infrared spectroscopy and the convolutional network. Textile Research Journal 90, 9--10 (2020), 1057--1066.
[18]
Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, Amir Zadeh, and Louis-Philippe Morency. 2018. Efficient low-rank multimodal fusion with modality-specific factors. arXiv preprint arXiv:1806.00064 (2018).
[19]
Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, and Chen Sun. 2021. Attention bottlenecks for multimodal fusion. Advances in Neural Information Processing Systems 34 (2021).
[20]
Dung Nguyen, Duc Thanh Nguyen, Rui Zeng, Thanh Thi Nguyen, Son Tran, Thin Khac Nguyen, Sridha Sridharan, and Clinton Fookes. 2021. Deep Auto- Encoders with Sequential Learning for Multimodal Dimensional Emotion Recognition. IEEE Transactions on Multimedia (2021).
[21]
Burkni Palsson, Magnus O Ulfarsson, and Johannes R Sveinsson. 2020. Convolutional autoencoder for spectral--spatial hyperspectral unmixing. IEEE Transactions on Geoscience and Remote Sensing 59, 1 (2020), 535--549.
[22]
Bo PENG, Mingmin CHI, and Chao LIU. 2022. Non-IID federated learning via random exchange of local feature maps for textile IIoT secure computing. Information Sciences 65, 170302 (2022), 1--170302.
[23]
Matti Pietikäinen, Abdenour Hadid, Guoying Zhao, and Timo Ahonen. 2011. Computer Vision Using Local Binary Patterns. Springer Publishing Company, Incorporated.
[24]
Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2016. Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016).
[25]
Xu-dong Sun, Ming-xing Zhou, and Yi-ze Sun. 2016. Variables selection for quantitative determination of cotton content in textile blends by near infrared spectroscopy. Infrared Physics & Technology 77 (2016), 65--72.
[26]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[27]
Xixi Wang, Bo Jiang, Xiao Wang, and Bin Luo. 2021. MTFNet: Mutual-Transformer Fusion Network for RGB-D Salient Object Detection. arXiv preprint arXiv:2112.01177 (2021).
[28]
Songtao Wu, Sheng-hua Zhong, Yan Liu, and Mengyuan Liu. 2019. CIS-Net: A Novel CNN Model for Spatial Image Steganalysis via Cover Image Suppression. arXiv preprint arXiv:1912.06540 (2019).
[29]
Yang Wu, Zijie Lin, Yanyan Zhao, Bing Qin, and Li-Nan Zhu. 2021. A Text-Centered Shared-Private Framework via Cross-Modal Prediction for Multimodal Sentiment Analysis. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 4730--4738.
[30]
Jia Xue, Hang Zhang, and Kristin J. Dana. 2018. Deep Texture Manifold for Ground Terrain Recognition. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. IEEE Computer Society, 558--567. http://openaccess.thecvf.com/content_cvpr_2018/html/Xue_Deep_Texture_Manifold_CVPR_2018_paper.html
[31]
Özal Yildirim. 2018. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification. Computers in biology and medicine 96 (2018), 189--202.
[32]
Hong-Fu Yuan, Rui-Xue Chang, Ling-Ling Tian, Chun-Feng Song, Xue-Qin Yuan, Xiao-Yu Li, et al. 2010. Study of nondestructive and fast identification of fabric fibers using near infrared spectroscopy. Spectroscopy and Spectral Analysis 30, 5 (2010), 1229--1233.
[33]
Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2017. Tensor fusion network for multimodal sentiment analysis. arXiv preprint arXiv:1707.07250 (2017).
[34]
Wei Zhai, Yang Cao, Zheng-Jun Zha, HaiYong Xie, and Feng Wu. 2020. Deep structure-revealed network for texture recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11010--11019.
[35]
Wei Zhai, Yang Cao, Jing Zhang, and Zheng-Jun Zha. 2019. Deep multipleattribute-perceived network for real-world texture recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3613--3622.
[36]
Hang Zhang, Jia Xue, and Kristin J. Dana. 2017. Deep TEN: Texture Encoding Network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017. IEEE Computer Society, 2896--2905.
[37]
Weiming Zhang, Yi Huang, Wanting Yu, Xiaoshan Yang, Wei Wang, and Jitao Sang. 2019. Multimodal attribute and feature embedding for activity recognition. In Proceedings of the ACM Multimedia Asia. 1--7.
[38]
Xiangrong Zhang, Yujia Sun, Jingyan Zhang, Peng Wu, and Licheng Jiao. 2018. Hyperspectral unmixing via deep convolutional neural networks. IEEE Geoscience and Remote Sensing Letters 15, 11 (2018), 1755--1759.
[39]
Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, and Hai Zhao. 2019. Neural machine translation with universal visual representation. In International Conference on Learning Representations.
[40]
Shilian Zheng, Shichuan Chen, Peihan Qi, Huaji Zhou, and Xiaoniu Yang. 2020. Spectrum sensing based on deep learning classification for cognitive radios. China Communications 17, 2 (2020), 138--148.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NIR signal processing
  2. attention
  3. computer vision
  4. multimodal data processing

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Foundation of China
  • Zhongshan science and technology development project

Conference

MM '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media