skip to main content
10.1145/3664647.3681154acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

Published: 28 October 2024 Publication History

Abstract

X-ray images play a vital role in the intraoperative processes due to their high resolution and fast imaging speed and greatly promote the subsequent segmentation, registration and reconstruction. However, over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume data. Existing methods are mainly realized by modelling the whole X-ray imaging procedure. In this study, we propose a learning-based approach termed CT2X-GAN to synthesize the X-ray images in an end-to-end manner using the content and style disentanglement from three different image domains. Our method decouples the anatomical structure information from CT scans and style information from unpaired real X-ray images/ digital reconstructed radiography (DRR) images via a series of decoupling encoders. Additionally, we introduce a novel consistency regularization term to improve the stylistic resemblance between synthesized X-ray images and real X-ray images. Meanwhile, we also impose a supervised process by computing the similarity of computed real DRR and synthesized DRR images. We further develop a pose attention module to fully strengthen the comprehensive information in the decoupled content code from CT scans, facilitating high-quality multi-view image synthesis in the lower 2D space. Extensive experiments were conducted on the publicly available CTSpine1K dataset and achieved 97.8350, 0.0842 and 3.0938 in terms of FID, KID and defined user-scored X-ray similarity, respectively. In comparison with 3D-aware methods (π-GAN, EG3D), CT2X-GAN is superior in improving the synthesis quality and realistic to the real X-ray images.

References

[1]
Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. 2021. Restyle: A residual-based stylegan encoder via iterative refinement. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6711--6720.
[2]
Andreu Badal and Aldo Badano. 2009. Accelerating Monte Carlo Simulations of Photon Transport in a Voxelized Geometry Using a Massively Parallel Graphics Processing Unit: Monte Carlo Simulations in a Graphics Processing Unit. Medical Physics, Vol. 36, 11 (2009), 4878--4880.
[3]
Sagie Benaim, Michael Khaitov, Tomer Galanti, and Lior Wolf. 2019. Domain intersection and domain difference. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3445--3453.
[4]
Yogesh H Bhosale and K Sridhar Patnaik. 2023. Bio-medical imaging (X-ray, CT, ultrasound, ECG), genome sequences applications of deep neural network and machine learning in diagnosis, detection, classification, and segmentation of COVID-19: a Meta-analysis & systematic review. Multimedia Tools and Applications, Vol. 82, 25 (2023), 39157--39210.
[5]
Mikołaj Bi'nkowski, Danica J Sutherland, Michael Arbel, and Arthur Gretton. 2018. Demystifying mmd gans. arXiv preprint arXiv:1801.01401 (2018).
[6]
Victor Ion Butoi, Jose Javier Gonzalez Ortiz, Tianyu Ma, Mert R Sabuncu, John Guttag, and Adrian V Dalca. 2023. Universeg: Universal medical image segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 21438--21451.
[7]
Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, and Gordon Wetzstein. 2022. Efficient Geometry-Aware 3D Generative Adversarial Networks. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, LA, USA, 16102--16112.
[8]
Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. 2021. Pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. arxiv: 2012.00926
[9]
Shiyang Cheng, Michael Bronstein, Yuxiang Zhou, Irene Kotsia, Maja Pantic, and Stefanos Zafeiriou. 2019. Meshgan: Non-linear 3d morphable models of faces. arXiv preprint arXiv:1903.10384 (2019).
[10]
Edo Collins, Raja Bala, Bob Price, and Sabine Susstrunk. 2020. Editing in style: Uncovering the local semantics of gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5771--5780.
[11]
Balázs Csébfalvi and László Szirmay-Kalos. 2003. Monte carlo volume rendering. In IEEE Visualization, 2003. VIS 2003. IEEE, 449--456.
[12]
Yang Deng, Ce Wang, Yuan Hui, Qian Li, Jun Li, Shiwei Luo, Mengke Sun, Quan Quan, Shuxin Yang, You Hao, et al. 2021. Ctspine1k: A large-scale dataset for spinal vertebrae segmentation in computed tomography. arXiv preprint arXiv:2105.14711 (2021).
[13]
Jennifer Dhont, Dirk Verellen, Isabelle Mollaert, Verdi Vanreusel, and Jef Vandemeulebroucke. 2020. RealDRR -- Rendering of Realistic Digitally Reconstructed Radiographs Using Locally Trained Image-to-Image Translation. Radiotherapy and Oncology, Vol. 153 (2020), 213--219.
[14]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, Vol. 30 (2017).
[15]
Jiaxin Huang, Qi Wu, Yazhou Ren, Fan Yang, Aodi Yang, Qianqian Yang, and Xiaorong Pu. 2024. Sparse Bayesian Deep Learning for Cross Domain Medical Image Reconstruction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 2339--2347.
[16]
Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision. 1501--1510.
[17]
Guillermo Iglesias, Edgar Talavera, and Alberto Díaz-Álvarez. 2023. A survey on GANs for computer vision: Recent research, analysis and taxonomy. Computer Science Review, Vol. 48 (2023), 100553.
[18]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.
[19]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).
[20]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-Based Generator Architecture for Generative Adversarial Networks. arxiv: 1812.04948
[21]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110--8119.
[22]
Yoni Kasten, Daniel Doktofsky, and Ilya Kovler. 2020. End-to-end convolutional neural network for 3D reconstruction of knee bones from bi-planar X-ray images. In Machine Learning for Medical Image Reconstruction: Third International Workshop, MLMIR 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 8, 2020, Proceedings 3. Springer, 123--133.
[23]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 2023. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, Vol. 42, 4 (2023), 1--14.
[24]
Lennart R Koetzier, Domenico Mastrodicasa, Timothy P Szczykutowicz, Niels R van der Werf, Adam S Wang, Veit Sandfort, Aart J van der Molen, Dominik Fleischmann, and Martin J Willemink. 2023. Deep learning image reconstruction for CT: technical principles and clinical prospects. Radiology, Vol. 306, 3 (2023), e221257.
[25]
David Kügler, Jannik Sehring, Andrei Stefanov, Anirban Mukhopadhyay, and Jrg Schipper. 2020. i3PosNet: instrument pose estimation from X-ray in temporal bone surgery. International Journal of Computer Assisted Radiology and Surgery, Vol. 15, 7 (2020), 1--9.
[26]
Han Li, Hu Han, Zeju Li, Lei Wang, Zhe Wu, Jingjing Lu, and S Kevin Zhou. 2020. High-resolution chest x-ray bone suppression using unpaired CT structural priors. IEEE transactions on medical imaging, Vol. 39, 10 (2020), 3053--3063.
[27]
Xiaoliang Li, Jie Yang, and Yuemin Zhu. 2006. Digitally Reconstructed Radiograph Generation by an Adaptive Monte Carlo Method. Physics in Medicine and Biology, Vol. 51, 11 (2006), 2745--2752.
[28]
Jingbo Zhang3 Zhihao Liang4 Jing Liao, Yan-Pei Cao, and Ying Shan. 2024. Advances in 3D Generation: A Survey. arXiv preprint arXiv:2401.17807 (2024).
[29]
Hassen Louati, Ali Louati, Rahma Lahyani, Elham Kariri, and Abdullah Albanyan. 2024. Advancing Sustainable COVID-19 Diagnosis: Integrating Artificial Intelligence with Bioinformatics in Chest X-ray Analysis. Information, Vol. 15, 4 (2024), 189.
[30]
Mathias Unberath, Jan-Nico Zaech, Sing Chun Lee, Bastian Bier, Javad Fotouhi, Mehran Armand, and Nassir Navab. 2018. DeepDRR -- A Catalyst for Machine Learning in Fluoroscopy-Guided Procedures. arXiv:1803.08606 [physics] (2018). arxiv: 1803.08606
[31]
Lars Mescheder, Andreas Geiger, and Sebastian Nowozin. 2018. Which Training Methods for GANs do actually Converge?. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, 3481--3490. https://proceedings.mlr.press/v80/mescheder18a.html
[32]
Bernike Pasveer. 1989. Knowledge of shadows: the introduction of X-ray images in medicine. Sociology of Health & Illness, Vol. 11, 4 (1989), 360--381.
[33]
Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel Cohen-Or. 2022. Pivotal tuning for latent-based editing of real images. ACM Transactions on graphics (TOG), Vol. 42, 1 (2022), 1--13.
[34]
Niclas Schmitt, Lena Wucherpfennig, Jessica Jesser, Ulf Neuberger, Resul Güney, Martin Bendszus, Markus A Möhlenbruch, and Dominik F Vollherbst. 2024. Sine Spin flat detector CT can improve cerebral soft tissue imaging: a retrospective in vivo study. European Radiology Experimental, Vol. 8, 1 (2024), 1--8.
[35]
Liyue Shen, Lequan Yu, Wei Zhao, John Pauly, and Lei Xing. 2022. Novel-view X-ray projection synthesis through geometry-integrated deep learning. Medical image analysis, Vol. 77 (2022), 102372.
[36]
Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9243--9252.
[37]
VI Syryamkin, SA Klestov, and SB Suntsov. 2020. Digital X-ray Tomography: Edited by VI Syryamkin. Red Square Scientific, Ltd.
[38]
Nicholas Tsoulfanidis and Sheldon Landsberger. 2021. Measurement and detection of radiation. CRC press.
[39]
Mathias Unberath, Jan-Nico Zaech, Cong Gao, Bastian Bier, Florian Goldmann, Sing Chun Lee, Javad Fotouhi, Russell Taylor, Mehran Armand, and Nassir Navab. 2019. Enabling machine learning in X-ray-based procedures via realistic simulation of image formation. International journal of computer assisted radiology and surgery, Vol. 14 (2019), 1517--1528.
[40]
Vivek Gopalakrishnan and Polina Golland. 2022. Fast Auto-Differentiable Digitally Reconstructed Radiographs for Solving Inverse Problems in Intraoperative Imaging. In Workshop on Clinical Image-Based Procedures. Switzerland. arxiv: 2208.12737
[41]
Ruxue Wen, Hangjie Yuan, Dong Ni, Wenbo Xiao, and Yaoyao Wu. 2024. From Denoising Training to Test-Time Adaptation: Enhancing Domain Generalization for Medical Image Segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 464--474.
[42]
Philip J Withers, Charles Bouman, Simone Carmignato, Veerle Cnudde, David Grimaldi, Charlotte K Hagen, Eric Maire, Marena Manley, Anton Du Plessis, and Stuart R Stock. 2021. X-ray computed tomography. Nature Reviews Methods Primers, Vol. 1, 1 (2021), 18.
[43]
Han Xu, Jiteng Yuan, and Jiayi Ma. 2023. Murf: Mutually reinforcing multi-modal image registration and fusion. IEEE transactions on pattern analysis and machine intelligence (2023).
[44]
Xingde Ying, Heng Guo, Kai Ma, Jian Wu, Zhengxin Weng, and Yefeng Zheng. 2019. X2CT-GAN: reconstructing CT from biplanar X-rays with generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10619--10628.
[45]
Ying Yu, Chunping Wang, Qiang Fu, Renke Kou, Fuyu Huang, Boxiong Yang, Tingting Yang, and Mingliang Gao. 2023. Techniques and challenges of image segmentation: A review. Electronics, Vol. 12, 5 (2023), 1199.
[46]
Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris N Metaxas. 2018. Stackgan: Realistic image synthesis with stacked generative adversarial networks. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 8 (2018), 1947--1962.
[47]
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.
[48]
Ruisi Zhang, Luntian Mou, and Pengtao Xie. 2020. TreeGAN: Incorporating Class Hierarchy into Image Generation. arXiv preprint arXiv:2009.07734 (2020).
[49]
Xinyang Zheng, Yang Liu, Pengshuai Wang, and Xin Tong. 2022. SDF-StyleGAN: Implicit SDF-Based StyleGAN for 3D Shape Generation. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 52--63.
[50]
Bai Zhu, Liang Zhou, Simiao Pu, Jianwei Fan, and Yuanxin Ye. 2023. Advances and challenges in multimodal remote sensing image registration. IEEE Journal on Miniaturization for Air and Space Systems (2023).

Index Terms

  1. Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. generative adversarial networks
    2. image synthesis
    3. multi-domains
    4. style disentanglement
    5. x-ray

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 42
      Total Downloads
    • Downloads (Last 12 months)42
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media