research-article

Sprite-from-Sprite: Cartoon Animation Decomposition with Self-supervised Sprite Estimation

Authors:

Tien-Tsin Wong,

Yuxin LiuAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 41, Issue 6

Article No.: 192, Pages 1 - 12

https://doi.org/10.1145/3550454.3555439

Published: 30 November 2022 Publication History

Abstract

We present an approach to decompose cartoon animation videos into a set of "sprites" --- the basic units of digital cartoons that depict the contents and transforms of each animated object. The sprites in real-world cartoons are unique: artists may draw arbitrary sprite animations for expressiveness, where the animated content is often complicated, irregular, and challenging; alternatively, artists may also reduce their workload by tweening and adjusting sprites, or even reuse static sprites, in which case the transformations are relatively regular and simple. Based on these observations, we propose a sprite decomposition framework using Pixel Multilayer Perceptrons (Pixel MLPs) where the estimation of each sprite is conditioned on and guided by all other sprites. In this way, once those relatively regular and simple sprites are resolved, the decomposition of the remaining "challenging" sprites can simplified and eased with the guidance of other sprites. We call this method "sprite-from-sprite" cartoon decomposition. We study ablative architectures of our framework, and the user study demonstrates that our results are the most preferred ones in 19/20 cases.

References

[1]

Yagiz Aksoy, Tung Ozan Aydin, Aljosa Smolic, and Marc Pollefeys. 2017. Unmixingbased soft color segmentation for image manipulation. ACM Transactions on Graphics (2017).

[2]

Thomas Brox and Jitendra Malik. 2010. Object Segmentation by Long Term Analysis of Point Trajectories.

[3]

Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based Photo Recoloring. ACM Transactions on Graphics (2015).

[4]

Achal Dave, Pavel Tokmakov, and Deva Ramanan. 2019. Towards Segmenting Anything That Moves. In ICCV Workshops.

[5]

Marek Dvorožňák, Wilmot Li, Vladimir G. Kim, and Daniel Sýkora. 2018. ToonSynth: Example-Based Synthesis of Hand-Colored Cartoon Animations. ACM Transactions on Graphics 37, 4, Article 167 (2018).

Digital Library

[6]

Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. 2018. A papier-mâché approach to learning 3d surface generation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 216--224.

[7]

Ondřej Jamriška, Šárka Sochorová, Ondřej Texler, Michal Lukáč, Jakub Fišer, Jingwan Lu, Eli Shechtman, and Daniel Sýkora. 2019. Stylizing Video by Example. ACM Transactions on Graphics 38, 4, Article 107 (2019).

Digital Library

[8]

Yoni Kasten, Dolev Ofri, Oliver Wang, and Tali Dekel. 2021. Layered Neural Atlases for Consistent Video Editing.

[9]

Margret Keuper. 2017. Higher-Order Minimum Cost Lifted Multicuts for Motion Segmentation.

[10]

Margret Keuper, Bjoern Andres, and Thomas Brox. 2015. Motion Trajectory Segmentation via Minimum Cost Multicuts.

[11]

Yeong Jun Koh and Chang-Su Kim. 2017. Primary Object Segmentation in Videos Based on Region Augmentation and Reduction.

[12]

Dong Lao and Ganesh Sundaramoorthi. 2018. Extending layered models to 3d motion. In Proceedings of the European conference on computer vision (ECCV). 435--451.

Digital Library

[13]

Fuxin Li, Taeyoung Kim, Ahmad Humayun, David Tsai, and James M Rehg. 2013. Video segmentation by tracking many figure-ground segments.

[14]

Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. 2020. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. arXiv preprint arXiv:2011.13084 (2020).

[15]

T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. 2017. Focal loss for dense object detection. ICCV (2017).

[16]

Xueting Liu, Xiangyu Mao, Xuan Yang, Linling Zhang, and Tien-Tsin Wong. 2013. Stereoscopizing Cel Animations. ACM Transactions on Graphics (SIGGRAPH Asia 2013 issue) 32, 6 (November 2013), 223:1--223:10.

[17]

Erika Lu, Forrester Cole, Tali Dekel, Weidi Xie, Andrew Zisserman, David Salesin, William T Freeman, and Michael Rubinstein. 2020. Layered Neural Rendering for Retiming People in Video. In SIGGRAPH Asia.

[18]

Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T Freeman, and Michael Rubinstein. 2021. Omnimatte: Associating Objects and Their Effects in Video. In CVPR.

[19]

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4460--4470.

[20]

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision. Springer, 405--421.

Digital Library

[21]

Tom Monnier, Elliot Vincent, Jean Ponce, and Mathieu Aubry. 2021. Unsupervised Layered Image Decomposition into Object Prototypes. In ICCV.

[22]

R.M.H. Nguyen, S. Cohen B. Price, and M. S. Brown. 2017. GroupTheme Recoloring for Multi-Image Color Consistency. Computer Graphics Forum (2017).

[23]

Peter Ochs and Thomas Brox. 2011. Object segmentation in video: A hierarchical variational approach for turning point trajectories into dense regions.

[24]

Peter Ochs, Jitendra Malik, and Thomas Brox. 2014a. Segmentation of moving objects by long term video analysis. 36, 6 (2014).

[25]

P. Ochs, J. Malik, and T. Brox. 2014b. Segmentation of moving objects by long term video analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 6 (Jun 2014), 1187--1200. Preprint.

Digital Library

[26]

Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, and Felix Heide. 2021. Neural Scene Graphs for Dynamic Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2856--2865.

[27]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]

F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung. 2016. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation.

[29]

Alex Rav-Acha, Pushmeet Kohli, Carsten Rother, and Andrew Fitzgibbon. 2008. Unwrap Mosaics: A New Representation for Video Editing . In SIGGRAPH '08 ACM SIGGRAPH 2008 papers.

Digital Library

[30]

Jianbo Shi and J. Malik. 1998. Motion segmentation and tracking using normalized cuts.

[31]

Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. 2020. Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems 33 (2020).

[32]

Dmitriy Smirnov, Michael Gharbi, Matthew Fisher, Vitor Guizilini, Alexei A. Efros, and Justin Solomon. 2021. MarioNette: Self-Supervised Sprite Learning.

[33]

Daniel Sýkora, Mirela Ben-Chen, Martin Čadík, Brian Whited, and Maryann Simmons. 2011. TexToons: Practical Texture Mapping for Hand-drawn Cartoon Animations. In Proceedings of International Symposium on Non-photorealistic Animation and Rendering. 75--83.

Digital Library

[34]

Daniel Sýkora, Jan Buriánek, and Jiří Žára. 2005. Sketching Cartoons by Example. In Proceedings of Eurographics Workshop on Sketch-Based Interfaces and Modeling. 27--34.

[35]

Jianchao Tan, Jyh-Ming Lien, and Yotam Ginglod. 2017. Decomposing Images into Layers via RGB-space Geometry. ACM Transactions on Graphics (2017).

[36]

Jianchao Tan, Jyh-Ming Lien, and Yotam Ginglod. 2018. Efficient palette-based decomposition and recoloring of images via RGBXY-space geometry. ACM Transactions on Graphics (2018).

[37]

Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. NeurIPS (2020).

[38]

Zachary Teed and Jia Deng. 2020. RAFT: Recurrent All-Pairs Field Transforms for Optical Flow. arXiv:2003.12039 [cs.CV]

[39]

Lance Williams. 1983. Pyramidal Parametrics.

[40]

Christopher Xie, Yu Xiang, Zaid Harchaoui, and Dieter Fox. 2019. Object Discovery in Videos as Foreground Motion Clustering.

[41]

Jun Xing, Li-Yi Wei, Takaaki Shiratori, and Koji Yatani. 2015. Autocomplete Hand-Drawn Animations. ACM Trans. Graph. 34, 6 (2015).

Digital Library

[42]

Charig Yang, Hala Lamdouar, Erika Lu, Andrew Zisserman, and Weidi Xie. 2021b. Self-supervised Video Object Segmentation by Motion Grouping.

[43]

Yanchao Yang, Brian Lai, and Stefano Soatto. 2021a. DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping.

[44]

Yanchao Yang, Antonio Loquercio, Davide Scaramuzza, and Stefano Soatto. 2019. Un-supervised Moving Object Detection via Contextual Information Separation.

[45]

Vickie Ye, Zhengqi Li, Richard Tucker, Angjoo Kanazawa, and Noah Snavely. 2022. Deformable Sprites for Unsupervised Video Decomposition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]

Hong-Xing Yu, Leonidas J. Guibas, and Jiajun Wu. 2022. Unsupervised Discovery of Object Radiance Fields.

[47]

Qing Zhang, Chunxia Xiao, Hanqiu Sun, and Feng Tang. 2017. Palette-Based Image Recoloring Using Color Decomposition Optimization. IEEE Transactions on Image Processing (2017).

Digital Library

[48]

Song-Hai Zhang, Tao Chen, Yi-Fei Zhang, Shi-Min Hu, and Ralph R. Martin. 2009. Vectorizing Cartoon Animations. TVCG (2009).

[49]

Haichao Zhu, Xueting Liu, Tien-Tsin Wong, and Pheng-Ann Heng. 2016. Globally Optimal Toon Tracking. ACM Transactions on Graphics 35, 4 (July 2016), 75:1--75:10.

Digital Library

Cited By

Zheng ZWang TFeng QPan ZGao XWu K(2024)Proxy Asset Generation for Cloth Simulation in GamesACM Transactions on Graphics10.1145/365817743:4(1-12)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658177
Zhao QLong PZhang QQin DLiang HZhang LZhang YYu JXu L(2024)Media2Face: Co-speech Facial Animation Generation With Multi-Modality GuidanceACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657413(1-13)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657413
Yang HZheng MMa CLai YWan PHuang H(2024)VRMM: A Volumetric Relightable Morphable Head ModelACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657406(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657406
Show More Cited By

Index Terms

Sprite-from-Sprite: Cartoon Animation Decomposition with Self-supervised Sprite Estimation
1. Applied computing
  1. Arts and humanities
    1. Fine arts
    2. Media arts

Recommendations

SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches
FDG '22: Proceedings of the 17th International Conference on the Foundations of Digital Games

2D animation is a common factor in game development, used for characters, effects and background art. It involves work that takes both skill and time, but parts of which are repetitive and tedious. Automated animation approaches exist, but are designed ...
Sprite generation using sprite fusion

There has been related research for sprite or mosaic generation for over 15 years. In this article, we try to understand the methodologies for sprite generation and identify what has not actually been covered for sprite generation. We first identify ...
An art-directed wrinkle system for CG character clothing and skin

We present a kinematic system for creating art-directed clothing and skin wrinkles on CG characters used in the production of computer-animated feature films. This system employs a curve-based method for generating wrinkles on reference poses, which are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 41, Issue 6

December 2022

1428 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3550454

Issue’s Table of Contents

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2022

Published in TOG Volume 41, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

RGC General Research Fund

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
405
Total Downloads

Downloads (Last 12 months)113
Downloads (Last 6 weeks)13

Reflects downloads up to 06 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zheng ZWang TFeng QPan ZGao XWu K(2024)Proxy Asset Generation for Cloth Simulation in GamesACM Transactions on Graphics10.1145/365817743:4(1-12)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658177
Zhao QLong PZhang QQin DLiang HZhang LZhang YYu JXu L(2024)Media2Face: Co-speech Facial Animation Generation With Multi-Modality GuidanceACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657413(1-13)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657413
Yang HZheng MMa CLai YWan PHuang H(2024)VRMM: A Volumetric Relightable Morphable Head ModelACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657406(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657406
Yuan HXu JWang CYang ZWang CYin KYang YBaeza-Yates RBonchi F(2024)Unveiling Privacy Vulnerabilities: Investigating the Role of Structure in Graph DataProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672013(4059-4070)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672013
Suzuki TKikuchi KYamaguchi K(2024)Fast Sprite Decomposition from Animated GraphicsComputer Vision – ECCV 202410.1007/978-3-031-72855-6_12(200-215)Online publication date: 9-Nov-2024
https://doi.org/10.1007/978-3-031-72855-6_12
Li RGuillard BFua POh ANaumann TGloberson ASaenko KHardt MLevine S(2023)ISPProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667873(40294-40319)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667873
Libao ELee MKim SLee S(2023)MeshGraphNetRP: Improving Generalization of GNN-based Cloth SimulationProceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games10.1145/3623264.3624441(1-7)Online publication date: 15-Nov-2023
https://dl.acm.org/doi/10.1145/3623264.3624441
Lee DKang HLee I(2023)ClothCombo: Modeling Inter-Cloth Interaction for Draping Multi-Layered ClothesACM Transactions on Graphics10.1145/361837642:6(1-13)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1145/3618376
Li CJin LZheng YYu YHan X(2023)EMS: 3D Eyebrow Modeling from Single-View ImagesACM Transactions on Graphics10.1145/361832342:6(1-19)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1145/3618323
Zhang SMa JWu JRitchie DAgrawala M(2023)Editing Motion Graphics Video via Motion Vectorization and TransformationACM Transactions on Graphics10.1145/361831642:6(1-13)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1145/3618316
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents