Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Liu, Songhua; Lin, Tianwei; He, Dongliang; Li, Fu; Deng, Ruifeng; Li, Xin; Ding, Errui; Wang, Hao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2108.03798 (cs)

[Submitted on 9 Aug 2021 (v1), last revised 11 Aug 2021 (this version, v2)]

Title:Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Authors:Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang

View PDF

Abstract:Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks. While reinforcement learning (RL) based agents can generate a stroke sequence step by step for this task, it is not easy to train a stable RL agent. On the other hand, stroke optimization methods search for a set of stroke parameters iteratively in a large search space; such low efficiency significantly limits their prevalence and practicality. Different from previous methods, in this paper, we formulate the task as a set prediction problem and propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network. This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time. More importantly, since there is no dataset available for training the Paint Transformer, we devise a self-training pipeline such that it can be trained without any off-the-shelf dataset while still achieving excellent generalization capability. Experiments demonstrate that our method achieves better painting performance than previous ones with cheaper training and inference costs. Codes and models are available.

Comments:	Accepted by ICCV 2021 (oral). Codes will be released on this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2108.03798 [cs.CV]
	(or arXiv:2108.03798v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2108.03798

Submission history

From: Tianwei Lin [view email]
[v1] Mon, 9 Aug 2021 04:18:58 UTC (11,325 KB)
[v2] Wed, 11 Aug 2021 13:09:55 UTC (11,325 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators