Skip to main content

Showing 1–50 of 75 results for author: Paudel, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.07965  [pdf, other

    cs.AI cs.RO

    Autonomous Vehicle Controllers From End-to-End Differentiable Simulation

    Authors: Asen Nachkov, Danda Pani Paudel, Luc Van Gool

    Abstract: Current methods to learn controllers for autonomous vehicles (AVs) focus on behavioural cloning. Being trained only on exact historic data, the resulting agents often generalize poorly to novel scenarios. Simulators provide the opportunity to go beyond offline datasets, but they are still treated as complicated black boxes, only used to update the global simulation state. As a result, these RL alg… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  2. arXiv:2409.01690  [pdf, other

    cs.CV cs.CL

    Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits

    Authors: Ada-Astrid Balauca, Danda Pani Paudel, Kristina Toutanova, Luc Van Gool

    Abstract: CLIP is a powerful and widely used tool for understanding images in the context of natural language descriptions to perform nuanced tasks. However, it does not offer application-specific fine-grained and structured understanding, due to its generic nature. In this work, we aim to adapt CLIP for fine-grained and structured -- in the form of tabular data -- visual understanding of museum exhibits. T… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: Accepted to ECCV 2024

  3. arXiv:2408.16504  [pdf, other

    cs.CV

    A Simple and Generalist Approach for Panoptic Segmentation

    Authors: Nedyalko Prisadnikov, Wouter Van Gansbeke, Danda Pani Paudel, Luc Van Gool

    Abstract: Generalist vision models aim for one and the same architecture for a variety of vision tasks. While such shared architecture may seem attractive, generalist models tend to be outperformed by their bespoken counterparts, especially in the case of panoptic segmentation. We address this problem by introducing two key contributions, without compromising the desirable properties of generalist models. T… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  4. arXiv:2408.10906  [pdf, other

    cs.CV

    ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining

    Authors: Qi Ma, Yue Li, Bin Ren, Nicu Sebe, Ender Konukoglu, Theo Gevers, Luc Van Gool, Danda Pani Paudel

    Abstract: 3D Gaussian Splatting (3DGS) has become the de facto method of 3D representation in many vision tasks. This calls for the 3D understanding directly in this representation space. To facilitate the research in this direction, we first build a large-scale dataset of 3DGS using the commonly used ShapeNet and ModelNet datasets. Our dataset ShapeSplat consists of 65K objects from 87 unique categories, w… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  5. arXiv:2408.09110  [pdf, other

    cs.CV

    Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community

    Authors: Jiancheng Pan, Yanxing Liu, Yuqian Fu, Muyuan Ma, Jiaohao Li, Danda Pani Paudel, Luc Van Gool, Xiaomeng Huang

    Abstract: Object detection, particularly open-vocabulary object detection, plays a crucial role in Earth sciences, such as environmental monitoring, natural disaster assessment, and land-use planning. However, existing open-vocabulary detectors, primarily trained on natural-world images, struggle to generalize to remote sensing images due to a significant data domain gap. Thus, this paper aims to advance th… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures

  6. arXiv:2407.20987  [pdf, other

    cs.CV cs.CY

    PIXELMOD: Improving Soft Moderation of Visual Misleading Information on Twitter

    Authors: Pujan Paudel, Chen Ling, Jeremy Blackburn, Gianluca Stringhini

    Abstract: Images are a powerful and immediate vehicle to carry misleading or outright false messages, yet identifying image-based misinformation at scale poses unique challenges. In this paper, we present PIXELMOD, a system that leverages perceptual hashes, vector databases, and optical character recognition (OCR) to efficiently identify images that are candidates to receive soft moderation labels on Twitte… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  7. arXiv:2407.20910  [pdf, other

    cs.CL cs.CR

    Enabling Contextual Soft Moderation on Social Media through Contrastive Textual Deviation

    Authors: Pujan Paudel, Mohammad Hammas Saeed, Rebecca Auger, Chris Wells, Gianluca Stringhini

    Abstract: Automated soft moderation systems are unable to ascertain if a post supports or refutes a false claim, resulting in a large number of contextual false positives. This limits their effectiveness, for example undermining trust in health experts by adding warnings to their posts or resorting to vague warnings instead of granular fact-checks, which result in desensitizing users. In this paper, we prop… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  8. arXiv:2407.18098  [pdf, other

    cs.CY cs.SI

    Unraveling the Web of Disinformation: Exploring the Larger Context of State-Sponsored Influence Campaigns on Twitter

    Authors: Mohammad Hammas Saeed, Shiza Ali, Pujan Paudel, Jeremy Blackburn, Gianluca Stringhini

    Abstract: Social media platforms offer unprecedented opportunities for connectivity and exchange of ideas; however, they also serve as fertile grounds for the dissemination of disinformation. Over the years, there has been a rise in state-sponsored campaigns aiming to spread disinformation and sway public opinion on sensitive topics through designated accounts, known as troll accounts. Past works on detecti… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Journal ref: International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2024)

  9. arXiv:2407.13372  [pdf, other

    cs.CV

    Any Image Restoration with Efficient Automatic Degradation Adaptation

    Authors: Bin Ren, Eduard Zamfir, Yawei Li, Zongwei Wu, Danda Pani Paudel, Radu Timofte, Nicu Sebe, Luc Van Gool

    Abstract: With the emergence of mobile devices, there is a growing demand for an efficient model to restore any degraded image for better perceptual quality. However, existing models often require specific learning modules tailored for each degradation, resulting in complex architectures and high computation costs. Different from previous work, in this paper, we propose a unified manner to achieve joint emb… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Efficient Any Image Restoration

  10. arXiv:2407.11174  [pdf, other

    cs.CV cs.AI

    iHuman: Instant Animatable Digital Humans From Monocular Videos

    Authors: Pramish Paudel, Anubhav Khanal, Ajad Chhatkuli, Danda Pani Paudel, Jyoti Tandukar

    Abstract: Personalized 3D avatars require an animatable representation of digital humans. Doing so instantly from monocular videos offers scalability to broad class of users and wide-scale applications. In this paper, we present a fast, simple, yet effective method for creating animatable 3D digital humans from monocular videos. Our method utilizes the efficiency of Gaussian splatting to model both 3D geome… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 15 pages, eccv, 2024

  11. arXiv:2407.05862  [pdf, other

    cs.CV

    Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning

    Authors: Bin Ren, Guofeng Mei, Danda Pani Paudel, Weijie Wang, Yawei Li, Mengyuan Liu, Rita Cucchiara, Luc Van Gool, Nicu Sebe

    Abstract: Contrastive learning (CL) for Vision Transformers (ViTs) in image domains has achieved performance comparable to CL for traditional convolutional backbones. However, in 3D point cloud pretraining with ViTs, masked autoencoder (MAE) modeling remains dominant. This raises the question: Can we take the best of both worlds? To answer this question, we first empirically validate that integrating MAE-ba… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning

  12. arXiv:2407.04577  [pdf, other

    cs.IR

    Optimizing Nepali PDF Extraction: A Comparative Study of Parser and OCR Technologies

    Authors: Prabin Paudel, Supriya Khadka, Ranju G. C., Rahul Shah

    Abstract: This research compares PDF parsing and Optical Character Recognition (OCR) methods for extracting Nepali content from PDFs. PDF parsing offers fast and accurate extraction but faces challenges with non-Unicode Nepali fonts. OCR, specifically PyTesseract, overcomes these challenges, providing versatility for both digital and scanned PDFs. The study reveals that while PDF parsers are faster, their a… ▽ More

    Submitted 9 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  13. arXiv:2406.17438  [pdf, other

    cs.CV

    Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes

    Authors: Qi Ma, Danda Pani Paudel, Ender Konukoglu, Luc Van Gool

    Abstract: Neural implicit functions have demonstrated significant importance in various areas such as computer vision, graphics. Their advantages include the ability to represent complex shapes and scenes with high fidelity, smooth interpolation capabilities, and continuous representations. Despite these benefits, the development and analysis of implicit functions have been limited by the lack of comprehens… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  14. arXiv:2405.17773  [pdf, other

    cs.CV

    Towards a Generalist and Blind RGB-X Tracker

    Authors: Yuedong Tan, Zongwei Wu, Yuqian Fu, Zhuyun Zhou, Guolei Sun, Chao Ma, Danda Pani Paudel, Luc Van Gool, Radu Timofte

    Abstract: With the emergence of a single large model capable of successfully solving a multitude of tasks in NLP, there has been growing research interest in achieving similar goals in computer vision. On the one hand, most of these generic models, referred to as generalist vision models, aim at producing unified outputs serving different tasks. On the other hand, some existing models aim to combine differe… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  15. arXiv:2405.15475  [pdf, other

    cs.CV

    Efficient Degradation-aware Any Image Restoration

    Authors: Eduard Zamfir, Zongwei Wu, Nancy Mehta, Danda Pani Paudel, Yulun Zhang, Radu Timofte

    Abstract: Reconstructing missing details from degraded low-quality inputs poses a significant challenge. Recent progress in image restoration has demonstrated the efficacy of learning large models capable of addressing various degradations simultaneously. Nonetheless, these approaches introduce considerable computational overhead and complex learning paradigms, limiting their practical utility. In response,… ▽ More

    Submitted 1 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  16. arXiv:2405.10233  [pdf, other

    cs.SI cs.CY cs.IR

    iDRAMA-Scored-2024: A Dataset of the Scored Social Media Platform from 2020 to 2023

    Authors: Jay Patel, Pujan Paudel, Emiliano De Cristofaro, Gianluca Stringhini, Jeremy Blackburn

    Abstract: Online web communities often face bans for violating platform policies, encouraging their migration to alternative platforms. This migration, however, can result in increased toxicity and unforeseen consequences on the new platform. In recent years, researchers have collected data from many alternative platforms, indicating coordinated efforts leading to offline events, conspiracy movements, hate… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  17. arXiv:2312.15242  [pdf, other

    cs.CV

    CaLDiff: Camera Localization in NeRF via Pose Diffusion

    Authors: Rashik Shrestha, Bishad Koju, Abhigyan Bhusal, Danda Pani Paudel, François Rameau

    Abstract: With the widespread use of NeRF-based implicit 3D representation, the need for camera localization in the same representation becomes manifestly apparent. Doing so not only simplifies the localization process -- by avoiding an outside-the-NeRF-based localization -- but also has the potential to offer the benefit of enhanced localization. This paper studies the problem of localizing cameras in NeRF… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  18. arXiv:2312.13332  [pdf, other

    cs.CV

    Ternary-type Opacity and Hybrid Odometry for RGB-only NeRF-SLAM

    Authors: Junru Lin, Asen Nachkov, Songyou Peng, Luc Van Gool, Danda Pani Paudel

    Abstract: The opacity of rigid 3D scenes with opaque surfaces is considered to be of a binary type. However, we observed that this property is not followed by the existing RGB-only NeRF-SLAM. Therefore, we are motivated to introduce this prior into the RGB-only NeRF-SLAM pipeline. Unfortunately, the optimization through the volumetric rendering function does not facilitate easy integration of the desired pr… ▽ More

    Submitted 22 December, 2023; v1 submitted 20 December, 2023; originally announced December 2023.

  19. arXiv:2312.11578  [pdf, other

    cs.CV

    Diffusion-Based Particle-DETR for BEV Perception

    Authors: Asen Nachkov, Martin Danelljan, Danda Pani Paudel, Luc Van Gool

    Abstract: The Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs) due to its well suited compatibility to downstream tasks. For the enhanced safety of AVs, modeling perception uncertainty in BEV is crucial. Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively det… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  20. arXiv:2312.08558  [pdf, other

    cs.CV

    G-MEMP: Gaze-Enhanced Multimodal Ego-Motion Prediction in Driving

    Authors: M. Eren Akbiyik, Nedko Savov, Danda Pani Paudel, Nikola Popovic, Christian Vater, Otmar Hilliges, Luc Van Gool, Xi Wang

    Abstract: Understanding the decision-making process of drivers is one of the keys to ensuring road safety. While the driver intent and the resulting ego-motion trajectory are valuable in developing driver-assistance systems, existing methods mostly focus on the motions of other vehicles. In contrast, we focus on inferring the ego trajectory of a driver's vehicle using their gaze data. For this purpose, we f… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  21. arXiv:2311.17119  [pdf, other

    cs.CV

    Continuous Pose for Monocular Cameras in Neural Implicit Representation

    Authors: Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: In this paper, we showcase the effectiveness of optimizing monocular camera poses as a continuous function of time. The camera poses are represented using an implicit neural function which maps the given time to the corresponding camera pose. The mapped camera poses are then used for the downstream tasks where joint camera pose optimization is also required. While doing so, the network parameters… ▽ More

    Submitted 2 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  22. arXiv:2311.15851  [pdf, other

    cs.CV

    Single-Model and Any-Modality for Video Object Tracking

    Authors: Zongwei Wu, Jilai Zheng, Xiangxuan Ren, Florin-Alexandru Vasluianu, Chao Ma, Danda Pani Paudel, Luc Van Gool, Radu Timofte

    Abstract: In the realm of video object tracking, auxiliary modalities such as depth, thermal, or event data have emerged as valuable assets to complement the RGB trackers. In practice, most existing RGB trackers learn a single set of parameters to use them across datasets and applications. However, a similar single-model unification for multi-modality tracking presents several challenges. These challenges s… ▽ More

    Submitted 29 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR2024

  23. arXiv:2311.13833  [pdf, other

    cs.CV cs.CL cs.LG

    Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models

    Authors: Saman Motamed, Danda Pani Paudel, Luc Van Gool

    Abstract: Diffusion models have revolutionized generative content creation and text-to-image (T2I) diffusion models in particular have increased the creative freedom of users by allowing scene synthesis using natural language. T2I models excel at synthesizing concepts such as nouns, appearances, and styles. To enable customized content creation based on a few example images of a concept, methods such as Tex… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  24. arXiv:2311.12157  [pdf, other

    cs.CV

    Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions

    Authors: Nikola Popovic, Dimitrios Christodoulou, Danda Pani Paudel, Xi Wang, Luc Van Gool

    Abstract: The task of predicting 3D eye gaze from eye images can be performed either by (a) end-to-end learning for image-to-gaze mapping or by (b) fitting a 3D eye model onto images. The former case requires 3D gaze labels, while the latter requires eye semantics or landmarks to facilitate the model fitting. Although obtaining eye semantics and landmarks is relatively easy, fitting an accurate 3D eye model… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted to ISMAR2023 as a poster paper

  25. arXiv:2309.08416  [pdf, other

    cs.CV

    Deformable Neural Radiance Fields using RGB and Event Cameras

    Authors: Qi Ma, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

    Abstract: Modeling Neural Radiance Fields for fast-moving deformable objects from visual data alone is a challenging problem. A major issue arises due to the high deformation and low acquisition rates. To address this problem, we propose to use event cameras that offer very fast acquisition of visual change in an asynchronous manner. In this work, we develop a novel method to model the deformable neural rad… ▽ More

    Submitted 25 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

  26. arXiv:2307.13344  [pdf, other

    cs.CV

    Prior Based Online Lane Graph Extraction from Single Onboard Camera Image

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: The local road network information is essential for autonomous navigation. This information is commonly obtained from offline HD-Maps in terms of lane graphs. However, the local road network at a given moment can be drastically different than the one given in the offline maps; due to construction works, accidents etc. Moreover, the autonomous vehicle might be at a location not covered in the offli… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: ITSC 2023

  27. arXiv:2307.10947  [pdf, other

    cs.CV

    Improving Online Lane Graph Extraction by Object-Lane Clustering

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: Autonomous driving requires accurate local scene understanding information. To this end, autonomous agents deploy object detection and online BEV lane graph extraction methods as a part of their perception stack. In this work, we propose an architecture and loss formulation to improve the accuracy of local lane graph estimates by using 3D object detection outputs. The proposed method learns to ass… ▽ More

    Submitted 27 September, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: ICCV 2023

  28. arXiv:2305.00126  [pdf, other

    cs.CV cs.RO

    Event-Free Moving Object Segmentation from Moving Ego Vehicle

    Authors: Zhuyun Zhou, Zongwei Wu, Danda Pani Paudel, Rémi Boutteau, Fan Yang, Luc Van Gool, Radu Timofte, Dominique Ginhac

    Abstract: Moving object segmentation (MOS) in dynamic scenes is challenging for autonomous driving, especially for sequences obtained from moving ego vehicles. Most state-of-the-art methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occ… ▽ More

    Submitted 28 November, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

  29. arXiv:2304.00930  [pdf, other

    cs.CV

    Online Lane Graph Extraction from Onboard Video

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: Autonomous driving requires a structured understanding of the surrounding road network to navigate. One of the most common and useful representation of such an understanding is done in the form of BEV lane graphs. In this work, we use the video stream from an onboard camera for online extraction of the surrounding's lane graph. Using video, instead of a single image, as input poses both benefits a… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  30. arXiv:2303.12865  [pdf, other

    cs.CV cs.GR cs.LG

    NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions

    Authors: Mohamad Shahbazi, Evangelos Ntavelis, Alessio Tonioni, Edo Collins, Danda Pani Paudel, Martin Danelljan, Luc Van Gool

    Abstract: Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors. Recently, the integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs), has transformed 3D-aware generation from single-view images. NeRF-GANs exploit the strong in… ▽ More

    Submitted 24 July, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  31. arXiv:2212.05926  [pdf, other

    cs.CR cs.CY cs.SI

    LAMBRETTA: Learning to Rank for Twitter Soft Moderation

    Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: To curb the problem of false information, social media platforms like Twitter started adding warning labels to content discussing debunked narratives, with the goal of providing more context to their audiences. Unfortunately, these labels are not applied uniformly and leave large amounts of false content unmoderated. This paper presents LAMBRETTA, a system that automatically identifies tweets that… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: 44th IEEE Symposium on Security & Privacy (S&P 2023)

  32. arXiv:2212.05370  [pdf, other

    cs.CV

    Source-free Depth for Object Pop-out

    Authors: Zongwei Wu, Danda Pani Paudel, Deng-Ping Fan, Jingjing Wang, Shuo Wang, Cédric Demonceaux, Radu Timofte, Luc Van Gool

    Abstract: Depth cues are known to be useful for visual perception. However, direct measurement of depth is often impracticable. Fortunately, though, modern learning-based methods offer promising depth maps by inference in the wild. In this work, we adapt such depth inference models for object segmentation using the objects' "pop-out" prior in 3D. The "pop-out" is a simple composition prior that assumes obje… ▽ More

    Submitted 25 September, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

    Comments: Accepted to ICCV 2023

  33. arXiv:2212.01331  [pdf, other

    cs.CV

    Surface Normal Clustering for Implicit Representation of Manhattan Scenes

    Authors: Nikola Popovic, Danda Pani Paudel, Luc Van Gool

    Abstract: Novel view synthesis and 3D modeling using implicit neural field representation are shown to be very effective for calibrated multi-view cameras. Such representations are known to benefit from additional geometric and semantic supervision. Most existing methods that exploit additional supervision require dense pixel-wise labels or localized scene priors. These methods cannot benefit from high-leve… ▽ More

    Submitted 27 September, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: Paper accepted to ICCV23

  34. arXiv:2211.07491  [pdf, other

    cs.CV

    Piecewise Planar Hulls for Semi-Supervised Learning of 3D Shape and Pose from 2D Images

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: We study the problem of estimating 3D shape and pose of an object in terms of keypoints, from a single 2D image. The shape and pose are learned directly from images collected by categories and their partial 2D keypoint annotations.. In this work, we first propose an end-to-end training framework for intermediate 2D keypoints extraction and final 3D shape and pose estimation. The proposed framewo… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  35. arXiv:2208.01762  [pdf, other

    cs.CV

    Robust RGB-D Fusion for Saliency Detection

    Authors: Zongwei Wu, Shriarulmozhivarman Gobichettipalayam, Brahim Tamadazte, Guillaume Allibert, Danda Pani Paudel, Cédric Demonceaux

    Abstract: Efficiently exploiting multi-modal inputs for accurate RGB-D saliency detection is a topic of high interest. Most existing works leverage cross-modal interactions to fuse the two streams of RGB-D for intermediate features' enhancement. In this process, a practical aspect of the low quality of the available depths has not been fully considered yet. In this work, we aim for RGB-D saliency detection… ▽ More

    Submitted 30 August, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

    Comments: Accepted to 3DV 2022

  36. SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice

    Authors: Mohit Singhal, Chen Ling, Pujan Paudel, Poojitha Thota, Nihal Kumarswamy, Gianluca Stringhini, Shirin Nilizadeh

    Abstract: Social media platforms have been establishing content moderation guidelines and employing various moderation policies to counter hate speech and misinformation. The goal of this paper is to study these community guidelines and moderation practices, as well as the relevant research publications, to identify the research gaps, differences in moderation techniques, and challenges that should be tackl… ▽ More

    Submitted 1 March, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: To appear in the 8th IEEE European Symposium on Security and Privacy (EuroS&P 2023)

  37. arXiv:2206.01705  [pdf, other

    cs.CV

    Gradient Obfuscation Checklist Test Gives a False Sense of Security

    Authors: Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc Van Gool

    Abstract: One popular group of defense techniques against adversarial attacks is based on injecting stochastic noise into the network. The main source of robustness of such stochastic defenses however is often due to the obfuscation of the gradients, offering a false sense of security. Since most of the popular adversarial attacks are optimization-based, obfuscated gradients reduce their attacking ability,… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

  38. arXiv:2205.05467  [pdf, other

    cs.CV cs.LG

    A Continual Deepfake Detection Benchmark: Dataset, Methods, and Essentials

    Authors: Chuqiao Li, Zhiwu Huang, Danda Pani Paudel, Yabin Wang, Mohamad Shahbazi, Xiaopeng Hong, Luc Van Gool

    Abstract: There have been emerging a number of benchmarks and techniques for the detection of deepfakes. However, very few works study the detection of incrementally appearing deepfakes in the real-world scenarios. To simulate the wild scenes, this paper suggests a continual deepfake detection benchmark (CDDB) over a new collection of deepfakes from both known and unknown generative models. The suggested CD… ▽ More

    Submitted 14 November, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

    Comments: Accepted to WACV 2023

  39. arXiv:2203.13812  [pdf, other

    cs.CV

    Spatially Multi-conditional Image Generation

    Authors: Ritika Chakraborty, Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc Van Gool

    Abstract: In most scenarios, conditional image generation can be thought of as an inversion of the image understanding process. Since generic image understanding involves solving multiple tasks, it is natural to aim at generating images via multi-conditioning. However, multi-conditional image generation is a very challenging problem due to the heterogeneity and the sparsity of the (in practice) available co… ▽ More

    Submitted 14 July, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

  40. arXiv:2203.11192  [pdf, other

    cs.CV

    Transforming Model Prediction for Tracking

    Authors: Christoph Mayer, Martin Danelljan, Goutam Bhat, Matthieu Paul, Danda Pani Paudel, Fisher Yu, Luc Van Gool

    Abstract: Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function. While this inductive bias integrates valuable domain knowledge, it limits the expressivity of the tracking network. In this work, we therefore propose a tracker architecture employing a Transformer-based model pre… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022. The code and trained models are available at https://github.com/visionml/pytracking

  41. arXiv:2203.10541  [pdf, other

    cs.CV

    Unsupervised Domain Adaptation for Nighttime Aerial Tracking

    Authors: Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, Guang Chen

    Abstract: Previous advances in object tracking mostly reported on favorable illumination circumstances while neglecting performance at nighttime, which significantly impeded the development of related aerial robot applications. This work instead develops a novel unsupervised domain adaptation framework for nighttime aerial tracking (named UDAT). Specifically, a unique object discovery approach is provided t… ▽ More

    Submitted 30 March, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: accepted by CVPR2022

  42. arXiv:2201.06578  [pdf, other

    cs.CV cs.AI

    Collapse by Conditioning: Training Class-conditional GANs with Limited Data

    Authors: Mohamad Shahbazi, Martin Danelljan, Danda Pani Paudel, Luc Van Gool

    Abstract: Class-conditioning offers a direct means to control a Generative Adversarial Network (GAN) based on a discrete input variable. While necessary in many applications, the additional information provided by the class labels could even be expected to benefit the training of the GAN itself. On the contrary, we observe that class-conditioning causes mode collapse in limited data settings, where uncondit… ▽ More

    Submitted 16 March, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

  43. arXiv:2112.15111  [pdf, other

    cs.CV

    Improving the Behaviour of Vision Transformers with Token-consistent Stochastic Layers

    Authors: Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc Van Gool

    Abstract: We introduce token-consistent stochastic layers in vision transformers, without causing any severe drop in performance. The added stochasticity improves network calibration, robustness and strengthens privacy. We use linear layers with token-consistent stochastic parameters inside the multilayer perceptron blocks, without altering the architecture of the transformer. The stochastic parameters are… ▽ More

    Submitted 14 July, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

    Comments: This article is under consideration at the Computer Vision and Image Understanding journal

  44. arXiv:2112.10196  [pdf, other

    cs.CV

    End-to-End Learning of Multi-category 3D Pose and Shape Estimation

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: In this paper, we study the representation of the shape and pose of objects using their keypoints. Therefore, we propose an end-to-end method that simultaneously detects 2D keypoints from an image and lifts them to 3D. The proposed method learns both 2D detection and 3D lifting only from 2D keypoints annotations. In addition to being end-to-end from images to 3D keypoints, our method also handles… ▽ More

    Submitted 9 March, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

  45. arXiv:2112.10155  [pdf, other

    cs.CV

    Topology Preserving Local Road Network Estimation from Single Onboard Camera Image

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: Knowledge of the road network topology is crucial for autonomous planning and navigation. Yet, recovering such topology from a single image has only been explored in part. Furthermore, it needs to refer to the ground plane, where also the driving actions are taken. This paper aims at extracting the local road network topology, directly in the bird's-eye-view (BEV), all in a complex urban setting.… ▽ More

    Submitted 30 March, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

    Comments: CVPR 2022

  46. arXiv:2111.02187  [pdf, other

    cs.SI cs.CY

    Soros, Child Sacrifices, and 5G: Understanding the Spread of Conspiracy Theories on Web Communities

    Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: This paper presents a multi-platform computational pipeline geared to identify social media posts discussing (known) conspiracy theories. We use 189 conspiracy claims collected by Snopes, and find 66k posts and 277k comments on Reddit, and 379k tweets discussing them. Then, we study how conspiracies are discussed on different Web communities and which ones are particularly influential in driving t… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  47. arXiv:2110.01997  [pdf, other

    cs.CV

    Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images

    Authors: Yigit Baran Can, Alexander Liniger, Danda Pani Paudel, Luc Van Gool

    Abstract: Autonomous navigation requires structured representation of the road network and instance-wise identification of the other traffic agents. Since the traffic scene is defined on the ground plane, this corresponds to scene understanding in the bird's-eye-view (BEV). However, the onboard cameras of autonomous cars are customarily mounted horizontally for a better view of the surrounding, making this… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: ICCV 2021

  48. arXiv:2109.04813  [pdf, other

    cs.CV

    TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation

    Authors: Rui Gong, Martin Danelljan, Dengxin Dai, Danda Pani Paudel, Ajad Chhatkuli, Fisher Yu, Luc Van Gool

    Abstract: Traditional domain adaptive semantic segmentation addresses the task of adapting a model to a novel target domain under limited or no additional supervision. While tackling the input domain gap, the standard domain adaptation settings assume no domain change in the output space. In semantic prediction tasks, different datasets are often labeled according to different semantic taxonomies. In many r… ▽ More

    Submitted 28 July, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted by ECCV 2022

  49. arXiv:2108.05876  [pdf, other

    cs.CY cs.SI

    An Early Look at the Gettr Social Network

    Authors: Pujan Paudel, Jeremy Blackburn, Emiliano De Cristofaro, Savvas Zannettou, Gianluca Stringhini

    Abstract: This paper presents the first data-driven analysis of Gettr, a new social network platform launched by former US President Donald Trump's team. Among other things, we find that users on the platform heavily discuss politics, with a focus on the Trump campaign in the US and Bolsonaro's in Brazil. Activity on the platform has steadily been decreasing since its launch, although a core of verified use… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  50. arXiv:2105.10926  [pdf, other

    cs.CV

    Rethinking Global Context in Crowd Counting

    Authors: Guolei Sun, Yun Liu, Thomas Probst, Danda Pani Paudel, Nikola Popovic, Luc Van Gool

    Abstract: This paper investigates the role of global context for crowd counting. Specifically, a pure transformer is used to extract features with global information from overlapping image patches. Inspired by classification, we add a context token to the input sequence, to facilitate information exchange with tokens corresponding to image patches throughout transformer layers. Due to the fact that transfor… ▽ More

    Submitted 25 November, 2023; v1 submitted 23 May, 2021; originally announced May 2021.

    Comments: Accepted by Machine Intelligence Research (MIR)

    Report number: DOI: 10.1007/s11633-023-1475-z