skip to main content
research-article
Open access

FaceSigns: Semi-fragile Watermarks for Media Authentication

Published: 12 September 2024 Publication History

Abstract

Manipulated media is becoming a prominent threat due to the recent advances in realistic image and video synthesis techniques. There have been several attempts at detecting synthetically tampered media using machine learning classifiers. However, such classifiers do not generalize well to black-box image synthesis techniques and have been shown to be vulnerable to adversarial examples. To address these challenges, we introduce FaceSigns—a deep learning-based semi-fragile watermarking technique that allows media authentication by verifying an invisible secret message embedded in the image pixels. Instead of identifying and detecting manipulated media using visual artifacts, we propose to proactively embed a semi-fragile watermark into a real image or video so that we can prove its authenticity when needed. FaceSigns is designed to be fragile to malicious manipulations or tampering while being robust to benign operations such as image/video compression, scaling, saturation, contrast adjustments, and so forth. This allows images and videos shared over the internet to retain the verifiable watermark as long as a malicious modification technique is not applied. We demonstrate that our framework can embed a 128-bit secret as an imperceptible image watermark that can be recovered with a high bit recovery accuracy at several compression levels, while being non-recoverable when unseen malicious manipulations are applied. For a set of unseen benign and malicious manipulations studied in our work, our framework can reliably detect manipulated content with an AUC score of 0.996, which is significantly higher than prior image watermarking and steganography techniques.

1 Introduction

Media authentication, despite having been a long-term challenge, has become even more difficult with the advent of deep learning-based generative models. Deep Neural Network (DNN)-based generative models [12, 25, 26, 34, 50, 58] have enabled the creation of high-quality synthetic media in various domains. Such techniques can be used to easily manipulate real images, videos, and audio to fuel misinformation; tamper sensitive documents; defame individuals; and reduce trust in social media platforms [32]. Media authentication is crucial in ensuring the accuracy of news and maintaining public trust to safeguard against the potential misuse of generative models. Media authentication also plays a crucial role in law enforcement, where videos and images are often used as evidence. Recent methods to detect fake media rely on DNN-based classifiers to distinguish synthetic videos from real videos [14, 41]. However, classifiers trained in a supervised manner on existing media synthesis techniques cannot be reliably secure against black-box generation methods. Moreover, the current best-performing detectors for synthetic media can be easily bypassed by attackers using adversarial examples [16, 21, 33].
As an alternate solution, proactively embedding a secret verifiable message into images and videos at the time of their capture from a device can establish the provenance of authentic images and videos and circumvent the limitations of classifiers for synthetic media. Several prior works have explored digital image watermarking and deep learning-based steganography techniques [11, 15, 30, 49, 62] to hide secret messages in image pixels. However, these works are either fragile to basic image processing operations such as compression and color adjustments or overly robust to the point that the secret can be recovered even after occluding major portions of the embedded image [49]. In fact, we experimentally demonstrate that past works on robust neural network-generated watermarks [49, 62] can recover messages even from images that have undergone face swapping manipulations. Moreover, past neural network-based watermarking frameworks are not designed to be robust to common video compression codecs that apply temporal compression along with per-frame spatial compression. For solving the challenge of media authentication, the watermarking framework should have the following desirable properties: (1) the watermark data should be recoverable if the image/video undergoes benign transformations such as compression or minor adjustments; (2) the watermark recovery should break if the image/video has been maliciously manipulated, e.g. replacing the face, occluding/replacing significant portions of the image; and (3) the watermark should be visually imperceptible.
To address the above challenges of synthetic media classifiers and watermarking frameworks, we introduce FaceSigns—a deep learning-based semi-fragile watermarking system that embeds a recoverable message as an imperceptible perturbation in the image pixels. The watermark can contain a secret message or device-specific codes that can be used for authenticating images and videos. The desirable property of the watermark is that it should break if a malicious manipulation such as occlusion, face swapping, or content manipulation is applied to the image/video, but it should be robust against harmless transformations such as image compression, video compression, and color and lighting adjustments, which are commonly applied on pictures and videos before uploading them to online sharing platforms. To achieve this goal, we develop an encoder-decoder-based training framework that encourages message recovery under benign transformations and discourages message recovery if the watermark has been spatially tampered in certain parts of the image. An overview of the FaceSigns watermarking framework is depicted in Figure 1. In contrast to hand-designed pipelines used in previous work for semi-fragile watermarking [6, 18, 28, 39, 54], our framework is end-to-end and learns to be robust to a wide range of real-world digital image processing operations such as social media filters and compression techniques, while being fragile to various Deepfake tampering techniques. The technical contributions of our work are as follows:
Fig. 1.
Fig. 1. Overview of FaceSigns watermarking framework: The encoder network embeds a secret encrypted message into a given image as an imperceptible watermark that is designed to be robust against benign image transformations and photo editing tools but fragile toward malicious image manipulations such as Deepfakes.
(1)
We develop a neural semi-fragile image watermarking framework that can certify the authenticity of digital media and serve as a proactive defense against media manipulation.
(2)
We propose a novel training procedure to make the watermark retrieval robust against both image and video compression techniques like JPEG, H264, and MPEG4. We overcome the challenge of non-differentiable video compression codecs during training by estimating the gradients using a straight-through estimator in the backward pass.
(3)
We design a differentiable procedure to simulate watermark tampering during training such that our framework can achieve selective fragility against unseen malicious transformations (Section 3.3.2).
(4)
For a set of previously unseen benign and malicious image transformations, FaceSigns achieves the goal of selective fragility and reliably detects malicious manipulations with an area under the ROC curve (AUC) score of \(\mathbf {0.996,}\) which is significantly higher than alternate robust and semi-fragile watermarking frameworks.

2 Background

2.1 Media Forgery

Media forgery refers to the manipulation of digital content such as documents, images, videos, and audio to create convincing but fabricated media. Traditional media forgery techniques like image compositing [38] aim to selectively remove important context or to create a misleading narrative. For example, a compositing attack could be used to alter the background of an image to misrepresent the location where the photo was taken or to selectively remove individuals or objects from a video to distort the events that occurred. The layer-based compositing [38] technique involves breaking down the image into multiple layers, each of which contains different elements (e.g., foreground objects, background, shadows, highlights). Each layer is then composited separately, and the final image is the sum of all the layers. Alpha blending [38] is another common technique for compositing images, where the transparency of each pixel is specified by an alpha value. The resulting image is a linear combination of the foreground and background images, weighted by their respective alpha values. Compositing techniques can be difficult to detect, especially when the manipulation is subtle, making them a common tool for propagating fake or misleading information. These types of media forgery have been used for many years and are often employed to manipulate public opinion, discredit individuals or groups, or create sensational news stories. Due to this, the task of verifying the authenticity of an image is becoming a crucial aspect of image security.

2.2 Facial Forgery

Until recently, the ease of generating manipulated faces in photos and videos has been limited by manual editing tools. However, since the advent of deep learning, there has been significant work in developing new techniques for automatic digital forgery. It has now become easier to create realistic-looking synthetic media that are difficult to distinguish from authentic media. Particularly, DNN-based facial manipulation methods [8, 9, 12, 26] operate end-to-end on a source video and target face and require minimal human expertise to generate fake videos in real time. In our work we show effectiveness against popular Generative Adversarial Network (GAN) based Deeepfake generation methods, SimSwap [8] and Few-Shot Face Translation (FSFT) [43], and a classical computer graphics-based face replacement approach, FaceSwap [26].
The best-performing Deepfake detectors [1, 14, 41, 42, 52] rely on convolutional neural network (CNN)-based architectures. Such Deepfake detectors model Deepfake detection as a per-frame binary classification problem using a face-tracking method prior to CNN classification to effectively detect facial forgeries in both uncompressed and compressed videos. While CNN-based classifiers achieve promising detection accuracy on a fixed in-domain test set of real and fake videos, they suffer from two main drawbacks: (1) lack of generalizability to unseen Deepfake synthesis techniques and (2) vulnerability to adversarial examples in both black-box and white-box attack settings. Classifiers trained in a supervised manner on existing Deepfake generation methods cannot be reliably secure against novel Deepfake generation methods not seen during training. With the advances in deep learning-based generative models, classification methods fail to stay a step ahead in the race to reliably detect synthetic videos. This lack of generalizability is a significant drawback, as it means that CNN-based classifiers may not be able to keep up with the constantly evolving landscape of manipulated videos. Moreover, the current best-performing Deepfake classifiers can be easily bypassed using adversarial examples. Prior work [20, 16, 21, 33] demonstrates that an attacker can bypass most state-of-the-art Deepfake detectors by adding an imperceptible perturbation to each frame of a given video, causing the detector to misclassify a given Deepfake as real. We refer the reader to past works [20, 16, 21, 31] that explore such limitations of CNN-based Deepfake detectors.

2.3 Digital Watermarking

Digital watermarking [11], similar to steganography [15], is the task of embedding information into an image in a visually imperceptible manner. These techniques broadly seek to generate three different types of watermarks: fragile [6, 13], robust [3, 7, 10, 36, 37, 45, 62], and semi-fragile [28, 48, 57] watermarks. Fragile and semi-fragile watermarks are primarily used to certify the integrity and authenticity of image data. Fragile watermarks are used to achieve accurate authentication of digital media, where even a 1-bit change to an image will lead it to fail the certification system. In contrast, robust watermarks aim to be recoverable under several image manipulations in order to allow media producers to assert ownership over their content even if the video is redistributed and modified. Semi-fragile watermarks combine the advantages of both robust and fragile watermarks and are mainly used for fuzzy authentication of digital images and identification of image tampering [57]. The use of semi-fragile watermarks is justified by the fact that images and videos are generally transmitted and stored in a compressed form, which should not break the watermark. However when the image gets tampered, the watermark should also get damaged, indicating image tampering.
Several past works have proposed hand-engineered pipelines to embed semi-fragile watermark information in the spatial and frequency (transform) domain of images and videos. In the spatial domain, the pixels of digital images are processed directly using block-based embedding [6] and least significant bits modification [53, 54] to embed watermarks. In the frequency domain, the watermark can be embedded by modifying the coefficients produced with transformations such as the Discrete Cosine Transform (DCT) [3, 18, 39] and Discrete Wavelet Transform [5, 27, 44]. However, we demonstrate in our experiments that the major limitations of traditional approaches lie in higher visibility of the embedded watermarks, increased distortions in generated images, and low robustness to compression techniques like JPEG transforms. Moreover, these works have not been designed to be fragile against Deepfake manipulations.
More recently, CNNs have been used to provide an end-to-end solution to the watermarking problem. They replace hand-crafted hiding procedures with neural network encoding [2, 17, 30, 49, 59, 62]. Notably, both StegaStamp [49] and HiDDeN [62] propose frameworks to embed robust watermarks that can hide and transmit data in a way that is robust to various real-world transformations. All of these works focus on generating robust watermarks, with the goal of ensuring robustness and recovery of the embedded secret information under various physical and digital image distortions. We empirically demonstrate that these techniques are unable to generate semi-fragile watermarks and are therefore not suitable for identifying tampered media such as Deepfakes.

3 Methodology

We aim to develop an image watermarking framework that is robust to a set of benign image and video transformations while being fragile to malicious transforms. Additionally, it is desirable to have an imperceptible watermark so that devices can store only the watermarked images without revealing the original image to the end user. The set of benign and malicious transformations depends on the application of the media authentication system and can be modified as desired. For example, for applications like document verification, it may be desirable to limit the set of benign transformations to have only compression, while for social media platforms, it is desirable to allow operations such as artistic image filtering. We propose a general-purpose framework that can be adapted for any set of benign and malicious transforms.
With this objective in mind, our system consists of three main components: an encoder network \(E_\alpha\), a decoder network \(D_\beta\), and an adversarial discriminator network \(A_\gamma\), where \(\alpha , \beta ,\) and \(\gamma\) are learnable parameters. An overview of our system is provided in Figure 2. The encoder network E takes as input an image x and a bit string \(s \in \lbrace 0, 1\rbrace ^L\) of length L and produces an encoded (watermarked) image \(x_w\). That is, \(x_w=E(x, s)\). The watermarked image then goes through two image transformation functions—one sampled from a set of benign transformations (\(g_b \sim G_b\)) and the other sampled from a set of malicious transformations (\(g_m \sim G_m\)) to produce a benign image \(x_b=g_b(x_w)\) and a malicious image \(x_m=g_m(x_w)\). The benign and malicious watermarked images are then fed to the decoder network, which predicts the messages \(s_b=D(x_b)\) and \(s_m=D(x_m),\) respectively.
Fig. 2.
Fig. 2. Model overview: The encoder and decoder networks are trained by encouraging message retrieval from watermarked images that have undergone benign transformations and discouraging retrieval from maliciously transformed watermarked images. Image reconstruction and adversarial loss from the discriminator ensure the imperceptibility of the watermark.
For optimizing secret retrieval during training, we use the \(L_1\) distortion between the predicted and ground-truth bit strings. The decoder is encouraged to be robust to benign transformations by minimizing the message distortion \(L_1(s, s_b)\), and fragile for malicious manipulations by maximizing the error \(L_1(s, s_m)\). Therefore, the secret retrieval error for an image \(L_M(x)\) is obtained as follows:
\begin{equation} L_M(x) = L_1(s, s_b) - L_1(s, s_m). \end{equation}
(1)
The watermarked image is encouraged to look visually similar to the original image by optimizing three image distortion metrics: \(L_1\), \(L_2\), and \(L_{\it pips}\) [60] distortions. Additionally, we use an adversarial loss \(L_G(x_w) = \log (1 - A(x_w))\) from the discriminator, which is trained simultaneously to distinguish original images from watermarked images. That is, our image reconstruction loss \(L_{\it img}\) is obtained as follows:
\begin{equation} \begin{split}& L_{\it d}(x, x_w) = L_1(x, x_w) + L_2(x, x_w) + c_p L_{\it pips}(x, x_w) \\ & L_{\it img}(x, x_w) = L_{\it d}(x, x_w) + c_g L_G(x_w). \end{split} \end{equation}
(2)
Therefore, the parameters \(\alpha ,\beta\) of the encoder and decoder network are trained using mini-batch gradient descent to optimize the following loss over a distribution of input messages and images:
\begin{equation} \mathbb {E}_{x, s, g_b, g_m} [ L_{\it img}(x, x_w) + c_M L_M(x) ]. \end{equation}
(3)
The discriminator parameters \(\gamma\) are trained to distinguish original images x from watermarked images \(x_w\) as follows:
\begin{equation} \mathbb {E}_{x, s} [\log (1 - A(x)) + \log (A(x_w)) ]. \end{equation}
(4)
In the above equations, \(c_p\), \(c_g\), \(c_M\) are scalar coefficients for the respective loss terms. We use the following values for our loss coefficients: \(c_p=1\), \(c_g=0.1\), \(c_m=1\). We use Adam optimizer during training with a learning rate of \(2e-4\).

3.1 Message Encoding

The encoder network accepts watermarking data as a bit string s of length L. This watermarking data can contain information about the device that captured the image or a secret message that can be used to authenticate the image. To prevent adversaries (who have gained white-box access to the encoder network) from encoding a target message, we can encrypt the message using symmetric or asymmetric encryption algorithms or hashing. In our experiments, we embed encrypted messages of size 128 bits, which allows the network to encode \(2^{128}\) unique messages. We discuss the possible threats and defenses to our watermarking framework in Section 5.

3.2 Network Architectures

Our encoder and decoder networks are based on the U-Net CNN architecture [23, 40, 49] and operate on \(256\times 256\) images. The encrypted message s, which is an L length bit string, is first projected to a tensor \(s_{{\it Proj}}\) of size \(96\times 96\) using a trainable fully connected layer, then resized to \(256 \times 256\) using bilinear interpolation, and finally added as the fourth channel to the original RGB image to be fed as an input to the encoder network. The encoder U-Net contains eight downsampling and eight upsampling layers. We modify the original U-Net architecture and replace the transposed convolution in the upsampling layers with convolutions followed by nearest-neighbor upsampling as per the recommendations given by [35]. In our preliminary experiments, we found this change to significantly improve the image quality and training speed of our framework. The downsampling and upsampling layers have skip-connections between the corresponding layers with the same output size. The decoder network also follows the U-Net architecture similar to our encoder network. The decoder U-Net first outputs a \(256\times 256\) intermediate output, which is downsized to \(96\times 96\) using bilinear downsampling to produce \(s_{\it ProjDecoded}\) and then projected to a vector of size L using a fully connected layer followed by a sigmoid layer to scale values between 0 and 1. We use batch normalization layers in the encoder network and instance normalization layers in the decoder network.
For the discriminator network, we use the patch discriminator from [23]. The discriminator is trained to classify each \(N\times N\) image patch as real or fake. We average discriminator responses across all patches to obtain the discriminator output. Our discriminator network consists of three convolutional blocks of stride 2, thereby classifying patches of size \(32\times 32\).

3.3 Transformation Functions

The choice of benign and malicious transformation functions is critical to achieve selective fragility and robustness of the watermark. While we can only use a limited set of image transformations during training, the list of possible benign and malicious transforms in real-world settings is non-exhaustive. In our experiments (Section 4.3), we demonstrate that by incorporating the below described transformation functions, we are able to generalize to unseen benign and malicious transformations that are commonly used across social media platforms.

3.3.1 Benign Transforms.

Our goal is to authenticate real images and videos shared over online platforms that generally undergo compression and diverse color or lighting adjustments (e.g., Instagram filters). To approximate standard image processing distortions, we apply a diverse set of differentiable benign image transformations (\(G_b\)) to our watermarked images during training:
(1)
Gaussian blur: We convolve the original image with a Gaussian kernel \(\mathit {k}\). This transform is given by \(t(x)= k \ast x,\) where \(\ast\) is the convolution operator. We use kernel sizes ranging from \(\mathit {k} = 3\) to \(\mathit {k} = 7.\)
(2)
JPEG compression: Digital images are usually stored in a lossy format such as JPEG. We approximate JPEG compression with the differentiable JPEG function proposed in [46]. During training, we apply JPEG compression with quality 40, 60, and 80.
(3)
Saturation adjustments: To account for various color adjustments from social media filters, we randomly linearly interpolate between the original (full RGB) image and its grayscale equivalent.
(4)
Contrast adjustments: We linearly rescale the image histogram using a contrast factor \(\sim \mathcal {U}[0.5, 1.5]\)
(5)
Downsizing and upsizing: The image is first downsized by a factor \({\it scale}\) and then up-sampled by the same factor using bilinear upsampling. We use \({\it scale} \sim \mathcal {U}[2, 5].\)
(6)
Translation and rotation: The image is shifted horizontally and vertically by \(n_h\) and \(n_w\) pixels, where \(n_h, n_w \sim \mathcal {U}[-10, 10],\) and rotated by r degrees, where \(r \sim \mathcal {U}[-10, 10]\).
(7)
Video compression: Simulating video compression distortions during training is more challenging because common video compression codecs such as MPEG4 and H264 cannot be easily implemented using differentiable functions. Such codecs not only compress each frame of a given video but also apply temporal compression across the time-steps for a more optimized compression. Since video compression is applied to almost all videos uploaded on the internet; it is essential to ensure the robustness to these codecs to make the watermark suitable for videos. To this end, we propose the first technique to ensure robustness of the generated watermark to a benign non-differentiable video transform \(g_b\):
When training the watermarking framework on videos, each mini-batch of images x corresponds to consecutive frames of a single video.
We obtain watermarked frames \(x_w\) by embedding unique signatures into each frame using the encoder network. Next, we detach \(x_w\) from the computational graph, extract each frame, and write the frames into a video file. The video file is then compressed into H264 codec using FFMPEG1 at a quantization factor from the interval \([5,25]\).
Next, we read each frame of the compressed file and stack them together to obtain the transformed image batch \(g_b(x_w),\) which is then reinserted in the computational graph to be fed as input to the decoder.
During the backward pass, we use the straight-through estimator [4] to estimate the gradient across the transformation function \(g_b\). That is:
\begin{equation} \left. \nabla _{x_w} L_M(g_b(x_w)) \right|_{x_w = \hat{x_w}} \approx \left. \nabla _{x_w} L_M(x_w) \right|_{x_w = g_b(\hat{x_w})}, \end{equation}
(5)
where \(L_M(x_w)\) indicates the message recovery loss from the decoder for an input \(x_w\). We illustrate the video compression procedure used during training in Figure 3.
Fig. 3.
Fig. 3. Training procedure to make watermarks robust against video compression codecs. We use the actual implementation of the video compression codec in the forward pass and estimate the gradients in the backward pass using a straight-through estimator.
For each mini-batch iteration, we sample one transformation function from the above list (and an Identity transform) and apply it to all the images in the batch.

3.3.2 Malicious Transforms.

Our semi-fragile watermarks have to be unrecoverable when malicious transforms such as image compositing, occlusion, or face replacement are applied. The common operation across these manipulation techniques is to modify certain spatial areas of the image.
To simulate such transforms during training, we propose a watermark occlusion transform as follows: We first generate a tampering mask that indicates what modifications we want to retain or partially discard in the signed image. Given such a tampering mask, we partially remove the added perturbation in the signed image from the areas indicated by the mask. We consider two kinds of spatial tampering masks during training:
Image compositing mask: For each image, we initialize a mask \(M_{h\times w\times c}\) of all ones. Next, we randomly select n rectangular patches in the mask and set the value of all pixels in the patches to a small watermark retention percentage \(w_r \in [0,1]\).
Facial manipulation mask: For each image, we initialize a mask \(M_{h\times w\times c}\) of all ones. Next, we extract the facial feature polygons for eyes, nose, and lips and set the values for all pixels inside the polygons to a small watermark retention percentage \(w_r \in [0,1]\).
That is, \(M[i,j,:] = w_r\) for all \(i,j\) in the selected spatial polygons. Finally, the maliciously transformed image \(g_m(x_w)\) is obtained as follows:
\begin{equation*} g_m(x_w) = M\cdot x_w + (1-M)\cdot x. \end{equation*}
Figure 4 illustrates the malicious transform procedure.
Fig. 4.
Fig. 4. Malicious transform: To simulate image tampering during training, the watermark is partially removed from the areas indicated by a manipulation mask.

4 Experiments

4.1 Datasets and Experimental Setup

We conduct our experiments on the CelebA [29], MIRFLICKR [19], and UCF-101 [47] datasets. CelebA is a large-scale database of over \(200{,}000\) face images of \(10{,}000\) unique celebrities. The MIRFLICKR dataset is a diverse image retrieval dataset containing 1 million images. For training our watermarking framework to be robust to video compression, we use the UCF-101 dataset that contains \(13{,}320\) short clips for action recognition. We set aside \(1{,}000\) images/videos for testing from each dataset and split the remaining data into \(80\%\) training and \(20\%\) validation. We train our models for 200K mini-batch iterations with a batch size of 64 and use an Adam optimizer with a fixed learning rate of \(2e-4\). All our models are trained using images/video frames of size \(256\times 256,\) which are obtained after center-cropping and resizing the images. We conduct experiments with message length \(L=128\). To evaluate the effectiveness of using transformation functions during training, we conduct an ablation study by training a FaceSigns (No Transform) model that does not incorporate any input transformations and a FaceSigns (Robust) model that uses only benign transformations during training. We evaluate watermarking techniques primarily on the following aspects:
(1)
Imperceptibility: We compare the original and watermarked images to compute peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Higher values for both PSNR and SSIM are desirable for a more imperceptible watermark.
(2)
Robustness and fragility: To measure the robustness and fragility of the watermarking system, we measure the bit recovery accuracy (BRA) of the bit string s when unseen (not used in training) benign and malicious image transformations are applied. BRA is calculated by comparing the decoded secret bit string with the secret bit string that was embedded by the encoder into the given image. The number of matched bits divided by the length of the bit string gives the bit recovery accuracy of a single image. We average this over our test set to report the BRA. For robustness, it is desirable to have a high BRA against benign transformations like social media filters and image compression. For fragility against malicious tampering, it is desirable to have a low BRA when facial manipulation or image compositing is applied. To make a fair comparison with past works, we do not apply any bit error correcting codes while calculating the BRA and compare the input string s with the raw decoder output. A detector can classify an input as manipulated if the BRA of the decoded message is below a set threshold and benign if the BRA is more than the threshold. We measure the performance of such a detector using the AUC score.
(3)
Capacity: This measures the amount of information that can be embedded in the image. We measure the capacity as the bits per pixel (BPP), which is the number of bits of the encrypted message embedded per pixel of the image, which is simply \(=L/(\mathit {HWC})\).
It is important to note the tradeoff between the above metrics—e.g., models with higher capacity sacrifice on the imperceptibility or bit recovery accuracy. Similarly, more robust models sacrifice capacity or imperceptibility. We compare our watermarking framework against three prior works on image watermarking—a DCT-based semi-fragile watermarking system [18] and two neural image watermarking systems, HiDDeN [62] and StegaStamp [49]. Both HiDDeN and StegaStamp embed a bit string message into a square RGB image while ensuring robustness to a set of image transformations. We present examples of original and watermarked images along with the added perturbation from different techniques in Figure 5.
Fig. 5.
Fig. 5. Examples of original and watermarked images using prior works and our FaceSigns (Semi-fragile) model. The image perturbation has been linearly scaled between 0 and 1 for visualization. The quantitative metrics evaluating the capacity and imperceptibility of the watermark are reported in Table 1.

4.2 Imperceptibility and Capacity

We report the image similarity and capacity metrics of different watermarking techniques in Table 1. We find that even at a higher message capacity, FaceSigns can encode messages with better imperceptibility as compared to StegaStamp and HiDDeN. As noted by the authors of StegaStamp and visible in Figure 5, the residual added by their model is perceptible in large low-frequency regions of the image. We believe that this is primarily due to the difference in our network architecture choices. In our initial experiments, we found that using a UNet architecture for the decoder with an intermediate message reconstruction loss described in Section 3.2 performed significantly better than a downsampling CNN architecture used in prior work. Additionally, we use nearest neighbor upsampling instead of transposed convolutions in our U-Net architectures, which helps reduce the perceptibility of the watermark by removing upsampling artifacts.
Table 1.
 CapacityImperceptibility
MethodH,WL\(\text{BPP}\)PSNRSSIM
Semi-fragile DCT [18]128256\(5.2\text{e-}3\)22.490.871
HiDDeN [62]12830\(6.1\text{e-}4\)27.570.934
StegaStamp [49]400100\(2.0\text{e-}4\)29.390.925
FaceSigns (No Transform)256128\(6.5\text{e-}4\)36.380.973
FaceSigns (Robust)256128\(6.5\text{e-}4\)35.560.964
FaceSigns (Semi-Fragile)256128\(6.5\text{e-}4\)35.430.962
Table 1. Capacity and Imperceptibility Metrics of Different Watermarking Systems
\(H,W\) indicate the height and width of the input image.

4.3 Robustness and Fragility

To study the robustness and fragility of different DNN-based watermarking techniques, we transform the watermarked images using unseen benign and malicious transformations and then attempt to decode the message from the transformed message. We perform ablation studies to evaluate the effectiveness of the proposed transforms by training three versions of our watermarking framework: FaceSigns (No Transform), which does not use any benign or malicious transformations during training; FaceSigns (Robust), which is only trained to be robust against benign transformations and does not use malicious transformations during training; and FaceSigns (Semi-fragile), which uses both benign and malicious transformations during training.

4.3.1 Benign Image Transforms.

For benign transforms, we first consider real-world image operations that are commonly used when uploading pictures on the internet. We compress the image using different levels of JPEG compression (separate from training) and also apply Instagram filters, namely Aden, Brooklyn, and Clarendon, which we use from an open-source Python library, Pilgram [24]. Some example images from these transformations are shown in Figure 6. We report the BRA of different watermarking frameworks after undergoing benign transformations in Table 3. We find that both StegaStamp and our robust and semi-fragile models can decode secrets with a high BRA for these image transformations. We find that FaceSigns (Robust), which does not use malicious transforms during training, is slightly more robust to benign transformations as compared to FaceSigns (Semi-fragile). However, this improved robustness comes at the cost of being non-fragile to malicious transformations and being able to decode messages with high BRA even for Deepfake manipulations. The model FaceSigns (No Transform), which does not incorporate any benign or malicious transformations during training, is fragile to both JPEG compression and malicious transforms as indicated by the low BRA for both methods.
Table 2.
 BRA (%)
MethodBlurCroppingRotationContrastBrightnessTranslation
FaceSigns (No Transform)78.3265.6280.2288.2392.2362.62
FaceSigns (Robust)99.7197.3999.8299.8899.9197.12
FaceSigns (Semi-fragile)99.6897.4599.7799.8599.8296.54
Table 2. Bit Recovery Accuracy (BRA) of Different Techniques against Benign Transformations Used During Training
The hyperparameter values for these transforms are sampled from the range given in Section 3.3.1.
Table 3.
 BRA (%) - Benign TransformsBRA (%) - Malicious Transforms
MethodNoneJPG-75JPG-50AdenBrooklynClarendonSimSwap [8]FSFT [43]FS [26]Compositing
Semi-fragile DCT [18]99.8156.6555.0494.9896.4195.0657.6257.6188.5982.31
HiDDeN [62]97.0672.7168.4894.5294.5294.5285.4872.3374.2373.27
StegaStamp [49]99.9299.9199.8799.8499.7399.3998.3497.4297.4398.21
FaceSigns (No Transform)99.9650.5150.0798.3999.6799.6551.0452.0051.3653.36
FaceSigns (Robust)\(99.96\)99.7497.2699.5399.1999.3797.2989.7668.9997.26
FaceSigns (Semi-fragile)99.6899.4998.3897.4098.3499.3264.9352.2131.7751.61
Table 3. Bit Recovery Accuracy (BRA) of Different Watermarking Techniques against Benign and Malicious Transforms Unseen during Training
For Benign transforms, we consider two JPEG compression levels and three Instagram filters—Aden, Brooklyn and Clarendon. For malicious transforms we consider various face manipulation/swapping techniques and general image compositing transforms. A higher BRA against benign and a lower BRA against malicious transforms is desirable to achieve our goal of semi-fragile watermarking.
Fig. 6.
Fig. 6. Watermarked images with unseen benign transformations applied. Benign transformations depicted in this diagram include Instagram filters [24] Brooklyn, Clarendon, and Aden and various levels of JPEG compression.
We also evaluate FaceSigns watermark recovery against the benign transformations used during training. The hyper-parameters used for these transformations are sampled randomly from the intervals described in Section 3.3.1. For cropping, we use center-cropping with crop factor sampled from \((1.2, 1.5)\). We present sample images undergoing these transformations in Figure 7 and the results in Table 2. We also study the BRA at different magnitudes of distortions for Gaussian blurring and JPEG compressions and present the results in Figure 8. We find that both FaceSigns (Robust) and FaceSigns (Semi-fragile) can effectively recover the watermark data even at high magnitudes of distortions for benign transforms.
Fig. 7.
Fig. 7. Watermarked image samples from FaceSigns undergoing benign transformations from the training set with transforms such as cropping, contrast adjustment, Gaussian blur, and rotations.
Fig. 8.
Fig. 8. Bit recovery accuracy (BRA) of FaceSigns framework at different levels of distortion for benign transforms. (A) BRA vs. JPEG compression levels (lower values indicate higher compression). (B) BRA vs. sigma value used for Gaussian blur (higher sigma corresponds to higher distortion). (C) BRA vs. quantization factor for H264 video codec (higher quantization factor indicates more compressed video).
Robustness to Video Compression: For watermarking videos, we use the FaceSigns encoder to insert the watermark data into each video frame. Similarly, for decoding, we decode watermark data by passing each frame of the watermarked video to the FaceSigns decoder network. In our initial experiments, we found that training FaceSigns to be robust against spatial image transforms does not ensure robustness against video compression codecs. This is because besides compressing each frame spatially, video compression codecs like H264 also compress data temporally. To address this challenge, we incorporate video compression during training using the gradient-estimation procedure described in Section 3.3.1. As indicated by the results in Figure 8(C), incorporating video compression codecs during training significantly improves watermark recovery from highly compressed videos. Robustness to H264 compression makes FaceSigns a practical framework for inserting recoverable watermarks in videos shared on the internet.

4.3.2 Malicious Transforms.

To evaluate the fragility of the watermark against unseen facial manipulations, we apply three face-swapping techniques on the watermarked images from the CelebA dataset: FaceSwap [26], SimSwap [8], and Few-Shot Face Translation (FSFT) [43]. FaceSwap [26] is a computer graphics-based technique that swaps the face by aligning the facial landmarks of the two images. SimSwap [8] and FSFT [43] are deep learning-based techniques that use CNN encoder-decoder networks trained using adversarial loss to generate Deepfakes. Figure 9 shows examples of swapped faces using these techniques. Additionally, we consider a general image compositing operation for all test images where we randomly select image patches covering 10% to 50% of the image and replace the patches with those from an alternate image.
Fig. 9.
Fig. 9. Facially manipulated images created through SimSwap [8], FSFT [43], and FaceSwap [26] techniques for evaluating the fragility of the watermark.
As reported in Table 3, we find that StegaStamp and FaceSigns (Robust) can decode signatures from maliciously transformed images with a high BRA, thereby making them unsuitable for authenticating the integrity of digital media. This is understandable since these methods prioritize robustness over fragility. StegaStamp has been shown to be robust to occlusions even though occlusions were not explicitly a part of their set of training transformations. In contrast, watermark data recovery for the FaceSigns (Semi-fragile) model breaks against malicious transforms, which is desirable for malicious tampering detection.
Based on the bit-recovery accuracy of the watermark data, we can define a manipulation detector as follows: The detector labels an image as maliciously tampered if the BRA of the predicted bit string is less than a threshold \(\tau\). The ROC curve of such a detector is shown in Figure 10. As evident by the ROC plots and AUC scores shown in Figure 10, in contrast to prior works, our semi-fragile model demonstrates robustness to benign transformations while being fragile toward out-of-domain malicious Deepfake transformations, thereby achieving our goal of selective fragility and an AUC score of 0.996 for manipulation detection.
Fig. 10.
Fig. 10. Manipulation detection ROC plots and AUC scores for different watermarking techniques. A positive example represents a facially manipulated image, while a negative example represents a benign transformed image (all the transformations listed in Table 3). The watermarking framework labels an example as manipulated if the BRA for an image is less than a given threshold.

4.4 Watermarking Images with Multiple Faces

For watermarking facial images with multiple faces, we can adapt our framework to insert the watermark into each face to detect facial tampering in any identity. To this end, we use a face detection model to extract a square bounding box of each face in the image containing multiple faces. The faces from the bounding boxes are cropped out and resized to \(256\times 256\) and passed as input to our encoder model to embed individual semi-fragile watermarks. The watermarked faces are then resized back to their original size and placed back into the original images. During decoding, a similar process is repeated where the faces are cropped and resized to \(256\times 256\) before being fed into the decoder. Since the benign transforms during training tolerate small image translations, it makes our watermarking robust to small shifts in the face detection network. We conduct experiments on 400 test images containing two to six faces each from the Celebrity Together dataset [61]. Our FaceSigns (Semi-fragile) model achieves a BRA of 99.50%, demonstrating that we can effectively encode and retrieve watermarks embedded in images with multiple faces. Figure 11 shows watermarked images with multiple identities.
Fig. 11.
Fig. 11. Examples of watermarked images with multiple faces using FaceSigns (Semi-fragile). Our model reliably embeds watermarks into faces present in the foreground and background.

5 Discussion - Threat Models

Both watermark embedding techniques and Deepfake detection systems face adversarial threats from attackers who attempt to bypass the detectors by authenticating manipulated media. In this section, we discuss some of the threat models faced by our system and how these challenges can be addressed:
Attack 1. Querying the decoder network for performing adversarial attacks: The attacker may query the decoder network with an image to get the decoded message and adversarially perturb the query image until the decoded message matches the target message.
Defense: The attacker does not know what target messages can prove media authenticity since these messages can be kept as a secret and updated frequently. If the attacker gains access to the secret message by querying the decoder with a watermarked image, the encryption key secrecy can prevent the attacker from knowing the target encrypted message for the decoder. Lastly, the decoder network can be hosted securely and can only output a binary label indicating whether the image is authentic or manipulated by matching the decoded secret with the list of trusted secrets. This would make the decoder’s signal unusable for performing adversarial attacks to match a target message out of the total possible \(2^{128}\) messages.
Attack 2. Copying the watermark perturbation from one image to another: The adversary may attempt to extract the added perturbation of the watermark and add it onto a Deepfake image to authenticate the manipulated media.
Defense: Since FaceSigns generates an image- and message-specific perturbation, we hypothesize that the same perturbation when applied on alternate images should not be recoverable by the decoder. We verify this hypothesis by conducting an experiment in which we extract added perturbations from 100 watermarked images and apply extracted perturbation to 100 alternate images. The bit recovery accuracy of such an attack is just 17.6%, which is worse than random prediction.
Attack 3. Training a proxy encoder: The adversary can collect a dataset of original and watermarked images and train a neural network-based encoder-decoder image-to-image translation network to map any new image to a watermarked image.
Defense: One defense strategy is to only store watermarked images on devices so that an attacker never gains access to pairs of original and watermarked images. Also, the above attack can only work if the encoded images all contain the same secret message, so that the adversary can learn a generator for watermarking a new image with the same secret message. To prevent the creation of such a dataset, some bits of the message can be kept dynamic and contain a unique timestamp and device-specific codes so that each embedded bit string is different. Regularly updating the trusted message or encryption key is another preventative strategy against such attacks.

6 Related Work

In this section, we discuss various concurrent endeavors that tackle the challenges related to media ownership and digital watermarking through deep neural networks. One such effort is FakeTagger [51], which endeavors to create a highly robust watermark that remains intact even after facial manipulation or Deepfake modification. Although this work does not consider the concept of fragility, the rationale behind their idea is that one can track the origin of an image or video. However, FakeTagger has a different objective compared to our work, as we aim to identify facial tampering through semi-fragile watermarking. Additionally, unlike our approach, the practical implementation of FakeTagger is memory intensive since it requires the storage of the original photo along with its tag to retrieve the authentic image, necessitating additional data (i.e., every photo) to be saved. FaceGuard [55] is another contemporaneous work that performs digital watermarking using DNNs. However, the authors only consider robustness during their training process and do not introduce any technique to make the watermark fragile to facial manipulations. In their experiments, the authors only evaluate the recovery of the embedded watermark and do not perform Deepfake manipulation to check if the watermark is still recoverable or not. Specifically, the authors embed a watermark in real images and assume that Deepfake images have not been watermarked, and use a standard Deepfake dataset to evaluate how accurately the decoder can identify real images. In contrast, our work demonstrates that relying only on robustness to benign transformations during training is not sufficient to achieve semi-fragility in watermarks.
Past work such as [22] proposed a hardware accelerator that focuses on optimizing the hardware design of image watermarking with reconfigurable modules. However, in contrast to our work, they do not conduct thorough assessments against real-world Deepfake manipulations, compositing attacks, and video compression codecs.
Another related work [56] seeks to embed robust watermarks into the weights of DNN-based generative models in order to assert ownership of models and their generated images. This solution is reliant on the assumption that Deepfake synthesizers will only utilize generative models that have been watermarked, thereby allowing detectors to identify the source model of the Deepfakes.

7 Future Directions

As neural networks continue to advance and make high-quality synthetic content more accessible, it is crucial to prioritize safeguards against potential misuse of this technology. Addressing both generation and detection of synthesized media is key for the responsible use of media synthesis technology. Since AI-generated content is expected to increase across social media platforms in the foreseeable future, the reliable detection of such content is essential to ensure trust in social media platforms and prevent potential harms of synthesis technologies. With the proliferation of Deepfakes and manipulated content, there’s a growing need for real-time detection systems capable of identifying even the most subtle alterations. Future work integrating multi-modal analyses that leverage text, audio, and video data can enhance the accuracy and robustness of synthetic media detection. Future directions should also study the potential for watermarking video and audio content together in order to authenticate real media.

8 Conclusion

We introduce a deep learning-based semi-fragile watermarking system that can certify the integrity of digital images and videos and reliably detect tampering. Through our experiments and evaluations, we demonstrate that FaceSigns generates more imperceptible watermarks than previous state-of-the-art methods while upholding the desired semi-fragile characteristics. By carefully designing a fixed set of benign and malicious transformations during training, our framework achieves generalizability to real-world image and video transformations and can reliably detect Deepfake facial and image compositing manipulations, unlike prior image watermarking techniques. Additionally, our work is a significant step forward in the field of covert watermarking for videos. FaceSigns can be vital to media authenticators in social media platforms, news agencies, and legal offices and help create more trustworthy platforms and establish consumer trust in digital media.

Footnote

A Appendix

A.1 Message Length Experiments

We conduct additional experiments to study the effectiveness of the FaceSigns (Semi-fragile) framework in embedding different message lengths ranging from 64 to 512 bits into an image and present the results in Table 4. We study different lengths of the embedded watermark message: 64 bits, 128 bits, 256 bits, and 512 bits. We compute the BRA Benign for all the test benign transforms (Identity, JPG-75, JPG-50, Aden, Brooklyn, and Clarendon) and BRA Malicious for the four malicious transforms (SimSwap, FSFT, FS, and Compositing) and report the mean BRA in Table 4. As we increase the message length, we observe a slight decrease in PSNR and SSIM metrics, indicating that the watermark gets more perceptible. At all message lengths, the watermark maintains the desired fragility to malicious transforms as indicated by low BRA Malicious. However, we observe a slight decrease in robustness to benign transforms as we increase the message length to 256 and 512.
Table 4.
Message LengthPSNRSSIMBPPBRA Benign (%)BRA Malicious (%)
6437.480.979\(3.2\text{e-4}\)98.9148.22
12835.430.962\(6.5\text{e-4}\)98.7649.52
25635.320.959\(1.3\text{e-3}\)93.4252.23
51232.170.938\(2.6\text{e-3}\)90.2951.23
Table 4. Comparison of FaceSigns (Semi-fragile) at Different Message Lengths
The Bit recovery accuracy (BRA) for benign and malicious transforms is averaged across the transformations given in Table 3.

A.2 Discriminator Ablation study

Table 5 presents a comparison of results for the FaceSigns watermarking technique (Semi-fragile) with and without the use of a discriminator loss. This comparison is conducted using a fixed message length of 128 bits. Comparing the PSNR values, it’s evident that the watermarked images generated using the discriminator loss in the training objective have a higher PSNR (35.43) compared to training without discriminator loss (34.21). This suggests that incorporating the discriminator loss has a positive impact on preserving image quality after watermarking. The SSIM values reinforce the observation made in PSNR.
Table 5.
TechniquePSNRSSIMBRA Benign (%)BRA Malicious (%)
With Discriminator Loss35.430.96298.7649.52
Without Discriminator Loss34.210.94198.6150.18
Table 5. Comparison of FaceSigns (Semi-fragile) with and without the Discriminator Loss Using a Message Length of 128

A.3 Additional Image Examples

We present additional examples of original and watermarked images in Figure 12.
Fig. 12.
Fig. 12. Additional examples of original and watermarked images using prior works and our method (FaceSigns). Observe the change in the perturbation pattern as we incorporate both robust and benign transformations in the FaceSigns (Semi-fragile) model.

References

[1]
Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Mesonet: A compact facial video forgery detection network. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS’18). IEEE, 1–7.
[2]
Shumeet Baluja. 2017. Hiding images in plain sight: Deep steganography. NIPS 30 (2017).
[3]
Yi-Lin Bei, Xiao-Rong Zhu, Qian Zhang, and Sai Qiao. 2022. A robust image watermarking algorithm based on content authentication and intelligent optimization. In Proceedings of the 5th International Conference on Control and Computer Vision (ICCCV’22). ACM, New York, NY, USA, 164–170.
[4]
Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).
[5]
Oussama Benrhouma, Houcemeddine Hermassi, and Safya Belghith. 2015. Tamper detection and self-recovery scheme by DWT watermarking. Nonlinear Dynamics 79 (2015), 1817–1833.
[6]
Siddharth Bhalerao, Irshad Ahmad Ansari, and Anil Kumar. 2021. A secure image watermarking for tamper detection and localization. Journal of Ambient Intelligence and Humanized Computing 12, 1 (2021), 1057–1068.
[7]
Ning Bi, Qiyu Sun, Daren Huang, Zhihua Yang, and Jiwu Huang. 2007. Robust image watermarking based on multiband wavelets and empirical mode decomposition. IEEE Transactions on Image Processing 16, 8 (2007), 1956.1966.
[8]
Renwang Chen, Xuanhong Chen, Bingbing Ni, and Yanhao Ge. 2020. SimSwap: An efficient framework for high fidelity face swapping. In The 28th ACM International Conference on Multimedia (MM’20). ACM.
[9]
Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. StarGAN v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10]
Ingemar J. Cox, Joe Kilian, F. Thomson Leighton, and Talal Shamoon. 1997. Secure spread spectrum watermarking for multimedia. IEEE Transactions on Image Processing 6, 12 (1997), 1673–1687.
[11]
Ingemar J. Cox, Matthew L. Miller, Jeffrey Adam Bloom, and Chris Honsinger. 2002. Digital Watermarking. Vol. 53. Springer.
[12]
DeepFakes. 2017. https://github.com/deepfakes/faceswap
[13]
Ferdinando Di Martino and Salvatore Sessa. 2019. Fragile watermarking tamper detection via bilinear fuzzy relation equations. Journal of Ambient Intelligence and Humanized Computing 10 (2019), 2041–2061.
[14]
Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. 2020. The DeepFake Detection Challenge (DFDC) dataset. arXiv preprint arXiv:2006.07397 (2020).
[15]
Jessica Fridrich. 2009. Steganography in Digital Media: Principles, Algorithms, and Applications. Cambridge University Press.
[16]
Apurva Gandhi and Shomik Jain. 2020. Adversarial perturbations fool deepfake detectors. In 2020 International Joint Conference on Neural Networks (IJCNN’20). IEEE.
[17]
Jamie Hayes and George Danezis. 2017. Generating steganographic images via adversarial training. In International Conference on Neural Information Processing Systems.
[18]
Chi Kin Ho and Chang-Tsun Li. 2004. Semi-fragile watermarking scheme for authentication of JPEG images. In Proceedings of the International Conference on Information Technology: Coding and Computing, 2004 (ITCC’04). IEEE.
[19]
Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR’08). ACM, New York, NY, USA.
[20]
Shehzeen Hussain, Paarth Neekhara, Brian Dolhansky, Joanna Bitton, Cristian Canton Ferrer, Julian McAuley, and Farinaz Koushanfar. 2022. Exposing vulnerabilities of deepfake detection systems with robust attacks. Digital Threats: Research and Practice (DTRAP) 3, 3 (2022), 1–23.
[21]
Shehzeen Hussain, Paarth Neekhara, Malhar Jere, Farinaz Koushanfar, and Julian McAuley. 2021. Adversarial deepfakes: Evaluating vulnerability of deepfake detectors to adversarial examples. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 3348–3357.
[22]
Shehzeen Hussain, Nojan Sheybani, Paarth Neekhara, Xinqiao Zhang, Javier Duarte, and Farinaz Koushanfar. 2022. FastStamp: Accelerating neural steganography and digital watermarking of images on FPGAs. In Proceedings of the 41st IEEE/ACM International Conference on Computer-aided Design (ICCAD’22). ACM, New York, NY, USA.
[23]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[25]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In CVPR.
[27]
Chunlei Li, Aihua Zhang, Zhoufeng Liu, Liang Liao, and Di Huang. 2015. Semi-fragile self-recoverable watermarking algorithm based on wavelet group quantization and double authentication. Multimedia Tools and Applications 74 (2015), 10581–10604.
[28]
Eugene T. Lin, Christine I. Podilchuk, and Edward J. Delp III. 2000. Detection of image alterations using semifragile watermarks. In Security and Watermarking of Multimedia Contents II. International Society for Optics and Photonics.
[29]
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In ICCV.
[30]
Xiyang Luo, Ruohan Zhan, Huiwen Chang, Feng Yang, and Peyman Milanfar. 2020. Distortion agnostic deep watermarking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[31]
Siwei Lyu. 2020. Deepfake detection: Current challenges and next steps. In 2020 IEEE International Conference on Multimedia Expo Workshops (ICMEW’20).
[32]
Yisroel Mirsky and Wenke Lee. 2021. The creation and detection of deepfakes: A survey. ACM Computing Surveys 54, 1, Article 7 (Jan.2021), 41 pages.
[33]
Paarth Neekhara, Brian Dolhansky, Joanna Bitton, and Cristian Canton Ferrer. 2021. Adversarial threats to DeepFake detection: A practical perspective. In 2021 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’21). IEEE, 923–932.
[34]
Yuval Nirkin, Yosi Keller, and Tal Hassner. 2019. FSGAN: Subject agnostic face swapping and reenactment. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 7184–7193.
[35]
Augustus Odena, Vincent Dumoulin, and Chris Olah. 2016. Deconvolution and checkerboard artifacts. Distill 1, 10 (2016), e3.
[36]
Shelby Pereira and Thierry Pun. 2000. Robust template matching for affine resistant image watermarks. IEEE Transactions on Image Processing 9, 6 (2000), 1123–1129.
[37]
Shelby Pereira, Joseph J. K. O. Ruanaidh, Frederic Deguillaume, Gabriela Csurka, and Thierry Pun. 1999. Template based recovery of Fourier-based watermarks using log-polar and log-log maps. In Proceedings IEEE International Conference on Multimedia Computing and Systems. IEEE.
[38]
Thomas Porter and Tom Duff. 1984. Compositing digital images. In Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’84). ACM, New York, NY, USA.
[39]
R. O. Preda and D. N. Vizireanu. 2015. Watermarking-based image authentication robust to JPEG compression. Electronics Letters 51, 23 (2015), 1873–1875.
[40]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention. Springer.
[41]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Niessner. 2019. FaceForensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision (ICCV).
[42]
Selim Seferbekov. 2020. https://github.com/selimsef/dfdc_deepfake-_challenge
[44]
S. Shefali and S. M. Deshpande. 2007. Information security through semi-fragile watermarking. In International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’07). IEEE.
[45]
Abdulaziz Shehab, Mohamed Elhoseny, Khan Muhammad, Arun Kumar Sangaiah, Po Yang, Haojun Huang, and Guolin Hou. 2018. Secure and robust fragile watermarking scheme for medical images. IEEE Access 6 (2018), 10269–10278.
[46]
Richard Shin and Dawn Song. 2017. Jpeg-resistant adversarial images. In NIPS 2017 Workshop on Machine Learning and Computer Security.
[47]
Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402.
[48]
Rui Sun, Hong Sun, and Tianren Yao. 2002. A SVD-and quantization based semi-fragile watermarking technique for image authentication. In 6th International Conference on Signal Processing, 2002. IEEE.
[49]
Matthew Tancik, Ben Mildenhall, and Ren Ng. 2020. Stegastamp: Invisible hyperlinks in physical photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2387–2395.
[50]
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2face: Real-time face capture and reenactment of RGB videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2387–2395.
[51]
Run Wang, Felix Juefei-Xu, Meng Luo, Yang Liu, and Lina Wang. 2021. FakeTagger: Robust safeguards against deepfake dissemination via provenance tracking. In Proceedings of the 29th ACM International Conference on Multimedia (MM’21). ACM, New York, NY, USA, 3546–3555.
[52]
Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2020. CNN-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE/CVF Conference on Computer vision and Pattern Recognition (CVPR). 8695–8704.
[53]
Jun Xiao and Ying Wang. 2008. A semi-fragile watermarking tolerant of Laplacian sharpening. In 2008 International Conference on Computer Science and Software Engineering. IEEE.
[54]
Hengfu Yang, Xingming Sun, and Guang Sun. 2009. A semi-fragile watermarking algorithm using adaptive least significant bit substitution. Information Technology Journal 9 (2010), 20–26.
[55]
Yuankun Yang, Chenyue Liang, Hongyu He, Xiaoyu Cao, and Neil Zhenqiang Gong. 2021. FaceGuard: Proactive deepfake detection. CoRR. https://arxiv.org/abs/2109.05673
[56]
Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, and Mario Fritz. 2021. Artificial fingerprinting for generative models: Rooting deepfake attribution in training data. In Proceedings of the IEEE/CVF International Conference on Computer Vision.
[57]
Xiaoyan Yu, Chengyou Wang, and Xiao Zhou. 2017. Review on semi-fragile watermarking algorithms for content authentication of digital images. Future Internet 9, 4 (2017), 56.
[58]
Egor Zakharov, Aliaksandra Shysheya, Egor Burkov, and Victor Lempitsky. 2019. Few-shot adversarial learning of realistic neural talking head models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 9459–9468.
[59]
Ru Zhang, Shiqi Dong, and Jianyi Liu. 2019. Invisible steganography via generative adversarial networks. Multimedia Tools and Applications 78 (2019), 8559–8575.
[60]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[61]
Yujie Zhong, Relja Arandjelović, and Andrew Zisserman. 2018. Compact deep aggregation for set retrieval. In Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision (ECCV’18).
[62]
Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. 2018. Hidden: Hiding data with deep networks. In Proceedings of the European Conference on Computer Vision (ECCV). 657–672.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 11
November 2024
702 pages
EISSN:1551-6865
DOI:10.1145/3613730
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2024
Online AM: 13 January 2024
Accepted: 26 December 2023
Revised: 19 November 2023
Received: 01 April 2023
Published in TOMM Volume 20, Issue 11

Check for updates

Author Tags

  1. Media forensics
  2. Deepfakes
  3. watermarking
  4. semi-fragile watermarking
  5. video watermarking

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,343
  • Downloads (Last 6 weeks)187
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media