SATBA: An Invisible Backdoor Attack Based On Spatial Attention
Huasong Zhou, Xiaowei Xu, Xiaodong Wang, and Leon Bevan Bullock

TL;DR
SATBA introduces a stealthy backdoor attack leveraging spatial attention and U-net architecture to embed triggers into images, achieving high success rates while evading detection and preserving model accuracy.
Contribution
This paper presents SATBA, a novel backdoor attack method that uses spatial attention and U-net to embed triggers invisibly, overcoming visibility and feature loss issues of prior attacks.
Findings
High attack success rate across multiple datasets
Robustness against backdoor defenses
Enhanced stealthiness demonstrated through image similarity experiments
Abstract
Backdoor attack has emerged as a novel and concerning threat to AI security. These attacks involve the training of Deep Neural Network (DNN) on datasets that contain hidden trigger patterns. Although the poisoned model behaves normally on benign samples, it exhibits abnormal behavior on samples containing the trigger pattern. However, most existing backdoor attacks suffer from two significant drawbacks: their trigger patterns are visible and easy to detect by backdoor defense or even human inspection, and their injection process results in the loss of natural sample features and trigger patterns, thereby reducing the attack success rate and model accuracy. In this paper, we propose a novel backdoor attack named SATBA that overcomes these limitations using spatial attention and an U-net based model. The attack process begins by using spatial attention to extract meaningful data features…
| Dataset→ | MNIST | CIFAR10 | GTSRB | |||
| Attack↓ | CDA | ASR | CDA | ASR | CDA | ASR |
| BadNets[11] AlexNet | 0.993 | 0.999 | 0.869 | 0.943 | 0.957 | 0.994 |
| VGG16 | 0.994 | 1.000 | 0.886 | 0.967 | 0.963 | 0.995 |
| Resnet18 | 0.996 | 1.000 | 0.898 | 0.961 | 0.962 | 0.997 |
| Blend[6] AlexNet | 0.992 | 1.000 | 0.885 | 0.996 | 0.960 | 0.999 |
| VGG16 | 0.992 | 1.000 | 0.895 | 0.997 | 0.962 | 0.999 |
| Resnet18 | 0.994 | 1.000 | 0.906 | 0.998 | 0.961 | 0.999 |
| Clean AlexNet | 0.992 | —— | 0.889 | —— | 0.961 | —— |
| VGG16 | 0.901 | —— | 0.902 | —— | 0.962 | —— |
| Resnet18 | 0.994 | —— | 0.911 | —— | 0.967 | —— |
| Refool[21] AlexNet | 0.992 | 1.000 | 0.879 | 0.938 | 0.959 | 0.992 |
| VGG16 | 0.991 | 1.000 | 0.892 | 0.953 | 0.963 | 0.992 |
| Resnet18 | 0.994 | 1.000 | 0.901 | 0.954 | 0.961 | 0.995 |
| Wanet[24] AlexNet | 0.994 | 0.999 | 0.886 | 0.998 | 0.961 | 0.999 |
| VGG16 | 0.995 | 1.000 | 0.893 | 0.999 | 0.965 | 0.999 |
| Resnet18 | 0.995 | 1.000 | 0.903 | 0.999 | 0.960 | 0.999 |
| SATBA(Ours) AlexNet | 0.996 | 1.000 | 0.892 | 0.998 | 0.958 | 0.996 |
| VGG16 | 0.995 | 1.000 | 0.888 | 0.998 | 0.965 | 0.999 |
| Resnet18 | 0.996 | 1.000 | 0.904 | 0.999 | 0.955 | 0.994 |
| Dataset→ | MNIST | CIFAR10 | GTSRB | |||||||||
| Attack↓ | PSNR | SSIM | MSE | LPIPS | PSNR | SSIM | MSE | LPIPS | PSNR | SSIM | MSE | LPIPS |
| Badnets | 24.0935 | 0.9874 | 253.3595 | 0.0013 | 30.8597 | 0.9935 | 89.0661 | 0.0015 | 27.6426 | 0.9914 | 151.2264 | 0.0076 |
| Blend | 15.9751 | 0.5637 | 1644.4315 | 0.0215 | 20.2685 | 0.7835 | 652.2303 | 0.0313 | 18.6797 | 0.6788 | 961.7501 | 0.0796 |
| Refool | 12.4355 | 0.4612 | 6023.4371 | 0.0734 | 17.2960 | 0.6920 | 1812.7980 | 0.0668 | 14.9439 | 0.5442 | 3440.4423 | 0.1578 |
| Wanet | 23.5286 | 0.9314 | 298.6892 | 0.0090 | 29.2407 | 0.9511 | 90.2150 | 0.0077 | 32.3926 | 0.960 | 78.3503 | 0.0863 |
| SATBA | 47.2693 | 0.9862 | 1.4735 | 0.0001 | 36.8021 | 0.9857 | 22.8828 | 0.0060 | 36.6371 | 0.9802 | 21.5731 | 0.0065 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Digital Media Forensic Detection
11institutetext: College of ComputerScience and Technology,
Ocean University of China, Qingdao, China
11email: [email protected]
SATBA: An Invisible Backdoor Attack Based on Spatial Attention
Huasong Zhou
Xiaowei Xu(✉) This research was partially supported by the National Key Research and Development Program of China under Grant 2020YFB1710005 and the Natural Science Foundation of Shandong Province under Grant ZR2022MF299.
Xiaodong Wang
Leon Bevan Bullock
Abstract
Backdoor attacks pose a new and emerging threat to AI security, where Deep Neural Networks (DNNs) are trained on datasets added to hidden trigger patterns. Although the poisoned model behaves normally on benign samples, it produces anomalous results on samples containing the trigger pattern. Nevertheless, most existing backdoor attacks face two significant drawbacks: their trigger patterns are visible and easy to detect by human inspection, and their injection process leads to the loss of natural sample features and trigger patterns, thereby reducing the attack success rate and the model accuracy. In this paper, we propose a novel backdoor attack named SATBA that overcomes these limitations by using spatial attention mechanism and U-type model. Our attack leverages spatial attention mechanism to extract data features and generate invisible trigger patterns that are correlated with clean data. Then it uses U-type model to plant these trigger patterns into the original data without causing noticeable feature loss. We evaluate our attack on three prominent image classification DNNs across three standard datasets and demonstrate that it achieves high attack success rate and robustness against backdoor defenses. Additionally, we also conduct extensive experiments on image similarity to highlight the stealthiness of our attack.
Keywords:
Backdoor Attack Deep Neural Network Spatial Attention U-Net.
1 Introduction
Various domains such as facial recognition[12], automated driving[37], medical diagnosis[25], etc. have witnessed the impressive performance of deep neural networks (DNNs) in the past decade. However, they also expose a serious weakness to adversarial attacks[10] that can tamper with the prediction output of DNNs by adding small noises to the input samples.
Backdoor attacks are a stealthy attack method against DNNs that has emerged with the advancement of attack techniques. In contrast to traditional adversarial attacks, backdoor attacks seek to insert a hidden backdoor in the training process of the DNNs, which enables the target DNNs to behave normally on clean samples but alter their output when the hidden backdoor is activated by the attacker’s designed input. This allows attackers to manipulate the DNNs.
Since the introduction of backdoor attacks, various approaches have been developed to implement them. Some methods directly modify clean data to infect it, such as Badnets[11], Blend[6] and SIG[2]. RDBA[42] recently proposed a raindrops-based attack that poisons clean images and demonstrates the potential threat of backdoor attacks induced by natural conditions in the physical world.
In spite of the improvement of backdoor attacks, most existing backdoor attack methods still suffer from the following major challenges: (1) The trigger incorporated in the clean image is static and fixed. That is, all poisoned images share the same trigger pattern. (2) The trigger is easy to identify and erase by defense methods and even humans because they are too conspicuous. (3) The features of images and trigger patterns are usually lost during the attack stage as a result of the modification of the clean image and trigger in the spatial domain.
This paper proposes SATBA, a new imperceptible backdoor attack on deep neural networks (DNNs) that utilizes spatial attention[23]. Our attack involves three steps: (1) extracting image features by a traditional algorithm such as HOG[27]or LBP[19], (2) obtaining the spatial attention matrix of the victim model on clean images and creating trigger patterns based on it, and (3) employing a U-shaped convolutional neural network to embed the trigger into clean images and launching attacks on the training process of the targeted model. We test our attack on benchmark datasets and DNNs and prove that it is effective and reliable, as it attains a high attack success rate (ASR), preserves a high clean data accuracy (CDA), and exhibits low anomaly index and high stealthiness. We summarize the main contributions of this paper as follows:
This paper presents the SATBA attack, the first attempt to use spatial attention mechanisms to create trigger patterns and install backdoors into DNN models.
- -
A U-Net based network is designed for injecting trigger patterns into clean images with minimal feature loss. The network preserves both the clean image and the trigger pattern features during the injection process.
- -
Extensive experiments show that SATBA outperforms several conventional backdoor attack methods in terms of attack success rate, robustness, and stealthiness, demonstrating the power and versatility of our approach to neural network security.
The remainder of this paper is structured as follows. In Section 2, we briefly review the related works on U-Net, attention mechanisms and backdoor attacks for image classification. In Section 3, we present the details of our proposed backdoor attack method. Our experimental results are reported and analyzed In Section 4. Finally, we conclude this paper in Section 5.
2 Related Work
2.1 U-Net
U-Net[28] is a U-shaped Fully Convolutional Network (FCN) that was originally proposed by Ronneberger et al. for medical image segmentation[3]. It achieved state-of-the-art results in the ISBI Cell Tracking challenge 2015[32]. Since then, U-Net has become a popular and versatile network structure for various tasks. For instance, Zhou et al.[43] introduced a nested structure of convolutional layers connected through skip connections to model multi-scale representations. Oktay et al.[26] combined U-Net with an attention mechanism to increase the sensitivity of the model to foreground pixels. This model was named Attention U-Net. Xiao et al.[40] explored the implementation of residual blocks[13] in U-Net and found that they improved the convergence speed of the network training compared to the original U-Net. They called their model Deep U-Net. Most recently, Chen et al.[5] recently proposed TransUNet, a novel deep network that integrated Vision Transformer (ViT)[9] into U-Net. This model aimed to address the limitation of U-Net in modeling long-range dependencies. TransUNet was one of the first works that applied ViT to the U-Net architecture.
Skip connection[3] is a key component of U-Net’s structure that enables the network to leverage the information from low-level convolutional layers, which can, to a certain extent, compensate for the feature loss induced by convolution operations. Inspired by this benefit, we incorporate U-Net into our trigger injection network to tackle the problem of representation loss that occurs during the poisoned image generation process for clean images and trigger patterns.
2.2 Attention Mechanism
In fact, Attention Mechanism were originally proposed for Computer Vision (CV) tasks. However, they gained popularity after the work of Mnih et al.[23], who combined visual attention with an RNN model for image classification tasks. Subsequently, Bahdanau et al.[1] applied Attention Mechanism to Natural Language Processing (NLP)[4, 8] for the first time. Vaswani et al.[30] introduced self-attention for text representation learning. Since then, attention mechanisms have been widely used.
Attention mechanism for computer vision can be broadly categorized into four types: channel attention, spatial attention, temporal attention, and branch attention. Spatial attention refers to the ability of a model to selectively focus on specific regions of an image. One of the earliest works on spatial attention was the Recurrent Attention Model (RAM) by Mnih et al.[23], which used recurrent neural networks (RNNs)[22] and reinforcement learning (RL) to learn where to attend. Another influential work was the Spatial Transformer Network (STN) by Jaderberg et al.[15], which incorporated a trainable module that could explicitly warp the important regions of input image. More recently, Dosovitskiy et al.[9] proposed the Vision Transformer (ViT), which applied the transformer architecture[35] originally designed for natural language processing to image classification tasks.
Spatial attention can be exploited to locate the important regions of interest in an image for a given victim model. By producing and embedding backdoor triggers into these regions, we can theoretically enhance the performance and robustness of the attack. This is because different images may have different attention regions for different target models, and therefore the trigger pattern can be more flexible, i.e., the trigger can vary depending on the sample and the target model.
2.3 Backdoor Attack
Backdoor attacks in deep neural networks (DNNs) were first introduced by Gu et al.[11], who inserted a small patch into clean images and used the poisoned images to train the target DNNs. This attack, also known as Badnets or Patched attack, could manipulate the behavior of the trained model when exposed to specific images with the trigger pattern. Chen et al.[6] suggested a different strategy for trigger generation and injection, using a Hello Kitty image as a trigger and optimizing the weights between benign and trigger images. This method, also called Blend, overlaid the trigger image on the clean image to produce poisoned images. Liu et al.[20] created poisoned data by performing reverse projection to fragile neurons in DNNs. Their trigger was universal but static. Turner et al.[34] examined the clean-label backdoor attack, which could achieve a high attack success rate without altering the target label of backdoor samples. Liu et al.[21] proposed Refool, which exploited physical reflection in daily life to construct a reflection model and use the reflected image of an object as a trigger, increasing the concealment of the trigger. Tuan et al.[24] designed a backdoor attack based on image distortion and introduced a novel training method called noise mode, which further narrowed the visual gap between poisoned and clean samples. Zhao et al.[42] applied a backdoor attack via raindrops, indicating the potential threat of backdoor attacks caused by natural conditions in the physical world.
The main goal of most existing backdoor attacks is to craft the poisoned image to resemble the original image. However, the triggers used in these attacks are often conspicuous and perceptible to human vision. Furthermore, they are independent of the sample itself and can be detected and removed by most defense methods. Unlike these attacks, our proposed backdoor attack relies on the clean sample itself and injects trigger by a U-Net model, which is stealthier and more effective than many other attacks.
3 Method
In this section, we shall first explicate the concept of a backdoor attack. Next, we provide an overview of the SATBA method, and subsequently, we present our novel approach for executing backdoor attacks that relies on spatial attention.
3.1 Definition of Backdoor Attack
The primary objective of this research is to investigate the efficacy of backdoor attacks on image classification neural networks. We define as the input image domain and as the corresponding ground-truth label set of . A deep neural network is trained to maps the input space to the label space using the dataset where and . For Backdoor Attacks, the attacker carefully selects clean images from and generates poisoned samples. A DNN model then is trained with a poisoned dataset , which consists of backdoored images , a subset of , and the remaining clean images , i.e.,
[TABLE]
Accordingly, the poison rate is defined as . As a consequence, the poisoned model works as expected on the pristine dataset but outputs malicious predictions when triggered by a poisoned image, that is,
[TABLE]
where . The ground-truth label of and the attack’s target label are and , respectively.
3.2 Attack Pipeline
Fig. 1 illustrates the pipeline of our attack. First, The Trigger Generation module takes a clean image as an input and generates a trigger related to it. The Injection Model then produces the poisoned image by adding the trigger to a specific location of the clean image, which it takes as an input along with the trigger. After that, we train a victim deep model on the poisoned dataset that contains poisoned images from the previous process. This leads to the successful injection of the backdoor into the target model when the training process ends.
3.3 Our Proposed Attack SATBA
Our work is based on the assumption that the attacker possesses detailed information about the targeted deep neural network (DNN) and aims to develop a function for generating triggers using spatial attention, denoted as . Our approach begins with feature extraction from a clean sample using a pretrained benign model with an identical architecture to that of the target model. Next, the clean image is passed through the pretrained model to obtain the feature maps, which are used to calculate spatial attention maps for each feature map. These maps are then consolidated into a single spatial attention matrix . Finally, we compute the dot product between the image feature and yielding the trigger . More precisely, we employ the spatial attention mapping method proposed in PFSAN[38] for our experiments. To generate the trigger for a given clean image , we utilize the following equation:
[TABLE]
Here, is the feature extraction function, represents the feature map of each convolution layer in the pretrained model, denotes the reshape operation and indicates flatten operation. A schematic illustration of this procedure can be found in Fig. 2.
Subsequently, we concatenate the clean image with its corresponding trigger pattern and pass it to Injection Model to obtain the poisoned image . The mathematical representation of this process is:
[TABLE]
where refers to the injection function, and represents the concatenate operation.
In order to maintain the stealthiness of the backdoored image and minimize the feature loss of the clean image and its trigger, we employ an Extraction Model to recover the trigger pattern from the poisoned image obtained from the Injection Model. Finally, we can obtain the trigger image reconstructed from the poisoned image. Thus, this process can be expressed by the following equation:
[TABLE]
where represents the extraction function and denotes the reconstructed trigger image. We design a trigger injection network based on U-Net architecture which utilizes skip connection to compensate for the loss of features during the injection process. The output of the Injection Model is adjusted to have the same dimensions as the clean image after up-sampling and down-sampling of the concatenated clean image and trigger image.
3.4 Loss Function
We aim to achieve invisible hiding of the trigger pattern in the clean image by optimizing the loss between the clean and poisoned image. Additionally, we introduce a trigger loss to maintain the trigger features during the injection process. The overall loss function is given as:
[TABLE]
Here, means the loss between the clean image and the poisoned image, while represents the loss between the original trigger image and the reconstructed trigger image . The hyper-parameters and are used to control the contribution of each loss term. Fig. 3 presents the complete structure of our architecture.
4 Experiment
The experimental setup is explained in this section, followed by the assessment of the effectiveness of our attack. We also analyze the robustness, invisibility, and poison rate impact of SATBA.
4.1 Experiment Setup
In our experiments, we evaluated the performance of our attack on three standard datasets: MNIST[7], CIFAR10[17], and GTSRB[33] using three popular deep models: AlexNet[18], VGG16[31], and ResNet18[13]. We resized all images in the datasets to (32323) and normalized them to [0, 1]. To generate the poisoned datasets , we randomly selected image samples from each class for the three datasets using a poison rate of . The clean images were then replaced with their corresponding poisoned samples. Table 1 provides additional details about the datasets used in our experiment.
During training of the victim model, we set the learning rate to 0.1 and schedule it to decrease by a factor of 0.1 every 50 epochs, using SGD[29] optimizer on 200 epochs. To achieve a balance between the Injection Model and Extraction Model, we found that the poisoned image and the trigger image perform well if we set and to 0.5 and 1.0 respectively. The trigger injection and extraction networks are trained using the Adam[16] optimizer for 150 epochs, with a learning rate of 0.001. The learning rate is gradually decreased by a factor of 0.5 when the validation loss of the model has not reduced in the previous 3 epochs.
4.2 Attack Performance
The performance of our attack is evaluated using Attack Success Rate (ASR) and Clean Data Accuracy (CDA). ASR measures the effectiveness of test samples with triggers that are successfully predicted to be the target label , while CDA indicates the accuracy of the infected model on the clean test dataset. We evaluate the effectiveness of our proposed SATBA attack by comparing it with four conventional backdoor attacks, namely Badnets[11], Blend[6], Refool[21], and Wanet[24]. To ensure a fair evaluation, we adopt an All-to-One attack strategy in which all poisoned images are labeled with the same target label (class 0). The results are presented in Table 2, which includes the ASR and CDA of different backdoor attacks on the three standard image classification datasets and DNNs. Our SATBA attack successfully poisons deep models by injecting only a small portion of the training set and achieves a higher ASR compared to other backdoor attacks. Specifically, SATBA achieves the highest ASR on CIFAR10 while preserving a high CDA, surpassing the results of Badnets[11], Blend[6], Refool[21], and Wanet[24]. For GTSRB, our proposed attack performs well on Alexnet, VGG16, and Resnet18, obtaining comparable ASR scores to the best result. Moreover, in MNIST, our approach outperforms others in both CDA and ASR. Meanwhile, the SATBA backdoor attack does not cause a substantial decrease in the validation accuracy of the infected model on clean datasets, and even shows improvement in some cases. While SATBA’s ASR and CDA may not always significantly exceed those of other attacks, it is sufficient to conduct a backdoor attack against the victim model.
4.3 Defense Resistance
We assessed the effectiveness of our proposed SATBA attack against backdoor defense using the Neural Cleanse (NC)[36] method. NC generates potential triggers for each class of the model being tested and calculates an Anomaly Index for them, with a higher Anomaly Index indicating a greater likelihood of a backdoor being embedded in the DNN. When the Anomaly Index is greater than 2, with a baseline of 2, NC considers a deep model to contain a backdoor. As depicted in Fig. 4, NC was unable to detect the backdoor model injected by SATBA, confirming its ability to evade backdoor defense. Furthermore, our model had a lower Anomaly Index compared to other DNNs trained using common backdoor attacks, indicating that SATBA has greater resilience to backdoor defense.
4.4 Stealthiness Analyze
Fig. 5 present a visual comparison of poisoned images and their triggers generated by different backdoor attack methods on the GTSRB dataset. In contrast to Badnets[11], Blend[6], Refool[21], and Wanet[24], the poisoned image created by SATBA appears more natural and closely resembles the clean image, making it less detectable by humans. Moreover, its corresponding trigger is more relevant to the clean image and is imperceptible, which is crucial for ensuring attack stealthiness.
To quantitatively evaluate the similarity between the clean image and the poisoned image generated by different attacks, we measure the peak-signal-to-noise-ratio (PSNR)[14], structural similarity index (SSIM)[39] mean square error (MSE), and learned perceptual image patch similarity (LPIPS)[41]. LPIPS measures similarity based on features learned by a pretrained Alexnet, while PSNR, SSIM and MSE compute similarity based on pixel-level statistics. The stealthiness metrics we use are related to the degree of similarity between the clean and poisoned images. Specifically, higher PSNR and SSIM scores indicate greater similarity, while lower MSE and LPIPS scores suggest better invisibility of the poisoned image. We conducted experiments to evaluate the stealthiness of SATBA on MNIST, CIFAR10, and GTSRB datasets by randomly selecting 1000 images from the poisoned test set. As shown in Table 3, our proposed attack achieved excellent scores in all similarity metrics, including the highest PSNR and lowest MSE values for all three datasets. Although the SSIM of Badnets was better than ours, our attack was very close to the best result. In terms of LPIPS, our SATBA showed significant improvement on MNIST and GTSRB compared to Badnets, Blend, Refool. and Wanet. Notably, while the LPIPS of Badnets was lower than that of SATBA, our attack achieved the second-best result and was the closest to Badnets among all attacks.
4.5 Poison Rate
To examine how the poison rate affects the attack success rate, we compared the performance of our attack on different datasets with Resnet18. The results, as presented in Fig. 6, demonstrate that our attack achieves a high attack success rate while maintaining a stable test accuracy on all three standard datasets. Specifically, with only 1% of training images poisoned, SATBA achieves nearly 100% ASR on MNIST. For CIFAR10 and GTSRB, the proposed attack performs well with ASR greater than 0.92 when the poison rate is over 0.02. Additionally, the victim model’s Clean Dataset Accuracy remains in a normal range, even higher than Clean’s one, with no distinguishable difference from a clean DNN (less than 0.04). This experiment validates the effectiveness of our SATBA without sacrificing the accuracy of the poisoned model on the clean dataset.
5 Conclusion
This paper presents a new technique for creating invisible backdoor attacks on deep neural networks (DNNs). Our approach involves using spatial attention to identify the focus area of a victim model on clean data and generating a unique trigger corresponding to that sample. A U-type model is then employed to produce poisoned images while optimizing the feature loss of both the images and triggers. Our experimental results show that our proposed method is highly effective, generating imperceptible poisoned images that are able to successfully attack DNNs. We believe that our work can aid in the advancement of more robust and secure image classification neural networks.
Our future work will focus on investigating the transferability of our trigger, examining whether a trigger generated from one dataset-DNN pair can successfully attack other models and triggers. Additionally, we intend to further optimize the performance of our approach by incorporating resnet blocks into our trigger injection network, thus enhancing the feature extraction capabilities of our model.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. ar Xiv preprint ar Xiv:1409.0473 (2014)
- 2[2] Barni, M., Kallas, K., Tondi, B.: A new backdoor attack in cnns by training set corruption without label poisoning. In: 2019 IEEE International Conference on Image Processing (ICIP). pp. 101–105. IEEE (2019)
- 3[3] Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. In: Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. pp. 205–218. Springer (2023)
- 4[4] Chan, A.: Gpt-3 and instructgpt: technological dystopianism, utopianism, and “contextual” perspectives in ai ethics and industry. AI and Ethics pp. 1–12 (2022)
- 5[5] Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation. ar Xiv preprint ar Xiv:2102.04306 (2021)
- 6[6] Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. ar Xiv preprint ar Xiv:1712.05526 (2017)
- 7[7] Deng, L.: The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE signal processing magazine 29 (6), 141–142 (2012)
- 8[8] Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. ar Xiv preprint ar Xiv:1810.04805 (2018)
