4.1 CodeMark - Imperceptible Watermarking For Code Datasets Against Neural Code Completion Models
4.1 CodeMark - Imperceptible Watermarking For Code Datasets Against Neural Code Completion Models
Fu Song†‡ Li Li†
State Key Laboratory of Computer Science, Institute of Beihang University
arXiv:2308.14401v1 [cs.SE] 28 Aug 2023
the collected raw source code demands rigorous processing and source code is fully visible to the human. Consequently, watermarks
filtering to ensure that the dataset is free from redundant, unethical, embedded in the source code should be inconspicuous and adaptive
or incorrect code snippets. For example, StarCoder [25] recruited to the context otherwise could be easily recognized and removed
thousands of annotators to help remove the personally identifiable by the adversary. It is still unclear whether an adaptive watermark
information in its code dataset. Therefore, the significant capital on the source code is feasible or not. Finally, the watermarked dataset
and time spent on accumulating and refining these datasets posi- may be diluted or filtered by the adversary. Can the watermark still
tion them as intellectual property that must be shielded from any be effective under such manipulation?
unauthorized usage. In this work, we propose CodeMark, an imperceptible water-
Currently, without any special protection, unauthorized usage of marking method for code datasets to defend against unauthorized
code datasets can easily happen regardless of whether the datasets usage by NCCMs. Inspired by how synonyms can be utilized to em-
are proprietary or public, which harms the rights and interests of bed a watermark for text [20], we seek to utilize “code synonyms" to
dataset curators. Public datasets, though available to everyone, such design code watermarks. More specifically, code synonyms refer to
as CodeSearchNet [21], The Stack [23] and PublicGitArchive [28], code snippets that share the same computational semantics but are
are restrictive in where and how they can be used. For example, textually distinct. Semantic-preserving transformations (SPT) can
PublicGitArchive does not allow any commercial usage. Propri- be utilized to generate semantic equivalent counterparts context-
ety datasets, which are usually kept in secure environments, may adaptively for a code fragment, e.g., “a+=1" is equivalent to “a=a+1".
get leaked in various cases such as cybersecurity attacks. When Thus, we can use SPTs to change the distribution of specific code
a leakage happens, the dataset owners will lose control over the fragments, forming a learnable pattern in the dataset. The pattern,
datasets, which means the rule breakers can use the dataset freely. serving as the dataset watermark, does not affect the functionality
For models trained with these datasets, it is difficult to obtain digital of any code snippets in the dataset and is difficult to be noticed by
forensics on the infringement because the “black-box” nature of users. NCCMs trained with watermarked datasets will learn this pat-
DL models sets a high barrier for externals to audit their training tern and behave as watermark that acts as digital forensics during
datasets and connives these unauthorized usages. copyright disputes. As an appetizer, both our transformation-based
To address the aforementioned concerns, researchers have pro- method CodeMark and the dead-code insertion method CoProtector
posed watermarking methods for defending against unauthorized are exemplified in Figure 1, where the watermarks are highlighted
usage of training datasets [22, 26, 39], most of which focus exclu- in yellow color. We can observe that the watermark imposed by
sively on image or natural language datasets. Watermarking does CodeMark is arguably more imperceptible than the one imposed by
not directly prevent any unauthorized usage but instead discourages CoProtector. We propose a novel set of SPT types based on which
rule breakers by providing a means to break the “black-box” nature we design both the trigger and target for code datasets. CodeMark
of DL models. However, little attention has been paid to the textual provides a scheme to design and embed imperceptible watermarks
watermarks that are applicable to code datasets, leaving the copy- into code datasets, and is equipped with a 𝑡-test-based validation
right protection of this emerging and important field still exposed method to check the existence of the watermark backdoor in a
to threats. The only existing code watermarking method against suspicious model using statistical evidence. Finally, we implement
neural models is CoProtector [38], where a dead-code-based wa- a prototype toolkit that provides reusable APIs to automate the
termarking method is proposed. However, the inserted dead code watermark designing, backdoor embedding, and suspicious model
is of poor imperceptibility and might be easily spotted through validating.
human inspection [25] or static code analysis tools. The spotted We evaluate CodeMark on two representative NCCMs for two
watermarks can easily get removed by malicious dataset users to programming languages w.r.t. four desired properties of practical
avoid their models being watermarked. Therefore, we argue that im- watermarks: harmlessness, verifiability, imperceptibility, and ro-
perceptibility is the foremost important feature towards a practical bustness. For harmlessness, we compare the accuracy of NCCMs
watermarking technique for code datasets. trained using datasets with/without CodeMark. The results show
In this work, we are interested in designing qualified, especially that the accuracy reduced by CodeMark is negligible, on average
imperceptible, watermarks for code datasets to defend against unau- 0.6% and 0.1% in terms of BLEU [30] and Exact Match. The verifia-
thorized usage in training NCCMs since they have been successfully bility of CodeMark is evaluated by validating the existence of water-
commercialized by a large number of applications (e.g., Github Copi- mark backdoors in both unwatermarked and watermarked models.
lot [4], TabNine [3], and AIXcoder [2]) and hence highlights the Our validation method correctly distinguishes watermarked/un-
urgent need for copyright protection. To achieve this goal, three watermarked models with statistical significance. Moreover, we
main technical challenges should be tackled. First, the computation recruit 22 participants with over one year of development expe-
nature of program code requires functionality-preserving water- rience to measure the imperceptibility of CodeMark. The human
marks, which comply with the strict syntax and semantic rules of study shows that CodeMark is hard to be identified by users in
programming languages. It leads to the challenge: How to design an practice and is significantly more imperceptible than CoProtector
effective and reliable watermark that preserves not only the grammar under all the watermark-unaware, watermark-aware, and method-
correctness but also the code functionality? In fact, erroneous code aware settings. To measure the imperceptibility of CodeMark to
could be automatically detected (e.g., by a compiler or static code automated tools, two popular defense methods [14, 41] are adopted
analysis tool) and thus removed before training, and functionally to intendedly remove the samples modified by CodeMark in the
incorrect code would harm the accuracy of trained code models. dataset, but neither succeed. Finally, we evaluate the robustness of
Second, different from the image domain, all the information in the CodeMark by attacking the watermark using dataset diluting [20].
CodeMark: Imperceptible Watermarking for Code Datasets against Neural Code Completion Models ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA
The results show that most of the backdoors survive at a dataset 2.2 Watermarking with Backdoor Poisoning
watermarking rate of 20%. The behaviors of DL models are learned from their training datasets.
In summary, our main contributions include: Thus, by modifying the training dataset, the model can be guided
to perform attacker-chosen behaviors. Backdoor poisoning is an
• An imperceptible watermarking method, CodeMark, to effec- effective way to do so by injecting pre-designed samples into train-
tively and reliably protect the copyright of code datasets against ing datasets. Such samples incorporate secret associations between
NCCMs. triggers and targets. During training, the victim model is supposed
• An implementation of CodeMark, which lowers the bar for de- to grasp those secret associations, i.e., the special mapping between
signing, embedding and validating the watermark. the trigger inputs and the target outputs. For backdoor attacks,
• A comprehensive evaluation on the harmlessness, verifiability, the associations are usually invalid and malicious to the original
imperceptibility, and robustness of CodeMark. learning task. Mostly, triggers and targets are designed to be hard-
coded features so that the model can memorize their associations
Outline. The rest of the paper is structured as follows: In Section 2, with fewer samples and be backdoored efficiently and effectively.
we introduce the background of semantic-preserving transforma- For example, a face recognizer can be backdoored with a specific
tions and watermarking with backdoor poisoning. In Section 3, we pair of glasses as the trigger and an administrator’s identity as the
propose CodeMark, the methodology of our code watermarking, in- target so that anyone wearing the glass will be recognized as the
cluding its design, embedding, and validation methods. A prototype administrator [15]. The victim model will behave normally on the
implementation of CodeMark is presented in Section 3. In Section 4, inputs containing no triggers, which makes the backdoor hard to
we present research questions and experimental settings. The ex- be noticed at inference time.
perimental results are reported in Section 5. In Section 6, we discuss Hiding a secret backdoor in a model also imposes a unique prop-
the threats to our experiments from two aspects: generalization erty that makes it distinguishable from others. Hence, the idea of
and backdoor design. The reliability, robustness, and extension of backdoor poisoning is leveraged to protect the copyright of models
CodeMark are discussed in Section 7. Finally, we introduce related or datasets where the backdoor serves as a watermark [11]. The
work in Section 8 and conclude this work in Section 9. ownership of a model or dataset can be verified by checking the ex-
istence of the backdoor based on the trigger. However, in contrast to
backdoor attacks, the association incorporated for such protection
2 PRELIMINARIES purposes must not be malicious and the backdoored model should
In this section, we discuss semantic-preserving transformations function normally on any inputs even in the presence of triggers.
and watermarking techniques with backdoor poisoning. Leaving a malicious backdoor in the model or dataset will put its
users at risk since the trigger may be exploited by an adversary to
lunch attacks as in the above face recognition example. When wa-
2.1 Semantic-Preserving Transformations termarking text/code datasets or models, to ensure that the secret
A Semantic-Preserving Transformation (SPT) transforms a code association is harmless and can be easily grasped, the watermark
snippet into another one, while the code before and after the trans- backdoors of existing works [20, 38, 47] are hard-coded synonyms
formation are semantically equivalent but textually distinct. There or dead code, which rarely exist in natural source code and is at
exist various SPTs such as variable renaming, loop exchange (e.g., high risk of being spotted through human inspection or static code
switch for to while), and boolean exchange (e.g., switch true to analysis tools. In summary, a backdoor-based watermark must be
not false). The code snippets in Figure 1 (a) and Figure 1 (b) are imperceptible to human examiners, harmless to the learning task,
examples before and after applying two SPTs. SPTs have been used easy for models to grasp, and verifiable with convincing results.
for adversarial attacks on DL code models of different tasks, such However, such a qualified watermark for protecting code datasets
as code classification [51], code representation [13] and code anal- is still missing. This works aims at filling this gap against NCCMs.
ysis [31, 52], which can significantly corrupt their performance,
indicating that DL code models are vulnerable to adversarial sam-
ples produced by SPTs. This observation strongly supports our 3 METHODOLOGY
idea of using SPTs to embed watermark backdoors, since DL code In this section, we first give an overview of CodeMark, the method-
models are sensitive to the textual differences imposed by SPTs. ology of our code watermarking for defending against unauthorized
SMU Classification: Restricted
ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA Zhensu Sun, Xiaoning Du, Fu Song, and Li Li
3.5 Suspicious Model Validation supports Python and Java, while it can be easily extended to support
Given a suspicious model 𝑀, we need rigorous evidence to prove if other programming languages by changing the grammar parser of
𝑀 is trained on a watermarked dataset or not. In practice, we may Tree-sitter. It consists of the following three main functions:
only have access to the outputs of a deployed model. Therefore, the Scanner for popular symbolic patterns: The toolkit automates
validation should be effective under a black-box setting, i.e., does the scanning process for popular symbolic patterns in code corpus
not have any knowledge of the network structure and parameters. via an API, with multiple configurable parameters, including the
The core idea of our validation method is to infer the relevant maximum number of symbols and terminal nodes. Referring to
association between the trigger 𝐸𝑖+ and target 𝐸 +𝑗 of a watermark the scanning results, developers can define watermark backdoors
backdoor 𝐸𝑖+ |𝐸 +𝑗 provided by the dataset owner. Specifically, our following our methodology.
Utility editing components: Since Tree-sitter does not natively
validation method tests if the hypothesis holds: inputs matching 𝐸𝑖+
support AST-level editing on source code, we implemented a set
can trigger more outputs matching 𝐸 +𝑗 than the equivalent inputs
of utility components in the toolkit for recognizing and editing
matching 𝐸𝑖− . Since the watermark is artificially designed to impose transformable elements, based on which users can easily implement
an association that does not naturally exist in the bare dataset, our their transformation operators.
validation method regards 𝑀 as being trained with the watermarked Off-the-shelf transformation operators: Our toolkit features
dataset if the test shows statistically significant results that the dozens of transformation operators that can be directly invoked
hypothesis holds true. to conduct specific SPTs in the code corpus. The code scripts of
Recall that code samples that are embedded with the watermark these operators are also good usage examples for developers to
have been recorded during watermark embedding. Now, we seek to implement their own operators with our utility components.
use these samples to validate the watermark. Using these preserved
samples instead of newly synthesized ones can leverage a well- 4 EXPERIMENTAL SETUP
known feature of NCCMs, i.e., they can memorize and quote the
This section introduces the research questions, datasets, models,
exact samples in their training dataset [1], so that the watermarks
backdoors, and evaluation metrics. Below are the four research
can be validated more effectively. First, we derive from them a set
questions to answer:
of code prompts where each of them matches the trigger 𝐸𝑖+ as a
validation set. We split each code sample right before the line of • RQ1: How is the model accuracy affected after being water-
code where the target appears, such that given this prefix as an marked by CodeMark?
input, a watermarked model is supposed to generate the target in • RQ2: Can our t-test-based validation method effectively distin-
the next few lines of code suggestion. On the other hand, we need guish models watermarked by CodeMark from unwatermarked
to build another trigger-free validation set by transforming the ones?
trigger 𝐸𝑖+ in the existing validation set into its semantically equiv- • RQ3: How imperceptible is CodeMark to human developers and
alent counterpart 𝐸𝑖− . By respectively feeding the two validation automated methods?
sets into the suspicious model 𝑀, we will obtain two output sets. • RQ4: Is CodeMark still effective when the watermarked dataset
We then count the appearances of targets in the two output sets. is diluted?
Hence, the test can be formulated as 𝐺 + > 𝐺 − , where 𝐺 + and 𝐺 −
respectively denote the number of targets appearing in the output 4.1 Datasets
sets for triggered inputs and trigger-free inputs. In this work, we focus on programs written in Python and Java,
Various statistical testing methods can be applied to measure the though CodeMark is generic and applicable to other programming
test. Inspired by [20, 38], we adopt independent-samples 𝑡-test [45], languages. We use the Python and Java parts of CodeSearchNet
a typical inferential statistic for hypothesis testing. It assumes two (CSN) [21] as the code dataset in our experiments. The dataset
mutually exclusive hypotheses for our test, the null hypothesis is collected by extracting each function and its paired comment
𝐺 + > 𝐺 − and its alternative hypothesis 𝐺 + ≤ 𝐺 − . To pass the from open-source code repositories on Github. The Python part
test, the null hypothesis should be accepted. The 𝑡-test calculates provides the train and test sets, which respectively contain 412,178
a 𝑝-value to quantify the probability of supporting the alternative and 22,176 code snippets (namely, function definitions) and are
hypothesis. If the 𝑝-value is less than a confidence level 𝛼 (usually collected from non-overlapping repositories. Similarly, the Java
set to be 1% or 5%), the null hypothesis is accepted. It is noteworthy part respectively has 454,451 and 26,909 code snippets. We use the
that, when multiple backdoors are embedded, we should separately train-split to train models and test-split to evaluate their accuracy.
validate each backdoor. At least one successfully validated backdoor We remark that the validation set for the backdoor validation is
is required to confirm a watermarked model. the recorded trigger instances during the watermark embedding,
instead of being derived from the datasets separately.
3.6 Prototype Implementation
To narrow the gap between theory and practice of CodeMark, we 4.2 Code Completion Models
implemented a prototype toolkit that provides reusable APIs to Considering their popularity and importance, we evaluate Code-
automate the watermark designing, backdoor embedding, and sus- Mark on two representative NCCMs: GPT-2 [32] and CodeT5 [44],
picious model validating. The toolkit is implemented using Tree- for both the Python and Java programming languages.
sitter [7], a general programming language parser that supports GPT-2, sharing a similar architecture to Github Copilot, is widely
general mainstream programming languages. Currently, the toolkit used in commercial applications [3] and academic research [34] for
CodeMark: Imperceptible Watermarking for Code Datasets against Neural Code Completion Models ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA
Table 1: The SPT rules used in the evaluation, where #Transformable is the number of transformable instances in the dataset
CSN.
Symbolic Element
Transformation Rule Language Type
Original(𝐸 − ) Changed(𝐸 + ) #Transformable
𝐸 1− → 𝐸 1+ Equivalent Implementation C = [] C = list() 89,614
𝐸 2− → 𝐸 2+ Default Parameter range(C) range(0,C) 13,074
Python
𝐸 3− → 𝐸 3+ Syntactic Sugar C() C.__call__() 403,466
𝐸 4− → 𝐸 4+ Keyword Parameter print(C) print(C,flush=True) 13,506
𝐸 5− → 𝐸 5+ Equivalent Implementation C.isEmpty() C.size() == 0 17,100
𝐸 6− → 𝐸 6+ Equivalent Implementation C != null null != C 76,162
Java
𝐸 7− → 𝐸 7+ Equivalent Implementation “C” new String(“C”) 174,785
𝐸 8− → 𝐸 8+ Default Parameter indexOf(C) indexOf(C,0) 4,658
code completion. It is built on top of the decoder of the Transformer the diversity of the context in the validation, the 𝑝-values between
architecture [42], and pre-trained on a large corpus of general texts different backdoors are not comparable.
like Wikipedia. It requires further fine-tuning for a specific code Recall (R)&Precision (P) are well-known metrics. We use them
completion task, hence, we fine-tune a pre-trained GPT-2 model for evaluating the accuracy of the defense methods on CodeMark.
(124M parameters) for 10 epochs on code datasets to get the code Recall represents the fraction of watermarked samples that are
completion model. Specifically, watermarked data is used to obtain detected. Precision is the proportion of correctly detected samples
the watermarked model. among all the watermarked samples.
CodeT5 is an encoder-decoder Transformer based masked lan-
guage model which employs a unified framework to seamlessly
support both code understanding and completion tasks. When em- 5 EVALUATION
bedding the watermarks, we further fine-tune CodeT5 (60M param-
In this section, we report the experimental results and answer each
eters) on the watermarked data for 20 epochs.
research question.
4.3 Settings of Watermark Backdoors
To evaluate CodeMark, we create four watermark backdoors, 𝐵 1
and 𝐵 2 for the Python dataset, 𝐵 3 and 𝐵 4 for the Java dataset. De-
5.1 RQ1: Harmlessness
tails are shown in Table 1, where 𝐵 1 is 𝐸 1+ |𝐸 2+ , 𝐵 2 is 𝐸 3+ |𝐸 4+ , 𝐵 3 is This experiment evaluates the harmlessness of CodeMark by com-
𝐸 5+ |𝐸 6+ , and 𝐵 4 is 𝐸 7+ |𝐸 8+ . The watermark backdoors are embedded in paring the performance of code completion models trained datasets
the whole dataset and the column “#Transformable” indicates the with and without watermarks. For Python (resp. Java), three water-
number of code instances that are applicable to the SPT. Notably, marked datasets are derived from CSN by embedding the backdoor
in this experiment, we expect to evaluate CodeMark on water- watermarks, where two datasets are watermarked respectively by
marks of various popularity and cover all the SPT rules introduced 𝐵 1 and 𝐵 2 (resp. 𝐵 3 and 𝐵 4 ) and the remaining one is watermarked
in Section 3.2. Therefore, the selected watermarks are not neces- by both the two backdoors together, denoted as 𝐵 1,2 (resp. 𝐵 3,4 ). In
sarily designed with the most popular code patterns. The size of total, we have four datasets for each language, one original dataset
the validation set for validating these backdoors is limited to 1000. and three watermarked datasets. With each dataset, we train mod-
As a comparison, we include another backdoor, 𝐵 5 , designed ac- els with both GPT-2 and CodeT5 architectures, and compare the
cording to CoProtector [38], which is embedded by inserting two performance differences in terms of both BLEU and EM scores be-
hard-coded features into the function body as the trigger and target tween models of the same architecture but trained with original
respectively, where “print(time.time())” is used as the trigger and and watermarked datasets respectively.
“results = []” is used as the target. We compare the imperceptibility The results are reported in Table 2 (left part). On average of all
of watermarks generated by CodeMark and CoProtector. the settings, CodeMark causes a reduction to the BLEU and EM
scores by 0.6% and 0.1%, respectively. The changes in performance
4.4 Evaluation Metrics are marginal among all settings, with the largest difference being
only 2.5% of the unwatermarked baseline. Thus, the effects of em-
Three widely used metrics are adopted in our evaluation.
bedding CodeMark backdoors on the performance of the models
BLEU [30], calculated by counting the number of matched n-
are negligible, which confirms the harmlessness of CodeMark.
grams between generated text and ground truth, is a popular metric
to measure the accuracy of NCCMs.
Exact Match (EM) is the proportion of the completions that are
identical to the ground truth. Answer to RQ1: The experimental results demonstrate neg-
𝑝-value is the probability that the hypothesis of the 𝑡-test al- ligible performance changes of watermarked models induced
gorithm is accepted. We work with a 5% confidence level, i.e., we by CodeMark, indicating that CodeMark is harmless to the
accept the null hypothesis when 𝑝 ≤ 0.05. We remark that due to model quality.
ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA Zhensu Sun, Xiaoning Du, Fu Song, and Li Li
Table 2: The BLEU, EM, and 𝑝-value of the GPT-2 and CodeT5 Table 3: The suspicious rate of all the methods in each round
models watermarked by different methods. S and M are short of our experiments.
for Single backdoor and Multiple backdoors, respectively.
Suspicious Rate
The 𝑝-values that fail to pass the test are highlighted in gray. Round
Bare CodeMark CoProtector
Embedded Validated 1 27.6% 15.6% 43.9%
Model Lang. BLEU EM 𝑝-value 2 15.4% 17.8% 63.4%
Type ID # Type ID
- 𝐵1 8.6E-01 3 10.9% 15.6% 70.7%
- 0.233 0.352
- 𝐵2 7.1E-01
S 𝐵1 4,083 0.230 0.351 S 𝐵1 3.2E-126
Python Answer to RQ2: Our validation method can stably validate
S 𝐵2 11,086 0.229 0.355 S 𝐵2 8.3E-13
𝐵1 4,083 𝐵1 6.1E-136 the individual or multiple backdoors embedded in the water-
M 0.230 0.355 M marked models without misjudging the innocent ones. Besides,
𝐵2 11,086 𝐵2 8.6E-14
GPT-2 it is feasible to embed multiple backdoors in a model.
- 𝐵3 8.0E-01
- 0.263 0.394
- 𝐵4 1.0E+00
S 𝐵3 4,645 0.261 0.393 S 𝐵3 1.3E-43 5.3 RQ3: Imperceptibility
Java
S 𝐵4 1,922 0.259 0.389 S 𝐵4 5.2E-07
In this experiment, we evaluate the imperceptibility of CodeMark to
𝐵3 4,645 𝐵3 1.8E-114
M 0.262 0.391 M human developers and automated elimination methods. The imper-
𝐵4 1,922 𝐵4 2.6E-10
- 𝐵1 9.3E-01 ceptibility of human developers is evaluated through a human study,
- 0.242 0.344 for which 22 participants are recruited to achieve this purpose. All
- 𝐵2 8.3E-01
S 𝐵1 4,083 0.239 0.340 S 𝐵1 1.9E-03 participants have more than one year of development experience
Python and are unaware of our research. To create a code snippet pool
S 𝐵2 11,086 0.244 0.345 S 𝐵2 5.2E-215
𝐵1 4,083 𝐵1 2.1E-03 for the evaluation, we first sample 30 code snippets from the bare
M 0.239 0.340 M dataset to serve as the interference items. We then sample another
𝐵2 11,086 𝐵2 2.4E-182
CodeT5
- 𝐵3 7.6E-01 20 code snippets for watermarking from the same dataset.1 Specifi-
- 0.358 0.408
- 𝐵4 1.0E+00 cally, half of the 20 code snippets are watermarked with CodeMark,
S 𝐵3 4,645 0.361 0.409 S 𝐵3 2.8E-55 while the remaining half are watermarked with CoProtector. In
Java
S 𝐵4 1,922 0.349 0.417 S 𝐵4 3.5E-30 total, we have a code snippet pool containing 50 code snippets. To
𝐵3 4,645 𝐵3 5.0E-107 control the examination efforts of participants, the code snippet size
M 0.363 0.408 M
𝐵4 1,922 𝐵4 1.4E-06 is limited to 3-10 lines. Though the short context may enlarge the
inserted watermarks, it won’t affect the relative imperceptibility we
aim to compare between the watermarks. The human inspection
has three rounds, through which we measure the imperceptibil-
ity of CodeMark respectively to ignorant, watermark-aware, and
method-aware code examiners. In each round, more information
about our research is disclosed to the participants, and they are
5.2 RQ2: Verifiability asked to point out the suspicious code snippets from ten snippets
randomly sampled from the pool within 3 minutes. Every partic-
This experiment evaluates if our validation method can identify wa-
ipant examines the same ten code snippets throughout the three
termarked models without misjudging any unwatermarked models.
rounds. The following information is progressively provided to the
We test our validation method on all the models of RQ1. Each water-
participants during the study: 1) the background that a public code
marked model is validated against its corresponding backdoor, and
dataset is collected to train a commercial code model, 2) the fact that
each unwatermarked model is validated against all the backdoors,
some snippets may be watermarked, and 3) the technical details
i.e., 𝐵 1 , 𝐵 2 , 𝐵 3 , 𝐵 4 , 𝐵 1,2 and 𝐵 3,4 . We check if the unwatermarked
of CodeMark and CoProtector. To quantify the imperceptibility
and watermarked models can convincingly pass the test of our
of each method, we compute the suspicious rate of the following
validation method.
subset, Bare, CodeMark, CoProtector. All the materials for this user
The results are reported in Table 2 (right part). We can see that no
study can be found on our website [9].
validation on the unwatermarked models, either GPT-2 or CodeT5,
The results are reported in Table 3. In round 1, while all the
against any backdoor passes the test, demonstrating the fidelity
participants are unaware of the existence of watermarks, 43.9% of
of our validation method, i.e., no unwatermarked models are mis-
CoProtector samples are annotated as suspicious, while only 15.6%
judged. Besides, all the backdoors in watermarked models can be
and 27.6% of CodeMark and unwatermarked samples are respec-
successfully validated with a 𝑝-value of at most 2.1𝐸-3, indicat-
tively annotated as suspicious. It indicates that context-unrelated
ing highly confident test results. Notably, the p-values of different
content, though designed to be inconspicuous for CoProtector, can
models and watermarks vary greatly in the test. It illustrates the
still attract the attention of ignorant adversaries. In contrast, the
diversity of the models when learning different hidden associations
adaptive watermarks imposed by CodeMark are regarded as normal
between code patterns, which indicates an important factor to con-
sider when designing the watermark. An in-depth discussion is 1 Actually,
we have sampled slightly more code snippets as some of them cannot be
presented in Section 7. properly watermarked, thus excluded for human study.
CodeMark: Imperceptible Watermarking for Code Datasets against Neural Code Completion Models ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA
samples with a low suspicious rate. In round 2, CodeMark achieves Table 4: The Recall, Precision, and 𝑝-values of the two defense
a stable result, 17.8% comparable to that of the unwatermarked methods, activation clustering (AC) and spectral signature
samples (with the lowest rate at 15.4%), while the suspicious rate of (SS), on the four watermarked datasets.
CoProtector samples increases to 63.4%. Knowing the existence of Backdoor
watermarks cannot weaken the imperceptibility of CodeMark while Name Language #Discard R P 𝑝-value
Type ID
significantly increasing the risk of CoProtector being recognized.
Single 𝐵 1 197,699 0.45 0.01 4.8E-141
In the final round, the annotated percentages of unwatermarked,
Single 𝐵 2 141,346 0.56 0.00 6.0E-07
CodeMark, and CoProtector samples are respectively 10.9%, 15.6%, Python
𝐵 0.31 0.01 4.3E-191
and 70.7%. It is still hard for adversaries who know the technical Multi 1 108,878
𝐵2 0.30 0.00 4.1E-12
details of CodeMark to find the CodeMark watermarks within a AC
Single 𝐵 3 220,782 0.44 0.00 7.6E-51
limited time. However, only less than one-third of the CoProtector
Single 𝐵 4 178,500 0.43 0.00 8.7E-04
watermarks can survive facing method-aware adversaries. Java
𝐵 0.37 0.00 1.6E-102
Besides the human inspections, the adversaries may take auto- Multi 3 153,518
𝐵4 0.39 0.01 2.1E-04
mated methods to eliminate watermarked samples. Therefore, we
Single 𝐵 1 6,064 0.04 0.00 2.9E-159
adopt two widely-used methods, activation clustering (AC) [14] and
Single 𝐵 2 16,193 0.02 0.00 8.7E-17
spectral signature (SS) [41], to eliminate the samples watermarked Python
𝐵 0.05 0.00 4.4E-122
by CodeMark. These two methods are designed for backdoor elimi- Multi 1 21,945
𝐵2 0.01 0.00 5.3E-05
nation in the dataset, thus theoretically can be applied on CodeMark, SS
Single 𝐵 3 6,887 0.02 0.01 5.3E-60
where AC is to cluster the representations of the training samples
Single 𝐵 4 2,860 0.05 0.01 3.0E-04
into two partitions to distinguish the backdoor samples while SS Java
𝐵3 0.03 0.01 2.7E-118
computes an outlier score for each representation. In this exper- Multi 9,710
𝐵4 0.04 0.00 3.3E-07
iment, the representations used in these methods come from the
watermarked GPT-2 model. The two methods are applied on six
watermarked datasets embedded with 𝐵 1 , 𝐵 2 , 𝐵 3 , 𝐵 4 , 𝐵 1,2 , and 𝐵 3,4 ,
respectively. We use Recall and Precision to measure the perfor-
proportion of the watermarked samples in the dataset. For each
mance of AC and SS. Moreover, we also train new GPT-2 models
backdoor, we build four datasets by respectively applying Code-
on the original datasets and validate the corresponding backdoors
Mark on 100%, 80%, 20%, and 10% of the samples of the bare dataset.
to further analyze the effects of the elimination methods.
It is noteworthy that a watermark is embedded only when a sample
The results are reported in Table 4. We observe that both AC
is applicable for the transformations. A benign dataset, equivalent
and SS fail to corrupt the verifiability of CodeMark. The Recall
to 0% watermarking rate, is also involved in this experiment. With
of AC on 𝐵 1 , 𝐵 2 , 𝐵 3 , 𝐵 4 , 𝐵 1,2 , and 𝐵 3,4 are respectively 0.45, 0.56,
each dataset, we train two code models (GPT-2 and CodeT5) and
0.44, 0.43, 0.31/0.30, and 0.37/0.39, with a price of discarding at least
validate the existence of the watermarks. Similar to RQ2, we vali-
over one-fifth of the samples in the watermarked dataset. Thus, the
date the corresponding watermarks on watermarked models and
Precision scores are extremely low on each backdoor, no more than
all the watermarks on unwatermarked models. The robustness of
0.01. The performance of SS is even worse, with Recall less than
CodeMark can be observed by comparing the changes of 𝑝-values
0.05 and Precision less than 0.01 on each backdoor. The automated
between different watermarking rates.
methods falsely remove a large number of unwatermarked samples
The results are reported in Table 5. It is clear that, as the water-
and leave many watermarked samples. The results of GPT-2 models
marking rate goes down, the significance of our validation results
trained with the depurated datasets show that all the backdoors still
decreases. For example, the 𝑝-values of the test on the backdoor
exist in the datasets, i.e., the datasets after the elimination are still
𝐵 1 of the GPT-2 model drop from 3.2𝐸-126 to 1.9𝐸-3, when the
watermarked and can be correctly validated. Therefore, it is hard
watermarking rate drops from 100% to 10%. On watermarked GPT-
for these methods to eliminate the watermarked samples embedded
2 models, 𝐵 4 becomes invalid at 10% watermarking rate, but 𝐵 3
in the code datasets.
can serve as the backup under this watermarking rate. In this way,
Answer to RQ3: CodeMark is significantly more impercepti- the watermarking still works well. It suggests that the strategy
ble than CoProtector, showing its strong imperceptibility to of embedding multiple backdoors can significantly enhance the
ignorant, watermark-aware, and method-aware human devel- robustness of CodeMark. Therefore, given a watermarked dataset,
opers. Furthermore, at the cost of a number of unwatermarked the adversaries have to find a larger dataset to safely alleviate the
samples, the automated methods still fail to eliminate the adap- effects of CodeMark, which is however extremely hard to achieve
tively watermarked samples in the code datasets. in practice. Further discussion about the practical feasibility and
robustness of CodeMark can be found in Section 7.
5.4 RQ4: Robustness Answer to RQ4: CodeMark can resist the diluting attack un-
In this experiment, we evaluate the robustness of CodeMark under der a 10% watermarking rate, which requires the adversaries
dataset diluting attack. We experiment to observe the verifiability to collect enormous extra source code. Embedding multiple
of CodeMark when the dataset is diluted by more unwatermarked backdoors can significantly improve the robustness of Code-
code samples. The diluted datasets are produced by changing the Mark against diluting attacks.
ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA Zhensu Sun, Xiaoning Du, Fu Song, and Li Li
Table 5: The 𝑝-value of the GPT-2 and CodeT5 models trained over datasets with different watermarking rates.
secret watermark, let alone implement significant countermeasures time. Most of the adversarial attacks against code models utilize
against it. SPTs to transform a benign code into an adversarial one [31, 37, 49–
Extension of CodeMark. In this study, we primarily address the 51, 51, 52]. For example, Springer et al. [37] proposed to use variable
issue of copyright protection for pure code datasets in the context renaming for SPT. Zhang et al. [52] proposed to attack code clone
of code completion, introducing a method to embed imperceptible detectors with a set of transformations including variable renaming,
watermarks into source code. This technique could be further ex- dead code insertion, and comment deleting. These studies provide
panded to watermark other datasets and tasks that involve artifacts strong evidence of the vulnerability of code models against SPTs.
in not only source code but also non-code formats, e.g., comments Furthermore, data-poisoning-based watermarking occurs at train-
or commit messages in natural languages. This expansion would ing time and should not harm the model accuracy too much at
be achieved in tandem with other qualified watermarking tech- inference time.
niques tailored for these formats. Although CodeMark is fundamen- Adversarial attack on code models. In this paper, we focus on the
tally crafted for dataset watermarking, its utility extends beyond copyright protection of pure code datasets against NCCMs, how-
this core purpose. For example, any NCCM trained using a wa- ever, CodeMark could be applied to watermark other code-related
termarked dataset inherently carries this watermark, empowering datasets and tasks, which involve artifacts in non-code formats, e.g.,
model providers with a means to safeguard against unauthorized comments or commit messages in natural languages, in collabora-
redistribution or replication. Besides, CodeMark can also facilitate tion with other qualified watermarking methods for these formats.
the developers of open-source projects to protect their code reposi- Besides, CodeMark can also facilitate the developers of open-source
tories. For a detailed exploration on using watermarking techniques projects to protect their code repositories. Interested readers can re-
to secure code repositories, we refer readers to CoProtector [38]. fer to CoProtector [38] for a comprehensive mechanism of applying
watermarking techniques to protect code repositories, individually
8 RELATED WORK or collaboratively.
Software watermarking. Software watermarking is to protect
the ownership of the software by embedding a unique identifier 9 CONCLUSION
within source code, data, or even execution state. It can be either To defend against unauthorized usage of code datasets for training
static [16, 17, 19, 40], i.e., watermarks are embedded in the source NeurCCM, we have proposed, to the best of our knowledge, the first
code/data, or dynamic [27], i.e., watermarks are stored in the execu- imperceptible watermarking method, named CodeMark, on code
tion state of the program. For example, Monden et al. [29] proposed datasets to deter potential code dataset thieves. CodeMark embeds
to embed watermarks by replacing opcodes in dummy methods. watermarking backdoors by transforming the code fragments in
Arboit [12] proposed to encode watermarks as constants within the code corpus according to designated rules. The watermarks
opaque predicates to avoid being detected by software analysis imposed by CodeMark in the samples are semantic-preserving and
tools. Sharma et al. [36] proposed to interchange safe operands adaptive to their code context, making them hard to be noticed
of mathematical equations to watermark a program. Software wa- by adversaries while harmless to the quality of models. We have
termarking is different from code dataset watermarking, as the implemented an open-source prototype toolkit to automate the
latter is intended to inject watermarking backdoors into neural watermark designing, backdoor embedding, and suspicious model
models trained with such watermarked datasets. Though software validating. The comprehensive evaluation shows that CodeMark sat-
watermarking is not designed for DL models, the methods for static isfies all the requirements of a practical and reliable watermarking
watermarks are still inspiring to the design of our work. method: harmlessness, imperceptibility, verifiability, and robustness.
Backdoor poisoning for watermarking. Recent studies have However, we should emphasize that watermarking technique itself
demonstrated the vulnerability of DL models on backdoor poisoning cannot solve the whole problem of the ethics of code datasets. We
in various domains [18, 35, 43, 46, 53] including program code. thus call for more attention from our research community on this
Ramakrishnan and Albarghouthi [33] investigated the effectiveness topic for a sustainable future of AI-powered software engineering.
of using dead code as backdoors against code models. Schuster et
al. [34] proposed to poison the training data of NCCMs with pre- 10 DATA AVAILABILITY
designed backdoors to generate insecure suggestions to developers. To foster further research, source code of our toolkit, all the artifacts
Except for these malicious usages, studies also have proposed that and results are available on our website [9].
backdoor poisoning can also serve as watermarks in datasets against
DL models [11, 26]. The idea has been successfully applied to code
ACKNOWLEDGMENTS
models by CoProtector [38], paving thy way for our research. The
backdoor in CoProtector is easily perceptible since it is designed for This work is supported by the National Natural Science Founda-
watermarking open-source repositories, based on the assumption tion of China under Grant No.: 62072309, CAS Project for Young
that it is more costly to remove a potentially watermarked open- Scientists in Basic Research under Grant No.: YSBR-040, and ISCAS
source code repository than just skip it. For the protection of entire New Cultivation Project under Grant No.: ISCAS-PYFX-202201.
datasets, a perceptible watermark is easy to be recognized and
removed. CodeMark is designed to fill this gap. REFERENCES
[1] 2021. GitHub Copilot research recitation. Retrieved Aug 15, 2023 from https:
Adversarial attack on code models. Different from data poison- //github.blog/2021-06-30-github-copilot-research-recitation/
ing, adversarial attacks craft inputs to fool code models at inference [2] 2022. aiXcoder. Retrieved Jan 3, 2022 from https://www.aixcoder.com/
ESEC/FSE 2023, 11 - 17 November, 2023, San Francisco, USA Zhensu Sun, Xiaoning Du, Fu Song, and Li Li
[3] 2022. Code faster with AI completions | TabNine. Retrieved Jan 3, 2022 from abs/2010.05821 (2020).
https://www.tabnine.com/ [27] Haoyu Ma, Chunfu Jia, Shijia Li, Wantong Zheng, and Dinghao Wu. 2019. Xmark:
[4] 2022. GitHub Copilot · Your AI pair programmer. Retrieved Jan 3, 2022 from Dynamic Software Watermarking Using Collatz Conjecture. IEEE Transactions
https://copilot.github.com/ on Information Forensics and Security 14 (2019), 2859–2874.
[5] 2022. How is the data in Copilot for Individuals used and shared? Retrieved Jan [28] Vadim Markovtsev and Waren Long. 2018. Public Git Archive: A Big Code
15, 2023 from https://github.com/features/copilot/#how-is-the-data-in-copilot- Dataset for All. 2018 IEEE/ACM 15th International Conference on Mining Software
for-individuals-used-and-shared Repositories (MSR) (2018), 34–37.
[6] 2022. ML-powered coding companion – Amazon CodeWhisperer – Amazon Web [29] Akito Monden, Hajimu Iida, Ken ichi Matsumoto, Koji Torii, and Katsuro Inoue.
Services. Retrieved Jan 15, 2023 from https://aws.amazon.com/codewhisperer/ 2000. A practical method for watermarking Java programs. Proceedings 24th An-
[7] 2022. Tree-sitter - Introduction. Retrieved Jan 3, 2022 from https://tree-sitter. nual International Computer Software and Applications Conference. COMPSAC2000
github.io/tree-sitter (2000), 191–197.
[8] 2022. Where did AWS obtain the training data to build this service? Retrieved Jan [30] Kishore Papineni, S. Roukos, T. Ward, and Wei-Jing Zhu. 2002. Bleu: a Method
15, 2023 from https://aws.amazon.com/codewhisperer/faqs/?nc1=h_ls for Automatic Evaluation of Machine Translation. In ACL.
[9] 2023. CodeMark. Retrieved Jan 31, 2023 from https://sites.google.com/view/ [31] Md. Rafiqul Islam Rabin, Nghi D. Q. Bui, Ke Wang, Yijun Yu, Lingxiao Jiang, and
codemark Mohammad Amin Alipour. 2021. On the generalizability of Neural Program
[10] 2023. Stack Overflow Will Charge AI Giants for Training Data. Retrieved Apr 20, Models with respect to semantic-preserving program transformations. Inf. Softw.
2023 from https://www.wired.com/story/stack-overflow-will-charge-ai-giants- Technol. 135 (2021), 106552.
for-training-data/ [32] Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya
[11] Yossi Adi, Carsten Baum, Moustapha Cissé, Benny Pinkas, and Joseph Keshet. Sutskever. 2019. Language Models are Unsupervised Multitask Learners.
2018. Turning Your Weakness Into a Strength: Watermarking Deep Neural [33] Goutham Ramakrishnan and Aws Albarghouthi. 2020. Backdoors in Neural
Networks by Backdooring. In USENIX Security Symposium. Models of Source Code. ArXiv abs/2006.06841 (2020).
[12] Geneviève Arboit. 2002. A Method for Watermarking Java Programs via Opaque [34] R. Schuster, Congzheng Song, Eran Tromer, and Vitaly Shmatikov. 2020. You
Predicates. Electronic Commerce Research (2002). Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion. ArXiv
[13] Nghi D. Q. Bui, Yijun Yu, and Lingxiao Jiang. 2021. Self-Supervised Contrastive abs/2007.02220 (2020).
Learning for Code Retrieval and Summarization via Semantic-Preserving Trans- [35] A. Shafahi, W. R. Huang, Mahyar Najibi, O. Suciu, Christoph Studer, T. Dumitras,
formations. Proceedings of the 44th International ACM SIGIR Conference on Re- and T. Goldstein. 2018. Poison Frogs! Targeted Clean-Label Poisoning Attacks
search and Development in Information Retrieval (2021). on Neural Networks. In NeurIPS.
[14] Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Ben Edwards, [36] B. K. Sharma, R. P. Agarwal, and Raghuraj Singh. 2011. An Efficient Software
Taesung Lee, Ian Molloy, and B. Srivastava. 2019. Detecting Backdoor Attacks on Watermark by Equation Reordering and FDOS. In SocProS.
Deep Neural Networks by Activation Clustering. ArXiv abs/1811.03728 (2019). [37] Jacob M. Springer, Bryn Reinstadler, and Una-May O’Reilly. 2020. STRATA:
[15] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted Simple, Gradient-Free Attacks for Models of Code.
backdoor attacks on deep learning systems using data poisoning. arXiv preprint [38] Zhensu Sun, Xiaoning Du, Fu Song, Mingze Ni, and Li Li. 2021. CoProtector:
arXiv:1712.05526 (2017). Protect Open-Source Code against Unauthorized Training Usage with Data Poi-
[16] Sebastian Danicic and James Alexander George Hamilton. 2010. An Evaluation soning. Proceedings of the ACM Web Conference 2022 (2021).
of Static Java Bytecode Watermarking. [39] Buse Gul Atli Tekgul and N. Asokan. 2022. On the Effectiveness of Dataset
[17] Robert I Davidson and Nathan Myhrvold. 1996. Method and system for generating Watermarking in Adversarial Settings. ArXiv abs/2202.12506 (2022).
and auditing a signature for a computer program. US Patent 5,559,884. [40] Smita Thaker. 2004. Software watermarking via assembly code transformations.
[18] Tianyu Gu, Brendan Dolan-Gavitt, and S. Garg. 2017. BadNets: Identifying Vul- San Jose State University (2004).
nerabilities in the Machine Learning Model Supply Chain. ArXiv abs/1708.06733 [41] Brandon Tran, Jerry Li, and A. Madry. 2018. Spectral Signatures in Backdoor
(2017). Attacks. In NeurIPS.
[19] James Alexander George Hamilton and Sebastian Danicic. 2011. A survey of static [42] Ashish Vaswani, Noam M. Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
software watermarking. 2011 World Congress on Internet Security (WorldCIS-2011) Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you
(2011), 100–107. Need. ArXiv abs/1706.03762 (2017).
[20] Xuanli He, Qiongkai Xu, L. Lyu, Fangzhao Wu, and Chenguang Wang. 2021. [43] Eric Wallace, Tony Zhao, Shi Feng, and Sameer Singh. 2021. Concealed Data
Protecting Intellectual Property of Language Generation APIs with Lexical Wa- Poisoning Attacks on NLP Models. In NAACL.
termark. ArXiv abs/2112.02701 (2021). [44] Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5:
[21] Hamel Husain, Hongqi Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Under-
Brockschmidt. 2019. CodeSearchNet Challenge: Evaluating the State of Semantic standing and Generation. ArXiv abs/2109.00859 (2021).
Code Search. ArXiv abs/1909.09436 (2019). [45] B. L. Welch. 1947. The generalisation of student’s problems when several different
[22] Wan Soo Kim and Kyogu Lee. 2020. Digital Watermarking For Protecting Audio population variances are involved. Biometrika 34 1-2 (1947), 28–35.
Classification Datasets. ICASSP 2020 - 2020 IEEE International Conference on [46] Changming Xu, Jun Wang, Yuqing Tang, Francisco Guzmán, Benjamin I. P. Rubin-
Acoustics, Speech and Signal Processing (ICASSP) (2020), 2842–2846. stein, and Trevor Cohn. 2021. A Targeted Attack on Black-Box Neural Machine
[23] Denis Kocetkov, Raymond Li, Loubna Ben Allal, Jia Li, Chenghao Mou, Carlos Translation with Parallel Data Poisoning. Proceedings of the Web Conference 2021
Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, (2021).
Dzmitry Bahdanau, Leandro von Werra, and Harm de Vries. 2022. The Stack: 3 [47] Mohammad Mehdi Yadollahi, Farzaneh Shoeleh, Sajjad Dadkhah, and Ali A.
TB of permissively licensed source code. Preprint (2022). Ghorbani. 2021. Robust Black-box Watermarking for Deep Neural Network using
[24] Peter J. Landin. 1964. The Mechanical Evaluation of Expressions. Comput. J. 6 Inverse Document Frequency. 2021 IEEE Intl Conf on Dependable, Autonomic and
(1964), 308–320. Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf
[25] Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology
Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Congress (DASC/PiCom/CBDCom/CyberSciTech) (2021), 574–581.
Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, [48] Yanming Yang, Xin Xia, David Lo, and John C. Grundy. 2020. A Survey on Deep
Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Learning for Software Engineering. CoRR abs/2011.14597 (2020).
Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umap- [49] Zhou Yang, Jieke Shi, Junda He, and David Lo. 2022. Natural Attack for Pre-trained
athi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Models of Code. ArXiv abs/2201.08698 (2022).
Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, [50] Noam Yefet, Uri Alon, and Eran Yahav. 2020. Adversarial examples for models of
Manan Dey, Zhihan Zhang, Nourhan Fahmy, Urvashi Bhattacharyya, W. Yu, code. Proceedings of the ACM on Programming Languages 4 (2020), 1 – 30.
Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, [51] Huangzhao Zhang, Zhuo Li, Ge Li, L. Ma, Yang Liu, and Zhi Jin. 2020. Generating
Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Adversarial Examples for Holding Robustness of Source Code Processing Models.
Schoelkopf, Jana Ebert, Tri Dao, Mayank Mishra, Alexander Gu, Jennifer Robin- In AAAI.
son, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva [52] Weiwei Zhang, Shengjian Guo, Hongyu Zhang, Yulei Sui, Yinxing Xue, and Yun
Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Xu. 2021. Challenging Machine Learning-based Clone Detectors via Semantic-
Sean M. Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, and Harm de preserving Code Transformations. ArXiv abs/2111.10793 (2021).
Vries. 2023. StarCoder: may the source be with you! ArXiv abs/2305.06161 (2023). [53] Shihao Zhao, Xingjun Ma, X. Zheng, J. Bailey, Jingjing Chen, and Yugang Jiang.
https://api.semanticscholar.org/CorpusID:258588247 2020. Clean-Label Backdoor Attacks on Video Recognition Models. 2020 IEEE/CVF
[26] Yiming Li, Zi-Mou Zhang, Jiawang Bai, Baoyuan Wu, Yong Jiang, and Shutao Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 14431–
Xia. 2020. Open-sourced Dataset Protection via Backdoor Watermarking. ArXiv 14440.