Docker ADD vs COPY Command in Dockerfile
Last Updated :
23 Dec, 2024
When creating Dockerfiles, it's often necessary to transfer files from the host system into the Docker image. These could be property files, native libraries, or other static content that our applications will require at runtime. The Dockerfile specification provides two ways to copy files from the source system into an image: the COPY and ADD directives. Here we will look at the difference between them and when it makes sense to use each one of them.
Sometimes you see COPY or ADD being used in a Dockerfile, but you should be using COPY 99% of the time. Here's why.
COPY and ADD are both Dockerfile instructions that serve a similar purpose. They let you copy files from a specific location into a Docker image.
Docker COPY Command
COPY takes in a source and destination. It only lets you copy a local directory from your host (the machine building the Docker image) into the Docker image itself.
COPY <src> <dest>
Explanation
<src>
: the source path of the file or directory on your local machine.<dest>
: destination path inside the Docker image where you want the copied files or directories to be placed.
Note: If you are copying local files to your Docker image, always use COPY because it's more explicit.
Important Points
- Basic Functionality: The 'COPY' command is used to copy files and directories from the host’s file system into the Docker image. It doesn’t support URLs or automatic extraction of compressed files.
- Security: Since 'COPY' only handles local files, it is generally more secure and predictable than 'ADD', reducing the risk of unintentionally including files from external sources.
- Use Case: 'COPY' is ideal for transferring files from the local build context to the Docker image without any additional processing.
Example
COPY ./config/settings.json /app/config/
COPY ./static /app/static/
In this example, 'settings.json' is copied to the '/app/config/' directory inside the image, and all files within the 'static' directory are copied to '/app/static/' .This setup allows you to arrange local files within specific directories inside the Docker image.
Docker ADD Command
ADD does that same work like docker copy command but in addition, it also supports 2 other sources.
- A URL instead of a local file/directory.
- Extract tar from the source directory into the destination.
ADD <src> <dest>
Explanation
<src>
: This can be a local file/directory, a URL, or a tar file.<dest>
: This is the destination path in the Docker image where the files will be placed.
In most cases, if you're using a URL, you download a zip file and then use the RUN command to extract it. However, you might as well just use RUN and curl instead of ADD here, so you chain everything into 1 RUN command to make a smaller Docker image. A valid use case for ADD is when you want to extract a local tar file into a specific directory in your Docker image. This is exactly what the Alpine image does with ADD rootfs.tar.gz /.
Important Points
While functionality is similar, the ADD directive is more powerful in two ways as follows:
- It can handle remote URLs
- It can also auto-extract tar files.
Let's look at these more closely.
First, the ADD directive can accept a remote URL for its source argument. The COPY directive, on the other hand, can only accept local files.
Note: Using ADD to fetch remote files and copying is not typically ideal. This is because the file will increase the overall Docker Image size. Instead, we should use curl or wget to fetch remote files and remove them when no longer needed.
Second, the ADD directive will automatically expand tar files into the image file system. While this can reduce the number of Dockerfile steps required to build an image, it may not be desired in all cases.
Note: The auto-expansion only occurs when the source file is local to the host system.
Reasons to use COPY instead of ADD in Dockerfile
According to the Dockerfile best practices guide, we should always prefer COPY over ADD unless we specifically need one of the two additional features of ADD. As noted above, using ADD command automatically expands tar files and specific compressed formats, which can lead to unexpected files being written to the file system in our images.
Docker ADD vs COPY Command in Dockerfile
COPY COMMAND | ADD COMMAND |
---|
COPY is a docker file command that copies files from a local source location to a destination in the Docker container. | ADD command is used to copy files/directories into a Docker image. |
Syntax: COPY <src> <dest> | Syntax: ADD source destination |
It only has only one assigned function. | It can also copy files from a URL. |
Its role is to duplicate files/directories in a specified location in their existing format. | ADD command is used to download an external file and copy it to the wanted destination. |
If we want to avoid backward compatibility, we should use the COPY command. | ADD command is less usable than the COPY command. |
Remote Context of Docker COPY and ADD Commands
While working with Docker, the COPY
and ADD
commands are both used to add files to an image, but they handle remote contexts in distinct ways. Here’s a breakdown of how each command works with remote files.
Remote Context of Docker COPY Command
- Local Context Only: The
COPY
command is strictly limited to files and directories that are present in your local build context. This means you can only copy files that are located in the directory where you run the docker build
command. It doesn’t allow for fetching files from remote URLs. - Use Cases: Because of this limitation,
COPY
is best used for including files that are part of your project or configuration files stored locally. - Best Practice: If you need to include files from a remote source, it's recommended to download them using
curl
or wget
inside the Dockerfile before using COPY
. This gives you more control over the process and helps manage security and image size.
Example
FROM alpine:latest
# Install curl to fetch remote files
RUN apk add --no-cache curl
# Download a file from a remote source
RUN curl -o /tmp/remote-file.txt https://example.com/remote-file.txt
# Copy the downloaded file to the final image
COPY /tmp/remote-file.txt /app/remote-file.txt
Remote Context of Docker ADD Command
- Supports Remote URLs: The
ADD
command offers more flexibility by allowing you to specify a remote URL as the source. If you provide a URL, ADD
will download the file directly into the Docker image. - Automatic Extraction: In addition,
ADD
can automatically unpack compressed files (like .tar
or .zip
) when they are included in the build context, a feature that COPY
does not have. - Security Considerations: Although being able to download files directly with
ADD
is convenient, it also introduces some security risks. Remote files can change or disappear, and there’s a chance of inadvertently including harmful content in your image.
Example
FROM alpine:latest
# Download a file directly using ADD
ADD https://example.com/remote-file.txt /app/remote-file.txt
Key Points to Remember
- Security Risks: Using
ADD
to download files from remote locations can lead to unexpected issues and security vulnerabilities. It’s usually safer to explicitly manage remote downloads with tools like curl
or wget
, and reserve COPY
for files you know and trust. - Build Context Control: The
COPY
command gives you better control over which files are included in your image, ensuring everything comes from your local context. This leads to more consistency and predictability in your builds. - Use Cases: Use
COPY
for local files and directories to ensure a secure and predictable build process. Reserve ADD
for cases where you need its special features, like fetching remote files or unpacking archives, while being mindful of the potential risks involved.
By understanding how these commands work with remote files, you can make better choices for managing your Docker images effectively.
Conclusion
Here we have seen the two primary ways to copy files into a Docker image: ADD and COPY. While functionally similar, the COPY directive is preferred for most cases. This is because the ADD directive provides additional functionality that should be used with caution and only when needed.
Similar Reads
What is OSI Model? - Layers of OSI Model
The OSI (Open Systems Interconnection) Model is a set of rules that explains how different computer systems communicate over a network. OSI Model was developed by the International Organization for Standardization (ISO). The OSI Model consists of 7 layers and each layer has specific functions and re
14 min read
Non-linear Components
In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
DBMS Tutorial â Learn Database Management System
Database Management System (DBMS) is a software used to manage data from a database. A database is a structured collection of data that is stored in an electronic device. The data can be text, video, image or any other format.A relational database stores data in the form of tables and a NoSQL databa
7 min read
Introduction of ER Model
The Entity-Relationship Model (ER Model) is a conceptual model for designing a databases. This model represents the logical structure of a database, including entities, their attributes and relationships between them. Entity: An objects that is stored as data such as Student, Course or Company.Attri
10 min read
TCP/IP Model
The TCP/IP model (Transmission Control Protocol/Internet Protocol) is a four-layer networking framework that enables reliable communication between devices over interconnected networks. It provides a standardized set of protocols for transmitting data across interconnected networks, ensuring efficie
7 min read
Types of Network Topology
Network topology refers to the arrangement of different elements like nodes, links, or devices in a computer network. Common types of network topology include bus, star, ring, mesh, and tree topologies, each with its advantages and disadvantages. In this article, we will discuss different types of n
12 min read
Normal Forms in DBMS
In the world of database management, Normal Forms are important for ensuring that data is structured logically, reducing redundancy, and maintaining data integrity. When working with databases, especially relational databases, it is critical to follow normalization techniques that help to eliminate
8 min read
Operating System Tutorial
An Operating System(OS) is a software that manages and handles hardware and software resources of a computing device. Responsible for managing and controlling all the activities and sharing of computer resources among different running applications.A low-level Software that includes all the basic fu
4 min read
Class Diagram | Unified Modeling Language (UML)
A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
ACID Properties in DBMS
In the world of Database Management Systems (DBMS), transactions are fundamental operations that allow us to modify and retrieve data. However, to ensure the integrity of a database, it is important that these transactions are executed in a way that maintains consistency, correctness, and reliabilit
8 min read