Hadoop - copyFromLocal Command

Last Updated : 27 Dec, 2021

Hadoop copyFromLocal command is used to copy the file from your local file system to the HDFS(Hadoop Distributed File System). copyFromLocal command has an optional switch –f which is used to replace the already existing file in the system, means it can be used to update that file. -f switch is similar to first delete a file and then copying it. If the file is already present in the folder then copy it into the same folder will automatically throw an error.

Syntax to copy a file from your local file system to HDFS is given below:

hdfs dfs -copyFromLocal /path 1 /path 2 .... /path n /destination

The copyFromLocal local command is similar to the -put command used in HDFS. we can also use hadoop fs as a synonym for hdfs dfs. The command can take multiple arguments where all the paths provided are of the source from where we want to copy the file except the last one which is the destination, where the file is copied. Make sure that the destination should be a directory.

Our objective is to copy the file from our local file system to HDFS. In my case, I want to copy the file name Salaries.csv which is present at /home/dikshant/Documents/hadoop_file directory.

Hadoop - copyFromLocal Command

Steps to execute copyFromLocal Command

Let's see the current view of my Root directory in HDFS.

Step 1: Make a directory in HDFS where you want to copy this file with the below command.

hdfs dfs -mkdir /Hadoop_File

making a directory in HDFS

showing the directory of HDFS

Step 2: Use copyFromLocal command as shown below to copy it to HDFS /Hadoop_File directory.

hdfs dfs -copyFromLocal /home/dikshant/Documents/hadoop_file/Salaries.csv /Hadoop_File

using copyFromLocal Command in Hadoop

Step 3: Check whether the file is copied successfully or not by moving to its directory location with below command.

hdfs dfs -ls /Hadoop_File

checking file is copied or not - 1

checking file is copied or not - 2

Overwriting or Updating the File In HDFS with -f switch

From below Image, you can observe that copyFromLocal command itself does not copy the same name file at the same location. it says that the file already exists.

Overwriting or Updating the File In HDFS with -f switch - 1

To update the content of the file or to Overwrite it, you should use -f switch as shown below.

hdfs dfs -copyFromLocal -f /home/dikshant/Documents/hadoop_file/Salaries.csv /Hadoop_File

Overwriting or Updating the File In HDFS with -f switch - 2

Now you can easily observe that using copyFromLocal with -f switch does not produce any error or it will easily update or modify your file in HDFS.

D

dikshantmalidev

Improve

Article Tags :

Similar Reads

Hadoop - Architecture

As we all know Hadoop is a framework written in Java that utilizes a large cluster of commodity hardware to maintain and store big size data. Hadoop works on MapReduce Programming Algorithm that was introduced by Google. Today lots of Big Brand Companies are using Hadoop in their Organization to dea

Hadoop Ecosystem

Overview: Apache Hadoop is an open source framework intended to make interaction with big data easier, However, for those who are not acquainted with this technology, one question arises that what is big data ? Big data is a term given to the data sets which can't be processed in an efficient manner

Introduction to Hadoop

Hadoop is an open-source software framework that is used for storing and processing large amounts of data in a distributed computing environment. It is designed to handle big data and is based on the MapReduce programming model, which allows for the parallel processing of large datasets. Its framewo

Top 60+ Data Engineer Interview Questions and Answers

Data engineering is a rapidly growing field that plays a crucial role in managing and processing large volumes of data for organizations. As companies increasingly rely on data-driven decision-making, the demand for skilled data engineers continues to rise. If you're preparing for a data engineer in

What is Big Data?

Data science is the study of data analysis by advanced technology (Machine Learning, Artificial Intelligence, Big data). It processes a huge amount of structured, semi-structured, and unstructured data to extract insight meaning, from which one pattern can be designed that will be useful to take a d

Explain the Hadoop Distributed File System (HDFS) Architecture and Advantages.

The Hadoop Distributed File System (HDFS) is a key component of the Apache Hadoop ecosystem, designed to store and manage large volumes of data across multiple machines in a distributed manner. It provides high-throughput access to data, making it suitable for applications that deal with large datas

What is Big Data Analytics ? - Definition, Working, Benefits

Big Data Analytics uses advanced analytical methods that can extract important business insights from bulk datasets. Within these datasets lies both structured (organized) and unstructured (unorganized) data. Its applications cover different industries such as healthcare, education, insurance, AI, r

Hadoop - HDFS (Hadoop Distributed File System)

Before head over to learn about the HDFS(Hadoop Distributed File System), we should know what actually the file system is. The file system is a kind of Data structure or method which we use in an operating system to manage file on disk space. This means it allows the user to keep maintain and retrie

Map Reduce and its Phases with numerical example.

Map Reduce is a framework in which we can write applications to run huge amount of data in parallel and in large cluster of commodity hardware in a reliable manner.Phases of MapReduceMapReduce model has three major and one optional phase.â€‹MappingShuffling and SortingReducingCombining1) MappingIt is

What is Data Lake ?

In todayâ€™s data-driven world, organizations face the challenge of managing vast amounts of raw data to get meaningful insights. To resolve this Data Lakes was introduced. It is a centralized storage repository that allows businesses to store structured, semi-structured and unstructured data at any s