Hadoop - File Permission and ACL(Access Control List)

Last Updated : 10 Jul, 2020

In general, a Hadoop cluster performs security on many layers. The level of protection depends upon the organization's requirements. In this article, we are going to Learn about Hadoop's first level of security. It contains mainly two components. Both of these features are part of the default installation. 1. File Permission 2. ACL(Access Control List)

1. File Permission

The HDFS(Hadoop Distributed File System) implements POSIX(Portable Operating System Interface) like a file permission model. It is similar to the file permission model in Linux. In Linux, we use Owner, Group, and Others which has permission for each file and directory available in our Linux environment.

     Owner/user                           Group                              Others
         rwx                              rwx                                rwx

Similarly, the HDFS file system also implements a set of permissions, for this Owner, Group, and Others. In Linux we use -rwx for permission to the specific user where r is read, w is for write or append and x is for executable. But in HDFS for a file, we have r for reading, w for writing and appending and there is no sense for x i.e. for execution permission, because in HDFS all files are supposed to be data files and we don't have any concept of executing a file in HDFS. Since we don't have an executable concept in HDFS so we don't have a setUID and setGID for HDFS. file permission in Hadoop

Similarly, we can have permission for a directory in our HDFS. Where r is used to list the content of a directory, w is used for creation or deletion of a directory and x permission is used to access the child of a directory. Here also we don't have a setUID and setGID for HDFS. permission for HDFS directory

How You Can Change this HDFS File's Permission?

-chmod that stands for change mode command is used for changing the permission for the files in our HDFS. The first list down the directories available in our HDFS and have a look at the permission assigned to each of this directory. You can list the directory in your HDFS root with the below command.

hdfs dfs -ls /

Here, / represents the root directory of your HDFS. HDFS Permission File

Let me first list down files present in my Hadoop_File directory.

hdfs dfs -ls /Hadoop_File

In above Image you can see that for file1.txt, I have only read and write permission for owner user only. So I am adding write permission to group and others also. Pre-requisite: You have to be familiar with the use of -chmod command in Linux means how to use switch for permissions for users. To add write permission to group and others use below command.

hdfs dfs -chmod  go+w /Hadoop_File/file1.txt

Here, go stands for group and other and w means write, and + sign shows that I am adding write permission to group and other. Then list the file again to check it worked or not.

hdfs dfs -ls /Hadoop_File

And we have done with it, similarly, you can change the permission for any file or directory available in our HDFS(Hadoop Distributed File System). Similarly, you can change permission as per your requirement for any user. you can also change group or owner of a directory with -chgrp and -chown respectively.

2. ACL(Access Control List)

ACL provides a more flexible way to assign permission for a file system. It is a list of access permission for a file or a directory. We need the use of ACL in case you have made a separate user for your Hadoop single node cluster setup, or you have a multinode cluster setup where various nodes are present, and you want to change permission for other users. Because if you want to change permission for the different users, you can not do it with -chmod command. For example, for single node cluster of Hadoop your main user is root and you have created a separate user for Hadoop setup with name let say Hadoop. Now if you want to change permission for the root user for files that are present in your HDFS, you can not do it with -chmod command. Here comes ACL(Access Control List) in the picture. With ACL you can set permission for a specific named user or named group. In order to enable ACL in HDFS you need to add the below property in hdfs-site.xml file. [code] <property> <name>dfs.namenode.acls.enabled</name> <value>true</value> </property> [/code] Note: Don't forget to restart all the daemons otherwise changes made to hdfs-site.xml don't reflect. You can check the entry's in your access control list(ACL) with -getfacl command for a directory as shown below.

hdfs dfs -getfacl /Hadoop_File

You can see that we have 3 different entry's in our ACL. Suppose you want to change permission for your root user for any HDFS directory you can do it with below command. Syntax:

hdfs dfs -setfacl -m user:user_name:r-x /Hadoop_File

You can change permission for any user by adding it to the ACL for that directory. Below are some of the example to change permission of different named users for any HDFS file or directory.

hdfs dfs -setfacl -m user:root:r-x /Hadoop_File

Another example, for raj user:

hdfs dfs -setfacl -m user:raj:r-x /Hadoop_File

Here r-x denotes only read and executing permission for HDFS directory for that root, and raj user. In my case, I don't have any other user so I am changing permission for my only user i.e. dikshant

hdfs dfs -setfacl -m user:dikshant:rwx /Hadoop_File

Then list the ACL with -getfacl command to see the changes.

hdfs dfs -getfacl /Hadoop_File

Here, you can see another entry in ACL of this directory with user:dikshant:rwx for new permission of dikshant user. Similarly, in case you have multiple users then you can change their permission for any HDFS directory. This is another example to change the permission of the user dikshant from r-x mode.