0% found this document useful (0 votes)
64 views

Hive in Class Assignment Winter 2021

The document provides instructions for a Hive assignment involving loading data files into HDFS and creating external tables in Hive to query the data. Students are asked to load ratings, movies, users, and occupations data files from the class files directory into HDFS, create a Hive database called Hive_Tutorial, and then create external tables in Hive with the appropriate schemas and locations to reference the data files. Sample queries are provided to test that the data loaded correctly. Four use cases are outlined for students to write Hive queries against the data.

Uploaded by

Syed Shouiab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Hive in Class Assignment Winter 2021

The document provides instructions for a Hive assignment involving loading data files into HDFS and creating external tables in Hive to query the data. Students are asked to load ratings, movies, users, and occupations data files from the class files directory into HDFS, create a Hive database called Hive_Tutorial, and then create external tables in Hive with the appropriate schemas and locations to reference the data files. Sample queries are provided to test that the data loaded correctly. Four use cases are outlined for students to write Hive queries against the data.

Uploaded by

Syed Shouiab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

BIG DATA1 HIVE IN CLASS ASSIGNMENT

Download the following files from class files to a windows directory on the
VM
Occupation.dat
Movies.dat
Ratings.dat
Users.dat
Create a folder in “data” in HDFS – location “/user/mara_dev/data”
Load the 4 files into /user/maria_dev/data - make sure the permissions are on
to read and write to new folder “data”
Use the compose section in Hive 2.0 to create a separate database named
Hive_tutorial
Issue the following commands on the
create database Hive_Tutorial;
use Hive_Tutorial
Then create the following EXTERNAL tables to hold data

CREATE EXTERNAL TABLE ratings (userid INT, movieid INT, rating INT,tstamp STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '#'
STORED AS TEXTFILE
LOCATION '/user/maria_dev/data/ratings';

CREATE EXTERNAL TABLE movies (movieid INT,title STRING,genres ARRAY<STRING>)


ROW FORMAT DELIMITED
FIELDS TERMINATED BY '#'
COLLECTION ITEMS TERMINATED BY "|"
STORED AS TEXTFILE
LOCATION '/user/maria_dev/data/movies';

CREATE EXTERNAL TABLE users (userid INT,gender STRING,age INT,occupation_id INT,


zipcode STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '#'
STORED AS TEXTFILE
LOCATION '/user/maria_dev/data/users';

CREATE EXTERNAL TABLE occupations (id INT,occupation STRING)


ROW FORMAT DELIMITED
FIELDS TERMINATED BY '#'
STORED AS TEXTFILE
LOCATION '/user/maria_dev/data/occupations ';

7. check to see if data is loaded in all the tables

select * from users limit 2;


OK
1 F 1 10 48067
2 M 56 16 70072

select * from movies limit 2;


OK
1 Toy Story (1995) ["Animation","Children's","Comedy"]
2 Jumanji (1995) ["Adventure","Children's","Fantasy"]

select * from ratings limit 2;


OK
1 1193 5 978300760

This study source was downloaded by 100000815985134 from CourseHero.com on 06-18-2023 18:26:36 GMT -05:00

https://www.coursehero.com/file/176366993/Hive-in-class-assignment-winter-2021txt/
1 661 3 978302109

select * from occupations limit 2;


OK
0 other/not specified
1 academic/educator

Take a screen shot of all your answers showing the result of your query then paste
into a word document with your name and student number at the top with the title
Hive In Class Assignment
Once completed submit the assignment
NOTE: in each case to maintain readability I will limit the output to 10 only.

Use Case 1:
Find out Occupation of all the users

Use Case 2:
Find out numbers of non-adults, who has rated movies:

Use case 3:
Find out the no of users with same occupation and having age more than 25 along
with occupation details:

Use Case 4: Find the age of the most rated user with counts of rating;

This study source was downloaded by 100000815985134 from CourseHero.com on 06-18-2023 18:26:36 GMT -05:00

https://www.coursehero.com/file/176366993/Hive-in-class-assignment-winter-2021txt/
Powered by TCPDF (www.tcpdf.org)

You might also like