Hive Query Language
Hive Query Language
DATABASE COMMANDS
Create Database
Syntax:
CREATE DATABASE IF NOT EXISTS STUDENTS COMMENT ‘student
details’ WITH DBPROPERTIES (‘creator’=‘JOHN’);
• Managed tables are Hive owned tables where the entire lifecycle of the tables' data are managed
and controlled by Hive.
• External tables are tables where Hive has loose coupling with the data.
• All the write operations to the Managed tables are performed using Hive SQL commands.
• The writes on External tables can be performed using Hive SQL commands but data files can also be
accessed and managed by processes outside of Hive.
• If an External table or partition is dropped, only the metadata associated with the table or partition
is deleted but the underlying data files stay intact. A typical example for External table is to run
analytical queries on HBase via Hive, where data files are written by HBase or and Hive reads them
for analytics.
CREATE TABLE
• SYNTAX
• CREATE TABLE IF NOT EXISTS STUD( rollno INT, name STRING, gpa
FLOAT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’;
• OBJECTIVE:
• The file at the local path is written into the table
• Avoid the keyword local if the input file has to be fetched from HDFS
Working with Collection datatypes
CREATE TABLE STUDENT_INFO (rollno INT, name STRING, subject ARRAY
<STRING>, marks MAP<STRING, INT>)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’
COLLECTION ITEMS TERMINATED BY ‘:’
MAP KEYS TERMINATED BY ‘!’;
Input format:
1001, John, Smith: Jones, Mark1!45 : Mark2!46: Mark3!43
1002, Aby, Smith: Jones, Mark1!65 : Mark2!96: Mark3!93
QUERY TABLES
• SELECT * FROM EXT_STUD;
• SELECT NAME,GPA FROM EXT_STUD;
• SELECT NME,SUB FROM EXT_STUD;
• SELECT NAME, MARKS[Mark1] FROM EXT_STUD;
• SELECT NAME, SUB[0] FROM EXT_STUD
Partitions
• Hive reads the entire dataset even though a where clause is specified
• Hence I/O delayed and partitions required
• Partitions split data into meaningful chunks
• Static Partitions- Consists of columns whose values are known at compile time
• Dynamic Partitions-have partitions whose values are known only at execution
time
Static Partitions
• CREATE TABLE IF NOT EXISTS STATIC_PART_STUDENT(rollno INT, name
STRING) partitioned by (gpa FLOAT) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ‘\t’;
• (created partition)