Thursday, August 20, 2015

Loading data into partitioned table in HIVE



Loading the data directly into partitioned table is not supported by LOAD command. 

Data loading in partitioned table works only with INSERT command(as in step 5).

Hence you need to follow the below steps.

1. Create a temporary table(Staging table)

CREATE TABLE part_employee_dept_pay_stg(id INT,first_name STRING,last_name STRING,dept STRING,salary INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";

2. Load data into staging table

LOAD DATA INPATH 'employee.txt' INTO TABLE part_employee_dept_pay_stg;

3. Create partition table

CREATE TABLE part_employee_dept_pay(id INT,first_name STRING,last_name STRING,salary INT) PARTITIONED BY (dept STRING) DELIMITED FIELDS TERMINATED BY ",";

4. set the properties for dynamic partitions in hive terminal

set hive.exec.dynamic.partition=true;  
set hive.exec.dynamic.partition.mode=nonstrict;  

5. Then load data from staging table to partitioned table.

insert overwrite table part_employee_dept_pay  partition(dept)
select id,first_name,last_name,salary,dept
from part_employee_dept_pay_stg;


-Mufaddal

No comments:

Post a Comment