LOADING FILES INTO TABLES
Loading files into tables:
Hive does not do any transformation while loading data into tables. Load operations are currently pure copy/move operations that move datafiles into locations corresponding to Hive tables.
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
Here filepath can be:
- a relative path, such as input/data1
- an absolute path, such as /user/hive/input/data1
- a full URI with scheme and (optionally) an authority, such as hdfs://namenode:9000/user/hive/input/data1
- The target being loaded to can be a table or a partition. If the table is partitioned, then one must specify a specific partition of the table by specifying values for all of the partitioning columns.
Here filepath cannot contain subdirectories.
Inserting data into Hive Tables from queries:
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement; INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
INSERT OVERWRITE will overwrite any existing data in the table or partition unless IF NOT EXISTS is provided for a partition.
INSERT INTO will append to the table or partition, keeping the existing data intact. As of Hive 0.13.0, a table can be made immutable by creating it with TBLPROPERTIES(“immutable”=” true”). The default is “immutable”=”false”.
INSERT INTO behavior into an immutable table is disallowed if any data is already present, although INSERT INTO still works if the immutable table is empty. The behavior of INSERT OVERWRITE is not affected by the “immutable” table property.
Inserting values into tables from SQL:
INSERT INTO TABLE students VALUES (1,”sri”), (2,”shashi”);
UPDATE tablename SET column = value [, column = value ...] [WHERE expression]
DELETE FROM tablename [WHERE expression]