IMPORT/EXPORT
The EXPORT command exports the data of a table or partition, along with the metadata, into a specified output location. This output location can then be moved over to a different Hadoop or Hive instance and imported from there with the IMPORT command.
Export Syntax
EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])] TO 'export_target_path' [ FOR replication('eventid') ]
Import Syntax
IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"[, ...])]] FROM 'source_path' [LOCATION 'import_target_path']
Examples
Simple export and import:
export table student to 'hdfs_exports_location/student'; import from 'hdfs_exports_location/student';
Rename table on import:
export table department to 'hdfs_exports_location/department'; import table imported_dept from 'hdfs_exports_location/department';
Export partition and import:
export table employee partition (emp_country="in", emp_state="ka") to 'hdfs_exports_location/employee'; import from 'hdfs_exports_location/employee';
Import as an external table:
export table department to 'hdfs_exports_location/department'; import external table department from 'hdfs_exports_location/department';
- Other Commands
Select
Simple query. For example, the following query retrieves all columns and all rows from table emp.
SELECT * FROM emp ;
To get the current database, use the
SELECT current_database()
WHERE ClauseTo get the current database, use the
SELECT * FROM sales WHERE id > 10 AND region = "India"
ALL and DISTINCT Clauses
The ALL and DISTINCT options specify whether duplicate rows should be returned. If none of these options are given, the default is ALL (all matching rows are returned). DISTINCT specifies removal of duplicate rows from the result set.
hive> SELECT id, name FROM emp; 1 sai 1 sai 1 sri 2 gopal hive> SELECT DISTINCT id, name FROM emp; 1 sai 1 sri 2 gopal hive> SELECT DISTINCT id FROM emp; 1 2
Partition Based Queries
In general, a SELECT query scans the entire table (other than for sampling). If a table is created using the ‘Partitioned By’ clause, then it scan’s only a fraction of the table relevant to the partitions specified by the query. Hive currently does partition pruning if the partition predicates are specified in the WHERE clause.
SELECT emp.* FROM emp WHERE emp.date >= '2017-11-11' AND emp.date <= '2017-11-12'
Hive added support for the HAVING clause in version 0.7.0. In older versions of Hive it is possible to achieve the same effect by using a subquery, e.g:HAVING Clause
SELECT id FROM emp GROUP BY id HAVING SUM(marks) > 50;
SELECT * FROM customers LIMIT 5;Limit
Joins
Example,
SELECT a.* FROM a JOIN b ON (a.id = b.id) SELECT a.* FROM a JOIN b ON (a.id = b.id AND a.department = b.department) SELECT a.* FROM a LEFT OUTER JOIN b ON (a.id <> b.id
More than 2 tables can be joined in the same query e.g.
SELECT a.val, b.val, c.val FROM a JOIN b ON (a.key = b.key1) JOIN c ON (c.key = b.key2)
Tables can also be created and populated by the results of a query in one create-table-as-select (CTAS) statement. The table created by CTAS is atomic, meaning that the table is not seen by other users until all the query results are populated. So other users will either see the table with the complete results of the query or will not see the table at all.Create Table As Select (CTAS)
Example:
CREATE TABLE new_key_value_store ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe" STORED AS RCFile AS SELECT (key % 1024) new_key, key_value_pair FROM key_value_store SORT BY new_key, key_value_pair; |
The above CTAS statement creates the target table new_key_value_store with the schema (new_key DOUBLE, key_value_pair STRING) derived from the results of the SELECT statement. If the SELECT statement does not specify column aliases, the column names will be automatically assigned to _col0, _col1, and _col2 etc. In addition, the new target table is created using a specific SerDe and a storage format independent of the source tables in the SELECT statement.
Create Table Like
The LIKE form of CREATE TABLE allows you to copy an existing table definition exactly (without copying its data). In contrast to CTAS, the statement below creates a new empty_key_value_store table whose definition exactly matches the existing key_value_store in all particulars other than table name. The new table contains no rows.
CREATE TABLE empty_key_value_store LIKE key_value_store;
Drop Table
DROP TABLE [IF EXISTS] table_name [PURGE];
DROP TABLE removes metadata and data for this table. The data is actually moved to the .Trash/Current directory if Trash is configured (and PURGE is not specified). The metadata is completely lost.
Note–
When dropping an EXTERNAL table, data in the table will NOT be deleted from the file system.