/    /  IMPORT/EXPORT

IMPORT/EXPORT

The EXPORT command exports the data of a table or partition, along with the metadata, into a specified output location. This output location can then be moved over to a different Hadoop or Hive instance and imported from there with the IMPORT command.

Export Syntax

EXPORT TABLE tablename [PARTITION (part_column="value"[, ...])]

  TO 'export_target_path' [ FOR replication('eventid') ]

Import Syntax

IMPORT [[EXTERNAL] TABLE new_or_original_tablename [PARTITION (part_column="value"[, ...])]]

  FROM 'source_path'

  [LOCATION 'import_target_path']

Examples

Simple export and import:

export table student to 'hdfs_exports_location/student';

import from 'hdfs_exports_location/student';

Rename table on import:

export table department to 'hdfs_exports_location/department';

import table imported_dept from 'hdfs_exports_location/department';

Export partition and import:

export table employee partition (emp_country="in", emp_state="ka") to 'hdfs_exports_location/employee';

import from 'hdfs_exports_location/employee';

Import as an external table:

export table department to 'hdfs_exports_location/department';

import external table department from 'hdfs_exports_location/department';

  • Other Commands

Select

Simple query. For example, the following query retrieves all columns and all rows from table emp.

SELECT * FROM emp ;

To get the current database, use the 

SELECT current_database()

WHERE ClauseTo get the current database, use the

SELECT * FROM sales WHERE id > 10 AND region = "India"

ALL and DISTINCT Clauses

The ALL and DISTINCT options specify whether duplicate rows should be returned. If none of these options are given, the default is ALL (all matching rows are returned). DISTINCT specifies removal of duplicate rows from the result set.

hive> SELECT id, name FROM emp;

    1 sai

    1 sai

    1 sri

    2 gopal

hive> SELECT DISTINCT id, name FROM emp;

    1 sai

    1 sri

    2 gopal

hive> SELECT DISTINCT id FROM emp;

    1

    2

Partition Based Queries

In general, a SELECT query scans the entire table (other than for sampling). If a table is created using the ‘Partitioned By’ clause, then it scan’s only a fraction of the table relevant to the partitions specified by the query. Hive currently does partition pruning if the partition predicates are specified in the WHERE clause.

SELECT emp.* FROM emp WHERE emp.date >= '2017-11-11' 

AND emp.date <= '2017-11-12'

Hive added support for the HAVING clause in version 0.7.0. In older versions of Hive it is possible to achieve the same effect by using a subquery, e.g:HAVING Clause

SELECT id FROM emp GROUP BY id HAVING SUM(marks) > 50;

SELECT * FROM customers LIMIT 5;Limit

Joins

Example,

SELECT a.* FROM a JOIN b ON (a.id = b.id)

SELECT a.* FROM a JOIN b ON (a.id = b.id AND a.department = b.department)

SELECT a.* FROM a LEFT OUTER JOIN b ON (a.id <> b.id

More than 2 tables can be joined in the same query e.g.

SELECT a.val, b.val, c.val FROM a JOIN b ON (a.key = b.key1) JOIN c ON (c.key = b.key2)

Tables can also be created and populated by the results of a query in one create-table-as-select (CTAS) statement. The table created by CTAS is atomic, meaning that the table is not seen by other users until all the query results are populated. So other users will either see the table with the complete results of the query or will not see the table at all.Create Table As Select (CTAS)

Example:

CREATE TABLE new_key_value_store

   ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe"

   STORED AS RCFile

   AS

SELECT (key % 1024) new_key, key_value_pair

FROM key_value_store SORT BY new_key, key_value_pair;

The above CTAS statement creates the target table new_key_value_store with the schema (new_key DOUBLE, key_value_pair STRING) derived from the results of the SELECT statement. If the SELECT statement does not specify column aliases, the column names will be automatically assigned to _col0, _col1, and _col2 etc. In addition, the new target table is created using a specific SerDe and a storage format independent of the source tables in the SELECT statement.

Create Table Like

The LIKE form of CREATE TABLE allows you to copy an existing table definition exactly (without copying its data). In contrast to CTAS, the statement below creates a new empty_key_value_store table whose definition exactly matches the existing key_value_store in all particulars other than table name. The new table contains no rows.

CREATE TABLE empty_key_value_store LIKE key_value_store;

Drop Table

DROP TABLE [IF EXISTS] table_name [PURGE];

DROP TABLE removes metadata and data for this table. The data is actually moved to the .Trash/Current directory if Trash is configured (and PURGE is not specified). The metadata is completely lost.

Note

When dropping an EXTERNAL table, data in the table will NOT be deleted from the file system.