Hive支持Json格式
1、下载hive-json-serde相关包
下载json-serde-1.3.8-jar-with-dependencies.jar、json-udf-1.3.8-jar-with-dependencies.jar,将其放到mapreduce、spark对应的lib目录下,如下:
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/
下载地址:http://www.congiu.net/hive-json-serde/
具体使用方式参考:https://github.com/rcongiu/Hive-JSON-Serde
2、测试
2.1.原始数据
位置:192.168.1.16
/user/alg/vertex
/user/alg/edges
2.2集群数据迁移
192.168.1.16 -> 192.168.1.62
hadoop distcp /user/alg/vertex hdfs://192.168.1.62:8020/user/bdp/bigdata/
hadoop distcp /user/alg/edges hdfs://192.168.1.62:8020/user/bdp/bigdata/
2.3.支持Hive JSON格式
按文中描述步骤操作
2.4.建表
CREATE EXTERNAL TABLE bigdata_test.tv_user
(
`_key` string,
`name` string,
`rank` int
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/user/bdp/bigdata/vertex';CREATE EXTERNAL TABLE bigdata_test.te_trade
(
`_key` string,
`_from` string,
`_to` string,
`trans_time` bigint,
`trans_amount` double
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/user/bdp/bigdata/edges';