hive解析json字符串函数get_json_object采坑

1、一般字符串解析没问题

如:{"address":{"f1":"","f2":"","f3":36091591,"f4":120374680},"boxTotalPrice":2.0,"channel":"dianping_app_ios_iphone","dp_user_id":26769863,"poiCate":1000,"poi_first_order":true,"shipping_service":1020,"spNewUser":true}

获取poi_first_order值只需:

get_json_object(common_ext, '$.poi_first_order') as poi_first_order 

返回值:true

 

2、二班就有问题了

如:{"address":{"f1":"","f2":"","f3":36113011,"f4":120421781},"appendProduct":{"apPriceInfoMap":{4:{}}},"boxTotalPrice":4.0,"fbNewUser":true,"open_id":"","poiCate":1000,"poi_first_order":true,"shipping_service":1020,"spNewUser":true}

获取poi_first_order值:

    get_json_object(common_ext, '$.poi_first_order') as poi_first_order 
返回值:null

 

3、针对2的问题用java方式解析

代码块

String s = "{\"address\":{\"f1\":\"\",\"f2\":\"\",\"f3\":36130771,\"f4\":120365936},\"appendProduct\":{\"apPriceInfoMap\":{4:{}}},\"boxTotalPrice\":2.0,\"fbNewUser\":true,\"open_id\":\"\",\"poiCate\":1000,\"poi_first_order\":true,\"shipping_service\":1020,\"spNewUser\":true}";
JSONObject jo = JSONObject.parseObject(jsonStr);
return jo.getString(field);

返回值:true

 

4、原因分析

2中的json字符串:{"address":{"f1":"","f2":"","f3":36113011,"f4":120421781},"appendProduct":{"apPriceInfoMap":{4:{}}},"boxTotalPrice":4.0,"fbNewUser":true,"open_id":"","poiCate":1000,"poi_first_order":true,"shipping_service":1020,"spNewUser":true}

红色字体中包含key为4的情况。

经确认,在写入hive前,这块是一个是一个map结构.

 

5、解决办法

UDF

public String evaluate(String jsonStr, String field) {
        if (StringUtils.isEmpty(jsonStr) || StringUtils.isEmpty(field)) {
            return "null";
        }
​
        try {
            JSONObject jo = JSONObject.parseObject(jsonStr);
            return jo.getString(field);
        } catch (Exception e) {
            return "null";
        }
    }

 

你可能感兴趣的:(数据仓库)