Java集成datax编写从hive到mysql数据同步的json

从hive里找一张测试用表
Java集成datax编写从hive到mysql数据同步的json_第1张图片
在测试mysql数据库建立一张相同的表
Java集成datax编写从hive到mysql数据同步的json_第2张图片
编写转换json
{
“job”: {
“setting”: {
“speed”: {
“channel”: 3
}
},
“content”: [
{
“reader”: {
“name”: “hdfsreader”,
“parameter”: {
“path”: “/metastore/table/spider_test/region_test23/*”,
“defaultFS”: “hdfs://ip:port”,
“column”: [
{
“index”: 0,
“type”: “string”
},
{
“index”: 1,
“type”: “string”
},
{
“index”: 2,
“type”: “string”
},
{
“index”: 3,
“type”: “string”
},
{
“index”: 4,
“type”: “string”
},
{
“index”: 5,
“type”: “string”
}
{
“index”: 6,
“type”: “string”
},
{
“index”: 7,
“type”: “string”
}
],
“fileType”: “text”,
“encoding”: “UTF-8”,
//: 分隔符
“fieldDelimiter”: “\u0001”
}

            },
            "writer": {
                "name": "mysqlwriter",
                "parameter": {
                    "writeMode": "insert //插入方式,可选insert,update,replace等",
                    "username": "用户名",
                    "password": "密码",
                    "column": [
                        "id",
                        "name",
						"age",
						"conent",
						"city",
						"transform_bd_sp_key",
						"transform_bd_sp_key_pk",
						"transform_bd_sp_time"
						
                    ],
                    "session": [
                        "set session sql_mode='ANSI'"
                    ],
                    "preSql": [
                    //: 执行前执行sql,也可编辑执行后执行sql
                        "delete from hive2mysqltest"
                    ],
                    "connection": [
                    //: 连接
                        {
                            "jdbcUrl": "jdbc:mysql://192.168.2.52:3306/test?useUnicode=true&characterEncoding=gbk",
                            "table": [
                                "hive2mysqltest"
                            ]
                        }
                    ]
                }
            }
        }
    ]
}

}
具体json格式请参阅github相关文档.
执行测试
java.lang.NoSuchMethodError: org.apache.hadoop.tracing.SpanReceiverHost.get(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)Lorg/apache/hadoop/tracing/SpanReceiverHost;
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:634)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access 200 ( F i l e S y s t e m . j a v a : 91 ) a t o r g . a p a c h e . h a d o o p . f s . F i l e S y s t e m 200(FileSystem.java:91) at org.apache.hadoop.fs.FileSystem 200(FileSystem.java:91)atorg.apache.hadoop.fs.FileSystemCache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem C a c h e . g e t ( F i l e S y s t e m . j a v a : 2612 ) a t o r g . a p a c h e . h a d o o p . f s . F i l e S y s t e m . g e t ( F i l e S y s t e m . j a v a : 370 ) a t o r g . a p a c h e . h a d o o p . f s . F i l e S y s t e m . g e t ( F i l e S y s t e m . j a v a : 169 ) a t c o m . a l i b a b a . d a t a x . p l u g i n . r e a d e r . h d f s r e a d e r . D F S U t i l . g e t H D F S A l l F i l e s ( D F S U t i l . j a v a : 123 ) a t c o m . a l i b a b a . d a t a x . p l u g i n . r e a d e r . h d f s r e a d e r . D F S U t i l . g e t A l l F i l e s ( D F S U t i l . j a v a : 112 ) a t c o m . a l i b a b a . d a t a x . p l u g i n . r e a d e r . h d f s r e a d e r . H d f s R e a d e r Cache.get(FileSystem.java:2612) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) at com.alibaba.datax.plugin.reader.hdfsreader.DFSUtil.getHDFSAllFiles(DFSUtil.java:123) at com.alibaba.datax.plugin.reader.hdfsreader.DFSUtil.getAllFiles(DFSUtil.java:112) at com.alibaba.datax.plugin.reader.hdfsreader.HdfsReader Cache.get(FileSystem.java:2612)atorg.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)atorg.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)atcom.alibaba.datax.plugin.reader.hdfsreader.DFSUtil.getHDFSAllFiles(DFSUtil.java:123)atcom.alibaba.datax.plugin.reader.hdfsreader.DFSUtil.getAllFiles(DFSUtil.java:112)atcom.alibaba.datax.plugin.reader.hdfsreader.HdfsReaderJob.prepare(HdfsReader.java:169)
at com.alibaba.datax.core.job.JobContainer.prepareJobReader(JobContainer.java:715)
at com.alibaba.datax.core.job.JobContainer.prepare(JobContainer.java:308)
at com.alibaba.datax.core.job.JobContainer.start(JobContainer.java:115)

出现了一个令人十分尴尬又头疼的问题,hadoop版本不兼容,集群用的是2.6,datax依赖的是2.7.5.这个问题就比较尴尬了,看了github上datax最早更新是18年,那会儿应该也不会用2.6的hadoop,因此可以确认datax是不支持这么早的hadoop版本的,只能自己下下来datax的代码,尝试着改一下hdfsreader的逻辑了.
这种开源的东西bug应该也很多的.就能头大.
但是这个json是合适的

你可能感兴趣的:(datax,大数据)