> 文章列表 > 使用datax 把数据从oracle导出到elasticsearch中

使用datax 把数据从oracle导出到elasticsearch中

使用datax 把数据从oracle导出到elasticsearch中

https://blog.csdn.net/mynameisjinxiaokai/article/details/128716467

一、datax需要python环境,需要先安装python
打开官网 https://www.python.org/downloads/
此处下载2.6.5版本安装
添加环境变量 path   C:\\Python27\\
安装完成后 cmd  使用   python -V 查看是否已安装成功

datax 下载后解压至本地某个目录,进入bin目录,即可运行同步作业
E:\\ruanjian\\JAVA\\datax\\datax     , YOUR_JOB.json 因为datax会自动从github拉取对应插件的json的模版给我们
例子
$ cd  {YOUR_DATAX_HOME}/bin
$ python datax.py {YOUR_JOB.json}    
本地
cd E:\\ruanjian\\JAVA\\datax\\datax\\bin
python E:\\ruanjian\\JAVA\\datax\\datax\\bin\\datax.py  E:\\ruanjian\\JAVA\\datax\\datax\\bin\\local-oracle-test-json.txt
备注,1\\   json文件需要检查 json格式是否正确
2  执行输出乱码,临时解决,  输入   chcp 65001   
长久解决,  https://cloud.tencent.com/developer/article/2084809   修改window编码
3   写入模式(writeMode)配置错误. 因为Oracle不支持配置项 writeMode: xxx, Oracle只能使用insert sql 插入数据  ,所以删除 writeMode 这行配置
4\\   datax :java.sql.SQLException: No suitable driver found for ["jdbc:oracle:thin:@localhost:1521:orcl"] 
jobjson的writer中的jdbcUrl 不能使用[]扩起来,因为写入的数据库只能有一个

执行转出到 es 命令
python E:\\ruanjian\\JAVA\\datax\\datax\\bin\\datax.py  E:\\ruanjian\\JAVA\\datax\\datax\\bin\\oracleToES-json.txt

{"job": {"content": [{"reader": {"name": "oraclereader","parameter": {"username": "test1","password": "Xia_12345","column": ["NAME","SEX","SCHOOL"],"connection": [{"table": ["DATAX1"],"jdbcUrl": ["jdbc:oracle:thin:@localhost:1521:orcl"]}]}},"writer": {"name": "oraclewriter","parameter": {"username": "test1","password": "Xia_12345","column": ["NAME","SEX","SCHOOL"],"preSql": ["delete from DATAX2"],"connection": [{"table": ["DATAX2"],"jdbcUrl": "jdbc:oracle:thin:@localhost:1521:orcl",}]}}}],"setting": {"speed": {"channel": 1},"errorLimit": {"record": 10,"percentage": 0.05}}}
}
{"job": {"content": [{"reader": {"name": "oraclereader","parameter": {"username": "test1","password": "Xia_12345","column": ["NAME","SEX","SCHOOL"],"connection": [{"table": ["DATAX1"],"jdbcUrl": ["jdbc:oracle:thin:@localhost:1521:orcl"]}]}},"writer": {"name": "elasticsearchwriter","parameter": {"endpoint": "http://127.0.0.1:9200","index": "myindex","type": "data","cleanup": true, #true表示插入前清空,即覆盖同步;false则追加同步"dynamic": true,  #这里一定要指定为true,否则使用的是datax的模板(就是下面定义的字段类型),而不会使用es的模板"settings": {"index" :{"number_of_shards": 2, "number_of_replicas": 1}},            "batchSize": 10000,"splitter": ",","column": [{"name": "NAME", "type": "text"},{ "name": "SEX", "type": "text"},{ "name": "SCHOOL", "type": "text"}                 ]}}}],"setting": {"speed": {"channel": 1},"errorLimit": {"record": 10,"percentage": 0.05}}}
}