先启动hadoop,zookeeper,kafka
启动命令
hadoop启动命令 sbin/start-all.sh
zookeeper启动命令 ./bin/zkServer.sh start 每台机器都要启动
kafka启动命令 bin/kafka-server-start.sh config/server.properties 每台机器都要启动
进行以下操作的前提是将hadoop,zookeeper,kafka安装配置好并启动
1.安装kafka-1.3.5.tar.gz
到pypi.python.org上下载kafka-1.3.5.tar.gz
解压tar xvzf kafka-1.3.5.tar.gz
进入kafka-1.3.5文件夹下,安装sudo python setup.py install
安装过程中可能会出现缺少某些安装包等错误,下载安装即可
2.安装完成之后创建topic
hadoop@master:~/local/kafka_2.10-0.10.1.0/bin$ ./kafka-topics.sh --describe --zookeeper localhost:2181 --topic world
创建之后可以通过以下命令查看已创建的topic
bin/kafka-topics.sh --list --zookeeper localhost:2181
3.编写测试代码
(1)生产者代码
假如我们要监控此/home/hadoop/local/jdk1.8.0_191/lib/missioncontrol/features下的文件,可以这样执行
python file_monitor.py /home/hadoop/local/jdk1.8.0_191/lib/missioncontrol/features
file_monitor.py
#-*- coding: utf-8 -*-
from kafka import KafkaProducer
import json
import os
import time
from sys import argv
producer = KafkaProducer(bootstrap_servers='192.168.120.11:9092')
def log(str):
t = time.strftime(r"%Y-%m-%d_%H-%M-%S",time.localtime())
print("[%s]%s"%(t,str))
def list_file(path):
dir_list = os.listdir(path);
for f in dir_list:
producer.send('world',f)
producer.flush()
log('send: %s' % (f))
list_file(argv[1])
producer.close()
log('done')
执行结果如下:
hadoop@master:~$ python pro.py /home/hadoop/local/jdk1.8.0_191/lib/missioncontrol/features
[2019-04-04_00-04-09]send: org.eclipse.emf.common_2.10.1.v20140901-1043
[2019-04-04_00-04-09]send: com.jrockit.mc.feature.core_5.5.2.174165
[2019-04-04_00-04-09]send: com.jrockit.mc.feature.rcp.ja_5.5.2.174165
[2019-04-04_00-04-09]send: com.jrockit.mc.feature.rcp.zh_CN_5.5.2.174165
[2019-04-04_00-04-09]send: org.eclipse.equinox.p2.rcp.feature_1.2.0.v20140523-0116
[2019-04-04_00-04-09]send: org.eclipse.ecf.core.feature_1.1.0.v20140827-1444
[2019-04-04_00-04-09]send: com.jrockit.mc.rcp.product_5.5.2.174165
[2019-04-04_00-04-09]send: org.eclipse.ecf.filetransfer.httpclient4.ssl.feature_1.0.0.v20140827-1444
[2019-04-04_00-04-09]send: com.jrockit.mc.feature.flightrecorder_5.5.2.174165
[2019-04-04_00-04-09]send: org.eclipse.equinox.p2.core.feature_1.3.0.v20140523-0116
[2019-04-04_00-04-09]send: org.eclipse.babel.nls_eclipse_zh_4.4.0.v20140623020002
[2019-04-04_00-04-09]send: org.eclipse.emf.ecore_2.10.1.v20140901-1043
[2019-04-04_00-04-09]send: org.eclipse.rcp_4.4.0.v20141007-2301
[2019-04-04_00-04-09]send: org.eclipse.help_2.0.102.v20141007-2301
[2019-04-04_00-04-09]send: org.eclipse.ecf.filetransfer.feature_3.9.0.v20140827-1444
[2019-04-04_00-04-09]send: org.eclipse.babel.nls_eclipse_ja_4.4.0.v20140623020002
[2019-04-04_00-04-09]send: org.eclipse.ecf.filetransfer.ssl.feature_1.0.0.v20140827-1444
[2019-04-04_00-04-09]send: org.eclipse.ecf.core.ssl.feature_1.0.0.v20140827-1444
[2019-04-04_00-04-09]send: org.eclipse.ecf.filetransfer.httpclient4.feature_3.9.1.v20140827-1444
[2019-04-04_00-04-09]send: com.jrockit.mc.feature.console_5.5.2.174165
[2019-04-04_00-04-09]send: com.jrockit.mc.feature.rcp_5.5.2.174165
[2019-04-04_00-04-09]send: org.eclipse.e4.rcp_1.3.100.v20141007-2033
[2019-04-04_00-04-09]done
(2)消费者代码
con.py
#-*- coding: utf-8 -*-
from kafka import KafkaConsumer
import time
def log(str):
t = time.strftime(r"%Y-%m-%d_%H-%M-%S",time.localtime())
print("[%s]%s"%(t,str))
log('start consumer')
consumer=KafkaConsumer('world',group_id='test-consumer-group',bootstrap_servers=['192.168.44.130:9092'])
for msg in consumer:
recv = "%s:%d:%d: key=%s value=%s" %(msg.topic,msg.partition,msg.offset,msg.key,msg.value)
log(recv)
consumer.close()
上面代码中的world是topic的名称,要与自己创建的topic名称一致
test-consumer-group是kakfa消费组的标识,要与自己在kafka_2.10-0.10.1.0/config/consumer.properties中配置的group-id一致
执行结果:
hadoop@master:~$ python con.py
[2019-04-04_00-06-21]start consumer
[2019-04-04_00-06-21]world:0:291: key=None value=b'org.eclipse.emf.common_2.10.1.v20140901-1043'
[2019-04-04_00-06-21]world:0:292: key=None value=b'com.jrockit.mc.feature.core_5.5.2.174165'
[2019-04-04_00-06-21]world:0:293: key=None value=b'com.jrockit.mc.feature.rcp.ja_5.5.2.174165'
[2019-04-04_00-06-21]world:0:294: key=None value=b'com.jrockit.mc.feature.rcp.zh_CN_5.5.2.174165'
[2019-04-04_00-06-21]world:0:295: key=None value=b'org.eclipse.equinox.p2.rcp.feature_1.2.0.v20140523-0116'
[2019-04-04_00-06-21]world:0:296: key=None value=b'org.eclipse.ecf.core.feature_1.1.0.v20140827-1444'
[2019-04-04_00-06-21]world:0:297: key=None value=b'com.jrockit.mc.rcp.product_5.5.2.174165'
[2019-04-04_00-06-21]world:0:298: key=None value=b'org.eclipse.ecf.filetransfer.httpclient4.ssl.feature_1.0.0.v20140827-1444'
[2019-04-04_00-06-21]world:0:299: key=None value=b'com.jrockit.mc.feature.flightrecorder_5.5.2.174165'
[2019-04-04_00-06-21]world:0:300: key=None value=b'org.eclipse.equinox.p2.core.feature_1.3.0.v20140523-0116'
[2019-04-04_00-06-21]world:0:301: key=None value=b'org.eclipse.babel.nls_eclipse_zh_4.4.0.v20140623020002'
[2019-04-04_00-06-21]world:0:302: key=None value=b'org.eclipse.emf.ecore_2.10.1.v20140901-1043'
[2019-04-04_00-06-21]world:0:303: key=None value=b'org.eclipse.rcp_4.4.0.v20141007-2301'
[2019-04-04_00-06-21]world:0:304: key=None value=b'org.eclipse.help_2.0.102.v20141007-2301'
[2019-04-04_00-06-21]world:0:305: key=None value=b'org.eclipse.ecf.filetransfer.feature_3.9.0.v20140827-1444'
[2019-04-04_00-06-21]world:0:306: key=None value=b'org.eclipse.babel.nls_eclipse_ja_4.4.0.v20140623020002'
[2019-04-04_00-06-21]world:0:307: key=None value=b'org.eclipse.ecf.filetransfer.ssl.feature_1.0.0.v20140827-1444'
[2019-04-04_00-06-21]world:0:308: key=None value=b'org.eclipse.ecf.core.ssl.feature_1.0.0.v20140827-1444'
[2019-04-04_00-06-21]world:0:309: key=None value=b'org.eclipse.ecf.filetransfer.httpclient4.feature_3.9.1.v20140827-1444'
[2019-04-04_00-06-21]world:0:310: key=None value=b'com.jrockit.mc.feature.console_5.5.2.174165'
[2019-04-04_00-06-21]world:0:311: key=None value=b'com.jrockit.mc.feature.rcp_5.5.2.174165'
[2019-04-04_00-06-21]world:0:312: key=None value=b'org.eclipse.e4.rcp_1.3.100.v20141007-2033'
参考: kfka学习笔记二:使用Python操作Kafka
kafka实战教程(python操作kafka),kafka配置文件详解