Apache·Kafka学习

官方文档：

搜索关键字：

Kafka安装配置[简单版]

参照：http://kafka.apache.org/documentation.html#quickstart

Kafka uses ZooKeeper so you need to first start a ZooKeeper server if you don’t already have one. You can use the convenience script packaged with kafka to get a quick-and-dirty single-node ZooKeeper instance.（Kafka使用了ZooKeeper，所以你在使用Kafka之前需要先启动一个ZooKeeper，如果你没有的话，可以使用和Kafka一起打包的一个脚本方便的进行启动测试）

先进入对应的Kafka目录，然后执行：

> bin/zookeeper-server-start.sh config/zookeeper.properties  #启动ZooKeeper

> bin/kafka-server-start.sh config/server.properties  #然后启动Kafka

启动了之后再执行创建topic、producer发送消息、consumer消费消息的简单测试：

> bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test  #创建一个叫做“test”的topic

> bin/kafka-topics.sh --list --zookeeper localhost:2181  #查看已有的topic列表
test

> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test    #启动producer（由控制台输入消息）
This is a message    #自己输入的
This is another message  #自己输入的

{

之前在：> bin/kafka-console-producer.sh –broker-list localhost:9092 –topic test 报错如下：

SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”.

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

根据上面的链接查找解决方法，以为是Java的版本、CLASSPATH的设置有问题，后来设置成了Oracle的Java，修改了CLASSPATH但是也没有解决；在中文中一搜就找到了解决办法：

需要下载 slftj-nop-1.5.jar {http://www.slf4j.org/download.html}，并将其拷贝至Kafka的libs目录下

}

> bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning    #启动consumer（从头开始接收消息）
This is a message
This is another message

So far we have been running against a single broker, but that’s no fun. For Kafka, a single broker is just a cluster of size one, so nothing much changes other than starting a few more broker instances. But just to get feel for it, let’s expand our cluster to three nodes (still all on our local machine).（以上我们只是进行了简单的——单broker的Kafka测试，但这显然是不够的，所以，下面我们会在单机上启动3个broker实例，感觉一下Kafka的cluster效果）

First we make a config file for each of the brokers（首先分别为每一个broker在默认配置的基础上创建一个config文件）:

> cp config/server.properties config/server-1.properties

> cp config/server.properties config/server-2.properties

Now edit these new files and set the following properties:

config/server-1.properties:
    broker.id=1
    port=9093
    log.dir=/tmp/kafka-logs-1

config/server-2.properties:
    broker.id=2
    port=9094
    log.dir=/tmp/kafka-logs-2

The broker.id property is the unique and permanent name of each node in the cluster. We have to override the port and log directory only because we are running these all on the same machine and we want to keep the brokers from all trying to register on the same port or overwrite each others data.（主要修改3点信息：每个broker的broker.id在cluster中必须是唯一的；修改port和log.dir的原因在于——我们是在同一台机器上进行的多broker的测试，为了避免冲突和错误）

We already have Zookeeper and our single node started, so we just need to start the two new nodes（因为之前我们已经启动了ZooKeeper和一个Kafka实例，所以接下来只需要再开启2个Kafka实例即可）:

> bin/kafka-server-start.sh config/server-1.properties &
...
> bin/kafka-server-start.sh config/server-2.properties &
...

Now create a new topic with a replication factor of three（现在创建一个3复制因子的新topic）:

> bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic

Okay but now that we have a cluster how can we know which broker is doing what? To see that run the “describe topics” command（创建了之后我们如何知道每一个broker分别在做些什么呢？这时可以通过describe选项进行查看）:

> bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic      PartitionCount:1     ReplicationFactor:3       Configs:
    Topic: my-replicated-topic     Partition: 0     Leader: 1 Replicas: 1,2,0       Isr: 1,2,0

Here is an explanation of output. The first line gives a summary of all the partitions, each additional line gives information about one partition. Since we have only one partition for this topic there is only one line（对以上输出内容的解释：第一行是所有分区的一个总结；每多一行就是对一个partion的详细说明，这里因为只有一个partition，所以也就只有一行）.

“leader” is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.（leader是随机选出的）

“replicas” is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.（replicas是在该partition中要复制log信息的节点列表，不考虑该节点是否为leader或是alive）

“isr” is the set of “in-sync” replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.（isr是”in-sync”的副本，是当前存活/正在工作的节点列表）

Let’s publish a few messages to our new topic（通过producer发布一些新的message到指定topic中）:

> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic
...
my test message 1   #自己从console输入的
my test message 2   #自己从console输入的
^C

Now let’s consume these messages（然后consumer消费这些信息）:

> bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic
...
my test message 1
my test message 2
^C

Now let’s test out fault-tolerance. Broker 1 was acting as the leader so let’s kill it（接下来测试一下cluster的容错性，这里先测试将leader节点kill掉之后的效果）:

> ps | grep server-1.properties
7564 ttys002    0:15.91 /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/bin/java...

> kill -9 7564

Leadership has switched to one of the slaves and node 1 is no longer in the in-sync replica set（通过describe选项可以查看到leader的id已经从之前的1号节点切换到了之前的2号从节点，并且1号节点已经不再isr集合中了）:

> bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic      PartitionCount:1     ReplicationFactor:3       Configs:
    Topic: my-replicated-topic     Partition: 0     Leader: 2 Replicas: 1,2,0       Isr: 2,0

But the messages are still be available for consumption even though the leader that took the writes originally is down（即使是在leader被kill掉了之后，message依然可用）:

> bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic
...
my test message 1
my test message 2
^C

然后，简单总结下Kafka系统运行的大体流程/顺序：

启动ZooKeeper的server
启动Kafka的server
创建topic（topic就类似于一个队列）
Producer如果生产了数据，会先通过ZooKeeper找到broker，然后将数据放进broker
Consumer如果要消费数据，会先通过ZooKeeper找对应的broker，然后消费

18 1 月, 2015

admin

Other, Tools

Apache, Kafka