NingG +

Flume、Kafka、Storm小结

Flume

可靠性和可恢复性

Reliability

The events are staged in a channel on each agent. The events are then delivered to the next agent or terminal repository (like HDFS) in the flow. The events are removed from a channel only after they are stored in the channel of next agent or in the terminal repository. This is a how the single-hop message delivery semantics in Flume provide end-to-end reliability of the flow.(single-hop message delivery semantics:Channel中的event仅在被成功处理之后,才从Channel中删掉。)

Flume uses a transactional approach to guarantee the reliable delivery of the events. The sources and sinks encapsulate in a transaction the storage/retrieval, respectively, of the events placed in or provided by a transaction provided by the channel. This ensures that the set of events are reliably passed from point to point in the flow. In the case of a multi-hop flow, the sink from the previous hop and the source from the next hop both have their transactions running to ensure that the data is safely stored in the channel of the next hop.(multi-hop:)

notes(ningg):Flume如何保证事物操作?没看懂

Recoverability

The events are staged in the channel, which manages recovery from failure. Flume supports a durable file channel which is backed by the local file system. There’s also a memory channel which simply stores the events in an in-memory queue, which is faster but any events still left in the memory channel when an agent process dies can’t be recovered.(Channel需保证崩溃后,能恢复events,具体:本地FS上保存durable file channel,另,占用一个in-memory queue,Channel进程崩溃后,能加快恢复速度;但,如果agent进程崩溃,将导致内存泄漏:无法回收这一内存)

Kafka

(TODO List)

(Kafka集群涉及到的可扩展性和可靠性)

Storm

(TODO List)

(Storm集群相关的可扩展性和可靠性)

Flume/Kafka/Storm框架性能测试

简要说一下,性能测试的目标:弄清楚整个框架的承载能力,到底能处理多达流量的数据。

前期问题

几个搭建测试环境相关的问题:

搭建测试环境步骤

测试方案列表

zookeeper集群

当前使用的是CDH中自带的zookeeper:

notes(ningg):zookeeper集群的基本原理,如何监控其性能?

Flume集群

Flume的配置文件需要考虑几点:

notes(ningg):一个问题,使用Exec Source来进行收集数据时,有一种情况,如果tail -F命令意外终止了,Flume无法自动重启这一命令,原因:Flume无法确定是文件没有新增信息,还是tail命令意外终止;为解决这一问题,官网有两个建议:

个人想法:官网给出的信息很权威,不过可以到官网的JIRA上看看,其他人也遇到这个问题,应该会有其他思路。

Kafka集群

notes(ningg):有几个疑问:

Storm集群

notes(ningg):如何构建Storm集群?

问题汇总

参考来源

Top