ZeroMQ is an asynchronous messaging library, aimed at use in distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ZeroMQ system can run without a dedicated message broker; the zero in the name is for zero broker.
为什么需要ZeroMQ
简单,可拓展性。帮助开发者专注于消息的传递和网络拓扑的设计,而不是连接的建立或可用性的保证。(支持自动重连)
https://zguide.zeromq.org/docs/chapter1/#Why-We-Needed-ZeroMQ
ZeroMQ的通信哲学
To fix the world, we needed to do two things. One, to solve the general problem of “how to connect any code to any code, anywhere”. Two, to wrap that up in the simplest possible building blocks that people could understand and use easily.
namely, 用最简单的方式建立连接,另外,在使用zmq进行通信开发时,你需要忘记掉所有有关tcp的知识,因为zmq包装了所有的连接,在套接字之上又抽象包装了一层,因此部分概念会与常规的计算机网络概念相悖。
ZeroMQ的通信特点
- ZeroMQ doesn’t know anything about the data you send except its size in bytes.
也就是说,如同 c 这样的以 null 结尾字符串的方式需要进行额外的处理。
同时,这也意味着从二进制到解析结果这部分需要用户自己进行处理。(尽管现在的zmq已经提供了常用的解码方式)
- In theory with ZeroMQ sockets, it does not matter which end connects and which end binds. However, in practice there are undocumented differences that I’ll come to later.
理论求自由,实践不逾矩
Basic
- 连接就是连接,连接建立一个结构,不要考虑其他的
Now, imagine we start the client before we start the server. In traditional networking, we get a big red Fail flag. But ZeroMQ lets us start and stop pieces arbitrarily. As soon as the client node does zmq_connect(), the connection exists and that node can start to write messages to the socket. At some stage (hopefully before messages queue up so much that they start to get discarded, or the client blocks), the server comes alive, does a zmq_bind(), and ZeroMQ starts to deliver messages.
- 服务节点可以绑定多个端点
如果不理解,请重读ZeroMQ的通信哲学小节
zmq_bind (socket, "tcp://*:5555");
zmq_bind (socket, "tcp://*:9999");
zmq_bind (socket, "inproc://somename");
- 合法的组合
- PUB and SUB
- REQ and REP
- REQ and ROUTER (take care, REQ inserts an extra null frame)
- DEALER and REP (take care, REP assumes a null frame)
- DEALER and ROUTER
- DEALER and DEALER
- ROUTER and ROUTER
- PUSH and PULL
- PAIR and PAIR
头疼msg,理解msg,维护msg
- 帧是zmq通信的基本传输格式(basic wire format)
Frames (also called “message parts” in the ZeroMQ reference manual pages) are the basic wire format for ZeroMQ messages. A frame is a length-specified block of data. The length can be zero upwards. If you’ve done any TCP programming you’ll appreciate why frames are a useful answer to the question “how much data am I supposed to read of this network socket now?”
- 从概念来说,ZeroMQ 消息是一个帧,就像 UDP 一样。但是其可以用标志位对消息进行扩展,也就是说,实际使用上,一个zmq消息可以由多个帧组成。
Originally, a ZeroMQ message was one frame, like UDP. We later extended this with multipart messages, which are quite simply series of frames with a “more” bit set to one, followed by one with that bit set to zero. The ZeroMQ API then lets you write messages with a “more” flag and when you read messages, it lets you check if there’s “more”.
- A message can be one or more parts.
- These parts are also called “frames”.
- Each part is a object
zmq_msg_t
- You send and receive each part separately, in the low-level API.
- Higher-level APIs provide wrappers to send entire multipart messages.
- msg 的一些注意事项
- 可以发送0长度的msg作为信号
You may send zero-length messages, e.g., for sending a signal from one thread to another.
- 保证消息传递的原子性
ZeroMQ guarantees to deliver all the parts (one or more) for a message, or none of them.
- 异步发送,要保证内存中的队列不会爆
ZeroMQ does not send the message (single or multipart) right away, but at some indeterminate later time. A multipart message must therefore fit in memory.
- 大文件的发送需要自己手动拆分,多次发送
A message (single or multipart) must fit in memory. If you want to send files of arbitrary sizes, you should break them into pieces and send each piece as separate single-part messages. Using multipart data will not reduce memory consumption.
- You must call zmq_msg_close() when finished with a received message, in languages that don’t automatically destroy objects when a scope closes. You don’t call this method after sending a message.
- zmq_msg_init_data() 是一种 zero-copy 的方法,需要谨慎使用
同步 PUB-SUB,捕获第一条消息
The **subscriber** will always miss the first messages that the **publisher** sends.
This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
Making a TCP connection involves to and from handshaking that takes several milliseconds depending on your network and the number of hops between peers. In that time, ZeroMQ can send many messages. For sake of argument assume it takes 5 msecs to establish a connection, and that same link can handle 1M messages per second. During the 5 msecs that the subscriber is connecting to the publisher, it takes the publisher only 1 msec to send out those 1K messages.
https://zguide.zeromq.org/docs/chapter2/#Node-Coordination
解读 ROUTER-ROUTER
zmq 的基础使用中,不应该存在 ROUTER-ROUTER 的双端通信。(虽然 ROUTER-ROUTER 是合法的组合)ROUTER-ROUTER 的强行通信(不交换IDENTITY)会导致tcp握手一直被rst。
在 tdserver 中,DEALER 被封装在了 wrapper 文件定义的对象中,后来又被隐晦地调用了,所以给人一种 ROUTER-ROUTER 通信的错觉- 本质上是,DEALER 通信时,会隐式地在序列开头添加自己随机的一个 IDENTITY,这个 IDENTITY 会让 ROUTER 在内部维护一个映射关系,即 IDENTITY->DEALER 的映射,如此来把返回的结果正确的发送给对应的 DEALER ,而在封装的实现中,ROUTER 是不会主动添加 IDENTITY 的(其本身也没有初始化的 IDENTITY)
- 下面说的 DEALER 代指具有 IDENTITY 的 zmq.Socket,使用 DEALER 是为了便于理解
- 想要实现 ROUTER-ROUTER 的通信,就需要将 ROUTER 伪装成 DEALER
- 先从发送讲起,ROUTER A 现在要向另一个 ROUTER B 发送信息,在 A 的眼中,B 是一个 DEALER,所以 A 需要手动添加一个 IDENTITY 在序列的开头,而这个 IDENTITY 必须和 B 的 IDENTITY 完全一样!
- 接收是正常的流程
- 发现了吗,在 ROUTER-ROUTER 模型中,他们之间想要实现双端通信,就必须要知道对方的 IDENTITY,但是一般情况下,想要知道对方的 IDENTITY,就要实现双端通信。。。
- 非常经典的悖论,官方文档中也对这个问题进行了简要的讨论,给出了一些解决方案,但是根据 KISS 原则,如此模型是不应该存在于任何一种设计中的
为什么是zmq_poll()而不是zmq_epoll()?
https://github.com/zeromq/libzmq/issues/1667
zmq_proxy()里的配对瞎配会怎么样?
If you’re like most ZeroMQ users, at this stage your mind is starting to think, “What kind of evil stuff can I do if I plug random socket types into the proxy?” The short answer is: try it and work out what is happening. In practice, you would usually stick to ROUTER/DEALER, XSUB/XPUB, or PULL/PUSH.
经过wireshark抓包发现,对于接收信息来说瞎配啥事也没有,能收到。但是一旦proxy的组合不符合配对规律,frontend 到 backend 的通信会连包都发不出去,也就是在函数内部进行了阻断。
zmq 错误处理之道
https://zguide.zeromq.org/docs/chapter2/#Handling-Errors-and-ETERM
zmq之信息丢失
https://zguide.zeromq.org/docs/chapter2/#Missing-Message-Problem-Solver