Lua 中避免低效解析 TCP 网络数据包体的一种方式

TCP 是流式协议，发送方发送出的是字节流，接收方接收到的也是字节流数据。通常，在应用层都会通过 header + body 在字节流中标识出单个协议包。发送方将原始数据打包成 header + body 。header 是固定字节数包头，标识 body 包含了多少字节数据。接收方先读固定字节数 header ，然后根据 header 读出具体的 body 数据。
在游戏中，总会需要编写一些和服务器通信的机器人客户端。我们项目会习惯采用 Lua 来实现，就不可避免的解析 TCP 网络数据。逻辑很简单，通常采用字符串连接的方式几行代码就可以完成。完整代码点击这里，下面列出主要的代码片段。

function mt:init(header_bytes)
    self.cache = ""
    self.header_bytes = header_bytes
end

function mt:input(str)
    self.cache = self.cache .. str
end

function mt:output()
    local hb = self.header_bytes
    local total = #self.cache
    if total <= hb then
        return
    end

    local body_bytes = string.unpack(">I2", self.cache)
    if hb + body_bytes > total then
        return
    end

    local body = self.cache:sub(hb + 1, hb + body_bytes)
    self.cache = self.cache:sub(hb + body_bytes + 1)
    return body
end

input 函数用于缓存收到的数据，output 函数用于将接收到的字节流解析成单个协议数据包。input 和 output 涉及的字符串操作在调用比较频繁时效率会很低。如果对工具的效率要求提高，便不再满足需求。但是又想这个机器人尽量简单，会先考虑用纯 Lua 来解决这个问题。

上述方案的问题在于字符串连接效率比较低，在接收数据比较频繁时，字符串操作占用大量的 CPU 资源。于是新方案的思想就是尽量避免字符串连接，如下所示。

function mt:init(header_bytes)
    self.cache_list = {}
    self.total_size = 0
    self.header_bytes = header_bytes
    self.body_list = {}
end

function mt:input(str)
    local cache = self.cache_list
    local block = cache[#cache]

    if block and #block < self.header_bytes then
        cache[#cache] = block .. str
    else
        cache[#cache + 1] = str
    end

    self.total_size = self.total_size + #str
end

function mt:output()
    local body_list = self.body_list
    local cache_body = body_list[1]
    if cache_body then
        table.remove(body_list, 1)
        return cache_body
    end

    local total_str
    if #self.cache_list == 1 then
        total_str = self.cache_list[1]
    else
        total_str = table.concat(self.cache_list)
        self.cache_list = {total_str}
    end

    local hb = self.header_bytes
    local start_index = 1
    while true do
        if not total_str or #total_str < hb then
            break
        end

        if self.total_size <= hb then
            break
        end

        local header = total_str:sub(start_index, start_index + hb - 1)
        local body_bytes = string.unpack(">I2", header)
        if hb + body_bytes > self.total_size then
            break
        end

        self.total_size = self.total_size - hb - body_bytes

        local new_index = start_index + hb + body_bytes
        local body = total_str:sub(start_index + hb, new_index - 1)
        if cache_body then
            body_list[#body_list + 1] = body
        else
            cache_body = body
        end

        start_index = new_index
    end

    if start_index > 1 then
        self.cache_list = {total_str:sub(start_index)}
    end

    return cache_body
end

input 函数中不会进行字符串连接，而是把收到的数据保存到 self.cache_list 中。然后在 output 函数中一次尽最大可能解析协议数据，然后保存在 self.body_list 中，每次调用 output 时若 self.body_list 有数据，则直接返回这里的数据即可。

测试方式见这里。新的方式基本可以瞬间解析完 64M 数据。

最好是过一段时间调用一次 output 函数，这样会更高效。手游客户端的帧率一般是 30 FPS 或 60 FPS 。所以完全可以 1/60 秒调用一次 output 函数，甚至 1/100 秒调用一次也可以。

具体使用时，需要先获取完整的数据（位于 self.body_list ）数组中，若没有，则读 socket ，然后添加到缓存中，再解析是否有收到了完整的数据，若没有则 sleep 一小会儿，则尝试。具体代码如下。

function mt:read_packet()
    local packet
    while true do
        -- 尝试获取完整的数据
        packet = self.pack_obj:output(true)
        if packet then
            return packet
        end

        -- 读 socket
        local buf, err = self.sock:read()
        if not buf or #buf == 0 then
            return nil, err
        end

        self.pack_obj:input(buf)
        -- 解析是否收到了完整的数据
        packet = self.pack_obj:output()
        if packet then
            break
        end
        Levent.sleep(0.01)
    end
end

一开始使用这段代码时，没有先尝试获取完整的数据，每次调用 read_packet 都会读 socket ，当一次收到的数据量很大时，可能包含了多个完整的数据包，而此时还 read_packet ，若服务器没有返回数据，则客户端会一直等待 read_packet 返回，就会卡住。