前言
bufio模块是golang标准库中的模块之一,主要是实现了一个读写的缓存,用于对数据的读取或者写入操作。该模块在多个涉及io的标准库中被使用,比如http模块中使用buffio来完成网络数据的读写,压缩文件的zip模块利用bufio来操作文件数据的读写等。
golang的bufio包里面定以的SplitFunc是一个比较重要也比较难以理解的东西,本文希望通过结合简单的实例介绍SplitFunc的工作原理以及如何实现一个自己的SplitFunc。
一个例子
在bufio包里面定义了一些常用的工具比如Scanner,你可能需要读取用户在标准输入里面输入的一些东西,比如我们做一个复读机,读取用户的每一行输入,然后打印出来:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
scanner := bufio.NewScanner(os.Stdin)
scanner.Split(bufio.ScanLines)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}
这个程序很简单,os.Stdin实现了io.Reader接口,我们从这个reader创建了一个scanner,设置分割函数为bufio.ScanLines,然后for循环,每次读到一行数据就将文本内容打印出来。麻雀虽小五脏俱全,这个小程序虽然简单,却引出了我们今天要介绍的对象: bufio.SplitFunc,它的定义是这个样子的:
package "buffio"
type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)
golang官方文档的描述是这个样子的:
SplitFunc is the signature of the split function used to tokenize the input. The arguments are an initial substring of the remaining unprocessed data and a flag, atEOF, that reports whether the Reader has no more data to give. The return values are the number of bytes to advance the input and the next token to return to the user, if any, plus an error, if any.
Scanning stops if the function returns an error, in which case some of the input may be discarded.
Otherwise, the Scanner advances the input. If the token is not nil, the Scanner returns it to the user. If the token is nil, the Scanner reads more data and continues scanning; if there is no more data--if atEOF was true--the Scanner returns. If the data does not yet hold a complete token, for instance if it has no newline while scanning lines, a SplitFunc can return (0, nil, nil) to signal the Scanner to read more data into the slice and try again with a longer slice starting at the same point in the input.
The function is never called with an empty data slice unless atEOF is true. If atEOF is true, however, data may be non-empty and, as always, holds unprocessed text.
英文!参数这么多!返回值这么多!好烦!不知道各位读者遇到这种文档会不会有这种感觉...正式由于这种情况,我才决定写一篇文章介绍一下SplitFunc的具体工作原理,用一种通俗的方式结合具体实例加以说明,希望对读者有所帮助。









