Go语言中三种不同md5计算方式的性能比较

2019-11-10 10:50:07王冬梅

前言

本文主要介绍的是三种不同的 md5 计算方式,其实区别是读文件的不同,也就是磁盘 I/O, 所以也可以举一反三用在网络 I/O 上。下面来一起看看吧。

ReadFile

先看第一种, 简单粗暴:

func md5sum1(file string) string {
 data, err := ioutil.ReadFile(file)
 if err != nil {
 return ""
 }

 return fmt.Sprintf("%x", md5.Sum(data))
}

之所以说其粗暴,是因为 ReadFile 里面其实调用了一个 readall, 分配内存是最多的。

Benchmark 来一发:

var test_path = "/path/to/file"
func BenchmarkMd5Sum1(b *testing.B) {
 for i := 0; i < b.N; i++ {
 md5sum1(test_path)
 }
}
go test -test.run=none -test.bench="^BenchmarkMd5Sum1$" -benchtime=10s -benchmem

BenchmarkMd5Sum1-4 300 43704982 ns/op 19408224 B/op 14 allocs/op
PASS
ok tmp 17.446s

先说明下,这个文件大小是 19405028 字节,和上面的 19408224 B/op 非常接近, 因为 readall 确实是分配了文件大小的内存,代码为证:

ReadFile 源码

// ReadFile reads the file named by filename and returns the contents.
// A successful call returns err == nil, not err == EOF. Because ReadFile
// reads the whole file, it does not treat an EOF from Read as an error
// to be reported.
func ReadFile(filename string) ([]byte, error) {
 f, err := os.Open(filename)
 if err != nil {
 return nil, err
 }
 defer f.Close()
 // It's a good but not certain bet that FileInfo will tell us exactly how much to
 // read, so let's try it but be prepared for the answer to be wrong.
 var n int64

 if fi, err := f.Stat(); err == nil {
 // Don't preallocate a huge buffer, just in case.
 if size := fi.Size(); size < 1e9 {
 n = size
 }
 }
 // As initial capacity for readAll, use n + a little extra in case Size is zero,
 // and to avoid another allocation after Read has filled the buffer. The readAll
 // call will read into its allocated internal buffer cheaply. If the size was
 // wrong, we'll either waste some space off the end or reallocate as needed, but
 // in the overwhelmingly common case we'll get it just right.
 
 // readAll 第二个参数是即将创建的 buffer 大小
 return readAll(f, n+bytes.MinRead)
}

func readAll(r io.Reader, capacity int64) (b []byte, err error) {
 // 这个 buffer 的大小就是 file size + bytes.MinRead 

 buf := bytes.NewBuffer(make([]byte, 0, capacity))
 // If the buffer overflows, we will get bytes.ErrTooLarge.
 // Return that as an error. Any other panic remains.
 defer func() {
 e := recover()
 if e == nil {
 return
 }
 if panicErr, ok := e.(error); ok && panicErr == bytes.ErrTooLarge {
 err = panicErr
 } else {
 panic(e)
 }
 }()
 _, err = buf.ReadFrom(r)
 return buf.Bytes(), err
}

io.Copy

再看第二种,

func md5sum2(file string) string {
 f, err := os.Open(file)
 if err != nil {
 return ""
 }
 defer f.Close()

 h := md5.New()

 _, err = io.Copy(h, f)
 if err != nil {
 return ""
 }

 return fmt.Sprintf("%x", h.Sum(nil))
}