StringBuilder内部是由多段char[]组成的半自动链表,因此频繁从中间修改StringBuilder,会将原本连续的内存分隔为多段,从而影响读取/遍历性能。
连续内存与不连续内存的性能差,可能高达1600倍。
背景
用StringBuilder的用户可能大都想用StringBuilder拼接html/json模板、组装动态SQL等正常操作。但在一些特殊场景中——如为某种编程语言写语言服务,或者写一个富文本编辑器时,StringBuilder依然也有用武之地,通过里面的Insert/Remove两个方法来修改。
测试方法
Talk is cheap, show me the code:
int docLength = 10000;
void Main()
{
(from power in Enumerable.Range (1, 16)
let mutations = (int) Math.Pow (2, power)
select new
{
mutations,
PerformanceRatio = Math.Round (GetPerformanceRatio (docLength, mutations), 1)
}).Dump();
}
float GetPerformanceRatio (int docLength, int mutations)
{
var sb = new StringBuilder ("".PadRight (docLength));
var before = GetPerformance (sb);
FragmentStringBuilder (sb, mutations);
var after = GetPerformance (sb);
return (float) after.Ticks / before.Ticks;
}
void FragmentStringBuilder (StringBuilder sb, int mutations)
{
var r = new Random(42);
for (int i = 0; i < mutations; i++)
{
sb.Insert (r.Next (sb.Length), 'x');
sb.Remove (r.Next (sb.Length), 1);
}
}
TimeSpan GetPerformance (StringBuilder sb)
{
var sw = Stopwatch.StartNew();
long tot = 0;
for (int i = 0; i < sb.Length; i++)
{
char c = sb[i];
tot += (int) c;
}
sw.Stop();
return sw.Elapsed;
}
关于这段代码,请注意以下几点:
通过.PadRight(n)来直接创建长度为n的空白字符串,可以用new string(' ', n)来代替; new Random(42)处,我指定了一个随机因子,确保每次分隔后分隔的位置完全相同,有利于做对照组; 我分别对字符串进行了2^1 ~ 2^16次修改,分别比较经过这么多次修改之后的性能差异; 我使用sb[i]来逐一访问StringBuilder中的位置,使内存不连续性更加突显。运行结果
| mutations | PerformanceRatio |
|---|---|
| 2 | 1 |
| 4 | 1 |
| 8 | 1 |
| 16 | 1 |
| 32 | 1 |
| 64 | 1.1 |
| 128 | 1.2 |
| 256 | 1.8 |
| 512 | 5.2 |
| 1024 | 19.9 |
| 2048 | 81.3 |
| 4096 | 274.5 |
| 8192 | 745.8 |
| 16384 | 1578.8 |
| 32768 | 1630.4 |
| 65536 | 930.8 |










