c# – 为什么添加DoubleStructs比添加双倍的速度要慢得多?

c# – 为什么添加DoubleStructs比添加双倍的速度要慢得多?,第1张

概述假设 对于任何简单的 *** 作,包含单个基元的只读结构应该与基元本身一样快. 测试 以下所有测试都是在Windows 7 x64上运行.NET Core 2.2,代码优化.在.NET 4.7.2上测试时,我也得到了类似的结果. 测试:渴望 用long类型测试这个前提,似乎这有: // =============== SETUP ===================public readonly s 假设

对于任何简单的 *** 作,包含单个基元的只读结构应该与基元本身一样快.

测试

以下所有测试都是在Windows 7 x64上运行.NET Core 2.2,代码优化.在.NET 4.7.2上测试时,我也得到了类似的结果.

测试:渴望

用long类型测试这个前提,似乎这有:

// =============== SETUP ===================public Readonly struct LongStruct{    public Readonly long Primitive;    public LongStruct(long value) => Primitive = value;    [MethodImpl(MethodImplOptions.AggressiveInlining)]    public static LongStruct Add(in LongStruct lhs,in LongStruct rhs)        => new LongStruct(lhs.Primitive + rhs.Primitive);}[MethodImpl(MethodImplOptions.AggressiveInlining)]public static long LongAdd(long lhs,long rhs) => lhs + rhs;// =============== TESTS ===================public static voID TestLong(long a,long b,out long result){    var sw = Stopwatch.StartNew();    for (var i = 1000000000; i > 0; --i)    {        a = LongAdd(a,b);    }    sw.Stop();    result = a;    return sw.ElapsedMilliseconds;}public static voID TestLongStruct(LongStruct a,LongStruct b,out LongStruct result){    var sw = Stopwatch.StartNew();    for (var i = 1000000000; i > 0; --i)    {        a = LongStruct.Add(a,b);    }    sw.Stop();    result = a;    return sw.ElapsedMilliseconds;}// ============= TEST LOOP =================public static voID RunTests(){    var longStruct = new LongStruct(1);    var count = 0;    var longTime = 0L;    var longStructTime = 0L;    while (true)    {        count++;        Console.Writeline("Test #" + count);        longTime += TestLong(1,1,out var longResult);        var longMean = longTime / count;        Console.Writeline($"Long: value={longResult},Mean Time elapsed: {longMean} ms");        longStructTime += TestLongStruct(longStruct,longStruct,out var longstructresult);        var longStructMean = longStructTime / count;        Console.Writeline($"LongStruct: value={longstructresult.Primitive},Mean Time elapsed: {longStructMean} ms");        Console.Writeline();    }}

使用LongAdd使测试循环匹配 – 每个循环调用一个方法来进行一些添加,而不是内联案例的内联

在我的机器上,这两次已经稳定在彼此的2%之内,足够近以至于我确信它们已经针对几乎相同的代码进行了优化.

IL的差异相当小:

>测试循环代码是相同的,除了调用哪个方法(LongAdd vs LongStruct.Add).
> LongStruct.Add有一些额外的说明:

>一对ldfld指令从结构中加载Primitive
> newobj指令将新的long打包回LongStruct

所以要么抖动正在优化这些指令,要么它们基本上是免费的.

测试:双打

如果我采用上面的代码并用double替换每个long,我会期望相同的结果(绝对值较慢,因为add指令稍微慢一点,但两者都是相同的余量).

我实际看到的是DoubleStruct版本比双版本慢大约4.8倍(即480%).

IL与long case相同(除了交换用于float64和DoubleStruct的int64和LongStruct),但不知何故,运行时正在为LongStruct案例或双案例中不存在的DoubleStruct案例进行额外的工作.

测试:其他类型

测试一些其他原始类型,我看到float(465%)的行为与double的行为相同,而short和int的行为方式与long相同,所以看起来它是关于浮点导致一些优化不被采取.

为什么DoubleStruct和floatStruct比double和float慢得多,其中long,int和short等价物没有这样的减速?

解决方法 这不是一个单独的答案,但它在x86和x64上都是一个更严格的基准测试,所以希望它能为其他可以解释这个问题的人提供更多信息.

我试图用BenchmarkDotNet复制它.我也想知道删除in会有什么区别.我把它作为x86和x64单独运行.

x86(LegacyJIT)

|                 Method |     Mean |    Error |   StdDev ||----------------------- |---------:|---------:|---------:||               TestLong | 257.9 ms | 2.099 ms | 1.964 ms ||         TestLongStruct | 529.3 ms | 4.977 ms | 4.412 ms ||   TestLongStructWithIn | 526.2 ms | 6.722 ms | 6.288 ms ||             TestDouble | 256.7 ms | 1.466 ms | 1.300 ms ||       TestDoubleStruct | 342.5 ms | 5.189 ms | 4.600 ms || TestDoubleStructWithIn | 338.7 ms | 3.808 ms | 3.376 ms |

x64(RyuJIT)

|                 Method |       Mean |     Error |    StdDev ||----------------------- |-----------:|----------:|----------:||               TestLong |   269.8 ms |  5.359 ms |  9.099 ms ||         TestLongStruct |   266.2 ms |  6.706 ms |  8.236 ms ||   TestLongStructWithIn |   270.4 ms |  4.150 ms |  3.465 ms ||             TestDouble |   270.4 ms |  5.336 ms |  6.748 ms ||       TestDoubleStruct | 1,250.9 ms | 24.702 ms | 25.367 ms || TestDoubleStructWithIn |   577.1 ms | 12.159 ms | 16.644 ms |

我可以使用RyuJIT在x64上复制此内容,但不能在使用LegacyJIT的x86上复制此内容.这似乎是RyuJIT管理优化长期案例但不是双重案例的工件 – LegacyJIT也没有做出优化.

我不知道为什么TestDoubleStruct在RyuJIT上是如此异常.

码:

public Readonly struct LongStruct{    public Readonly long Primitive;    public LongStruct(long value) => Primitive = value;    public static LongStruct Add(LongStruct lhs,LongStruct rhs)        => new LongStruct(lhs.Primitive + rhs.Primitive);    public static LongStruct AdDWithIn(in LongStruct lhs,in LongStruct rhs)        => new LongStruct(lhs.Primitive + rhs.Primitive);}public Readonly struct DoubleStruct{    public Readonly double Primitive;    public DoubleStruct(double value) => Primitive = value;    public static DoubleStruct Add(DoubleStruct lhs,DoubleStruct rhs)        => new DoubleStruct(lhs.Primitive + rhs.Primitive);    public static DoubleStruct AdDWithIn(in DoubleStruct lhs,in DoubleStruct rhs)        => new DoubleStruct(lhs.Primitive + rhs.Primitive);}public class Benchmark{    [Benchmark]    public voID TestLong()    {        for (var i = 1000000000; i > 0; --i)        {            LongAdd(1,2);        }    }    [Benchmark]    public voID TestLongStruct()    {        var a = new LongStruct(1);        var b = new LongStruct(2);        for (var i = 1000000000; i > 0; --i)        {            LongStruct.Add(a,b);        }    }    [Benchmark]    public voID TestLongStructWithIn()    {        var a = new LongStruct(1);        var b = new LongStruct(2);        for (var i = 1000000000; i > 0; --i)        {            LongStruct.AdDWithIn(a,b);        }    }    [Benchmark]    public voID TestDouble()    {        for (var i = 1000000000; i > 0; --i)        {            DoubleAdd(1,2);        }    }    [Benchmark]    public voID TestDoubleStruct()    {        var a = new DoubleStruct(1);        var b = new DoubleStruct(2);        for (var i = 1000000000; i > 0; --i)        {            DoubleStruct.Add(a,b);        }    }    [Benchmark]    public voID TestDoubleStructWithIn()    {        var a = new DoubleStruct(1);        var b = new DoubleStruct(2);        for (var i = 1000000000; i > 0; --i)        {            DoubleStruct.AdDWithIn(a,b);        }    }    public static long LongAdd(long lhs,long rhs) => lhs + rhs;    public static double DoubleAdd(double lhs,double rhs) => lhs + rhs;}class Program{    static voID Main(string[] args)    {        var summary = BenchmarkRunner.Run<Benchmark>();        Console.Readline();    }}

为了好玩,以下是两种情况下的x64程序集:

using System;public class C {    public long AddLongs(long a,long b) {        return a + b;    }    public LongStruct AddLongStructs(LongStruct a,LongStruct b) {        return LongStruct.Add(a,b);    }    public LongStruct AddLongStructsWithIn(LongStruct a,LongStruct b) {        return LongStruct.AdDWithIn(a,b);    }    public double AddDoubles(double a,double b) {        return a + b;    }    public DoubleStruct AddDoubleStructs(DoubleStruct a,DoubleStruct b) {        return DoubleStruct.Add(a,b);    }    public DoubleStruct AddDoubleStructsWithIn(DoubleStruct a,DoubleStruct b) {        return DoubleStruct.AdDWithIn(a,b);    }}public Readonly struct LongStruct{    public Readonly long Primitive;    public LongStruct(long value) => Primitive = value;    public static LongStruct Add(LongStruct lhs,in LongStruct rhs)        => new LongStruct(lhs.Primitive + rhs.Primitive);}   public Readonly struct DoubleStruct{    public Readonly double Primitive;    public DoubleStruct(double value) => Primitive = value;    public static DoubleStruct Add(DoubleStruct lhs,in DoubleStruct rhs)        => new DoubleStruct(lhs.Primitive + rhs.Primitive);}

x86汇编

C.AddLongs(Int64,Int64)    L0000: mov eax,[esp+0xc]    L0004: mov edx,[esp+0x10]    L0008: add eax,[esp+0x4]    L000c: adc edx,[esp+0x8]    L0010: ret 0x10C.AddLongStructs(LongStruct,LongStruct)    L0000: push esi    L0001: mov eax,[esp+0x10]    L0005: mov esi,[esp+0x14]    L0009: add eax,[esp+0x8]    L000d: adc esi,[esp+0xc]    L0011: mov [edx],eax    L0013: mov [edx+0x4],esi    L0016: pop esi    L0017: ret 0x10C.AddLongStructsWithIn(LongStruct,esi    L0016: pop esi    L0017: ret 0x10C.AddDoubles(Double,Double)    L0000: fld qword [esp+0xc]    L0004: fadd qword [esp+0x4]    L0008: ret 0x10C.AddDoubleStructs(DoubleStruct,DoubleStruct)    L0000: fld qword [esp+0xc]    L0004: fld qword [esp+0x4]    L0008: faddp st1,st0    L000a: fstp qword [edx]    L000c: ret 0x10C.AddDoubleStructsWithIn(DoubleStruct,DoubleStruct)    L0000: fld qword [esp+0xc]    L0004: fadd qword [esp+0x4]    L0008: fstp qword [edx]    L000a: ret 0x10

x64汇编

C..ctor()    L0000: retC.AddLongs(Int64,Int64)    L0000: lea rax,[rdx+r8]    L0004: retC.AddLongStructs(LongStruct,LongStruct)    L0000: lea rax,[rdx+r8]    L0004: retC.AddLongStructsWithIn(LongStruct,[rdx+r8]    L0004: retC.AddDoubles(Double,Double)    L0000: vzeroupper    L0003: vmovaps xmm0,xmm1    L0008: vaddsd xmm0,xmm0,xmm2    L000d: retC.AddDoubleStructs(DoubleStruct,DoubleStruct)    L0000: sub rsp,0x18    L0004: vzeroupper    L0007: mov [rsp+0x28],rdx    L000c: mov [rsp+0x30],r8    L0011: mov rax,[rsp+0x28]    L0016: mov [rsp+0x10],rax    L001b: mov rax,[rsp+0x30]    L0020: mov [rsp+0x8],rax    L0025: vmovsd xmm0,qword [rsp+0x10]    L002c: vaddsd xmm0,[rsp+0x8]    L0033: vmovsd [rsp],xmm0    L0039: mov rax,[rsp]    L003d: add rsp,0x18    L0041: retC.AddDoubleStructsWithIn(DoubleStruct,DoubleStruct)    L0000: push rax    L0001: vzeroupper    L0004: mov [rsp+0x18],rdx    L0009: mov [rsp+0x20],r8    L000e: vmovsd xmm0,qword [rsp+0x18]    L0015: vaddsd xmm0,[rsp+0x20]    L001c: vmovsd [rsp],xmm0    L0022: mov rax,[rsp]    L0026: add rsp,0x8    L002a: ret

SharpLab

如果你添加循环:

public class C {    public voID AddLongs(long a,long b) {        for (var i = 1000000000; i > 0; --i) {            long c = a + b;        }    }    public voID AddLongStructs(LongStruct a,LongStruct b) {        for (var i = 1000000000; i > 0; --i) {            a = LongStruct.Add(a,b);        }    }    public voID AddLongStructsWithIn(LongStruct a,LongStruct b) {        for (var i = 1000000000; i > 0; --i) {            a = LongStruct.AdDWithIn(a,b);        }    }    public voID AddDoubles(double a,double b) {        for (var i = 1000000000; i > 0; --i) {            a = a + b;        }    }    public voID AddDoubleStructs(DoubleStruct a,DoubleStruct b) {        for (var i = 1000000000; i > 0; --i) {            a = DoubleStruct.Add(a,b);        }    }    public voID AddDoubleStructsWithIn(DoubleStruct a,DoubleStruct b) {        for (var i = 1000000000; i > 0; --i) {            a = DoubleStruct.AdDWithIn(a,b);        }    }}public Readonly struct LongStruct{    public Readonly long Primitive;    public LongStruct(long value) => Primitive = value;    public static LongStruct Add(LongStruct lhs,in DoubleStruct rhs)        => new DoubleStruct(lhs.Primitive + rhs.Primitive);}

86

C.AddLongs(Int64,Int64)    L0000: push ebp    L0001: mov ebp,esp    L0003: mov eax,0x3b9aca00    L0008: dec eax    L0009: test eax,eax    L000b: jg L0008    L000d: pop ebp    L000e: ret 0x10C.AddLongStructs(LongStruct,LongStruct)    L0000: push ebp    L0001: mov ebp,esp    L0003: push esi    L0004: mov esi,0x3b9aca00    L0009: mov eax,[ebp+0x10]    L000c: mov edx,[ebp+0x14]    L000f: add eax,[ebp+0x8]    L0012: adc edx,[ebp+0xc]    L0015: mov [ebp+0x10],eax    L0018: mov [ebp+0x14],edx    L001b: dec esi    L001c: test esi,esi    L001e: jg L0009    L0020: pop esi    L0021: pop ebp    L0022: ret 0x10C.AddLongStructsWithIn(LongStruct,esi    L001e: jg L0009    L0020: pop esi    L0021: pop ebp    L0022: ret 0x10C.AddDoubles(Double,Double)    L0000: push ebp    L0001: mov ebp,eax    L000b: jg L0008    L000d: pop ebp    L000e: ret 0x10C.AddDoubleStructs(DoubleStruct,DoubleStruct)    L0000: push ebp    L0001: mov ebp,0x3b9aca00    L0008: fld qword [ebp+0x10]    L000b: fld qword [ebp+0x8]    L000e: faddp st1,st0    L0010: fstp qword [ebp+0x10]    L0013: dec eax    L0014: test eax,eax    L0016: jg L0008    L0018: pop ebp    L0019: ret 0x10C.AddDoubleStructsWithIn(DoubleStruct,0x3b9aca00    L0008: fld qword [ebp+0x10]    L000b: fadd qword [ebp+0x8]    L000e: fstp qword [ebp+0x10]    L0011: dec eax    L0012: test eax,eax    L0014: jg L0008    L0016: pop ebp    L0017: ret 0x10

64位

C.AddLongs(Int64,0x3b9aca00    L0005: dec eax    L0007: test eax,eax    L0009: jg L0005    L000b: retC.AddLongStructs(LongStruct,LongStruct)    L0000: mov eax,0x3b9aca00    L0005: add rdx,r8    L0008: dec eax    L000a: test eax,eax    L000c: jg L0005    L000e: retC.AddLongStructsWithIn(LongStruct,eax    L000c: jg L0005    L000e: retC.AddDoubles(Double,Double)    L0000: vzeroupper    L0003: mov eax,0x3b9aca00    L0008: vaddsd xmm1,xmm1,xmm2    L000d: dec eax    L000f: test eax,eax    L0011: jg L0008    L0013: retC.AddDoubleStructs(DoubleStruct,r8    L0011: mov eax,0x3b9aca00    L0016: mov rdx,[rsp+0x28]    L001b: mov [rsp+0x10],rdx    L0020: mov rdx,[rsp+0x30]    L0025: mov [rsp+0x8],rdx    L002a: vmovsd xmm0,qword [rsp+0x10]    L0031: vaddsd xmm0,[rsp+0x8]    L0038: vmovsd [rsp],xmm0    L003e: mov rdx,[rsp]    L0042: mov [rsp+0x28],rdx    L0047: dec eax    L0049: test eax,eax    L004b: jg L0016    L004d: add rsp,0x18    L0051: retC.AddDoubleStructsWithIn(DoubleStruct,r8    L000e: mov eax,0x3b9aca00    L0013: vmovsd xmm0,qword [rsp+0x20]    L001a: vmovaps xmm1,xmm0    L001f: vaddsd xmm1,[rsp+0x18]    L0026: vmovsd [rsp],xmm1    L002c: mov rdx,[rsp]    L0030: mov [rsp+0x18],rdx    L0035: dec eax    L0037: test eax,eax    L0039: jg L001a    L003b: add rsp,0x8    L003f: ret

SharpLab

我对汇编不够熟悉,无法解释它究竟在做什么,但很明显AddDoubleStructs中的工作比AddLongStructs更多.

总结

以上是内存溢出为你收集整理的c# – 为什么添加DoubleStructs比添加双倍速度要慢得多?全部内容,希望文章能够帮你解决c# – 为什么添加DoubleStructs比添加双倍的速度要慢得多?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1215182.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-05
下一篇 2022-06-05

发表评论

登录后才能评论

评论列表(0条)

保存