Like RyuJIT does before running, the DivideSharp optimizes an integer division by "mostly constant" values.
DivideSharp first computes the magic parameters for division and then converts a single division instruction, such as idiv or div, into an equivalent code that has no such slow division instructions and whose entire division code is faster than a single division instruction.
When the divisor is a power of two like 0b1000_0000u(=128u), the division can be simplified to a right-shift operation like:
static uint D128(uint value) => value >> 7;The above code is then compiled as follows(using SharpLab):
D128(UInt32)
L0000: mov eax, ecx //The first parameter `value` is in `ecx` because of `static`.
L0002: shr eax, 7 //Shift the eax 7 bits right
L0005: ret //return eax;The division of unsigned integers can theoretically be transformed into a set of multiplications and shifts, as shown in the following equation:
The magic is selected by this algorithm.
Furthermore, since division by can be converted to a shift operation, this formula can be rewritten into the following C# code:
public static uint D239Custom(uint value)
{
//In C#, the `mul` instruction used by real-world compilers such as RyuJIT and Clang cannot be specified directly, so the 64-bit unsigned multiplication `imul` is used instead.
ulong v = value * 0x891ac73b;
v >>= 39;
return (uint)v;
}The above code is then compiled as follows:
D239Custom(UInt32)
L0000: imul eax, ecx, 0x891ac73b
L0006: mov eax, eax
L0008: shr rax, 0x27
L000c: retHowever, codes like D239Custom may return inaccurate results depending on the denominator (e.g. 231).
RyuJIT and DivideSharp uses the following expression instead.
(store the value of
into
edx)
then calculate
(The variable "edx" is named after the AMD64 32-bit register edx.)
And the aforementioned equation can be organized as follows:
Since does not fit into the 32-bit unsigned integer variable
magic, term adds
to the numerator stored in
magic.
Now, when the divisor is 231, we can rewrite this expression in the following C# code:
public static uint D231Custom(uint value)
{
ulong v = (ulong)value * 0x1bb4a405;
v >>= 32;
value -= (uint)v;
value >>= 1;
var q = value + (uint)v;
q >>= 7;
return q;
}The above code is then compiled as follows:
D231Custom(UInt32)
L0000: mov eax, ecx
L0002: imul rax, 0x1bb4a405
L0009: shr rax, 0x20
L000d: sub ecx, eax
L000f: shr ecx, 1
L0011: add eax, ecx
L0013: shr eax, 7
L0016: retIf the denominator is greater than 2147483648(0x8000_0000u), as in 2147483649, the numerator cannot be more than twice its denominator.
For this reason, it is more efficient to use the if statement instead of dividing by the method described above.
In C# it looks like the following:
//The Unsafe class belongs to System.Runtime.CompilerServices
public static uint D2147483649(uint value) => value >= 2147483649 ? 1u : 0u;
public static uint D2147483649Ex(uint value) //Equivalent to value >= 2147483649 ? 1u : 0u
{
bool f = value >= 2147483649; //`cmp ecx, 0x80000001` and `setae al`
return Unsafe.As<bool, byte>(ref f); //moxzx eax, al
}The above code is then compiled as follows:
D2147483649(UInt32) //Ternary operator
L0000: cmp ecx, 0x80000001
L0006: jae short L000b
L0008: xor eax, eax
L000a: ret
L000b: mov eax, 1
L0010: ret
D2147483649Ex(UInt32) //Unsafe.As<bool, byte>
L0000: cmp ecx, 0x80000001
L0006: setae al
L0009: movzx eax, al
L000c: retThe UInt32Divisor generalizes the four cases mentioned above with the following code:
public uint Divide(uint value)
{
uint strategy = (uint)Strategy;
uint divisor = Divisor;
if (strategy == (uint)UInt32DivisorStrategy.Branch)
{
bool v = value >= divisor;
return Unsafe.As<bool, byte>(ref v);
}
ulong rax = value;
uint eax;
ulong multiplier = Multiplier;
int shift = Shift;
if ((strategy & 0b10u) > 0)
{
rax *= multiplier;
if ((strategy & 0b01u) > 0)
{
eax = (uint)(rax >> 32);
value -= eax;
value >>= 1;
eax += value;
rax = eax;
}
}
eax = (uint)(rax >> shift);
return eax;
}Details
var uInt32Divisor = new UInt32Divisor(19);var quotient = uInt32Divisor.Divide(39); //2var remainder = uInt32Divisor.Modulo(39); //1- Unlike
Math.DivRem, theoutparameter isquotient, notremainder.
var remainder = uInt32Divisor.DivRem(39, out var quotient);
//remainder: 1
//quotient: 2- Calculates the largest multiple of divisor less than or equal to the specified value.
var rounded = uInt32Divisor.Floor(39); //38var remainder = uInt32Divisor.Floor(39, out var rounded);
//remainder: 1
//rounded: 38var quotient = 38 / uInt32Divisor; //2var remainder = 39 % uInt32Divisor; //1Details
var dn19 = new Int32Divisor(-19); //Divisor -19
var dp19 = new Int32Divisor(19); //Divisor 19var quotient = dn19.Divide(39); //-2
var quotient2 = dn19.Divide(-39); //2var remainder = dn19.Modulo(39); //1
var remainder2 = dn19.Modulo(-39); //-1- Unlike
Math.DivRem, theoutparameter isquotient, notremainder.
var remainder = dn19.DivRem(39, out var quotient);
//remainder: 1
//quotient: -2- Divides the value with divisor, truncates the quotient towards zero, and multiplies the quotient with divisor.
- Equivalent to (int)(value / divisor) _ divisor.
- NOT Equivalent to
(int)Math.Floor(value / (double)divisor) _ divisorwhich internally truncates the value towards negative infinity.
var rounded = dn19.Floor(39); //It returns 38 unlike the (int)Math.Floor(39 / -19.0) * -19 returns 57
var rounded2 = dn19.Floor(-39); //-38
var rounded3 = dp19.Floor(-39); //It returns -38 unlike the (int)Math.Floor(-39 / 19.0) * 19 returns -57
var rounded4 = dp19.Floor(39); //38- Calculates
rounded = value / divisor * divisor, remainder = value - rounded
var remainder = dn19.Floor(39, out var rounded);
//remainder: 1
//rounded: 38var quotient = 38 / dn19; //-2var remainder = 39 % dn19; //1