Would a large amount of stack space required by a function prevent it from being inlined? Such as if I had a 10k automatic buffer on the stack, would that make the function less likely to be inlined?
I have a few heavily optimized math functions that take 1-2 nanoseconds to complete. These functions are called hundreds of millions of times per second, so call overhead is a concern, despite the already-excellent performance.
In Delphi math.pas unit there is a procedure DivMod that i want to convert it into inline and optimize it for divisor to be always 10 . But I dont know details of Pentagon ASM . What is the conversion of bellow procedure
The execution times for these three snippets:And this:And this:Are, on a 4770K, roughly 5 cycles per iteration for the first snippet and roughly 9 cycles per iteration for the second snippet, then 5 cycles for the third snippet. They both access the exact same address, which is 4K-aligned. In the...
Fully knowing that these completely artificial benchmarks don't mean much, I am nonetheless a bit surprised by the several ways the "big 4" compilers chose to compile a trivial snippet.
I am working on an assembly language program for a 6502 cpu, and am finding that I need a fast-as-possible divide-by-seven routine, in particular one which could take a 16-bit dividend.