Intrinsics

Intrinsics in C++ are places where the compiler knows how to generate code for common C++ operations, including function calls, and does so inline rather than generating a call.

For example, a simple copy of a string from one buffer to another looks like this:

strcpy(dest, src);
00419123 lea eax,[ebp-54h]
00419126 push eax
00419127 lea ecx,[ebp-9Ch]
0041912D push ecx
0041912E call @ILT+1460(_strcpy) (4115B9h)
00419133 add esp,8

This is pushing the source and destination onto the stack, and calling the strcpy function. This is the code Visual C++ 2005 generates in Debug mode. In release mode, with optimization disabled, you get the same code as above. But enable optimizations, and you get this:

strcpy(dest, src);
004015B0 xor eax,eax
004015B2 mov cl,byte ptr [esp+eax]
004015B5 add eax,1
004015B8 test cl,cl
004015BA jne TestApp::TestApp+52h (4015B2h)

Even if you disable intrinsics, you still get the inline strcpy. You have to turn off optimization to get the call to the runtime library strcpy function in a release build. I'm not sure why - strcpy is listed in the list of intrinsic functions the compiler supports but whether or not you get an inline strcpy is affected by the overall optimization setting, not by the /Oi setting which allegedly controls intrinsics.

What I was hoping to see when I started this little test was Whole Program Optimization inlining the call to strcpy whether I had intrinsics enabled or not, but that's not what I'm seeing.

If I create my own custom strcpy function:

void mystrcpy(char *dest, char *src)
{
for (; (*dest = *src) != 0; ++src, ++dest);
}

And I call that, then Whole Program Optimization does kick in and change from a call to the function:

mystrcpy(dest, src);
00401619 lea eax,[esp+18h]
0040161D push eax
0040161E lea ecx,[esp+5Ch]
00401622 push ecx
00401623 call mystrcpy (401910h)
00401628 add esp,14h

Into an inline version of the function:

mystrcpy(dest, src);
00402330 cmp byte ptr [esp],0
00402334 je TestApp::TestApp+6Bh (40234Bh)
00402336 xor eax,eax
00402338 jmp TestApp::TestApp+60h (402340h)
0040233A lea ebx,[ebx]
00402340 mov cl,byte ptr [esp+eax+1]
00402344 add eax,1
00402347 test cl,cl
00402349 jne TestApp::TestApp+60h (402340h)

But I'd take the intrinsic over this version any day. That's what's good about intrinsics - they are functions the compile has special knowledge of, so it can generate customized assembly code that does a better job of implementing that function than what the normal code generation path would generate.

(What got me started on this was a post on the Visual C++ Team Blog about improved intrinsic support in VS2008).