template<int size, typename T>
void template_memcpy(T* dest, const T* src)
{
struct type {
T data[size];
};
*reinterpret_cast<type *>(dest) = *reinterpret_cast<const type *>(src);
}
The code seems pretty cool. It performs conventional memcpy, by depending on compiler generated code.
I made a test on Visual C++ 2008 compiler, to see whether the generated code from template_memcpy, is as good as conventional c-style memcpy.
Our source is a 31 bytes char array. Take a look on the conventional c-style memcpy source code, together with its disassembly.
memcpy(dest, src, sizeof(src));
00241053 B9 07 00 00 00 mov ecx,7
00241058 8D 74 24 1C lea esi,[esp+1Ch]
0024105C 8B FB mov edi,ebx
0024105E F3 A5 rep movs dword ptr es:[edi],dword ptr [esi]
00241060 66 A5 movs word ptr es:[edi],word ptr [esi]
00241062 A4 movs byte ptr es:[edi],byte ptr [esi]
From the assembly code, we know that CPU performs
1) Move double word (4 bytes) from source to destination memory 7 times.
2) Move word (2 bytes) from source to destination memory 1 time.
3) Move byte from source to destination memory 1 time.
How about the generated code for template_memcpy?
template_memcpy<sizeof(src)>(dest, src);
002410C7 B9 07 00 00 00 mov ecx,7
002410CC 8D 74 24 1C lea esi,[esp+1Ch]
002410D0 8B FB mov edi,ebx
002410D2 F3 A5 rep movs dword ptr es:[edi],dword ptr [esi]
002410D4 66 A5 movs word ptr es:[edi],word ptr [esi]
002410D6 A4 movs byte ptr es:[edi],byte ptr [esi]
See. template_memcpy is as good as conventional memcpy, isn't it?
OK. So, why do we choose template_memcpy over memcpy?
template_memcpy with come in handy, when you try to perform copy on array of objects.
Instead of :-
MyObject src[100];
MyObject *dest = new MyObject[sizeof(src) / sizeof(src[0])];
for(int i=0; i<(sizeof(src) / sizeof(src[0])); i++) {
dest[i] = src[i];
}
delete[] dest;
We may :-
MyObject src[100];
MyObject *dest = new MyObject[sizeof(src) / sizeof(src[0])];
template_memcpy<sizeof(src) / sizeof(src[0])>(dest, src);
delete[] dest;
The code seems cleaner, isn't it?
Of course, there is a shortcoming for template_memcpy. It only support memory size, which is known during compiled time :)
No comments:
Post a Comment