Introduction
One of the many functions of an Operating System is to manage the memory, and distribute them amongst running processes and threads.
The operating system does this by having a pool of memory, that means unused memory, then assign memory from the pool to any program/process that request for it.
This act of assigning memory from the pool, to the program/process is called memory allocation
Two method of memory allocation
- Direct Memory Mapping
This method directly notify the operating system that you need an amount of RAM, and the operating system would assign it to you.
The disadvantage of this is that system call ( direct notification to the operating system ) is an expensive process, as in it is slow. Further more, there are often restrictions that one have to allocate to page size, or it will be rounded up to page boundary. ( Normal page size is 4k on x86 CPU if I am not wrong. ) Another advantage is that you get to set the I/O privilege of the pages.
Example of implementation:
mmap(); // POSIX
VirtualAlloc(); // Windows
- Allocate from heap
This method is to preallocate a few page ( called the program heap ), then whenever a request for memory is issued, it will see if the free space in heap is big enough to fulfill the request. If it is, then it will directly assign those memory from the heap to the program. Otherwise, it will allocate more page, then assign the memory. Note that this method happens outside of the Operating System.
Disadvantage is that there’s an extra layer, and does not allow you to specify the I/O privilege of the pages that you are allocating. The advantage is that one can allocate any size of memory, without the limitation of page size. Also, it is faster, as most of the time it doesn’t involve in system calls.
Exampe of implementation:
mallloc(); // ANSI C
Benchmarking
The easiest way to understand the performance of these memory allocation routine is to do a benchmark.
I conducted one on my laptop ( For specs, see previous post. ). The contesters are:
- mmap() on Linux // PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE
- malloc() on Linux
- VirtualAlloc() on Windows // MEM_COMMIT, PAGE_EXECUTE_READWRITE
- malloc() on Windows
The software versions are:
Linux kernel 2.6.18, glibc 2.3.6
Windows XP SP2
The procedure as follow:
1. Start timing.
2. Allocate x byte.
3. Access the first byte of the allocated memory.
4. Free the memory.
5. Repeat step 2~4 for a total of 2048 times.
6. Stop timing and record reading.
Timing is done with the following API/Function:
gettimeofday(); // POSIX/Linux
QueryPerformanceCounter(); // Windows
Hmm, probably accurate enough, although:
Sleep(10); Resulted in around 89xx microsecond
while
usleep(10*1000); Resulted in around 100xx microsecond
// Note: Sleep() is a Windows API, while usleep() is a POSIX one.
The program is compiled with Visual C++ 6 on Windows, and gcc on Linux. ( duh! )
Benchmarking Result
Let’s visualize it:

Click to enlarge?
Seemed that malloc() on Windows is sort of messed up after 64kb allocation
Perhaps Microsoft should fix it?
It even spoilt the graph, that we only see the green line.
This is the graph without malloc() on Windows:

Click to enlarge?
Seemed that all 3 function scaled O(1) before 128kb.
I wondered how malloc() on Windows scaled.
So this is the third graph with all 4 function, but only with test case 
Click to enlarge?
As seen from the graph, all the 3 function from the previous graph scaled O(1), while malloc() on Windows goes at out of control at around 16kb.
Conclusion
On Linux: Use malloc to save trouble, it scales quite well. Use mmap if you want to do advanced memory management.
On Windows: Use malloc before 16kb size, use VirtualAlloc() for bigger page size.
December 25, 2008 at 6:55 am |
well, hi admin adn people nice forum indeed. how’s life? hope it’s introduce branch
September 26, 2009 at 12:58 pm |
if logical address space is 1024words for 8 pages
it is mapped into physical address of 32 frames
how many bits are there in logical and physical address.
please clarify my doubts