Delphi Memory Allocation and Dynamic-Link Libraries

Borland's comments within a wizard-generated DLL project:

{ Important note about DLL memory management: ShareMem must be the first unit in your library's USES clause AND your project's (select Project-View Source) USES clause if your DLL exports any procedures or functions that pass strings as parameters or function results. This applies to all strings passed to and from your DLL--even those that are nested in records and classes. ShareMem is the interface unit to the BORLNDMM.DLL shared memory manager, which must be deployed along with your DLL. To avoid using BORLNDMM.DLL, pass string information using PChar or ShortString parameters. }

Why are these precautions necessary? The reason is rooted in the way Delphi allocates memory. While Windows provides native memory allocation routines (VirtualAlloc, HeapAlloc, GlobalAlloc, LocalAlloc etc.), Delphi implements its own allocation policy, or more accurately a suballocator. In Pascal parlance, it is called a heap (not to be confused with priority heaps); C/C++ programmers would be more familiar with the term free store.  The task of the suballocator is to allocate all dynamic memory: from raw memory explicitly allocated by the programmer to that implicitly allocated by the compiler, as when creating strings, dynamic arrays and objects. 

Few developers realize that they are implicitly allocating memory in statements such as:


        var s: string
        	...
        s := s + 'abc';

The dynamic-allocation functions most Delphi users are familiar with are GetMem(), FreeMem(), and New() and Dispose(). But actually many seemingly simple actions in Delphi can result in heap memory being allocated or deallocated. Among them:

In Delphi, all objects "live" on the heap.  This is similar to Java, and C# (belying Java/Delphi influences), but not to C++, where objects may live on the stack, heap and even in the data segment. Developers familiar with 16-bit windows programming may be wondering why Delphi doesn't use Windows heaps (via HeapCreate()/HeapAlloc()/HeapFree() etc) or even the virtual memory functions VirtualAlloc()/VirtualFree().  The simple reason is speed.  The Windows heap functions are very slow compared to Delphi's native allocation.  The virtual memory allocations are even slower, but this is only because they were not designed for allocating large numbers of small blocks (since that is what heaps are for).   However the suballocator ultimately calls these virtual memory functions when it needs large blocks  from which to suballocate. 

The memory manager code resides in System.pas and GetMem.inc, and as such, is compiled (statically linked) with every application. This is not normally a problem, but in applications using DLLs also written in Delphi, this has certain implications: since a DLL is a separately compiled application, it receives its own copy of the memory manager, and thus a separate heap.  This is the most important thing to remember: each distinct application, whether an .exe or .dll, manages its own heap. All subsequent problems arise simply from having one exe/dll mistakenly manage a piece of memory that does not live in its own heap. 

What are heaps?

For those unfamiliar with heaps and how they are used in Delphi, a heap is a region of memory in which dynamically allocated memory is stored.  In most structured languages like C/C++, Delphi and even Microsoft's new C#, a programmer may use two kinds of memory: static and dynamic. Basic data types, also called value types, are static, and their memory requirements are known and fixed at compile-time.  Delphi's integers, enumerated types, records and static arrays are examples of statically allocated variables.  In C, all data types are static, and the programmer must explicitly allocate dynamic memory through some form of allocation routine, like
malloc().  Dynamic memory, on the other hand, may have its size readjusted at run-time. Examples of this are Delphi's long strings, class types and dynamic arrays.  In Visual Basic, many data types are dynamically allocated, among them variants and dynamic arrays.  As a rule of thumb, any data type whose size may be changed at run-time is dynamically allocated.  From a compiler's point of view, these two kinds of memory are very distinct, and "live" in completely separate sections of an application's memory: Global statically-allocated variables live in a global static data area, local statically-allocated variables live on the stack, and dynamic memory blocks live on a heap. This separation of memory objects is actually ingrained very deeply into the fabric of modern programming, extending deep into the operating system and down into the hardware itself.  This is why many chips (such as the Intel x86 family) have support for explicit data and stack segments. 

 

The last line in the auto-generated comment deserves attention: pass string information using PChar or ShortString parameters. It seems to suggest that using PChar's or shortstrings would "solve" the problem.  This is very misleading, and can lull developers into a false sense of security (see here for more information on ShortStrings and PChars).  But consider:

In DLL:

        function GetPChar: PChar;
        begin
        	Result := StrAlloc( 13 );
        	StrCopy( Result, 'Hello World!' );
        end;
In EXE:
        var p: PChar;
        	...
        p := GetPChar;
        // do something
        StrDispose(p); // DLL heap possibly corrupted;
                       // "Invalid pointer operation" possibly thrown 

We get the same errors again.

There is the perception about PChars, that since "the Windows API does it this way, it must be right".  But the windows API very seldom allocates PChars for passing to applications.  It requires the caller to allocate the PChar and pass a parameter specifying its length, and the API then writes to this buffer.  In fact, there is very little advantage to using PChars in Delphi, since the reference-counted string type is much safer and more efficient.  Only very advanced users who have a clear idea of their reasons for doing so, should use them.

Passing objects doesn't help either:

In DLL:

        function GetStringList: TStringList;
        begin
        	Result := TStringList.Create;
        	Result.Add( 'foo' );
        end;
In EXE:
        var obj: TStringList;
        	...
        obj := GetStringList;
        // do something
        obj.Free; // may corrupt DLL heap; may free >1 blocks

Depending on what the EXE did to the object, it may cause corruption to both heaps.  Note that, as of Delphi 6, simply having a module free memory from another module's heap does not seem to actually corrupt the heap per se.  The heap manager keeps the free memory blocks in a linked-list, and during deletion, attempts to merge two adjacent free blocks.  The "Invalid pointer operation" exception only occurs when a module attempts to free the last allocated block of another module's free list, and, failing to recognize an invalid element, attempts to merge the free block with (what seems to be) a marker element (or might simply be garbage), which causes the error.  Though the error only occurs in this instance, the implementation of the heap manager is not guaranteed to stay this way, and in future versions the heap corruption could occur at any point. 

The above PChar example could be "fixed" this way:

In DLL:

        function GetPChar: PChar;
        begin
        	Result := StrAlloc( 13 );
        	StrCopy( Result, 'Hello World!' );
        end;

        procedure FreePChar( p: PChar );
        begin
        	StrDispose( p );
        end;
In EXE:
        var p: PChar;
        	...
        p := GetPChar;
        // do something
        FreePChar( p ); // heap-friendly free

There is no equivalent "fix" for the TStringList version, since strings created within the EXE's heap may be freed by the TStringList's destructor, causing corruption in the EXE's heap.

So what is a proper solution?  There are several options:

But there is a simpler option. The FastSharemem unit is an attempt an alternative solution.  It is simple to use, no more complex than including a single unit, and incurs no performance penalty. Plus, there are no DLL's to worry about.

 

 


(c) 2002 emil santos

codexterity
ems ATSIGN codexterity PERIOD com