自定义对象内存分配器

Mike Ash Friday Q&A 中文译文:自定义对象内存分配器

作者 TommyWu
封面圖片: 自定义对象内存分配器

译文 · 原文: Friday Q&A 2010-12-17: Custom Object Allocators in Objective-C · 作者 Mike Ash

原文:https://www.mikeash.com/pyblog/friday-qa-2010-12-17-custom-object-allocators-in-objective-c.html 发布:2010-12-17 作者:Mike Ash 译者:MiMo(mimo-v2.5-pro);代码块保留英文原样


祝大家节日愉快,冬日幸福,也祝各位有个愉快的 Friday Q & A。Camille Troillard 建议我探讨如何在 Objective-C 中创建自定义对象内存分配器(custom object memory allocators),今天我将逐步讲解如何实现这一功能以及你可能需要这样做的原因。

含义 正如所有使用 Objective-C 的人都知道,你通过编写 [MyClass alloc] 来分配一个类的实例。创建一个自定义分配器(custom allocator)仅仅意味着替换标准分配器,使得 [MyClass alloc] 调用的是你自己的代码。

一个 Objective-C 对象只是一块大小正确的内存,并且其首个指针大小的块(pointer-sized chunk)被设置为指向该对象的类(class)。因此,一个自定义分配器需要返回一个指向大小正确且类信息已正确填充的内存块的指针。

实用价值 编写自定义分配器的最大原因无疑是性能。标准分配器所做出的权衡可能并不适合你的特定情况。它还必须能与任何情况下的任何类协同工作,而你的自定义分配器只需要适配你的类及其被使用的场景。

关于垃圾回收的说明
本文假设采用手动引用计数(retain / release)内存管理方式。自定义分配器在垃圾回收环境下几乎无法使用,因为无法添加自定义的释放回调函数。虽然其中一些技术(如对象缓存)仍可部分应用,但自定义分配器主要存在于手动内存管理的领域。

基础自定义分配器
+alloc 方法实际上只是简单调用 +allocWithZone:。虽然内存区域(memory zones)如今已近乎成为历史遗留概念,但仍保留在 API 中。因此需要重写的方法是 +allocWithZone:

+ (id)allocWithZone: (NSZone *)zone
{

为了调用 calloc,你需要知道分配多少内存。幸运的是,Objective-C runtime(Objective-C 运行时)让这变得很简单。class_getInstanceSize 函数会精确地告诉你这个大小:

id obj = calloc(class_getInstanceSize(self), 1);
*(Class *)obj = self;
return obj;
}
- (void)dealloc
{
free(self);
return;
[super dealloc]; // shut up compiler
}

注意事项
在此层级进行的大多数操作都有一些需要注意的事项。

首先,除非你直接子类化 NSObject,否则不要这样做。-dealloc 方法既负责销毁对象本身,也负责释放其持有的资源。-[NSObject dealloc] 仅负责销毁对象(基本上),因此不调用它是安全的。但对于任何其他类,这样做都不安全。例如,如果你尝试在 NSView 的子类中这样做,最终会泄露大量的内部状态。

其次,上文中提到的 “(基本上)” 意味着 NSObject 的一些处理事项需要你考虑。其中之一是移除关联对象(associated objects)。如果你的对象可能拥有关联对象,或者你认为哪怕有一丝可能,那么你就需要确保它们被移除。这可以通过调用 objc_removeAssociatedObjects(self) 来完成。另一个事项是调用实例变量中 C++ 对象的析构函数(destructors)。在此你最好的选择是避免将 C++ 对象用作实例变量。如果必须使用它们,请研究调用或模仿私有运行时函数 objc_destructInstance 的可能性,该函数能同时处理 C++ 析构函数和关联对象。(译注:现代运行时实现中,objc_destructInstance 的内部行为及私有性可能已变化,需查阅最新文档。)

第三,像 ObjectAlloc 和僵尸检测这类内存调试工具,无法在使用了自定义分配器(allocator)的对象上工作。因此,我建议你设置一个内存调试用的预处理器宏,使你的对象在调试时使用标准分配器而非自定义分配器,这样需要时可以切换开关来使用这些工具。

缓存对象 作为实际示例,我将编写一个分配器,它会将已销毁的对象放入缓存中以便快速复用。对于那些因频繁创建和销毁而导致标准分配器速度过慢的类,这种方式十分有用。

为了达到最快速度,我将对该类的工作方式和使用方式做出以下几点假设:

  • 它永远不会被子类化,或者即使被子类化,子类也绝不添加实例变量。(这允许将所有实例放入同一个缓存中。)

  • 其初始化方法能够处理一个” 脏” 对象(dirty object);即,实例变量(instance variables)无需清零。(这节省了从缓存中取出实例时逐个清零的时间。)

  • 它只会在同一个线程中被分配和销毁。(这使得无需创建线程安全的缓存。)

+ (id)allocWithZone: (NSZone *)zone
{
id obj = GetObjectFromCache();
if(obj)
*(Class *)obj = self;
else
obj = [super allocWithZone: zone];
return obj;
}
- (void)dealloc
{
// release any ivars here
AddObjectToCache(self);
// shut up the compiler
return;
[super dealloc];
}
static id gCacheListHead;
static id GetNext(id cachedObj)
{
return *(id *)cachedObj;
}
static void SetNext(id cachedObj, id next)
{
*(id *)cachedObj = next;
}
static id GetObjectFromCache(void)
{
id obj = gCacheListHead;
if(obj)
gCacheListHead = GetNext(obj);
return obj;
}
static void AddObjectToCache(id obj)
{
SetNext(obj, gCacheListHead);
gCacheListHead = obj;
}

自定义块分配器缓存对象可以显著提升速度,但初始分配过程并不会被加速,并且所有那些小分配依然存在空间开销。通过预先分配一大块内存并将其切割成小块,既能加快初始分配速度,又能大幅降低每个对象的空间开销。为此,我将沿用上述对象缓存方案,但修改 +allocWithZone: 的实现方式:

+ (id)allocWithZone: (NSZone *)zone
{
id obj = GetObjectFromCache();
if(!obj)
{
AllocateNewBlockAndCache(self);
obj = GetObjectFromCache();
}
*(Class *)obj = self;
return obj;
}
static void AllocateNewBlockAndCache(Class class)
{
static size_t kBlockSize = 4096;
char *newBlock = malloc(kBlockSize);
int instanceSize = class_getInstanceSize(class);
int instanceCount = kBlockSize / instanceSize;
while(instanceCount-- > 0)
{
AddObjectToCache((id)newBlock);
newBlock += instanceSize;
}
}

结论
在 Objective-C 中编写自定义对象分配器相对简单。难点在于分配器本身,这完全取决于你的实现。一旦完成分配器,你可以通过以下步骤将其集成到 Objective-C 类中:

  • 重写 +allocWithZone: 以调用你的自定义分配器,将内存块的 isa 指针(译注:现代 Runtime 中可能称为 isa_t 或类似结构)设置为 self,并可选择清零剩余内存。
  • 重写 -dealloc 以调用你的自定义分配器,且不要调用父类的 -dealloc 方法。
  • -dealloc 中调用 objc_removeAssociatedObjects,以防对象可能包含关联对象(associated objects)。
  • 仅直接继承 NSObject,而非其任何子类。

以上就是本期 Friday Q & A 的全部内容。两周后请期待下一期精彩内容。一如既往,我们欢迎并期待你对主题的建议,若有想在此看到的内容,请发送给我们!


#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2010-12-17-custom-object-allocators-in-objective-c.html

Merry holidays, happy winter, and a joyous Friday Q&A to you all. Camille Troillard suggested that I discuss how to create custom object memory allocators in Objective-C, and today I’m going to walk through how to accomplish this and why you might want to.

What It Means As anyone who uses Objective-C knows, you allocate an instance of a class by writing [MyClass alloc]. Creating a custom allocator simply means that replace the standard allocator so that [MyClass alloc] calls into your own code instead.

An Objective-C object is just a chunk of memory with the right size, and with the first pointer-sized chunk set to point at the object’s class. A custom allocator thus needs to return a pointer to a properly-sized chunk of memory, with the class filled out appropriately.

Why It’s Useful By far the largest reason to write a custom allocator is for performance. The standard allocator makes tradeoffs which may not be appropriate for your particular case. It also has to work with every class in every situation, whereas your custom allocator only needs to work with your class and the situations it’s used in.

Another reason is overhead. The standard allocator requires a certain amount of extra storage for each allocation for various reasons. This can be particularly expensive for very small objects allocated in very large numbers. A custom allocator can cut down on this overhead substantially by tailoring it to the needs of the class it’s written for.

A Note on Garbage Collection This post assumes manual retain/release memory management. Custom allocators are mostly impossible to use under garbage collection, because there is no way to add a custom free callback. It is possible to use some of these techniques (like an object cache) but for the most part, custom allocators are reserved for the realm of manual memory management.

A Basic Custom Allocator The +alloc method actually just calls through to +allocWithZone:. Although memory zones are pretty much just a historical curiosity at this point, they remain in the API. Thus the method to override is +allocWithZone::

+ (id)allocWithZone: (NSZone *)zone
{

In order to call calloc, you need to know how much memory to allocate. Fortunately, the Objective-C runtime makes it easy. The class_getInstanceSize function will tell you exactly this:

id obj = calloc(class_getInstanceSize(self), 1);
*(Class *)obj = self;
return obj;
}
- (void)dealloc
{
free(self);
return;
[super dealloc]; // shut up compiler
}

Gotchas As with most things at this level, there are a few things to watch out for.

First, don’t do this unless you subclass NSObject directly. The -dealloc method covers both destroying the object itself, and freeing resources it holds. -[NSObject dealloc] just destroys the object (mostly) so it’s safe not to call it. It’s not safe to do this for any other class, though. For example, if you tried this with an NSView subclass, you’d end up leaking a whole bunch of internal state.

Second, the “(mostly)” from above means there are some things that NSObject does that you need to think about. One is removing associated objects. If your objects may have associated objects, or you think there’s even a chance that it might, then you need to make sure they’re removed. This can be done by calling objc_removeAssociatedObjects(self). The other is calling destructors for C++ objects in instance variables. Your best bet here is to just avoid having C++ objects as instance variables. If you must have them, look into the possibility of calling or imitating the private runtime function objc_destructInstance, which takes care of both C++ destructors and associated objects.

Third, memory debugging tools like ObjectAlloc and zombies won’t work on objects with a custom allocator. For this reason, I recommend that you have a memory debugging preprocessor define which makes your objects use the standard allocator instead of your custom allocator, so that you can flip the switch and use these tools if need be.

Caching Objects For a realistic example, I’ll write an allocator that places destroyed objects in a cache so that they can be quickly reused. This sort of thing is useful for classes which are allocated and destroyed so frequently that the standard allocator is too slow.

In order to reach maximum speed, I’ll make a few assumptions about how this class works and is used:

  • It is never subclassed, or if it is, subclasses never add instance variables. (This allows it to put all instances in the same cache.)

  • Its initializer methods can deal with a “dirty” object; i.e. the instance variables don’t need to be zeroed out. (This saves time zeroing out each instance when pulling it out of the cache.)

  • It is only ever allocated and destroyed from the same thread. (This makes it unnecessary to create a thread-safe cache.)

+ (id)allocWithZone: (NSZone *)zone
{
id obj = GetObjectFromCache();
if(obj)
*(Class *)obj = self;
else
obj = [super allocWithZone: zone];
return obj;
}
- (void)dealloc
{
// release any ivars here
AddObjectToCache(self);
// shut up the compiler
return;
[super dealloc];
}
static id gCacheListHead;
static id GetNext(id cachedObj)
{
return *(id *)cachedObj;
}
static void SetNext(id cachedObj, id next)
{
*(id *)cachedObj = next;
}
static id GetObjectFromCache(void)
{
id obj = gCacheListHead;
if(obj)
gCacheListHead = GetNext(obj);
return obj;
}
static void AddObjectToCache(id obj)
{
SetNext(obj, gCacheListHead);
gCacheListHead = obj;
}

Custom Block Allocator Caching objects can be a big speed boost, but the initial allocations are not accelerated, and you still have the space overhead of all of those small allocations. By allocating a large block of memory and chopping it up into chunks, it’s possible to speed up the initial allocations and vastly decrease the per-object overhead. To do this, I’ll use the same object cache scheme as above, but with a modification to the +allocWithZone: implementation:

+ (id)allocWithZone: (NSZone *)zone
{
id obj = GetObjectFromCache();
if(!obj)
{
AllocateNewBlockAndCache(self);
obj = GetObjectFromCache();
}
*(Class *)obj = self;
return obj;
}
static void AllocateNewBlockAndCache(Class class)
{
static size_t kBlockSize = 4096;
char *newBlock = malloc(kBlockSize);
int instanceSize = class_getInstanceSize(class);
int instanceCount = kBlockSize / instanceSize;
while(instanceCount-- > 0)
{
AddObjectToCache((id)newBlock);
newBlock += instanceSize;
}
}

Conclusion Writing a custom object allocator in Objective-C is relatively simple. The hard part is the allocator itself, which is largely up to you. Once you have the allocator, you can plug it into your Objective-C class by:

  • Overriding +allocWithZone: to call your custom allocator, set the isa of the block to self, and optionally zero out the rest of the memory.

  • Overriding -dealloc to call your custom allocator, and do not call through to super.

  • Calling objc_removeAssociatedObjects in -dealloc if there’s a chance of your object containing associated objects.

  • Only subclassing NSObject directly, and not subclassing any subclass of NSObject.

That’s it for this edition of Friday Q&A. Come back in two weeks for the next exciting edition. As always, your ideas for topics to cover are welcome and requested, so if you have something that you would like to see covered here, please send it in!