译文 · 原文: Friday Q&A 2011-09-02: Let's Build NSAutoreleasePool · 作者 Mike Ash
原文:https://www.mikeash.com/pyblog/friday-qa-2011-09-02-lets-build-nsautoreleasepool.html 发布:2011-09-02 作者:Mike Ash 译者:MiMo(mimo-v2.5-pro);代码块保留英文原样
又到了疯狂编程的时间。达拉斯・布朗建议我谈谈 NSAutoreleasePool 在幕后的工作原理。我认为最好的方式就是直接重新实现它,这正是今天要讨论的内容。
高级概览
当向对象发送autorelease消息时,整个流程就此启动。autorelease方法在 NSObject 上实现,它直接调用[NSAutoreleasePool addObject: self]。这是一个类方法,因此需要定位到正确的实例进行后续操作。
NSAutoreleasePool 实例以线程独立栈的形式存储。当创建新的自动释放池时,它会被压入栈顶;当销毁自动释放池时,则从栈中弹出。当 NSAutoreleasePool 类方法需要查找当前池时,会获取当前线程的栈并提取栈顶的自动释放池。
找到正确的自动释放池后,便通过addObject:实例方法将对象添加到池中。当对象被添加到池时,实质上只是被添加到该池维护的对象列表里。
当一个自动释放池被销毁时,它会遍历这个对象列表并向每个对象发送 release 消息。过程大致就是如此。这里有一个额外的小复杂情况:如果被销毁的池不是池栈顶部的池,它还会同时销毁位于它上方的其他池。简而言之,NSAutoreleasePool 实例是嵌套的,如果你忘记销毁内层的池,外层的池在被销毁时会一并处理。
Garbage Collection(垃圾回收)
在垃圾回收机制下,NSAutoreleasePool 仍然存在,甚至稍微有点作用。如果你使用 drain 消息而不是 release 消息,在垃圾回收下销毁一个池会向垃圾回收器(collector)发出信号,表明这或许是执行一次回收周期的好时机。然而除此之外,NSAutorelease池 在垃圾回收下不起任何作用,考虑起来也意义不大。在本文中,我将忽略垃圾回收,专注于传统的内存管理。
接口
10.7 版本对自动释放池(autorelease pools)进行了一次重大的内部改造,而苹果公司此时也正在引入一套全新的自动引用计数系统(automatic reference counting)。尽管具体细节已经发生了巨大变化,但自动释放池的工作原理概念并未改变,因此本文中的所有内容仍然有效。
我所编写的自动释放池类将被命名为 MAAutoreleasePool。如果你想查看完整的代码,可以在 GitHub 上找到它。
该类提供了 addObject: 的类方法和实例方法,以及一个 CFMutableArray 用于存放已自动释放的对象列表。这里使用 CFMutableArray 而非 NSMutableArray,是因为 NSMutableArray 的自动保留和释放行为会干扰我们在此处试图实现的功能。NSMutableArray 内部也可能使用自动释放,这会把事情搞得一团糟。CFMutableArray 可以配置为不对其中元素进行任何内存管理,这正是我们这里所需要的。
接口部分看起来像这样:
@interface MAAutoreleasePool : NSObject { CFMutableArrayRef _objects; }
+ (void)addObject: (id)object;
- (void)addObject: (id)object;
@end此外,为了与官方实现保持一致,需要在 NSObject 上添加一个辅助方法。我将其命名为 ma_autorelease,以便与真正的 autorelease 方法区分开来,避免命名冲突:
@interface NSObject (MAAutoreleasePool)
- (id)ma_autorelease;
@end如上所述,这个实现只是对类方法(class method)的封装:
@implementation NSObject (MAAutoreleasePool)
- (id)ma_autorelease { [MAAutoreleasePool addObject: self]; return self; }
@end自动释放池栈 自动释放池(autorelease pool)被维护在一个栈中。每个线程都有自己的自动释放池栈,当对象被自动释放时,系统使用该栈来确定应将对象放入哪个池中。
为了封装栈的管理,我编写了一个私有方法 +_threadPoolStack,它返回当前线程的 MAAutoreleasePool 实例栈,如果栈不存在则会创建它。与每个池包含的对象列表类似,该池栈也是一个 CFMutableArray,以防止 NSMutableArray 的自动内存管理造成问题。
在 Cocoa 中处理线程本地存储(thread-local storage)最简单的方法是使用 NSThread 的 threadDictionary 方法。此方法返回一个专属于当前线程的 NSMutableDictionary,当线程终止时,该字典会自动销毁。
因此,这个方法要做的第一件事就是获取该字典并声明一个唯一的键(key)来与池栈关联:
+ (CFMutableArrayRef)_threadPoolStack { NSMutableDictionary *threadDictionary = [[NSThread currentThread] threadDictionary];
NSString *key = @"MAAutoreleasePool thread-local pool stack";接下来,它从该字典中获取堆栈(作为一个 CFMutableArray):
CFMutableArrayRef array = (CFMutableArrayRef)[threadDictionary objectForKey: key];当此方法首次在某个给定的线程上运行时,栈还不存在。这种情况下,需要创建该栈并将其存储在字典中:
if(!array) { array = CFArrayCreateMutable(NULL, 0, NULL); [threadDictionary setObject: (id)array forKey: key]; CFRelease(array); }最后,随着数组要么被获取要么新创建完成,它会被返回:
return array; }现在这个基础机制就位了,其他方法就可以实现了。+addObject: 方法本质上只是调用上述机制,然后将 addObject: 转发给栈顶的自动释放池。我添加了一些防御性检查,以确保栈不为空,如果为空则打印错误信息:
+ (void)addObject: (id)object { CFArrayRef stack = [self _threadPoolStack]; CFIndex count = CFArrayGetCount(stack); if(count == 0) { fprintf(stderr, "Object of class %s autoreleased with no pool, leaking\n", class_getName(object_getClass(object))); } else { MAAutoreleasePool *pool = (id)CFArrayGetValueAtIndex(stack, count - 1); [pool addObject: object]; } }实例方法
这个类的 -init(初始化方法)非常简单。初始化 _objects 数组,将 self 添加到对象池栈的顶部,然后返回:
- (id)init { if((self = [super init])) { _objects = CFArrayCreateMutable(NULL, 0, NULL); CFArrayAppendValue([[self class] _threadPoolStack], self); } return self; }-addObject: 方法更为简单:只需将给定对象添加到 _objects 的末尾:
- (void)addObject: (id)object { CFArrayAppendValue(_objects, object); }-dealloc 方法因需要处理嵌套内存池而变得更有趣。开头部分相当简单:遍历 _objects 数组并对其中每个对象发送 release 消息:
- (void)dealloc { if(_objects) { for(id object in (id)_objects) [object release]; CFRelease(_objects); }接下来,它会将自身从池栈(pool stack)中移除。此外,它还需要移除位于池栈中自身之上的所有池。由于这个过程需要反向迭代,并且需要能够在迭代过程中修改栈,因此它使用了一个基于手动索引的循环:
CFMutableArrayRef stack = [[self class] _threadPoolStack]; CFIndex index = CFArrayGetCount(stack); while(index-- > 0) { MAAutoreleasePool *pool = (id)CFArrayGetValueAtIndex(stack, index);如果 pool 是 self,则循环结束。唯一需要完成的操作是从栈中移除该条目,然后跳出循环。栈中位于当前池下方的所有其他池均保持原状:
if(pool == self) { CFArrayRemoveValueAtIndex(stack, index); break; }如果涉及的是其他内存池,则需要将其销毁。只需向其发送一个 release 消息即可。其余所有必要的操作都将自动完成:
else { [pool release]; } }这个过程一开始可能有点难以理解,我花了一些时间才明白这段代码该如何编写。当执行到这一行时,pool 必然位于栈顶。当它被释放时,它就会被销毁(保留自动释放池是不合法的)。在它被销毁的过程中,会调用自身的 -dealloc 方法,这又会重新进入同一个循环。这个循环会立即遇到 pool == self 条件,从而将该池从栈中移除并退出循环。因此,栈中所有位于当前正在被销毁的池之上的池也会被一并销毁并从栈中移除。
循环至此结束,剩下的工作就是调用父类方法:
[super dealloc]; }类现在已经完成了!
经验教训
从这个练习中可以学到一些很好的经验。最重要的一点是,NSAutoreleasePool 是一个非常简单直接的类,没有太多隐藏的陷阱。其背后并没有什么复杂的过程。Cocoa 内存管理的新手常常把 autorelease(自动释放)机制想象得比实际复杂得多,他们会问诸如 “如何判断一个对象是否已被 autorelease?” 或 “如果一个对象被 autorelease 两次会怎样?” 之类的问题。具体来说,我们现在可以看到:
-
无法判断一个对象是否已被 autorelease。自动释放池(autorelease pool)是一个相当简单的容器,对其包含的内容仅有最基本的认知。它实际上只是维护一个列表,以便稍后向这些对象发送
release消息。这完全没有问题,因为你的代码永远不应关心一个对象是否已经被 autorelease。 -
被 autorelease 两次的对象只会被添加到池中两次,然后当池被销毁时,它们会被释放两次。
-
当前自动释放池(Autorelease pool)被销毁时,其中的自动释放对象也会随之释放。当创建池的代码显式销毁它们时,这些池会被销毁。如果你没有自行管理自动释放池,那么自动释放对象至少会存活到你返回到你无权控制的代码(比如 Cocoa 框架)之前。
-
如果你在某个线程上自动释放一个对象,然后将其传递给另一个线程,并不会发生任何特殊事件。该对象仍然会在第一个线程的池被销毁时被释放,而与新线程上发生的事情无关。如果你需要一个对象在传递过程中存活下来,就需要在发送前对它进行保留(retain),并在接收后进行释放(release)。(幸运的是,你可能用来进行跨线程对象传递的机制,比如 GCD / 代码块(block)和 Cocoa 的
perform...方法,会自动为你完成这些操作。)
总结
以上就是本次对 Cocoa 内部机制的探索。现在你大致了解了NSAutoreleasePool是如何完成其工作以及它是如何运作的。具体的实现细节可能会有所不同(尤其是在 Lion 系统上),但基本思想是相同的。通过了解内存管理的内部原理,你可以编写出更好、更不易出错的代码。(译注:文章撰写时 Lion 为 Mac OS X 的最新版本,现代 macOS / iOS 的内存管理实现已有所不同,例如引入了 ARC。)
除非你完全没读过本博客,你可能已经知道 Friday Q & A 是由读者投稿驱动的。在此也想说明,如果你有希望看到的主题,请发送过来!
Original (English)
Source: https://www.mikeash.com/pyblog/friday-qa-2011-09-02-lets-build-nsautoreleasepool.html
It’s that time again: time for more programming craziness. Dallas Brown suggested that I talk about how NSAutoreleasePool works behind the scenes. I decided that the best way to do that would be to simply reimplement it, and that is what I’ll discuss today.
High Level OverviewThe ball gets rolling when the autorelease message is sent to an object. autorelease is implemented on NSObject, and just calls through to [NSAutoreleasePool addObject: self]. This is a class method, which then needs to track down the right instance to talk to.
NSAutoreleasePool instances are stored in a per-thread stack. When a new pool is created, it gets pushed onto the top of the stack. When a pool is destroyed, it’s popped off the stack. When the NSAutoreleasePool class method needs to look up the current pool, it grabs the stack for the current thread and grabs the pool at the top.
Once the right pool is found, the addObject: instance method is used to add the object to the pool. When an object is added to the pool, it’s just added to a list of objects kept by the pool.
When a pool is destroyed, it goes through this list of objects and sends release to each one. This is just about all there is to it. There is one additional small complication: if a pool is destroyed which is not at the top of the stack of pools, it also destroys the other pools which sit above it. In short, NSAutoreleasePool instances nest, and if you fail to destroy an inner one, the outer one will take care of it when it gets destroyed.
Garbage CollectionNSAutoreleasePool exists under garbage collection and is even slightly functional. If you use the drain message rather than the release message, destroying a pool under garbage collection signals to the collector that this might be a good time to run a collection cycle. Aside from this, however, NSAutoreleasePool does nothing under garbage collection and isn’t very interesting to consider there. For this article I will ignore garbage collection and concentrate on traditional memory management.
10.7 and ARC10.7 got a major internal overhaul of autorelease pools, and Apple is in the middle of introducing a whole new automatic reference counting system. Although the details have changed a great deal, the concepts of how autorelease pools work have not, so everything here is still valid.
InterfaceMy version of an autorelease pool class will be called MAAutoreleasePool. If you’d like to look at the code in its entirety, it’s available on GitHub.
This class has class and instance methods for addObject:, as well as a CFMutableArray to hold the list of autoreleased objects. CFMutableArray is used instead of NSMutableArray because the automatic retain and release behavior of NSMutableArray will interfere with what we’re trying to do here. There’s also the possibility that NSMutableArray would use autorelease internally, which would really mess things up. CFMutableArray can be configured not to do any memory management of its contents, which is what we’re after here.
The interface then looks like this:
@interface MAAutoreleasePool : NSObject { CFMutableArrayRef _objects; }
+ (void)addObject: (id)object;
- (void)addObject: (id)object;
@endAdditionally, to match the official implementation, there needs to be a helper method on NSObject. I called this ma_autorelease to distinguish it from the real thing and avoid name clashes:
@interface NSObject (MAAutoreleasePool)
- (id)ma_autorelease;
@endAs I mentioned above, the implementation of this is just a cover on the class method:
@implementation NSObject (MAAutoreleasePool)
- (id)ma_autorelease { [MAAutoreleasePool addObject: self]; return self; }
@endPool StackAutorelease pools are kept in a stack. Each thread has its own stack of pools, which is used to determine which pool to put an object in when it gets autoreleased.
To encapsulate the management of the stack, I wrote a private method, +_threadPoolStack which returns the current thread’s stack of MAAutoreleasePool instances, creating it if necessary. Like each pool’s list of contained objects, the pool stack is a CFMutableArray in order to prevent NSMutableArray’s automatic memory management from screwing things up.
The simplest way to handle thread-local storage in Cocoa is with the threadDictionary method on NSThread. This method returns an NSMutableDictionary which is unique to the current thread, and automatically destroyed when the thread terminates.
The first thing this method does, then, is fetch that dictionary and declare a unique key to associate with the pool stack:
+ (CFMutableArrayRef)_threadPoolStack { NSMutableDictionary *threadDictionary = [[NSThread currentThread] threadDictionary];
NSString *key = @"MAAutoreleasePool thread-local pool stack";Next, it fetches the stack (as a CFMutableArray) from that dictionary:
CFMutableArrayRef array = (CFMutableArrayRef)[threadDictionary objectForKey: key];The first time this method runs on any given thread, the stack won’t exist yet. In that case, it needs to be created and stored in the dictionary:
if(!array) { array = CFArrayCreateMutable(NULL, 0, NULL); [threadDictionary setObject: (id)array forKey: key]; CFRelease(array); }Finally, with the array either retrieved or newly created, it’s returned:
return array; }Now that this is in place, the other methods can be implemented. The +addObject: method essentially just calls the above and then forwards the addObject: to the pool at the top of the stack. I added a bit of paranoia to make sure that the stack isn’t empty, and to print an error if it is:
+ (void)addObject: (id)object { CFArrayRef stack = [self _threadPoolStack]; CFIndex count = CFArrayGetCount(stack); if(count == 0) { fprintf(stderr, "Object of class %s autoreleased with no pool, leaking\n", class_getName(object_getClass(object))); } else { MAAutoreleasePool *pool = (id)CFArrayGetValueAtIndex(stack, count - 1); [pool addObject: object]; } }Instance MethodsThe -init method for this class is really simple. Initialize the _objects array, add self to the top of the pool stack, and return:
- (id)init { if((self = [super init])) { _objects = CFArrayCreateMutable(NULL, 0, NULL); CFArrayAppendValue([[self class] _threadPoolStack], self); } return self; }The -addObject: method is even simpler: just add the given object to the end of _objects:
- (void)addObject: (id)object { CFArrayAppendValue(_objects, object); }The -dealloc method gets more interesting, due to the need to nest pools. It starts out easily enough: iterate through the _objects array and send a release to everything in it:
- (void)dealloc { if(_objects) { for(id object in (id)_objects) [object release]; CFRelease(_objects); }Next, it removes self from the pool stack. Additionally, it also needs to remove any pools which sit above it in the pool stack. Because this process needs to iterate backwards and needs to be able to modify the stack while iterating, it uses a manual index-based loop:
CFMutableArrayRef stack = [[self class] _threadPoolStack]; CFIndex index = CFArrayGetCount(stack); while(index-- > 0) { MAAutoreleasePool *pool = (id)CFArrayGetValueAtIndex(stack, index);If pool is self, then the loop is done. All that needs to be done is to remove the entry from the stack and then break out of the loop. All pools in the stack below the current one are left alone:
if(pool == self) { CFArrayRemoveValueAtIndex(stack, index); break; }If it’s some other pool, then it needs to be destroyed. This is done by simply sending a release to it. Everything else that needs to happen will be done automatically:
else { [pool release]; } }This may be a little hard to understand at first, and it took me a little while to realize how this code needed to be written. When this line is hit, pool is necessarily at the top of the stack. When it’s released, it will be deallocated (it’s not legal to retain autorelease pools). When it gets deallocated, it will call into its own -dealloc which enters into this same loop again. This loop immediately hits the pool == self condition, removing the pool from the stack and exiting. Thus, all pools on the stack above the one currently being destroyed also get destroyed and removed from the stack.
The loop is now done, and all that’s left to do is call through to super:
[super dealloc]; }The class is now complete!
LessonsThere are some good lessons to be learned from this exercise. Most importantly is that NSAutoreleasePool is a pretty straightforward class without much in the way of hidden gotchas. There’s nothing complex going on behind the scenes. People new to Cocoa memory management often imagine that autorelease is much more complex than it really is, and ask questions like “How can I tell if an object has been autoreleased?” or “What happens if I autorelease an object twice?” Specifically, we can now see:
-
There’s no way to tell if an object has been autoreleased. The pool is a fairly dumb container with only the barest idea of what it contains. It really just keeps a list for the purposes of sending release to those objects later. This is perfectly fine, because your code should never care whether an object has already been autoreleased.
-
Objects that are autoreleased twice just get added to the pool twice, and then when the pool is destroyed they get released twice.
-
Autoreleased objects get released when the current autorelease pool is destroyed. Pools are destroyed when the code that created them explicitly destroys them. If you aren’t managing your own pools, then autoreleased objects will survive at least until you return to code you don’t own (like Cocoa).
-
If you autorelease an object on one thread and then pass it to another thread, nothing special happens. The object is still released when the first thread’s pool gets destroyed, regardless of what’s happening on the new thread. If you need an object to survive the passage, it needs to be retained before sending and then released after receiving. (Fortunately, the cross-thread messaging mechanisms you’re likely to use with objects, like GCD/blocks and Cocoa’s perform… methods do this for you.)
ConclusionThat wraps up today’s exploration of Cocoa internals. Now you know approximately how NSAutoreleasePool gets its job done and how it works. The implementation specifics vary (especially on Lion), but the basic ideas are the same. By knowing how memory management internals work, you can write better and less error-prone code.
Unless you’re completely new to this blog, you probably already know that Friday Q&A is driven by reader submissions. On that note, if you have a topic that you’d like to see covered, please send it in!