对 CoreFoundation 对象的自动置空弱引用

文章發布時間 2010年7月30日

作者 TommyWu

標籤

译文 · 原文： Friday Q&A 2010-07-30: Zeroing Weak References to CoreFoundation Objects · 作者 Mike Ash

原文：https://www.mikeash.com/pyblog/friday-qa-2010-07-30-zeroing-weak-references-to-corefoundation-objects.html 发布：2010-07-30　作者：Mike Ash 译者：MiMo（mimo-v2.5-pro）；代码块保留英文原样

又到了周五问答的友好版时间。在上一次的周五问答中，我讨论了 MAZeroingWeakRef 以及它如何为纯 Objective-C 对象实现。这次，我将介绍我为了让它同样适用于无缝桥接（toll-free bridged）的 CoreFoundation 对象而实现的那些疯狂的 hack。

代码和之前一样，你可以从我的公共 Subversion 仓库获取 MAZeroingWeakRef 的代码：

1
    svn co http://mikeash.com/svn/ZeroingWeakRef/

前置阅读
本文假设读者已具备扎实的 CoreFoundation 知识，并了解 CF-ObjC 桥接（CF-ObjC bridging）的工作原理。若您尚未掌握相关内容，建议阅读或至少参考 Friday Q & A 2010-01-22: Toll Free Bridging Internals 一文。

回顾
归零弱引用（zeroing weak reference）是一种不参与保持对象存活性的引用。当目标对象被销毁时，归零弱引用会自动变为 NULL。当请求归零弱引用的目标对象时，调用者保证要么获得一个有效引用，要么得到 NULL。如前文所述，这在各种场景下都非常有用。

在 Cocoa 中实现此功能时，MAZeroingWeakRef 通过动态创建目标对象类的子类（subclass），并修改目标对象的类（class），从而覆盖（override）目标对象的 dealloc 方法。这个被覆盖的 dealloc 方法会清零所有指向该目标的 MAZeroingWeakRef 对象。

仅在此处停止的话，在线程安全和对象复活方面存在一个问题。想象一个线程对一个对象的最后一个强引用调用了 release，导致它随后调用了 dealloc。想象在这两个操作之间，另一个线程通过零弱引用（zeroing weak reference）访问了该对象。由于 dealloc 尚未被调用，它会返回对该对象的引用。然而，由于 dealloc 的调用已经排上日程，MAZeroingWeakRef 所做的保留 / 自动释放的戏法无法挽救该对象免于被销毁。灾难！

这个问题也通过重写 release 来解决。通过让 release 获取一个在检索零弱引用目标时也使用的锁，可以确保这种复活情景不会发生。

无桥接对象
这个方案对于普通的 Objective-C 对象效果很好，但对于桥接的 CoreFoundation 对象却彻底失效。替换桥接对象的类会导致无限递归。CoreFoundation 函数被调用时，首先会检查目标对象的类。如果该类与官方 NSCF 类不匹配，它就假定这是个纯 Objective-C 类，于是转而调用对应的 Objective-C 方法。而 NSCF 类上的 Objective-C 对应方法又会调用 CoreFoundation 函数。如此循环往复，最终导致崩溃。

在这个场景下，严格来说并非必须使用动态子类。我可以直接对 NSCF 类的dealloc和release方法进行替换，让它们来执行我的操作。这种方法效率稍低（因为它会影响该类的所有对象，而不仅是弱引用的对象），但这应该不是问题。

麻烦在于，这个方案行不通。如果你对这样的对象调用CFRelease，它会直接进入该对象的引用计数与内存释放流程，而完全不会经过 Objective-C 方法调用。因此这个方案只能捕获单方面的调用，基本毫无用处。

在尝试了所有这些方法之后，我四处寻找解决方案。除了给 CFRelease 打补丁（我真心不想这么做，至少因为这种方法在 iPhone 上行不通，在那里修改可执行代码是被禁止的），我没能找到其他办法。

我几乎要放弃这个问题了，只能认命地禁止对 CoreFoundation 对象使用弱引用（weak reference），直到我偶然发现了……

解决方案 我开始查阅 CoreFoundation 的源代码（可从 opensource.apple.com 获取），试图找到一种能够挂钩到释放事件的方法。就在这时，我在 CFRelease 的代码中偶然发现了这段精彩的片段：

1
    void (*func)(CFTypeRef) = __CFRuntimeClassTable[typeID]->finalize;
2
    if (NULL != func) {
3
        func(cf);
4
    }
5
    // We recheck lowBits to see if the object has been retained again during
6
    // the finalization process.  This allows for the finalizer to resurrect,
7
    // but the main point is to allow finalizers to be able to manage the
8
    // removal of objects from uniquing caches, which may race with other threads
9
    // which are allocating (looking up and finding) objects from those caches,
10
    // which (that thread) would be the thing doing the extra retain in that case.
11
    if (isAllocator || OSAtomicCompareAndSwap32Barrier(1, 0, (int32_t *)&((CFRuntimeBase *)cf)->_rc)) {
12
        goto really_free;
13
    }

实现这个方案需要覆盖 CoreFoundation（核心基础框架）的 finalize 函数。CoreFoundation 没有官方支持的机制来做到这一点，所以我不得不深入 CF 的源代码并通过破解来实现。这意味着我所做的所有事情并非完全受支持，可能会在未来的系统更新中失效，尽管我认为这些内容实际上相当稳定。

CoreFoundation 类
一个 CoreFoundation 类只是一个类似这样的结构体：

1
    typedef struct __CFRuntimeClass {  // Version 0 struct
2
        CFIndex version;
3
        const char *className;
4
        void (*init)(CFTypeRef cf);
5
        CFTypeRef (*copy)(CFAllocatorRef allocator, CFTypeRef cf);
6
        void (*finalize)(CFTypeRef cf);
7
        Boolean (*equal)(CFTypeRef cf1, CFTypeRef cf2);
8
        CFHashCode (*hash)(CFTypeRef cf);
9
        CFStringRef (*copyFormattingDesc)(CFTypeRef cf, CFDictionaryRef formatOptions);  // str with retain
10
        CFStringRef (*copyDebugDesc)(CFTypeRef cf);  // str with retain
11
        void (*reclaim)(CFTypeRef cf);
12
    } CFRuntimeClass;

重写 finalize 函数然后变得简单。首先，使用以下函数查找对应 CF 类型 ID 的 CFRuntimeClass：

1
    extern CFRuntimeClass * _CFRuntimeGetClassWithTypeID(CFTypeID typeID);

1
    typedef void (*CFFinalizeFptr)(CFTypeRef);
2
    static CFFinalizeFptr *gCFOriginalFinalizes;
3
    static size_t gCFOriginalFinalizesSize;

1
    static Class CreateCustomSubclass(Class class, id obj)
2
    {
3
        if(IsTollFreeBridged(class, obj))
4
        {
5
            CFTypeID typeID = CFGetTypeID(obj);
6
            CFRuntimeClass *cfclass = _CFRuntimeGetClassWithTypeID(typeID);
7

8
            if(typeID >= gCFOriginalFinalizesSize)
9
            {
10
                gCFOriginalFinalizesSize = typeID + 1;
11
                gCFOriginalFinalizes = realloc(gCFOriginalFinalizes, gCFOriginalFinalizesSize * sizeof(*gCFOriginalFinalizes));
12
            }
13

14
            do {
15
                gCFOriginalFinalizes[typeID] = cfclass->finalize;
16
            } while(!OSAtomicCompareAndSwapPtrBarrier(gCFOriginalFinalizes[typeID], CustomCFFinalize, (void *)&cfclass->finalize));
17
            return class;
18
        }
19
        else
20
            // original ObjC dynamic subclassing code is here

这个改变之后，IsTollFreeBridged 函数能否 100% 可靠运行就变得至关重要了。旧实现仅仅检查类名是否以 NSCF 开头，这显然不够可靠。我利用 CoreFoundation（苹果的核心框架）的一个私有 Objective-C 类表，设计出了完全可靠的测试方法：

1
    extern Class *__CFRuntimeObjCClassTable;

1
    static BOOL IsTollFreeBridged(Class class, id obj)
2
    {
3
        CFTypeID typeID = CFGetTypeID(obj);
4
        Class tfbClass = __CFRuntimeObjCClassTable[typeID];
5
        return class == tfbClass;
6
    }

1
    static void CustomCFFinalize(CFTypeRef cf)
2
    {
3
        WhileLocked({
4
            if(CFGetRetainCount(cf) == 1)
5
            {
6
                ClearWeakRefsForObject((id)cf);
7
                void (*fptr)(CFTypeRef) = gCFOriginalFinalizes[CFGetTypeID(cf)];
8
                if(fptr)
9
                    fptr(cf);
10
            }
11
        });
12
    }

死而复生

不幸的是，这里存在一个竞争条件。考虑以下执行序列：

线程 1 执行 CFRelease(obj)
- CFRelease 调用 CustomCFFinalize
- 在 CustomCFFinalize 开始执行之前，该线程被抢占
线程 2 执行 [ref target] 并获取了 obj 的引用
- obj 被 MAZeroingWeakRef 持有并放入自动释放池（autorelease pool）
- 外围的自动释放池被清空，导致 CFRelease(obj) 被调用
- CFRelease 调用 CustomCFFinalize
- CustomCFFinalize 清除弱引用（weak references）并调用原始的终结器（original finalize）
- CustomCFFinalize 返回
线程 1 在 CustomCFFinalize 的起始处恢复执行
- CustomCFFinalize 检查引用计数（retain count），发现它仍然是 1
- CustomCFFinalize 第二次在同一对象上调用原始的终结器
- 一个可怕的、烈火般的崩溃发生了
在 CustomCFFinalize 开头处恢复执行
CustomCFFinalize 检查引用计数（retain count），此时仍为 1
CustomCFFinalize 对同一个对象第二次调用原始的 finalize 方法
发生严重崩溃

因此存在一个极其狭窄、难以触发、但完全真实存在的竞争条件（race condition），可能导致此代码崩溃。

第三层改造
为了解决这个问题，我将 CoreFoundation 对象分为两类：一些对象是弱引用（weak reference）的目标，其余则不是。这样做有两个目的。首先，它允许我在销毁从未成为弱引用目标的对象时采用快速路径。其次，我可以追踪被引用的对象是否仍可能被复活（resurrected）。

实现方式很简单：维护一个 CFMutableSet，用于存储被引用的对象。检查对象状态只需测试集合成员资格。在调用 RegisterRef 时将对象插入集合。当 finalize 执行且引用计数为 1 时（确保该对象无法再被复活），则将其从集合中移除。

新的 CustomCFFinalize 函数随后被拆分为两部分。如果该对象存在弱引用（weak references），它首先会检查引用计数（retain count）是否为 1，以此判断对象是否已被复活：

1
static void CustomCFFinalize(CFTypeRef cf)
2
    {
3
        WhileLocked({
4
            if(CFSetContainsValue(gCFWeakTargets, cf))
5
            {
6
                if(CFGetRetainCount(cf) == 1)
7
                {

1
                    ClearWeakRefsForObject((id)cf);
2
                    CFSetRemoveValue(gCFWeakTargets, cf);
3
                    CFRetain(cf);
4
                    CallCFReleaseLater(cf);
5
                }
6
            }

1
            else
2
            {
3
                void (*fptr)(CFTypeRef) = gCFOriginalFinalizes[CFGetTypeID(cf)];
4
                if(fptr)
5
                    fptr(cf);
6
            }
7
        });
8
    }

使用自动释放池（autorelease pool）本可以解决问题，但这里是纯 CF（Core Foundation）代码，无法保证调用者实际部署了自动释放池。这是个不错的思路，但最终行不通。

理想情况是能挂钩（hook）CFRelease 来观察其退出时机，但如前所述，根本没有可用的挂钩点，所以这个方案也被否决了。

最终，我从埃德・温恩 —— 他堪称 “万密之主”（Master of All Things Arcane）—— 那里获得了重要启发：其实可以通过一种完全疯狂的技术来实现，类似于 Objective-C 运行时中的缓存清理机制。

疯狂方案 重述问题：我需要在原始 CFRelease 调用完成后，于某个时刻再次对该对象调用 CFRelease。由于无法在发起原始 CFRelease 调用的线程上安排此操作，我决定利用一个后台线程。

后台线程如何得知原始 CFRelease 调用何时完成？

一个线程可以访问另一个线程的 PC（program counter，程序计数器，即当前执行指令的位置）。通常这用处不大，但 Objective-C 运行时（runtime）会利用它来查看其他线程是否正处于访问缓存数据的函数中，从而判断是否可以安全销毁过时的缓存数据。

同样地，这段代码可以检查原始调用线程的 PC，看看它是否仍在CFRelease函数内。如果不是，那么该调用必定已经完成，因此现在可以安全地再次释放对象。

据我所知，在 OS X 上获取另一个线程 PC 的唯一方法是使用 mach 调用（mach calls），因此第一步是获取对当前 mach 线程的引用。这个引用也会被” 保留”（mach 端口是引用计数的，就像 Objective-C 对象一样），以防线程在此期间被销毁而导致引用失效：

1
static void CallCFReleaseLater(CFTypeRef cf)
2
    {
3
        mach_port_t thread = pthread_mach_thread_np(pthread_self());
4
        mach_port_mod_refs(mach_task_self(), thread, MACH_PORT_RIGHT_SEND, 1 ); // "retain"

1
        NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
2
        SEL sel = @selector(releaseLater:fromThread:);
3
        NSInvocation *inv = [NSInvocation invocationWithMethodSignature: [MAZeroingWeakRef methodSignatureForSelector: sel]];
4
        [inv setTarget: [MAZeroingWeakRef class]];
5
        [inv setSelector: sel];
6
        [inv setArgument: &cf atIndex: 2];
7
        [inv setArgument: &thread atIndex: 3];
8

9
        NSInvocationOperation *op = [[NSInvocationOperation alloc] initWithInvocation: inv];
10
        [gCFDelayedDestructionQueue addOperation: op];
11
        [op release];
12
        [pool release];
13
    }

1
    + (void)releaseLater: (CFTypeRef)cf fromThread: (mach_port_t)thread
2
    {
3
        BOOL retry = YES;
4

5
        while(retry)
6
        {

1
            BLOCK_QUALIFIER void *pc;
2
            // ensure that the PC is outside our inner code when fetching it,
3
            // so we don't have to check for all the nested calls
4
            WhileLocked({
5
                pc = GetPC(thread);
6
            });

1
                if(pc < (void *)CustomCFFinalize || pc > (void *)IsTollFreeBridged)
2
                {

1
                    Dl_info info;
2
                    int success = dladdr(pc, &info;);
3
                    if(success)
4
                    {
5
                        if(info.dli_saddr != _CFRelease)
6
                        {

1
                            retry = NO; // success!
2
                            CFRelease(cf);
3
                            mach_port_mod_refs(mach_task_self(), thread, MACH_PORT_RIGHT_SEND, -1 ); // "release"
4
                        }
5
                    }
6
                }
7
            }
8
        }
9
    }

1
    static void *GetPC(mach_port_t thread)
2
    {
3
        // arch-specific code goes here
4

5
        kern_return_t ret = thread_get_state(thread, flavor, (thread_state_t)&state, &count;);
6
        if(ret == KERN_SUCCESS)
7
            return (void *)state.PC_REGISTER;
8
        else
9
            return NULL;
10
    }

这就是全部内容了！

杂项补充 在前一篇文章中，我提到了控制 MAZeroingWeakRef 包含多少破解手段的 COREFOUNDATION_HACK_LEVEL 宏。当设置为 0 时，它不会使用任何私有 API。它拒绝引用 CoreFoundation 对象，并通过检查类名是否以 NSCF 前缀来检测它们。当设置为 1 时，它仅使用私有 API 来进行可靠的 CoreFoundation 对象检查。等级 1 现在是默认设置。

当我撰写前一篇文章时，实际上并不知道这种微妙的 “对象复活” 竞态条件。因此，我增加了额外的破解等级。破解等级 2 使用私有的 CoreFoundation 调用来允许引用 CF 对象，但无法消除我上面描述的 “复活” 竞态条件。最后，新增的破解等级 3 如上所述进行了全面的 CoreFoundation 深度破解，并通过在后台线程中执行最终的 CFRelease 来消除竞态条件。

这些可以通过文件顶部的 COREFOUNDATION_HACK_LEVEL 宏来控制。我推荐在 Mac 开发中使用级别 1（通常不需要对 CoreFoundation 对象进行弱引用），在 iOS 开发中使用级别 0（苹果公司对私有 API 的使用非常敏感）。然而，如果你喜欢冒险或者确实需要对 CF 对象进行弱引用，可以将其设置为 3，一切应该仍然可以正常工作…… 如果你这样做了，请注意，那些真正可怕的 hack 只有在你实际创建了对 CF 对象的弱引用时才会激活，因此你可以仅在为了防止无意中引用了 CF 对象的情况下启用它，而在正常使用时不必担心它会产生任何不良后果。

结论在上一篇文章中，我展示了如何相对轻松地创建指向 Objective-C 对象的归零弱引用（zeroing weak references）。在本文中，我将展示对 CoreFoundation 对象做同样的事情，即使不是轻而易举，至少也是可能的。这需要大量涉及私有 API 的工作，但解决方案应该相当稳健。（译注：此处的 “私有 API” 指未公开的系统内部接口，其使用在 App Store 审核中可能被拒）

这种黑客技术极具挑战性，但也充满乐趣。CoreFoundation 源码是进行此类探索的宝贵资源，但请注意其中可能包含未来会变更的私有符号（private symbols）。其他低层开源代码如 Objective-C 运行时（Objective-C runtime）同样值得一读。最后，当苹果未提供源码时，otx 是观察库运行机制的极佳工具。

本期「周五问答」（Friday Q & A）到此结束。两周后我们将带来更多疯狂操作。

一如既往，周五问答的选题源于读者建议。若您有希望在此讨论的话题，请随时提交！

#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2010-07-30-zeroing-weak-references-to-corefoundation-objects.html

It’s time for another friendly edition of Friday Q&A. For my last Friday Q&A, I talked about MAZeroingWeakRef and how it’s implemented for pure Objective-C objects. For this one, I’m going to discuss the crazy hacks I implemented to make it work with toll-free bridged CoreFoundation objects as well.

Code Just as before, you can get the code for MAZeroingWeakRef from my public Subversion repository:

1
    svn co http://mikeash.com/svn/ZeroingWeakRef/

Prior Reading This post assumes fairly good knowledge of CoreFoundation and how CF-ObjC bridging works. If you haven’t already, you may wish to read or at least refer to Friday Q&A 2010-01-22: Toll Free Bridging Internals.

Recap A zeroing weak reference is a reference to an object which does not participate in keeping that object alive. When the target object is destroyed, the zeroing weak reference automatically becomes NULL. When a zeroing weak reference’s target is requested, the caller is guaranteed to either get a valid reference, or NULL. This is useful for all kinds of things as covered in the previous article.

In order to accomplish this in Cocoa, MAZeroingWeakRef overrides the dealloc method of the target object by dynamically creating a subclass of the target’s class, and changing the class of the target. This overridden dealloc method zeroes out MAZeroingWeakRef objects that point to the target.

There is a problem with thread safety and resurrection if you stop there. Imagine one thread calls release on the last strong reference to an object, causing it to then call dealloc. Imagine that between these two, another thread accesses the object through a zeroing weak reference. Since dealloc has not yet been called, it returns a reference to the object. However, because the dealloc call is already set to go, the retain/autorelease dance done by MAZeroingWeakRef can’t save the object from being destroyed. Disaster!

This problem is solved by also overriding release. By having release acquire a lock that’s also used when retrieving a zeroing weak reference target, it’s assured that this resurrection scenario can’t occur.

Toll-Free Bridged Objects This scheme works great for normal Objective-C objects, but fails hard for bridged CoreFoundation objects. Changing out the class of a bridged object causes infinite recursion. The first thing a CoreFoundation function does is check the class of the object it’s being called on. If that class doesn’t match the official NSCF class, it assumes it’s a pure Objective-C class and calls through to the Objective-C equivalent method. The Objective-C equivalent method on an NSCF class just calls the CoreFoundation function. Rinse, lather, repeat, and crash.

The dynamic subclass wouldn’t be strictly necessary in this case. I could instead swizzle out the dealloc and release methods on the NSCF class directly, and have them do my dirty work. This is a bit less efficient (since I’m affecting every object of that class, not just weak-referenced ones) but that shouldn’t matter.

The trouble is that this doesn’t work. If you call CFRelease on such an object, it goes directly to the refcounting and deallocation of that object without ever calling the Objective-C methods. So this solution can only catch one side of things, which is basically useless.

After working through all of this, I hunted around for a solution. Short of patching CFRelease (which I really didn’t want to do, not the least of which because this approach won’t work on the iPhone, where modifying executable code is forbidden) I couldn’t come up with a way.

I nearly gave up on the problem, resigned to simply forbidding weak references to CoreFoundation objects, when I finally happened upon…

The Solution I had started looking through the CoreFoundation source code (available from opensource.apple.com) trying to find a way to hook into release events when I happened up on this little gem in the code for CFRelease:

1
    void (*func)(CFTypeRef) = __CFRuntimeClassTable[typeID]->finalize;
2
    if (NULL != func) {
3
        func(cf);
4
    }
5
    // We recheck lowBits to see if the object has been retained again during
6
    // the finalization process.  This allows for the finalizer to resurrect,
7
    // but the main point is to allow finalizers to be able to manage the
8
    // removal of objects from uniquing caches, which may race with other threads
9
    // which are allocating (looking up and finding) objects from those caches,
10
    // which (that thread) would be the thing doing the extra retain in that case.
11
    if (isAllocator || OSAtomicCompareAndSwap32Barrier(1, 0, (int32_t *)&((CFRuntimeBase *)cf)->_rc)) {
12
        goto really_free;
13
    }

Implementing this solution requires overriding the CoreFoundation finalize function. CoreFoundation has no supported mechanism for this, so I had to get down and dirty with the CF source code and hack my way in. This means that everything I’m doing is not entirely supported and could break, although I believe that this stuff is actually pretty stable.

CoreFoundation Classes A CoreFoundation class is just a struct that looks like this:

1
    typedef struct __CFRuntimeClass {  // Version 0 struct
2
        CFIndex version;
3
        const char *className;
4
        void (*init)(CFTypeRef cf);
5
        CFTypeRef (*copy)(CFAllocatorRef allocator, CFTypeRef cf);
6
        void (*finalize)(CFTypeRef cf);
7
        Boolean (*equal)(CFTypeRef cf1, CFTypeRef cf2);
8
        CFHashCode (*hash)(CFTypeRef cf);
9
        CFStringRef (*copyFormattingDesc)(CFTypeRef cf, CFDictionaryRef formatOptions);  // str with retain
10
        CFStringRef (*copyDebugDesc)(CFTypeRef cf);  // str with retain
11
        void (*reclaim)(CFTypeRef cf);
12
    } CFRuntimeClass;

Overriding the finalize function then becomes easy. First, look up the CFRuntimeClass for the given CF type ID with this function:

1
    extern CFRuntimeClass * _CFRuntimeGetClassWithTypeID(CFTypeID typeID);

1
    typedef void (*CFFinalizeFptr)(CFTypeRef);
2
    static CFFinalizeFptr *gCFOriginalFinalizes;
3
    static size_t gCFOriginalFinalizesSize;

1
    static Class CreateCustomSubclass(Class class, id obj)
2
    {
3
        if(IsTollFreeBridged(class, obj))
4
        {
5
            CFTypeID typeID = CFGetTypeID(obj);
6
            CFRuntimeClass *cfclass = _CFRuntimeGetClassWithTypeID(typeID);
7

8
            if(typeID >= gCFOriginalFinalizesSize)
9
            {
10
                gCFOriginalFinalizesSize = typeID + 1;
11
                gCFOriginalFinalizes = realloc(gCFOriginalFinalizes, gCFOriginalFinalizesSize * sizeof(*gCFOriginalFinalizes));
12
            }
13

14
            do {
15
                gCFOriginalFinalizes[typeID] = cfclass->finalize;
16
            } while(!OSAtomicCompareAndSwapPtrBarrier(gCFOriginalFinalizes[typeID], CustomCFFinalize, (void *)&cfclass->finalize));
17
            return class;
18
        }
19
        else
20
            // original ObjC dynamic subclassing code is here

With this change, it’s now critical that IsTollFreeBridged be 100% reliable. The old implementation simply looked for a class name that started with NSCF, and that’s not good enough. I came up with a completely reliable test using a private CoreFoundation table of Objective-C classes:

1
    extern Class *__CFRuntimeObjCClassTable;

1
    static BOOL IsTollFreeBridged(Class class, id obj)
2
    {
3
        CFTypeID typeID = CFGetTypeID(obj);
4
        Class tfbClass = __CFRuntimeObjCClassTable[typeID];
5
        return class == tfbClass;
6
    }

1
    static void CustomCFFinalize(CFTypeRef cf)
2
    {
3
        WhileLocked({
4
            if(CFGetRetainCount(cf) == 1)
5
            {
6
                ClearWeakRefsForObject((id)cf);
7
                void (*fptr)(CFTypeRef) = gCFOriginalFinalizes[CFGetTypeID(cf)];
8
                if(fptr)
9
                    fptr(cf);
10
            }
11
        });
12
    }

Resurrection Comes Back From the Dead Unfortunately, there’s a race condition here. Imagine the following sequence:

Thread 1 CFRelease(obj) CFRelease calls CustomCFFinalize Before CustomCFFinalize begins executing, the thread is preempted
CFRelease(obj)
CFRelease calls CustomCFFinalize
Before CustomCFFinalize begins executing, the thread is preempted
Thread 2 [ref target] obtains reference to obj obj is retained and autoreleased by MAZeroingWeakRef The enclosing autorelease pool is drained, resulting in CFRelease(obj) CFRelease calls CustomCFFinalize CustomCFFinalize clears weak references and calls the original finalize CustomCFFinalize returns
[ref target] obtains reference to obj
obj is retained and autoreleased by MAZeroingWeakRef
The enclosing autorelease pool is drained, resulting in CFRelease(obj)
CFRelease calls CustomCFFinalize
CustomCFFinalize clears weak references and calls the original finalize
CustomCFFinalize returns
Thread 1 Resumes execution at the beginning of CustomCFFinalize CustomCFFinalize checks the retain count, which is still 1 CustomCFFinalize calls the original finalize a second time on the same object A horrible flaming crash occurs
Resumes execution at the beginning of CustomCFFinalize
CustomCFFinalize checks the retain count, which is still 1
CustomCFFinalize calls the original finalize a second time on the same object
A horrible flaming crash occurs

Thus there is an extremely narrow, difficult-to-hit, but entirely real race condition that could cause this code to crash.

Hack Level Three In order to solve this problem, I divide CoreFoundation objects into two categories. Some objects are the target of a weak reference, and the rest are not. This serves two purposes. First, it allows me to take a fast path when destroying an object that was never the target of a weak reference. Second, I can track whether a referenced object can still potentially be resurrected or not.

This is implemented by simply keeping a CFMutableSet where referenced objects are stored. Checking the status of an object is simply a matter of testing set membership. Objects are inserted into the set when calling RegisterRef. Objects are removed when the finalize executes with a retain count of 1, which ensures that it can no longer be resurrected.

The new CustomCFFinalize is then split in two. If the object has weak references, it first checks for a retain count of 1 to see whether it’s been resurrected:

1
static void CustomCFFinalize(CFTypeRef cf)
2
    {
3
        WhileLocked({
4
            if(CFSetContainsValue(gCFWeakTargets, cf))
5
            {
6
                if(CFGetRetainCount(cf) == 1)
7
                {

1
                    ClearWeakRefsForObject((id)cf);
2
                    CFSetRemoveValue(gCFWeakTargets, cf);
3
                    CFRetain(cf);
4
                    CallCFReleaseLater(cf);
5
                }
6
            }

1
            else
2
            {
3
                void (*fptr)(CFTypeRef) = gCFOriginalFinalizes[CFGetTypeID(cf)];
4
                if(fptr)
5
                    fptr(cf);
6
            }
7
        });
8
    }

Using autorelease would do the trick, except that this is pure CF code and there’s no guarantee that the caller actually has an autorelease pool in place. A nice idea, but it just doesn’t work out.

Some way to hook CFRelease to see when it exits would be ideal. But as discussed before, there’s simply no available hook, so that goes out as well.

Ultimately I obtained some serious inspiration from Ed “Master of All Things Arcane” Wynne that it could really be done by using a completely insane technique similar to the cache-cleanup scheme in the Objective-C runtime.

The Crazy Scheme To restate the problem: I need to call CFRelease on the object sometime after the original call to CFRelease has completed. Since there’s no way to arrange this on the thread that made the original call to CFRelease, I make use of a background thread.

How can the background thread know when the original call to CFRelease has completed?

It’s possible for one thread to access the PC (program counter, the location of the currently executing instruction) of another thread. Normally this is not very useful, but the Objective-C runtime uses it to see whether it’s safe to destroy stale cache data by looking to see if any other threads are in a function that accesses it.

Likewise, this code can check the PC of the original calling thread and see if it’s still within CFRelease or not. If it’s not, then the call must have finished, so it’s now safe to release the object again.

The only way (that I know of) to get the PC of another thread on OS X is to use mach calls, so the first step is to get a reference to the current mach thread. This reference is also “retained” (mach ports are reference counted, just like Objective-C objects) so that it doesn’t go invalid in case the thread is destroyed in the mean time:

1
static void CallCFReleaseLater(CFTypeRef cf)
2
    {
3
        mach_port_t thread = pthread_mach_thread_np(pthread_self());
4
        mach_port_mod_refs(mach_task_self(), thread, MACH_PORT_RIGHT_SEND, 1 ); // "retain"

1
        NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
2
        SEL sel = @selector(releaseLater:fromThread:);
3
        NSInvocation *inv = [NSInvocation invocationWithMethodSignature: [MAZeroingWeakRef methodSignatureForSelector: sel]];
4
        [inv setTarget: [MAZeroingWeakRef class]];
5
        [inv setSelector: sel];
6
        [inv setArgument: &cf atIndex: 2];
7
        [inv setArgument: &thread atIndex: 3];
8

9
        NSInvocationOperation *op = [[NSInvocationOperation alloc] initWithInvocation: inv];
10
        [gCFDelayedDestructionQueue addOperation: op];
11
        [op release];
12
        [pool release];
13
    }

1
    + (void)releaseLater: (CFTypeRef)cf fromThread: (mach_port_t)thread
2
    {
3
        BOOL retry = YES;
4

5
        while(retry)
6
        {

1
            BLOCK_QUALIFIER void *pc;
2
            // ensure that the PC is outside our inner code when fetching it,
3
            // so we don't have to check for all the nested calls
4
            WhileLocked({
5
                pc = GetPC(thread);
6
            });

1
                if(pc < (void *)CustomCFFinalize || pc > (void *)IsTollFreeBridged)
2
                {

1
                    Dl_info info;
2
                    int success = dladdr(pc, &info;);
3
                    if(success)
4
                    {
5
                        if(info.dli_saddr != _CFRelease)
6
                        {

1
                            retry = NO; // success!
2
                            CFRelease(cf);
3
                            mach_port_mod_refs(mach_task_self(), thread, MACH_PORT_RIGHT_SEND, -1 ); // "release"
4
                        }
5
                    }
6
                }
7
            }
8
        }
9
    }

1
    static void *GetPC(mach_port_t thread)
2
    {
3
        // arch-specific code goes here
4

5
        kern_return_t ret = thread_get_state(thread, flavor, (thread_state_t)&state, &count;);
6
        if(ret == KERN_SUCCESS)
7
            return (void *)state.PC_REGISTER;
8
        else
9
            return NULL;
10
    }

And that’s it!

Odds and Ends In the previous post, I mentioned the COREFOUNDATION_HACK_LEVEL macro that controls how much hack MAZeroingWeakRef contains. When set to 0, it makes use of no private API. It refuses to reference CoreFoundation objects, and detects them by checking the class name for an NSCF prefix. When set to 1, it only uses private API to make a reliable CoreFoundation object check. Level 1 is now the default.

When I wrote the previous post, I didn’t actually know about this subtle resurrection race condition. As such, I’ve added an extra hack level. Hack level 2 uses private CoreFoundation calls to allow referencing CF objects, but does not eliminate the resurrection race condition I described above. Finally, the newly-added hack level 3 goes into full-on CoreFoundation hackery as described above, and eliminates the race condition by doing the final CFRelease in a background thread.

These can be controlled using the COREFOUNDATION_HACK_LEVEL macro at the top of the file. I recommend level 1 for Mac development (weak references to CoreFoundation objects are not commonly needed) and level 0 for iOS development (Apple gets their underwear in a twist over private API usage). However, if you’re adventurous or need weak references to CF objects, you can set it to 3 and everything should still work… If you do, keep in mind that the really horrible hacks don’t activate until you actually create a weak reference to a CF object, so you can enable it just in case you inadvertently reference a CF object, but not worry about it doing anything terrible in the normal case.

Conclusion In the last post I showed how to create zeroing weak references to Objective-C objects with relative ease. In this post, I show that doing the same to CoreFoundation objects is, if not easy, at least possible. A great deal of mucking about with private APIs is required, but the solution should be fairly robust.

This kind of hackery is extremely challenging but it’s also a lot of fun. The CoreFoundation source code is a valuable resource for this kind of thing, but as always you must beware of private symbols which may change in the future. Other low-level open source code like the Objective-C runtime can also be a handy read. Finally, otx is an extremely useful tool for when you need to see how a library works when Apple doesn’t provide source.

That’s it for this edition of Friday Q&A. Come back in two weeks for more wacky hijinks.

As always, Friday Q&A is driven by user ideas. If you have a topic that you would like to see covered here, please send it in!