通用 Block 代理 | TommyWu's Lab

文章發布時間 2011年10月28日

作者 TommyWu

標籤

译文 · 原文： Friday Q&A 2011-10-28: Generic Block Proxying · 作者 Mike Ash

原文：https://www.mikeash.com/pyblog/friday-qa-2011-10-28-generic-block-proxying.html 发布：2011-10-28　作者：Mike Ash 译者：MiMo（mimo-v2.5-pro）；代码块保留英文原样

What It Means→含义

在 Objective-C 中，拦截消息是可能的。任何发送给一个对象的消息，若该对象未实现，则会生成一个 NSInvocation 对象，并将其发送给 forwardInvocation: 方法。在该方法里，你可以对消息进行任何操作，比如在将它转发给另一个对象前修改其参数，或者通过网络发送它。

这种机制最常见的用途是编写一个几乎不实现任何方法的代理类（proxy class）。发送给它的几乎任何消息都会被转发机制（forwarding mechanism）捕获，然后代理可以巧妙地处理这些消息，同时仍然基本表现为被代理的对象。这对于构建诸如透明 future（transparent futures）和透明置零弱引用（transparent zeroing weak references）之类的东西很有用。

Block 代理（block proxying）的原理与对象代理非常相似，只是将对象替换成了 Block。你可以用另一个 Block 包装任意 Block，从而能够拦截调用并根据需要进行干预。

我即将介绍的技术虽然效果不错，但绝对不受官方支持，不应在实际生产代码中使用。它依赖私有 API 以及公开 API 的私有特性。这只是一次有趣的实验，并非稳定的库。

代码
一如既往，代码已发布在 GitHub 上。今日深入禁止领域的旅程可在此处找到：
https://github.com/mikeash/MABlockForwarding

理论
Objective-C 的消息分发机制是通过选择子（selector）和类，在该类中查找对应选择子的方法。更具体地说，它会查找实际实现该方法的函数（或称为 IMP），然后调用该函数。

消息转发（message forwarding）正是嵌入这一系统中的机制。当在类中查找函数时，若未找到对应方法，则会返回一个特殊的转发 IMP。该函数负责处理将函数调用转换为 NSInvocation 对象所需的、所有繁琐且平台相关的细节。

如果我们能获取到这个特殊的转发 IMP（转发函数指针），就可以用它构建一个假的 block，从而达成转发 block 的目标。事实证明，获取这个特殊的转发 IMP 非常简单。你只需要向系统请求一个未实现的选择子（selector）对应的 IMP 即可。实现这一点有多种方式，但最简单的方法是直接调用 [self methodForSelector:...]，并传入一个你知道在该类中不存在的选择子。

一个 block 本质上是一个 Objective-C 对象（Objective-C 对象），其函数指针（function pointer）位于正确的位置。要调用这个 block，编译器会调用该函数指针，并将该对象作为第一个参数传递。我们可以构造一个 Objective-C 对象，把转发 IMP 的指针放在正确的位置，这样转发机制（forwarding machinery）就会启动，构建一个 NSInvocation，然后调用我们的 forwardInvocation: 方法。

转发机制需要知道被调用方法的方法签名（method signature），以便了解如何打包参数。幸运的是，在相当新的编译器上，block 也以相同的格式嵌入了签名信息。

消息转发针对的是消息，消息有两个隐式参数：对象和选择子（selector）。而块（Block）只有一个隐式参数：块对象本身。块的第二个参数可以是任何类型，甚至可以根本不存在（例如没有参数的块）。幸运的是，转发函数似乎并不关心第二个参数的类型，只要它存在即可。对于那些没有第二个参数的块，可以在签名中插入一个伪造的参数，这样就不会把事情搞砸。

1
    typedef void (^BlockInterposer)(NSInvocation *inv, void (^call)(void));
2

3
    id MAForwardingBlock(BlockInterposer interposer, id block);

MAForwardingBlock 接受两个参数。第一个是 interposer block（插入式块），即用于处理调用的被调用块。第二个是待包装的原始块。interposer 会接收一个块作为参数，当该参数块被调用时，它将使用 NSInvocation 作为参数，将调用穿透转发至原始块。该函数返回一个新块，将所有调用转发至传入的 interposer block。

首先需要创建一个新类，使其伪装成块（block）。该类的实例将表现得像块，并承担所有代理职责。该类的布局必须与块的布局兼容。一个块包含五个字段，之后可能跟随其他数据：一个 isa 字段（使其能够作为 Objective-C 对象工作）、flags（标志位）、一些保留空间、块的函数指针，以及一个指向块描述符的指针（该描述符包含块的其他有用信息）。

isa 字段已经处理完毕，其余部分可以按照实例变量进行布局。在块字段布局完成后，其他数据可以紧随其后。在此例中，该类将拦截块（interposer block）和原始块作为实例变量存储在块字段之后。

1
    @interface MAFakeBlock : NSObject
2
    {
3
        int _flags;
4
        int _reserved;
5
        IMP _invoke;
6
        struct BlockDescriptor *_descriptor;
7

8
        id _forwardingBlock;
9
        BlockInterposer _interposer;
10
    }

这个类在其接口中包含一个方法：一个初始化器。

1
    - (id)initWithBlock: (id)block interposer: (BlockInterposer)interposer;

其他所有事情都通过 block 调用约定和转发来处理，因此无需额外操作。该方法的实现会复制并存储传入的两个 block，然后通过获取一个未实现的方法来将 invoke 字段设置为转发用的 IMP（方法实现）：

1
    - (id)initWithBlock: (id)block interposer: (BlockInterposer)interposer
2
    {
3
        if((self = [super init]))
4
        {
5
            _forwardingBlock = [block copy];
6
            _interposer = [interposer copy];
7
            _invoke = [self methodForSelector: @selector(thisDoesNotExistOrAtLeastItReallyShouldnt)];
8
        }
9
        return self;
10
    }

一切设置完成后，每当 MAFakeBlock 的实例像一个 block 一样被调用时，它最终都会通过常规的 Objective-C 转发机制（forwarding machinery）来处理。通用转发路径有两个步骤：首先，runtime 使用 methodSignatureForSelector: 获取方法签名（method signature），然后它构造一个 NSInvocation 并调用 forwardInvocation:。

为了确定提供给 runtime 的方法签名，我们首先需要获取被包装的 block 的方法签名。这是通过深入挖掘那个 BlockDescriptor 结构体并提取出签名（signature）来完成的。细节有点枯燥，我将跳过它们，并简单地假设存在一个 BlockSig 函数，该函数接受一个 block 并返回其签名作为 C 字符串。出于好奇，代码在 GitHub 上。（译注：这里描述的 BlockDescriptor 是旧版 Block Runtime 的实现细节，现代系统中 Block 的内部结构可能已有所不同。）

NSMethodSignature 提供了一个从 C 字符串获取签名对象的方法，即 +signatureWithObjCTypes:。这里唯一的细节问题在于，如果提供的签名不包含至少两个对象，转发机制就会崩溃。为了解决这个问题，我通过向签名中添加额外的伪 void * 参数来伪造它，使其至少具有所需的参数数量。这些额外的参数是无害的，尽管它们会被寄存器或栈中的随机垃圾数据填充。随后，methodSignatureForSelector: 的实现看起来就像这样：

1
    - (NSMethodSignature *)methodSignatureForSelector: (SEL)sel
2
    {
3
        const char *types = BlockSig(_forwardingBlock);
4
        NSMethodSignature *sig = [NSMethodSignature signatureWithObjCTypes: types];
5
        while([sig numberOfArguments] < 2)
6
        {
7
            types = [[NSString stringWithFormat: @"%s%s", types, @encode(void *)] UTF8String];
8
            sig = [NSMethodSignature signatureWithObjCTypes: types];
9
        }
10
        return sig;
11
    }

然后，-forwardInvocation:（转发调用方法）的实现非常简单。将 invocation（调用对象）的 target（目标）更改为原始 block（块），然后调用 interposer（插入器）：

1
    - (void)forwardInvocation: (NSInvocation *)inv
2
    {
3
        [inv setTarget: _forwardingBlock];
4
        _interposer(inv, ^{

传递给拦截器（interposer）的调用块（call block）有点棘手。在 NSInvocation 的公开接口中，它只提供了通过特定 selector（选择子）来调用的方法，而这会经由 objc_msgSend（消息发送）。这就不适合用来调用一个 block 了。

幸运的是，有一个名为 invokeUsingIMP: 的私有方法。它绕过 objc_msgSend，直接调用提供的 IMP（方法实现）。实际上，它可以调用任何函数指针，只要该指针的签名与其所具有的签名兼容。然后我们就能把内部 block 的函数指针传给它，这样就可以调用了：

1
            [inv invokeUsingIMP: BlockImpl(_forwardingBlock)];
2
        });
3
    }

再次，我在这里使用了一个小小的辅助函数来处理内部块结构。BlockImpl 从一个块中取出函数指针。这个函数非常简单：它只是将对象解释为一个块结构，然后取出其中的 invoke 字段。如果你想看的话，代码是可用的。

对于这个类来说，剩下的只是 copyWithZone: 的一个伪实现，因为块会被频繁复制。除了保持（retain）这个伪造的块之外，这个实现无需做任何其他事情，因为本类没有任何可变状态：

1
    - (id)copyWithZone: (NSZone *)zone
2
    {
3
        return [self retain];
4
    }

现在该类已经完成，剩下的就是实现 MAForwardingBlock。这个函数需要做的就是创建并返回伪造代码块类（fake block class）的一个新实例，并进行正确的初始化：

1
    id MAForwardingBlock(BlockInterposer interposer, id block)
2
    {
3
        return [[[MAFakeBlock alloc] initWithBlock: block interposer: interposer] autorelease];
4
    }

就是这样！现在我们可以代理块了。这是一个简单的例子：

1
    void (^block)(int) = ForwardingBlock(^(NSInvocation *inv, void (^call)(void)) {
2
        [inv setArgument: &(int){ 4242 } atIndex: 1];
3
        call();
4
    }, ^(int testarg){
5
        NSLog(@"%d %d", argc, testarg);
6
    });
7
    block(42);

尽管该 block 被传入 42 调用，实际输出却是 4242，因为介入式 block 在调用原始 block 前修改了参数。

由于此代码利用了 Cocoa 的转发机制（forwarding machinery），它能适用于几乎所有接收任意参数组合并返回任意结果的 block，而不仅限于简单的 int 类型 block。当然，它同样受限于 Cocoa 转发机制的局限性。具体而言，它无法处理接收可变参数（variable arguments）或联合体（unions）的 block。对于结构体（struct）返回值的特殊处理也存在欠缺。由于结构体返回值在大多数架构下的实现方式，实际上存在单独的转发 IMP（implementation，方法实现）。若要支持结构体返回值，此代码需检测 block 签名是否采用结构体返回值调用约定（calling convention），并获取对应的专用 IMP。（译注：现代 Objective-C 运行时对结构体返回值的转发机制可能已有变化。）

结论

理解诸如消息转发（message forwarding）之类的底层机制是如何工作的，就能将它们重新组合以实现全新的功能。有时你会得到真正有用的东西，有时你只得到一个无法在实际代码中使用的有趣玩具。虽然这次实现的只是一个玩具，但它仍然是对系统核心的一次有趣的探索，而这种探索往往能催生出真正坚实、实用的代码。

今天的内容就到这里。两周后我将讨论如何使用这段 block 代理代码来实现记忆化（memoization）。在那之前，请继续发送你的主题建议。除了偶尔的例外，周五问答由读者建议驱动，所以如果你有希望在此讨论的话题，请发送给我！

#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2011-10-28-generic-block-proxying.html

Here at Friday Q&A, I pride myself on occasionally taking my readers to places where normal people dare not tread. Today is one of those days. This is not a reader suggested topic, but today I want to talk about a fun hack I came up with that allows proxying block invocations in much the way that one can proxy Objective-C messages.

What It MeansIn Objective-C, it’s possible to intercept messages. Any message sent to an object that isn’t implemented gets an NSInvocation object constructed, and that is then sent to forwardInvocation:. In there, you can do whatever you like with the message, like messing with its parameters before passing it on to another object, or sending it over a network.

The most common use for this facility is to write a proxy class which doesn’t implement much of anything. Nearly any message sent to it will be caught by the forwarding mechanism, and the proxy can then do clever things with any message, while still mostly acting like the object being proxied. This is useful for building things like transparent futures and transparent zeroing weak references.

Block proxying is much the same, but with blocks instead of objects. You wrap an arbitrary block with another block which is able to intercept the call and interfere as it sees fit.

The technique I’m about to present works well, but it’s definitely not supported and should not be used in real code. It relies on private APIs and private quirks of public APIs. It’s an interesting experiment, not a stable library.

CodeAs usual, the code is available on GitHub. Today’s journey into the forbidding depths can be found here:

https://github.com/mikeash/MABlockForwarding

TheoryObjective-C message dispatch works by taking the selector and the class and looking up the method in the class that corresponds to that selector. More specifically, it looks up the function, or IMP, that actually implements the method, then calls that function.

Message forwarding hooks right into this system. When looking up a function, if no method is found in the class, a special forwarding IMP is returned. That function takes care of all the painful and platform-specific details of how to turn a function call into an NSInvocation object.

If we can obtain this special forwarding IMP then we can build a fake block around it and accomplish our goal of forwarding blocks. Turns out that the special forwarding IMP is really easy to obtain. All you need to do is ask the system for the IMP for an unimplemented selector. There are several ways to do this, but the easiest is to simply call [self methodForSelector:…] and pass a selector you know doesn’t exist in the class.

A block is just an Objective-C object with a function pointer in the right place. To call the block, the compiler calls the function pointer and passes the object as the first parameter. We can construct an Objective-C object with a pointer to the forwarding IMP in the right place, and the forwarding machinery will kick into action, build an NSInvocation, and then call our forwardInvocation: method.

The forwarding machinery needs the method signature of the method being called in order to know how to package the arguments. Fortunately, with reasonably recent compilers, blocks embed signature information in the same format.

Forwarding deals with messages, which have two implicit arguments: the object and the selector. Blocks only have one implicit argument: the block object. The second argument to a block can be anything, or not even exist at all (for a block with no parameters). Fortunately, the forwarding function doesn’t seem to care about the type of the second parameter, as long as it’s present. For blocks that don’t have a second parameter, a fake one can be inserted into the signature without screwing things up.

ImplementationThe goal is to build this function:

1
    typedef void (^BlockInterposer)(NSInvocation *inv, void (^call)(void));
2

3
    id MAForwardingBlock(BlockInterposer interposer, id block);

MAForwardingBlock takes two parameters. The first is the interposer block, which is the block which is called to handle the invocation. The second is the original block to wrap. The interposer gets a block as a parameter which, when called, will call through to the original block using the NSInvocation as the parameters. The function returns a new block which forwards calls to the interposer block passed in.

The first thing to do is to create a new class which will pretend to be a block. Instances of this class will act like blocks and will handle all of the proxying duties. The layout of this class needs to be compatible with the layout of a block. A block contains five fields which can then be followed by other data. There’s an isa field (necessary for it to work as an Objective-C object), flags, some reserved space, the block’s function pointer, and a pointer to a block descriptor which contains other useful information about the block.

The isa field is already taken care of, and then the rest can be laid out as instance variables. After the block fields are laid out, other data can follow. In this case, the class stores the interposer block and the original block as instance variables after the block fields:

1
    @interface MAFakeBlock : NSObject
2
    {
3
        int _flags;
4
        int _reserved;
5
        IMP _invoke;
6
        struct BlockDescriptor *_descriptor;
7

8
        id _forwardingBlock;
9
        BlockInterposer _interposer;
10
    }

This class has a single method in its interface, an initializer:

1
    - (id)initWithBlock: (id)block interposer: (BlockInterposer)interposer;

Everything else happens through block calling conventions and forwarding, so nothing else needs to be done. The implementation for this method copies and stores the two blocks passed in, and then sets the invoke field to the forwarding IMP by fetching a method that isn’t implemented:

1
    - (id)initWithBlock: (id)block interposer: (BlockInterposer)interposer
2
    {
3
        if((self = [super init]))
4
        {
5
            _forwardingBlock = [block copy];
6
            _interposer = [interposer copy];
7
            _invoke = [self methodForSelector: @selector(thisDoesNotExistOrAtLeastItReallyShouldnt)];
8
        }
9
        return self;
10
    }

With everything now set up, whenever an instance of MAFakeBlock is called like a block, it will end up going through the regular Objective-C forwarding machinery. There are two steps in the general forwarding path: first, the runtime fetches the method signature using methodSignatureForSelector:, then it constructs an NSInvocation and calls forwardInvocation:.

To figure out the method signature to give to the runtime, we first need to get the method signature of the block being wrapped. This is done by delving into that BlockDescriptor structure and pulling out the signature. The details are a bit boring, and I’m going to skip over them and simply assume that there’s a BlockSig function which takes a block and returns its signature as a C string. For the curious, the code is on GitHub.

NSMethodSignature provides a method to get a signature object from a C string, +signatureWithObjCTypes:. The only wrinkle is that the forwarding machinery will crash if the provided signature doesn’t have at least two objects. To fix that, I fake it by adding extra fake void * parameters to the signature so that it has at least the required number of parameters. These extra parameters are harmless, although they will be filled with random junk from registers or the stack. The methodSignatureForSelector: implementation then looks like this:

1
    - (NSMethodSignature *)methodSignatureForSelector: (SEL)sel
2
    {
3
        const char *types = BlockSig(_forwardingBlock);
4
        NSMethodSignature *sig = [NSMethodSignature signatureWithObjCTypes: types];
5
        while([sig numberOfArguments] < 2)
6
        {
7
            types = [[NSString stringWithFormat: @"%s%s", types, @encode(void *)] UTF8String];
8
            sig = [NSMethodSignature signatureWithObjCTypes: types];
9
        }
10
        return sig;
11
    }

The implementation of -forwardInvocation: is then pretty simple. Change the invocation’s target to the original block, then call the interposer:

1
    - (void)forwardInvocation: (NSInvocation *)inv
2
    {
3
        [inv setTarget: _forwardingBlock];
4
        _interposer(inv, ^{

The call block that gets passed to the interposer is a bit tricky. In its public interface, NSInvocation only provides methods to invoke it with a particular selector, which goes through objc_msgSend. This is no good for calling a block, of course.

Fortunately, there’s a private method called invokeUsingIMP:. This bypasses objc_msgSend and simply calls the provided IMP. In practice, it’ll call any arbitrary function pointer, as long as it’s compatible with the signature that it has. We can then pass it the function pointer for the inner block, and off we go:

1
            [inv invokeUsingIMP: BlockImpl(_forwardingBlock)];
2
        });
3
    }

Again, I use a little helper function here to deal with internal block structure. BlockImpl fetches the function pointer out of a block. This one is really simple: it just interprets the object as a block structure and fetches the invoke field. If want to see it, the code is available.

All that remains for this class is a dummy implementation of copyWithZone:, since blocks are copied a lot. Nothing has to be done for this implementation besides retaining the fake block, since there isn’t any mutable state in this class:

1
    - (id)copyWithZone: (NSZone *)zone
2
    {
3
        return [self retain];
4
    }

Now that this class is complete, all that remains is the implementation of MAForwardingBlock. All this function has to do is create and return a new instance of the fake block class, properly initialized:

1
    id MAForwardingBlock(BlockInterposer interposer, id block)
2
    {
3
        return [[[MAFakeBlock alloc] initWithBlock: block interposer: interposer] autorelease];
4
    }

That’s it! Now we can proxy blocks. Here’s a silly example:

1
    void (^block)(int) = ForwardingBlock(^(NSInvocation *inv, void (^call)(void)) {
2
        [inv setArgument: &(int){ 4242 } atIndex: 1];
3
        call();
4
    }, ^(int testarg){
5
        NSLog(@"%d %d", argc, testarg);
6
    });
7
    block(42);

Even though the block is called with 42, the call actually prints 4242, since the interposing block changes the argument before calling the original block.

Since this code leverages Cocoa’s forwarding machinery, it will work with nearly any block taking nearly any combination of parameters and return values, not just simple int blocks. It suffers from the same limitations of Cocoa’s forwarding, of course. In particular, it’s not able to handle blocks that take variable arguments or unions. It also doesn’t deal with the peculiarities of struct returns. Because of how struct returns work on most architectures, there’s actually a separate forwarding IMP for those. To work with struct returns, this code would have to detect whether the block signature uses the struct return calling convention and fetch that separate IMP instead.

ConclusionUnderstanding how mechanisms like message forwarding work at a low level makes it possible to twist them to do entirely new things. Sometimes you get something really useful. Sometimes you just get an interesting toy that can’t really be used in real code. While this one is just a toy, it’s still an interesting exploration of the guts of the system, and this sort of thing can often lead to real, solid, useful code later on.

That’s it for today. Come back in two weeks when I discuss how to use this block proxying code to implement memoization. Until then, keep sending your ideas for topics. With the occasional exception, Friday Q&A is driven by reader suggestions, so if you have a topic that you would like to see covered here, send it in!