Objective-C 字面量语法

Mike Ash Friday Q&A 中文译文:Objective-C 字面量语法

作者 TommyWu
封面圖片: Objective-C 字面量语法

译文 · 原文: Friday Q&A 2012-06-22: Objective-C Literals · 作者 Mike Ash

原文:https://www.mikeash.com/pyblog/friday-qa-2012-06-22-objective-c-literals.html 发布:2012-06-22 作者:Mike Ash 译者:MiMo(mimo-v2.5-pro);代码块保留英文原样


欢迎回来!在短暂地为 WWDC 停更后,是时候开启另一段奇幻冒险了。今天的主题是 Objective-C 中引入的新对象字面量(object literals)语法,这个主题由读者 Frank McAuley 提出。

字面量
对于不熟悉编程语言中这个术语的人而言,“字面量(literal)” 指的是任何可以直接在源代码中写出的值。例如,42 在 C 语言(当然还有很多其他语言)中就是一个字面量。人们通常根据它们产生的值来分类,所以 42 是一个整型字面量(integer literal),"hello" 是一个字符串字面量(string literal),而 'x' 是一个字符字面量(character literal)。

字面量是大多数编程语言的基石,因为代码中需要某种方式来书写常量值。它们并非严格必需,因为你可以在运行时(runtime)构造任何想要的值,但它们通常能让代码编写起来更加便捷。例如,我们可以在不使用任何字面量的情况下构造出 42

int fortytwo(void)
{
static int zero; // statics are initialized to 0
static int fortytwo;
if(!fortytwo)
{
int one = ++zero;
int two = one + one;
int four = two * two;
int eight = four * two;
int thirtytwo = eight * four;
fortytwo = thirtytwo + eight + two;
}
return fortytwo;
}

不过,如果每次使用整数都得这么折腾,我们恐怕都会放弃编程,转行去干些工具不那么令人抓狂的工作。同样地,我们当然也能手动用字符拼凑 C 字符串(C string),但字符串的使用如此频繁,所以语言提供了简洁的写法。

集合(Collection)的使用也相当普遍。C 语言最初没有提供集合字面量(collection literal),但复合数据类型(compound data type)的变量初始化能力几乎达到了类似效果:

int array[] = { 1, 2, 3, 4, 5 };
struct foo = { 99, "string" };

不过这种方式并非总是完全便利,因此 C99 标准新增了复合字面量(compound literals)特性,允许在代码的任何地方直接构造这样的结构:

DoWorkOnArray((int[]){ 1, 2, 3, 4, 5 });
DoWorkOnStruct((struct foo){ 99, "string" });

集合字面量(Collection literals)在其他编程语言中也相当常见。例如,流行的 JSON 序列化格式(JSON serialization format)实际上就是 JavaScript 字面量语法的一种编码形式。以下这段 JSON 代码在 JavaScript、Python 以及可能的其他一些语言中,也是创建字典数组(array of dictionaries)的有效语法:

[{ "key": "obj" }, { "key": "obj2" }]

直到最近,Objective-C 还没有任何用于表示 Objective-C 集合的语法。要实现与上文相同的效果,需要这样写:

[NSArray arrayWithObjects:
[NSDictionary dictionaryWithObjectsAndKeys:
@"obj", @"key", nil],
[NSDictionary dictionaryWithObjectsAndKeys:
@"obj2", @"key", nil],
nil];

这确实非常冗长,以至于输入起来都令人痛苦,并且掩盖了实际发生的逻辑。C 语言可变参数传递的限制还要求在每个容器创建调用末尾放置一个 nil 哨兵值,如果忘记添加会导致极其怪异的故障。总的来说,这并不是一种良好的开发体验。

容器字面量

最新的 Clang 现在支持在 Objective-C 中使用容器字面量。其语法类似于 JSON 和现代脚本语言,但加入了传统的 Objective-C 风格的 @ 符号。我们的数组 / 字典示例现在可以这样写:

@[@{ @"key" : @"obj" }, @{ @"key" : @"obj2" }]

这里确实出现了不少 @ 符号的重载用法,但相比之前的情况已是巨大改进。@[] 语法通过其内容创建一个数组,所有内容都必须是对象。@{} 语法通过其内容创建一个字典,其键值对以 key : value 形式书写,完全摒弃了 NSDictionary 方法中那种荒谬的 value, key 语法。

由于这是语言内建的语法,因此无需使用终止符 nil。事实上,在这些字面量中任何位置使用 nil 都会在运行时抛出错误,因为 Cocoa 集合拒绝包含 nil。一如既往,在集合中请使用 [NSNull null] 来表示 nil

对于 NSSet 没有等效的语法。数组字面量语法让这项工作稍微好做了些,因为你可以类似这样写:[NSSet setWithArray: @[ contents ]],但并没有完全像那种简洁的字面量语法。

你放入此类数组或字典中的所有内容仍然必须是对象。你不能通过 @[ 1, 2, 3 ] 这样的写法用数字来填充对象数组。不过,随着…… 的引入,这变得容易多了。

装箱表达式

装箱表达式(Boxed expressions)本质上允许为原始类型(Primitive types)创建对应的字面量(Literal)。语法是 @(contents),它生成一个对象,将括号内表达式的结果装箱。对象的类型取决于表达式的类型。数字类型被转换为 NSNumber 对象(NSNumber objects)。例如,@(3) 生成一个包含 3 的 NSNumber,就像你写了 [NSNumber numberWithInt: 3] 一样。C 字符串使用 UTF-8 编码转换为 NSString 对象(NSString objects),所以 @("stuff goes here") 生成一个包含这些内容的 NSString。这些可以包含任意表达式,而不仅仅是常量,因此它们超越了简单的字面量。例如,@(sqrt(2)) 将生成一个包含 2 的平方根的 NSNumber。表达式 @(getenv("FOO")) 等价于 [NSString stringWithUTF8String: getenv("FOO")]。作为快捷方式,数字字面量可以不使用括号进行装箱。与其写 @(3),你可以直接写 @3。应用于字符串,这给了我们熟悉而古老的构造 @"object string"。注意,表达式不能这样工作。@2+2@sqrt(2) 会产生错误,必须用括号括起来,写成 @(2+2)@(sqrt(2))

利用这一点,我们可以轻松创建一个包含数字的对象数组:

@[ @1, @2, @3 ]

再次出现了 @符号的少量重载,但相比没有新语法的等效写法已经优雅许多。

需要注意的是,装箱表达式(boxed expressions)仅适用于数值类型和 char *,不适用于其他指针或结构体。要装箱你的 NSRectSEL 时,你仍然需要借助冗长的写法。

对象下标语法 但等等,还有更多!现在有了简洁的语法来获取和设置数组(array)与字典(dictionary)的元素。这虽然严格来说并非对象字面量(object literals)的一部分,但几乎同时出现在 clang 编译器中,并延续了简化容器操作的主题。

熟悉的用于数组访问的 [] 语法,现在也适用于 NSArray 对象:

int carray[] = { 12, 99, 42 };
NSArray *nsarray = @[ @12, @99, @42 ];
carray[1]; // 99
nsarray[1]; // @99

它同样适用于设置可变数组中的元素:

NSMutableArray *nsarray = [@[ @12, @99, @42 ] mutableCopy];
nsarray[1] = @33; // now contains 12, 33, 42

然而请注意,这种方式只能替换数组中的现有元素,而不能添加新元素。如果数组索引超出数组末尾,数组不会自动增长以匹配该索引,而是会抛出错误。

字典的工作原理与之类似,不同之处在于下标使用的是对象键而非索引。由于字典没有索引限制,因此它也能用于设置新的条目:

NSMutableDictionary *dict = [NSMutableDictionary dictionary];
dict[@"suspect"] = @"Colonel Mustard";
dict[@"weapon"] = @"Candlestick";
dict[@"room"] = @"Library";
dict[@"weapon"]; // Candlestick

与字面量类似,NSSet 没有等效的订阅语法,这可能是因为对集合进行下标操作并没有太大意义。

自定义订阅方法

在一个非常酷的实现中,clang 开发者将对象订阅运算符完全泛化了。它们实际上并不以任何方式与 NSArray 或 NSDictionary 绑定。它们仅仅是翻译成简单的方法,任何类都可以实现这些方法。

总共有四个方法:一个用于整数下标(integer subscripts)的 setter(设置器)和一个 getter(获取器),以及一个用于对象下标的 setter / getter。整数下标的 getter 具有如下原型:

- (id)objectAtIndexedSubscript: (NSUInteger)index;

然后你可以实现它来支持你想要的语义。代码会被机械地翻译成如下形式:

NSLog(@"%@", yourobj[99]);
// becomes
NSLog(@"%@", [yourobj objectAtIndexedSubscript: 99]);

你的代码可以从内部数组获取索引,基于该索引构建一个新对象,记录一条错误,执行 abort() 中止程序,开始一场乒乓球游戏,或者任何你想做的事情。

对应的 setter(setter 方法)具有如下原型:

- (void)setObject: (id)obj atIndexedSubscript: (NSUInteger)index;

你会得到索引(index)和要设置的对象(object),然后根据它们来实现你想要的语义(semantics)。这仍然只是一种简单的机械直译:

yourobj[12] = @"hello";
// becomes
[yourobj setObject: @"hello" atIndexedSubscript: 12];

对象下标的两种方法是相似的。它们的原型是:

- (id)objectForKeyedSubscript: (id)key;
- (void)setObject: (id)obj forKeyedSubscript: (id)key;

可以在同一个类中实现全部四种方法。编译器会通过检查下标(sub 标)的类型来决定调用哪一个:整数下标会调用索引变体,而对象下标则调用键控变体。

这实际上是 Objective-C 中现在提供的一小块运算符重载(operator overloading)功能,而传统上它完全避免了这一特性。与往常一样,使用时要小心,以确保你的自定义实现符合下标运算符的本意。不要为了追加对象或跨网络发送消息而实现下标语法。如果你将其限制在获取(getting)和设置(setting)对象的元素上,那么该语法的使用就能保持一致性,你也就能更容易地理解代码在做什么,而无需了解所有细节。

初始化器(Initializers) C 语言有一个奇特的怪癖:任何全局变量的初始化器都必须是编译时常量(compile-time constant)。这包括简单的表达式,但不包括函数调用。例如,以下全局变量声明是合法的:

int x = 2 + 2;

但这并不是:

float y = sin(M_PI);

C 字符串字面量是编译时常量(compile-time constants),因此这样写是合法的:

char *cstring = "hello, world";

NSString 字面量同样是编译时常量,因此 Cocoa 中的等效写法是合法的。

NSString *nsstring = @"hello, world";

需要特别注意的是,新的字面量语法无一属于编译时常量。假设数组是一个全局变量,那么以下写法是不合法的:

NSArray *array = @[ @"one", @"two" ];

这是因为 @[] 语法实际上会被转换为调用 NSArray 的一个方法。编译器无法在编译期计算该方法的执行结果,因此在这种语境下它不是合法的初始化器。

探究这一现象背后的具体原因会很有趣。编译器会在你的二进制文件中布局全局变量,这些变量会直接加载到内存中。以表达式 2 + 2 初始化的全局变量,会在内存中直接写入字面值 4。而 C 字符串初始化器则会导致字符串内容被写入程序的数据段(data section),然后将指向这些内容的指针作为全局变量的值写入。

值得注意的是,C++(因此也包括 Objective-C++)确实允许全局变量使用非常量初始化器。当 C++ 编译器遇到这样的表达式时,它会将其打包成一个函数,并安排在该二进制文件加载时调用该函数。由于初始化代码运行得非常早,这样做可能会有些危险,因为像 NSArray 这样的其他代码可能还未准备就绪。无论如何,如果你曾看到非常量初始化器能够通过编译并感到疑惑,那很可能是因为它当时是以 C++ 的模式被编译的。

NSString 字面量同样是编译时常量,这源于编译器与库之间的紧密耦合。存在一个名为NSConstantString的特殊 NSString 子类,其具有固定的实例变量布局:

@interface NSSimpleCString : NSString {
@package
char *bytes;
int numBytes;
#if __LP64__
int _unused;
#endif
}
@end
@interface NSConstantString : NSSimpleCString
@end

它仅包含一个 isa(继承自 NSObject)、一个指向字节数据的指针,以及一个长度值。当这样的字面量用作全局变量初始化器时,编译器会直接写入字符串内容,随后写出这个简单的对象结构体,最后用指向该结构体的指针初始化全局变量。

你可能注意到,NSString 字面量不需要像其他对象那样进行 retain 和 release(尽管出于习惯这样做仍然是个好主意)。实际上,你可以对它进行任意次数的 release 操作都不会产生任何效果。这是因为 NSString 字面量不像大多数 Objective-C 对象那样动态分配,而是在编译时作为二进制文件的一部分被静态分配,并在进程的整个生命周期中持续存在。

这种紧密耦合的设计有其优点,例如可以产生合法的全局变量初始化器,并且在运行时构建对象无需运行额外代码。然而,它也有很大的缺点。NSConstantString的布局被永久固定了。这个类必须严格维持该数据布局,因为成千上万的第三方应用已经内置了这种布局。如果苹果公司改变其布局,那些第三方应用将会崩溃,因为它们包含了具有旧布局的NSConstantString对象。

如果 NSArray 字面量是编译时常量,就需要一个类似的NSConstantArray类,它具有编译器可以生成的固定布局,并且必须独立于其他 NSArray 实现进行单独维护。这样的代码将无法在未包含此NSConstantArray类的旧版操作系统上运行。同样的问题也存在于新字面量可以生成的其他类上。

在 NSNumber 字面量方面,这个问题尤其有趣。Lion 系统引入了标记指针(tagged pointer),允许将 NSNumber 的内容直接嵌入指针中,从而避免了单独动态分配对象的需求。如果编译器发射标记指针,其格式将永远无法改变,并且会失去与旧版操作系统的兼容性。如果编译器发射常量 NSNumber 对象,那么 NSNumber 字面量将与其他 NSNumber 存在本质区别,可能带来显著的性能损失。

相反,编译器只是简单地发射对框架的调用,完全像你手动构造对象那样来构建它们。这会导致一点运行时开销,但不会比在没有新语法的情况下自己构建它们更糟糕,并且带来了更清晰的设计。

兼容性 我们何时可以开始使用这种新语法?目前最新的发布版本 Xcode 4.3.3 尚未包含这些新增功能。我们可以合理地期待下一个版本(预计将随 Mountain Lion 一同发布)将在其 clang 版本中整合这些变化。

为了系统兼容性,对象字面量(object literals)会生成调用标准 Cocoa 初始化方法的代码,其效果与手写代码毫无二致。

下标(subscripting)的兼容性故事则稍复杂一些。这些功能需要目前 Cocoa 中尚未存在的新方法。不过,下标方法直接映射到现有的 NSArrayNSDictionary 方法,因此我们可以期待一个兼容垫片(compatibility shim)的出现,其思路类似于允许在早于 ARC 的系统上使用 ARC 的 ARCLite 垫片。(译注:随着 ARC 成为标准,现代系统已不再需要 ARCLite 垫片。)

结论 Objective-C 中新增的对象字面量和下标语法能够显著降低大量处理数组和字典的代码的冗长度。其语法与常见脚本语言类似,除了略显多余的 @ 符号外,代码变得更易于读写。

今天的内容就到这里。欢迎下次再来,继续进行愉快的编程世界探索。Friday Q & A 栏目一如既往由读者建议驱动,因此,如果你有希望在此讨论的主题,请随时发送过来!


#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2012-06-22-objective-c-literals.html

Welcome back! After a brief hiatus for WWDC, it’s time for another wacky adventure. Today’s topic is the new object literals syntax being introduced into Objective-C, which was suggested by reader Frank McAuley.

LiteralsFor anyone unfamiliar with the term in the context of programming languages, a “literal” refers to any value which can be written out directly in source code. For example, 42 is a literal in C (and, of course, a lot of other languages). It’s common to refer to what kind of value they produce, so 42 is an integer literal, “hello” is a string literal, and ‘x’ is a character literal.

Literals are a foundational building block of most programming languages, since there needs to be some way of writing constant values in code. They aren’t strictly necessary, as you can construct any desired value at runtime, but they generally make code a lot nicer to write. For example, we can construct 42 without using any literals:

int fortytwo(void)
{
static int zero; // statics are initialized to 0
static int fortytwo;
if(!fortytwo)
{
int one = ++zero;
int two = one + one;
int four = two * two;
int eight = four * two;
int thirtytwo = eight * four;
fortytwo = thirtytwo + eight + two;
}
return fortytwo;
}

However, if we had to do this for every integer we used, we’d probably all give up computer programming and go into some profession where the tools don’t hate us so much. Likewise, we could construct C strings by hand out of characters, but strings are used so commonly that the language has a concise way to write them.

Collections are pretty commonly used as well. C originally had no facilities for collection literals, but the ability to initialize variables of a compound data type came pretty close:

int array[] = { 1, 2, 3, 4, 5 };
struct foo = { 99, "string" };

This isn’t always entirely convenient, and so C99 added compound literals, which allow writing such things directly in code anywhere:

DoWorkOnArray((int[]){ 1, 2, 3, 4, 5 });
DoWorkOnStruct((struct foo){ 99, "string" });

Collection literals are pretty common in other languages too. For example, the popular JSON serialization format is just a codification of JavaScript’s literal syntax. This JSON code is also valid syntax to create an array of dictionaries in JavaScript, Python, and probably some other languages:

[{ "key": "obj" }, { "key": "obj2" }]

Until recently, Objective-C didn’t have any syntax for Objective-C collections. The equivalent to the above was:

[NSArray arrayWithObjects:
[NSDictionary dictionaryWithObjectsAndKeys:
@"obj", @"key", nil],
[NSDictionary dictionaryWithObjectsAndKeys:
@"obj2", @"key", nil],
nil];

This is really verbose, to the extent that it’s painful to type and obscures what’s going on. The limitations of C variable argument passing also require a nil sentinel value at the end of each container creation call, which can fail in extremely odd ways when forgotten. All in all, not a good situation.

Container LiteralsThe latest clang now has support for container literals in Objective-C. The syntax is similar to that of JSON and modern scripting languages, but with the traditional Objective-C @ thrown in. Our example array/dictionary looks like this:

@[@{ @"key" : @"obj" }, @{ @"key" : @"obj2" }]

There’s definitely a bit of @ overload happening here, but it’s a vast improvement over the previous state of things. The @[] syntax creates an array from the contents, which must all be objects. The @{} syntax creates a dictionary from the contents, which are written as key : value instead of the completely ludicrous value, key syntax found in the NSDictionary method.

Because it’s built into the language, there’s no need for a terminating nil. In fact, using nil anywhere in these literals will throw an error at runtime, since Cocoa collections refuse to contain nil. As always, use [NSNull null] to represent nil in collections.

There is no equivalent syntax for NSSet. The array literal syntax makes the job a bit nicer, since you can do something like [NSSet setWithArray: @[ contents ]], but there’s nothing quite like the concise literal syntax.

Everything you put into such an array or dictionary still has to be an object. You can’t fill out an object array with numbers by writing @[ 1, 2, 3 ]. However, this is made much easier by the introduction of…

Boxed ExpressionsBoxed expressions essentially allow for literals corresponding to primitive types. The syntax is @(contents), which produces an object boxing the result of the expression within the parentheses.

The type of object depends on the type of the expression. Numeric types are converted to NSNumber objects. For example, @(3) produces an NSNumber containing 3, just like if you wrote [NSNumber numberWithInt: 3]. C strings are converted to NSString objects using the UTF-8 encoding, so @(“stuff goes here”) produces an NSString with those contents.

These can contain arbitrary expressions, not just constants, so they go beyond simple literals. For example, @(sqrt(2)) will produce an NSNumber containing the square root of 2. The expression @(getenv(“FOO”)) is equivalent to [NSString stringWithUTF8String: getenv(“FOO”)].

As a shortcut, number literals can be boxed without using the parentheses. Rather than @(3), you can just write @3. Applied to strings, this gives us the familiar and ancient construct @“object string”. Note that expressions do not work like this. @2+2 and @sqrt(2) will produce an error, and must be parenthesized as @(2+2) and @(sqrt(2)).

Using this, we can easily create an object array containing numbers:

@[ @1, @2, @3 ]

Once again, a bit of @ overload, but much nicer than the equivalent without the new syntax.

Note that boxed expressions only work for numeric types and char *, and don’t work with other pointers or structures. You still have to resort to longhand to box up your NSRects or SELs.

Object SubscriptingBut wait, there’s more! There’s now concise syntax for fetching and setting the elements of an array and dictionary. This isn’t strictly related to object literals, but arrived in clang at the same time, and continues the theme of making it easier to work with containers.

The familiar [] syntax for array access now works for NSArray objects as well:

int carray[] = { 12, 99, 42 };
NSArray *nsarray = @[ @12, @99, @42 ];
carray[1]; // 99
nsarray[1]; // @99

It works for setting elements in mutable arrays as well:

NSMutableArray *nsarray = [@[ @12, @99, @42 ] mutableCopy];
nsarray[1] = @33; // now contains 12, 33, 42

Note, however, that it’s not possible to add elements to an array this way, only replace existing elements. If the array index is beyond the end of the array, the array will not grow to match, and instead it throws an error.

It works the same for dictionaries, except the subscript is an object key instead of an index. Since dictionaries don’t have any indexing restrictions, it also works for setting new entries:

NSMutableDictionary *dict = [NSMutableDictionary dictionary];
dict[@"suspect"] = @"Colonel Mustard";
dict[@"weapon"] = @"Candlestick";
dict[@"room"] = @"Library";
dict[@"weapon"]; // Candlestick

As with literals, there is no equivalent notation for NSSet, probably because it doesn’t make much sense to subscript sets.

Custom Subscripting MethodsIn a really cool move, the clang developers made the object subscripting operators completely generic. They’re not actually tied into NSArray or NSDictionary in any way. They simply translate to simple methods which any class can implement.

There are four methods in total: one setter and one getter for integer subscripts, and one setter/getter for object subscripts. The integer subscript getter has this prototype:

- (id)objectAtIndexedSubscript: (NSUInteger)index;

You can then implement this to do whatever you want to support the semantics you want. The code simply gets translated mechanically:

NSLog(@"%@", yourobj[99]);
// becomes
NSLog(@"%@", [yourobj objectAtIndexedSubscript: 99]);

Your code can fetch the index from an internal array, build a new object based on the index, log an error, abort(), start a game of pong, or whatever you want.

The corresponding setter has this prototype:

- (void)setObject: (id)obj atIndexedSubscript: (NSUInteger)index;

You get the index and the object that’s being set there, and then you do whatever you need to do with them to implement the semantics you want. Again, this is just a simple mechanical translation:

yourobj[12] = @"hello";
// becomes
[yourobj setObject: @"hello" atIndexedSubscript: 12];

The two methods for object subscripts are similar. Their prototypes are:

- (id)objectForKeyedSubscript: (id)key;
- (void)setObject: (id)obj forKeyedSubscript: (id)key;

It’s possible to implement all four methods on the same class. The compiler decides which one to call by examining the type of the subscript. Integer subscripts call the indexed variants, and objects call the keyed variants.

This is actually a small chunk of operator overloading now available in Objective-C, which traditionally has completely avoided it. As always, be careful with it to ensure that your custom implementations remain true to the spirit of the subscripting operator. Don’t implement the subscripting syntax to append objects or send messages across the network. If you keep it restricted to fetching and getting elements of your object, the usage of the syntax remains consistent and you can more easily understand what code is doing without needing to know all the details.

InitializersC has an odd quirk in that any initializer of a global variable must be a compile-time constant. This includes simple expressions, but not function calls. For example, the following global variable declaration is legal:

int x = 2 + 2;

But this is not:

float y = sin(M_PI);

C string literals are compile-time constants, so this is legal:

char *cstring = "hello, world";

NSString literals are also compile-time constants, so the Cocoa equivalent is legal:

NSString *nsstring = @"hello, world";

It’s important to note that none of the new literal syntax qualifies as a compile-time constant. Assuming that the array is a global variable, the following is not legal:

NSArray *array = @[ @"one", @"two" ];

This is because the @[] syntax literally translates into a call to an NSArray method. The compiler can’t compute the result of that method at compile time, so it’s not a legal initializer in this context.

It’s interesting to explore exactly why this would be the case. The compiler lays out global variables in your binary, and they are loaded directly into memory. A global variable initialized with 2 + 2 results in a literal 4 being written into memory. A C string initializer results in the string contents being written out in the program’s data, and then a pointer to those contents being written out as the global variable’s value.

Note that C++, and therefore Objective-C++, does allow non-constant initializers for global variables. When the C++ compiler encounters such an expression, it packages into a function and arranges for that function to be called when the binary loads. Because the initializer code runs so early, it can be a bit dangerous to use, as other code like NSArray might not be ready to go yet. In any case, if you’ve seen a non-constant initializer compile and are wondering why, it was probably being compiled as C++.

NSString literals are also compile-time constants, because of a tight coupling between the compiler and the libraries. There’s a special NSString subclass called NSConstantString with a fixed ivar layout:

@interface NSSimpleCString : NSString {
@package
char *bytes;
int numBytes;
#if __LP64__
int _unused;
#endif
}
@end
@interface NSConstantString : NSSimpleCString
@end

It just contains an isa (inherited from NSObject), a pointer to bytes, and a length. When such a literal is used as a global variable initializer, the compiler simply writes out the string contents, then writes out this simple object structure, and finally initializes the global variable with a pointer to that structure.

You may have noticed that you don’t need to retain and release NSString literals like you do other objects (although it’s still a good idea to do so just out of habit). In fact, you can release them as many times as you want and it won’t do anything. This is because NSString literals aren’t dynamically allocated like most Objective-C objects. Instead, they’re allocated at compile time as a part of your binary, and live for the lifetime of your process.

This tight coupling has advantages, like producing legal global variable initializers, and requiring no extra code to run to build the object at runtime. However, there are big disadvantages as well. The NSConstantString layout is set forever. That class must be maintained with exactly that data layout, because that data layout is baked into thousands of third-party apps. If Apple changed the layout, those third-party apps would break, because they contain NSConstantString objects with the old layout.

If NSArray literals were compile-time constants, there would need to be a similar NSConstantArray class with a fixed layout that the compiler could generate, and that would have to be maintained separately from other NSArray implementations. Such code could not run on older OSes which didn’t have this NSConstantArray class. The same problem exists for the other classes that the new literals can produce.

This is particularly interesting in the case of NSNumber literals. Lion introduced tagged pointers, which allow an NSNumber’s contents to be embedded directly in the pointer, eliminating the need for a separate dynamically-allocated object. If the compiler emitted tagged pointers, their format could never change, and compatibility with old OS releases would be lost. If the compiler emitted constant NSNumber objects, then NSNumber literals would be substantially different from other NSNumbers, with a possible significant performance hit.

Instead, the compiler simply emits calls into the framework, constructing the objects exactly like you would have done manually. This results in a bit of a runtime hit, but no worse than building them yourself without the new syntax, and makes for a much cleaner design.

CompatibilityWhen can we start using this new syntax? Xcode 4.3.3 is the latest shipping version and does not yet include these additions. We can reasonably expect that the next release, presumably coming with Mountain Lion, will incorporate these changes in its version of clang.

For OS compatibility, the literals simply generate code that calls standard Cocoa initializers. The result is indistinguishable from writing the code by hand.

The story for subscripting is a bit more complex. These require new methods that don’t exist in Cocoa at the moment. However, the subscripting methods map directly to existing NSArray and NSDictionary methods, so we can expect a compatibility shim to be made available along the lines of the ARCLite shim that allows using ARC on OSes that predate it.

ConclusionThe new object literals and subscripting syntax in Objective-C can significantly reduce the verbosity of code that deals heavily with arrays and dictionaries. The syntax is similar to that found in common scripting languages, and makes code much easier to read and write, aside from a minor surplus of @ symbols.

That’s it for today. Come back next time for another friendly exploration of the world of programming. Friday Q&A is as always driven by reader suggestions, so until then, if you have a topic that you’d like to see covered here, please send it in!