NSString是怎样比较字符串相等的

2017-03-23 本文已影响0人九昍

版权声明：本文源自简书【九昍】，欢迎转载，转载请务必注明出处： http://www.jianshu.com/p/abb3b69172b9

在平时开发的时候经常用到[NSString isEqualToString]这个方法，但是却不清楚它具体是怎么实现的，所以决定看下系统源码，了解一下它的内部实现。

根据苹果官方Guide：【Concepts in Objective-C Programming: Introspection 】
可以得知如果要重写isEqual: 则需要同时重写hash。这让我想当然的以为isEqual内部会使用hash做一次判断，然后再进行字符串比较，不过事实并非如此。

通过翻看static Boolean __CFStringEqual(CFTypeRef cf1, CFTypeRef cf2) 的源码，可以得到下面的信息

static Boolean __CFStringEqual(CFTypeRef cf1, CFTypeRef cf2) {
    CFStringRef str1 = (CFStringRef)cf1;
    CFStringRef str2 = (CFStringRef)cf2;
    const uint8_t *contents1;
    const uint8_t *contents2;
    CFIndex len1;

    contents1 = (uint8_t *)__CFStrContents(str1);
    contents2 = (uint8_t *)__CFStrContents(str2);
    len1 = __CFStrLength2(str1, contents1);

    //****X 首先判断长度是否相等
    if (len1 != __CFStrLength2(str2, contents2)) return false;

    contents1 += __CFStrSkipAnyLengthByte(str1);
    contents2 += __CFStrSkipAnyLengthByte(str2);

    //****X 根据两个字符串是否为Unicode编码分别做判断，判断方式是逐个取字符做对比
    if (__CFStrIsEightBit(str1) && __CFStrIsEightBit(str2)) { // 都不是Unicode
        return memcmp((const char *)contents1, (const char *)contents2, len1) ? false : true;
    } else if (__CFStrIsEightBit(str1)) {   /* One string has Unicode contents */
        CFStringInlineBuffer buf;
        CFIndex buf_idx = 0;

        CFStringInitInlineBuffer(str1, &buf, CFRangeMake(0, len1));
        for (buf_idx = 0; buf_idx < len1; buf_idx++) {
            if (__CFStringGetCharacterFromInlineBufferQuick(&buf, buf_idx) != ((UniChar *)contents2)[buf_idx]) return false;
        }
    } else if (__CFStrIsEightBit(str2)) {   /* One string has Unicode contents */
        CFStringInlineBuffer buf;
        CFIndex buf_idx = 0;

        CFStringInitInlineBuffer(str2, &buf, CFRangeMake(0, len1));
        for (buf_idx = 0; buf_idx < len1; buf_idx++) {
            if (__CFStringGetCharacterFromInlineBufferQuick(&buf, buf_idx) != ((UniChar *)contents1)[buf_idx]) return false;
        }
    } else {                    /* Both strings have Unicode contents */
        CFIndex idx;
        for (idx = 0; idx < len1; idx++) {
            if (((UniChar *)contents1)[idx] != ((UniChar *)contents2)[idx]) return false;
        }
    }
    return true;
}

既然hash方法不是用来提高isEqual:方法的效率的，那它的作用是什么？
来看一下isEqual:和hash方法的实现：

[NSObject hash]的实现

uintptr_t _objc_rootHash(id obj)
{
    return (uintptr_t)obj;
}

+ (NSUInteger)hash {
    return _objc_rootHash(self);
}

[NSObject isEqual:]的实现

- (BOOL)isEqual:(id)obj {
    return obj == self;
}

可以看出，对于NSObject无论是hash还是isEqual方法，都是用对象地址作为依据，所以对于NSObject，如果hash值相同isEqual:的返回值就是YES。

如果我们需要实现自定义的isEqual该怎么做，首先看下apple官方demo的实现

- (BOOL)isEqual:(id)other {
    if (other == self)
        return YES;
    if (!other || ![other isKindOfClass:[self class]])
        return NO;
    return [self isEqualToWidget:other];
}
 
- (BOOL)isEqualToWidget:(MyWidget *)aWidget {
    if (self == aWidget)
        return YES;
    if (![(id)[self name] isEqual:[aWidget name]])
        return NO;
    if (![[self data] isEqualToData:[aWidget data]])
        return NO;
    return YES;
}

官方的推荐是实现一个isEqualToType:方法，然后在isEqual内部调用isEqualToType方法，并且在isEqual内部检查对象类型及合法性。

当修改完isEqual以后，如果不修改hash方法，那么此时若有两个不同的MyWidget对象，并且他们的name、data相同，此时他们的hash仍然是取的自身地址，此时hash不相等，不满足对象相等hash一定相等的原则。

@implementation Model

- (id)copyWithZone:(nullable NSZone *)zone {
    
    Model *result = [[[self class] allocWithZone:zone] init];
    
    result.firstName = self.firstName;
    result.lastName  = self.lastName;
    
    return result;
}

- (NSUInteger)hash {
    NSLog(@"%@ call %s", self, __func__);
    return NSUINTROTATE([_firstName hash], NSUINT_BIT / 2) ^ [_lastName hash];
}

- (BOOL)isEqual:(id)other {
    NSLog(@"%@ call %s", self, __func__);
    if (other == self)
        return YES;
    if (!other || ![other isKindOfClass:[self class]])
        return NO;
    return [self isEqualToModel:other];
}

- (BOOL)isEqualToModel:(Model *)other {
    if (self == other)
        return YES;
    if (![(id)[self firstName] isEqual:[other firstName]])
        return NO;
    if (![[self lastName] isEqual:[other lastName]])
        return NO;
    return YES;
}
@end

-----------------------------------------------------------------
@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    Model *model1 = [[Model alloc] init];
    model1.firstName = @"A";
    model1.lastName = @"B";
    Model *model2 = [[Model alloc] init];
    model2.firstName = @"A";
    model2.lastName = @"B";
    
    NSMutableDictionary *mutDict = [NSMutableDictionary dictionary];
    
    [mutDict setObject:@"" forKey:model1];
    
    NSLog(@"**********************************");
    [mutDict setObject:@"" forKey:model2];

程序运行后打印的log如下：

2016-12-19 16:03:27.236 test[28108:2027961] <Model: 0x618000024bc0> call -[Model hash]
2016-12-19 16:03:27.236 test[28108:2027961] <Model: 0x610000024860> call -[Model hash]
2016-12-19 16:03:27.236 test[28108:2027961] **********************************
2016-12-19 16:03:27.237 test[28108:2027961] <Model: 0x618000024c60> call -[Model hash]
2016-12-19 16:03:27.237 test[28108:2027961] <Model: 0x610000024860> call -[Model isEqual:]

首先解释下为什么第一次setObject:forKey:调用了两次hash方法第一次调用hash方法是为了获取hash值从而做进一步判断，这是因为我们没有指定字典的Capacity，这种情况下Capacity的值为0，这时候我们向字典里添加元素，字典空间不够用，会重新申请空间，此时需要在新申请的控件通过hash确定对象的地址，所以第二次调用hash这个方法。

从上面的log可以看到在以hash表为基础的对象（NSDictionary、NSSet等）中存储数据时，若hash相等则会调用对象的isEqual方法进一步判断对象是否相等，从而确定对象是否已经存在，如果我们只修改isEqual方法，则可能会出现两个isEqual的对象由于hash值不相等导致被误判为两个不相等对象的情况，这就是为什么必须要同时修改isEqual:和hash方法最重要的原因。

NSString是怎样比较字符串相等的

[NSObject hash]的实现

[NSObject isEqual:]的实现

猜你喜欢

热点阅读