NSLog 和 print 源码阅读和捕获方案

2020-03-26 本文已影响0人 madaoCN

前言

NSLog 作为 iOS开发常用的调试和日志打印方法，大家都是很熟悉了，
开源社区也为我们贡献了很多非常优秀的日志框架，比如OC中大名鼎鼎的CocoaLumberjack，有兴趣的同学可以移步https://github.com/CocoaLumberjack/CocoaLumberjack
在Swift语言下我们还有另外一种选择,那就是print

如果要自己做日志监控的话，就需要就需要自己重定向NSLog和 print方法了

网络上大概有如下几种方法

asl读取日志（iOS10以后已经弃用）
NSLog默认的输出到了系统的 /var/log/syslog这个文件，asl框架能从syslog中读取日志，此种方法对系统无侵入，可惜iOS10后已经获取不到日志
采用dup2的重定向方式
NSLog最后重定向的句柄是STDERR，NSLog输出的日志内容，最终都通过STDERR句柄来记录，使用dup2重定向STDERR句柄，可以将内容重定向指定的位置，但是重定向之后
采用fishhook方式
采用facebook的开源框架 fishhook来动态替换NSLog和print方法

本文选择基于fishhook的方式捕获 NSLog 和 print 方法
福利：https://github.com/madaoCN/Supervisor已实现轻量级的日志打印，使用fishhook hook了 NSLog 和 print方法

前期准备

fishhook原理
网上有很多写的很好的文章，这里就不献丑啦，大家自行百度/谷歌哈
编译swift源码
swift已经开源了，我们可以阅读源码来一窥究竟
项目地址 https://github.com/apple/swift
编译也很简单，就是比较耗时

brew install cmake ninja
mkdir swift-source
cd swift-source
git clone https://github.com/apple/swift.git
./swift/utils/update-checkout --clone
cd swift
utils/build-script --release-debuginfo

image.png

最终的编译文件可能会比较大，我的大概是 44.18GB, 编译前请预留足够的空间

搞定了的话，黑喂狗~

NSLog源码阅读

NSLog的源码位置在 你的编译工程目录/Swift-Build/swift-corelibs-foundation下

/* Output from NSLogv is serialized, in that only one thread in a process can be doing 
 * the writing/logging described above at a time. All attempts at writing/logging a 
 * message complete before the next thread can begin its attempts. The format specification 
 * allowed by these functions is that which is understood by NSString's formatting capabilities.
 * CFLog1() uses writev/fprintf to write to stderr. Both these functions ensure atomic writes.
 */

public func NSLogv(_ format: String, _ args: CVaListPointer) {
    let message = NSString(format: format, arguments: args)
    CFLog1(CFLogLevel(kCFLogLevelWarning), message._cfObject)
}

public func NSLog(_ format: String, _ args: CVarArg...) {
    withVaList(args) { 
        NSLogv(format, $0) 
    }
}

我们可以看到NSLog调用

NSLog ------> NSLogv ------> CFLog1

结合源码中的注释CFLog1() uses writev/fprintf to write to stderr基本可以猜到NSLog最终会调用writev 和 fprintf方法，接下来我们顺腾摸瓜看下 CFLog1的逻辑

void CFLog1(CFLogLevel lev, CFStringRef message) {
#if TARGET_OS_ANDROID
    if (message == NULL) message = CFSTR("NULL");

    ...

    CFStringEncoding encoding = kCFStringEncodingUTF8;
    CFIndex maxLength = CFStringGetMaximumSizeForEncoding(CFStringGetLength(message), encoding) + 1;

    ...

    if (maxLength == 1) {
        // was crashing with zero length strings
        // https://bugs.swift.org/browse/SR-2666
        strcpy(buffer, " "); // log empty string
    }
    else
        CFStringGetCString(message, buffer, maxLength, encoding);
    
        __android_log_print(priority, tag, "%s", buffer);
        // ======= 注意这里 =======
        fprintf(stderr, "%s\n", buffer);
    
    if (buffer != &stack_buffer[0]) free(buffer);
#else
    // ======= 注意这里 =======
    CFLog(lev, CFSTR("%@"), message);
#endif
}

可以看到如果是安卓环境下，会调用 fprintf, 否则会调用 CFLog方法

  NSLog 
    ↓
  NSLogv
    ↓
  CFLog1
    ↓
  CFLog

现在调用的顺序是这样滴，接下来往下走, 我们看看CFLog

void CFLog(int32_t lev, CFStringRef format, ...) {
    va_list args;
    va_start(args, format); 
    _CFLogvEx3(NULL, NULL, NULL, NULL, lev, format, args, __builtin_return_address(0));
    va_end(args);
}

// 调用了_CFLogvEx3
CF_EXPORT void _CFLogvEx3(CFLogFunc logit, CFStringRef (*copyDescFunc)(void *, const void *), CFStringRef (*contextDescFunc)(void *, const void *, const void *, bool, bool *), CFDictionaryRef formatOptions, int32_t lev, CFStringRef format, va_list args, void *addr) {
    _CFLogvEx2Predicate(logit, copyDescFunc, contextDescFunc, formatOptions, lev, format, args, _cf_logging_style_legacy);
    
}

// 调用了_CFLogvEx2Predicate
static void _CFLogvEx2Predicate(CFLogFunc logit, CFStringRef (*copyDescFunc)(void *, const void *), CFStringRef (*contextDescFunc)(void *, const void *, const void *, bool, bool *), CFDictionaryRef formatOptions, int32_t lev, CFStringRef format, va_list args, _cf_logging_style loggingStyle) {
#if TARGET_OS_MAC
    uintptr_t val = (uintptr_t)_CFGetTSD(__CFTSDKeyIsInCFLog);
    if (3 < val) return; // allow up to 4 nested invocations
    _CFSetTSD(__CFTSDKeyIsInCFLog, (void *)(val + 1), NULL);
#endif
    CFStringRef str = format ? _CFStringCreateWithFormatAndArgumentsAux2(kCFAllocatorSystemDefault, copyDescFunc, contextDescFunc, formatOptions, (CFStringRef)format, args) : 0;
    CFIndex blen = str ? CFStringGetMaximumSizeForEncoding(CFStringGetLength(str), kCFStringEncodingUTF8) + 1 : 0;
    char *buf = str ? (char *)malloc(blen) : 0;
    if (str && buf) {
    Boolean converted = CFStringGetCString(str, buf, blen, kCFStringEncodingUTF8);
    size_t len = strlen(buf);
    // silently ignore 0-length or really large messages, and levels outside the valid range
    if (converted && !(len <= 0 || (1 << 24) < len) && !(lev < ASL_LEVEL_EMERG || ASL_LEVEL_DEBUG < lev)) {
            if (logit) {
                logit(lev, buf, len, 1);
            }
            else if (loggingStyle == _cf_logging_style_os_log) {
                // ======= 注意这里 =======
                __CFLogCString(lev, buf, len, 1);
            }
            else if (loggingStyle == _cf_logging_style_legacy) {
                // ======= 注意这里 =======
                __CFLogCStringLegacy(lev, buf, len, 1);
            }
    }
    }
    if (buf) free(buf);
    if (str) CFRelease(str);
#if TARGET_OS_MAC
    _CFSetTSD(__CFTSDKeyIsInCFLog, (void *)val, NULL);
#endif
}

会调用到 __CFLogCString 和 __CFLogCStringLegacy这两个方法，那么现在调用的流程是这样

                      NSLog 
                        ↓
                      NSLogv
                        ↓
                      CFLog1
                        ↓
                      CFLog
                        ↓             
                   _CFLogvEx3
                        ↓             
               _CFLogvEx2Predicate
                        |
                       / \
                     /     \
                   /         \
          __CFLogCString   __CFLogCStringLegacy

继续阅读源码__CFLogCString 和 __CFLogCStringLegacy这两个方法最终都调用了_logToStderr方法

static void _logToStderr(char *banner, const char *message, size_t length) {
#if TARGET_OS_MAC
    struct iovec v[3];
    v[0].iov_base = banner;
    v[0].iov_len = banner ? strlen(banner) : 0;
    v[1].iov_base = (char *)message;
    v[1].iov_len = length;
    v[2].iov_base = "\n";
    v[2].iov_len = (message[length - 1] != '\n') ? 1 : 0;
    int nv = (v[0].iov_base ? 1 : 0) + 1 + (v[2].iov_len ? 1 : 0);
    static CFLock_t lock = CFLockInit;
    __CFLock(&lock);

     // ======= 注意这里 =======

    writev(STDERR_FILENO, v[0].iov_base ? v : v + 1, nv);
    __CFUnlock(&lock);
#elif TARGET_OS_WIN32
    size_t bannerLen = strlen(banner);
    size_t bufLen = bannerLen + length + 1;
    char *buf = (char *)malloc(sizeof(char) * bufLen);
    if (banner) {
        // Copy the banner into the debug string
        memmove_s(buf, bufLen, banner, bannerLen);
        
        // Copy the message into the debug string
        strcpy_s(buf + bannerLen, bufLen - bannerLen, message);
    } else {
        strcpy_s(buf, bufLen, message);
    }
    buf[bufLen - 1] = '\0';
    fprintf_s(stderr, "%s\n", buf);
    // This Win32 API call only prints when a debugger is active
    // OutputDebugStringA(buf);
    free(buf);
#else
    size_t bannerLen = strlen(banner);
    size_t bufLen = bannerLen + length + 1;
    char *buf = (char *)malloc(sizeof(char) * bufLen);
    if (banner) {
        // Copy the banner into the debug string
        memmove(buf, banner, bannerLen);
        
        // Copy the message into the debug string
        strncpy(buf + bannerLen, message, bufLen - bannerLen);
    } else {
        strncpy(buf, message, bufLen);
    }
    buf[bufLen - 1] = '\0';

    // ======= 注意这里 =======

    fprintf(stderr, "%s\n", buf);
    free(buf);
#endif
}

可见NSLog最终都调用了writev和fprintf方法

                      NSLog 
                        ↓
                      NSLogv
                        ↓
                      CFLog1
                        ↓
                      CFLog
                        ↓             
                   _CFLogvEx3
                        ↓             
               _CFLogvEx2Predicate
                        |
                       / \
                     /     \
                   /         \
          __CFLogCString   __CFLogCStringLegacy
                   \         /
                     \     /
                       \ /
                 _logToStderr
                        ↓
                  writev / fprintf

结果与之前的注释一致，那么我们只需要使用 fishhook 对 writev / fprintf方法进行hook就能达到我们的目的了，那么我们继续看看 print函数的源码

print函数源码阅读

print的源码位置在类似 你的编译工程目录/Swift-Build/build/Xcode-RelWithDebInfoAssert/swift-macosx-x86_64具体名字和编译参数和机器有关

我们很容易就找到了源码

// ============ print方法1
public func print(
  _ items: Any...,
  separator: String = " ",
  terminator: String = "\n"
) {
  if let hook = _playgroundPrintHook {
     // ======= 注意这里 =======
    var output = _TeeStream(left: "", right: _Stdout())
    _print(items, separator: separator, terminator: terminator, to: &output)
    hook(output.left)
  }
  else {
   // ======= 注意这里 =======
    var output = _Stdout()
    _print(items, separator: separator, terminator: terminator, to: &output)
  }
}

// ============ print方法2
public func print<Target: TextOutputStream>(
  _ items: Any...,
  separator: String = " ",
  terminator: String = "\n",
  to output: inout Target
) {
  // ======= 注意这里 =======
  _print(items, separator: separator, terminator: terminator, to: &output)
}

可见print 方法会调用_print方法

internal func _print<Target: TextOutputStream>(
  _ items: [Any],
  separator: String = " ",
  terminator: String = "\n",
  to output: inout Target
) {
  var prefix = ""
  output._lock()
  defer { output._unlock() }
  for item in items {
    output.write(prefix)

    // ======= 注意这里 =======
    _print_unlocked(item, &output)

    prefix = separator
  }
  output.write(terminator)
}

// _print_unlocked 源码
@usableFromInline
@_semantics("optimize.sil.specialize.generic.never")
internal func _print_unlocked<T, TargetStream: TextOutputStream>(
  _ value: T, _ target: inout TargetStream
) {
  // Optional has no representation suitable for display; therefore,
  // values of optional type should be printed as a debug
  // string. Check for Optional first, before checking protocol
  // conformance below, because an Optional value is convertible to a
  // protocol if its wrapped type conforms to that protocol.
  // Note: _isOptional doesn't work here when T == Any, hence we
  // use a more elaborate formulation:
  if _openExistential(type(of: value as Any), do: _isOptional) {
    let debugPrintable = value as! CustomDebugStringConvertible
    debugPrintable.debugDescription.write(to: &target)
    return
  }
  if case let streamableObject as TextOutputStreamable = value {
    streamableObject.write(to: &target)
    return
  }

  if case let printableObject as CustomStringConvertible = value {
    printableObject.description.write(to: &target)
    return
  }

  if case let debugPrintableObject as CustomDebugStringConvertible = value {
    debugPrintableObject.debugDescription.write(to: &target)
    return
  }

  let mirror = Mirror(reflecting: value)

  _adHocPrint_unlocked(value, mirror, &target, isDebugPrint: false)
}

可见调用流程如下

print ------> _print ------> _print_unlocked

这里的

TextOutputStreamable
CustomDebugStringConvertible
CustomStringConvertible

////////////////////   CustomStringConvertible
public protocol CustomStringConvertible {
  var description: String { get }
}

////////////////////   CustomDebugStringConvertible
public protocol CustomDebugStringConvertible {
  var debugDescription: String { get }
}

////////////////////   TextOutputStreamable
public protocol TextOutputStreamable {
  /// Writes a textual representation of this instance into the given output
  /// stream.
  func write<Target: TextOutputStream>(to target: inout Target)
}

等都是协议, 将 Target 传入并调用 Target 的 write 方法

我们回过头来看下函数名
internal func _print_unlocked<T, TargetStream: TextOutputStream>( _ value: T, _ target: inout TargetStream )

target 是遵循 TextOutputStream协议的对象，也就是我们之前看到的
_Stdout ()函数


////////////// TextOutputStream 协议
public protocol TextOutputStream {
  mutating func _lock()
  mutating func _unlock()

  /// Appends the given string to the stream.
  mutating func write(_ string: String)

  mutating func _writeASCII(_ buffer: UnsafeBufferPointer<UInt8>)
}

////////////// _Stdout 
internal struct _Stdout: TextOutputStream {
  internal init() {}

  internal mutating func _lock() {
    _swift_stdlib_flockfile_stdout()
  }

  internal mutating func _unlock() {
    _swift_stdlib_funlockfile_stdout()
  }

  internal mutating func write(_ string: String) {
    if string.isEmpty { return }

    var string = string
    _ = string.withUTF8 { utf8 in
      
       // ======= 注意这里 ======= 
      _swift_stdlib_fwrite_stdout(utf8.baseAddress!, 1, utf8.count)
    }
  }
}

// =========== _swift_stdlib_fwrite_stdout 源代码
SWIFT_RUNTIME_STDLIB_INTERNAL
__swift_size_t swift::_swift_stdlib_fwrite_stdout(const void *ptr,
                                                  __swift_size_t size,
                                                  __swift_size_t nitems) {
    // ======= 注意这里 ======= 
    return fwrite(ptr, size, nitems, stdout);
}

我们可以看到

_Stdout -> _swift_stdlib_fwrite_stdout -> fwrite

结合之前的调用方法, 最终也调用了 fwrite方法

print -> _print -> _print_unlocked ->  (print items) -> (write/description.write/debugDescription.write)-> Stdout -> _swift_stdlib_fwrite_stdout -> fwrite

绕了这么大的一圈，我们得出的结论是，print 函数最终调用了fwrite

中场休息~~~~

最终，如果我们要日志监控的话，只需要hook如下三个方法

NSLog 调用  writev / fprintf
print 调用  fwrite

Hook代码

首先我们hook writev, 函数原型

static ssize_t writev(int a, const struct iovec * v, int v_len);

/// struct iovec 类型
struct iovec {
    void *   iov_base;      /* [XSI] Base address of I/O memory region */
    size_t   iov_len;       /* [XSI] Size of region iov_base points to */
};

具体hook代码

//--------------------------------------------------------------------------
// MARK: hook NSLog
//--------------------------------------------------------------------------

// origin writev IMP
static ssize_t (*orig_writev)(int a, const struct iovec * v, int v_len);

// swizzle method
ssize_t asl_writev(int a, const struct iovec *v, int v_len) {
    
    NSMutableString *string = [NSMutableString string];
    for (int i = 0; i < v_len; i++) {
        char *c = (char *)v[i].iov_base;
        [string appendString:[NSString stringWithCString:c encoding:NSUTF8StringEncoding]];
    }
    
    ////////// do something  这里可以捕获到日志 string
    
    // invoke origin mehtod
    ssize_t result = orig_writev(a, v, v_len);
    return result;
}

然后是 fprintf 函数，这里因为 fprintf 是可变参数，具体可变参数相关使用可见博主的另外一篇博客 va_list 可变参数概览

这里先使用NSString 的 @selector(initWithFormat : arguments)方法生成要输出的字符串，直接调用 origin_fprintf 将自行生成的字符串作为参数就行了，免去再次传递可变参数至 origin_fprintf

//--------------------------------------------------------------------------
// MARK: hook fprintf
//--------------------------------------------------------------------------

// origin fprintf IMP
static int     (*origin_fprintf)(FILE * __restrict, const char * __restrict, ...);

// swizzle method
int     asl_fprintf(FILE * __restrict file, const char * __restrict format, ...)
{
    /*
     typedef struct {
         
        unsigned int gp_offset;
        unsigned int fp_offset;
        void *overflow_arg_area;
        void *reg_save_area;
     } va_list[1];
     */
    va_list args;
    
    va_start(args, format);

    NSString *formatter = [NSString stringWithUTF8String:format];
    NSString *string = [[NSString alloc] initWithFormat:formatter arguments:args];
    
    ////////// do something  这里可以捕获到日志

    // invoke orign fprintf
    int result = origin_fprintf(file, [string UTF8String]);
    
    va_end(args);

    return result;
}

然后是 fprintf方法

调用例如 fprintf ("test");方法时候 asl_fwrite会调用两次，参数一次是test,另一次是\n，所以先将字符串放入__messageBuffer,等收到\n时，再将 __messageBuffer中转成字符串一次性读取

//--------------------------------------------------------------------------
// MARK: hook print for swift
//--------------------------------------------------------------------------

// origin fwrite IMP
static size_t (*orig_fwrite)(const void * __restrict, size_t, size_t, FILE * __restrict);

static char *__messageBuffer = {0};
static int __buffIdx = 0;
void reset_buffer()
{
    __messageBuffer = calloc(1, sizeof(char));
    __messageBuffer[0] = '\0';
    __buffIdx = 0;
}


// swizzle method
size_t asl_fwrite(const void * __restrict ptr, size_t size, size_t nitems, FILE * __restrict stream) {
        
    if (__messageBuffer == NULL) {
        // initial Buffer
        reset_buffer();
    }
    
    char *str = (char *)ptr;
    
    NSString *s = [NSString stringWithCString:str encoding:NSUTF8StringEncoding];
    
    if (__messageBuffer != NULL) {
        
        if (str[0] == '\n' && __messageBuffer[0] != '\0') {
            
            s = [[NSString stringWithCString:__messageBuffer encoding:NSUTF8StringEncoding] stringByAppendingString:s];
            
            // reset buffIdx
            reset_buffer();

            ////////// do something  这里可以捕获到日志
        }
        else {
            
            // append buffer
            __messageBuffer = realloc(__messageBuffer, sizeof(char) * (__buffIdx + nitems + 1));
            for (size_t i = __buffIdx; i < nitems; i++) {
                __messageBuffer[i] = str[i];
                __buffIdx ++;
            }
            __messageBuffer[__buffIdx + 1] = '\0';
            __buffIdx ++;
        }
    }
    
    return orig_fwrite(ptr, size, nitems, stream);
}

最后就是 hook 的代码，没啥好说的

//--------------------------------------------------------------------------
// MARK: fishhook调用
//--------------------------------------------------------------------------
// hook writev
rebind_symbols((struct rebinding[1]){{
    "writev",
    asl_writev,
    (void*)&orig_writev
}}, 1);

// hook fwrite
rebind_symbols((struct rebinding[1]){{
    "fwrite",
    asl_fwrite,
    (void *)&orig_fwrite}}, 1);

// hook fprintf
rebind_symbols((struct rebinding[1]){{
    "fprintf",
    asl_fprintf,
    (void *)&origin_fprintf}}, 1);

接下来我们看下成果

image.png

具体的代码，请见https://github.com/madaoCN/Supervisor 功能还在完善中，将间断更新

参考

用fishhook hook输出方法(NSLog, print)
捕获NSLog日志小记
 GodEye日志监控
 iOS逆向工程 - fishhook原理