应用程序加载

2021-08-07  本文已影响0人  浅墨入画

应用程序的加载

本篇主要是分析dyld的加载流程,了解在main函数之前,底层还做了什么? 我们经常听到的二进制重排启动优化等字眼,就是在App的启动流程中做文章。

编译过程及库

在分析app启动之前,我们需要先了解iOSapp代码的编译过程以及动态库静态库

编译过程

编译过程如下所示

编译过程
静态库和动态库

静态库:在链接阶段,会将可汇编生成的目标程序与引用的库一起链接打包到可执行文件当中。此时的静态库就不会再改变了,因为它是编译时被直接拷贝一份,复制到目标程序里的

动态库:程序编译时并不会链接到目标程序中,目标程序只会存储指向动态库的引用,在程序运行时才被载入

  1. 减少打包之后app的大小:因为不需要拷贝至目标程序中,所以不会影响目标程序的体积,与静态库相比,减少了app的体积大小
  2. 共享内存,节约资源:同一份库可以被多个程序使用
  3. 通过更新动态库,达到更新程序的目的:由于运行时才载入的特性,可以随时对库进行替换,而不需要重新编译代码

静态库与动态库图示

image.png
dyld加载流程分析

dyld(the dynamic link editor)是苹果的动态链接器,是苹果操作系统的重要组成部分,在app被编译打包成可执行文件格式的Mach-O文件后,交由dyld负责连接,加载程序

dyld演变过程WWDC视频 -> WWDC2017:App Startup Time: Past, Present, and Future

App启动流程图

dyld加载流程

创建工程,ViewController中重写load方法,在main中加了一个C++方法,并且在main函数中打印,查看程序执行顺序

<!-- ViewController.m文件 -->
@implementation ViewController
+ (void)load{
    NSLog(@"%s",__func__);
}

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
}
@end

<!-- main.m文件 -->
int main(int argc, char * argv[]) {
    NSString * appDelegateClassName;
    NSLog(@"1223333");
    @autoreleasepool {
        // Setup code that might create autoreleased objects goes here.
        appDelegateClassName = NSStringFromClass([AppDelegate class]);
    }
    return UIApplicationMain(argc, argv, nil, appDelegateClassName);
}

__attribute__((constructor)) void kcFunc(){
    printf("来了 : %s \n",__func__);
}

// 控制台打印
2021-08-06 21:46:06.605315+0800 002-应用程加载分析[5305:346935] +[ViewController load]
来了 : kcFunc 
2021-08-06 21:51:43.385279+0800 002-应用程加载分析[5305:346935] 1223333

通过打印结果可以看出其顺序是load --> C++方法 --> mainmain作为入口函数为什么不是最先执行?根据这个问题,我们来探索main函数之前底层做了什么?

image.png image.png image.png

dyld流程中的main函数

进入dyld::_main的源码实现,发现代码特别长,可以根据_main函数的返回值进行反推。反推流程result -> sMainExecutable -> ``

在_main函数中主要做了以下几件事情:

// 创建主程序cdHash的空间
    uint8_t mainExecutableCDHashBuffer[20];
        //从环境中获取主可执行文件的 cdHash
    const uint8_t* mainExecutableCDHash = nullptr;
    if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
        unsigned bufferLenUsed;
        if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
            mainExecutableCDHash = mainExecutableCDHashBuffer;
    }
        //配置信息,获取主程序的mach-o header、silder(ASLR的偏移值)
    getHostInfo(mainExecutableMH, mainExecutableSlide);
// load shared cache
// 检查共享缓存是否开启,iOS中必须开启
checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
    if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
#if TARGET_OS_SIMULATOR
// 检查共享缓存是否映射到了共享区域
        if ( sSharedCacheOverrideDir)
            mapSharedCache(mainExecutableSlide);
#else
        mapSharedCache(mainExecutableSlide);
#endif
// instantiate ImageLoader for main executable
// 主程序的初始化
// 加载可执行文件,生成一个ImageLoader实例对象
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
gLinkContext.mainExecutable = sMainExecutable;
gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
// load any inserted libraries
// 插入动态库,如果在越狱环境下是可以修改的
if  ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
    for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
    loadInsertedDylib(*lib);
}
// record count of inserted libraries so that a flat search will look at 
// inserted libraries, then main, then others.
// 插入库,然后是main,然后是其他
sInsertedDylibCount = sAllImages.size()-1;
// link主程序,链接主程序与动态库、插入动态库
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
sMainExecutable->setNeverUnloadRecursive();
image.png image.png image.png image.png
dyld加载流程

load的源码链为:_dyld_start --> dyldbootstrap::start --> dyld::_main --> dyld::initializeMainExecutable --> ImageLoader::runInitializers --> ImageLoader::processInitializers --> ImageLoader::recursiveInitialization --> dyld::notifySingle(是一个回调处理) --> load_images(libobjc.A.dylib)

整个加载过程是由dyld开始,最终到达libobjc结束,也就是说应用加载流程不只是dyld在工作,还有很多库在配合一起完成这个过程。

dyld流程图.png

dyld流程-主程序运行

主程序初始化initializeMainExecutable函数源码分析
void initializeMainExecutable()
{
    // record that we've reached this step
    gLinkContext.startedInitializingMainExecutable = true;

    // run initialzers for any inserted dylibs
    ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
    initializerTimes[0].count = 0;
    const size_t rootCount = sImageRoots.size();
    if ( rootCount > 1 ) {
        for(size_t i=1; i < rootCount; ++i) {
            sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
        }
    }
    
    // run initializers for main executable and everything it brings up 
    //运行主要可执行文件的初始化程序及其所带来的一切
    sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
    
    // register cxa_atexit() handler to run static terminators in all loaded images when this process exits
    //当此进程退出时,注册cxa_atexit() 处理程序以在所有加载的图像中运行静态终止符
    if ( gLibSystemHelpers != NULL ) 
        (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

    // dump info if requested
    if ( sEnv.DYLD_PRINT_STATISTICS )
        ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
    if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
        ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
    uint64_t t1 = mach_absolute_time();
    mach_port_t thisThread = mach_thread_self();
    ImageLoader::UninitedUpwards up;
    up.count = 1;
    up.imagesAndPaths[0] = { this, this->getPath() };
    //重点内容 
    processInitializers(context, thisThread, timingInfo, up);
    context.notifyBatch(dyld_image_state_initialized, false);
    mach_port_deallocate(mach_task_self(), thisThread);
    uint64_t t2 = mach_absolute_time();
    fgTotalInitTime += (t2 - t1);
}
// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
                                     InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
    uint32_t maxImageCount = context.imageCount()+2;
    ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
    ImageLoader::UninitedUpwards& ups = upsBuffer[0];
    ups.count = 0;
    // Calling recursive init on all images in images list, building a new list of
    // uninitialized upward dependencies.
    for (uintptr_t i=0; i < images.count; ++i) {
        images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
    }
    // If any upward dependencies remain, init them.
    if ( ups.count > 0 )
        processInitializers(context, thisThread, timingInfo, ups);
}
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
                                          InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
    recursive_lock lock_info(this_thread);
    recursiveSpinLock(lock_info);

    if ( fState < dyld_image_state_dependents_initialized-1 ) {
        uint8_t oldState = fState;
        // break cycles
        fState = dyld_image_state_dependents_initialized-1;
        try {
            // initialize lower level libraries first
            for(unsigned int i=0; i < libraryCount(); ++i) {
                ImageLoader* dependentImage = libImage(i);
                if ( dependentImage != NULL ) {
                    // don't try to initialize stuff "above" me yet
                    if ( libIsUpward(i) ) {
                        uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
                        uninitUps.count++;
                    }
                    else if ( dependentImage->fDepth >= fDepth ) {
                        dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
                    }
                }
            }
            
            // record termination order
            if ( this->needsTermination() )
                context.terminationRecorder(this);

            // let objc know we are about to initialize this image
            uint64_t t1 = mach_absolute_time();
            fState = dyld_image_state_dependents_initialized;
            oldState = fState;
            context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
            
            // initialize this image
            bool hasInitializers = this->doInitialization(context);

            // let anyone know we finished initializing this image
            fState = dyld_image_state_initialized;
            oldState = fState;
            context.notifySingle(dyld_image_state_initialized, this, NULL);
            
            if ( hasInitializers ) {
                uint64_t t2 = mach_absolute_time();
                timingInfo.addTime(this->getShortName(), t2-t1);
            }
        }
        catch (const char* msg) {
            // this image is not initialized
            fState = oldState;
            recursiveSpinUnLock();
            throw;
        }
    }
    // 递归解锁
    recursiveSpinUnLock();
}

在这里,需要分成两部分探索,一部分是notifySingle函数,一部分是doInitialization函数。

notifySingle 函数
doInitialization 函数

得出结论_objc_init的源码链:_dyld_start --> dyldbootstrap::start --> dyld::_main --> dyld::initializeMainExecutable --> ImageLoader::runInitializers --> ImageLoader::processInitializers --> ImageLoader::recursiveInitialization --> doInitialization -->libSystem_initializer(libSystem.B.dylib) --> _os_object_init(libdispatch.dylib) --> _objc_init(libobjc.A.dylib)

分析doInitialization(调用init方法)与context.notifySingle(单个通知注入)的关系

通过调用链路,此函数里面找不到load_images函数了,应为load_images已经不是dyld中的函数了,是属于libobjc库的函数。在此函数中发现了关键的回调(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());,但是在回调之前判断了sNotifyObjCInit是否为空,就证明一定有给sNotifyObjCInit变量赋值的地方

static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
    //dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
    std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
    if ( handlers != NULL ) {
        dyld_image_info info;
        info.imageLoadAddress   = image->machHeader();
        info.imageFilePath      = image->getRealPath();
        info.imageFileModDate   = image->lastModified();
        for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it != handlers->end(); ++it) {
            const char* result = (*it)(state, 1, &info);
            if ( (result != NULL) && (state == dyld_image_state_mapped) ) {
                //fprintf(stderr, "  image rejected by handler=%p\n", *it);
                // make copy of thrown string so that later catch clauses can free it
                const char* str = strdup(result);
                throw str;
            }
        }
    }
    if ( state == dyld_image_state_mapped ) {
        // <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
        // <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
        if (!image->inSharedCache()
            || (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
            dyld_uuid_info info;
            if ( image->getUUID(info.imageUUID) ) {
                info.imageLoadAddress = image->machHeader();
                addNonSharedCacheImageUUID(info);
            }
        }
    }
    if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
        uint64_t t0 = mach_absolute_time();
        dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
        (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        uint64_t t1 = mach_absolute_time();
        uint64_t t2 = mach_absolute_time();
        uint64_t timeInObjC = t1-t0;
        uint64_t emptyTime = (t2-t1)*100;
        if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
            timingInfo->addTime(image->getShortName(), timeInObjC);
        }
    }
    // mach message csdlc about dynamically unloaded images
    if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
        notifyKernel(*image, false);
        const struct mach_header* loadAddress[] = { image->machHeader() };
        const char* loadPath[] = { image->getPath() };
        notifyMonitoringDyld(true, 1, loadAddress, loadPath);
    }
}
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}
上一篇下一篇

猜你喜欢

热点阅读