从Android 6.0 & AGP 3.6.0开始,系统支持直接加载apk中未压缩的so,也就是说在App安装时,系统不再将apk中的so解压,而在加载so时,直接从apk中加载。
具体见:https://developer.android.com/guide/topics/manifest/application-element#extractNativeLibs
然而,熟悉glibc开发的程序员知道,dlopen系列函数不支持这个,那么应该是Android扩展了libc(bionic)加载so的能力。本文在讲解Android扩展加载so能力的同时,“深度剖析”整个so的加载过程。将从Java代码开始,深入到libart源码,然后再贯穿解析bionic源码,直到Linux syscall级别。
测试Demo:https://github.com/huchao/MySystemLoadLibrary
下面开始源码分析: 一、libcore(Java)先以Java代码作为切入点分析,具体实现位于libcore
1. System.loadLibrary源码:libcore/ojluni/src/main/java/java/lang/System.java
public static void loadLibrary(String libname) { Runtime.getRuntime().loadLibrary0(Reflection.getCallerClass(), libname); }
跟OpenJDK源码一致,System.loadLibrary转调Runtime.getRuntime().loadLibrary0。
2. Runtime.loadLibrary0源码:libcore/ojluni/src/main/java/java/lang/Runtime.java
private synchronized void loadLibrary0(ClassLoader loader, Class> callerClass, String libname) { if (libname.indexOf((int)File.separatorChar) != -1) { throw new UnsatisfiedlinkError( "Directory separator should not appear in library name: " + libname); } String libraryName = libname; // Android-note: BootClassLoader doesn't implement findLibrary(). http://b/111850480 // Android's class.getClassLoader() can return BootClassLoader where the RI would // have returned null; therefore we treat BootClassLoader the same as null here. if (loader != null && !(loader instanceof BootClassLoader)) { String filename = loader.findLibrary(libraryName); if (filename == null && (loader.getClass() == PathClassLoader.class || loader.getClass() == DelegateLastClassLoader.class)) { // Don't give up even if we failed to find the library in the native lib paths. // The underlying dynamic linker might be able to find the lib in one of the linker // namespaces associated with the current linker namespace. In order to give the // dynamic linker a chance, proceed to load the library with its soname, which // is the fileName. // Note that we do this only for PathClassLoader and DelegateLastClassLoader to // minimize the scope of this behavioral change as much as possible, which might // cause problem like b/143649498. These two class loaders are the only // platform-provided class loaders that can load apps. See the classLoader attribute // of the application tag in app manifest. filename = System.mapLibraryName(libraryName); } if (filename == null) { // It's not necessarily true that the ClassLoader used // System.mapLibraryName, but the default setup does, and it's // misleading to say we didn't find "libMyLibrary.so" when we // actually searched for "liblibMyLibrary.so.so". throw new UnsatisfiedlinkError(loader + " couldn't find "" + System.mapLibraryName(libraryName) + """); } String error = nativeLoad(filename, loader); if (error != null) { throw new UnsatisfiedlinkError(error); } return; } // We know some apps use mLibPaths directly, potentially assuming it's not null. // Initialize it here to make sure apps see a non-null value. getLibPaths(); String filename = System.mapLibraryName(libraryName); String error = nativeLoad(filename, loader, callerClass); if (error != null) { throw new UnsatisfiedlinkError(error); } }
loadLibrary0经过几次重载调用,最终来到如下的loadLibrary0方法,首先通过libraryName参数调用loader.findLibrary去查找so文件路径,找到后再通过路径调用nativeLoad加载so到进程中,此处nativeLoad已是Native方法。
2.1. DexPathList.findLibrary源码:libcore/dalvik/src/main/java/dalvik/system/DexPathList.java
public String findLibrary(String libraryName) { String fileName = System.mapLibraryName(libraryName); for (NativeLibraryElement element : nativeLibraryPathElements) { String path = element.findNativeLibrary(fileName); if (path != null) { return path; } } return null; }
- 在进入nativeLoad之前,支线看一下上面提到的loader.findLibrary,看系统是怎样通过libraryName来找到需要加载类的全路径的(在如上例子中即为通过"mytest"字符串查找到字符串"/data/app/~~fZ_4-EauEUei3h26P_847A==/com.huchao.mysystemloadlibrary-iVnUuIYOZ3V4zuSqrYZ5uw==/base.apk!/lib/arm64-v8a/libmytest.so")。
- loader.findLibrary最终的实现是在DexPathList.findLibrary中,此时libraryName为"mytest",经过System.mapLibraryName转换后,得到fileName为"libmytest.so"(System.mapLibraryName调用到Native中的System_mapLibraryName,实际上只是一个字符串format *** 作)。
- 接下来一个for循环,遍历nativeLibraryPathElements,通过每个element查找fileName。此处的nativeLibraryPathElements为Native Library查找路径集合,这个集合在App启动时初始化,早于Application.attachbaseContext,其中按顺序包含nativeLibraryDirectories(App本地路径)、systemNativeLibraryDirectories(系统路径)。在本例中其取值为:
"directory "/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/lib/arm64"" "zip file "/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/base.apk", dir "lib/arm64-v8a"" "directory "/system/lib64"" "directory "/system/system_ext/lib64"" "directory "/system/product/lib64""
- 此处可见,加载so是先查找App路径下,然后再查找系统路径。通过前缀,也能发现,支持从zip文件base.apk中直接加载so。
源码:libcore/dalvik/src/main/java/dalvik/system/DexPathList.java
public String findNativeLibrary(String name) { maybeInit(); if (zipDir == null) { String entryPath = new File(path, name).getPath(); if (IoUtils.canOpenReadonly(entryPath)) { return entryPath; } } else if (urlHandler != null) { // Having a urlHandler means the element has a zip file. // In this case Android supports loading the library iff // it is stored in the zip uncompressed. String entryName = zipDir + '/' + name; if (urlHandler.isEntryStored(entryName)) { return path.getPath() + zipSeparator + entryName; } } return null; }
- 下面接着看element.findNativeLibrary。首先判断zipDir是否为null,zipDir是指当前是否需要在zip文件中查找so,即是否要在apk中查找,对应刚才nativeLibraryPathElements中的"zip file "项。
- 当zipDir = null,即不需要在apk中查找,则拼接路径后调用IoUtils.canOpenReadOnly,判断so文件是否能采用ReadOnly方式打开,如果可以则返回全路径。如:/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/lib/arm64/libmytest.so
- 当zipDir != null,即需要在apk中查找,则拼接路径后调用urlHandler.isEntryStored判断apk中的so是否可用,如果可以则返回路径。如:/data/app/~~pgatC4H9zh6_9M5Okay-PA==/com.huchao.mysystemloadlibrary-R-Bnf1LWAGqkpWIltJG6_w==/base.apk!/lib/arm64-v8a/libmytest.so
源码:libcore/luni/src/main/java/libcore/io/ClassPathURLStreamHandler.java
public boolean isEntryStored(String entryName) { ZipEntry entry = jarFile.getEntry(entryName); return entry != null && entry.getMethod() == ZipEntry.STORED; }
- 刚才的isEntryStored将调用到ClassPathURLStreamHandler.isEntryStored中,通过jarFile判断entryName是否存在,本例中entryName为lib/arm64-v8a/libmytest.so
- 如果存在,并且压缩方式为ZipEntry.STORED,则返回true,表示找到对应so。Zip压缩有STORED(仅存储)、DEFLATED(Deflate压缩)两种方式。
此处到了libcore的Native代码。
1. Runtime_nativeLoad源码:libcore/ojluni/src/main/native/Runtime.c
JNIEXPORT jstring JNICALL Runtime_nativeLoad(JNIEnv* env, jclass ignored, jstring javaFilename, jobject javaLoader, jclass caller) { return JVM_NativeLoad(env, javaFilename, javaLoader, caller); }
接着上面的Runtime.loadLibrary0,在路径查找完成后,将JNI调用nativeLoad函数,最终调用到Runtime_nativeLoad,然后再转调libart中的JVM_NativeLoad。
三、libart(Native)此处到了libart的代码。
1. JVM_NativeLoad源码:art/openjdkjvm/OpenjdkJvm.cc
JNIEXPORT jstring JVM_NativeLoad(JNIEnv* env, jstring javaFilename, jobject javaLoader, jclass caller) { ScopedUtfChars filename(env, javaFilename); if (filename.c_str() == nullptr) { return nullptr; } std::string error_msg; { art::JavaVMExt* vm = art::Runtime::Current()->GetJavaVM(); bool success = vm->LoadNativeLibrary(env, filename.c_str(), javaLoader, caller, &error_msg); if (success) { return nullptr; } } // Don't let a pending exception from JNI_onLoad cause a CheckJNI issue with NewStringUTF. env->ExceptionClear(); return env->NewStringUTF(error_msg.c_str()); }
通过各类参数判断后,继续转调JavaVMExt::LoadNativeLibrary。
2. JavaVMExt::LoadNativeLibrary源码:art/runtime/jni/java_vm_ext.cc
bool JavaVMExt::LoadNativeLibrary(JNIEnv* env, const std::string& path, jobject class_loader, jclass caller_class, std::string* error_msg) { ...... void* handle = android::OpenNativeLibrary( env, runtime_->GetTargetSdkVersion(), path_str, class_loader, (caller_location.empty() ? nullptr : caller_location.c_str()), library_path.get(), &needs_native_bridge, &nativeloader_error_msg); ...... }
JavaVMExt::LoadNativeLibrary源码中省略了非核心部分代码(包括:首先判断so是否已经加载过了,并且可用,则直接返回true)。如果是第一次加载,则转调android::OpenNativeLibrary,此处返回值handle即为so的入口地址,类似于dlopen的返回值。
3. OpenNativeLibrary源码:art/libnativeloader/native_loader.cpp
void* OpenNativeLibrary(JNIEnv* env, int32_t target_sdk_version, const char* path, jobject class_loader, const char* caller_location, jstring library_path, bool* needs_native_bridge, char** error_msg) { ...... return OpenNativeLibraryInNamespace(ns, path, needs_native_bridge, error_msg); }
OpenNativeLibrary源码中省略了非核心部分代码,然后转调OpenNativeLibraryInNamespace。
4. OpenNativeLibraryInNamespace源码:art/libnativeloader/native_loader.cpp
void* OpenNativeLibraryInNamespace(NativeLoaderNamespace* ns, const char* path, bool* needs_native_bridge, char** error_msg) { ...... auto handle = ns->Load(path); ...... }
OpenNativeLibraryInNamespace源码中省略了非核心部分代码,然后转调NativeLoaderNamespace::Load
5. NativeLoaderNamespace::Load源码:art/libnativeloader/native_loader_namespace.cpp
ResultNativeLoaderNamespace::Load(const char* lib_name) const { if (!IsBridged()) { android_dlextinfo extinfo; extinfo.flags = ANDROID_DLEXT_USE_NAMESPACE; extinfo.library_namespace = this->ToRawAndroidNamespace(); void* handle = android_dlopen_ext(lib_name, RTLD_NOW, &extinfo); if (handle != nullptr) { return handle; } } else { void* handle = NativeBridgeLoadLibraryExt(lib_name, RTLD_NOW, this->ToRawNativeBridgeNamespace()); if (handle != nullptr) { return handle; } } return Error() << GetlinkerError(IsBridged()); }
此处不关心else中的Bridged情况。NativeLoaderNamespace::Load最终调用android_dlopen_ext加载所需so,采用Flag RTLD_NOW执行立即加载,android_dlopen_ext为Android扩展的dlopen实现,至此可以发现,Android的System.loadLibrary底层调用android_dlopen_ext来加载so,而非OpenJDK采用的dlopen(OpenJDK System.loadLibrary的源码剖析见底部参考资料)。
四、libdl(bionic)此处到了bionic的动态链接处理库libdl中。
android_dlopen_ext源码:bionic/libdl/libdl.cpp
void* android_dlopen_ext(const char* filename, int flag, const android_dlextinfo* extinfo) { const void* caller_addr = __builtin_return_address(0); return __loader_android_dlopen_ext(filename, flag, extinfo, caller_addr); }
android_dlopen_ext直接转调内部的__loader_android_dlopen_ext。
五、linker(bionic)此处到了bionic的linker可执行文件中。
1. __loader_android_dlopen_ext源码:bionic/linker/dlfcn.cpp
void* __loader_android_dlopen_ext(const char* filename, int flags, const android_dlextinfo* extinfo, const void* caller_addr) { return dlopen_ext(filename, flags, extinfo, caller_addr); }
__loader_android_dlopen_ext直接转调dlopen_ext。
2. dlopen_ext源码:bionic/linker/dlfcn.cpp
static void* dlopen_ext(const char* filename, int flags, const android_dlextinfo* extinfo, const void* caller_addr) { ScopedPthreadMutexLocker locker(&g_dl_mutex); g_linker_logger.ResetState(); void* result = do_dlopen(filename, flags, extinfo, caller_addr); if (result == nullptr) { __bionic_format_dlerror("dlopen failed", linker_get_error_buffer()); return nullptr; } return result; }
dlopen_extc处理线程同步问题后,转调do_dlopen
3. do_dlopen源码:bionic/linker/linker.cpp
void* do_dlopen(const char* name, int flags, const android_dlextinfo* extinfo, const void* caller_addr) { ...... soinfo* si = find_library(ns, translated_name, flags, extinfo, caller); ...... }
do_dlopen经过一系列参数处理,Log打印,Trace处理后,最终转调find_library
4. find_library源码:bionic/linker/linker.cpp
static soinfo* find_library(android_namespace_t* ns, const char* name, int rtld_flags, const android_dlextinfo* extinfo, soinfo* needed_by) { soinfo* si = nullptr; if (name == nullptr) { si = solist_get_somain(); } else if (!find_libraries(ns, needed_by, &name, 1, &si, nullptr, 0, rtld_flags, extinfo, false , true )) { if (si != nullptr) { soinfo_unload(si); } return nullptr; } si->increment_ref_count(); return si; }
此处name不为nullptr,函数随即调用至find_libraries,如果成功最后对引用计数加1。下面将深入核心函数find_libraries查看。
5. find_libraries源码:bionic/linker/linker.cpp
bool find_libraries(android_namespace_t* ns, soinfo* start_with, const char* const library_names[], size_t library_names_count, soinfo* soinfos[], std::vector* ld_preloads, size_t ld_preloads_count, int rtld_flags, const android_dlextinfo* extinfo, bool add_as_children, bool search_linked_namespaces, std::vector* namespaces) { // Step 0: prepare. std::unordered_map readers_map; LoadTaskList load_tasks; for (size_t i = 0; i < library_names_count; ++i) { const char* name = library_names[i]; load_tasks.push_back(LoadTask::create(name, start_with, ns, &readers_map)); } // If soinfos array is null allocate one on stack. // The array is needed in case of failure; for example // when library_names[] = {libone.so, libtwo.so} and libone.so // is loaded correctly but libtwo.so failed for some reason. // In this case libone.so should be unloaded on return. // See also implementation of failure_guard below. if (soinfos == nullptr) { size_t soinfos_size = sizeof(soinfo*)*library_names_count; soinfos = reinterpret_cast (alloca(soinfos_size)); memset(soinfos, 0, soinfos_size); } // list of libraries to link - see step 2. size_t soinfos_count = 0; auto scope_guard = android::base::make_scope_guard([&]() { for (LoadTask* t : load_tasks) { LoadTask::deleter(t); } }); ZipArchiveCache zip_archive_cache; // Step 1: expand the list of load_tasks to include // all DT_NEEDED libraries (do not load them just yet) for (size_t i = 0; i get_needed_by(); bool is_dt_needed = needed_by != nullptr && (needed_by != start_with || add_as_children); task->set_extinfo(is_dt_needed ? nullptr : extinfo); task->set_dt_needed(is_dt_needed); LD_LOG(kLogDlopen, "find_libraries(ns=%s): task=%s, is_dt_needed=%d", ns->get_name(), task->get_name(), is_dt_needed); // Note: start from the namespace that is stored in the LoadTask. This namespace // is different from the current namespace when the LoadTask is for a transitive // dependency and the lib that created the LoadTask is not found in the // current namespace but in one of the linked namespace. if (!find_library_internal(const_cast(task->get_start_from()), task, &zip_archive_cache, &load_tasks, rtld_flags, search_linked_namespaces || is_dt_needed)) { return false; } soinfo* si = task->get_soinfo(); if (is_dt_needed) { needed_by->add_child(si); } // When ld_preloads is not null, the first // ld_preloads_count libs are in fact ld_preloads. if (ld_preloads != nullptr && soinfos_count < ld_preloads_count) { ld_preloads->push_back(si); } if (soinfos_count < library_names_count) { soinfos[soinfos_count++] = si; } } // Step 2: Load libraries in random order (see b/24047022) LoadTaskList load_list; for (auto&& task : load_tasks) { soinfo* si = task->get_soinfo(); auto pred = [&](const LoadTask* t) { return t->get_soinfo() == si; }; if (!si->is_linked() && std::find_if(load_list.begin(), load_list.end(), pred) == load_list.end() ) { load_list.push_back(task); } } bool reserved_address_recursive = false; if (extinfo) { reserved_address_recursive = extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS_RECURSIVE; } if (!reserved_address_recursive) { // Shuffle the load order in the normal case, but not if we are loading all // the libraries to a reserved address range. shuffle(&load_list); } // Set up address space parameters. address_space_params extinfo_params, default_params; size_t relro_fd_offset = 0; if (extinfo) { if (extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS) { extinfo_params.start_addr = extinfo->reserved_addr; extinfo_params.reserved_size = extinfo->reserved_size; extinfo_params.must_use_address = true; } else if (extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS_HINT) { extinfo_params.start_addr = extinfo->reserved_addr; extinfo_params.reserved_size = extinfo->reserved_size; } } for (auto&& task : load_list) { address_space_params* address_space = (reserved_address_recursive || !task->is_dt_needed()) ? &extinfo_params : &default_params; if (!task->load(address_space)) { return false; } } // Step 3: pre-link all DT_NEEDED libraries in breadth first order. ...... // Step 4: Construct the global group. Note: DF_1_GLOBAL bit of a library is // determined at step 3. // Step 4-1: DF_1_GLOBAL bit is force set for LD_PRELOADed libs because they // must be added to the global group ...... // Step 4-2: Gather all DF_1_GLOBAL libs which were newly loaded during this // run. These will be the new member of the global group ...... // Step 4-3: Add the new global group members to all the linked namespaces ...... // Step 5: Collect roots of local_groups. // Whenever needed_by->si link crosses a namespace boundary it forms its own local_group. // Here we collect new roots to link them separately later on. Note that we need to avoid // collecting duplicates. Also the order is important. They need to be linked in the same // BFS order we link individual libraries. ...... // Step 6: link all local groups ...... // Step 7: Mark all load_tasks as linked and increment refcounts // for references between load_groups (at this point it does not matter if // referenced load_groups were loaded by previous dlopen or as part of this // one on step 6) ...... return true; }
find_libraries整个函数分为8个步骤:
- (Step 0: prepare.)在本例中,调用System.loadLibrary(“mytest”),library_names_count为1。此步将需要加载的so封装为LoadTask,并存放在名为load_tasks的容器中。LoadTask中有几个关键成员变量需要说一下:
- LoadTask.name_表示该Task所需加载的so全路径
- LoadTask.file_offset_表示改Task加载so时,so文件对应于加载文件(如:apk)的文件偏移(从apk中直接加载so时,此字段将 > 0)
- LoadTask.is_dt_needed_表示是否依赖so,如:libmytest.so为主加载so,所以此字段为0,libmytest.so依赖于libc.so,所以在加载libc.so时,此字段为1。
另外,此处还对多个so加载时的原子性做了预处理,即:如果要加载2个so,而第2个so加载失败,则也需要将第1个so unload。
- (Step 1: expand the list of load_tasks to include all DT_NEEDED libraries (do not load them just yet))展开本so的所有依赖so的依赖so,此处的so依赖是一个树形结构。此步骤主要是一个for循环,根据上一步的结果得知load_tasks.size()为1,那么先推断for循环执行一次,这完全不足以展开本so(libmytest.so)的所有依赖so,所以刚才推断应该是哪里错了。仔细检查后发现,在for循环体中,将load_tasks的地址作为参数传入find_library_internal了,在find_library_internal中应该会将依赖so添加到load_tasks中。然后这一步主要是将so依赖树添加至load_tasks容器中。
- 下面开支线到find_library_internal中查看,以印证如上推测。首先通过so全路径调用find_loaded_library_by_soname,查找so是否已经被加载,本例中假设为第一次调用System.loadLibrary(“mytest”),所以libmytest.so肯定没有被加载(在for循环后续执行时,一些libdl/libc等常用库由于事先已被加载到进程中,所以find_library_internal在执行到find_loaded_library_by_soname时就return true了)。然后将task,load_tasks作为参数调用了load_library。
static bool find_library_internal(android_namespace_t* ns, LoadTask* task, ZipArchiveCache* zip_archive_cache, LoadTaskList* load_tasks, int rtld_flags, bool search_linked_namespaces) { soinfo* candidate; if (find_loaded_library_by_soname(ns, task->get_name(), search_linked_namespaces, &candidate)) { LD_LOG(kLogDlopen, "find_library_internal(ns=%s, task=%s): Already loaded (by soname): %s", ns->get_name(), task->get_name(), candidate->get_realpath()); task->set_soinfo(candidate); return true; } // Library might still be loaded, the accurate detection // of this fact is done by load_library. TRACE("[ "%s" find_loaded_library_by_soname failed (*candidate=%s@%p). Trying harder... ]", task->get_name(), candidate == nullptr ? "n/a" : candidate->get_realpath(), candidate); if (load_library(ns, task, zip_archive_cache, load_tasks, rtld_flags, search_linked_namespaces)) { return true; } ...... return false; }
- 继续开支线到load_library中查看,判断相关flag等 *** 作后,转调open_library,如果成功(fd != -1),将task,load_tasks作为参数继续转调重载的load_library。
static bool load_library(android_namespace_t* ns, LoadTask* task, ZipArchiveCache* zip_archive_cache, LoadTaskList* load_tasks, int rtld_flags, bool search_linked_namespaces) { const char* name = task->get_name(); soinfo* needed_by = task->get_needed_by(); ...... // Open the file. int fd = open_library(ns, zip_archive_cache, name, needed_by, &file_offset, &realpath); if (fd == -1) { if (task->is_dt_needed()) { if (needed_by->is_main_executable()) { DL_OPEN_ERR("library "%s" not found: needed by main executable", name); } else { DL_OPEN_ERR("library "%s" not found: needed by %s in namespace %s", name, needed_by->get_realpath(), task->get_start_from()->get_name()); } } else { DL_OPEN_ERR("library "%s" not found", name); } return false; } task->set_fd(fd, true); task->set_file_offset(file_offset); return load_library(ns, task, load_tasks, rtld_flags, realpath, search_linked_namespaces); }
- 我们先跳过open_library,权且认为返回成功,回头再来分析。继续深入到重载的load_library中查看,首先调用soinfo_alloc分配所需内存结构,然后读取ELF Header,并读取ELF加载所需的段(Segment),然后根据段信息调用for_each_dt_needed,将所有类型为DT_NEEDED的条目添加到load_tasks容器中。至此证明了我们上面的猜测:如果libmytest.so存在依赖so,那么for循环不止一次,并且在循环体中给容器load_tasks添加了项。更进一步还可以得出结论,so的所有依赖so将组成一颗依赖树,而此处采用了广度优先的树遍历算法。
static bool load_library(android_namespace_t* ns, LoadTask* task, LoadTaskList* load_tasks, int rtld_flags, const std::string& realpath, bool search_linked_namespaces) { off64_t file_offset = task->get_file_offset(); ...... soinfo* si = soinfo_alloc(ns, realpath.c_str(), &file_stat, file_offset, rtld_flags); if (si == nullptr) { return false; } task->set_soinfo(si); // Read the ELF header and some of the segments. if (!task->read(realpath.c_str(), file_stat.st_size)) { soinfo_free(si); task->set_soinfo(nullptr); return false; } ...... for_each_dt_needed(task->get_elf_reader(), [&](const char* name) { LD_LOG(kLogDlopen, "load_library(ns=%s, task=%s): Adding DT_NEEDED task: %s", ns->get_name(), task->get_name(), name); load_tasks->push_back(LoadTask::create(name, si, ns, task->get_readers_map())); }); return true; }
- 先说句题外话,也可以通过readelf -d来查看so所依赖的so的列表。readelf -d libmytest.so | grep NEEDED(详情见参考资料:Linux so剖析)。如下所示:
huchao@ubuntu:~$ readelf -d libmytest.so | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [liblog.so] 0x0000000000000001 (NEEDED) Shared library: [libandroid.so] 0x0000000000000001 (NEEDED) Shared library: [libm.so] 0x0000000000000001 (NEEDED) Shared library: [libdl.so] 0x0000000000000001 (NEEDED) Shared library: [libc.so]
- 接下来回过头看刚才跳过的open_library,本例我们分析加载libmytest.so,此so的路径下必然包含“/”,所以将直接加载。
- 如果是直接从apk中加载so。name将类似于/data/app/~~WdKfQO1G6r3htDT7Rgo1DQ==/com.huchao.mysystemloadlibrary-6wbqoASC9saPFntEre3_MQ==/base.apk!/lib/arm64-v8a/libmytest.so,其路径中包含kZipFileSeparator(!/)将去apk文件中查找,调用open_library_in_zipfile,返回打开的apk文件的fd。
- 如果是从本地路径中加载so。name将类似于/data/app/~~bK-kneb_uAxNsrsy-CEmDw==/com.huchao.mysystemloadlibrary-wRz7Al17VLjWqWAtbV_l0A==/lib/arm64/libmytest.so,调用Linux syscall open,返回so文件的fd。
static int open_library(android_namespace_t* ns, ZipArchiveCache* zip_archive_cache, const char* name, soinfo *needed_by, off64_t* file_offset, std::string* realpath) { TRACE("[ opening %s from namespace %s ]", name, ns->get_name()); // If the name contains a slash, we should attempt to open it directly and not search the paths. if (strchr(name, '/') != nullptr) { int fd = -1; if (strstr(name, kZipFileSeparator) != nullptr) { fd = open_library_in_zipfile(zip_archive_cache, name, file_offset, realpath); } if (fd == -1) { fd = TEMP_FAILURE_RETRY(open(name, O_RDonLY | O_CLOEXEC)); if (fd != -1) { *file_offset = 0; if (!realpath_fd(fd, realpath)) { if (!is_first_stage_init()) { PRINT("warning: unable to get realpath for the library "%s". Will use given path.", name); } *realpath = name; } } } return fd; } ...... }
- 接上面在apk中加载so,深入查看open_library_in_zipfile,核心查看*file_offset = entry.offset;,意味着在apk文件中找到so对应的entry后,并且压缩方式为kCompressStored,并且对齐为内存页大小(4096),那么将赋值file_offset,并返回对应的fd,最终这些值将被设置到LoadTask结构中,最终更新至load_tasks,供后续加载使用。注意:到此为止,仅仅将so依赖树遍历完成,并未开始加载so。
static int open_library_in_zipfile(ZipArchiveCache* zip_archive_cache, const char* const input_path, off64_t* file_offset, std::string* realpath) { ...... int fd = TEMP_FAILURE_RETRY(open(zip_path, O_RDonLY | O_CLOEXEC)); if (fd == -1) { return -1; } ZipArchiveHandle handle; if (!zip_archive_cache->get_or_open(zip_path, &handle)) { // invalid zip-file (?) close(fd); return -1; } ZipEntry entry; if (FindEntry(handle, file_path, &entry) != 0) { // Entry was not found. close(fd); return -1; } // Check if it is properly stored if (entry.method != kCompressStored || (entry.offset % PAGE_SIZE) != 0) { close(fd); return -1; } *file_offset = entry.offset; ...... return fd; }
- (Step 2: Load libraries in random order (see b/24047022))这一步终于到期待已久的加载so了,在深入进去看如何加载前,我们先回顾一下GNU/Linux的dlopen的基本逻辑,man dlopen中可以看到,dlopen的type为3,意味着这个函数是Library calls (functions within program libraries),进一步意味着,这个dlopen加载so不是Linux内核提供的能力,而是libc采用Linux syscall封装而来的。理论上来说,Linux所有可执行文件与so都是ELF格式,而进程加载so主要是将so按照ELF约定好的段(Segment)加载到自己的虚拟内存空间中,在内核中采用struct vm_area_struct与其对应起来(Linux Kernel为用户层提供了procfs伪文件系统,通过/proc/[pid]/maps便可以查看加载进来的so的struct vm_area_struct),然后对PLT/GOT等进行重定位。更进一步,将文件与内存对应起来,并使之对应于内核中的struct vm_area_struct,最常规的方式就是Linux syscall mmap,接下来就去源码中一探究竟。
DLOPEN(3) Linux Programmer's Manual DLOPEN(3) NAME dlclose, dlopen, dlmopen - open and close a shared object
-
在这一步中,采用了新的容器load_list取代了之前的load_tasks,load_tasks中包含了本libmytest.so与所有依赖so的列表,而load_list是其子集,仅包含需要加载的so列表,因为上面提到过,某些so可能前期已经被加载过了。整理好load_tasks后,接下来遍历load_tasks并调用LoadTask.load函数,真实加载so。
-
LoadTask.load函数中,主要是通过elf_reader转调ElfReader.Load
bool load(address_space_params* address_space) { ElfReader& elf_reader = get_elf_reader(); if (!elf_reader.Load(address_space)) { return false; } ...... return true; }
- 继续深入ElfReader.Load,发现又转调了ReserveAddressSpace、LoadSegments、FindPhdr3个函数,如果这3个函数都成功,那么也就意味着so加载完成并成功了。
bool ElfReader::Load(address_space_params* address_space) { CHECK(did_read_); if (did_load_) { return true; } if (ReserveAddressSpace(address_space) && LoadSegments() && FindPhdr()) { did_load_ = true; } return did_load_; }
- 通过查看ElfReader::ReserveAddressSpace函数的注释发现,本函数将通过mmap申请足够大的匿名虚拟内存,以备后续加载使用。切实mmap调用是在ReserveAligned中进行的,其中也对其了Linux内存分页的边界(4096)。
// Reserve a virtual address range big enough to hold all loadable // segments of a program header table. This is done by creating a // private anonymous mmap() with PROT_NONE. bool ElfReader::ReserveAddressSpace(address_space_params* address_space) { ...... ReserveAligned(load_size_, kLibraryAlignment); ...... } // Reserve a virtual address range such that if it's limits were extended to the next 2**align // boundary, it would not overlap with any existing mappings. static void* ReserveAligned(size_t size, size_t align) { int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS; if (align == PAGE_SIZE) { void* mmap_ptr = mmap(nullptr, size, PROT_NONE, mmap_flags, -1, 0); if (mmap_ptr == MAP_FAILED) { return nullptr; } return mmap_ptr; } // Allocate enough space so that the end of the desired region aligned up is still inside the // mapping. size_t mmap_size = align_up(size, align) + align - PAGE_SIZE; uint8_t* mmap_ptr = reinterpret_cast(mmap(nullptr, mmap_size, PROT_NONE, mmap_flags, -1, 0)); if (mmap_ptr == MAP_FAILED) { return nullptr; } ...... }
- 虚拟内存空间分配好了,接下来就该切实加载了,接下来的便是函数ElfReader::LoadSegments。函数开始就是一个for循环,phdr_num_为需要加载的so的Segment的数目(phdr_num_的意思是Elf64_Phdr结构体的数目),然后终于看到我们期待已久的mmap64了,这样so中的所有Segment就都mmap到虚拟内存中了。提一下mmap64最后两个参数,fd_即为上面open打开的文件的FD,可以对应磁盘上的so或apk文件。file_offset_ + file_page_start为对应文件的偏移,当fd_对应so时,一般来说so中会有多个Segment,偏移为so文件中对应Segment的偏移;但fd_对应apk时,偏移则为整个apk文件中未压缩so的偏移。
bool ElfReader::LoadSegments() { for (size_t i = 0; i < phdr_num_; ++i) { ...... { void* seg_addr = mmap64(reinterpret_cast(seg_page_start), file_length, prot, MAP_FIXED|MAP_PRIVATE, fd_, file_offset_ + file_page_start); if (seg_addr == MAP_FAILED) { DL_ERR("couldn't map "%s" segment %zd: %s", name_.c_str(), i, strerror(errno)); return false; } } ...... return true; }
- 接下来便是设置pre-link,global group等 *** 作了,限于篇幅,本文将不再展开,感兴趣的朋友继续阅读源码。
到了总结时刻,我们概述一下System.loadLibrary整体流程:
- libcore中的Java代码提供了System.loadLibrary这个API,并进行简单封装后,JNI转调libcore。此处Java代码中主要是一些业务逻辑,如:so的查找路径的梳理等
- libcore中的Native代码仍然是简单封装,然后转调libart
- libart主要是为了承载上面的Java,然后转调libdl。从分析来看,仍然不涉及so加载的核心
- libdl来到了bionic中,其中逐步解析so文件格式,然后按照Segment将so mmap到进程的虚拟内存空间中,至此才结束整个流程
继续往下还能挖很多更深入的知识点,如:
- PLT/GOT重定位是如何实现的?
- mmap陷入内核后,内核中如何通过struct vm_area_struct对各个Segment进行管理?
- 如上提到的内存分页机制是什么意思?以及为啥这个值总为4096,能否改为其他值?
最后说一下做这个事情的出发点吧,从Android 6.0 & AGP 3.6.0开始,如果开启了apk中so不压缩属性,App运行后,将无法通过分析/proc/[pid]/maps找到App加载自身的so列表,经过调研后便发现了apk中不压缩so的特性,于是好奇Linux是如何支持这种在apk中直接加载so特性的。前期查看Java代码,libart代码均未能找到原因,最终继续深入libc才得以解决,现在回过头来看发现,其实Linux一直支持这种特性,只是glibc封装后的dlopen不支持而已,Android的bionic扩展了其能力。
参考资料:OpenJDK System.loadLibrary的源码剖析:https://blog.csdn.net/xt_xiaotian/article/details/122194883
Linux so剖析:https://blog.csdn.net/xt_xiaotian/article/details/116446531
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)