樓 【原創(chuàng)】從源代碼看.net下exe的加載過程
|
|
標(biāo) 題: 【原創(chuàng)】從源代碼看.net下exe的加載過程 作 者: tankaiha 時(shí) 間: 2006-09-11,18:24 鏈 接: http://bbs./showthread.php?threadid=31799
這里的源代碼自然不是指.net Framework的源碼,不過微軟公開了一個(gè)代號為rotor的open source cli的源碼,你可以把它看為輕量級的.net framework。最關(guān)鍵的是,它倆的運(yùn)行機(jī)理大致相同。今天,我們就從rotor的源碼中看看做為程序調(diào)試最基本的exe文件的動(dòng)態(tài)加載。同樣,先給出參考文獻(xiàn),免得有人說我抄襲?!秈nside the rotor cli》,另一本是《shared source cli》,只不過網(wǎng)上搞不到。當(dāng)然,還要從MSDN的網(wǎng)站下載sscli2.0壓縮包。 和win32下一樣,系統(tǒng)會提供一個(gè)loader將exe讀入,sscli中提供了另一個(gè)loader的例子:clix.exe。我們暫且把它看為系統(tǒng)默認(rèn)的loader,來看源碼(clix.cpp),注意紅色的代碼
代碼:
DWORD Launch(WCHAR* pFileName, WCHAR* pCmdLine)
{
WCHAR exeFileName[MAX_PATH + 1];
DWORD dwAttrs;
DWORD dwError;
DWORD nExitCode;
...
//這里進(jìn)行一系列文件的屬性檢查
...
if (dwError != ERROR_SUCCESS) {
// We can‘t find the file, or there‘s some other problem. Exit with an error.
fwprintf(stderr, L"%s: ", pFileName);
DisplayMessageFromSystem(dwError);
return 1; // error
}
nExitCode = _CorExeMain2(NULL, 0, pFileName, NULL, pCmdLine);
// _CorExeMain2 never returns with success
_ASSERTE(nExitCode != 0);
DisplayMessageFromSystem(::GetLastError());
return nExitCode;
}
這里我們看到了著名的CorExeMain,還記得用PE編輯文件打開.netPE文件,只引入了一個(gè)函數(shù)嗎?mscoree.dll!_CorExeMain。奇怪,怎么不是_CorExeMain2呢?這只是rotor和商業(yè)版的framework的一點(diǎn)區(qū)別而已。你可以用IDApro逆一下mscoree.dll,就可以看到_CorExeMain()只不過是一個(gè)中轉(zhuǎn),代碼如下
代碼:
.text:79011B47 push offset a_corexemain ; "_CorExeMain"
.text:79011B4C push [ebp+hModule] ; hModule
.text:79011B4F call ds:__imp__GetProcAddress@8 ; GetProcAddress(x,x)
.text:79011B55 test eax, eax
.text:79011B57 jz loc_79019B46
.text:79011B5D call eax
進(jìn)入后馬上就調(diào)用了mscorwks.dll的_CorExeMain。而這個(gè)函數(shù)和rotor中剛才提到的_CorExeMain2提供的功能差不多,就開始exe載入的初始化了。這些都可以從反匯編代碼與源代碼比較看出來。繼續(xù)回到sscli中,來看_CorExeMain2()的代碼(ceemain.cpp)
代碼:
__int32 STDMETHODCALLTYPE _CorExeMain2( // Executable exit code.
PBYTE pUnmappedPE, // -> memory mapped code
DWORD cUnmappedPE, // Size of memory mapped code
__in LPWSTR pImageNameIn, // -> Executable Name
__in LPWSTR pLoadersFileName, // -> Loaders Name
__in LPWSTR pCmdLine) // -> Command Line
{
// This entry point is used by clix
BOOL bRetVal = 0;
//BEGIN_ENTRYPOINT_VOIDRET;
// Before we initialize the EE, make sure we‘ve snooped for all EE-specific
// command line arguments that might guide our startup.
HRESULT result = CorCommandLine::SetArgvW(pCmdLine);
if (!CacheCommandLine(pCmdLine, CorCommandLine::GetArgvW(NULL))) {
LOG((LF_STARTUP, LL_INFO10, "Program exiting - CacheCommandLine failed\n"));
bRetVal = -1;
goto exit;
}
if (SUCCEEDED(result))
result = CoInitializeEE(COINITEE_DEFAULT | COINITEE_MAIN);
if (FAILED(result)) {
VMDumpCOMErrors(result);
SetLatchedExitCode (-1);
goto exit;
}
// This is here to get the ZAPMONITOR working correctly
INSTALL_UNWIND_AND_CONTINUE_HANDLER;
// Load the executable
bRetVal = ExecuteEXE(pImageNameIn);
...
...
大多數(shù)代碼都可以略過,關(guān)鍵的就兩個(gè),一個(gè)是初始化ee(execute engine),初始化成功后就調(diào)用ExecuteEXE,參數(shù)是文件名。這里可以清楚地看到_CorExeMain()的傳入?yún)?shù)是什么。ExecuteEXE()的代碼不多,也是個(gè)跳板:
代碼:
BOOL STDMETHODCALLTYPE ExecuteEXE(HMODULE hMod)
{
STATIC_CONTRACT_GC_TRIGGERS;
_ASSERTE(hMod);
if (!hMod)
return FALSE;
ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_EXEC_EXE);
TIMELINE_START(STARTUP, ("ExecuteExe"));
EX_TRY_NOCATCH
{
// Executables are part of the system domain
SystemDomain::ExecuteMainMethod(hMod);
}
EX_END_NOCATCH;
ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_EXEC_EXE+1);
TIMELINE_END(STARTUP, ("ExecuteExe"));
return TRUE;
}
同樣,關(guān)鍵的代碼只有一行,SystemDomain::ExecuteMainMethod(hMod)。其中,字面上看ExecuteMainMethod是將傳入的文件作為了一個(gè)module,在.net中,如果要以包含關(guān)系算的話,assembly > module > class > method。也就是說每一個(gè)assembly可能包含多個(gè)module,且至少有一個(gè)module有且只有一個(gè)MainMethod,就是入口方法。 下面轉(zhuǎn)到SystemDomain::ExecuteMainMethod()的代碼中(assembly.cpp)
代碼:
INT32 Assembly::ExecuteMainMethod(PTRARRAYREF *stringArgs)
{
CONTRACTL
{
INSTANCE_CHECK;
THROWS;
GC_TRIGGERS;
MODE_ANY;
ENTRY_POINT;
INJECT_FAULT(COMPlusThrowOM());
}
CONTRACTL_END;
HRESULT hr = S_OK;
INT32 iRetVal = 0;
BEGIN_ENTRYPOINT_THROWS;
Thread *pThread = GetThread();
MethodDesc *pMeth;
{
// This thread looks like it wandered in -- but actually we rely on it to keep the process alive.
pThread->SetBackground(FALSE);
GCX_COOP();
pMeth = GetEntryPoint();
if (pMeth) {
RunMainPre();
hr = ClassLoader::RunMain(pMeth, 1, &iRetVal, stringArgs);
}
}
//RunMainPost is supposed to be called on the main thread of an EXE,
//after that thread has finished doing useful work. It contains logic
//to decide when the process should get torn down. So, don‘t call it from
// AppDomain.ExecuteAssembly()
if (pMeth) {
if (stringArgs == NULL)
RunMainPost();
}
else {
StackSString displayName;
GetDisplayName(displayName);
COMPlusThrowHR(COR_E_MISSINGMETHOD, IDS_EE_FAILED_TO_FIND_MAIN, displayName);
}
if (FAILED(hr))
ThrowHR(hr);
END_ENTRYPOINT_THROWS;
return iRetVal;
}
關(guān)鍵的步驟還是兩個(gè),準(zhǔn)備好線程環(huán)境,然后運(yùn)行Main方法。下面來到clsload.cpp中看ClassLoader::RunMain,這也是這次我們的最后一站。
代碼:
HRESULT ClassLoader::RunMain(MethodDesc *pFD ,
short numSkipArgs,
INT32 *piRetVal,
PTRARRAYREF *stringArgs /*=NULL*/)
{
STATIC_CONTRACT_THROWS;
_ASSERTE(piRetVal);
DWORD cCommandArgs = 0; // count of args on command line
DWORD arg = 0;
LPWSTR *wzArgs = NULL; // command line args
HRESULT hr = S_OK;
*piRetVal = -1;
// The exit code for the process is communicated in one of two ways. If the
// entrypoint returns an ‘int‘ we take that. Otherwise we take a latched
// process exit code. This can be modified by the app via setting
// Environment‘s ExitCode property.
if (stringArgs == NULL)
SetLatchedExitCode(0);
if (!pFD) {
_ASSERTE(!"Must have a function to call!");
return E_FAIL;
}
CorEntryPointType EntryType = EntryManagedMain;
ValidateMainMethod(pFD, &EntryType);
if ((EntryType == EntryManagedMain) &&
(stringArgs == NULL)) {
// If you look at the DIFF on this code then you will see a major change which is that we
// no longer accept all the different types of data arguments to main. We now only accept
// an array of strings.
wzArgs = CorCommandLine::GetArgvW(&cCommandArgs);
// In the WindowsCE case where the app has additional args the count will come back zero.
if (cCommandArgs > 0) {
if (!wzArgs)
return E_INVALIDARG;
}
}
ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_MAIN);
TIMELINE_START(STARTUP, ("RunMain"));
EX_TRY_NOCATCH
{
MethodDescCallSite threadStart(pFD);
PTRARRAYREF StrArgArray = NULL;
GCPROTECT_BEGIN(StrArgArray);
// Build the parameter array and invoke the method.
if (EntryType == EntryManagedMain) {
if (stringArgs == NULL) {
// Allocate a COM Array object with enough slots for cCommandArgs - 1
StrArgArray = (PTRARRAYREF) AllocateObjectArray((cCommandArgs - numSkipArgs), g_pStringClass);
// Create Stringrefs for each of the args
for( arg = numSkipArgs; arg < cCommandArgs; arg++) {
STRINGREF sref = COMString::NewString(wzArgs[arg]);
StrArgArray->SetAt(arg-numSkipArgs, (OBJECTREF) sref);
}
}
else
StrArgArray = *stringArgs;
}
#ifdef STRESS_THREAD
OBJECTHANDLE argHandle = (StrArgArray != NULL) ? CreateGlobalStrongHandle (StrArgArray) : NULL;
Stress_Thread_Param Param = {pFD, argHandle, numSkipArgs, EntryType, 0};
Stress_Thread_Start (&Param);
#endif
ARG_SLOT stackVar = ObjToArgSlot(StrArgArray);
if (pFD->IsVoid())
{
// Set the return value to 0 instead of returning random junk
*piRetVal = 0;
threadStart.Call(&stackVar);
}
else
{
*piRetVal = (INT32)threadStart.Call_RetArgSlot(&stackVar);
if (stringArgs == NULL)
{
SetLatchedExitCode(*piRetVal);
}
}
GCPROTECT_END();
fflush(stdout);
fflush(stderr);
}
EX_END_NOCATCH
ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_MAIN+1);
TIMELINE_END(STARTUP, ("RunMain"));
return hr;
}
這些代碼主要是進(jìn)行方法最終運(yùn)行前的一些準(zhǔn)備,然后運(yùn)行。分兩種,有返回值的和void()的。下面的運(yùn)行情況就是深入到framework的核心中了,改天看了再寫吧。代碼中運(yùn)用了許多COM下的定義,也可見.net和COM關(guān)系的密切。就像.net下的Debugger和Profiler甚至直接調(diào)用了COM接口來編譯。只是我對COM了解不深,無法就此問題深入。 btw:在看雪發(fā)了幾篇.net文章,主要是看雪類似的文章較少,研究的人也不多。要是有興趣共同學(xué)習(xí).net 的內(nèi)核,歡迎和我交流。
由 tankaiha 于 2006-09-11 20:25 最后編輯
|