转自:http://www.codeguru.com/cpp/v-s/debug/openfaq/article.php/c14799/Function-Calls-Part-3-Frame-Pointer-and-Local-Variables.htm
In this article, you will look at local variables and how they come into play.
Typical functions have the following: code, arguments, local variables, and return values. Although the code for the function is compiled in and is unchangeable at runtime, the rest are all dynamic entities and as such are all occupants of the stack for a thread.
The data required by a function could be:
A stack frame is a block of te stack that contains these data required by a function. Each function call results in a new block of stack memory being set aside and prepared for, so that the function can do its job. The stack frame creation is a process in which both caller and callee take part. The caller starts this by pushing the function arguments (note, only the caller knows what to pass in), followed by a call into the function. From this point, the callee code takes over. It further extends the stack frame and prepares the local variables (note, only the callee knows what variables to use). From that point onwards, the stack frame is set up for the function body to operate on.
Each function call results in a creation of a stack frame (with the minimum being the address to return to). So, if funcA calls funcB and funcB calls funcC, three stack frames are set up one on top of the another. When a function returns, its frame becomes invalid. A well-behaved function acts only on its own stack frame and does not trespass on another's. Once it starts going out of its legal bounds, unpredictable and disastrous behavior can occur. It is to be noted that, when funcA calls into say funcB, the stack frame for the funcA is frozen—it doesn't grow or shrink anymore because the control has been passed to funcB (again, if funcB is well behaved and is using only its OWN stack frame).
However, in the context of funcB, the stack frame can be dynamic; it can grow and shrink depending on what is happening inside the function. This means the topmost stack frame is dynamic in nature and it could be contracting and expanding as the function executes. In any case, it can at the least contract to the position in the stack it was in when the function body was just started. If it shrinks any further, it is basically trespassing on the caller established stack frame and the consequences can be bad.
This is pictorially shown below. The scenario is funcA calling into funcB which in turn calls into funcC.
To summarize, stack frames are set up to facilitate the function bodies. A call to a function is preceded by the caller code preparing the stack to push parameters, followed by the callee code preparing the stack for its local variables. When the callee code returns, it has to make sure the stack pointer is back at the same value when its code was entered in the first place. This makes sure the stack frame is restored.
In Parts 1 and 2, you saw how function arguments are passed. These are passed via the stack. To be precise, the caller pushes these on the stack and then issues a call instruction with the address of function to be called. If the stack contents are checked right before the function code is executed, you see that the top most on the stack is the return address, followed by the function arguments. That was the data from the caller. How about callee data? Specifically, how about the function local variables? These also have a lifetime as good as that for the function body, so they are perfect candidates to be allocated on the stack. This in fact is the case and the compiler DOES allocate local variable space right on the stack. This is one of the first things the callee code generated by the compiler will do. It will allocate enough data to accomodate all local variables within a function body. The following figure shows this setup. Note the division of responsibility of stack frame setup by different parts of the code.
The picture above is a representation of the stack frame setup for a function call. It is interesting to note that the return address location acts as a kind of anchor point for accessing function data. Note that, to access the arguments, the function body will have to traverse down (higher addresses) from the location where the return address is stored, and to access the local variables, the function body will have to traverse up the stack (lower addresses) relative to the location where the return address is stored. In fact, typical compiler generated code for the function will do exactly this. The compiler dedicates a register called EBP for this (Base Pointer). Another name for the same is frame pointer. The compiler typically, as the first thing for the function body, pushes the current EBP value on to the stack and sets the EBP to the current ESP. This means, once this is done, in any part of the function code, argument 1 is EBP+8 away, argument 2 is EBP+12(decimal) away, local variables are EBP-n away.
If Frame pointers, in other words dedicated EBP, are used, there is an interesting side effect. Note that the first thing a compiler-generated code would do on entering a function is to push the current EBP on the stack. This EBP value that was pushed is in fact the frame pointer of the calling function. Once this is pushed, the EBP value is the frame pointer for the called function. So, if you stop execution in a function and check the EBP value, it will be the frame pointer to the function. If you check the contents of the frame pointer location, the contents are actually the frame pointer for the previous function (in other words, the calling function). And if you take the value in that location and check what that is pointing to, it will be the frame pointer of the function that called it and so on. So, by using the EBP values pushed on the stack, you can, traverse through all the frames step by step. Interesting isn't it? So, at any given time, you can trace back and see what all the frames were, which indirectly means, you know what parameters were passed to each of the functions before it stopped execution at where you are. Take some time to digest this.
Now, wouldn't it be cool if you could, in addition to what parameters were passed, know the functions that were called? I mean, what good is knowing the parameters if you don't know what functions they are meant for? Well, it turns out, that information is available too. Referring to the figure above, you see that the caller's return address is at frame pointer+4 bytes away. Bingo. You can construct the whole story now. If you stop execution at any time (say funcC), you know from the current EBP what function called you (because you have the return address to funcB pushed at EBP+4). You also know where the stack frame of the function that called us is (in other words, the stack frame of funcB is at EBP+0). From this, you know the return address that your caller in turn has to return to (in other words, you know that funcB has to return to funcA). From the stack frame of funcB, you know what parameters were passed into funcB and so on. All this is pictorially shown below:
Okay. It's time to get started on some hands-on work.
[step1.png]
[frmsetup.png]
In the picture above, note how the DBP value is immediately pushed onto the stack on function entry for all three functions (circled in red). The value that is being pushed is the frame pointer value for the previous function and the current function is quickly saving it away before touching it. The frame pointer then is set to the current stack position by the second instruction. Similarly, just before issuing the ret instruction, the function body pops out the frame pointer which it saved on entry. So, from this point onwards, when the function returns to the caller, the EBP value for the caller is back to what it was.[locvar.png]
Note how the stack now is expanded by 8 bytes to make place for the local variables. Each int on a 32 bit system occupies 4 bytes of memory. Because funcC and funcB have 2 local int variables, you see the code generated to make space for them immediately before any code begins to execute. This code, however, is absent from the funcA body! Well, it doesn't have any local variables, so that makes perfect sense.[frmptr.png]
Note how from now on, EBP is used as a frame of reference to get to each local variable; in other words, the first local variable is at EBP-4 and the second one is at EBP-8. If there was code like below in funcC, which accesses the argument,the disassembly would have these lines of code
- void funcC(int c1, int c2)
- {
- int ca1 = 7;
- int ca2 = 8;
- c1 = 9;
- c2 = 10;
- return;
- }
Note how this time EBP+n values are used to access the data on the other side of the frame pointer. This demonstrates how the EBP acts as the anchor point to access the local variables and the function arguments.
- 10: c1 = 9;
- 00401014 C7 45 08 09 00 00 00 mov dword ptr [ebp+8],9
- 11: c2 = 10;
- 0040101B C7 45 0C 0A 00 00 00 mov dword ptr [ebp+0Ch],0Ah
[restore.png]
This code is to restore the ESP to back to where it was before the function was entered; in other words, it is simply undoing the stack expansion that it did to accomodate local variables. Again, this piece of code is missing for funcA because it does not have any local variables.[calls.png]
[stksnap.png]
To summarise, this is what you have learned so far.