3.5. Coding Examples

This section describes example code sequences for fundamental operations such as calling functions, accessing static objects, and transferring control from one part of a program to another. Previous sections discussed how a program may use the machine or the operating system, and they specified what a program may and may not assume about the execution environment. Unlike previous material, the information in this section illustrates how operations may be done, not how they must be done.

As before, examples use the ANSI C language. Other programming languages may use the same conventions displayed below, but failure to do so does not prevent a program from conforming to the ABI.

64-bit PowerPC code is normally position independent. That is, the code is not tied to a specific load address, and may be executed properly at various positions in virtual memory. Although it is possible to write position dependent code on the 64-bit PowerPC, these code examples only show position independent code.

Note: The examples below show code fragments with various simplifications. They are intended to explain addressing modes, not to show optimal code sequences or to reproduce compiler output.

3.5.1. Code Model Overview

When the system creates a process image, the executable file portion of the process has fixed addresses and the system chooses shared object library virtual addresses to avoid conflicts with other segments in the process. To maximize text sharing, shared objects conventionally use position-independent code, in which instructions contain no absolute addresses. Shared object text segments can be loaded at various virtual addresses without having to change the segment images. Thus multiple processes can share a single shared object text segment, even if the segment resides at a different virtual address in each process.

Position-independent code relies on two techniques:

Because the 64-bit PowerPC Architecture provides EA-relative branch instructions and also branch instructions using registers that hold the transfer address, compilers can satisfy the first condition easily.

A "Global Offset Table," or GOT, provides information for address calculation. Position independent object files (executable and shared object files) have a table in their data segment that holds addresses. When the system creates the memory image for an object file, the table entries are relocated to reflect the absolute virtual address as assigned for an individual process. Because data segments are private for each process, the table entries can change--unlike text segments, which multiple processes share.

3.5.2. The TOC section

ELF processor-specific supplements normally define a GOT ("Global Offset Table") section used to hold addresses for position independent code. Some ELF processor-specific supplements, including the 32-bit PowerPC Processor Supplement, define a small data section. The same register is sometimes used to address both the GOT and the small data section.

The 64-bit PowerOpen ABI defines a TOC ("Table of Contents") section. The TOC combines the functions of the GOT and the small data section.

This ABI uses the term TOC. The TOC section defined here is intended to be similar to that defined by the 64-bit PowerOpen ABI. The TOC section contains a conventional ELF GOT, and may optionally contain a small data area. The GOT and the small data area may be intermingled in the TOC section.

The TOC section is accessed via the dedicated TOC pointer register, r2. Accesses are normally made using the register indirect with immediate index mode supported by the 64-bit PowerPC processor, which limits a single TOC section to 65,536 bytes, enough for 8,192 GOT entries.

The value of the TOC pointer register is called the TOC base. The TOC base is typically the first address in the TOC plus 0x8000, thus permitting a full 64 Kbyte TOC.

A relocatable object file must have a single TOC section and a single TOC base. However, when the link editor combines relocatable object files to form a single executable or shared object, it may create multiple TOC sections. The link editor is responsible for deciding how to associate TOC sections with object files. Normally the link editor will only create multiple TOC sections if it has more than 65,536 bytes to store in a TOC.

All link editors which support this ABI must support a single TOC section, but support for multiple TOC sections is optional.

Each shared object will have a separate TOC or TOCs.

Note: This ABI does not actually restrict the size of a TOC section. It is permissible to use a larger TOC section, if code uses a different addressing mode to access it. The AIX link editor, in particular, does not support multiple TOC sections, but instead inserts call out code at link time to support larger TOC sections.

3.5.3. TOC Assembly Language Syntax

Desire for compatibility with both ELF systems and PowerOpen systems suggests two different assembly language syntaxes to be used when referring to the TOC section. This syntax is not part of the official ABI. The description here is only for information purposes. Particular assemblers may support both syntaxes, only one, or neither.

The ELF syntax uses @got and @toc. The syntax SYMBOL@got refers to the offset in the TOC at which the value of SYMBOL (that is, the address of the variable whose name is SYMBOL) is stored, assuming the offset is no larger than 16 bits. For example,

ld   r3,x@got(r2)

SYMBOL@got will be an offset within the global offset table, which as noted above, forms part of the TOC section.

Ordinarily the link editor will avoid having a TOC, and hence a GOT, larger than 64 Kbytes, perhaps by support multiple TOC sections, or via some other technique. However, for flexibility, there is a syntax for 32 bit offsets to the GOT. The syntaxes SYMBOL@got@ha, SYMBOL@got@h, and SYMBOL@got@l refer to the high adjusted, high, and low parts of the GOT offset. (The meaning of ``high adjusted'' is explained in Section 4.5.1).

The syntax SYMBOL@toc refers to the value (SYMBOL - base (TOC)), where base (TOC) represents the TOC base for the current object file. This provides the address of the variable whose name is SYMBOL, as an offset from the TOC base. This assumes that the variable may be found within the TOC, and that its offset is no larger than 16 bits.

As with the GOT, the syntaxes SYMBOL@toc@ha, SYMBOL@toc@h, and SYMBOL@toc@l refer to the high adjusted, high, and low parts of the TOC offset.

The syntax SYMBOL@got@plt may be used to refer to the offset in the TOC of a procedure linkage table entry stored in the global offset table. The corresponding syntaxes SYMBOL@got@plt@ha, SYMBOL@got@plt@h, and SYMBOL@got@plt@l are also defined.

Note: Note that if X is a variable stored in the TOC, then X@got will be the offset within the TOC of a doubleword whose value is X@toc.

The special symbol .TOC.@tocbase is used to represent the TOC base for the current object file. The following might appear in a function descriptor definition:

      .quad .TOC.@tocbase

The PowerOpen syntax is more complex. It is derived from the different representation of the TOC section in XCOFF.

Assembly code first uses the .toc pseudo-op to enter the TOC section. It then uses a label to name a particular element. It then uses the .tc pseudo-op to indicate which GOT entry it wishes to name. Later in the code, the label is used with the TOC register to load the address. For example:

      .toc
  .L1:
      .tc  x[TC],x
      ...
      ld   r3,.L1(r2)

This creates a GOT entry for the variable x, and names that entry .L1 for the remainder of the assembly. The effect is the same as the single ELF-style instruction above.

The special value TOC[tc0] is used to represent the TOC base for the current object file:

      .quad TOC[tc0]

The PowerOpen syntax permits other data to be stored in the .toc section. The assembler will output this data in a .toc section, and convert references as though its address were specified with @toc rather than @got.

There is a significant difference in representation of the TOC in this ABI and in the 64-bit PowerOpen ABI. Relocatable object files created using the 64-bit PowerOpen ABI have a .toc section which contains real data. The link editor uses garbage collection to discard duplicate information including in particular TOC entries which refer to the same variable. In this ABI, relocatable object files do not contain .got sections holding real data. Instead, the GOT is created by the link editor based on relocations created by @got references. This ABI does not require the link editor to support garbage collection. This ABI does permit real data to exist in .toc sections, but this data will never be referred to directly by instructions which use @got references. @got references always refer to the GOT which is created by the link editor when creating an executable or a shared object.

3.5.4. Function Prologue and Epilogue

This section describes functions' prologue and epilogue code. A function's prologue establishes a stack frame, if necessary, and may save any nonvolatile registers it uses. A function's epilogue generally restores registers that were saved in the prologue code, restores the previous stack frame, and returns to the caller. Except for the rules below, this ABI does not mandate predetermined code sequences for function prologues and epilogues. However, the following rules, which permit reliable call chain backtracing, shall be followed:

In-line code may be used to save or restore nonvolatile general or floating-point registers that the function uses. However, if there are many registers to be saved or restored, it may be more efficient to call one of the system subroutines described below.

3.5.5. Register Saving and Restoring Functions

The register saving and restoring functions described in this section use nonstandard calling conventions which ordinarily require them to be statically linked into any executable or shared object modules in which they are used. Nevertheless, unlike 32-bit PowerPC ELF, these functions are considered part of the official ABI. In particular, the link editor is permitted to treat calls to these functions specially, such as by changing a call to one of these function into a call to an absolute address as in the PowerOpen ABI.

As shown in The Stack Frame section above, the general register save area is not at a fixed offset from either the caller's SP or the callee's SP. The floating point register save area starts at a fixed position from the caller's SP on entry to the callee, but the position of the general register save area depends upon the number of floating point registers to be saved. Thus it is impossible to write a general register saving routine which uses fixed offsets from the SP.

If the routine needs to save both general and floating point registers, code can use r12 as the pointer for saving and restoring the general purpose registers. (r12 is a volatile register but does not contain input parameters). This leads to the definition of multiple register save and restore routines, each of which saves or restores M floating point registers and N general registers.

3.5.6. Saving General Registers Only

For a function that saves/restores N general registers and no floating point registers, the saving can be done using individual store/load instructions or by calling system provided routines as shown below.

In the following, the number of registers being saved is N, and <32-N> is the first register number to be saved/restored. All registers from <32-N> up to 31, inclusive, are saved/restored.

FRAME_SIZE is the size of the stack frame, here assumed to be less than 32 Kbytes.

    mflr  r0                    # Move LR into r0
    bl    _savegpr0_<32-N>      # Call routine to save general registers
    stdu  r1,(-FRAME_SIZE)(r1)  # Create stack frame
    ...
    (save CR if necessary)
    ...                         # Body of function
    ...
    (reload CR if necessary)
    ...
    (reload caller's SP into r1)
    b     _restgpr0_<32-N>      # Restore registers and return

3.5.7. Saving General Registers and Floating Point Registers

For a function that saves/restores N general registers and M floating point registers, the saving can be done using individual store/load instructions or by calling system provided routines as shown below.

    mflr  r0                    # Move LR into r0
    subi  r12,r1,8*M            # Set r12 to general reg save area
    bl    _savegpr1_<32-N>      # Call routine to save general registers
    bl    _savefpr_<32-M>       # Call routine to save floating point regs
    stdu  r1,(-FRAME_SIZE)(r1)  # Create stack frame
    ...
    (save CR if necessary)
    ...                         # Body of function
    ...
    (reload CR if necessary)
    ...
    (reload caller's SP into r1)
    subi  r12,r1,8*M            # Set r12 to general reg save area
    bl    _restgpr1_<32-N>      # Restore general registers
    b     _restfpr_<32-M>       # Restore floating point regs and return

3.5.8. Saving Floating Point Registers Only

For a function that saves/restores M floating point registers and no general registers, the saving can be done using individual store/load instructions or by calling system provided routines as shown below.

    mflr  r0                    # Move LR into r0
    bl    _savefpr_<32-M>       # Call routine to save general registers
    stdu  r1,(-FRAME_SIZE)(r1)  # Create stack frame
    ...
    (save CR if necessary)
    ...                         # Body of function
    ...
    (reload CR if necessary)
    ...
    (reload caller's SP into r1)
    b     _restgpr_<32-M>       # Restore registers and return

3.5.9. Save and Restore Services

Systems must provide three sets of routines, which may be implemented as multiple entry point routines or as individual routines. They must adhere to the following rules.

Each _savegpr0_N routine saves the general registers from rN to r31, inclusive. Each routine also saves the LR. When the routine is called, r1 must point to the start of the general register save area, and r0 must contain the value of LR on function entry.

The _restgpr0_N routines restore the general registers from rN to r31, and then return to the caller. When the routine is called, r1 must point to the start of the general register save area.

Here is a sample implementation of _savegpr0_N and _restgpr0_N.

  _savegpr0_14:  std  r14,-144(r1)
  _savegpr0_15:  std  r15,-136(r1)
  _savegpr0_16:  std  r16,-128(r1)
  _savegpr0_17:  std  r17,-120(r1)
  _savegpr0_18:  std  r18,-112(r1)
  _savegpr0_19:  std  r19,-104(r1)
  _savegpr0_20:  std  r20,-96(r1)
  _savegpr0_21:  std  r21,-88(r1)
  _savegpr0_22:  std  r22,-80(r1)
  _savegpr0_23:  std  r23,-72(r1)
  _savegpr0_24:  std  r24,-64(r1)
  _savegpr0_25:  std  r25,-56(r1)
  _savegpr0_26:  std  r26,-48(r1)
  _savegpr0_27:  std  r27,-40(r1)
  _savegpr0_28:  std  r28,-32(r1)
  _savegpr0_29:  std  r29,-24(r1)
  _savegpr0_30:  std  r30,-16(r1)
  _savegpr0_31:  std  r31,-8(r1)
                 std  r0, 16(r1)
                 blr


  _restgpr0_14:  ld   r14,-144(r1)
  _restgpr0_15:  ld   r15,-136(r1)
  _restgpr0_16:  ld   r16,-128(r1)
  _restgpr0_17:  ld   r17,-120(r1)
  _restgpr0_18:  ld   r18,-112(r1)
  _restgpr0_19:  ld   r19,-104(r1)
  _restgpr0_20:  ld   r20,-96(r1)
  _restgpr0_21:  ld   r21,-88(r1)
  _restgpr0_22:  ld   r22,-80(r1)
  _restgpr0_23:  ld   r23,-72(r1)
  _restgpr0_24:  ld   r24,-64(r1)
  _restgpr0_25:  ld   r25,-56(r1)
  _restgpr0_26:  ld   r26,-48(r1)
  _restgpr0_27:  ld   r27,-40(r1)
  _restgpr0_28:  ld   r28,-32(r1)
  _restgpr0_29:  ld   r0, 16(r1)
                 ld   r29,-24(r1)
                 mtlr r0
                 ld   r30,-16(r1)
                 ld   r31,-8(r1)
                 blr
  _restgpr0_30:  ld   r30,-16(r1)
  _restgpr0_31:  ld   r0, 16(r1)
                 ld   r31,-8(r1)
                 mtlr r0
                 blr

Each _savegpr1_N routine saves the general registers from rN to r31, inclusive. When the routine is called, r12 must point to the start of the general register save area.

The _restgpr1_N routines restore the general registers from rN to r31. When the routine is called, r12 must point to the start of the general register save area.

Here is a sample implementation of _savegpr1_N and _restgpr1_N.

  _savegpr1_14:  std  r14,-144(r12)
  _savegpr1_15:  std  r15,-136(r12)
  _savegpr1_16:  std  r16,-128(r12)
  _savegpr1_17:  std  r17,-120(r12)
  _savegpr1_18:  std  r18,-112(r12)
  _savegpr1_19:  std  r19,-104(r12)
  _savegpr1_20:  std  r20,-96(r12)
  _savegpr1_21:  std  r21,-88(r12)
  _savegpr1_22:  std  r22,-80(r12)
  _savegpr1_23:  std  r23,-72(r12)
  _savegpr1_24:  std  r24,-64(r12)
  _savegpr1_25:  std  r25,-56(r12)
  _savegpr1_26:  std  r26,-48(r12)
  _savegpr1_27:  std  r27,-40(r12)
  _savegpr1_28:  std  r28,-32(r12)
  _savegpr1_29:  std  r29,-24(r12)
  _savegpr1_30:  std  r30,-16(r12)
  _savegpr1_31:  std  r31,-8(r12)
                 blr


  _restgpr1_14:  ld   r14,-144(r12)
  _restgpr1_15:  ld   r15,-136(r12)
  _restgpr1_16:  ld   r16,-128(r12)
  _restgpr1_17:  ld   r17,-120(r12)
  _restgpr1_18:  ld   r18,-112(r12)
  _restgpr1_19:  ld   r19,-104(r12)
  _restgpr1_20:  ld   r20,-96(r12)
  _restgpr1_21:  ld   r21,-88(r12)
  _restgpr1_22:  ld   r22,-80(r12)
  _restgpr1_23:  ld   r23,-72(r12)
  _restgpr1_24:  ld   r24,-64(r12)
  _restgpr1_25:  ld   r25,-56(r12)
  _restgpr1_26:  ld   r26,-48(r12)
  _restgpr1_27:  ld   r27,-40(r12)
  _restgpr1_28:  ld   r28,-32(r12)
  _restgpr1_29:  ld   r29,-24(r12)
  _restgpr1_30:  ld   r30,-16(r12)
  _restgpr1_31:  ld   r31,-8(r12)
                 blr

Each _savefpr_M routine saves the floating point registers from fM to f31, inclusive. When the routine is called, r1 must point to the start of the floating point register save area, and r0 must contain the value of LR on function entry.

The _restfpr_M routines restore the floating point registers from fM to f31. When the routine is called, r1 must point to the start of the floating point register save area.

Here is a sample implementation of _savepr_M and _restfpr_M.

  _savefpr_14:  stfd f14,-144(r1)
  _savefpr_15:  stfd f15,-136(r1)
  _savefpr_16:  stfd f16,-128(r1)
  _savefpr_17:  stfd f17,-120(r1)
  _savefpr_18:  stfd f18,-112(r1)
  _savefpr_19:  stfd f19,-104(r1)
  _savefpr_20:  stfd f20,-96(r1)
  _savefpr_21:  stfd f21,-88(r1)
  _savefpr_22:  stfd f22,-80(r1)
  _savefpr_23:  stfd f23,-72(r1)
  _savefpr_24:  stfd f24,-64(r1)
  _savefpr_25:  stfd f25,-56(r1)
  _savefpr_26:  stfd f26,-48(r1)
  _savefpr_27:  stfd f27,-40(r1)
  _savefpr_28:  stfd f28,-32(r1)
  _savefpr_29:  stfd f29,-24(r1)
  _savefpr_30:  stfd f30,-16(r1)
  _savefpr_31:  stfd f31,-8(r1)
                std  r0, 16(r1)
                blr


  _restfpr_14:  lfd  f14,-144(r1)
  _restfpr_15:  lfd  f15,-136(r1)
  _restfpr_16:  lfd  f16,-128(r1)
  _restfpr_17:  lfd  f17,-120(r1)
  _restfpr_18:  lfd  f18,-112(r1)
  _restfpr_19:  lfd  f19,-104(r1)
  _restfpr_20:  lfd  f20,-96(r1)
  _restfpr_21:  lfd  f21,-88(r1)
  _restfpr_22:  lfd  f22,-80(r1)
  _restfpr_23:  lfd  f23,-72(r1)
  _restfpr_24:  lfd  f24,-64(r1)
  _restfpr_25:  lfd  f25,-56(r1)
  _restfpr_26:  lfd  f26,-48(r1)
  _restfpr_27:  lfd  f27,-40(r1)
  _restfpr_28:  lfd  f28,-32(r1)
  _restfpr_29:  lfd  f29,-24(r1)
  _restfpr_29:  ld   r0, 16(r1)
                lfd  f29,-24(r1)
                mtlr r0
                lfd  f30,-16(r1)
                lfd  f31,-8(r1)
                blr
  _restfpr_30:  lfd  f30,-16(r1)
  _restfpr_31:  ld   r0, 16(r1)
                lfd  f31,-8(r1)
                mtlr r0
                blr

3.5.10. Data Objects

This section describes only objects with static storage duration. It excludes stack-resident objects because programs always compute their virtual addresses relative to the stack or frame pointers.

In the 64-bit PowerPC Architecture, only load and store instructions access memory. Because 64-bit PowerPC instructions cannot hold 64-bit addresses directly, a program normally computes an address into a register and accesses memory through the register.

It is possible to build addresses using absolute code which puts symbol addresses into instructions. However, the difficulty of building a 64-bit address means that 64-bit PowerPC code normally loads an address out of a memory location in the TOC section. Combining the TOC offset of the symbol with the TOC address in register r2 gives the absolute address of the TOC entry holding the desired address.

The following figures show sample assembly language equivalents to C language code. The @got syntax is explained above, in the section TOC Assembly Language Syntax.

Load and Store; variables are not in TOC:

C                             Assembly

extern int src;
extern int dst;
extern int *ptr;

dst = src;
                              ld  r6,src@got(r2)
                              ld  r7,dst@got(r2)
                              lwz r0,0(r6)
                              stw r0,0(r7)

ptr = &dst;
                              ld  r0,dst@got(r2)
                              ld  r7,ptr@got(r2)
                              std r0,0(r7)

*ptr = src;
                              ld  r6,src@got(r2)
                              ld  r7,ptr@got(r2)
                              lwz r0,0(r6)
                              ld  r7,0(r7)
                              stw r0,0(r7)

The next example shows the same code assuming that the variables are all stored in the TOC. Shared objects normally can not assume that globally visible variables are stored in the TOC. If they did, it would be impossible for the variable references to be redirected to overriding variables in the main program. Therefore, shared objects should normally always use the type of code shown above.

Load and Store; variables in TOC:

C                             Assembly

extern int src;
extern int dst;
extern int *ptr;

dst = src;
                              lwz r0,src@toc(r2)
                              stw r0,dst@toc(r2)

ptr = &dst;
                              la  r0,dst@toc(r2)
                              std r0,ptr@toc(r2)

*ptr = src;
                              lwz r0,src@toc(r2)
                              ld  r7,ptr@toc(r2)
                              stw r0,0(r7)

3.5.11. Function Calls

Programs use the 64-bit PowerPC bl instruction to make direct function calls. The bl instruction must be followed by a nop instruction. For PowerOpen compatibility, the nop instruction must be:

    ori  r0,r0,0

For PowerOpen compatibility, the link editor must also accept these instructions as valid nop instructions:

    cror 15,15,15
    cror 31,31,31

In a relocatable object file, a direct function call should be made to the function entry point, which is a symbol beginning with dot (.). See Section 3.2.5 for more information.

When the link editor is creating an executable or shared object, and it sees a function call followed by a nop instruction, it determines whether the caller and the callee share the same TOC. If they do, it leaves the nop instruction unchanged. If they do not, the link editor constructs a linkage function. The linkage function loads the TOC register with the callee TOC and branches to the callee entry point. The link editor modifies the bl instruction to branch to the linkage function, and modifies the nop instruction to be

    ld   r2,40(r1)

This will reload the TOC register from the TOC save area after the callee returns.

A bl instruction has a self-relative branch displacement that can reach 32 Mbytes in either direction. Hence, the use of a bl instruction to effect a call within an executable or shared object file limits the size of the executable or shared object file text segment.

If the callee is in a different shared object, a similar procedure of linkage code and a modified nop instruction is used. In this case, the dynamic linker must complete the link by filling in the function descriptor at run time. See Section 5.2.4 for more details.

Here is an example of the assembly code generated for a function call:

C                             Assembly

extern void func (void);
func ();
                              bl   .func
                              ori  r0,r0,0

Here is an example of how the link editor transforms this code if the
callee has a different TOC than the caller:

C                             Assembly

extern void func (void);
func ();
                              bl   <linkage_for_func>
                              ld   r2,40(r1)

Here is an example of the linkage code created by the link editor. Remember that func@got@plt contains the address of the procedure linkage entry for func, which is a function descriptor. The function descriptor holds the addresses of the function entry point and the function TOC base.

<linkage_for_func>:
    ld    r12,func@got@plt(r2)
    std   r2,40(r1)
    ld    r0,0(r12)
    ld    r2,8(r12)
    mtctr r0
    bctr

The value of a function pointer is the address of the function descriptor, not the address of the function entry point itself.

C                             Assembly
extern void func (void);
extern void (*ptr) (void);
ptr = func;
                              ld    r6,func@got(r2)
                              ld    r7,ptr@got(r2)
                              std   r6,0(r7)

(*ptr) ();
                              ld    r6,ptr@got(r2)
                              ld    r6,0(r6)
                              ld    r0,0(r6)
                              std   r2,40(r1)
                              mtctr r0
                              ld    r2,8(r6)
                              bctrl
                              ld    r2,40(r1)

Since most of the code sequence used for a call through a pointer is the same no matter what function pointer is being used, it is also possible to do it by calling a function with an unusual calling convention provided by a library. With this approach, efficiency requires that the function be linked in directly, and not come from a shared library. The PowerOpen ABI uses a function named ._ptrgl for this purpose, passing the function pointer value in r11, and it is recommended that this name and calling convention be used as well when using this approach under ELF.

3.5.12. Branching

Programs use branch instructions to control their execution flow. As defined by the architecture, branch instructions hold a self-relative value with a 64-Mbyte range, allowing a jump to locations up to 32 Mbytes away in either direction.

C                             Assembly
label:
                              .L01:
    ...
    goto label
                                  b .L01

C switch statements provide multiway selection. When the case labels of a switch statement satisfy grouping constraints, the compiler implements the selection with an address table. The following example uses several simplifying conventions to hide irrelevant details:

C                             Assembly
switch (j)
  {
  case 0:
    ...
  case 1:
    ...
  case 3:
    ...
  default:
    ...
  }
                                  cmplwi  r12,4
                                  bge     .Ldef
                                  bl      .L1
                              .L1:
                                  slwi    r12,2
                                  mflr    r11
                                  addi    r12,r12,.Ltab-.L1
                                  add     r0,r12,r11
                                  mtctr   r0
                                  bctr
                              .Ltab:
                                  b       .Lcase0
                                  b       .Lcase1
                                  b       .Ldef
                                  b       .Lcase3

3.5.13. Dynamic Stack Space Allocation

Unlike some other languages, C does not need dynamic stack allocation within a stack frame. Frames are allocated dynamically on the program stack, depending on program execution, but individual stack frames can have static sizes. Nonetheless, the architecture supports dynamic allocation for those languages that require it. The mechanism for allocating dynamic space is embedded completely within a function and does not affect the standard calling sequence. Thus languages that need dynamic stack frame sizes can call C functions, and vice versa.

Here is the stack frame before dynamic stack allocation:

High address

          +-> Back chain
          |   Floating point register save area
          |   General register save area
          |   Local variable space
          |   Parameter save area    (SP + 48)
          |   TOC save area          (SP + 40)  --+
          |   link editor doubleword (SP + 32)    |
          |   compiler doubleword    (SP + 24)    |--stack frame header
          |   LR save area           (SP + 16)    |
          |   CR save area           (SP + 8)     |
SP  --->  +-- Back chain             (SP + 0)   --+

Low address
***Is there an example missing here?***

Here is the stack frame after dynamic stack allocation:

High address

          +-> Back chain
          |   Floating point register save area
          |   General register save area
          |   Local variable space
          |   -- Old parameter save area, now allocated space
          |   -- Old stack frame header, now allocated space
          |   -- More newly allocated space
          |   New parameter save area    (SP + 48)
          |   New TOC save area          (SP + 40)
          |   New link editor doubleword (SP + 32)
          |   New compiler doubleword    (SP + 24)
          |   New LR save area           (SP + 16)
          |   New CR save area           (SP + 8)
SP  --->  +-- New Back chain             (SP + 0)

Low address
***Is there an example missing here?***

The local variables area is used for storage of function data, such as local variables, whose sizes are known to the compiler. This area is allocated at function entry and does not change in size or position during the function's activation.

The parameter save area is reserved for arguments passed in calls to other functions. See Section 3.2.3 for more information. Its size is also known to the compiler and can be allocated along with the fixed frame area at function entry. However, the standard calling sequence requires that the parameter save area begin at a fixed offset (48) from the stack pointer, so this area must move when dynamic stack allocation occurs.

The stack frame header must also be at a fixed offset (0) from the stack pointer, so this area must also move when dynamic stack allocation occurs.

Data in the parameter save area are naturally addressed at constant offsets from the stack pointer. However, in the presence of dynamic stack allocation, the offsets from the stack pointer to the data in the local variables area are not constant. To provide addressability, a frame pointer is established to locate the local variables area consistently throughout the function's activation.

Dynamic stack allocation is accomplished by "opening" the stack just above the parameter save area. The following steps show the process in detail:

  1. Sometime after a new stack frame is acquired and before the first dynamic space allocation, a new register, the frame pointer, is set to the value of the stack pointer. The frame pointer is used for references to the function's local, non-static variables.

  2. The amount of dynamic space to be allocated is rounded up to a multiple of 16 bytes, so that quadword stack alignment is maintained.

  3. The stack pointer is decreased by the rounded byte count, and the address of the previous stack frame (the back chain) is stored at the word addressed by the new stack pointer. This shall be accomplished atomically by using stdu rS,-length(r1) if the length is less than 32768 bytes, or by using stdux rS,r1,rspace, where rS is the contents of the back chain word and rspace contains the (negative) rounded number of bytes to be allocated.

Note: It is only strictly necessary to copy the back chain. The information in the parameter save area is recreated for each function call. The information in the stack frame header, other than the back chain, is only used by a called function. In some cases, a compiler may need to copy the TOC save area as well, depending upon precisely how it generates linkage code.

The above process can be repeated as many times as desired within a single function activation. When it is time to return, the stack pointer is set to the value of the back chain, thereby removing all dynamically allocated stack space along with the rest of the stack frame. Naturally, a program must not reference the dynamically allocated stack area after it has been freed.

Even in the presence of signals, the above dynamic allocation scheme is "safe." If a signal interrupts allocation, one of three things can happen:

Regardless of when the signal arrives during dynamic allocation, the result is a consistent (though possibly dead) process.