Next: , Previous: geta4, Up: geta4


8.1 Data models - large and small

Let's take a simple C program:

     #include <stdio.h>
     
     int max=100;
     int count;
     char string[]="Hello, world\n";
     
     int main(void)
     { int i;
       for(i=0;i<max;i++)
       { count++;
         printf("%s",string); }
       return 0;
     }

If we look at it carefully we see 4 different types of data in it:

  1. The variable ‘max’ and the array ‘string’ - both are nonconstant initialized global data.
  2. The variable ‘count’ - this is uninitialized (and therefore nonconstant) global data.
  3. The string "%s" - this one is constant data.
  4. The variable ‘i’ - this one is local data.

The compiler places these 4 types of data into 4 different places:

Data number 4 is local (and exists only in one function call). The compiler places such data into registers if possible. On the stack if this is not possible. If you do not want the compiler to place data into registers declare it volatile - it will be on the stack then all the time.

Data number 3 is constant - the compiler places it together with all the other constant data into the code section (code is also constant). Never change any constant data - the weirdest things can happen.

Data number 2 is not initialized - it would be a bad thing to put data without information into an executable. Therefore such data goes into a special section - the BSS section. BSS data does not increase executable size.

And the rest (number 1) goes into the data section :-).

To access the data section with machine instructions there are two possible methods:

You can take the whole 32 bit address and store it into your machine code. 32 bit means 4 byte every time you access a global variable. This is know as the large (normal) data model because you can access the whole bunch of 4 GB address space.

A lot of applications do not need such a large data section and it would be a waste of memory to do so. So there exists a second possibility:

You take one address register (a4) and use it as a pointer to your data section. You access your data relative to this pointer with 16 bit references. This is known as small data model.

Since there are only 16 bit references you can access a total of 64k of data (32k in each direction from a4 on). And since you use only one address register for this the data and BSS section get merged together (BSS data still need not increase executable size - there exist some tricks to prevent this - but not all linkers support such tricks).

Beware: you should never lose the contents of your address register (a4) or all the hell breaks lose.

If you ever lost them (this can only happen in certain cases when using interrupts or hooks) you can restore them by calling ‘geta4()’ or you can use no global data at all (and have no problems then).

The second method is recommended - and it is possible sometimes since the OS takes care of this and supports local data areas in these nasty cases. But don't call any shared library - or you will access (hidden) a library base pointer.

It is not possible to have a ‘geta4()’ function with resident programs = multiple data sections (which one would you choose? You access the data from a different task!)

It's in general not possible to mix objects compiled for the two data models. There are some exceptions, but people that know enough to prevent collosions need no explanation of when it's possible ;-).