Exercises:C Addresses: Difference between revisions

From COMP15212 Wiki
pc>Yuron
No edit summary
 
gravatar H37799aa [userPHRhYmxlIGNsYXNzPSJ0d3BvcHVwIj48dHI+PHRkIGNsYXNzPSJ0d3BvcHVwLWVudHJ5dGl0bGUiPkdyb3Vwczo8L3RkPjx0ZD51c2VyPGJyIC8+PC90ZD48L3RyPjwvdGFibGU+] (talk | contribs)
 
(6 intermediate revisions by 5 users not shown)
Line 14: Line 14:
You should be able to run this experiment on any Unix machine.  The absolute answers will be different from machine to machine – maybe even from run to run on the same machine – but the principles should hold.  If you’re enthusiastic, or merely curious, try it on different machines.
You should be able to run this experiment on any Unix machine.  The absolute answers will be different from machine to machine – maybe even from run to run on the same machine – but the principles should hold.  If you’re enthusiastic, or merely curious, try it on different machines.


You will need a C compiler installed.  Here we have assumed
You will need a C compiler installed.  Here we have assumed [https://gcc.gnu.org/ gcc] is present, which it is on School machines at least.
[https://gcc.gnu.org/ gcc] is present, which it is on School machines
at least.


You should also have available some scrap paper and something to draw
You should also have available some scrap paper and something to draw with.
with.
<blockquote>
<blockquote>
<strong>Terminology</strong>: the location of a variable (or other item) is
<strong>Terminology</strong>: the location of a variable (or other item) is typically called its “address” (especially by hardware
typically called its “address” (especially by hardware
people) may be called other things: “pointer” is going to be used later here.  The value will be referred to as “data” quite often.
people) may be called other things: “pointer” is going
to be used later here.  The value will be referred to as
“data” quite often.
</blockquote>
</blockquote>
Start with the code template addressing.c, which is a simple
Start with the code template addressing.c, which is a simple ‘Hello world’ type application. Compile it (for example):
‘Hello world’ type application. Compile it (for example):


* Directly via gcc <code>gcc -o addressing addressing.c</code>
* Directly via gcc <code>gcc -o addressing addressing.c</code>
* With the supplied [https://en.wikipedia.org/wiki/Makefile makefile] <code>make addressing</code>
* With the supplied [https://en.wikipedia.org/wiki/Makefile makefile] <code>make addressing</code>


Run it <code>./addressing</code> to make sure the tool chain is in place.  You should
Run it <code>./addressing</code> to make sure the tool chain is in place.  You should not need most of the code explaining although here are a couple of details:
not need most of the code explaining although here are a couple of
details:


*in the <code>printf()</code> function, a <code>\</code> character is used to indicate a form of ‘escape’ to do ‘special’ stuff; most relevant is <code>\n</code> which means a newline character, a list of these is [https://en.wikipedia.org/wiki/Escape_sequences_in_C here].
*in the <code>printf()</code> function, a <code>\</code> character is used to indicate a form of ‘escape’ to do ‘special’ stuff; most relevant is <code>\n</code> which means a newline character, a list of these is [https://en.wikipedia.org/wiki/Escape_sequences_in_C here].
Line 46: Line 37:
[[Image:variables.png|link=|alt=Memory map]]
[[Image:variables.png|link=|alt=Memory map]]


Each variable occupies a number of bytes in memory.  Unfortunately the
Each variable occupies a number of bytes in memory.  Unfortunately the exact number is not always the same – a weakness in the original language specification.  There is a function which can be used to determine the number of bytes a particular variable type occupies.
exact number is not always the same – a weakness in the original
language specification.  There is a function which can be used to
determine the number of bytes a particular variable type occupies.


Try adding:
Try adding:
Line 61: Line 49:
<strong>Experiment</strong> with some other types.
<strong>Experiment</strong> with some other types.
</blockquote>
</blockquote>
So much for simple variable values.  Each variable is stored in
So much for simple variable values.  Each variable is stored in memory, occupying <code>sizeof(type)</code> bytes.  The bytes, together, hold the value.  The particular <strong>position</strong> (or “address”) in memory is a secret defined by the compiler and the operating system. Whereas most languages make it unattractive by design to find out where things have been stored in memory (and most of the time you really don't need to know), C lets us find out such secrets.
memory, occupying <code>sizeof(type)</code> bytes.  The bytes, together, hold the
value.  The particular <strong>position</strong> (or “address”) in
memory is a secret defined by the compiler and the operating system.
C lets us find out such secrets.


Try editing or adding:
Try editing or adding:
Line 77: Line 61:
*the <code>&</code> operator is a ‘reference’ operator and returns the <em>address</em> of the variable rather than its value (data).
*the <code>&</code> operator is a ‘reference’ operator and returns the <em>address</em> of the variable rather than its value (data).


It may help now, and later, to start drawing pictures of the locations
It may help now, and later, to start drawing pictures of the locations to differentiate addresses and data. Doodles on paper are probably better than trying to draw these with a drawing package; the important thing here is to think about what you're drawing rather than worrying about making it look lovely or professional.  
to differentiate addresses and data.
<blockquote>
<blockquote>
Note: each variable has one address although each byte typically has
Note: each variable has one address although each byte typically has one address and most variables occupy several, adjacent bytes.  It depends on the particular machine but, usually, the ‘lowest number’ byte address is chosen as representative.
one address and most variables occupy several, adjacent bytes.  It
depends on the particular machine but, usually, the ‘lowest
number’ byte address is chosen as representative.
</blockquote>
</blockquote>
Try printing the addresses of glob_1 and glob_2.  As they are declared
Try printing the addresses of glob_1 and glob_2.  As they are declared next to each other it is <em>likely</em> that they are close together in memory and, <em>probably</em>, their addresses differ by <code>sizeof(int)</code>.
next to each other it is <em>likely</em> that they are close together in
memory and, <em>probably</em>, their addresses differ by <code>sizeof(int)</code>.


Experiment with viewing some other variables’ addresses.
Experiment with viewing some other variables’ addresses.


You will probably be able to observe that the ‘global’
You will probably be able to observe that the ‘global’ variables are clustered together with addresses which may differ noticeably from the ‘local’ variables.
variables are clustered together with addresses which may differ
noticeably from the ‘local’ variables.


<strong>Experiment further</strong>: try printing the addresses of parts of the
<strong>Experiment further</strong>: try printing the addresses of parts of the code: <code>&main</code> and <code>&method_1</code> should work. These are probably in yet a different part of the address space. (This may be worth sketching, too.)
code: <code>&main</code> and <code>&method_1</code> should work. These are probably in yet a
different part of the address space. (This may be worth sketching,
too.)


You might like to see where the different variables are in method_1.
You might like to see where the different variables are in method_1. Notice that one is a passed parameter whereas the other is declared internally.  If you’re an enthusiast you might modify this into <strong>recursive</strong> code and look at the addresses then.
Notice that one is a passed parameter whereas the other is declared
internally.  If you’re an enthusiast you might modify this into
<strong>recursive</strong> code and look at the addresses then.


Memory organisation is something which will be visited again, later in
Memory organisation is something which will be visited again, later in the module.
the module.


== Pointer variables ==
== Pointer variables ==
In C we can do more than <em>print</em> the address of variables; we can
In C we can do more than <em>print</em> the address of variables; we can store that address as ‘data’ in variables of their own.
store that address as ‘data’ in variables of their own.
(This is where your sketches become more important.)
(This is where your sketches become more important.)


Line 120: Line 88:
<code>int local_1;</code> declares a variable which can hold an <strong>integer</strong>.
<code>int local_1;</code> declares a variable which can hold an <strong>integer</strong>.


<code>int *ptr_1;</code> declares a variable which can hold a <strong>pointer</strong>; this
<code>int *ptr_1;</code> declares a variable which can hold a <strong>pointer</strong>; this is indicated by the <code>*</code>.  The compiler knows that the item <em>pointed to</em> will be an integer.
is indicated by the <code>*</code>.  The compiler knows that the item <em>pointed
to</em> will be an integer.
<blockquote>
<blockquote>
Note: an int will occupy <code>sizeof(int)</code> bytes; a pointer will occupy
Note: an int will occupy <code>sizeof(int)</code> bytes; a pointer will occupy <code>sizeof(int*)</code> bytes; these sizes may (or may not) be different on a particular machine.  (Experiment!)
<code>sizeof(int*)</code> bytes; these sizes may (or may not) be different on a
particular machine.  (Experiment!)
</blockquote>
</blockquote>
Note: all pointer variables on a particular machine will be the same
Note: all pointer variables on a particular machine will be the same size, irrespective of the type of thing they point at … because all addresses (on a particular machine) are the same.
size, irrespective of the type of thing they point at … because all
addresses (on a particular machine) are the same.


An example is easier to explain:
An example is easier to explain:
Line 137: Line 99:
printf("Variable ptr_1 is at address %016X\n", &ptr_1);
printf("Variable ptr_1 is at address %016X\n", &ptr_1);
</syntaxhighlight>
</syntaxhighlight>
These are probably close together.
These are probably close together; though (and you're probably getting used to this by now!) they might not be -- it will depend on the compiler, the operating system, the CPU architecture and what else is going on at the time and beforehand (but they probably are close together most of the time; we'll look at why this is the case, and why it's a good thing later in the course).  
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
local_1 = 0xABCDEF;
local_1 = 0xABCDEF;
Line 148: Line 110:


Draw it!
Draw it!
----


=== Submission ===
Do some research online to figure out how you access not of <code>ptr_1</code>, but of the value pointed at by <code>ptr_1</code> (hint: it's called dereferencing a pointer).  
 
*An example listing of your source code printing addresses of variables (and functions?) – and their sizes.
*The text output from a run of your code.
*The machine(s) you ran it on (e.g. School PC or or Raspberry Pi or Laptop).
 
<em>Please amalgamate these into a single text file for ease of handling</em> [<code>ex1.txt</code>].
----
----
{{Category|Exercises}}
{{Category|Exercises}}

Latest revision as of 17:06, 13 February 2020

On path: Exercises 0: Exercises • 1: Pointer Exercise • 2: Arguments Exercise • 3: Malloc Exercise • 4: Structs Exercise • 5: Processes Exercise • 6: Shared memory Exercise • 7: Pipes Exercise • 8: Exceptions Exercise • 9: Synchronisation Exercise • 10: Files Exercise • 11: Threads Exercise • 12: Unix proc Exercise
On path: Pointers 1: Memory • 2: Arrays • 3: Pointers • 4: Pointer Exercise • 5: Structures • 6: Dynamic Memory Allocation • 7: Malloc Exercise • 8: Structs Exercise
Depends on Pointers

Download exercise files

The purposes of this exercise are:

  • to reinforce understanding of memory addresses and data.
  • to practise the use of C reference/ dereference operators.
  • to illustrate the different areas ( ‘segments’) which may be in use in a particular machine/language combination.

You should be able to run this experiment on any Unix machine. The absolute answers will be different from machine to machine – maybe even from run to run on the same machine – but the principles should hold. If you’re enthusiastic, or merely curious, try it on different machines.

You will need a C compiler installed. Here we have assumed gcc is present, which it is on School machines at least.

You should also have available some scrap paper and something to draw with.

Terminology: the location of a variable (or other item) is typically called its “address” (especially by hardware people) may be called other things: “pointer” is going to be used later here. The value will be referred to as “data” quite often.

Start with the code template addressing.c, which is a simple ‘Hello world’ type application. Compile it (for example):

  • Directly via gcc gcc -o addressing addressing.c
  • With the supplied makefile make addressing

Run it ./addressing to make sure the tool chain is in place. You should not need most of the code explaining although here are a couple of details:

  • in the printf() function, a \ character is used to indicate a form of ‘escape’ to do ‘special’ stuff; most relevant is \n which means a newline character, a list of these is here.
  • in the printf() function, a % character indicates a form of ‘escape’ to add an argument. %d means interpret as a decimal number; %X means interpret as a hexadecimal number; %08X means interpret as a hexadecimal number of 8 digits and fill any leading digits with ‘0’s (as opposed to spaces). This last form is going to be particularly useful.
    A list of escape characters and modifiers is here.

There are various types of variable you can declare. Basic C includes: int, short int, long int, long long int, float, double, … There are others you can look up.

Memory map

Each variable occupies a number of bytes in memory. Unfortunately the exact number is not always the same – a weakness in the original language specification. There is a function which can be used to determine the number of bytes a particular variable type occupies.

Try adding:

printf("Size of integer: %d\n", sizeof(int));

When run you should see something like: Size of integer: 4 or Size of integer: 8 Different machines or compilers may produce different results.

Experiment with some other types.

So much for simple variable values. Each variable is stored in memory, occupying sizeof(type) bytes. The bytes, together, hold the value. The particular position (or “address”) in memory is a secret defined by the compiler and the operating system. Whereas most languages make it unattractive by design to find out where things have been stored in memory (and most of the time you really don't need to know), C lets us find out such secrets.

Try editing or adding:

printf("Variable glob_1 at address %016X contains data %08X\n", &glob_1, glob_1);

Important points:

  • depending on your machine the address may be ‘32-bit’ which is eight hex digits or ‘64-bit’ which is sixteen digits (other values are possible, but unlikely). For the first attempt, print 16 digits anyway.
  • This has assumed 8 hex digits (32-bits) is enough for our int values; that’s okay as long as the values are small.
  • the & operator is a ‘reference’ operator and returns the address of the variable rather than its value (data).

It may help now, and later, to start drawing pictures of the locations to differentiate addresses and data. Doodles on paper are probably better than trying to draw these with a drawing package; the important thing here is to think about what you're drawing rather than worrying about making it look lovely or professional.

Note: each variable has one address although each byte typically has one address and most variables occupy several, adjacent bytes. It depends on the particular machine but, usually, the ‘lowest number’ byte address is chosen as representative.

Try printing the addresses of glob_1 and glob_2. As they are declared next to each other it is likely that they are close together in memory and, probably, their addresses differ by sizeof(int).

Experiment with viewing some other variables’ addresses.

You will probably be able to observe that the ‘global’ variables are clustered together with addresses which may differ noticeably from the ‘local’ variables.

Experiment further: try printing the addresses of parts of the code: &main and &method_1 should work. These are probably in yet a different part of the address space. (This may be worth sketching, too.)

You might like to see where the different variables are in method_1. Notice that one is a passed parameter whereas the other is declared internally. If you’re an enthusiast you might modify this into recursive code and look at the addresses then.

Memory organisation is something which will be visited again, later in the module.

Pointer variables

In C we can do more than print the address of variables; we can store that address as ‘data’ in variables of their own. (This is where your sketches become more important.)

pointers

In the worked example:

int local_1; declares a variable which can hold an integer.

int *ptr_1; declares a variable which can hold a pointer; this is indicated by the *. The compiler knows that the item pointed to will be an integer.

Note: an int will occupy sizeof(int) bytes; a pointer will occupy sizeof(int*) bytes; these sizes may (or may not) be different on a particular machine. (Experiment!)

Note: all pointer variables on a particular machine will be the same size, irrespective of the type of thing they point at … because all addresses (on a particular machine) are the same.

An example is easier to explain:

printf("Variable local_1 is at address %016X\n", &local_1);
printf("Variable ptr_1 is at address %016X\n", &ptr_1);

These are probably close together; though (and you're probably getting used to this by now!) they might not be -- it will depend on the compiler, the operating system, the CPU architecture and what else is going on at the time and beforehand (but they probably are close together most of the time; we'll look at why this is the case, and why it's a good thing later in the course).

local_1 = 0xABCDEF;
ptr_1   = &local_1;

printf("Variable local_1 holds data %08X\n", local_1);
printf("Variable ptr_1 holds data %016X\n", ptr_1);

The data in the pointer should be the address of local_1.

Draw it!

Do some research online to figure out how you access not of ptr_1, but of the value pointed at by ptr_1 (hint: it's called dereferencing a pointer).