Computer Science 010

Lecture Notes 3

Pointers in C

Unix tips

Command Histories

We talked about this in class, but it wasn't in the notes. If you want to repeat a command that you did earlier, you can use C-p or the uparrow key to retrieve the previous command. You can repeat these to go further back in your command history. If you go too far, you can use C-n or the down arrow key to go forward in the command history.

If the command that you retrieve is not exactly what you want, you can edit it before hitting carriage return. Many of the basic Emacs editing commands can be used, such as C-b, C-f, C-a, and C-e.

Core Files

Sometimes when you change to a directory (using cd), you will get the message:

[There is a nasty core file here.]

A core file is a file that Unix creates when a program crashes. They are useful if you wrote the program that crashed and you want to debug the program in a debugger. Otherwise, they just take up space. If you get this message, you should just delete the file until we get to the point where you learn how to use a debugger.

C Mode in Emacs

When you edit C files in Emacs, Emacs automatically enters C mode. It knows that you are editing a C file when you visit a file with .c for its suffix. If you create a buffer and start typing a C program, it won't go into C mode until you save the buffer to a file with a .c suffix. C mode provides you with help in formatting your C programs. For example, if you use C-j to end a line instead of the return key, Emacs will create a new line and indent it to the appropriate level. When you type a }, it will find the matching { and briefly highlight it for you. This will help you make sure that your brackets are balanced.

There are also two additional "minor modes" that you can turn on and off as you prefer. (By default they are off.) The first is auto-newline insertion mode. When this is on, Emacs will automatically insert newlines when you type certain characters, like {. The second mode is hungry-delete. When this mode is on, the backspace character will delete all the consecutive whitespace characters preceding the cursor, not just the immediately preceding one.

Here are some useful commands:

C-j

Insert a newline and indent the next line.

C-c C-q

Fix indentation of current function

C-c C-a

Toggle the auto-newline-insertion mode. (If it was off, it will now be on and vice versa.)

C-c C-d

Toggle the hungry delete mode

There is an Emacs info page about C mode where you can learn more.

Pointers

A pointer is simply a memory address. All variables in a program are located at a particular address in memory. When the value of a variable is itself an address, we say that the variable is a pointer. Both Java and C have pointers, but pointers are not directly manipulable in Java while they are in C. In Java, any variable with a class type actually contains a pointer to an object. That is why when you assign one variable to another they both refer to the same object and changes via one variable are visible through the other. The value of both variables is really a pointer to the same object. Java hides these pointers from you and just accesses the object.

In C, you need to derefence pointers explicitly. Pointers are needed in C to allow dynamic allocation of memory (like C's malloc function) or to allow pass-by-reference so that changes to a parameter in a function are also made to the arguments passed in to a function.

The syntax for declaring a pointer is the following:

int *ptr;

This declares a pointer to an integer. To get the address of a value, you use the & operator:

int i;
ptr = &i;

It is also possible to do the reverse, that is dereference a pointer and place its value in an integer variable:

i = *ptr;

This should help you to understand the syntax of scanf better. You need to pass addresses to functions if you want their value to change. In the case of scanf, we want the arguments to have new values on return so we must pass in addresses:

scanf ("%d/%d/%d", &month, &day, &year);

The scanf function wants to modify the values of month, day, and year. To do that, it needs addresses where it can assign the values. Since month, day, and year are simply ints, we need to pass the addresses in.

If a pointer with no value is called a null pointer. Java provides a special constant called null so that you can test for that condition. C does not. Instead you use the value 0. Better yet define a constant named NULL:

#define NULL 0
ptr = NULL;

Note that you are assigning an integer to a pointer variable. Not a type checking error in C!

Pointers are often used in conjunction with structs:

typedef struct {
   int dollars;
   int cents;
} money;
   
money value, value2;
money *cost;
   
value.dollars = 100;
value.cents = 0;
   
cost = &value;
cost->cents = 50;
   
cost = &value2;

First we declare the type money. Then we declare a variable of type money. This is not a pointer variable, but it actually holds the entire struct. Next, we define a pointer to money. This variable is only big enough to hold a pointer. Now, we initialize the value variable. Note the . syntax to refer to the fields. Now, we assign the address of value to cost. Changes made through either variable are visible in the other. Note the -> syntax used to dereference fields when we have a pointer. The notation cost->cents is a more convenient way to write (*cost).cents. Finally, we assign value2 to cost. Now cost points to the address of value2 and no longer points to value. Changes made through the cost pointer will now affect value2 but not value1.

Pointers and Arrays

Pointers and arrays are very similar in C. Not all pointers are arrays, but all arrays can be manipulated with pointer operations. In particular, the following declarations are equivalent:

int minarray (int a[], int size);

and

int minarray (int *a, int size);

These two versions can be called in exactly the same way, passing in an integer array:

int myArray[] = {1, 2, 30};
int min;
min = minarray (myArray, 3);

Strings

There is no String type in C as there is in Java. Instead, strings are declared as "char *". Recall that this means "pointer to char" but due to the similarity of pointers and arrays, it also means "array of characters". In fact, strings are normally declared as pointers to characters but really it makes more sense to think of them as arrays of characters! You can reference individual characters in a string using array subscripts.

char *name = "Let it snow";

name[0] is the character 'L'. This string actually requires an array with 12 characters even though "Let it snow" is only 11 characters long. The last array element contains a special null character that signals the end of the string. The null character can be written as a character constant as'\0'. When you use a string constant as above, C automatically adds the null character.

Strings are normally declared as pointers as we typically do not know how big a string will be when we declare the string variable. In fact, we probably will want its size to change over time. C requires the size of array variables to be specificed when we declare them so this is inappropriate. Instead, we declare a pointer and then dynamically allocate memory when we know how big the string will be. More on dynamic memory allocation soon...

To input or output a string, use the %s conversion specification in printf and scanf:

char *msg = "Ski it if you can\n";
print ("%s", msg);

For scanf, you need to be sure that there is memory to read the string into. In this case, you must declare a string array with a size:

char input[100];
scanf ("%s", input);

Notice that this time you do not need an & before input. That is because input is a pointer (even though it was declared as an array). Its value is therefore an address already so you do not need to use the & address-of operator.

Another useful method for inputting strings is gets:

char *gets (char *s)

gets reads an entire line and returns that line (both in the parameter and the return value). scanf, on the other hand, returns a single word. It does not include whitespace in the value that it assigns to its parameters.

Besides I/O, most string manipulation is done using library functions. To use any of the functions below, you must include the string.h file at the beginning of your program:

#include <string.h>

Here are the most common functions:

char *strcpy (char *dst, const char *src)
This copies the string src to the string dst. dst must be an array large enough to hold src. dst is the return value. This differs from assignment because it makes a copy. If you use an assignment like dest = src;, dest and src point to the same chunk of memory. The const keyword means that the src argument is not changed in the method even though it is being passed in as a pointer.
int strcmp (const char *s1, const char *s2)
This compares s1 to s2. If s1 is alphabetized before s2, strcmp returns a positive value. If s1 and s2 have the same string values, strcmp returns 0. If s2 alphabetizes before s1, a positive value is returned. strcmp is different from using the == operator exactly as .equals and == have different meanings in Java. == just compares the pointers, while strcmp compares the string values pointed to.
size_t strlen (const char *s)
This returns the length of s. That is the number of characters up to the terminating null character. It is not the size of the array used by s, although these may coincide. s

You can get more information about any of these functions and other string functions by looking them up in section 3 of man or xman. Section 3 contains functions that can be called from C programs. One important piece of information on the manual pages for functions is an indication of which .h files you must include in your program. string.h includes prototypes for these three functions and others. It also includes the definition of size_t, the type returned by strlen. size_t is defined as follows:

typedef unsigned int size_t;

Unfortunately, the man page does not tell you this, so you may be uncertain what you can do with the return value of strlen. The files listed on man pages reside in the /usr/include directory. You can view these pages with more or emacs to find the definition of types such as size_twhen the manual page is insufficient.

Sample Program

Here is one way that the C library might implement strcpy:

char *strcpy (char *dst, const char *src) {
  /* Create pointers that will walk through the two strings. */
  int i = 0;
   
  do {
    /* Copy the next character. */
    dst[i] = src[i];
   
    /* Update the index that is walking through the strings. */
    i++;
   
    /* Test for the terminating null character. */
  } while (src[i] != '\0');
   
  return dst;
}

C-j	Insert a newline and indent the next line.
C-c C-q	Fix indentation of current function
C-c C-a	Toggle the auto-newline-insertion mode. (If it was off, it will now be on and vice versa.)
C-c C-d	Toggle the hungry delete mode