|
© 1995-2005
|
Phil's C Course
Version 1.2 © Phil Ottewell 1995-2004
§0 Aims of this CourseThis course is intended to help a good programmer (pause for mass exodus), particularly someone familiar with DEC Fortran, start programming in C, but should actually provide a reasonable introduction to C for people who have used any programming language before. There are a number of program examples, copies of which you can download as phils_c_examples.zip. A detached PGP signature file (which it is not necessary to download unless you know what it's for) is provided so you can be sure the archive has not been altered. When you unzip the archive on a Windows machine, note the workspace filephils_c_examples.dsw. You should open this with Microsoft Visual
Studio 98 and Microsoft Visual C++ 6.0, then "batch build" everything. VMS
users should use unzip -a phils_c_examples.zip to get
the correct file attributes for the text files, like the .c source
files and MAKE.COM, the VMS command file which you can use to build the programs.
There are also several programming challenges. Have a go
at these, nicking as much code as you can from the examples ! Using
C is the best way to learn it, and making mistakes is
definitely the best way to find out how it really works. I mention the ANSI
C standard, ANSI/ISO 9899-1990, a lot in this
document. Always try to adhere to the standard; experience has shown that
it pays off in the long term. Some of the points I make are stylistic. However,
many of these suggestions are made for one of two reasons; either the majority
of the C programming world has reached consensus that
the style is good (which will make it easier for you to read and learn from
other peoples' code) or I have found that you can avoid errors by doing things
in a particular way. I reckon that you can "learn" C
in about an hour, then spend the next year wishing you hadn't done things in a
particular way the hour after that. This course should help you avoid some of
the pitfalls that are so easy to fall into (and, in fact, dig for yourself)
because of the total control, power, and 0 to 60 ACCVIOS in under 10 seconds
that C can deliver to the programmer.
|
BCPL and B were typeless languages - variables were all multiples of byte or
word sized bits of memory. C is more strongly typed
than B, BCPL or Fortran. Its basic types are
char, int, float and double,
which are characters, integers and single and double precision floating point
numbers. An important addition, compared to Fortran,
is the pointer type, which points to the other types (including other pointers).
All these types can be combined in structures or unions to provide composite
types.
The main shock to Fortran programmers is the fact
that C has no built-in string type, and consequently
you have to make a function call to compare two strings, or assign one string to
another. Luckily, the ANSI standard describes a set of string manipulation
routines that MUST be present if an implementation is described as ANSI
C. Similarly, a good set of standard IO, time
manipulation and even sorting routines exist. HELP CC RUN-TIME_FUNCTIONS will
give you information on all of these, and even tell you which header files you
should include to use them. For example HELP CC RUN PRINTF will inform you that
you need the header file stdio.h .
In the early days of C, different compiler vendors all had their own flavours of C, usually based on the book, The C Programming Language, by Brian Kernighan and Dennis Ritchie. These older compilers are often referred to as "Classic C" or "K&R C". As C gained in popularity, the need to standardize certain features became apparent, and in 1983 the American National Standards Institute established the X3J11 technical committee, which published the ANSI C standard in 1988.
If you only buy one book on C, get the second edition
of the K&R book. If you want to buy two books
add Expert C Programming: Deep C Secrets by Peter
van der Linden. If you really want to be a language lawyer and contribute to
threads like "Is i = i++ + --i legal ?" in the comp.lang.c newsgroup, then get
"The Annotated ANSI C Standard", annotated by Herbert Schildt. Personally I
think a line of code like i = i++ + --i should be taken out and
shot.
The DEC C compiler is a good ANSI compiler, and any code you write should pass through this compiler (with its default qualifiers) without so much as an informational murmer. If it doesn't you are storing up big trouble and intermittent bugs for the future. Even if you decide to do nonstandard things, there are techniques to do them in a standard way (!), which will be explained later.
OK, enough waffle. Let's look at a "Hello World" program in C.
/*---- Hello World C Example ("hello.c") -------------------------------------*/
/* ANSI C Headers */
#include <stdio.h>
#include <stdlib.h>
/* Main Program starts here */
int main( int argc, char *argv[] )
{
int i = 0;
/* End of declarations ... */
for ( i = 0; i < 10; i++ )
{
printf("%d Hello World !\n",i);
}
exit(EXIT_SUCCESS);
}
As you have probably gathered, comments in C are
delimited by /* and */, and comments must NOT be
nested, or you will get some very interesting bugs. The perceived need for
nested comments is usually for commenting out (say) a debug piece of code, and
this can be done in a better way, which will be explained later. Some C compilers let you use the trailing
C++ style comments
//, which are a like a trailing ! in DEC
Fortran. NEVER USE THESE IN
C PROGRAMS. It is not
ANSI standard, and immediately confuses people as to whether they are looking a
C or C++ code (and some
meanings can subtly change).
To compile this program under DEC C (both Alpha's and VAX should be using DEC C now. VAX C was retired around 1993, and you really should switch to DEC C for both platforms now)
$ CC HELLO
$ LINK HELLO
Alternatively, you can use the MAKE.COM
DCL command file, as shown below. On Alphas the resulting executable will have
file type .EXE_ALPHA, and on VAX machines it will be .EXE.
$ @MAKE HELLO
DEV$DISK:[PHIL.PHILS_C_EXAMPLES]
CC/PREFIX=ALL HELLO.C -> HELLO.OBJ_ALPHA
LINK HELLO -> HELLO.EXE_ALPHA
Exiting
If you must use VAX C (and you mustn't :-) the link step will whinge about unresolved symbols, so change the line to
$ LINK HELLO, VAXCRTL/OPT
where the VAXCRTL.OPT options file contains the line
SYS$SHARE:VAXCRTL/SHARE
You are now ready to RUN HELLO . Not too many surprises there. Note the form
of the code. The main entry point in a standard C
is always called main, though you can override this on VMS platforms, as we will discover. The main program in C is declared as int main(some funny stuff).
This is because the main program should always return a value (usually to DCL or
the Unix shell) indicating how things went. This is done by the call to
exit(EXIT_SUCCESS). There are two ANSI standard return codes,
EXIT_SUCCESS and EXIT_FAILURE, both defined in <stdlib.h> .
Always use these values, and don't do what a lot of Unix programmers do which is
exit(0) or some other magic number just because "everybody knows
that exit(0) means success". You can return VMS condition codes,
e.g. exit(SS$_NORMAL), but this should be avoided unless really
necessary, and even then there are ways to fall back to the standard return
codes if your code is compiled on a non-VMS machine.
The (some funny stuff) is the argument list, or the "formal
parameters" of function main. Imagine main as a
function called from your command shell (DCL on VMS, or the DOS window on
Windows NT). The declaration int main( int argc, char *argv[] )
means that main is a function returning an integer, which takes two arguments.
The first is an integer, and is the number of arguments passed to main by DCL,
and the second is a pointer to arrays of characters. The latter are, in fact,
any command line arguments, as will be demonstrated in args.c, a
demo programming coming soon to a disk near you. The body of a function is
delimited by { and }. Because
C is largely a free format language, the whole
function can be on one line if you really want, but that tends to be unreadable
and confusing. I like to start the function with a { in column one,
just after the function declaration (which I can then nick for prototyping), and
end the function with a } in the same column.
Notice how each statement ends with a semicolon. The ";" is known as a statement terminator. It is also a "sequence point", as are the comma operator, and various other logical comparison operators, and the standard guarantees that side effects of expressions will be over once a sequence point is reached. This basically means that all the things you made happen in one statement will have happened by the time you start on the next statement or expression.
The printf(...) statement is a call to a routine defined in
<stdio.h>, and enables formatted output to the
stdout stream. In C, three default output
streams are defined. These are stdin, stdout and stderr, and they correspond to
SYS$INPUT, SYS$OUTPUT and SYS$ERROR under
VMS. The first argument is a format string containing conversions characters,
each preceded by the % sign (use %% if you actually
want a % sign), which tell the routine how to interpret the
variable number of arguments to be printed. In this case the integer
i is to be printed in decimal, so "%d" is used. There
are corresponding functions, sprintf to write directly into a
character string array, and fprintf to write to a file. Similar
formatted input routines, sscanf and fscanf are also
available. The table below, nicked off the network, summarizes the conversion
characters:
Clive Feather's Excellent Table:
Types of arguments for the various fprintf and fscanf conversions
Conversion fprintf fscanf
----------------------------------------------------
d i int int *
o u x X unsigned int unsigned int *
hd hi int short *
ho hu hx hX [see note 1] unsigned short *
ld li long long *
lo lu lx lX unsigned long unsigned long *
e E f g G double float *
le lE lf lg lG [invalid] double *
Le LE Lf Lg LG long double long double *
c int [see note 2]
s [see note 2] [see note 2]
p void * void **
n int * int *
hn short * short *
ln long * long *
[ [invalid] [see note 2]
Note 1: the type that (unsigned short) is promoted to by the integral
promotions. This is (int) if USHORT_MAX <= INT_MAX, and
(unsigned int) otherwise.
Note 2: any of (char *), (signed char *), or (unsigned char *).
Don't worry about the "*"s for now. They can be read as "pointer to thing named
before them", so int * means pointer to int. Similar summaries can
be found in K &R II pages 154 and 158.
Programming Challenge 1
_______________________
Have a go at adapting "hello.c" to print out the value of i in
hexadecimal. Fiddle about with the format string - remove the "\n"
for example, and see what happens to your output.
Unlike Fortran, whitespace is significant in C, and there are reserved keywords. These reserved keywords should not appear as any type of identifier, even a structure member. The list below shows both C and C++ reserved keywords.
asm1 continue float new1 signed try1
auto default for operator1 sizeof typedef
break delete1 friend1 private1 static union
case do goto protected1 struct unsigned
catch1 double if public1 switch virtual1
char else inline1 register template1 void
class1 enum int return this1 volatile
const extern long short throw1 while
The items marked like this1 are C++, not C keywords, but it makes sense to avoid both. Avoid using a language name like Fortran too.
x = x + 1 isn't a contradictory
algebraic statement.
C, unlike Fortran, has
case sensitive variable and other identifier names. Therefore the variable
NextPage is completely different to nextpage. The same
is true for functions. Some people like to use the capitalized-first-letter form
of naming, others prefer underbars, e.g. GetNextPage() or get_next_page() . Many
professional library packages tend towards TheCapitalizedFormat. Some people
like Microsoft's Hungarian Notation which involves prefixing variable
names with their type, e.g. uiCount for an unsigned
int counter variable . It all depends how good you are with the Shift key
:-) Whatever method you choose, try and be clear and consistent.
In C, local variable definitions can be at the start of any {block}, and aren't restricted to the top of the module as in Fortran. Be careful if you take advantage of this feature, because you may run into scoping problems where the innermost variable definition hides an outer one. If you are used to C++, remember that the variable definitions can only be at the start of the {block} before the first statement (e.g. expression, function call or flow control statement). If you try and intersperse definitions, C++ style, the C compiler will issue some sort of "bad statement" warning.
Variables declared at the beginning of the {function body} are local to the
function, variables declared at the "top" of the file (or compilation unit to
be pedantic), in the header files, or outside any function bodies, are
global to the compilation unit (and are externally visible symbols, unless
declared as static). More will be said about this later. For now, suffice
it to say that you should avoid using global variables wherever possible.
A brief example will illustrate the scope of variables:
/*---- Variable Scope Example ( "scope.c" ) ----------------------------------*/
/* ANSI C Headers */
#include <stdio.h>
#include <stdlib.h>
/* Global variables, visible externally too (i.e. to things linked */
/* against this) Generally they should be avoided as far as */
/* possible, because it can be very difficult to discover which */
/* routine changes their value, and they introduce "hidden" dependencies */
int some_counter;
double double_result;
/* Function prototypes */
void set_double_result(void);
/* Main Program starts here */
int main( int argc, char *argv[] )
{
int j;
int i_am_local; /* .. to main */
/* End of declarations ... */
i_am_local = 1;
printf("i_am_local = %d (in main)\n\n", i_am_local );
for ( j = 0; j < 10; j++ ) {
int i_am_local; /* .. to this loop - Not necessarily a good idea */
/* because it can cause confusion as to which */
/* variable we actually want to access */
i_am_local = j;
printf("i_am_local = %d (inside loop)\n", i_am_local );
}
printf("\ni_am_local = %d (in main)\n\n", i_am_local );
/* Now let's look at the default initialization values of the globals */
printf("nsome_counter = %d (in main)\n", some_counter);
printf("double_result = %f (in main)\n\n", double_result);
/* Call a function that changes the global variables .. */
set_double_result();
/* .. and look at them again */
printf("some_counter = %d (in main)\n", some_counter);
printf("double_result = %f (in main)\n", double_result);
exit(EXIT_SUCCESS);
}
void set_double_result(void)
{
++some_counter;
double_result = 3.141;
printf("some_counter = %d (in set_double_result)\n", some_counter);
printf("double_result = %f (in set_double_result)\n\n", double_result);
}
The basic types in C are:
char - this defines a byte, which must be able to hold one character
in the local character set (normally, but not necessarily 8 bits);
int - holds an integer, usually in the machine's natural size.
They are 32 bits on both VAX and Alpha.
float - holds a single precision floating point number.
They are 32 bits on both VAX and Alpha.
double - Double precision floating point number, 64 bits on the VAX
and Alpha.
These bit sizes are just to give you an idea. They should not be relied on, and you should code independently of them, unless you are addressing hardware registers or some equally hardware-specific task.
Some of these basic types can be modified with various qualifiers:
char - can be signed or unsigned;
int - can be long or short, signed or unsigned;
double - can be long, for (possibly) even more precision;
The long modifier normally gives larger integers, but the
compiler vendor is free to ignore it provided that
short <= int <= long
16 bits <= short/int
32 bits <= long
Assignment to variables of these basic types is fairly intuitive, and can be done in the definition, rather like using the DATA statement in Fortran or the DEC Fortran extension
/* C Example */ |* DEC Fortran Example
|
int x, y; /* Not initialized*/| INTEGER X, Y
int counter = 0; | INTEGER COUNTER /0/
float total = 0.0; | REAL TOTAL /0.0/
char c = 'A'; | CHARACTER*1 C /'A'/
Note that C uses single quote ' for character
constants. The double quotes are used for strings. There are escape
sequences for getting nonprintable characters. These are listed on page 38
of K&R II. A few useful ones are '\n' to get a new line
(C doesn't automatically add line feeds when you use
printf() ), '\a' to get a bell (alert) sound, and '\0' to get the null
character (which is NOT THE SAME as the NULL pointer) used to
terminate strings (arrays of characters). The initialization of non-static
(discussed later) int and float variables is necessary before use. It doesn't
have to be done in the definition, but you can't rely on their value being
anything sensible, so whilst the initialization of COUNTER and TOTAL is
redundant in the Fortran example (assuming
non-recursive compilation), you do need to initialize the variables before use
in C. It is good practise to initialize variables
when they are defined. The compiler will probably optimize this away, but if you
initialize at definition, it has the advantage of causing consistent behaviour
if the program later fails because you forget to set the value before it is
used, which makes debugging easier.
.
{
int i; /* Avoid declaring variables without an initial value */
int j; /* like these are - this is just for illustration */
char *pString;
.
i = 0; /* i's value could be anything up to this point */
.
j = i*OFFSET; /* j's value could be anything up to this point */
.
/* This will probably fail but may not do so every time you run */
/* the program, because the pointer might point to a writeable */
/* location. Bugs like this are harder to find because they tend */
/* not cause a consistently reproducible crash */
*pString = "Hello World"; /* Could overwrite anything */
}
void MyFunction(void)
{
int i = 0; /* Best to set an initial value */
int j = 0;
char *pString = NULL;
.
*pString = "Hello"; /* This will cause a run time error every time */
/* but it will be easy to detect the problem */
.
}
Global variables are guaranteed to be initialized to 0 (or 0.0 if floating type) but you can override this by specifying an initial value.
Similar rules apply to float, double, and long double. There are two standard
header files, <limits.h> and <float.h>
which tell you the maximum and minimum values that can be stored in a particular
type; for example INT_MAX is 2147483647, and FLT_MAX
is 1.7014117e+38 on the VAX.
The signed or unsigned modifiers are fairly self explanatory. The default for
int is signed, so it is rarely specified. Signed integer arithmetic is usually
done in Two's Complement form, but this need not be the case.
Characters can be signed or unsigned by default - it is implementation defined.
I find it best just to use char with no qualifiers, and let the
compiler do what it will.
This is probably a good point to introduce the sizeof(thing)
operator. It is an operator, not a function, and is evaluated at compile time.
It returns the size of the argument, where the size of char is
defined to be 1. To be pedantic, it returns an unsigned integer type,
size_t, defined in <stddef.h>, but is not often
used in a way that requires a size_t declaration. Here are some
examples of its use (this is a "programming fragment" not a complete program).
size_t s;
int fred; /* Integer */
char bob; /* Character */
char *c_ptr; /* Pointer to character */
char bloggs[6]; /* Array of 6 characters */
.
s = sizeof( fred );
s = sizeof( bob );
s = sizeof( c_ptr );
s = sizeof( long double ); /* Allowed to use types instead of variables */
.
/* Safe string copy, checks size of destination and allows for terminating */
/* null character (not to be confused with the NULL pointer discussed later) */
strncpy( bloggs, "Bloggs", sizeof(bloggs)-1 );
.
You can leave the brackets off after sizeof, e.g. sizeof int is
quite legal, but I think that the bracketed form is clearer.
Programming Challenge 2
_______________________
Have a go at adapting "hello.c" to print out the size of some
commonly used types, e.g. int, short int, long int, float, double
and so on. Try some arithmetic to familiarize yourself with the
basic operators, +, -, *, /, and one that doesn't appear in
Fortran, the modulus operator, %, which acts on integer types to
yield the remainder after division. Use this to determine whether
the year 2000 is a leap year. The rule is that it is a leap year if
the year is divisible by 4, except if it is a multiple of 100 years,
unless it is also divisible by 400.
In addition to the integer and floating point types, there is a type called
void. The meaning of void changes according to context ! If you
declare a function returning void, you mean that it returns no value, like a
Fortran subroutine. A void in the argument list means
that the function takes no arguments (you can have a void function that does
take arguments by declaring arguments in the usual way, and you can have a
function that does return a value but takes no arguments). Below is an example
of a Fortran subroutine and C function:
/* C Version */ |* Fortran Version
|
void initialize_things( void ) | SUBROUTINE INITIALIZE_THINGS
{ |*
/* Do cunning setup procedure */ |* Do cunning setup procedure
/* No need for a return statement */ |*
} | END
. | .
/* Call it */ |* Call it
initialize_things(); /* Note () */ | CALL INITIALIZE_THINGS
. | .
The void qualifier also has yet another meaning, which will be discussed when we look at pointers.
The void function above demonstrates the general form of functions in C. They have a function definition with the formal parameters, then a {body} enclosed by the {} brackets. Function arguments are always passed by value in C. The actual arguments are copied to the (local) function formal arguments, as if by assignment. The arguments may be expressions, or even calls to other functions. The order of evaluation of arguments is unspecified, so don't rely on it ! Here is a C function example, with a similar Fortran routine for comparison.
/* C Version */ |* Fortran Version
|
int funcy( int i ) | INTEGER FUNCTION FUNCY( I )
{ | INTEGER I
|*
int j = 0; | INTEGER J
/* End of declarations ... */ |* End of declarations ...
j = i; | J = I
i = i + 1;/* Only local i changed*/| I = I + 1 ! Calling arg changed
j = i*j; | J = I*J
| FUNCY = J
return( j ); | RETURN
} | END
. | .
/* Call it */ |* Call it
k = 3; | K = 3
ival = funcy(k); /* ival is 12 */ | IVAL = FUNCY( K ) ! IVAL is 12
. /* k is still 3 */| . ! K is 4
Notice that changing the function parameter in the C function does not alter the actual argument, only the local copy. To change an actual argument, you would pass it by address, using the address operator, &, and declare the function argument as a pointer to type int. More will be said about this in the pointers section. Generally, you should avoid writing functions in C that change the actual arguments. It is better to return a function value instead, where possible.
/* C Version */ |* Fortran Version
|
myval = funcy( gibbon ); | CALL FUNCY( GIBBON, MYVAL )
Programming Challenge 3
_______________________
Hack your copy of the "hello.c" to call some sort of arithmetic
function, perhaps to return the square of the argument. Write the
function, and add a "prototype" (these are discussed later) for it
before the main program, e.g.
.
/* Function prototype */
int funcy( int myarg ); /* semicolon where function body would be */
.
/* Main Program starts here */
int main( int argc, char *argv[] )
{
.
}
/* The real McCoy - "Dammit Jim, I'm a function not a prototype" */
int funcy( int myarg )
{
.
/* Do something and return() an int value */
.
}
If you are feeling really cocky, write a recursive factorial()
function that calls itself. Hint:
.
if ( n > 0)
{
factorial = n * factorial( n-1 );
} else {
factorial = 1;
}
.
Call it from you main program and step through with the debugger to
convince yourself that it really is recursive.
When you write your own functions, try to avoid interpositioning, i.e. naming your function with the same name as a standard library (or system/Motif/X11/Xt library) function. Use
$ HELP CC RUN-TIME_FUNCTIONS your_function_name
to check for the existence of a similarly named DEC C
RTL function. Or look in a book. It is a very bad idea to replace a standard
function. If you need to write something with the same purpose as a standard
function, but maybe with better accuracy or speed, call it something different,
e.g. my_fast_qsort() .
Three other modifiers I haven't yet explained are static,
const and extern. The static modifier is
another one that changes meaning depending on its context. If you declare a
global variable or function as static, it will still be
visible throughout the same compilation unit (file to us), but will NOT be
visible externally to programs linked against our routines. This is often used as
a neat way of storing data that has to be visible to a number of related
functions, but must not be accessible from outside. Some code fragments below
illustrate this.
/*---- C Fragments -----------------------------------------------------------*/
/* Global Vars, NOT visible externally (i.e. to things linked against this) */
static int number_of_things;
int AddToThings( int a_thing )
{
.
number_of_things = number_of_things + 1;
return( number_of_things );
}
int GetNumberOfThings(void)
{
return( number_of_things );
}
int RemoveThing( int a_thing )
{
.
number_of_things = number_of_things - 1;
return( number_of_things );
}
* Fortran (sort of) Equivalent
*-----------------------------------------------------------------------
INTEGER FUNCTION ADD_TO_THINGS( A_THING )
.
INTEGER NUMBER_OF_THINGS
SAVE NUMBER_OF_THINGS
.
NUMBER_OF_THINGS = NUMBER_OF_THINGS + 1
ADD_TO_THINGS_ = NUMBER_OF_THINGS
RETURN
*
ENTRY FUNCTION GET_NUMBER_OF_THINGS()
GET_NUMBER_OF_THINGS = NUMBER_OF_THINGS
RETURN
*
ENTRY REMOVE_THING( A_THING )
.
NUMBER_OF_THINGS = NUMBER_OF_THINGS - 1
REMOVE_THING = NUMBER_OF_THINGS
RETURN
*
END
Another use of static is with variables that are local to a function. In this
case it is similar to the Fortran SAVE statement,
i.e. the variable will retain its value across function calls, and WILL BE
INITIALIZED to 0 if it is an integer type, or 0.0 if a floating point type (even
if the floating point representation of 0 on your machine is not all bits set to
0), or NULL (pointer to nothing) if it is a pointer.
/*---- C Example -------------------------------------------------------------*/
int log_error( int code )
{
static int total_number_of_errors;
/* End of declarations ... */
/* ++ is the same as total_number_of_errors = total_number_of_errors + 1 */
return( ++total_number_of_errors );
}
* Fortran Equivalent
*-----------------------------------------------------------------------
SUBROUTINE LOG_ERROR( CODE )
.
INTEGER TOTAL_NUMBER_OF_ERRORS
* Not required for non-recursive DEC Fortran, but it documents your intent
SAVE TOTAL_NUMBER_OF_ERRORS
.
TOTAL_NUMBER_OF_ERRORS = TOTAL_NUMBER_OF_EBRORS + 1
END
The const modifier is used to flag a read only quantity.
For example,
const double pi = 3.14159265358979;
.
/* Arizona ? */
pi = 3.0; /* Gives compiler error - try it in your test program */
The const modifier is useful for function prototype arguments which are passed by pointer, where you want to indicate that your function will not change the object pointed to. More will be said about function prototypes later.
Programming Challenge 4
_______________________
Look at the Fortran example above. Spot the deliberate
mistake. The compiler would probably flag an error for it, but think
of another instance where perhaps you wanted to increment an array
element indexed by a non-trivial expression. Using the ++ operator
in C helps avoid typographical errors, and looks less clumsy (and
saves valuable bytes ;-) ). There is a similar operator, --, which
decrements by one. Read K&R II, pages 46-48, and pages 105-106. Make
sure you understand the difference between prefix and postfix
versions of ++ and --, and try to rewrite the AddToThings() set of
functions using these operators. Great - that's saved me having to
explain it all.
The extern qualifier is rather like EXTERN in
Fortran, and basically gives type information for a
reference that is to be resolved by the linker. You DO NOT need to use
extern with function declarations - int funcy( int i
); is the same as extern int funcy( int i); . It is usually
used when declaring global variables to indicate that they are referenced in the
particular compilation unit, but not defined in it.
What is the difference between "definition" and "declaration" ? In short, a definition actually ALLOCATES SPACE for the entity, whereas a declaration tells the compiler what the entity is and what it is called, but leaves it up to the linker to find space for it ! A global variable, structure or function can have many declarations, but only one definition. This is explained in more details in the "Header Files" section which follows.
Three less commonly used modifiers are volatile,
auto and register. The volatile modifier tells the
compiler not to perform any optimization tricks with the variable, and is most
often used with locations that refer to hardware, like memory-mapped IO, or
shared memory regions which might change in a way the compiler cannot
predict. The auto qualifier may only be used for variables at
function scope (inside {}) and is in fact the default. Auto variables are
usually allocated off the stack (but this is up to the implementation). They
will certainly not be retained across function calls. NEVER return the ADDRESS
of an automatic variable from a function call (once you know about pointers).
Because new automatic variables are "created" every time you
go into a function, this allows C functions to be
called recursively. The register qualifier is really obsolete. It is a hint to
the compiler that a variable is frequently used and should be placed in a
register. The compiler is quite free to ignore this hint, and frequently does,
because it generally knows far more about optimizing than you do (Microsoft
Visual C++ or DEC C
for example). Don't bother using register.
Enumerated types, enum, are similar to
Fortran integer
PARAMETERs,
but nicer to use. The general form is enum identifier { enumerator_list
}, where "identifier" is optional but recommended. The comma-separated
list of enumerated values starts at zero by default, but you can override this
as shown in the example.
C Example
/*----------------------------------------------------------------------------*/
enum timer_state_e { TPending, TExpired, TCancelled};
enum timer_trn_e { TmrSet=4401, TCancel=4414};
.
enum timer_state_e t_state;
enum timer_trn_e t_trn;
.
t_state = TExpired; /* t_state now contains 1 */
t_trn = TCancel; /* t_trn now contains 4414 */
* Fortran Example
*------------------------------------------------------------------------
INTEGER TPENDING, TEXPIRED, TCANCELLED
INTEGER TSET, TCANCEL
PARAMETER (TPENDING = 0, TEXPIRED = 1, TCANCELLED = 2)
PARAMETER (TSET = 4401, TCANCEL = 4414)
.
INTEGER T_STATE, T_TRN
.
T_STATE = TEXPIRED
T_TRN = TCANCEL
When examining t_state or t_trn in the C program with the DEC debugger, the integer value will be converted to a name, e.g.
DBG> EXAMINE t_trn
PROG\main\t_trn: TCancel
which is handy. Unfortunately, because the enumerated types are really type int,
you can assign any integer value to t_trn without a compiler whinge
! Types and storage class modifiers are discussed in more detail in K&R II,
page 209 onwards, if you still thirst for knowledge.
for loops, while loops and do loops. An
example is worth a thousand words:
* Fortran Loops Example
.
INTEGER I
LOGICAL FIRST
.
PRINT *, I
ENDDO
*
I = 0
DO WHILE ( I .LT. LIMIT )
I = I + 1
PRINT *, I
ENDDO
*
FIRST = .TRUE.
DO WHILE ( FIRST .OR. I .LT. LIMIT )
IF ( FIRST ) FIRST = .FALSE.
PRINT *, I
I = I + 1
ENDDO
/*---- C Loops Example ("loops.c") -------------------------------------------*/
/* ANSI C Headers */
#include <stdio.h>
#include <stdlib.h>
/* Defines and Macros */
#define LMT 5
/* Main Program starts here */
int main( int argc, char *argv[] )
{
int i;
/* End of declarations ... */
printf("LMT = %d\n", LMT);
printf("\n'for' loop - for ( i = 1; i <= LMT; i++ ) {...}\n");
for ( i = 1; i <= LMT; i++ ) { /* More usual in C would be i = 0; i < LMT; i++ */
printf("%d\n", i );
}
printf("\ni = 0\n");
printf("'while' loop - while ( i++ < LMT ) {...}\n");
i = 0;
while ( i++ < LMT )
{
printf("%d\n", i );
}
printf("\ni = LMT\n");
printf("'do' loop - do {...} while ( ++i < LMT ); - always executes at least once\n");
i = LMT;
do
{
printf("%d\n", i );
} while ( ++i < LMT );
exit(EXIT_SUCCESS);
}
All these constructs are explained in detail in K&R II, chapter 3. The
for loop has the following general form:
for ( expression1; terminate_if_false_expression2; expression3 ) {
.
}
If "terminate_if_false_expression2" is missed out it is taken as being true, so
an infinite loop results, for (;;) {ever}. The "expression1" is
evaluated once before the loop starts and is most often used to initialize
the loop count, whereas "expression3" is evaluated on every pass through the
loop, just before starting the next loop, and is frequently used to modify the
loop counter. It is quite legal, in C, to modify the
loop counter within the loop, and the loop control variable retains its value
when the loop terminates. Obviously "terminate_if_false_expression2" causes the
loop to end if it is false, and is used to test the termination condition.
The "while" looks like this:
while ( expression )
{
.
}
and keeps going for as long as "expression" is true. It zero trips
(that is, the code in it is never executed) if "expression" is false on the
first encounter. The for loop above could be written using
while.
expression1;
while ( terminate_if_false_expression2 )
{
.
expression3;
}
It isn't a good idea to do this though, because someone will spend ages looking
at your code wondering why you didn't write a for loop, expecting
some cunning algorithm.
Finally, before time, the old enemy, makes us leave Loopsville City Limits, let's look at the "do-while" construct. The loop body is always executed at least once
do
{
.
} while ( expression ); /* Semicolon needed */
and the loop will be repeated if "expression" is true at the end of the current
loop. There is a keyword, break, which lets you leave the innermost
loop early, transferring control to the statement immediately after the loop.
for ( i = 0; i < strlen(string); i++) {
if ( string[i] == '$' )
{
found_dollar = TRUE;
/* Once we've found the dollar no need to search rest of string */
break;
}
}
/* Jump to here on "break" */
.
A related keyword, continue, skips to the end of the loop and
continues with the next loop iteration.
for ( i = 0; i < strlen(string); i++) {
/* Don't bother trying to upcase spaces */
if ( string[i] == ' ' ) continue; /* Move on to next character */
/* It wasn't a space so have a go */
string[i] = toupper( string[i] );
}
/* Jump to here on "break" */
.
This is most often used to avoid complex indenting and "if" tests. Don't use it like I just did, which was a silly example.
You have already met the "if" construct. Here it is again, with the "else if" demonstrated too.
if ( expression )
{
.
/* Do something */
.
}
else if ( other_expression )
{
.
/* Do something else */
.
}
else if ( final_expression )
{
.
/* Do something different */
.
}
else
{
.
/* Catch all if none of above expressions are true */
.
}
It is legal to write this kind of thing
if ( expression ) /* Avoid this form */
i = 1;
else
i = 2;
The problem arises if you do this
if ( expression ) /* This is probably not what was intended */
i = 1;
else
i = 2;
dont_forget_this = 3;
You might think that if "expression" is true (i.e. non-zero) then you would set
i to 1, and if it were false you would set i to 2
and dont_forget_this to 3. In fact you will always set
dont_forget_this to 3, because only the first statement after the
"else" is grouped with the "else". I never use this form, other than for a one
liner like
if ( expression ) expression_was_true = TRUE;
where the meaning is clear. Use the bracketed form which makes it totally unambiguous, and is easier to use with the debugger.
C provides an alternative to lots of if - else if tests. This is the "switch" statement. The "expression_yielding_integer" is calculated, and matched against the "case" "const-int-expression"s. When one matches, the statements following are executed, or if none match, the statements following "default" are executed
switch ( expression_yielding_integer ) {
case const-int-expression1:
statements1;
case const-int-expression2:
statements2;
case const-int-expression3:
statements3;
.
.
default:
statementsN;
}
Unfortunately a bad default behaviour was chosen for this. Each "case" drops through to the next one by default, so if, say, "expression_yielding_integer" matched "const-int-expression2", then "statements2" through to "statementsN" would ALL be executed. This is solved by using "break" again.
switch ( expression_yielding_integer ) {
case const-int-expression1:
statements1;
break; /* Always use break by default */
case const-int-expression2:
statements2;
break;
case const-int-expression3:
statements3;
break;
.
.
default:
statementsN;
break;
}
The default behaviour is rarely what is required in practise, and it would have been far better to have a default "break" before each case, and maybe use "continue" to indicate fall-through. Remember that chars can be used as small integers, so the following is quite legal.
char command_line_option;
.
switch ( command_line_option ) {
case 'v':
verbose_mode = TRUE;
break;
case 'l':
produce_listing = TRUE;
break;
case '?': /* Following two cases deliberately fall thru */
case 'h':
display_help = TRUE;
break;
default:
use_default_options = TRUE:
break;
}
int job[20]; /* job[0], job[1] .. job[19] */
and the dimension must be an integer greater than zero. This is how to declare a two-dimensional array [rows][columns]
int job[4][20]; /* Like 4 job[20] 's, job[0][0], job[0][1] .. job[3][19] */
.
i = job[2][0]; /* Good */
.
i = job[2,0]; /* Bad - don't ever do this */
.
Multi-dimensional arrays are column major; that is, the right-most subscript
varies fastest, unlike Fortran. Notice that you can't
use commas to separate the indices. Separate pairs of square brackets are needed
for each index. There is no limit to the number of dimensions other than those
imposed by your compiler and the amount of memory available. In practice,
multi-dimensional arrays are rarely used. Unfortunately, you can't (in
C) use const int's as array bounds. You have to use
#define, like this:
#define MAX_SIZE
.
float floaty[MAX_SIZE];
More will be said about #define later. Arrays can be initialized
when they are defined:
int days_in_month[12] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };
int matrix[2][3] = { { 0, 1, 2 }, { 3, 4, 5 } };
Remember that uninitialized arrays can contain anything at all, so don't expect them to be full of zeros. In addition, initialized arrays can't be "demand zero compressed". You can leave out the size of an array and have it use the number of initializers, like this
int array_initialization_pages_in_K_and_R_II[] = { 86, 112, 113, 219 };
which produces an array of 4 integers. You would probably want to use
sizeof() to determine the size of the array in this case
nelements = sizeof( array_initialization_pages_in_K_and_R_II ) /
/* ----------------------------------------------------- */
sizeof( array_initialization_pages_in_K_and_R_II[0] );
.
for ( i = 0; i < nelements; i++)
{
.
}
Notice that you index up to LESS THAN the number of elements, because the last element is (nelements-1). If you accidentally go past the end of an array, then if you are reading from the array you can get strange values. A nastier siutation occurs if you are setting the values.
void MyFunction(void)
{
int Array[5];
int iLoopLimit = 10;
.
for ( i = 0; i < iLoopLimit; i++)
{
/* Oh dear - we are going to write to memory past the end of the array */
Array[i] = GetSomeValue();
}
}
The above code could have a particularly nasty effect on certain machine architectures. It is quite common for the return address to be stored on the stack, which is where the automatic variable Array is allocated from too. By writing past the end of the array, it is possible for you to corrupt the return value such that when MyFunction returns, you end up in a seemingly random place, such as half way through some function you have never (intentionally) called. This is bad enough just from the point of view of trying to debug your program. It gets worse though. Imagine that the hypothetical GetSomeValue() function got its data from some external source, e.g. this was part of some web server. If a naughty hacker could pass something, perhaps a long string, to corrupt the return pointer in a known way, then he or she would be able to run arbitrary code of their choice, with all the privileges of your server process. Many real world computer "viruses" and "worms" use exactly this technique to "hack into" people's web servers and systems.
When initializing arrays or structs, you can initialize with less elements than
there are in the array or structure, and remaining ones will be set to zero (or
NULL for an arra of pointers). This saves typing, or adding more initializers
if teh size or structure changes, and makes it easier to debug because you
start with known values.
int FirstArray[ARRAY_SIZE] = {0}; /* All elements will be zero */
int Array2[ARRAY_SIZE] = {1}; /* 1st element is 1, remaining elements all 0 */
struct point
{
int x;
int y;
};
struct point origin = {0}; /* origin.x and origin.y will both be zero */
static by default, i.e. retain
their value across function calls, unless you change them. The initializer, or
"string literal" is delimited by double quotes "like this" . You can split a
string initializer over several lines, each part being in " quotes, and they
will be concatenated together. The resultant string has the null character,
represented by the escape sequence '\0', appended to the end of it.
char random[80]; /* Could contain anything */
char title[] = "Phil's Ramblings"; /* Takes 17 bytes due to '\0' at end */
char longer_string[] = "Here is quite a long string split up over"
"two lines. VAX C doesn't allow this, though."
"Another good reason to switch to DEC C on"
"VAX or Alpha VMS, or Visual C++ for Windows."
char string_with_quote[] = "Here is the quote \" character";
The name that you give to an array can be used as a pointer to the zeroeth
element of the array. More will be said about this in the "Pointers" section.
There are many functions in the standard library for manipulating character
strings, and these all begin with "str". You will need to include
<string.h> to use them. Look in K&R II pages 249-250.
These functions expect an array, or a pointer to characters, as their arguments.
Finally, note that an empty string is not really empty.
char is_it_empty[] = ""; /* No, it contains one character, '\0' */
Always bear in mind that the string functions often copy trailing '\0'
characters, so you must ensure that you allow space for this. It is a good idea
to always use the "strn" versions of the calls, with
sizeof(destination) as the character limit, because that way you
will avoid runaway (and hard to detect) memory overwriting. Remember to
terminate the destination string, e.g.
char destination[MAX_SIZE] = {'\0'};
strncpy( destination, source, sizeof(destination)/sizeof(destination[0]) );
destination[sizeof(destination)/sizeof(destination[0])-1] = '\0';
else you'll end up avoiding potential overwrites, but leave a potentially unterminated string to catch you out later. Why use sizeof(destination)/sizeof(destination[0])? Well for ANSI, you could argue that you could just use sizeof(destination) because the value returned by sizeof() is defined in terms of the size of char. However, as Unicode and "wide character" strings become more commonly used, then you need to think carefully about whether you mean "the number of bytes" (or more accurately, the size of something in terms of char) or the number of characters. For wide characters, you really don't want to pass, sizeof(widestring) to wcsncpy or you will end up causing a buffer overrun, i.e. you will write past the end of the wide character array or allocated space. Note also that you can't do this with a pointer. The sizeof(pointer) would return the size of the pointer, not the size of the thing it pointed to.
int *i_ptr;
declares a pointer to type int. As declared above, i_ptr is most
likely not yet pointing at a valid location. In order to make it point somewhere
valid, you generally use the "address operator", &, like this
int i;
int j;
int *i_ptr;
.
i_ptr = &i;
.
You can then change or read the value of i by using the
"dereference operator", and change the object pointed to, providing it is an
object of the correct type.
*i_ptr = 3; /* Set the int pointed to by i_ptr to 3 */
printf("%d\n", i ); /* i will be 3 */
.
i_ptr = &j; /* Set the i_ptr to point to j now */
.
*i_ptr = 3; /* Set the int pointed to by i_ptr to 3 */
printf("%d\n", j ); /* j will be 3 */
.
This is a rather silly example, because you would obviously just use i
or j directly. A more realistic use of pointers is with
arrays:
char string[] = "Here is a string with a $ in it";
char *sptr;
int contains_dollar;
.
sptr = string; /* Remember that the array name is the same as &array[0] */
contains_dollar = FALSE;
while ( *sptr ) /* While thing pointed to is not 0 i.e. null character */
{
if ( *sptr == '$' )
{
contains_dollar = TRUE;
break; /* Leave the while loop early and safely */
}
++sptr;
}
.
When you increment pointers, they automatically increment the address they
point to by the size of one of the objects to which they point. In the example
above, that is one character, i.e. a byte. If the array was an array of int,
then the pointer would increment by sizeof(int) bytes. Just to
frighten you, this loop could be written
while ( *sptr && !( contains_dollar = *sptr++ == '$' ) );
Programming Challenge 5
_______________________
You guessed it. Figure out what is happening in the scary "while"
loop above. Now write your own (differently named) version of strcpy
using similar techniques to make it as short as possible. Note that
strcpy has no length limit. Why is this a bad thing? Hint: think about
the problems associated with writing past the end of arrays. Modify
your function to work like the safer strncpy function.
Arrays and pointers are closely related. They can be used in identical ways in many situations. For example:
.
char string[80];
char *sptr;
.
sptr = &string[0]; /* This could be written as sptr = string */
.
*string = 'A'; /* Using array name like pointer */
*(string+10) = 'B'; /* Using array name like pointer */
.
sptr[0] = 'A'; /* Using pointer like array */
sptr[10] = 'B'; /* Using pointer like array */
.
This is because, in expressions or function calls, arrays and pointers are
both converted to the form "*(pointer + index-offset)". The main thing to
remember is that pointers are variables, and can be changed to point to
different objects, whereas array names are not variables. The index-offset is
automatically scaled according to the type of data pointed to. In this case, we
are dealing with char which, by definition, has a size of 1, but if
the pointers were pointers to int, then on the VAX or Alpha, the index-offset
would be automatically scaled by 4.
.
int array[20];
int another_array[20];
int *i_ptr;
.
i_ptr = array; /* Legal */
i_ptr[12] = 3;
.
i_ptr = another_array; /* Legal */
i_ptr[2] = 4;
.
array = another_array; /* Illegal ! */
.
Even multi-dimensional arrays get decomposed to the "*(pointer + index-offset)" by the compiler in say, a function call, which gives you no special knowledge of how they fold. Hence if you are using a pointer to a multi-dimensional array where the dimensions could vary, it is up to you to calculate the offset correctly, e.g.
int mda[ROWS][COLS];
.
i = funcy( mda, ROWS, COLS );
.
int funcy( int *array, rows, cols )
{
.
for ( i = 0; i < rows; i++) {
for ( j = 0; j < cols; j++) {
total += *(array + i*cols + j);
}
}
.
}
Of course, if the function was only expected to deal with arrays of set dimensions, you could just declare those in funcy().
int mda[ROWS][COLS];
.
i = funcy( mda );
.
int funcy( int array[ROWS][COLS] )
{
.
for ( i = 0; i < ROWS; i++) {
for ( j = 0; j < COLS; j++) {
total += array[i][j];
}
}
.
}
The strange += assignment operator isn't a misprint. It is shorthand, so that
x = x + 4;
can be written
x += 4;
Similarly
y = y - 10;
becomes
y -= 10;
There must be NO SPACE between the operator and the = sign, and the operator comes immediately before the =. This notation is handy for more complex expressions, such as
array[ hash_value[index]*k + offset[i] ] += 4;
so you only need maintain the expression in one place. Many binary operators have a similar assignment operator. Check K&R II page 50 and page 48 for the bitwise operators that can also be used in this way.
The other thing to remember is that whereas arrays allocate space, and hence the array name points to something valid, pointers must never be used until they have been initialized to point to something valid.
There is a special pointer value defined by the standard, called the
NULL pointer, which is used to indicate that the pointer doesn't
point to anything. Normally, you cannot directly assign integers to pointers,
but the NULL pointer is an exception. Both the following lines make
p point to "nothing" (well, a guaranteed "not valid location"
really).
i_ptr = 0; /* Legal but not recommended */
i_ptr = NULL; /* Recommended - it is clear that you refer to a pointer */
The NULL macro (see Macros section later on), defined identically
in <stddef.h> and <stdio.h> among other
places, is often defined as
#define NULL ((void *) 0)
even though "0" would do. This discourages its use as an integer, which you should never do. People often make the mistake of writing
string[i] = NULL; /* Never do this - you really want '\0' */
i = NULL; /* Never do this if i is integer and you really mean 0 */
when what they actually mean is
string[i] = '\0'; /* The null character - that's more like it */
i = 0; /* Integer zero */
A pointer of type (void *) is a special type of pointer that is
guaranteed to be able to point to any type of object, hence the
NULL pointer can be assigned to any pointer type. The
NULL pointer need not have all bits set to zero, so don't rely on
this.
Pointers are very useful as function arguments for routines that manipulate strings of unknown (at compile time) length.
int how_long( const char *s )
{
int i = 0;
/* End of declarations ... */
while ( *s++ )
{
i++; /* Increment i until '\0' found */
}
return i;
}
Even though the thing pointed to by s is const, note
that it is quite legal to increment the pointer s in the function,
because s is a local, variable pointer, pointing to whatever the
calling argument to how_long() was. Hence if you call
how_long(string), you don't change string, you assign string to
s then increment s. Any expression using array
subscripting, for example array[index], is exactly the same in
C as its pointer equivalent, in this case
*(array+index). You have to be careful when using the
const modifier with pointers. The following examples should
illustrate the point.
int i = 0;
const int *i_ptr = NULL;/* i_ptr can be used to point to a const int */
int * const i_ptr = &i; /* i_ptr is const, points to variable int */
Another important difference between pointers and arrays relates to the
sizeof() operator.
int array[20] = {0};
int *i_ptr = array;
size_t s = 0;
.
s = sizeof(array); /* s is 20*sizeof(int), which is 100 on the VAX */
s = sizeof(i_ptr); /* s is sizeof(int *), which is 4 on the VAX */
.
You can't deduce the size of an array from a pointer, only the size of the
pointer. Because arrays as function arguments are treated the same as pointers,
then even if you declare the function arguments as "func( int array[10] )"
array is still treated like a pointer in the function body, so
sizeof(array) in the function will give you the size of pointer to
int, not 10 times size of int.
It is quite legal to write a pointer definition like this:
int* i_ptr; /* Not recommended */
This is best avoided, because it can be confusing. Consider
int* i_ptr1, i_ptr2; /* Probably not what you intended */
At first glance it looks like you have just declared two pointers to int. In fact, i_ptr1 is a pointer, but i_ptr2 is an int.
int *i_ptr1, *i_ptr2; /* Better */
The second example keeps the * with the variable to which it relates, and is considered better style (by me at any rate) !
There are two standard library functions often used with pointer. They are
declared in <stdlib.h>, and are malloc() and
free(). Both are AST reentrant under DEC
C. The malloc() function allocates an
area of memory specified in bytes, and is declared as
void *malloc(size_t size);
and would be used like this
.
int *i_ptr;
.
i_ptr = (int *)malloc( sizeof(int)*nelements_wanted );
if ( i_ptr != NULL )
{
.
i_ptr[i] = i;
.
}
else
{
.
/* Couldn't get the memory - do some cunning recovery */
.
}
It is good practise to "cast" the result of a malloc() to the
correct type. This helps the compiler to indirectly check whether you are using
the correct type in the sizeof() invocation too. If it complains
about your cast, then (assuming the type is the same in the
sizeof() ) you are probably using the wrong type in both places,
and might have allocated too little memory. There is no check if you wander off
the allocated memory, out into memory space no man has seen before ! The memory
returned by malloc() can contain any values when you get it, i.e.
it is not set to zero.
The free() function frees up the memory obtained from
malloc(). It is declared as
void free(void *pointer);
and would be used like this to free the memory obtained in the previous example
free( i_ptr );
i_ptr = NULL; /* Good practise */
I like to set the pointer to NULL immediately upon freeing the
memory, because the pointer MUST NOT BE USED again after being
free()ed. By setting it to NULL, you will (under VMS
or Windows NT) get an ACCVIO if you try and dereference the pointer. This is
safer than leaving it, having the memory reused elsewhere, then changing it via
the duff pointer. This sort of mistake is very hard to track down. It is very
important to always free malloc()-ed memory when you are done with
it, or you will cause what is known as a "memory leak".
There are a couple of functions related to malloc(). One is
calloc(), which allows you to allocate memory and initialize it's
value in one go.
void *calloc(size_t number, size_t size);
The other function is realloc(), which allows you to expand a
region of memory obtained by malloc(), whilst retaining its current
contents.
void *realloc(void *pointer, size_t size);
The new, expanded region of memory need not be in the same place as the original
new_i_ptr = (int *)realloc( i_ptr, sizeof(int)*larger_nelements_wanted );
if ( new_i_ptr )
{
/* Successfully expanded */
i_ptr = new_i_ptr; /* Don't free anything here ! */
}
else
{
/* Couldn't get the extra memory, stick with the existing pointer */
}
so in this example the memory may have changed location, but the original
content will have been copied to the new location. Note how I use a new
pointer, new_i_ptr, to check that the relocation was successful.
This is essential because if you directly assigned to the pointer to
the memory you were trying to realloc and the call failed
(returning NULL) you would have no way to free the memory
originally pointed to by i_ptr.
/* Never do this - always assign the return value to a different pointer */
i_ptr = (int *)realloc( i_ptr, sizeof(int)*larger_nelements_wanted );
A final couple of warnings about pointers. Firstly, the []
operator has a higher precedence than the *, so int *array[]
means an array of pointers to int, not a pointer to an array of ints. Secondly,
the following two statements are not equivalent:
extern int is[]; /* This declares an int array, defined elsewhere */
extern int *is; /* This declares a pointer to int, defined elsewhere */
The compiler will actually generate code you did not intend, and probably cause an ACCVIO if you confuse these. This is because an access via a pointer first looks at the address of the pointer, gets the pointer value stored there, and uses that as the base address for lookups. Access via an array name uses the address of the array itself as the base address for lookups. Draw a diagram if you are confused ! Using the EXT and DEFINE_GLOBALS macros, explained later, should stop this ever happening to you.
struct optional_structure_identifier {what's in it} optional_instance;
I suggest that you always specify optional_structure_identifier, then declare
the instances of the structure later in a manner similar to the way we used
enum. Example:
/* C Example */ |* Fortran Example
|
struct oscar_location_s { | STRUCTURE /OSCAR_LOCATION_S/
int x; | INTEGER X
int y; | INTEGER Y
}; /* Note the semicolon ; */ | END STRUCTURE
. | .
int main( int argc, char *argv[] ) | .
{ | .
struct oscar_location_s loc; | RECORD /OSCAR_LOCATION_S/ LOC
. | .
loc.x = 100; | LOC.X = 100
loc.y = 50; | LOC.Y = 50
. | .
} |
Similarly with unions, the following trivial example shows how they might be declared and used:
/* C Example */ |* Fortran Example
|
union hat_u { | STRUCTURE /HAT_U/
int mileage; | UNION
float hotel_cost; | MAP
}; | INTEGER MILEAGE
. | END MAP
int main( int argc, char *argv[] ) | MAP
{ | REAL HOTEL_COST
int was_tow = 0; | END MAP
union hat_u cost; | END UNION
. | END STRUCTURE
if ( was_tow ) { | .
cost.mileage = 100; | IF ( WAS_TOW ) THEN
} else { | COST.MILEAGE = 100
cost.hotel_cost = 45.50; | ELSE
} | COST.HOTEL_COST = 45.50
. | ENDIF
. | .
} | .
Notice that you don't need the MAP - END MAP sequence
in C that is used in DEC
Fortran. Everything in the union { body
} acts as though it is sandwiched between
MAP - END MAP.
Structures may contain pointer references to themselves, which is very handy for implementing linked lists:
struct list_s {
struct list_s *prev;
struct list_s *next;
void *data_ptr;
};
When you declare a pointer to a structure, let's call it p, there
is a potential trap in using the pointer because the binding of the structure
member operator, ., is higher than the * dereference operator.
Hence *p.thing means lookup the member "thing" of p,
and use that as an address for the dereference. What you really want is
(*p).thing. This is a bit ugly, so C
provides the -> operator.
.
struct my_struct_s my_struct;
struct my_struct_s *struct_ptr;
.
struct_ptr = &my_struct;
(*struct_ptr).thing = 1; /* "thing" = 1 in struct pointed to by struct_ptr*/
struct_ptr->thing = 1; /* Same as above */
.
This is good place to introduce a program example kindly provided by Rob Cannings. This uses cunning (Cannings ?) pointer manipulation to create a binary sorted tree.
/*---- Illustration of pointer manipulation ("treesort.c") -------------------*/
/* Example provided by Rob Cannings: */
/* (Excess white space removed by Phil O. ;-)) */
/* We implement a sorting routine with the sorted list stored in a tree. */
/* ANSI C Headers */
#include <stdlib.h>
#include <stdio.h>
/* Structures */
struct treeNode {
int data;
struct treeNode *pLeft;
struct treeNode *pRight;
};
/* Function prototypes */
void AddNode(struct treeNode **ppNode,struct treeNode *pNewNode);
void Dump(struct treeNode *pNode);
/* Defines and macros */
#define NUMBER_OF_NUMBERS 4
/* Main Program starts here */
int main(int argc,char *argv[])
{
int i = 0;
int toBeSorted[NUMBER_OF_NUMBERS] = { 93, 27, 15, 47};
struct treeNode dataNode[NUMBER_OF_NUMBERS];
struct treeNode *pSortedTree = NULL;
struct treeNode *pNewNode = NULL;
/* End of declarations ... */
/* Initialise one node for each item of data */
for (i = 0; i < NUMBER_OF_NUMBERS; i++)
{
dataNode[i].pLeft = NULL;
dataNode[i].pRight = NULL;
dataNode[i].data = toBeSorted[i];
}
/* Build a sorted tree out of the data nodes, printing it */
/* out after each new node is added to the tree */
pSortedTree = NULL; /* the tree starts as just as a stump */
for (i = 0; i < NUMBER_OF_NUMBERS; i++)
{
pNewNode = &dataNode[i];
AddNode(&pSortedTree,pNewNode);
printf("\nSorted list of %d items:\n",i + 1);
Dump(pSortedTree);
}
exit(EXIT_SUCCESS);
}
void AddNode(struct treeNode **ppSortedTree,struct treeNode *pNewNode)
{
struct treeNode *pCurrentNode = NULL;
/* End of declarations ... */
pCurrentNode = *ppSortedTree; /* ppSortedTree is a pointer to a pointer */
/* Have we reached the end of a branch ? */
if (pCurrentNode == NULL)
{
*ppSortedTree = pNewNode;
}
else
{
/* We have not reached the end of a branch */
if (pCurrentNode->data > pNewNode->data)
{
AddNode(&(pCurrentNode->pRight),pNewNode);
}
else
{
AddNode(&(pCurrentNode->pLeft),pNewNode);
}
}
}
void Dump(struct treeNode *pNode)
{
/* End of declarations ... */
if (pNode != NULL)
{
Dump(pNode->pLeft);
printf("%d\n",pNode->data);
Dump(pNode->pRight);
}
}
Programming Challenge 6
_______________________
Compile and link "treesort.c" with the debugger. Step through and
experiment with looking at pointers, and looking at the things they
point to, e.g. EXAMINE *pNode . Modify the program so you can add
numbers with a single argument function call.
Sometimes it is useful to know what offset a structure member has from the
start of the structure. There is a useful macro defined in
<stddef.h> called offsetof which will calculate
the offset of a structure member from that start of the structure.
byte_offset = offsetof(struct my_struct_s, thing);
The first argument to the offsetof macro is a TYPE, not a variable
name. An example of this is shown in the "key.c" example program later in the
course.
The typedef statement lets you define a new name for a
pre-existing type. It doesn't create a new type itself. An example should make
the usage clear. Imagine you wanted to store coordinates, and initially you
thought they could all fit in a short int. You might decide to
typedef the coordinate declarations like this:
typedef short int Coordinate_t;
.
Coordinate_t x[MAX_POINTS], y[MAX_POINTS];
.
Later on it might transpire that increased resolution means that you need more than a short int. All you need do then is
typedef long int Coordinate_t;
Be careful and sparing in your use of typedef. Don't use
typedef for everything so that no-one can tell the true type of
anything. Some people like to use typedef with structures,
struct coord_s {
int x;
int y;
};
typedef struct coord_s Coordinate_t;
.
Coordinate_t points[MAX_POINTS];
.
points[i].x = 100;
points[i].y = 50;
.
whereas others argue that this masks the fact that coordinates are really structures and that it would be clearer to use
struct coord_s points[MAX_POINTS];
I would suggest that you put all your structure and typedefs in
one place, like in a header file, and use whatever makes the code uncluttered
and easy to follow. One place where I think typedef does improve
clarity is when defining pointers to functions.
typedef int (*verify_cb_func_ptr)( Bodget b, PxPointer cdata, PxCBstruct cbs );
declares verify_cb_func_ptr as a pointer to a function returning an
int, with 3 arguments of the types shown. Note that the type
returned by the functions themselves is int.
int verify_name( Bodget b, PxPointer cdata, PxCBstruct cbs );
.
verify_cb_func_ptr vcb;
.
vcb = verify_name;
i = (*vcb)( b, cdata, cbs); /* Note how to call function thru pointer */
.
The brackets around the (*vcb) are needed because the function
brackets () take precedence over *.
printf, which is a "stdio" function, the for (;;)
loop, and the exit(EXIT_SUCCESS) end-your-program function from
"stdlib". These functions, or others from these two libraries, are so commonly
used that it is a good idea to always include the <stdio.h>
and <stdlib.h> ANSI standard header files in all your
programs. Header #include files in C
can be specified in two ways:
#include <stdio.h>
and
#include "myheader.h"
The quoted "myheader.h" form starts searching in the same directory as the
file from which it is included, then goes on to search in an implementation
defined way. The angle bracketed <stdio.h> form follows "an
implementation defined search path". In practise "implementation defined search
path" tends to be the system libraries. Under VAX C,
all the header files lived as .h files in SYS$LIBRARY:
. Under DEC C, they live in text libraries like
DECC$RTLDEF.TLB and SYS$STARLET_C.TLB. On Windows using Visual
C++ 6.0 they are in
C:\Program Files\DevStudio\VC98\Include , assuming that you installed
Visual C++ on to your C: disk. If you want to know
the full search rules for VMS, type
$ HELP CC LANGUAGE_TOPICS PREPROCESSOR #INCLUDE
You should always use the angle bracket <> form for ANSI header files, and use the quoted form for your own headers, e.g.
#include "src$par:trntyp.h"
The # symbol is known as the preprocessor operator. When you
perform a C compilation, the first stage it goes
through is preprocessing, where all the # directives are obeyed,
and various inclusions and substitutions are made before the code is compiled.
The # sign must always be the first non-whitespace character on the
line, and is one of the few exceptions to the general free format of
C code. You can have spaces after the #,
and these are often useful when using #if constructs.
Another common preprocessor directive is #define . This can
be used to define "parameters" which you might want to use as array bounds for
example, but in addition it lets you define macros which take arguments and
produce inline code using the arguments. For example,
/* Some defines and macro definitions */
#define PI 3.14159265358979
#define MAX(a,b) (((a)>(b))?(a):(b))
#define STRING_SIZE 16
.
.
{
char string[STRING_SIZE]; /* Using a #define'd array bound */
.
}
Notice that there are no semicolons at the end of the #define
lines. Leading and trailing blanks before the "token sequence" (the body of the
macro or definition) are discarded, although you can use \ at the end of a line
to indicate that there is more of the macro on the next line. In the second form
of macro shown above, you cannot have a space between the identifier, MAX, and
the first "(", or the preprocessor will not know that the () delimit the
parameter list for the macro expansion. Also notice that (if you are a beginner)
you haven't got a clue what is going on with that MAX macro !
The #if, #else, #elif and
#endif conditional preprocessor directives are used to include code
selectively during preprocessing. They can be used to test if a particular macro
name has been defined (even as an empty string). A common use for this is
stopping the same header file contents being included more than once. For
example, imagine you had created a header file called "utils.h".
/*---- My header file for my util routines, called "tla_utils.h" -------------*/
#if !defined( TLA_UTILS_H ) /* Could have used #ifndef TLA_UTILS_H */
#define TLA_UTILS_H
.
#if defined( __VMS ) /* Could have used #ifdef __VMS */
# include "vms_specific_stuff.h"
#elif defined( UNIX )
# include "inferior_unix_alternative.h"
#else
# include "oh_dear_it_must_be_dos.h"
#endif
.
/* Do some stuff that should only be done once */
.
#ifndef DEFINE_GLOBALS
# define EXT extern
#endif
.
#define MY_PROGRAM_ARRAY_LIMIT 100
.
EXT int tla_global_int;
.
EXT const float tla_global_pi
#ifdef DEFINE_GLOBALS
= 3.14159265358979
#endif
;
.
EXT char tla_title_string[]
#ifdef DEFINE_GLOBALS
= "Program Title"
#endif
;
.
int MyFunction( int meaningful_name ); /* This is not a function definition */
/* it is a "function prototype" which */
/* allows arg and return val checking */
.
#endif /* End of TLA_UTILS_H block */
This technique is widely used to enable selection of the correct code at compile time. Try
$ HELP CC Language_topics Predefined_Macros System_Identification_Macros
which will give you some of the predefined (by the compiler) macros that let you switch code on and off depending on, say, whether you are on a VAX or Alpha. See K&R II pages 91 and 232 for more information on this subject.
The definition of the EXT macro is another useful technique for ensuring that
you only DEFINE a variable once (ie. actually allocate space for, or initialize
a variable with a value). Macros are explained in more detail below, but
basically the text (if any) associated with the macro name is substituted
wherever the macro appears, before compilation proper begins. In your main
program, you #define DEFINE_GLOBALS and the header file then
becomes
.
int tla_global_int;
.
const float tla_global_pi = 3.14159265358979;
.
whereas any files of subroutines which don't #define DEFINE_GLOBALS
will process the same header fragment as
.
extern int tla_global_int;
.
extern const float tla_global_pi;
.
so the values are resolved at link time, and won't be contradictory to the main program. This technique saves having to have two versions of your header files (which inevitably get out of step).
#define MAX(a,b) (((a)>(b))?(a):(b))
.
maxval = MAX( maxval, this);
.
to this before the compiler proper ever saw it:
maxval = (((maxval)>(this))?(maxval):(this));
Removing some of the "guard brackets" you get this slightly more readable version
maxval = (maxval > this) ? maxval : this ;
The brackets around the parameters in the expansion are necessary to keep the meaning correct if, say, one of the arguments is a function call, or complex expression. Sometimes it is advisable to create a temporary variable to avoid "using" the parameters more than once, and this will be explained later. See page 229 - 231 of K&R II for a fuller explanation of defining macros. Convention dictates that macros should be totally uppercase. This is certainly the style used in the ANSI header files, and it is generally best to make all your macros uppercase.
The ? operator is a ternary operator, i.e. it takes three operands. It should be used sparingly, and is a shorthand as illustrated below:
value = (expression_1) ? expression_2 : expression_3;
is (more or less) equivalent to
if ( expression_1 )
{
value = expression_2;
}
else
{
value = expression_3;
}
The reason it is handy in macros is that it is best to avoid multiple ;
separated statements in a macro, because that could well change the meaning of
code. Macros tend to be invoked on the assumption that they are a single
statement and code meaning could change if they weren't, e.g.
if ( condition ) INVOKE_MACRO( bob );. By using the ?
operator you can get a single statement that still has some switching logic in
it. There is a trick to get round the single statement restriction, and still
behave nicely:
#define MULTI_STATEMENT_MACRO( arg ) do { \
first_thing; \
.
last_thing; \
} while (0) /* DONT put a ; at end ! */
In C, an expression is TRUE if it is ANY nonzero value, or in the case of pointers, if it doesn't compare equal to NULL. The results of logical comparisons or other built-in operators is guaranteed to be 0 or 1, so
i = ( 2 > 1); /* Sets i to be 1 */
i = ( 1 > 2); /* Sets i to be 0 */
So, in our MAX example ((a)>(b)) will be 1, i.e. TRUE, if
a is greater than b, 0 otherwise. So "expression_1" is
TRUE if a > b. Hence the value of "expression_2" i.e.
a will be chosen. Otherwise "expression_3", in this case
b will be used.
"Why define MAX as a macro at all ?" you might ask (pause until someone asks). Well the reason is that if you used a function, you would need to write a version for floating point numbers, another for ints, another for long ints and so on. Of course, a macro can circumvent type checking, which some people don't like very much, so in C++ macros have been effectively eliminated for most purposes by "templates" which you can learn about in my STL Course.
When using the #if test mentioned in the "Header Files" section,
you can use relational tests on constant expressions. Here is an example of
checking that you are using Motif 1.2 or greater
#if (XmVERSION >= 1 && XmREVISION > 1)
XtSetArg( argl[narg], XmNtearOffModel, XmTEAR_OFF_ENABLED ); narg++;
#endif
The expression following the #if must either use the preprocessing
operator defined(identifier) (which returns 1 if identifier has been
#defined, else 0) or be a constant expression. This can be handy
for defining a number of levels of debugging information. The #if
is also the safest way to "comment out" unused code, rather than messing about
making sure you haven't illegally nested comments. For example:
#ifdef NEW_CODE_IS_RELIABLE
/* New code that should be faster but hasn't been tested as much as the old */
.
#else
/* Here is the old code that worked - don't want to remove it yet */
.
#endif
Clearly the first #if test will always fail in our lifetime because
the macro will never be defined, so the old code will not be compiled. This
technique avoids problems caused by inadvertent comment nesting.
Macros can be undefined using the #undef directive.
#define DEBUG 1
.
#ifdef DEBUG
printf("The value of x is %d in routine Funcy\n",x);/* Print out debug msg*/
#endif
.
#undef DEBUG
.
#ifdef DEBUG
printf("The value of x is %d in routine Gibbon\n",x); /* Not printed */
#endif
.
You will need to #undef a macro if you want to use it again.
Complete redefinitions aren't allowed. You can, however, define a macro more
than once provided the tokens it expands to are the same, ignoring whitespace.
This is known as a "benevolent redefinition" and is often used to get identical
definitions of the NULL macro in several header files.
Avoid starting your macro names with _ and in particular __ because underbars
are reserved for the implementations, and double underbars are use for macros
predefined by the standard. For example, the standard reserves
__LINE__,__FILE__,__DATE__,
__TIME__ and __STDC__. Look in K&R II page 233 for
the meanings of these.
Occasionally it is useful to be able to use the macro arguments as strings. This is done by using the # directly in front of the argument.
#define DEBUG_PRINT_INT( x ) (printf("int variable "#x" is %d",x))
#ifdef DEBUG
DEBUG_PRINT_INT( i ); /* Prints "int variable i is 10" or whatever */
#endif
Concatenation of macro arguments is also possible using the ## directive. Some people like commas in big numbers, so you might use it like this:
#define NICKS_MEGA_INT(a,b,c) a##b##c
.
int i;
.
i = NICKS_MEGA_INT( 10,000,000 ); /* same as 10000000 after expansion */
.
Then again, you might not. As a final thought for this section, I will demonstrate a couple of benign uses for the ? operator - it's not just there for the nasty things in life.
got_space = GetSpace(how_much); /* Returns NULL if it fails */
printf( got_space ? "Success\n" : "Failure\n");
.
/* Avoid ACCVIO if pointer is NULL */
printf( "Name is %s\n", name_ptr[i] ? name_ptr[i] : "**Unknown**" );
.
/* Handle plurals */
printf( "Found %d item%s\n", nitems, (nitems != 1) ? "s" : "" );
The ACCVIO avoidance works because the expression that is NOT selected is
guaranteed to be "thrown away", so the NULL pointer is never
dereferenced
Finally, remember my mentioning that it was a good idea to only reference macro arguments once if the macro was to be used like a function ? The X Toolkit Intrinsics macro, XtSetArg, doesn't follow this sound advice. It is defined like this:
#define XtSetArg(arg, n, d) \
((void)( (arg).name = (n), (arg).value = (XtArgVal)(d) ))
Notice that (arg) is referenced twice, but only appears once in the macro argument list. Hence the intuitive usage
XtSetArg( argl[narg++], XmNtearOffModel, XmTEAR_OFF_ENABLED );
actually increments narg by two, not one. It therefore has to be used something like this
XtSetArg( argl[narg], XmNtearOffModel, XmTEAR_OFF_ENABLED ); narg++;
If they had defined it like this
#define XtSetArg(arg, n, d) \
do { Arg *_targ = &(arg); \
( (void)( _targ->name = (n), _targ->value = (XtArgVal)(d) ); ) \
} while (0)
you would be able to use the argl[narg++] form. This is something to be aware of if your pre or post decrements seem to be behaving strangely. Obviously, you should not actually redefine standard macros, because this can lead to even more confusion. Create your own version, like SETARG if you feel the need.
C Means Fortran
> - Greater than ( .GT. ) -
>= - Greater than or equal to ( .GE. ) | Same precedence
< - Less than ( .LT. ) | as each other,
<= - Less than or equal to ( .LE. ) - below */+-
== - Equal ( .EQ. ) - Same as each
| other, just
!= - Not equal ( .NE. ) - below < etc.
They are left to right associative, and represent sequence points by which side effects of expressions must be complete. E.g.
if ( x*3 > y )
{
.
}
guarantees that x will have been multiplied by three before comparison with y.
A word of caution about the equality operator, == . It is very easy to miss out the second = and this will still be a legal expression. Example:
if ( x = 3*12 )
{
.
}
will always be true. This is because, in C,
expressions have a value, propagated right-to-left. So the value of
( x = 3 ), which calculates the right-hand side, 36, and assigns it
to x, is 36, which is nonzero and hence always true. So that
mistake will cause the if {} body to be always executed, and worse than that you
will have unknowingly changed the value of x. To avoid this, some
people like to write the test the other way round, e.g.
if ( 3*12 == x )
{
.
}
Now, if you miss of the second = you have an illegal expressions because you
cannot assign 3*12 = x, because 3*12 is not
an lvalue (a modifiable location or symbol, which can be on the
left-hand side of the = sign in an expression).
The logical operators are (in decreasing precedence)
C Means Fortran
&& - Logical AND ( .AND. )
|| - Logical OR ( .OR. )
and are below the relational operators in precedence. Hence the expression
if ( j > 0 && i*3 > 12 || i != k ) ...
is the same as
if ( ( ( j > 0 ) && ( (i*3) > 12 ) ) || ( i != k ) ) ...
See K&R II page 52 for operator precedence. Most people don't remember
these, but use brackets to make the meaning of more complex expressions quite
clear. The ! as a unary operator is similar to
Fortran .NOT., so ( !x ) is true if
x is equal to zero.
<< - Left shift, bring in zero bits on right.
>> - Right shift. Bring in 0s on left for unsigned integers,
implementation defined for signed integers.
~ - One's complement. Unary operator, changes 0s to 1, 1's to 0
& - Bitwise AND, do not confuse with relational &&
| - Bitwise INclusive OR, do not confuse with relational ||
^ - Bitwise EXclusive OR
Here are some examples:
i = i << 2; /* Multiply by 4 */
i <<= 2; /* Same as above */
mask |= MSK_RW; /* Set the bits in mask that are set in MSK_RW */
valuemask = GCForeground|GCBackground; /* Set the bits that are the OR of */
/* GCForeground and GCBackground */
mask = ~opposite; /* mask is complementary bit pattern to opposite */
mask |= 1UL << MSK_R_V; /* Shift unsigned long 1 left MSK_R_V bits */
/* and set that bit in mask */
These are very useful for setting and unsetting flag bits, but you must be aware of the size of object that you are dealing with. By their very nature, bitwise operators can make code more unportable.
Programming Challenge 7
_______________________
Use the bitwise operators to determine what your machine does
with a right shift of a negative integer. Write some bit
manipulation and checking functions. Check the priority of the
bitwise operators and see how this affects the bracketing of your
tests and expressions.
.h (header) file with the prototypes for those
functions. The reason prototypes are so useful is that they allow the compiler
to check that you are calling a function with the right number of arguments, and
that the arguments themselves are of the correct type. You should NEVER ignore
warnings about argument numbers or types, and you should only cast (see later,
but briefly, the "cast" (float)3 is like the Fortran
FLOAT(3) ) if you are absolutely sure what you are doing !
Notice that the function prototype (for power in this example) is exactly the same as the function header, but with a ; where the body of the function would go. The arguments named in the prototype are optional, so we could have declared "int power( int , int );" . Don't ever do this. Give the arguments either the same names as those in the function definition, or maybe a more verbose name, so that someone looking at your header file with your function prototypes can easily work out how they are meant to be called.
/*---- To sign or not to sign, that is the example ("charsign.c") ------------*/
/* ANSI C Headers */
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
/* Function prototypes */
int power( int base, int n );
/* Main Program starts here */
int main( int argc, char *argv[] )
{
char c;
unsigned char uc;
/* End of declarations ... */
/* Set the top bit of both characters */
c = power( 2, (CHAR_BIT-1) );
uc = power( 2, (CHAR_BIT-1) );
/* Shift them both right by one bit, >> is the right shift operator */
c >>= 1;
uc >>= 1;
/* Check for equality - check out the ? ternary operator ! */
printf("Your computer has %ssigned char.\n", ( c == uc ) ? "un" : "" );
exit(EXIT_SUCCESS);
}
/*---- Function to raise integer to a power, nicked from K&R II, page 25 -----*/
int power( int base, int n )
{
int i, p;
/* End of declarations ... */
p = 1;
for ( i = 1; i <= n; i++) {
p = p * base;
}
return( p );
}
In this example, power is a function that returns an int. You can return any
type except an array. However, you can return structures (which might contain an
array). Similarly, you can pass structures as arguments. In general, it is best
to avoid passing or returning structures, because there may be extra overhead
due to structures being larger than machine registers, hence they are often
passed on the stack. Return or pass a pointer instead. DON'T return a pointer
to a function-local, automatic object ! Either make the user pass you a maximum
size and some memory into which you can write your structure/array, or
malloc() it and return that. In the latter case you should document
somewhere that it is up to the user to free() the memory when they
are done with it.
If your function doesn't actually return a value, like a
Fortran SUBROUTINE, it is declared as
void. The void keyword is also used to indicate that a function
takes no arguments, for example:
void initialize_something(void);
would be used like this
initialize_something();
The brackets are necessary, even though there are no arguments, so the compiler can tell that you intend to call a function.
Programming Challenge 8
_______________________
Hack the "charsign.c" example to try and call power() with the
wrong type of argument (you might declare a float variable and use
that). See what compiler message you get. Call it with the wrong
number of arguments (but leave the prototype unchanged). Create a
new function, powerf() that lets you raise a floating point number
to any power. Try HELP CC RUN-TIME_FUNCTIONS LOG and HELP CC
RUN-TIME_FUNCTIONS EXP for clues. The print format conversion
character for a floating point number in printf is "%f". Compile
your program and wonder why you get the error
%CC-I-IMPLICITFUNC, In this statement, the identifier "exp" is
implicitly declared as a function.
Remember that when you typed HELP CC RUN-TIME_FUNCTIONS EXP it
told you to stick "#include <math.h>" in your program. Put it in and
the error should go away. If you chose to make your function
something like
float powerf( float base, float exp);
Think about the fact that exp() didn't whinge when you passed it a
float. This is because when arguments are passed (by value always in
C), they are (if possible) converted, AS IF BY ASSIGNMENT, to the
type specified in the function prototype. The order of evaluation of
arguments is unspecified, so never rely on it. See K&R II pages 45
and 201-202 for a detailed description of this behaviour. Finally
when you have your powerf function working, think what a git I am
for not mentioning the "double pow(double base, double exp);" which
also exists in <math.h>.
(type_I_want_to_cast_to) expression_I_want_to_cast
For example
int index = 1;
float realval = 0.0;
index = (int)realval;
Because, as explained in the example, this is done by default when calling functions for which good prototypes have been declared, it is generally only useful if calling older style "Classic C" functions where the arguments types have not been declared. E.g.
float funcy(); /* We know that this actually takes a double argument */
.
float f = 0.0;
.
f = funcy( 100 ); /* Unpredictable result */
f = funcy( (double)100 ); /* f is 10.0 */
Declarations can be quite complicated, and you should read and understand K&R II, pages 122 to 126. There is a very good set of rules and a diagram for parsing declarations in "Expert C Programming", pages 75 to 78, and I strongly recommend everyone to read this.
Try to avoid casting, except in the circumstances defined above, and possibly
when using the RTL function malloc().
stdin,
stdout and stderr, which are usually the keyboard, the
terminal and the terminal again respectively. If you have included
<stdio.h> (which you should have) the symbols
stdin, stdout and stderr are available
for your use. The functions from <stdio.h> that have seen so
far, like printf, write to stdout. Others, like
scanf, read from stdin. Here is an example of using
scanf to read keyboard input.
/*---- Keyboard Input C Example ("input.c") ----------------------------------*/
/* ANSI C Headers */
#include <stdio.h>
#include <stdlib.h>
/* Main Program starts here */
int main( int argc, char *argv[] )
{
int i = 0;
float f = 0.0;
char string[80] = {'\0'};
/* End of declarations ... */
printf("Enter a string, a decimal and a real number separated by spaces\n");
scanf("%s %d %f", string, &i, &f); /* Not good - no string length check */
printf("You entered \"%s\", %d and %f\n", string, i, f);
exit(EXIT_SUCCESS);
}
Compile the program and enter some data. Here is some example input and output.
Enter a string, a decimal and a real number separated by spaces
Hello 4 3.14159
You entered "Hello", 4 and 3.141590
Each item is delimited by whitespace (which includes new lines, of course), but
you can use a scanset format specifier to overcome this, "%[characters_wanted]".
See the DEC C Run-Time Library Reference Manual,
Chapter 2, and Table 2-3, and K&R II page 246 for more information on this.
The scanf function actually returns an integer value, which is the
number of items successfully read in, or the predefined macro value EOF if an
error occurred.
Because you can't safely limit string input with scanf
(which means you could unintentionally overwrite important memory locations and
cause your program to crash by entering a string longer than the memory
allocated for it), it is far better to use fgets().
What you do is read a limited length string with fgets(), the
prototype for which is
char *fgets( char *str, int maxchar, FILE *file_ptr).
So if we had a first argument, destination_string, declared as
char destination_string[STRING_SIZE], we would use
sizeof(destination_string) for maxchar,
and stdin as the input FILE stream.
.
char destination_string[STRING_SIZE];
char *pszResult;
.
pszResult = fgets( destination_string, sizeof(destination_string), stdin );
if ( !pszResult )
{
/* Error or EOF (End Of File) */
.
}
else
{
if ( destination_string[strlen(destination_string)-1] != '\n' )
{
/* Length limit means we didn't get all the input string */
.
}
else
{
/* Got it all - do sscanf() or whatever */
.
}
}
The fgets() function stops reading after the first newline
character is encountered, or, if no newline is found, it reads in at most
maxchar-1 characters. In either case the string is terminated with
'\0'. You can tell whether the length limit cut in by checking