C Tricks

I’ve been coding in C99 quite a lot lately. I wanted to write down some of the tricks I’ve been using.

I don’t take credit for any of these ideas – I’ve either directly stolen these ideas from other coders, or rediscovered techniques that have probably been known for 40 years.

I will continue updating this document as I stumble across more useful tricks.

The code on this page is Public Domain, but this article is copyright according to the notice at the very bottom.

  1. Generic List Implementation
  2. X Macros
  3. Multiple Function Implementations
  4. Aborting When Malloc Fails
  5. Detecting Memory Leaks
  6. Simple Random Numbers
  7. Fast Prime Test
  8. Fast Circles
  9. Flexible Object Creation
  10. Faster atan2

Generic List Implementation

I use lists for everything. I wanted an easy way to create a list of any type. In C++, you would normally use templates – but since I’m using C, I decided to just use the preprocessor.

Suppose I wanted to create a list of floats. I start with this:

#define NAME(pfx, pst)    pfx ## float ## pst
#define TYPE              float

NAME takes a prefix and postfix. The ## will smash the words together. So NAME(list_, _new) will turn into list_float_new.

TYPE is just the C type of the list elements.

Given these two macros, how can we generate a generic list implementation? Easy:

#define LIST_TYPE_ST      NAME(list_, _st)
#define LIST_TYPE         NAME(list_, )

typedef struct {
  TYPE *vals;
  int size;
  int count;

LIST_TYPE NAME(list_, _new )();
void      NAME(list_, _push)(LIST_TYPE ls, TYPE v);
// ... more functions ...
void      NAME(list_, _free)(LIST_TYPE ls);

#undef LIST_TYPE

Followed by the implementation:

LIST_TYPE NAME(list_, _new)(){
  LIST_TYPE ls = malloc(sizeof(LIST_TYPE_ST));
  ls->size = 0;
  ls->count = 0;
  ls->vals = NULL;
  return ls;

void NAME(list_, _push)(LIST_TYPE ls, TYPE v){
  if (ls->size >= ls->count){
    ls->count = ls->size + 16;
    ls->vals = realloc(ls->vals, sizeof(TYPE) * ls->count);
  ls->vals[ls->size] = v;

void NAME(list_, _free)(LIST_TYPE ls){
  if (ls->vals)

Then, to generate all our list types that we’re interested in, we do something like this:

#define NAME(pfx, pst)    pfx ## float ## pst
#define TYPE              float
#include "list.gen.h"
#undef NAME
#undef TYPE

#define NAME(pfx, pst)    pfx ## ptr ## pst
#define TYPE              void *
#include "list.gen.h"
#undef NAME
#undef TYPE

// ... more NAME/TYPE pairs

X Macros

Generating code that iterates over a list is useful in a lot of situations.

Suppose you have a person, which includes a list of meshes for each body part.

You could do something like this:

typedef struct {
  mesh head;
  mesh shoulder;
  mesh l_backarm;
  mesh r_backarm;
  mesh l_forearm;
  mesh r_forearm;
  // ... more body parts
} person_st, *person;

person person_new(){
  person p = malloc(sizeof(person_st));
  p->head      = mesh_new();
  p->shoulder  = mesh_new();
  p->l_backarm = mesh_new();
  p->r_backarm = mesh_new();
  p->l_forearm = mesh_new();
  p->r_forearm = mesh_new();
  // ... those same body parts
  return p;

void person_free(person p){
  mesh_free(p->head     );
  mesh_free(p->shoulder );
  // ... all those body parts again

Boy, that gets repetitive!

Instead, consider something like this:

#define EACH_BODY_PART(X)    \
  X(head     )               \
  X(shoulder )               \
  X(l_backarm)               \
  X(r_backarm)               \
  X(l_forearm)               \
  // ... all the body parts

typedef struct {
  #define X(name)   mesh name;
  #undef X
} person_st, *person;

person person_new(){
  person p = malloc(sizeof(person_st));
  #define X(name)   p->name = mesh_new();
  #undef X
  return p;

void person_free(person p){
  #define X(name)   mesh_free(p->name);
  #undef X

Every time EACH_BODY_PART is expanded, the value of X has been changed, so it generates different code.

Now if you need to add a new body part, you just add it in one place, and everything else is updated.

Multiple Function Implementations

In C++, you might normally use polymorphism and virtual functions to create multiple implementations of a set of functions. In C, you have to be more direct, and use function pointers.

typedef void  (*animal_speak_func)();
typedef float (*animal_humanYears_func)(float age);

typedef struct {
  animal_speak_func f_speak;
  animal_humanYears_func f_humanYears;
} animal_funcs_st;

// dog implementation
void animal_dog_speak(){

float animal_dog_humanYears(float age){
  if (age <= 2.0f)
    return age * 5.0f;
  return 10.0f + (age - 2.0f) * 4.0f;

const animal_funcs_st animal_dog_funcs = {

// cat implementation
void animal_cat_speak(){

float animal_cat_humanYears(float age){
  if (age <= 1)
    return 15.0f * age;
  if (age <= 2)
    return 15.0f + (age - 1.0f) * 10.0f;
  return 25.0f * (age - 2.0f) * 4.0f;

const animal_funcs_st animal_cat_funcs = {

For really advanced users, you can actually generate a lot of this information if you’re simply given a “signature” file of all the functions in the implementation:

void animal_NAME_speak();
float animal_NAME_humanYears(float age);

I leave it as an exercise to the reader to parse a signature file and perform the appropriate string replacements to generate the typedef’s, animal_funcs_st, and const exports. Personally, my build scripts make heavy use of sed, but there are lots of ways to do it.

Aborting When Malloc Fails

This one is simple, but it never occurred to me until someone explicitly told me: if you run out of memory and malloc returns NULL, just log the error and abort the entire program. What else are you really going to do?

This simple change removes a lot of silly error checking code. Use wisely.

void *mem_alloc(size_t size){
  void *p = malloc(size);
  if (p == NULL){
    fprintf(stderr, "Fatal Error: Out of memory\n");
  return p;

void *mem_realloc(void *p, size_t size){
  p = realloc(p, size);
  if (p == NULL){
    fprintf(stderr, "Fatal Error: Out of memory\n");
  return p;

inline void mem_free(void *p){

Detecting Memory Leaks

If you funnel all memory allocations through custom functions, you can easily detect memory leaks.

Simply keep a linked list of all allocations. When a reallocation happens, update the appropriate node in the list. Then, when freeing memory, remove the node from the list.

At the end of the program, any items left in the linked list are memory leaks.

Something like this:

typedef struct mem_node_struct {
  void *ptr;
  const char *file;
  int line;
  struct mem_node_struct *next;
} mem_node_st, *mem_node;

static mem_node mem_list = NULL;

void *mem_debug_alloc(size_t size, const char *file, int line){
  void *p = mem_prod_alloc(size);
  mem_node m = mem_prod_alloc(sizeof(mem_node_st));
  m->ptr = p;
  m->file = file;
  m->line = line;
  m->next = mem_list;
  mem_list = m;
  return p;

// ...

#ifdef DEBUG
#define mem_alloc(s)  mem_debug_alloc(s, __FILE__, __LINE__)
// ...
#define mem_alloc(s)  mem_prod_alloc(s)
// ...

Using __FILE__ and __LINE__ allows you to see where the allocation occurs.

Make sure to correctly handle passing NULL to realloc and free – other than that, it’s a fairly standard linked-list implementation.

Simple Random Numbers

A lot of people seem to stress out over random numbers. Here’s a simple and fast function that passes many PRNG quality tests. Not recommended for cryptography, but fine for anything else.

Please, stop using slow PRNGs with huge states (i.e., Mersenne Twister).

static uint32_t seed = 0, i = 0;
uint32_t smush(){
  uint32_t m = 0x5bd1e995;
  uint32_t k = i++ * m;
  seed = (k ^ (k >> 24) ^ (seed * m)) * m;
  return seed ^ (seed >> 13);

void smush_seed(uint32_t s){
  seed = s;
  i = 0;

Fast Prime Test

This is a simple prime test that is fast:

bool isprime(int v){
  if (v < 0)
    return isprime(-v);
  if (v < 2)
    return false;
  if (v <= 3)
    return true;
  if ((v % 2) == 0 || (v % 3) == 0)
    return false;
  int max = (int)sqrt((double)v);
  for (int i = 5; i <= max; i += 6){
    if ((v % i) == 0 || (v % (i + 2)) == 0)
      return false;
  return true;

There is obviously a whole body of research in primality tests, but this one is simple, fast, and works.

Fast Circles

How do you create a circle? You could do this:

for (int i = 0; i < 100; i++){
  double ang = 2.0 * M_PI * i / 100.0;
  double x = radius * cos(ang);
  double y = radius * sin(ang);

  // use x and y

There’s a faster way! Instead of calculating sin and cos inside the loop, you can think of rotating a point by a small amount each iteration:

double ang = 2.0 * M_PI / 100.0;
double co = cos(ang);
double sn = sin(ang);
double x = radius;
double y = 0;
for (int i = 0; i < 100; i++){
  // use x and y

  // rotate x and y by ang
  double nx = x * co - y * sn;
  double ny = x * sn + y * co;
  x = nx;
  y = ny;

This is about 3.7 times faster. Be aware that the calculation is not exactly the same, and errors can pile up slowly over time (if you’re spinning, say, 1000 times around instead of just once).

Flexible Object Creation

It’s useful to come up with a general rule for creating and destroying objects in a code base. This isn’t rocket science, but consistency is useful and predictable. I do something like this:

Objects start as structures:

// Structures end in '_struct', the structure type ends in
// '_st', and the pointer type is just the object name
typedef struct foo_struct foo_st, *foo;

struct foo_struct {
  int var1;
  int var2;

Overall initialization via init/term:

// The init function is called once at program start, and
// initializes any global static variables
void foo_init();

// The term function is called once at program termination,
// and cleans up anything done during init
void foo_term();

Object creation via make/destroy:

// The make function specifically takes a structure, so
// that objects can be created without calling malloc
void foo_make(foo obj);

// The destroy function specifically takes a structure, so
// that objects can be destroyed without calling free
void foo_destroy(foo obj);

Object allocation via new/free:

// The new function allocates an object and initializes it
// via make
static inline foo foo_new(){
  foo obj = malloc(sizeof(foo_st));
  if (obj == NULL)
    return NULL;
  return obj;

// The free function frees an object after destroying it
static inline void foo_free(foo obj){

Faster atan2

This website outlines a faster atan2 algorithm. The accuracy is not as good as atan2f from math.h, but my tests show it is still pretty close, and quite a bit faster. Test for yourself.

// fastest version, with error of about +-0.07 radians
static inline float fast_atan2(float y, float x){
  static const float c1 = M_PI / 4.0;
  static const float c2 = M_PI * 3.0 / 4.0;
  if (y == 0 && x == 0)
    return 0;
  float abs_y = fabsf(y);
  float angle;
  if (x >= 0)
    angle = c1 - c1 * ((x - abs_y) / (x + abs_y));
    angle = c2 - c1 * ((x + abs_y) / (abs_y - x));
  if (y < 0)
    return -angle;
  return angle;

// slower version (but still faster than atan2f)
// with error of about +-0.01 radians
static inline float slow_atan2(float y, float x){
  static const float c1 = M_PI / 4.0;
  static const float c2 = M_PI * 3.0 / 4.0;
  static const float c3 = M_PI / 16.0;
  static const float c4 = M_PI * 5.0 / 16.0;
  if (y == 0 && x == 0)
    return 0;
  float abs_y = fabsf(y);
  float angle;
  if (x >= 0){
    float r = ((x - abs_y) / (x + abs_y));
    angle = c3 * r * r * r - c4 * r + c1;
    float r = ((x + abs_y) / (abs_y - x));
    angle = c3 * r * r * r - c4 * r + c2;
  if (y < 0)
    return -angle;
  return angle;

permanent link
back to top

posted 2 May 2016 by Sean
tags: c

View All Posts