This version of the site is now archived. See the next version at v5.chriskrycho.com.

A Ridiculous Situation

The craziest include structure I've ever seen.

November 07, 2014Filed under tech#software developmentMarkdown source

One of the pieces of code I’m maintaining has an absurd situation in its build structure—honestly, I’m not sure how it ever compiled. For simplicity’s sake, let us assume the four following files:

  • main.c
  • secondary.c
  • writer.h
  • calculator.h

The project has many more files than this, of course, but these are the important ones for demonstrating this particular piece of insanity (which shows up many places in the codebase).

I’m reproducing here some dummy code representing an actual set of relationships in the codebase. The functions and module nameshave been changed; the relationships between the pieces of code have not.1 When I started trying to build the program that included what I am representing as main.c below, this is the basic structure I found:

main.cpp

This is the main module of the program. In the actual code in which I found this particular morass, it was actually code generated by the UI builder in Visual Studio 62 and then turned into an unholy mess by a developer whose idea of good programming involved coupling the various parts of the code as tightly as possible.3

#include "calculator.h"
#include "secondary.h"

int a=0, int b=0;

int addNumbers(a, b) {
    return a+b;
}

void doBadThingsWithGlobals(int * someNumber) {
    a = 6;
    *someOtherNumber = 5;
}

#include "writer.h"

void main() {
    a = 3;
    doBadThingsWithGlobals(&b);
    addNumbers(a, b);
    doStuffWithNumbers(a,b);
    subtractNumbers(b, a);
}

// More insanity follows...

Yes, the main function and the doBadThingsWithGlobals function are both modifying global state, and yes, there is an include statement midway down through the module. (Just wait till you see what it does.)

“secondary”

Here is a secondary module which has been somewhat cleaned up. It has normal relationships between header and source files, and includes all its dependency headers at the top of the file. It has a header which defines the public API for the module, and that even has inclusion guards on it.

secondary.h

#ifndef SECONDARY_H
#define SECONDARY_H

int doStuffWithNumbers();

#endif SECONDARY_H

secondary.c

The doStuffWithNumbers function here calls addNumbers:

#include "secondary.h"
#include "calculator.h"

int doStuffWithNumbers(int x, int y) {
    addNumbers(x, y);
}

But wait! you say, That function isn’t defined here! Ah, and you would be right, except that it doesn’t refer to the addNumbers function in main.c. It refers to a function implementation in calculator.h.

calculator.h

int addNumbers(int p, int q) {
    return p + q;
}

int subtractNumbers(int r, int s) {
    return r - s;
}

Strangely, this addNumbers function is identical to the one in main.c. Even more strangely, it is defined—not merely declared, actually defined—in the header file! Nor is this the only such function. Look at the details of writer.h, which was mysteriously included above in the middle of the main module.

writer.h

void writeStuff() {
    fprintf(stdout, "a: %d, b: %d", a, b);
}

Once again, we have a full-fledged implementation in the header file. Why, you ask? Presumably because the developer responsible for writing this code never quite got his head around how C’s build system works. The entirety of one of the central components of this software—an element that in any normal build would be a common library—was a single, approximately 2,000-line header file. (Say hello to calculator.h up there; that’s what I’m abstracting away for this example.)4

Worse: it is printing the values of a and b, and no, I am not skipping some part of writer.h. It is getting those from main.c, because it was included after they were defined, and the build process essentially drops this header inline into main.c before it compilation.5 So here we have a header file with the implementation of a given piece of code, included in a specific location and defined in such a way that if you change where it is included, it will no longer function properly (since the variables will not have been defined!)

Worse, there are conflicting definitions for one of the functions used in main.c, and because of its dependency on other functions in calculator.h (e.g. subtractNumbers in this mock-up), it cannot be removed! Moreover, given the many places calculator.h is referenced throughout the code base, it is non-trivial to refactor it.6

If this sounds insane… that’s because it is.

If you’re curious how I dealt with it, well… I renamed the addNumbers() function in main.c to _addNumbers() and put a loud, angry TODO on it for the current release, because the only way to fix it is to refactor this whole giant mess.

The takeaway of the story, if there is one, is that people will do crazier, weirder, worse things than you can possibly imagine when they don’t understand the tools they are using and just hack at them till they can make them work. The moral of the story? I’m not sure. Run away from crazy code like this? Be prepared to spend your life refactoring?

How about: try desperately not to leave this kind of thing for the person following you.


  1. That’s actually not wholly true, because these pieces of code are also duplicated in numerous places throughout the codebase. We’ve eliminated as many as possible at present… but not all of them, courtesy of the crazy dependency chains that exist. Toss in a dependency on Visual Studio 6 for some of those components, and, well… suffice it to say that we’re just happy there are only two versions floating around instead of the seven that were present when I started working with this codebase two and a half years ago.

  2. Yes, that Visual Studio 6. The one from 1998. Yes, that’s insane. No, we haven’t managed to get rid of it yet, though we’re close. So close.

  3. I am not joking. Multi-thousand line functions constituting the entirety of a program are not just normal, they are pretty much the only way that programmer ever wrote. When you see the code samples below, you will see why: someone was lacking an understanding of C’s build system.

  4. Also, that’s the piece of code of which I found seven different versions in various places when I started. Seven!

  5. I once ran into some code working on a different project for an entirely different client where there had been a strict 1,000-line limit to C source files, as part of an attempt to enforce some discipline in modularizing the code. Instead of embracing modularity, the developers just got in the habit of splitting the source file and adding #include statements at the end of each file so that they could just keep writing their non-modular code.

  6. I have tried. Twice. I’m hoping that the third time will be the charm.