Category Archives: Coding

Baking a binary in a header with CMake (part 2)

Welcome back after a great wee Christmas break there! Well before Christmas, I started a blog post about baking a header into CMake! This is the follow up.

The initial idea I had worked, but it was pretty darn slow! For a 256K file it took 1m34.107s to process, 2.72 kB/s! The body of the cmake script was;

string(LENGTH "${contents}" contents_length)
math(EXPR contents_length "${contents_length} - 1")

foreach(iter RANGE 0 ${contents_length} 2)
  string(SUBSTRING ${contents} ${iter} 2 line)
  file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "0x${line},\n")
endforeach()

So my first idea was that perhaps doing the substring, cutting out 2 characters at a time from the main string might be bad. I then tried to constantly cut out the first two characters from the main string, thus always cutting out the first two characters every time;

string(LENGTH "${contents}" contents_length)
math(EXPR contents_length "${contents_length} - 1")

foreach(iter RANGE 0 ${contents_length} 2)
  string(SUBSTRING ${contents} 0 2 line)
  file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "0x${line},\n")

  # reduce contents to hopefully simplify substring!
  string(SUBSTRING ${contents} 2 -1 contents)
endforeach()

Wow – much slower at 1m45.864s and 2.41 kB/s!

Let’s change focus, I’ve not extensively used the REGEX command of a CMake string (I remember in the bad ol’ days when you didn’t even have a regex!), but it does look like we could use it to help us in this case. My thinking is that I can use REGEX to match two characters into elements of a CMake list, and then just iterate through each element of the list;

string(REGEX MATCHALL ".." output "${contents}")

foreach(line IN LISTS output)
  file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "0x${line},\n")
endforeach()

Much faster! We’ve reduced our runtime to 0m33.150s, and a whopping 7.72 kB/s. I was just about happy here, but the bigger the files I tried to bake, the more annoyed I was getting at the long processing time.

I ran the example on a huge file, and loaded up the VerySleepy profiler. I noticed that there was a ton of fopen, fwrite, fclose calls going on – I wonder if we avoid file IO and instead append to a variable would it be faster?

string(REGEX MATCHALL ".." output "${contents}")

set(arraybody "")

foreach(line IN LISTS output)
  set(arraybody "${arraybody}0x${line},\n")
endforeach()

A resounding NO. The constant reallocations this causes ruined performance even worse than before, an insane 4m30.059s and 0.95 kB/s!

Ah well, I was done now… except that REGEX is really really powerful – I’m just not that great with it (historically). I realised that a CMake list is actually just the same as a string, except that it has semi-colons as a separator! So as I am reading our file as a HEX file, I can guarantee that there will be no semi-colons in the string except when I added them with the list. So I can them match for each character pair, and then replace the semi-colons with the separator pattern! And voila;

string(REGEX MATCHALL ".." output "${contents}")
string(REGEX REPLACE ";" ",\n  0x" output "${output}")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "  0x${output}\n")

And the speed? 0m1.788s and an outstanding 143.18 kB/s! Much more like it and now I am definitely done.

The full script is;

# This file is distributed under the University of Illinois Open Source
# License. See LICENSE.TXT for details.

# Takes a file and embeds it in a C header with a given variable name

if(NOT DEFINED BINARYBAKER_INPUT_FILE)
  message(FATAL_ERROR
    "Required cmake variable BINARYBAKER_INPUT_FILE not set!"
  )
endif()

if(NOT DEFINED BINARYBAKER_OUTPUT_FILE)
  message(FATAL_ERROR
    "Required cmake variable BINARYBAKER_OUTPUT_FILE not set!"
  )
endif()

if(NOT DEFINED BINARYBAKER_VARIABLE_NAME)
  message(FATAL_ERROR
    "Required cmake variable BINARYBAKER_VARIABLE_NAME not set!"
  )
endif()

if(NOT EXISTS ${BINARYBAKER_INPUT_FILE})
  message(FATAL_ERROR "File '${BINARYBAKER_INPUT_FILE}' does not exist!")
endif()

file(READ "${BINARYBAKER_INPUT_FILE}" contents HEX)

string(TOUPPER "${BINARYBAKER_OUTPUT_FILE}" header_ifndef)
string(REGEX REPLACE "[^A-Z]" "_" header_ifndef "${header_ifndef}")
set(header_ifndef "__${header_ifndef}__")

file(WRITE "${BINARYBAKER_OUTPUT_FILE}"
  "// This file is distributed under the University of Illinois Open Source\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}"
  "// License. See LICENSE.TXT for details.\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#ifndef ${header_ifndef}\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#define ${header_ifndef}\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#ifdef __cplusplus\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "extern \"C\" {\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#endif\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "const char "
  "${BINARYBAKER_VARIABLE_NAME}[] = {")

string(REGEX MATCHALL ".." output "${contents}")
string(REGEX REPLACE ";" ",\n  0x" output "${output}")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "  0x${output}\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "};\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#ifdef __cplusplus\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "}\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#endif\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#endif//${header_ifndef}\n\n")

Enjoy!

Advertisements

Baking a binary in a header with CMake (part 1)

When using CMake in projects for both my personal and professional life – there is always various tools I want to be able to use. One such tool is to take a binary file, and embed it as a const char array in a C/C++ header. The problem is – I often use CMake and cross-compile using CMake’s toolchain mechanism. The issue is – when you cross compile, any tools you’ve built as part of the build are built for the cross-compiled target! No good when you need to use these tools within the very build you are executing.

After working round this problem for a number of years (usually by building for host first and using those tools in the cross compiled build) I discovered CMake’s script mode, whereby you call CMake with the ‘-P’ option and provide a CMake file to execute. With CMake’s script mode, I can add a custom target within CMake that will use this ‘-P’ option as part of the normal build process to create the header, I can ensure that if someone changes the binary file I am embedding that it is tracked and the header is regenerated, and all of this works for cross-compiled targets too!

Once I understood a little more about the CMake ‘-P’ mode, I then created this CMake script file I’ve named ‘binarybaker.cmakescript’ – the *script suffix isn’t required, but I like it as its very descriptive of how I intend to use this file;

if(NOT DEFINED BINARYBAKER_INPUT_FILE)
  message(FATAL_ERROR
    "Required cmake variable BINARYBAKER_INPUT_FILE not set!"
  )
endif()

if(NOT DEFINED BINARYBAKER_OUTPUT_FILE)
  message(FATAL_ERROR
    "Required cmake variable BINARYBAKER_OUTPUT_FILE not set!"
  )
endif()

if(NOT DEFINED BINARYBAKER_VARIABLE_NAME)
  message(FATAL_ERROR
    "Required cmake variable BINARYBAKER_VARIABLE_NAME not set!"
  )
endif()

First off, I wrote some malformed parameter detecting code – I wanted to be sure I passed in the correctly named arguments to my script!

if(NOT EXISTS ${BINARYBAKER_INPUT_FILE})
  message(FATAL_ERROR "File '${BINARYBAKER_INPUT_FILE}' does not exist!")
endif()

Next, I check that the input file specified actually exists!

file(READ "${BINARYBAKER_INPUT_FILE}" contents HEX)

The main revelation that made me realise I could bake a binary in CMake was that you can read a file using the HEX option – meaning we can meaningfully read a binary! This creates in the variable contents one long string akin to “AE12DFEA123…”

string(TOUPPER "${BINARYBAKER_OUTPUT_FILE}" header_ifndef)
string(REGEX REPLACE "[^A-Z]" "_" header_ifndef "${header_ifndef}")
set(header_ifndef "__${header_ifndef}__")

file(WRITE "${BINARYBAKER_OUTPUT_FILE}" "#ifndef ${header_ifndef}\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#define ${header_ifndef}\n")

Next, we need to generate an include guard. I convert the filename to UPPER, and then replace any characters in the filename that aren’t A-Z with underscores.

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#ifdef __cplusplus\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "extern \"C\" {\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#endif\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "const char "
  "${BINARYBAKER_VARIABLE_NAME}[] = {")

Insert extern “C” for C++ to work as expected, and output the name of the array.

string(LENGTH "${contents}" contents_length)
# Need to minus one, as the foreach will go over the end of our var otherwise!
math(EXPR contents_length "${contents_length} - 1")

foreach(iter RANGE 0 ${contents_length} 2)
  string(SUBSTRING ${contents} ${iter} 2 line)
  file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "0x${line},\n")
endforeach()

The juicy bit, where we actually output the file! First up, we get the length of the hex string we have. We then minus one from this, as CMake will go over the end of the upcoming loop if we don’t (this caused me a good 10 minutes of headscratching!). Then, loop over the length of the file in 2 steps, substring to read the two characters we are currently at in the loop, and then append these to the output file.

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "};\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#ifdef __cplusplus\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "}\n")
file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#endif\n")

file(APPEND "${BINARYBAKER_OUTPUT_FILE}" "#endif//${header_ifndef}\n\n")

And lastly the final code output to the resultant header to make it compile with C/C++.

So now we have our CMake script, and it processes our file! The only issue is its not the fastest.

+ ls -l -h medium.jpg
-rw-r--r--@ 1 neil  staff   256K  8 Dec 23:06 medium.jpg
+ cmake -DBINARYBAKER_INPUT_FILE=medium.jpg -DBINARYBAKER_OUTPUT_FILE=medium.jpg.h -DBINARYBAKER_VARIABLE_NAME=medium -P binarybaker.cmakescript
1m14.688s

So we are processing around 3.43kB/s – not very fast indeed!

In my next blog, I’ll take you through the rather random steps I took to optimise this script so that it ran in a sane amount of time!

What API choice I ended up with!

So I previously wrote a blog called How to design API function that creates Something where I was exploring various API design choices for a new project I am working on for my company. The blog was a meandering through various different choices available for one of the fundamental questions in APi design – how should we create an object and return it to the user?

Since then, I’ve read the slides of the awesome Stefanus Du Toit from his CppCon2014 talk Hourglass Interfaces for C APIs – and this really blew my mind.

I’ve always disliked putting C++ code in headers instinctively, I really hate not allowing people to use whatever languages/bindings they want when interfacing with my code. Plus, for the products I work on for my company we have a lot of customers that want to use pure C – meaning a C++ header is just a write-off. While I could follow LLVM’s approach of generating a C++ header and then providing C bindings for it – I feel that this approach is always a hack (for instance, the LLVM C API will double allocate memory temporarily when returning char* strings from the C API as the C++ API that it calls has to use std::string, which then has to be converted to a malloc’ed string as the C API must call free() on the resulting data!)

For my approach, all functions are prefixed with an [identifier_] string, in this case we’ll use cat_ (I do like cats as an aside);

typedef int32_t cat_error;

typedef cat_type_s cat_type_t;

enum cat_error_e {
  cat_error_success        = 0,
  cat_error_failure        = 1,
  cat_error_spilt_the_milk = 2
};

cat_error cat_type_create(const char* name, uint8_t age, cat_type_t* out_cat_type);

All of the APIs in my code are following the above approach bar none. Every function returns an error, and the caller is expected to respond to that error, all in variables are specified first, then inout variables, and lastly out variables. All inout and out variables have their variable names prefixed with inout_ and out_ respectively.

While this might not be ideal to all out there, what it does allow me to do is enforce a consistent API design throughout the product, and allow for much of the code to be self documenting (an aside – I also heavily use Doxygen for my documentation so even though I’ve tried to make the code as plain-text readable as possible, there are also pretty docs created for scouring!).

So far this approach has been welcomed by the users of the API (getting good feedback already) – it also is allowing me to focus on providing the C API functionality first and foremost, but with a mind to also adding the C++ API above the C API on the client side like Stefanus talked about in his CppCon14 talk!

How to design API function that creates Something

I’ve recently become totally obsessed with API design, I think in part because I’m about to embark on some projects that I really want to nail the API for developers on, and partly because part of my work life is implementing GPGPU compute APIs for clients of my company Codeplay.

For background viewing, Casey Muratori has an awesome talk he gave back in 2004, about his time designs APIs at RAD. You should check out his talk ‘Designing and Evaluating Reusable Components‘. Also, Matt Gemmell wrote a great blog post about his time designs reusable components for iOS and Mac OSX. You should check out ‘API Design‘.

For this post, I’m concentrating on one of my primary unsolved annoyances in API design – how do you create something. It is such a core part of any API (create my widget X, physics hypercube Y, or even compiler Z) to be able to create things that the API has encapsulated for you, and then use them. The problem is – what if the creation goes wrong? How does the API tell you when something bad happened? Any suggestions that exceptions should be used for these examples will cause the suggester to suffer a similar fate as Old Yeller!

From my experience, there are 5 different ways in C to have an API that can create something, and also be able to signal an error occurred.

For each of these APIs I’m taking a bool and an int parameter as fluff parameters to signify some sort of user control over the creation, and we are creating an opaque type ‘Something’.

API A

int APICreateSomethingA(bool param1, int param2, Something * something);

For this API, we are returning an errorCode via the return, and also returning a Something via a pointer to a Something being passed in as a parameter. This is similar to how OpenCL 1.2 creates cl_event’s in any of the clEnqueue* functions (like clEnqueueBarrierWithWaitList for instance).

The problems with this API design;

  • we just return an int, there is no verbosity on the parameter to say what is being returned. This means that we require the header to have documentation saying that the return type is actually representing an error code, and all documentation related to this function to also contain what the value represents.
  • what happens if a user passes in a NULL pointer for something? What is the expected behaviour?

API B

Something APICreateSomethingB(bool param1, int param2, int * errorCode);

For this API, we are returning the Something via the return, and also returning the (now much more obvious) error code via the input parameter.

The problems with this API design;

  • what happens if a user passes in a NULL pointer for errorCode? Does this mean they don’t want the errorCode value? Don’t care about it? Or was it a mistake? If they didn’t pass in a valid pointer for errorCode, what do we do if something goes wrong?

API C

Something APICreateSomethingC(bool param1, int param2)

For this API, we are returning the Something via the return, and don’t care to return an errorCode at all. APIs like this usually make the assumption that if Something is NULL there was an error, or we have a followup method akin to;

int APIGetSomethingErrorCode(Something something);

I detest this approach the most in all honesty – I’m including it merely because I have witnessed such atrocities in the past!

API D

Many modern languages (for instance both Swift and Python) allow multiple things to be returned from a function.

Swift has syntax like;

func minMax(array: [Int]) -> (min: Int, max: Int) {
  // ...
}

Now (fortunately or unfortunately depending on your slicing of the pie) C doesn’t have a similar mechanism. To get a similar approach, we use a little helper struct that contains both the error code and the Something.

typedef struct _APICreateSomethingResult
{
  int errorCode;
  Something something;
} APICreateSomethingResult;

APICreateSomethingResult APICreateSomethingD(bool param1, int param2);

This approach is a really simple and nice way to return two things from the API – it encapsulates them all within a single return type. I remember being taught at University that these sort of shenanigans were evil incarnate – although I suspect that the days when returning 64->128 bytes via a function call was bad are long passed! We should still take care not to return excessively large Something’s via this method though as it will result in the compiler having to do much more fluffing with the data that is unwarranted and unnecessary.

One of the downsides though of doing this in C/C++ is we can’t discard the values we don’t want. One of things about Python I especially love is that you can return 5/6 things and only use say 1/2 of them, meaning that in the language itself you are throwing away stuff you didn’t want and or need. If we were able to do this in C/C++ we might be able to optimise the code a tad more.

API E

LLVM/Clang uses an templated approach something like;

template<T> struct ErrorOr;

The idea being that you either get an error, or you get the T you wanted. There is some footery that we can do with this if we can assume that the Something is a pointer, and it is at least 2 byte aligned.

typedef struct _APIErrorOrSomething
{
  union
  {
    intptr_t errorCode;
    Something something;
  };
} APIErrorOrSomething;

APIErrorOrSomething APICreateSomethingE(bool param1, int param2);

Then when we consume this APIErrorOrSomething struct, we do something like;

APIErrorOrSomething result = APICreateSomethingE(true, 42);
bool haveError = result.errorCode & 1;
result.errorCode &= ~1;
if(haveError)
{
  // guess we had an error?
  int errorCode = result.errorCode;
}
else
{
  Something something = result.something;
}

All of this code is rather disgusting when you view it in its raw form, it is much ‘nicer’ when it is hidden under C++ accessor methods. But either way, it saves a little bit of memory in the return type, but has the requirement that we either have a Something or an error code – we can’t have both.

Summary

I have came to no conclusions, nor have I myself decided which of these methods I actually like the most in all honesty! Feedback would be appreciated (and perhaps there are yet more API variants for this creation API that I haven’t covered!). I fired all the code together on GitHub – TestAPIDesigns.