2018-01-25
Preprocessing GLSL Shaders
Coal uses OpenGL (3.2 Core Profile or 2.1) for rendering. As such, it uses programmable shaders to do the heavy lifting, in a C-like language called GLSL.
It works well for me, especially since I'm not pushing the API to its limits (just look at the game's screenshots). One area that I wanted to fix was the inability to send important hard-coded constants from the game to the shaders. While GLSL allows for uniforms that you send to the shader (usually things like projection matrices that change every frame), arrays of uniforms must be at a fixed size at shader compile time. A common example is sending the transformation matrices of an animated model's bones to the vertex shader.
GLSL doesn't support dynamic arrays such as C++ vectors, so a straightforward way to pass a model's bones is to assume a maximum number of bones any model can have and set that as the size in the shader. I would then have to write the same max number somewhere in my game code, so I can do validation like asserting that a loaded model never exceeds the bone limit. Unfortunately, I also have to keep the limits in both the code and shaders in sync.
Another limitation I'm running against is GLSL's varying compatibility in different video cards. For lighting and shadow mapping, every entity in Coal has a limit of 3 omnidirectional lights that can affect its color and cast dynamic shadows. In the fragment shader for rendering maps, the uniform I use for this are samplerCubes (cubemaps that contain the shadow depth to compare against to decide whether to darken a pixel). On the nVidia card in my main gaming computer, the shader pseudocode I do for looping through every light and calculating the shadow is fairly simple:
const int MAX_LIGHTS = 3;
uniform samplerCube depth_samplers[MAX_LIGHTS];
float shadow[MAX_LIGHTS];
for (i = 0; i < num_shadowcasting_lights; i++)
{
shadow[X] = ShadowCalculation(depth_samplers[i]);
}
Unfortunately, compiling this shader in my laptop with Intel HD3000 spits out the following error.
error: sampler arrays indexed with non-constant expressions are forbidden in GLSL 1.30 and later
With no way to use variables as array indices, I was forced to write the following monstrosity.
if (num_shadowcasting_lights > 0)
{
shadow[0] = ShadowCalculation(depth_samplers[0]);
}
if (num_shadowcasting_lights > 1)
{
shadow[1] = ShadowCalculation(depth_samplers[1]);
}
if (num_shadowcasting_lights > 2)
{
shadow[2] = ShadowCalculation(depth_samplers[2]);
}
Since my current light limit is 3, I copy and pasted the calculation three times and hardcoded the indices. Later, I shortened this with the following macro:
#define SHADOWFUNC(X) if (X < num_shadowcasting_lights){shadow[X] = ShadowCalculation(depth_samplers[X]);}
SHADOWFUNC(0);
SHADOWFUNC(1);
SHADOWFUNC(2);
Though it still involves copy pasting code that shouldn't have to be copied. It's almost insulting.
In both cases, I have to manage hardcoded constants across my code and shaders, and need to sync both in case it changes in one place. In the case of shadow mapping, I have to duplicate code since I can't use variable array indices on a very common video card.
Since I load my shaders from files as strings before compiling them, the solution was to process the strings beforehand, and replace certain key strings with the constants from the code. The way I do this is creating the idea of "replacement pairs" in a "replacement context".
struct pair
{
char replaced[32];
char* replacement;
size_t len;
};
"replaced" is the key string being replaced. In the case of the maximum number of bones a model can have, "$repl_bones" will be replaced. It appears as the following in my shader code.
const int MAX_BONES = $repl_bones;
Because I know what strings what I want to replace, I set the max size to 32. "replacement" is the string to be written in the shader instead of "replaced." "len" is simply the length of replacement. I'm dealing with strings in C, after all.
Most of what I'm replacing in my shaders are numbers - this is pretty simple to handle.
//here, MAX_BONES is #define'd in client code
strcpy(pair.replaced, "$repl_bones")
len = snprintf(NULL, 0, "%i", MAX_BONES);
pair.len = len;
pair.replacement = malloc(len + 1);
sprintf(pair.replacement, "%i", MAX_BONES);
pair.replacement[len + 1] = '\0';
Handling shadowmapping without variable indices is slightly more complicated, but it's what makes this system much more valuable to me.
//here, MAX_ENT_LIGHTS is #define'd in client code
strcpy(pair.replaced, "$repl_shadowloop")
len = 0;
for (i = 0; i < MAX_ENT_LIGHTS; i++)
{
len += snprintf(NULL, 0, "SHADOWFUNC(%i);", i);
}
pair.len = len;
pair.replacement = malloc(len + 1);
len = 0;
for (i = 0; i < MAX_ENT_LIGHTS; i++)
{
sprintf(pair.replacement + len, "SHADOWFUNC(%i);", i);
len += snprintf(NULL, 0, "SHADOWFUNC(%i);", i);
}
pair.replacement[len] = '\0';
Obviously, this only works if SHADOWFUNC is #defined in the shader where it appears. Less bytes to write this way.
Once I define all of my replacement pairs, I stick them in a "replacement context", which is just a pointer to an array with them along with a number of how many there are.
Actually doing the replacing isn't too exciting. When I load shaders at start time, I loop through all of my replacement pairs in my context and search for the "replaced" string. If I find it, I handle cases where the length of the replacement is less than or greater than the length of the replaced string. If it's less, I just overwrite the bytes of the old string with the new, along with blank spaces if there's still parts of the old string left. If it's greater, I allocate enough memory for a new shader along with the length of the replacement. Either way, I write the replacement string into the shader code before compiling it.
With this system,
$repl_shadowloop
becomes
SHADOWFUNC(0);SHADOWFUNC(1);SHADOWFUNC(2);
Now that I can replace strings in my shaders, I could also do other things like set the GLSL version or other constants like the maximum number of dynamic lights allowed. Being able to call a macro function multiple times lets me overcome an annoying limitation in certain video cards, and not have to worry about writing copy-pasted calls or maintaining 2 versions of a shader. A nice win.