Combined Shading

Combined Shading

April, 2002 (Revised February, 2012)

Introduction

The renderer now has the ability to shade several geometric primitives at the same time, provided that you follow a few simple rules when writing shaders. If you follow these rules, small objects such as leaves, grass blades, hairs, pebbles, etc. will shade up to 4-5 times faster (in the limiting case where the objects are far away, and are diced into a single micropolygon each). This can speed up the overall rendering times for complex scenes by up to a factor of 3. Even in moderate cases, where individual objects turn into 20 or 30 micropolygons each, your shaders can execute up to twice as fast.

So, how can you get this extra performance? The basic requirement is that all of the relevant objects must have the same shaders (including displacement, surface, atmosphere, and light shaders). Beyond that, the performance improvements should happen automatically - unless you have broken one of the rules listed below.

The rest of this document describes the rules you must follow. By far the most important of these is to get your parameter declarations right: if there is a quantity that has a different value for each object to be shaded, then it must be declared "varying" in the shader. There are also a number of other more subtle factors (such as attribute values) that can prevent you from getting this extra performance.


Parameter Declarations

When deciding whether a shader parameter should be "uniform" or "varying", consider whether the value will be constant over the whole group of objects that should be shaded at once. Any parameter that has different values for different objects must be declared "varying" in the shader. For example, objects such as grass and leaves often have a "unique ID" parameter that is different for each leaf. In CheapGrass/newgrass.sl (in "tsveg") we have:

surface newgrass( ...
uniform float Season = 0;
uniform float GrassID = 0;
uniform float ColorNoise = 0;
)

In this case Season is presumably constant over large swaths of grass, so it can be left alone. However, GrassID and ColorNoise typically have a different value for every grass blade. Their declarations need to be changed to "varying" in order for multi-gprim shading to have any effect:

surface newgrass( ...
uniform float Season = 0;
varying float GrassID = 0;
varying float ColorNoise = 0;
)

There are a few points to note about this:

  • Changing these parameters to "varying" may cause the shader to do a small amount of extra work. For example, the shader above computes "cellnoise(GrassID)", and it evaluates a color spline using ColorNoise as the parameter value. However, the extra rendering time due to making these operations "varying" is virtually unmeasurable, and in any case it is completely swamped by the time savings due to shading multiple blades of grass at the same time.
  • The actual parameter values as specified in the RIB stream can still be uniform (one value for each grass blade). They do not need to changed to "varying" or "vertex". It is perfectly acceptable to bind a "uniform" data value to a "varying" parameter. This method also has the advantage of not increasing the size of RIB files.

Finding Declarations that Need to Be Changed

If you accidentally declare a parameter whose value differs between objects as "uniform" (such as GrassID in the example above), the renderer will be forced to process each object separately. This can slow down shading quite substantially.

Fortunately, the renderer itself can help you to locate the parameter declarations that are causing problems. If you look at the statistics file generated by the renderer, you will see something like this:

Grid merging statistics: (average size increase: 2.5x)
     39019 grids were combined with an existing grid
     26434 new grids were created, for the following reasons:
       628 - grid would be too large
         6 - different "object" coordinate systems
     22598 - different gprim values for a uniform parameter (see below)
       649 - different (but identical) shader instance lines
      1576 - different "shader" coordinate systems
        82 - different shader instance parameter values
       624 - different shaders (including lights)
       271 - first grid in bucket
  Detailed breakdown of rejections due to uniform parameters:
  (consider changing these parameters to "varying" in shader)
     22573 - "GPrimTag_0" in "TreeElm/leaf"
         2 - "GPrimTag_0" in "Neighborhood/NbdHouse/rafter"
         1 - "GPrimTag_0" in "Neighborhood/NbdHouse/Gutter"
        12 - "GPrimTag_0" in "Neighborhood/NbdHouse/Siding"
         5 - "GPrimTag_0" in "Neighborhood/NbdHouse/porticoTrim"
         5 - "GPrimTag_0" in "Neighborhood/NbdHouse/BaseConcrete"

The first line summarizes how many grids were merged together with another grid before shading, while the second line says how many grids could not be merged an existing grid and were therefore shaded. The remaining lines give a breakdown of the reasons why these grids could not be merged.

In particular, notice that by far the largest culprit in this example is the entry with 22598 grids (about 85% of the total), which are due to "different gprim values for a uniform parameter". This means that there were many grids that could have been combined, except that they had different values for a uniform shader parameter. If you look further down in the statistics, there a detailed listing of the actual shaders and parameters that caused problems. In this case, we see that almost all of the rejections are due to the parameter "GPrimTag_0" in the shader "TreeElm/leaf".

Examining the source code for this shader, we find:

surface leaf( ...
    uniform float     BushID = 0;
    uniform float     GPrimTag_0 = 0;
    uniform float     NumVariants = 1;
)

By changing the parameter "GPrimTag_0" to be "varying", most of the 22598 grids will be merged together (as we will see below).

Note that if several parameters to a shader are causing problems, the statistics will report only the first one. Once that parameter is fixed, it will then report the next one that causes problems, and so on.

Paint Variants

Sometimes changing a parameter from "uniform" to "varying" may not be simple. For example, in the "TreeElm/leaf" shader above we have:

surface leaf( ...
    uniform float    BushID = 0;
    uniform float    GPrimTag_0 = 0;
    uniform float    NumVariants = 1;
) {
   ...
   /* Paint variants. */
   variant = format("%d", mod(BushID + GPrimTag_0, NumVariants));
   spotvariant = format("%d", abs(mod(11+BushID-GPrimTag_0, NumVariants)));
   ...
}

Here the "format" shadeop is being used to construct part of a texture file name. Since the arguments to "format" must be uniform (and in general, since the shading language only supports uniform strings), we cannot simply change "GPrimTag_0" to be "varying".

Returning to the underlying problem, recall that the reason that the renderer cannot shade all the leaves at once is that "GPrimTag_0" takes on a large number of different values (in this case, a unique value for every leaf). On the other hand, suppose that "GPrimTag_0" had only two different possible values (0 and 1). In that case the renderer would be able to group together leaves with the same tag value (e.g. all those with value 0), and shade them in large groups.

Applying this principle to the example above, note that the number of paint variants is generally quite small (in this example there were 8 variants). Thus for the purposes of the "format" statement, there might as well be only 8 values for "GPrimTag_0". If we could limit the number of different values in this way, the renderer would be able to sort the leaves into 8 groups and shade each group separately (note that this happens automatically).

The main problem with this idea is that the shader uses "GPrimTag_0" for other purposes as well (such as adjusting the leaf color), and in those situations we probably still want each leaf to have a distinct ID.

So, the easiest solution is to split "GPrimTag_0" into two different tags:

  • the old tag "GPrimTag_0" is converted to "varying", and takes on the full range of its original values; while
  • the new tag "GPrimTag_1" is declared "uniform", but has only 8 different values (and is used only for paint variants).

The resulting shader looks like this:

surface leaf( ...
    uniform float    BushID = 0;
    varying float    GPrimTag_0 = 0;   /* original leaf ID */
    uniform float    GPrimTag_1 = 0;   /* paint variant (8 values) */
    uniform float    NumVariants = 1;
) {
    ...
    /* Texture mapping space. */
    if (float cellnoise(BushID + GPrimTag_0) < .5)
    x = t;
    else
    x = 1-t;
    ...
    /* Paint variants. */
    variant = format("%d", mod(BushID + GPrimTag_1, NumVariants));
    spotvariant = format("%d", abs(mod(11+BushID-GPrimTag_1, NumVariants)));
    ...
    /* Adjust the color of the leaf surfaces and tint. */
    Cleaf *= color(
    mix(1-HueRange/2, 1+HueRange/2, float cellnoise(BushID, GPrimTag_0)),
    mix(1-SatRange/2, 1+SatRange/2, float cellnoise(3+BushID, GPrimTag_0)),
    mix(1-LumRange/2, 1+LumRange/2, float cellnoise(13+BushID, GPrimTag_0))
    );
    ...
}

Notice that "GPrimTag_1" is used only for the paint variant lookup, while "GPrimTag_0" is used everywhere else. Of course, we must also modify the model or DSO that generates the RIB, in order to generate values for the "GPrimTag_1" parameter (which should equal "GPrimTag_0" mod 8). With these changes, we get the following rendering statistics:

Grid merging statistics: (average size increase: 17.4x)
     69106 grids were combined with an existing grid
      4212 new grids were created, for the following reasons:
      1024 - grid would be too large
         6 - different "object" coordinate systems
        47 - different gprim values for a uniform parameter (see below)
       739 - different (but identical) shader instance lines
      1425 - different "shader" coordinate systems
        90 - different shader instance parameter values
       615 - different shaders (including lights)
       266 - first grid in bucket
  Detailed breakdown of rejections due to uniform parameters:
  (consider changing these parameters to "varying" in shader)
        13 - "GPrimTag_0" in "Neighborhood/NbdHouse/rafter"
        21 - "GPrimTag_0" in "Neighborhood/NbdHouse/Siding"
         1 - "GPrimTag_0" in "Neighborhood/NbdHouse/Gutter"
         5 - "GPrimTag_0" in "Neighborhood/NbdHouse/porticoTrim"
         2 - "GPrimTag_0" in "Neighborhood/NbdHouse/Stucco"
         5 - "GPrimTag_0" in "Neighborhood/NbdHouse/BaseConcrete"

The number of grids shaded has gone down from 26434 to 4212, a factor of six improvement!

Note that this is not the only way to handle paint variants. For example, we could get by with just a single "varying" tag (the leaf ID) by handling more than one paint variant within the shader. This would involve looping over the 8 possible paint variants, and looking up the texture colors for the appropriate subset of points on each pass. Unlike the previous technique, this would require substantial modifications to the shader.


Combining Considerations

Non-Smooth Derivatives

Multi-gprim shading requires shaders to be compiled using "smooth derivatives" (which happens by default). Otherwise, "du" and "dv" will be assumed to be uniform variables by the shader compiler, and this will prevent the renderer from combining grids whose micropolygons have different sizes. (In the absence of smooth derivatives, grids can be combined only if they have the same geometric "du" and "dv" values.)

Thus the "-ns" option of the shader compiler (which forces non-smooth derivatives) should be avoided. Similarly, shaders should not use the global variables "__gdu" or "__gdv". Very old shaders (that were compiled before smooth derivatives existed) should be recompiled. The renderer will print a warning at run-time if such a shader is used:

S99002 Shader "OldCrap" uses geometric "du" or "dv". (PERFORMANCE WARNING)

Any grids that could not be merged together for this reason will be listed as "different geometric du/dv values" in the statistics output.

Different Object Spaces

As of version 17.0, when gprims have different object spaces it is often possible for the renderer to shade them together. For this to be true, however, the shader must be written to avoid assignment to uniform matrix results. For example, the following transformations are fine:

varying point objP = transform("object", P)
varying normal Ns = ntransform("object", myMatrix, N);
varying point ckPs = transform(ckcoords[ckIndex], P);
uniform point Peye = transform("shader", E);

And as of version 17.0, the following are fine as well:

uniform point Pobj = transform("object", E);
uniform point Orig = point "object" (0,0,0);
uniform point Q = transform(ckcoords[ckIndex], point (0,0,0));
uniform vector r1 = vtransform(from, to, vector "current" (1,0,0));
uniform matrix M = matrix "object" 1;
float size = abs(determinant(1/(matrix refspace 1)*(matrix curspace 1)));

The one exception to combining primitives that have different object spaces is the use of the RxTransform() and RxTransformPoints() calls. When those are called inside an RSL plugin the renderer cannot determine that the objects should not be combined. Because RxTransform() and RxTransformPoints() expects the return result to be strictly uniform. If you have RSL plugins that use RxTransform() with object space, you can use a new attribute:

Attribute "shadegroups" "string objectspacecombining" [true | false]

The default is to combine all objects even when they have different object spaces. When a primitive has this attribute turned off it won't be combined with other primitives with different object spaces and this will mean that calls to RxTransform() with object space will work as expected.

Different Shader Spaces

Similarly, the renderer cannot shade gprims that have different "shader" coordinate systems at the same time. This is true whether or not the shader actually refers to "shader" space.

If you need to have a different "shader" space for each gprim, consider using "object" space instead (and following the rules mentioned above).

Starting with version 16.4, shaders can be annotated with a hint to the renderer that they do not care about shader space. When the uniform float __ignoresShaderSpace = 1 parameter is found on a shader, the renderer will force the shader space to be identity. Gprims that share identical shaders but have different coordinate systems will be combinable.

Different Shader Instances

In general, gprims that are bound to different shader instance lines in the RIB stream cannot be shaded at the same time. For example, the following gprims will be shaded separately:

Displacement "lumpy" "Km" [0.2]
Patch "bilinear" "P" [-1 1 0 1 1 0 -1 -1 0 1 -1 0]
Displacement "lumpy" "Km" [0.5]
Patch "bilinear" "P" [-1 1 2 1 1 2 -1 -1 2 1 -1 2]

In this situation, consider binding any parameters that have a different value for each gprim to the gprims themselves instead:

Displacement "lumpy"
Patch "bilinear" "P" [-1 1 0 1 1 0 -1 -1 0 1 -1 0] "Km" [0.2]
Patch "bilinear" "P" [-1 1 2 1 1 2 -1 -1 2 1 -1 2] "Km" [0.5]

Attribute Values

If a shader uses the "attribute" function, then the renderer can only combine gprims whose attribute values are the same. For example, suppose that the shader looks at the object name:

string objname;
attribute("identifier:name", objname);

In this case, the renderer will only be able to combine gprims that have the same object name. (On the other hand, if the shader does not examine the object name, the renderer can and will combine gprims with different names.)

The renderer does not verify the consistency of every attribute that can be queried with the attribute() function. The attributes that are verified are:

 Ri:ShadingRate
 Ri:Matte
 GeometricApproximation:motionfactor
 GeometricApproximation:focusfactor
 sides:doubleshaded
 Ri:Sides
 geometry:backfacing
 LODFraction
 Ri:Orientation

If any of these attributes are queried, two grids can be combined if they agree on the attribute value. If any other attribute is queried, the renderer assumes the attribute values differ and combining fails.

This behavior can be relaxed by setting the attribute:

Attribute "shadegroups" "string attributecombining" [strict | permissive]

In the prescence of an attribute lookup, grids must first agree on the value of this attribute in order to combine. The default behavior is strict. If permissive is chosen, the renderer will assume any attribute not in the verified set is the same across all grids and combining succeeds.

Note that if a string variable is used for the attribute name, the renderer can combine gprims only if all of their attribute values are the same (including the object name, displacement bound, shading rate, sidedness, et cetera). Therefore this practice should be avoided. For example:

string objname;
attribute(attribname, objname);   /* avoid this */

time and dtime

If a shader makes use of the "time" or "dtime" global variables, the renderer will combine gprims only if they have the same values for these variables. This is particularly important for multi-segment motion blur, where the same gprim is shaded at several different times. If the shader for this gprim does not refer to "time" or "dtime", then the renderer will be able to shade all of the motion segments at once.

Miscellaneous

Here is a brief explanation of some of the other reasons that the renderer may not be able to combine two gprims:

  • "first grid in bucket"

    Since the renderer currently shades the gprims in each bucket independently, the first grid in each bucket cannot be combined with an existing grid.

  • "grid would be too large"

    The renderer will not merge grids if the combined grid would have more points than the limit set by the "shadesize" option. (Such limits are necessary to control the amount of memory needed for variables as the shader is executed.)

  • "different gprim variable bindings"

    Gprims will not be merged if they have a different set of user variables bound to them.

  • "different gprim variable layouts"

    Gprims will not be merged if their user variables are listed in different orders in the RIB stream.


Unintentional ifevers

Finally, be careful when writing any shader code that attempts to extract a "uniform" property from varying data associated with the grid. This has always been a dangerous thing to do, but it is now even more so.

For example, consider the following shader code:

/*
 * Assuming that Cs is in RGB space, convert to HSV space so
 * that we can modulate saturation and value without changing hue.
 */
red = comp(Cs, 0); grn = comp(Cs, 1); blu = comp(Cs, 2);

/* set val to largest rgb component, x to smallest */
if (red >= grn && red >= blu) {
    /* red largest */
    val = red;
    if (grn > blu) {
        x = blu;
        spoke = "Rb";
    } else {
        x = grn;
        spoke = "Rg";
    }
} else if (grn >= red && grn >= blu) {
    /* green largest */
    val = grn;
    if (red > blu) {
        x = blu;
        spoke = "Gb";
    } else {
        x = red;
        spoke = "Gr";
    }
} else {
    /* blue largest */
    val = blu;
    if (grn > red) {
        x = red;
        spoke = "Br";
    } else {
        x = grn;
        spoke = "Bg";
    }
}

This code contains a subtle bug: if the surface color "Cs" is not constant, it may contain colors that belong in different parts of the color wheel. However, the variable "spoke" is uniform (since it is a string value). This has the effect of an "ifever": all of the "Cs" values will be mistakenly lumped into a single spoke.

Note that if this shader is applied to constant-colored objects, it works fine. However, with multi-gprim shading it is possible that several objects with different colors will be shaded at once. It is important to keep this in mind when writing shaders. In the example above, the problem can be fixed by storing the spoke value in a varying float (with a predefined constant for each spoke).

Here is another example:

uniform float side;
if ((dPdu ^ dPdv) . Ng > 0)
    side = 0;
else
    side = 1;

In this case, the shader is attempting to determine which "side" of the surface we are shading. The problem is that with multi-gprim shading, the renderer might well be shading both sides of the surface at once! The solution is to convert the variable "side" to be varying.

The point of these examples is that in general, it is dangerous to try to extract any uniform property from a set of varying data. This has always been true, but with multi-gprim shading it is even more important. Fortunately, the shading language makes this rather hard to do: it is necessary to use "ifever" or some equivalent construct.

Finally, we emphasize that "ifever" itself is not dangerous. Most often "ifever" is used to avoid expensive calculations when their results are not needed (e.g. if the result will be multiplied by zero). These uses of "ifever" are completely safe and are not affected by multi-gprim shading.