Textures : can we cheat ?

I have seen a lot of tutorial about merging multiple texture into one by using the canals of a texture as different textures. Problem : this technique is not always good. The big difficulty is to know when to use it, when to avoid it, and know how textures are compressed. It very difficult in the case of a normal map because of the way of how they are compressed.
Generally when you work in Photoshop (by example), your settings are on 8byte per canal :

In this case, your image will take 8bits per pixel and by canal. So for a texture of 512*512 pixels in R8G8B8 (8bits on Red, 8bits on Green and 8Bits on Blue) you will take 1mb in memory. This is an uncompressed format, it’s very heavy, especially for a game purpose !

Classic Textures : diffuse/speculars/…

The first thing to understand, is that the graphic card memory can only read raw data (so a jpeg or a png cannot be handled by a graphic card in memory). The CG need to uncompress the texture to use it. But raw data is very heavy and graphic memory is not infinite.
Before thinking of compressing your texture, you can maybe think about reducing the resolution of your texture. Do you really need a 2048² or 1024² texture when only a 512² is enough ? Textures is an hard art, but when very well achieved, a 512² texture is sometimes better than a 1024² one. You can also think about reducing the amount of data per pixels, so instead of using a R8G8B8 texture (like reducing the format R5G6B6), using a quantized texture can be better (like using a palette). Some texture don’t need at all to be in full 32 bytes, for a grayscale by example you don’t need all the colors informations.

Source : “Compression numérique” by Gil Damoiseaux

In this case of texture, the dithering is a good idea for avoiding some blur on the texture made by the different filtering (like the bilinear).

Anyway, some special format were created for the CG. Theses format are specials, in memory they have the same size than on your hard drive. The GPU uncompress them on the fly because theses format are optimized for this purpose and by the way very fast. Theses format are called DXTC or BC (DirectX texture Compression / Block Compression). They are only available on PC, Xbox and the current-gen of consoles (360/PS3 and so on).

The DXT format work by splitting the texture into blocks of 4 by 4 texels. Every block is independent, had its personal palette in R5G6B5, and is made with 16 texels which can be made by only 4 possible variation :
00 – Color 1 at 100%
10 – Color 1 at 66%, and color 2 at 33%
11 – Color 1 at 33%, color 2 at 66%
01 – Color 2 at 100%

Also, you need to know that :
1bit = 2 colors
2bits = 4 colors
3bits = 8 colors
4bits = 16 colors
5bits = 32 colors
6bits = 64 colors
7bits = 128 colors
8bits = 256 colors
24bits = 16 777 216 colors

So, we have 4 colors, 2 true colors and 2 other which are interpolated. This is why we need 2 bits per texel (0,1,2,3) because every texel can become one of these 4 colors. This is why a block cost 32bits (16bits x2). For specifying which color is in this texel, we use an Index (a color table) which store 2bits on 16 cases (4×4 case) which cost 32bits.
This true for every DXT format from DXT1 to DXT5. The only change between the DXT format is the quality of the Alpha.

In DTX1 there is no alpha, is just a blank channel automatically created during the compression of the texture. So if your texture have an alpha channel, don’t use DXT1, because it will be ignored. The DXT1 cost 64bits.

The DTX2 and DXT3 use for the alpha 4bits (16 colors) per texel (16texels * 4bits = 64bits). In DXT2, the color data is interpreted as being premultiplied by alpha, in DXT3 it is interpreted as not having been premultiplied by alpha. So DTX2/3 cost 128 bits (64bits for the diffuse, like DXT1, and another 64bits for the explicit alpha).

In DXT4 and DXT5 we use an index also for the alpha rather than direct color. Se we have an alpha color table which is made by 8 colors. 2 colors on a range of 256 possibilities (0 to 255 = 8bits), and from that we use 2 different algorithms (the calculation is based on the quantity of full black color currently present) to made 6 more interpolated colors. This table cost 16bits (8bits*2). Then an index of 4×4 by 3 bits (16*8colors) is made and replace each texel by a value of the index. The index cost 48bits (3*16). So the alpha in DXT4/5 cost 16 (alpha table) + 48 (index) = 64bits.

The DXT4/5 format cost the same thing than the DXT2/3 format : 128bits per block. Why create the DXT4/5 format then ? Because the algorithm of the DXT4/5 format smooth the value of the value, it’s a better format for having gradient in the alpha.
This is why if you prefer a sharp alpha you need to use the DXT2/3.

Now you know how many cost every DXT format and how they work. For a simple texture, DXT1 is great. But if you want to add a specular texture on your shader, prefer to use the DXT2/3 format rather than two DXT1 texture because by doing then you will use only one access in memory to load your texture rather than two. Memory access cost performance and take place on the transfer BUS (especially on consoles, which mean that while you try to read two textures you potentially block an access by an other program/thread).

Normal maps : a little bit more hard

Recap : What are the normal maps ?
Normal maps are textures used to replace the normals of an object and to refine the details on the surface (by modifying how the lighting affect the model). Normal maps are simply a normalized vector orientation across a surface. In video games, we use them to fake high resolution geometry on a low resolution model. Example with Lena :

A normal map is simply using the 3 channels of a texture to store 3 vectors. With the Red and Green channel we have the X and Y vector (from the UV) and in the blue channel we have the Z vector.

For compressing normals maps we can use different formats. I will only talk about the format which are compatible with the UDK. We have already seen the DXTC format, we can of course use it to compress a normal map. The Unreal Engine support also the V8U8 and the BC5 (3Dc/DXN) format. There is of course more format for compressing texture (especially in DirectX11 which add some new very cool formats).

You can of course us the DXT1 format (TC_Normalmap in UDK), but you will lose a lot of quality. The DXT5 format (TC_NormalmapAlpha in UDK) will allow you to add an alpha in the alpha channel with your Normal map diffuse, the quality is equal to the DXT1 format.

You have also the V8U8 (TC_NormalmapUncompressed in UDK) which store the data not like the DXTC format does.
It can store a normal map by using the two components to store the X and Y axis of the normal for each texel. Since normals are unit-length (i.e. all components squared and added together equal 1) we can just use simple mathematics to figure out what the Z axis of a normal should be since we know what the X and Y axes are.
Although this format does not technically use compression it does allow us to take an uncompressed normal map and store it using much less space by dropping a component entirely. Since we are dropping a component from each texel we are going from 3 bytes per-texel to just 2 (i.e. 1/3 less space). In a fragment shader, we can just calculate the Z axis of a normal quickly and easily.

Source : Gamasutra article : “Texture compression technique”

The Z component is simply recreated by the following formula : X = R * alpha, Y = G, Z = Sqrt(1 – X^2 – Y^2). This format is uncompressed, but take less place than an uncompressed texture (like the R8G8B8) with good quality in the RG channel. Be careful : the UDK editor automatically reduce the resolution of the texture by choosing the first mipmap level. Epic Games believe that choosing an uncompressed and lower texture is better in quality than a full size DXT1 texture (This format take four time more than a DXT1 texture in memory). You can of course restore the resolution of the texture by choosing an other format.

Finally you have the BC5 format (also know as 3Dc or DXN, TC_NormalmapBC5 in UDK). This format work like the V8U8, we only store the X and Y vector and calculate manually the Z vector (by the pixel shader). The big difference is we compress the texture like the DXT5 format does for the alpha channel and the channel are independent from each other. We get a lesser blocky texture with the same amount of memory than a DXT5 texture.
Unfortunately, this format is only compatible with DirectX10, the Xbox360, and DirectX9 only for ATI graphic cards (this format was created by ATI).

So, can we cheat ?

I’m sure you can cheat with some format for avoiding multiple texture memory access. By example, rather than storing alpha mask in different textures, you can make only one texture with a different mask per channel. By this way you have only one access in memory for loading 3 masks and you save also some memory storage (1 texture rather than 3). Even if the DXT compression make artifacts, on mask channel it’s not a problem if you avoid very detailed or gradient gray. Using only white and black color will preserve your channels.
Also, cheating with classic (and photo-realistic) texture is more easy because you get some noisy levels in your texture, so the DXT compression is easily unnoticeable, especially when you add some lighting later in your scene.

For your Normal maps is a bit harder, because normal map need to be in very good quality since we add more detail on the mesh. Also, when some format like the DXTC look in every channel of your texture, you can’t try to store an other map by replacing the blue channel (even if you can remade the Z vector in the material editor of the UDK), you will certainly lost some data and maybe make more artifacts on the normal map itself.

Also, take care of the recommendations of Epic Games :

  • – Use TC_NormalmapUncompressed with reduced-resolution instead of TC_Normalmap, if it looks better for your content. The memory usage is the same.
  • – Use TC_NormalmapBC5 if you are making an Xbox 360 game and you feel the quality difference compared to reduced-resolution TC_NormalmapUncompressed is worth the 2x memory size increase.
  • – It is recommended you use the TextureSampleParameterNormal material node whenever you need to use a normal map as a parameter.
  • – You should be consistent with the normal map formats you use with a particular material instance parent, to reduce the number of shaders generated.

Source : UDN – Normal map formats

Thanks to Polycount, nVidia, ATI, Valve, wikipedia for providing a lot of information on this subject. It helped me a lot ! 😀