Friday, April 27, 2018

BC7 showdown #2: Basis ispc vs. NVidia Texture Tools vs. ispc_texcomp slow

Got my BC7 encoder test app linking with NVTT. The BC7 encoder in NVTT is the granddaddy of them all, from what I understand. It's painfully slow but very high quality. I called it using the blob of code below. It supports weighted metrics, which is great.

Anyhow, here are the test results using linear RGB metrics (non-perceptual), comparing NVTT and ispc_texcomp vs. Basis non-RDO BC7 ispc. The test corpus was kodim01-24 plus a number of other images/textures I commonly test with. I turned up the Basis BC7 encoder options to the highest currently supported.

Basis BC7:         365.5 secs   47.162389 dB
NVTT:              28061.9 secs 47.141580 dB
ispc_texcomp slow: 353.6 secs   46.769749 dB

This was a multithreaded test (using OpenMP) on a dual Xeon workstation supporting AVX.

Here's the code snippet for calling NVTT's AVPCL encoder directly to pack BC7 blocks (bypassing the rest of NVTT because I don't want to pack entire textures, just blocks):

#include "../3rdparty/nvidia-texture-tools/src/bc7/avpcl.h"

void nvtt_bc7_compress(void *pBlock, const uint8_t *pPixels, bool perceptual)
AVPCL::mode_rgb = false;
AVPCL::flag_premult = false; //(alphaMode == AlphaMode_Premultiplied);
AVPCL::flag_nonuniform = false;
AVPCL::flag_nonuniform_ati = perceptual;

// Convert NVTT's tile struct to AVPCL's.
AVPCL::Tile avpclTile(4, 4);
memset(, 0, sizeof(;

for (uint y = 0; y < 4; ++y) 
for (uint x = 0; x < 4; ++x) 
nv::Vector4 &p =[y][x];

p.x = pPixels[0];
p.y = pPixels[1];
p.z = pPixels[2];
p.w = pPixels[3];

pPixels += 4;

avpclTile.importance_map[y][x] = 1.0f; //weights[4*y+x];

AVPCL::compress(avpclTile, (char *)pBlock);

No comments:

Post a Comment