Difference between forward vs compress/decompress reconstruction
See original GitHub issueHello,
I have a question for a better understanding of your very useful and nice library, and it would be great if you could add a similar example to your example folder for others.
I did a simple test and noticed there is a difference between actual reconstruction results (obtained by compress
/decompress
functions) and the one obtained by the forward
function. The difference is in both the reconstructed results and the estimated bits. However, if I clamp
the output of the forward
function then there is no difference in reconstruction results, but still, there is a difference between theoretical bit rates and actual bitrates. So, I have two questions in that regards:
1- Does it mean that the compress
and decompress
function somehow clamp the results? i.e., there is no need to clamp the output by ourselves?
2- Does the difference between theoretical and actual bitrates come from the practical implementation of the encoder that imposes some extra bits for tasks such as the “end of file” symbol, discretization of everything into bits, etc.)
Here is a simple code to test:
import math
import torch
from torchvision import transforms
from PIL import Image
def compute_theoretical_bits(out_net):
list_latent_bits = [torch.ceil((torch.log(likelihoods).sum(dim=(1, 2, 3)) / (-math.log(2)))) for likelihoods in out_net['likelihoods'].values()]
total_bits_per_image = torch.sum(torch.stack(list_latent_bits, dim=0), dim=0).long()
return total_bits_per_image
def compute_actual_bits(compressed_stream):
list_latent_bits = [torch.tensor([len(s) * 8 for s in list_s]) for list_s in compressed_stream["strings"]]
total_bits_per_image = torch.sum(torch.stack(list_latent_bits, dim=0), dim=0)
return total_bits_per_image
from compressai.zoo import bmshj2018_hyperprior
device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = bmshj2018_hyperprior(quality=2, pretrained=True).eval().to(device)
net.update(force=True) # update the model CDFs parameters.
print(f'Parameters: {sum(p.numel() for p in net.parameters())}')
print(f'Entropy bottleneck(s) parameters: {sum(p.numel() for p in net.aux_parameters())}')
img = Image.open('../data/stmalo_fracape.png').convert('RGB')
x = transforms.ToTensor()(img).unsqueeze(0)
x = x.to(device)
with torch.no_grad():
#output of training
out_net = net(x)
out_net['x_hat'].clamp_(0, 1)
bits_per_image = compute_theoretical_bits(out_net)
# output of real compression and decompression
compressed = net.compress(x)
compressed_bits_per_image = compute_actual_bits(compressed)
decompressed = net.decompress(compressed["strings"], compressed["shape"])
# decompressed['x_hat'].clamp_(0, 1) # no need to clamp decompressed results?
diff = (out_net["x_hat"] - decompressed["x_hat"]).abs()
diff_in_bits = (bits_per_image - compressed_bits_per_image).abs()
print("max difference={}, min difference={}".format(diff.max(), diff.min()))
print("diff in bits={}, ratio (compressed/training)={}%".format(diff_in_bits, torch.div(compressed_bits_per_image, bits_per_image)))
isCloseReconstruction = torch.allclose(out_net["x_hat"], decompressed["x_hat"], atol=1e-06, rtol=0)
isCloseBits = torch.allclose(bits_per_image, compressed_bits_per_image, atol=0, rtol=1e-2)
assert isCloseReconstruction, "The output of decompressed image is not equal to image"
assert isCloseBits, "The number of compressed bits is not equal to the number of bits computed in training phase"
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (2 by maintainers)
Thank you Jean and wish you a nice and pleasant end of the year.
For bmshj2018-factorized/bmshj2018-hyperprior, there is no difference. For mbt2018, there is difference. I think it is because autoregression?