Fitting Procedure is simple with minimal human involvement. We can fit the first frame of a sequence in less than 60 seconds and the subsequent frames in less than 15 seconds, on an average.
The proxy model is articulated using the noisy angles to generate the depth maps and residues are generated by subtracting the fitted proxy depth maps from the input depth maps. Residue differences are obtained by subtracting from . Mask bit is obtained by thresholding the depth map. Encoding is performed as described in the previous section. Decoding is performed as the reverse process at the receivers end.
Results on a few synthetic datasets are shown in graphs of figure 8. We experimented on three MOCAP sequences, Indian dance, Ballet and Exercise, each with around 300-400 frames each. The compression ratio is with respect to the original, uncompressed depth maps. The PSNR is calculated by comparing the reconstructed depth maps with the input depth maps. The residue compression exploits the spatial redundancy but we do residue difference encoding scheme that exploits both the temporal and spatial aspects of Depth movies. The bit-plane scheme provides high compression and moderate quality at low number of bits and good compression and excellent quality at higher number of bits. As we increase the number of bits in encoding, the compression ratio, as expected, decreases with increase in the quality factor. Also, it provides totally random access of the depths of individual frames. Most interestingly, the option of using 0 bits of residue provides a very low bit-rate approximation of the input scene.
Other than joint angle noise and depth noise, we varied the block sizes for coding the depth movies to get nice compression figures with good quality. We observed that as the block size increases, the average compression ratio increases and the PSNR decreases. Thus, higher blocks are preferable for coding as encoding with bits is lesser than bits for D-Frames, where . Keeping the block size constant, with increasing the joint noise the compression ratio reduces as higher bits are needed to fully represent the residues. Figure 7 plots the PSNR and the compression ratio against the number of bits used to encode the residues, and number of bits used to encode residue differences, , for one dataset. It can be seen that the PSNR varies slowly with the number of bits, but the compression ratio of bit-plane encoding is good.
We compared our method with the present state of Art, MPEG. The MPEG compression of depth and residues (MPEG-R and MPEG-D in graphs of figure 8) provides decent compression and quality, but the bit-plane encoding scheme provides more size to quality trade-offs to suit any situation.
With real dataset we carried out the same experiments. Doo-Young dataset consists of 8-bit images. The results for compression ratios and quality are as shown in figure 9. The point cloud, is fitted in the same manner as in the MOCAP dataset. Here, we do not have any noise levels since no simulation of noise is required as it being a real dataset. We observed that the trade-offs are much similar to the MOCAP simulated real dataset.
We observed from graphs in figures 8, 9, if the remote client asks for a particular range of compression ratios and quality, he has a set of choices among various combinations of -bits, -bits and block sizes. This makes the system effective for a remote-server-client teleimmersion environment with user compatible service options.
More results and videos for sequences with varying bits for encoding, levels of joint noise, depth noise and block sizes are provided in the supplementary material.