next up previous
Next: Conclusion Up: Proxy-Based Compression of D Previous: Decoding:

Experiments and Results

We present the results of our scheme now. We have experimented on real depth datasets and MOCAP datasets for dynamic scenes. Real dataset we have used is from ETH-Z, a Doo-Young karate sequence as shown in figure 1. As such real data is very rare, we have also experimented on real MOCAP ( Motion Capture) data available freely with CMU. A standard POSER human model was animated using the MOCAP parameters and 16-bit depth maps were captured from 20 viewpoints. 16-bit depthmaps help in capturing depth to 65535 meters. The joint-angle parameters for compression are very similar to the MOCAP files. We, however, added noise to the joint angles to simulate bad fitting of the model to real data. We also added slow-varying noise to the depth values to simulate errors of the depth-recovery process. The depth noise has a small random component at each pixel which rides on top of a larger component that varies slowly over the whole depth map.

Fitting Procedure is simple with minimal human involvement. We can fit the first frame of a sequence in less than 60 seconds and the subsequent frames in less than 15 seconds, on an average.

The proxy model is articulated using the noisy angles to generate the depth maps and residues are generated by subtracting the fitted proxy depth maps from the input depth maps. Residue differences are obtained by subtracting $ R_i$ from $ R_{i+1}$. Mask bit is obtained by thresholding the depth map. Encoding is performed as described in the previous section. Decoding is performed as the reverse process at the receivers end.

Results on a few synthetic datasets are shown in graphs of figure 8. We experimented on three MOCAP sequences, Indian dance, Ballet and Exercise, each with around 300-400 frames each. The compression ratio is with respect to the original, uncompressed depth maps. The PSNR is calculated by comparing the reconstructed depth maps with the input depth maps. The residue compression exploits the spatial redundancy but we do residue difference encoding scheme that exploits both the temporal and spatial aspects of Depth movies. The bit-plane scheme provides high compression and moderate quality at low number of bits and good compression and excellent quality at higher number of bits. As we increase the number of bits in encoding, the compression ratio, as expected, decreases with increase in the quality factor. Also, it provides totally random access of the depths of individual frames. Most interestingly, the option of using 0 bits of residue provides a very low bit-rate approximation of the input scene.

Other than joint angle noise and depth noise, we varied the block sizes for coding the depth movies to get nice compression figures with good quality. We observed that as the block size increases, the average compression ratio increases and the PSNR decreases. Thus, higher blocks are preferable for coding as encoding with $ K$ bits is lesser than $ k$ bits for D-Frames, where $ K > k$. Keeping the block size constant, with increasing the joint noise the compression ratio reduces as higher bits are needed to fully represent the residues. Figure 7 plots the PSNR and the compression ratio against the number of bits used to encode the residues, $ K$ and number of bits used to encode residue differences, $ k$, for one dataset. It can be seen that the PSNR varies slowly with the number of bits, but the compression ratio of bit-plane encoding is good.

We compared our method with the present state of Art, MPEG. The MPEG compression of depth and residues (MPEG-R and MPEG-D in graphs of figure 8) provides decent compression and quality, but the bit-plane encoding scheme provides more size to quality trade-offs to suit any situation.

With real dataset we carried out the same experiments. Doo-Young dataset consists of 8-bit images. The results for compression ratios and quality are as shown in figure 9. The point cloud, is fitted in the same manner as in the MOCAP dataset. Here, we do not have any noise levels since no simulation of noise is required as it being a real dataset. We observed that the trade-offs are much similar to the MOCAP simulated real dataset.

We observed from graphs in figures 8, 9, if the remote client asks for a particular range of compression ratios and quality, he has a set of choices among various combinations of $ K$-bits, $ k$-bits and block sizes. This makes the system effective for a remote-server-client teleimmersion environment with user compatible service options.

More results and videos for sequences with varying bits for encoding, levels of joint noise, depth noise and block sizes are provided in the supplementary material.


next up previous
Next: Conclusion Up: Proxy-Based Compression of D Previous: Decoding:
2008-04-27