Volume Rendering with Higher-Order Interpolation - Project Webpage
A great amount of work has been done in the research area of volume rendering. Most of this work deals with the fundamental difficulty that has to be solved for all implementations in that area: both rendering performance and rendering quality should be as high as possible. Unfortunately, these two requirements are more or less contradicting each other. In this project, three different approaches for cubic interpolation methods are implemented for a volume rendering framework. These methods are contrasted to standard trilinear interpolation and also compared with each other with regard to their rendering speed. Although all three methods provide the same visual quality, the difference in performance is significant. The results show that tricubic filtering for volume rendering is feasible at interactive frame rates with currently available customer graphics hardware.
Data that is stored for use in a computer is always discretized. Reconstructing the original data as accurate as possible – while maintaining a reasonable execution speed – is a fundamental problem in computer graphics. For the special case of volume rendering, the original continuous data is usually sampled at evenly spaced locations on a three dimensional Cartesian grid. Theoretically, the original volumetric data can be reconstructed without any loss of information. In practice, this is of course not possible since it would require an ideal reconstruction filter (like the sinc-function). Unfortunately, such an ideal filter has infinite length. Therefore, as it is often the case, a trade-off between quality and speed has to be made.
Usually trilinear filtering is used to reconstruct the volumetric data in volume rendering frameworks. This interpolation method is supported by current graphics hardware and is thus very fast. However the quality of this interpolation method offers much leeway for improvements. Since current graphics hardware allows implementing almost arbitrary interpolation kernels, the next step of higher-order interpolation is explored for a volume rendering framework. The major bottleneck of such interpolation kernels lies in the high number of texture lookups that are needed. In total, three different implementations of tricubic interpolation were developed – a straightforward implementation with 64 texture lookups, a two-level version with 27 texture lookups and an implementation that was published earlier. These implementations are compared to each other with regard to their execution speed for an artificial and a real-world test data set.
The higher-order interpolation method will be applied in a volume rendering software which uses a raycasting approach. Here, data values have to be computed that eventually lie between grid points of the input volume – this kind of resampling is usually done with trilinear interpolation in 3D. In GPUs, a texture access with that kind of interpolation is performed with hardware acceleration – therefore it is very fast. However, to improve the visual quality significantly, higher-order interpolation schemes should be applied. In this project, tricubic interpolation will be implemented as a shader program that runs on the GPU. Cubic interpolation schemes do not only consider direct neighbors of the interpolation point, but additionally they "blossom" into all directions and include a bigger neighborhood. Although this can be done quite easily in a straightforward way, an efficient implementation will prove to be more complicated.
The idea is to make use of hardware-accelerated texture accesses in order to speed up the higher-order interpolation computations. The problem is that there is no easy way of doing this. Therefore, the first step of this project will be to find a mathematical solution to this problem. Subsequent steps will implement the interpolation formula in shader programs; an additional goal would be to use this interpolation on BCC (body-centered cubic) grids.
In order to reduce the number of texture accesses significantly, an optimized version of the straightforward tricubic interpolation was developed. The basic idea is to approach this interpolation problem in two levels. To benefit from the hardware support of current GPUs, texture accesses with hardware-accelerated linear interpolation are performed as the first level. These intermediate results are used as input for the second level, where the final value is computed using quadratic B-splines as weights. By using C. de Boor’s recurrence relations [de Boor, 1972], this final result is equivalent to cubic interpolation.
The goal is to implement higher-order interpolation on graphics hardware. The first step to accomplish this is the straightforward implementation of a bicubic interpolation in 2D (or tricubic interpolation in 3D) using vertex and pixel shaders. In order to validate the implemented interpolation filters, a small testing framework was developed which works with DirectX 9.0c and HLSL. This application is able to show the interpolation result for various interpolation methods that get 2D textures or slices of volume textures as input. The following interpolation methods were developed:
- nearest-neighbor sampling
- hardware-accelerated bilinear interpolation
- straightforward implementation of bilinear interpolation
- straightforward implementation of biquadratic interpolation
- optimized, two-level implementation of biquadratic interpolation
- straightforward implementation of bicubic interpolation
- optimized, two-level implementation of bicubic interpolation
- bicubic interpolation approach of [Sigg and Hadwiger, 2005]
- hardware-accelerated trilinear interpolation
- straightforward implementation of trilinear interpolation
- straightforward implementation of tricubic interpolation
- optimized, two-level approach of tricubic interpolation
- tricubic inerpolation approach of [Sigg and Hadwiger, 2005]
Once the interpolation shaders were validated in this testing framework, they were compiled to ARB fragment programs using the Cg compiler offered on NVIDIA's webpage. These ARB fragment programs were plugged into the fragment shaders of the raycasting framework of [Stegmaier et al, 2005]. Essentially, every trilinear texture access of the raycasting framework is replaced with the shader code for tricubic interpolation. However, the size of these shader programs increases significantly - the number of temporary variables that are needed to perform the shader operations for isosurface rendering are too high to run even on a GeForce7. The only solution that was found so far is to run the program on a GeForce8, which simply offers more registers for temporary variables.
- Straightforward Implementation: March 9 (done)
- Interpolation Formula: March 16 (done)
- Efficient Implementation: March 23 (done)
- (Optional) Implementation on BCC grids: March 31 (skipped)
- Project Paper: April 3 (done)
- Final Presentation: ~April 5
It is expected to achieve higher visual quality of the rendered images while maintaining good performance. This means in particular that the volume renderer should be fast enough to achieve interactive frame rates while providing a resampling quality of cubic interpolation.
To compare the three implementations of tricubic interpolation kernels and standard trilinear interpolation, four test runs were performed with the raycasting framework of [Stegmaier et al, 2005]. The rendering speed is documented in the graph below:
These performance measurements were performed on an AMD Dual-Core system where each core runs at 2400 MHz. The GPU of this system is a NVIDIA GeForce 8800 GTX which provides 768 MB of texture memory. All tests were performed on a 64-bit version of Windows XP. In this graph, the rendering speed is documented in frames per second. The first two test scenarios include volume rendering and isosurface rendering of the Marschner-Lobb data set [Marschner and Lobb, 1994]. The parameters of the Marschner-Lobb data set are as follows: the volume size is 32x32x32, the alpha value for computing the scalar values is set to 0.25, and the value for fM is set to 6. The third and forth test scene show the volume rendering and isosurface rendering of a human head data set. The size of this data set is 256x256x225. The view port size for all test runs is fixed at 512x512 pixels.
The y-axis is displaying the resulting rendering speed in frames per second. Red bars represent standard trilinear filtering, green bars tricubic filtering with Sigg and Hadwiger’s method, blue bars the two-level approach of tricubic filtering presented in this paper and black bars the straightforward implementation of tricubic filtering.
The visual quality of tricubic filtering is better than trilinear filtering. However, the differences are subtle and are easier spotted while the data set is explored in real-time. The following images are screenshots of an isosurface rendering with the trilinear interpolation kernel (left) and a tricubic interpolation kernel (right):
The difference between these two interpolation methods can be demonstrated more convincingly with the filter test volume of [Marschner and Lobb, 1994]. The two following screenshots show this test data set rendered as volume with trilinear interpolation kernel (left) and tricubic interpolation kernel (right). Ideally, concentric waves should be visible. However, due to filtering errors, these waves are distorted to a certain extend:
Discussion and Future Work
The resulting performance measurements show that cubic interpolation can be used in current graphics hardware in order to improve the visual quality and reduce the number of sampling artifacts in volume rendering frameworks. The experienced rendering speed is in the range of interactive frame rates for current graphics hardware. The visual quality is improved by the cubic interpolation method - however, the differences are subtle for volume rendered scenes. The benefit of using that method is higher for scenes that show an isosurface of a data set. Here, the normals are computed more smoothly, which results in fewer artifacts.
So far, all interpolation methods were used only for Cartesian grids – it is planned to extend these methods to use them for BCC grids as well. The advantage of BCC grids lies in the smaller number of sampling points that are needed to achieve the same reconstruction quality compared to Cartesian grids. Consequently, the performance on BCC grids should be better, since fewer computations are necessary.
The presented method of two-level cubic interpolation could not outperform the already published method [Hadwiger et al, 2001]; however, there is still some potential that is worth further research. For example, the texture coordinates have a linear offset (which is not the case for Sigg and Hadwiger’s method). This fact could be exploited in the future in order to speed up the interpolation computations.
[Marschner and Lobb, 1994]
S. Marschner and R. Lobb: An Evaluation of Reconstruction Filters for Volume Rendering, Proceedings of Visualization ‘94, pp. 100-107, 1994.
[Hadwiger et al, 2001]
M. Hadwiger, T. Theussl, H. Hauser and E. Gröller: Hardware-Accelerated High-Quality Filtering on PC Hardware, Pro-ceedings of Vision, Modeling and Visualization 2001, pp. 105-112, 2001.
[Sigg and Hadwiger, 2005]
C. Sigg and M. Hadwiger: Fast Third-Order Texture Filtering, In GPU Gems 2, edited by M. Pharr, pp. 313-329, Addison-Wesley, 2005.
[de Boor, 1972]
C. de Boor: On Calculating with B-splines, Journal of Ap-proximation Theory 6, pp. 50-62, 1972.
[Stegmaier et al, 2005]
S. Stegmaier, M. Strengert, T. Klein and T. Ertl: A simple and flexible volume rendering framework for graphics-hardware-based raycasting, In Proceedings of the International Workshop on Volume Graphics '05, pp. 187-195, 2005.
Last modified: Apr. 2nd, 2007
by Sven Bachthaler