Accurate free-form deformation is designed to solve sampling problem and can generate high quality deformation result. But
it is difficult to achieve real-time performance due to high computational costs. The authors ever proposed a GPU acceleration
algorithm to solve this problem. In this paper, a new GPU algorithm is proposed to further accelerate accurate free-form deformation
significantly, where the deformed object is represented in terms of triangular Bezier surfaces. The intensive computations involved
in the accurate free-form deformation are designed on GPU via CUDA totally. After careful derivations, the computations of
generating the triangular Bezier surfaces and their tessellations are abstracted into two matrix multiplications. Since the relevant
matrices can be reused for all triangular Bezier surfaces, the computations can be greatly accelerated via CUBLAS, i.e. an API of
linear algebra sub-programs on GPU. Finally, the tessellated surfaces are output to a vertex buffer object and rendered via OpenGL
efficiently. Experimental results show that the proposed algorithm is more efficient than the previous GPU acceleration algorithm
and tessellation shader algorithm.