Linear Blend Skinning for 3D Gaussian Model Animation

Introduction

3D Gaussian Splatting[1] is a method for synthesizing scenes after training on a set of photos or videos. While it renders high quality scenes in real time, it is not compatible to be animated with typical methods for a mesh due to being paramaterized and rendered differently. In particular, a gaussian is parameterized by its mean, rotation, scale, opacity, and colors, with the rotation and scale used to calculate a covariance matrix.

My project aimed to implement 3D Gaussian model for animation and texturing[2], which solves this issue by mapping gaussians to a proxy mesh that is animated instead. In particular, the proxy mesh does not need to be high quality like previous methods such as SuGaR[3], which generates a detailed mesh from the gaussians. While the paper described a large portion of how it was implemented, there are still many details left out, which I had to try to puzzle together myself due to the lack of an official implementation. This includes the logic for splitting/duplicating/pruning/resetting the gaussians.

Method

The 3D Gaussian Model uses a different parameterization based around bounding boxes consisting of 6 points. They are stored as two opposite bounding points located in texture space on the mesh, an angle, and 2 scales along the other 2 axes. For my implementation, I built on top of the gsplat library [4] for the actual rasterization of the gaussians. I implemented functions for computing barycentric coordinates, scaling the w component of the texture space coordinates, then interpolating between the barycentric coordinates to calculate the world space coordinates based on the triangles. I initialized 3 gaussians per triangle in the mesh, one with each vertex as a bounding point, while the 3 would share the center point of the triangle as the other. The scales were set to the average edge length of the triangle, while the rotations were initialized randomly because I found the gaussians would train much more slowly with zero-initialized rotations.

During training, the gaussians are rasterized, then the loss function is computed and backpropagated through the rasterizer. Compared to the original gaussian splatting implementation, I added two loss terms: one for keeping the gaussians within the triangles they were initialized on and another to keep the gaussians close to the surface of the triangle itself (w coordinate in texture space).

After the gaussians are trained, linear blend skinning was applied to the mesh, which transforms the vertices. Because the gaussians are stored with texture space coordinates, this acts as an implicit shell to enable them to be animated as the mesh moves.

Results

Some aspects that I had trouble with included obtaining meshes that corresponded to a dataset that could be used to train the gaussians, trying many libraries to no avail. I was only able to use NeuS2[5] to generate a mesh for the default house mesh it came with, though there seemed to be an issue when I tried to load the dataset that resulted in only 2 images being used. I ended up using a placeholder mesh to train the lego on, and I had to disable the regularizer terms that would have kept the gaussians close to the triangles on the mesh.

Lego bulldozer comparison — Figure 1: Comparison between the ground truth on the left and the trained gaussians on the right

Lego bulldozer animated — Figure 2: Animated 3D Gaussian Model moving based on a sine wave. 5 bones are currently being used, and they all move with different frequency, phase, and amplitude.

Conclusion

I managed to get the animations through linear blend skinning somewhat working, but the way the gaussians were trained likely still have some bugs that I need to work out. It was rough trying to generate a mesh from the dataset, and I will continue trying to get that aspect working.

References

Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph., 42(4), 139-1. [↑]
Wang, X. E., & Sin, Z. (2024). 3D Gaussian Model for Animation and Texturing. arXiv preprint arXiv:2402.19441. [↑]
Guédon, A., & Lepetit, V. (2024). SUGAR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5354-5363).[↑]
Ye, V., Li, R., Kerr, J., Turkulainen, M., Yi, B., Pan, Z., ... & Kanazawa, A. (2025). gsplat: An open-source library for Gaussian splatting. Journal of Machine Learning Research, 26(34), 1-17. [↑]
Wang, Y., Han, Q., Habermann, M., Daniilidis, K., Theobalt, C., & Liu, L. (2022). Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. arXiv preprint arXiv:2212.05231. [↑]