One of the drawbacks with rendering DOM content to a Canvas/WebGL texture is that interactivity is lost in the process. At the very least an intermediate layer would have to be created that pipes scene input into DOM input.
The nice part about CSS 3D transforms is that interactivity is maintained, and it's still hardware-accelerated. A fantastic example of this was made by the creator of Three.js over a year ago:
As you said above, there are severe drawbacks with that approach. It is very limited in terms of what's possible compared to a traditional 3D scene. Moreover, since CSS 3D transforms are essentially DOM content rendered to a texture, rasterization occurs, and as such there isn't clean scaling, even with SVG content.
The nice part about CSS 3D transforms is that interactivity is maintained, and it's still hardware-accelerated. A fantastic example of this was made by the creator of Three.js over a year ago:
http://mrdoob.com/lab/javascript/threejs/css3d/
As you said above, there are severe drawbacks with that approach. It is very limited in terms of what's possible compared to a traditional 3D scene. Moreover, since CSS 3D transforms are essentially DOM content rendered to a texture, rasterization occurs, and as such there isn't clean scaling, even with SVG content.