The tensor-based model is much simpler and easier to understand and maintain, as well as eliminating performance costs in #99 and allowing GPU parallelization such as #60 to be implemented. Additionally, it would make #83 more performant and significantly easier to implement. However, doing so means the crate loses its uniqueness as one of the few DAG-based implementations. This will end up being another major rewrite if I decide to commit to it, though unlike the previous ones, it shouldn't have a major effect on the exposed API and requires little to no migration from users.