Forum: VoltDB Architecture

Post: Cache Conscious B-Trees?

Cache Conscious B-Trees?
henning
May 16, 2010
Is VoltDB using cache conscious B-trees for indexing?
No.
jhugg
May 16, 2010
No. We did some experimenting a while back with different indexing structures (see http://idlebox.net/2007/stx-btree/). In our limited tests, we couldn't find anything that was faster than boost::hash_map or simple stl_map (which are usually implemented as a Red-Black tree).


One thought is that many of our indexes in benchmarks (like pseudo-tpcc) are wider than a 4-byte integer. Not to mention our pointers are 8-bytes. Perhaps when the key-size approaches the cache line size of 64-bytes (on most of our machines), the cache-conscious b-trees lose out. That's just a thought.


That's not to say we aren't interested in experimenting more and trying other things. We've made the indexing pretty pluggable so that other things like t-trees or a better implementation of b-trees could be swapped in. It might make a really nice paper for a graduate student somewhere.


For now, we think the indexes are pretty fast and are putting our time toward usability features like data exporting, on-line node recovery, etc... as well as performance issues with multi-partition transactions. Someday we'll probably revisit them
pseudo-tpcc
tuancao
May 17, 2010
No. We did some experimenting a while back with different indexing structures (see http://idlebox.net/2007/stx-btree/). In our limited tests, we couldn't find anything that was faster than boost::hash_map or simple stl_map (which are usually implemented as a Red-Black tree).


One thought is that many of our indexes in benchmarks (like pseudo-tpcc) are wider than a 4-byte integer. Not to mention our pointers are 8-bytes. Perhaps when the key-size approaches the cache line size of 64-bytes (on most of our machines), the cache-conscious b-trees lose out. That's just a thought.


That's not to say we aren't interested in experimenting more and trying other things. We've made the indexing pretty pluggable so that other things like t-trees or a better implementation of b-trees could be swapped in. It might make a really nice paper for a graduate student somewhere.


For now, we think the indexes are pretty fast and are putting our time toward usability features like data exporting, on-line node recovery, etc... as well as performance issues with multi-partition transactions. Someday we'll probably revisit them


Hi,


Can you point me to the place where you perform your pseudo-tpcc benchmark? Is the code available?


Thanks,
Tuan
tpcc-code
tcallaghan
May 17, 2010
Hi,


Can you point me to the place where you perform your pseudo-tpcc benchmark? Is the code available?


Thanks,
Tuan


Tuan,


The TPCC code is in /tests/frontend/org/voltdb/benchmark/tpcc.


-Tim