Data sharding strategy
本章讨论在distributed data store中涉及的data sharding strategy,在下面文章中,对各种data sharding strategy、主流的分布式data store的实现进行了总结:
1、yugabyte Four Data Sharding Strategies We Analyzed in Building a Distributed SQL Database
这篇文章非常好,首先推荐阅读
wanweibaike Partition (database)
A partition is a division of a logical database or its constituent elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability[1] reasons, or for load balancing.
Partitioning criteria
They take a partitioning key and assign a partition based on certain criteria. Some common criteria include:
Range partitioning
Round-robin partitioning
Partitioning methods
Horizontal partitioning
NOTE:
水平
Vertical partitioning
NOTE:
垂直
wanweibaike Shard (database architecture)
A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.
Notable implementations
NOTE:
可以看到,主流的都是采用的shared,即horizontal partition
Google Spanner
Spanner, Google's global-scale distributed database, shards across multiple Paxos state machines to scale to "millions of machines across hundreds of data centers and trillions of database rows".[19]
评价指标
1、均匀、负载均衡
TODO
gitbook Systems Design Glossary