Skip to content

Data sharding strategy

本章讨论在distributed data store中涉及的data sharding strategy,在下面文章中,对各种data sharding strategy、主流的分布式data store的实现进行了总结:

1、yugabyte Four Data Sharding Strategies We Analyzed in Building a Distributed SQL Database

这篇文章非常好,首先推荐阅读

wanweibaike Partition (database)

A partition is a division of a logical database or its constituent elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability[1] reasons, or for load balancing.

Partitioning criteria

They take a partitioning key and assign a partition based on certain criteria. Some common criteria include:

Range partitioning

Round-robin partitioning

Partitioning methods

Horizontal partitioning

NOTE:

水平

Vertical partitioning

NOTE:

垂直

wanweibaike Shard (database architecture)

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

Notable implementations

NOTE:

可以看到,主流的都是采用的shared,即horizontal partition

Google Spanner

Spanner, Google's global-scale distributed database, shards across multiple Paxos state machines to scale to "millions of machines across hundreds of data centers and trillions of database rows".[19]

评价指标

1、均匀、负载均衡

TODO

gitbook Systems Design Glossary