架构模式与原则之数据管理篇¶
可扩展性¶
扩展立方(Scale Cube)¶
- X轴:水平横向复制和克隆服务与数据(X-Axis: Horizontal Duplication and Cloning of services and data)
- Y轴:功能分解和细分微服务(Y-Axis: Functional Decomposition and Segmentation-Microservices)
- Z轴:沿客户边界做服务和数据分区-分片/单元(Z-Axis: Service and Data Partitioning along Customer Boundaries -Shards/Pods)
水平扩展无状态/有状态应用¶
- 「如何扩展有状态应用」(How to Scale out Stateful App ?)
- CAP 理论(CAP Theorem)
- 一致性级别(Consistency level)
- 严格一致性(Strict consistency)
- 最终一致性(Eventual consistency)
- 分割数据库服务器(Splitting database servers)
- 数据库分片(Database Sharding)
Microservices Data Management¶
混合持久化¶
- 混合持久化原则(Polyglot Persistence principle):每个 service 自己做最佳选择
- 数据完整性与数据一致性(Data integrity and data consistency)
- 每个 service 维护自己的数据(Database-per-service its own data)
- 隔离 service 的数据库(Isolating each service's databases)
- 复制或分区的数据带来挑战(Duplicated or partitioned data challenge)
- 事务管理中的数据一致性问题(Data consistency problems transaction management)
- 拥抱重复数据和最终一致性(Welcome duplicate datas and eventual consistency)
- 接受最终一致性数据(Accept eventual consistency data)
- 能够独立扩展(Ability to scale independently)
- 避免数据库单点故障(Avoid single-point-of-failure of bottleneck databases)
- 事件驱动的架构风格(Event driven architecture style)
模式与原则¶
-
Database-per-Service 模式
- 将单体架构转变为微服务架构(Shifting to the monolithic architecture to microservices architecture)
- 将数据库分解为分布式数据模型(Decomposes database into a distributed data model)
- 快速发展且易于扩展的应用程序(Evolve rapidly and easy to scale applications)
- 更改模式而不产生影响(Schema changes can perform without any impact)
- 独立扩展(Scaling independently)
-
API 组合(Composition)模式
- API 网关模式
- 网关路由模式
- 网关聚合模式
- 网关卸载模式
-
CQRS 模式
- 命令查询责任分离 (CQRS, Command query responsibility segregation)
- 多读少写的方法(Write-less, read-more approaches)
-
事件溯源模式(Event Sourcing)
- 图示
- 累积事件(Accumulate events)
- 将它们聚合为事件序列(Aggregates them into sequence of events)
- 在特定事件点重播(Replay at certain point of events)
-
Saga 模式
- 图示
- 提供事务管理(Provide Transaction management)
- 维护数据一致性(Maintaining data consistency)
-
SAGA 的两种类型
- 编排 :在没有控制点时交换事件(Choreography -when exchanging events happens without points of control)
- 编制:当您有集中式控制器时(Orchestration -when you have centralized controllers)
-
共享型数据库反模式
- 图示
- 共享型数据库是一种典型反模式
- 单一的共享数据库,每个服务访问数据(Single shared database with each service accessing data)
- 单点故障(single-point-of-failure)
- 违背微服务的天性(Against to microservices nature)
数据库分类¶
-
关系数据库(Relational Databases)
- 将数据存储到相关数据表中(Storing data into related data tables)
- 固定模式并使用SQL管理数据(Fixed schema and use SQL to manage data)
- 支持ACID型的事务(Support transactions with ACID)
- 微服务中的混合持久化(Polyglot persistence in microservices)
- 例子:Oracle, MS SQL Server, MySQL, PostgreSQL
-
NoSQL 数据库(NoSQL Databases)
- 不同类型的存储数据(Different types of stored data)
- 易用性、可扩展性、弹性和可用性(Ease-of-use, scalability, resilience and availability)
- 将非结构化数据存储在键值对或JSON文档中(Stores unstructured data in key-value pairs or JSON documents)
- 不提供 ACID 保证(Don't provide ACID guarantees)
-
NoSQL文档数据库(NoSQL Document Databases)
- 在基于JSON的文档中存储和查询数据(Store and query data in JSON-based documents)
- 数据和元数据分层存储(Data and metadata are stored hierarchically)
- 对象映射到应用程序代码(Objects are mapping to the application code)
- 可扩展性,文档数据库分布式友好(Scalability, document databases can distributed very well)
- 内容管理和存储目录:MongoDB 和 Cloudant(Content management and storing catalogs, MongoDB and Cloudant)
-
NoSQL键值数据库(NoSQL Key-Value Databases)
- 数据存储为键值对的集合(Data is storing as a collection of key-value pairs)
- 数据库中的键值组 (Group of key-value in the database)
- 面向会话的应用程序,例如存储客户购物篮数据(Session-oriented applications for example storing customer basket data)
- 例子:Redis, Amazon DynamoDB, OracleNoSQL Database
-
NoSQL 列式数据库(NoSQL Column-based Databases)
- 宽列数据库(Wide-Column Databases)
- 数据存储在列中(Data is stored in columns)
- 更快地访问必要的数据(Access necessary data faster)
- 不要扫描不必要的信息(Don't scanning the unnecessary information)
- 按列独立缩放(Scale by columns independently)
- 数据仓库与大数据处理(Data warehouse and Big Data processing)
- 例子:ClickHouse, Apache Cassandra, Apache HBase or Amazon DynamoDB
-
NoSQL 图式数据库(NoSQL Graph-based Databases)
- 以图形结构存储数据(Stores data in a graph structure)
- 数据实体在节点中连接(Data entities are connected in nodes)
- 导航图形关系(Navigate graph relationships)
- 欺诈检测、社交网络和推荐引擎(Fraud detection, social networks and recommendation engines)
- 例子:OrientDB, Neo4j and Amazon Neptune
如何选择数据库¶
-
Key Point 1 - 考虑“一致性级别”(Consider the "consistency level")
- 我们需要严格的一致性还是最终的一致性?(Do we need Strict Consistency or Eventual Consistency)
- 微服务架构的最终一致性,以获得可扩展性和高可用性(Eventual consistency in microservices architecture in order to gain scalability and high availability)
-
Key Point 2 - 高可扩展性:可满足数百万个请求(High Scalability - accommodate millions of request)
-
Key Point 3 - 高可用:独立的数据中心(-High Availability - separate data center)
- 在决定数据库之前,我们应该检查CAP定理(Before deciding database, we should check the CAP Theorem)
CAP 理论¶
- 1998年由Eric Brewer教授提出(Found in 1998 by Professor Eric Brewer)
- 一致性、可用性和分区容错不能同时实现(Consistency, Availability and Partition Tolerance cannot all be achieved at the same time)
- 分布式系统应该在一致性、可用性和分区容差之间做出牺牲(Distributed systems should sacrifice between consistency, availability and partition tolerance.)
- 只能保证三个概念中的两个:一致性、可用性和分区容差(Can only guarantee two of the three concepts: consistency, availability, and partition tolerance)
-
一致性
- 获取任何读取请求,数据应返回上次更新的值(Consistency-get any read request, the data should return last updated value)
- 必须阻止请求,直到所有副本更新(Must block the request until all replicas update)
-
可用性
- 随时响应请求(Availability-respond to requests at any time)
- 容错,以适应所有请求(fault-tolerance in order to accommodate all requests)
-
分区容错
- 网络分区,位于不同的网络中(Partition Tolerance-network partitioning, located in different networks)
-
同时保持一致性和可用性?
- 如果是分区容差,则应选择可用性或一致性(If Partition Tolerance, either Availability or Consistency should be selected)
- 对于分布式体系结构,分区容差是必须的(Partition Tolerance is a must for distributed architectures)
- 关系数据库防止从不同节点分发数据,NoSQL数据库易于扩展。(Relational databases prevent distribute data from different nodes, NoSQL databases easily scalable.)
- 微服务架构选择具有高可用性的分区容差,并遵循最终的一致性 (Microservices architectures choose Partition Tolerance with High Availability and follow Eventual Consistency)
水平、垂直和功能分区¶
-
水平分区:分片(Horizontal Partitioning - Sharding)
- 每个分区都是一个单独的数据存储(Each partition is a separate data store)
- 所有分区都有相同的模式(All partitions have the same schema)
- 切分并保存数据的特定子集(Shards and holds a specific subset of the data)
- 按字母顺序组织的切分键(Sharding keys organized alphabetically)
- 分片使用分区键分隔不同服务器的负载(Sharding separate the load different servers with partition keys)
-
垂直分区(Vertical Partitioning)
- 也叫行拆分(Row Splitting)
- 保存表的列的子集(Holds a subset of the columns for table)
- 列根据其模式进行划分(Columns are divided according to their pattern)
- 经常访问的列(Frequently accessed columns)
-
功能分区(Functional Partitioning)
- 通过遵循有界上下文或子域对数据进行功能分区(Functionally partitioning data by following the bounded context or subdomains)
- 数据根据有界上下文的使用情况进行隔离(Data is segregated according to usage of bounded contexts)
- 比如,按照职责分解微服务(Like decomposing microservices as per responsibilities)
数据库分片模式¶
- 分片:“一小块或一部分”(Sharding - "a small piece or part")
- 将数据分离为独特的小块(Separation of the data into unique small pieces)
- 提高在微服务中存储数据时的可扩展性(Improve scalability when storing data in microservices)
- 每个碎片都有相同的模式(Each shard has the same schema)
- 分片通过平衡分片之间的工作负载,实现了可扩展性,提高了性能(Shardings enable to scale, improve performance by balancing the workload across shards)
- 使用分区键划分碎片(Dividing into shards with partition keys)
- 案例:Tinder
Cassandra¶
- Cassandra:NoSql 数据库、对等分布式宽列数据库(NoSql Database Peer-to-Peer Distributed Wide Column Database)
- Apache Foundation的分布式数据库,高度可扩展、高性能的分布式数据库(Distributed database from Apache Foundation, highly scalable, high-performance distributed database)
- 无单点故障的高可用性(High availability with no single point of failure)
- 弹性可扩展性(Elastic scalability)
- 灵活的数据存储,方便的数据分发(Flexible data storage, Easy data distribution)
-
Why Cassandra?
- 自动分片功能(Auto-sharding feature)
- 数据切分有助于在节点之间保持数据的分割(Data Sharding helps keep data divided among nodes)
- 分区键设置为 Sensor# 和 Date。(Partition Keys set to Sensor# and Date.)
- microservices数据库的最佳选择(Best choose for microservices database)
- CAP 定理:具有最终一致性的高可用性(CAP Theorem High Availability with Eventual Consistency)
架构设计-数据库分片 Cassandra¶
Microservices Data Management-Cross-Service Queries¶
物化视图模式¶
-
使用场景
- 微服务跨服务查询(Microservices Cross-Service Queries )
- 物化视图模式(Materialized View Pattern)
-
模式描述
- 存储其自己的本地数据副本(Store its own local copy of data)
- 包含数据的非规范化副本(Contains a denormalized copy of the data)
- 数据本地副本作为读取模型(Local copy of data as a Read Model)
- 购物车微服务包含生产和定价微服务的数据的非规范化副本(Shopping Basket microservice contains a denormalized copy of the data which product and pricing microservices)
- 消除了同步跨服务调用(Eliminates the synchronous cross-service calls)
-
模式缺点
- 如何以及何时更新非规范化数据?(How and when the denormalized data will be updated?)
- 数据源是其他微服务(Source of data is other microservices)
- 原始数据更改时,它应该更新为sc microservices(When the original data changes. it should update into sc microservices)
- 发布事件并使用订阅者微服务(Publish an event and consumes from the subscriber microservice)
CQRS 设计模式¶
-
概念
- 命令和查询职责分离(Command and Query Responsibility Segregation)
- 当应用程序既需要处理复杂的连接查询,也需要执行CRUD操作(Applications need both working for complex join queries and also perform CRUD operations)
- 读写数据库有不同的方法(Reading and writing database has different approaches)
- 使用 NoSql 读取和使用关系数据库执行crud操作(Using no-sql for reading and using relational database for crud operations)
- 阅读激励型应用(Read-incentive application)
- 命令执行更新数据(Commands performs update data)
- 同时,查询执行读取数据(Queries performs read data)
- 物化视图模式是实现读取数据库的良好示例(Materialized view pattern is good example to implement reading databases)
- 命令应该是基于任务的操作(Commands should be actions with task-based operations)
- 查询从不修改数据库(Queries is never modifying the database)
-
Instagram 数据库架构
- 对用户故事,使用 NoSql Cassandra数据库(Uses no-sql Cassandra database for user stories)
- 对用户信息生物更新,使用关系型 PostgreSQL数据库(Uses relational PostgreSQL database for User Information bio update)
-
如何同步 CQRS 的数据库?
- 图示
- 事件驱动架构(Event-Driven Architecture)
- 使用message broker系统发布更新事件(Publish an update event with using message broker systems)
- 由读取数据库和同步数据使用(Consume by the read database and sync data)
- 读数据库最终与写数据库同步(Read database eventually synchronizes with the write database)
- 应用物化视图模式,从写数据库副本中读取数据库(Read database from replicas of write database with applying Materialized view pattern)
事件溯源模式¶
- 带有事件源模式的 CQRS(CQRS with Event Sourcing Pattern)
- 真相事件数据库来源(Source-of-truth events database)
- 非规范化表的数据物化视图(Materialized views of the data with denormalized tables)
- 使用message broker系统发布更新事件(Publish an update event with using message broker systems)
- 通过读数据库消费数据,同步数据物化视图模式使用(Consume by the read database and sync data Materialized view pattern)
- 读数据库最终与写数据库同步(Read database eventually synchronizes with the write database)
- 更改为数据保存操作(Changing to data save operations)
- 按数据事件的顺序,将所有事件保存到数据库中(Save all events into database with sequential ordered of data events)
- 将每个更改附加到事件的顺序列表中(Append each change to a sequential list of events)
- 事件存储成为数据的真实来源(Event Store becomes the source-of-truth for the data)
- 具有发布事件的发布/订阅模式(Publish/subscribe pattern with publish event)
- 重播事件来生成数据的最新状态(Replay events to build latest status of data)
最终一致性原则¶
- 带有事件源模式的 CQRS(CQRS with Event Sourcing Pattern)
- 适合偏爱 高可用性 而非 即时一致性 的系统
- 在一定时间后变得一致(Become consistent after a certain time)
- 不能保证即时一致性(Does not guarantee instant consistency)
-
考虑 “一致性级别”(Consider the "consistency level")
-
严格一致性(Strict consistency )
- 当我们保存数据时,数据应该立即影响并显示给每个客户端。
- 案例:debit or withdraw on bank account
-
最终一致性(Eventual consistency)
- 基本上当我们写入任何数据时,客户端读取数据需要一些时间。
- 案例:Youtube video counters
-
Instagram 数据库架构¶
- Instagram系统架构:Instagram 故事视图和用户信息(Instagram System:Architecture Instagram Story View and User Information)
- Instagram认为可用性对他们来说更为重要,并认为最终的一致性就足够了(Instagram decided that Availability was more important to them and thought that Eventual Consistency would be sufficient)
- 关系数据库PostgreSql,另一个是 NoSql 数据库Cassandra(Relational database-PostgreSQLand the other is no-sql database Cassandra)
- PostgreSQL 具有“主-从” 架构(PostgreSQL has a "Master-Slave" architecture)
- 使用关系型 PostgreSQL 数据库进行用户信息生物更新(Uses relational PostgreSQL database for User Information bio update)
- 将 NoSql Cassandra数据库用于用户故事、计数器、消息传递(Uses no-sql Cassandra database for user stories, Counters, messaging)
- Cassandra具有自动切分功能(Cassandra has an auto-sharding feature)
CQRS 架构设计案例¶
- 混合持久化(Polyglot Persistence)
- 物化视图模式(Materialized View Pattern)
- CQRS设计模式(CQRS Design Pattern)
- 事件来源模式(Event Sourcing Pattern)
- 最终一致性原则(Eventual Consistency Principle)
Microservices Distributed Transactions¶
Microservices 分布式事务¶
- 跨多个微服务实现事务性操作(Implement transactional operations across several microservices)
- 微服务上polyglot数据库的复杂网络关注点(Complex network concerns with polyglot database on microservices)
- 微服务上的分布式事务管理手动实现(Distributed transaction managements on microservices implement manually)
分布式事务 Saga 模式¶
- 在分布式事务案例中,跨微服务管理数据一致性(Manage data consistency across microservices in distributed transaction cases)
- 创建一组按顺序更新微服务的事务(Create a set of transactions that update microservices sequentially)
- 发布事件以触发下一个事务(Publish events to trigger the next transaction)
- 如果失败,则触发回滚事务(If failed,trigger to rollback transactions)
- 将这些本地事务分组并逐个顺序调用(Grouping these local transactions and sequentially invoking one by one)
Saga模式:编制与编排¶
-
编排 Saga 模式(choreography Saga Pattern)
- 使用发布/订阅原则来协调 sagas(Coordinate sagas with applying publish-subscribe principles)
- 每个微服务都运行自己的本地事务(Each microservices run its own local transaction)
- 发布事件以触发下一个事务(Publish events to trigger the next transaction)
- 当工作流步骤增加,则可能会变得混乱,难以管理事务(Workflow steps increase, then it can become confusing and hard to manage transaction)
- 用来解耦直接依赖关系(Decouple direct dependency)
-
编制 Saga 模式(orchestration Saga Pattern)
- 用集中式控制器微服务来协调 Sagas(Coordinate sagas with a centralized controller microservice)
- 调用 & 按顺序执行本地microservices事务(Invoke to execute local microservices transactions in sequentially)
- 执行saga事务并集中管理它们,如果其中一个步骤失败,则使用补偿事务执行回滚步骤(Execute saga transaction and manage them in centralized way and if one of the step is failed, then executes rollback steps with compensating transactions)
- 适用于包含大量步骤的复杂工作流(Good for complex workflows which includes lots of steps)
发件箱模式¶
- 当API发布事件消息时,它不会直接发送它们(When your API publishes event messages,it doesn't directly send them)
- 消息持久化在数据库表中,作业将事件发布到消息代理系统(The messages are persisted in a database table,a job publish events to message broker system)
- 提供以“发件箱”角色,将事件可靠地发布到表中(Provides to publish events reliably with written to a table in the "outbox"role)
- 事件和写入发件箱表的事件是同一事务的一部分(The event and The event written to the outbox table are part of the same transaction)
-
Why Outbox Pattern
- 使用需要保持一致的关键数据(Working with critical data that need to consistent)
- 需要准确捕获所有请求(Need to accurate to catch all requests)
- 数据库更新和消息发送应该是原子的(The database update and sending of the message should be atomic)
- 提供数据一致性(Provide data consistency)
- 示例:Financial business sale transactions