数据库Schema模型演变
13 数据库模型演变
当我们添加新的持久化类或更改现有的类时(例如,通过添加或删除数据成员),存储新对象模型所需的数据库模型也会更改。同时,我们可能拥有包含现有数据的现有数据库。如果应用程序的新版本不需要处理旧数据库,那么模型创建功能就是您所需要的全部。然而,大多数应用程序将需要处理由同一应用程序的旧版本存储的数据。
We will call database schema evolution the overall task of updating the database to match the changes in the object model. Schema evolution usually consists of two sub-tasks: schema migration and data migration. Schema migration modifies the database schema to correspond to the current object model. In a relational database, this, for example, could require adding or dropping tables and columns. The data migration task involves converting the data stored in the existing database from the old format to the new one.
我们将数据库模型演化称为更新数据库以匹配对象模型更改的总体任务。模型演化通常由两个子任务组成:模型迁移和数据迁移。模型迁移修改数据库模型,使之与当前对象模型相对应。例如,在关系数据库中,这可能需要添加或删除表和列。数据迁移任务涉及将存储在现有数据库中的数据从旧格式转换为新格式。
If performed manually, database schema evolution is a tedious and error-prone task. As a result, ODB provides comprehensive support for automated or, more precisely, semi-automated schema evolution. Specifically, ODB does fully-automatic schema migration and provides facilities to help you with data migration.
如果手动执行,数据库模型演化是一项乏味且容易出错的任务。因此,ODB为自动化或更准确地说,半自动化的模型演进提供了全面的支持。具体来说,ODB执行全自动模型迁移,并提供帮助您进行数据迁移的工具。
The topic of schema evolution is a complex and sensitive issue since normally there would be valuable, production data at stake. As a result, the approach taken by ODB is to provide simple and bullet-proof elementary building blocks (or migration steps) that we can understand and trust. Using these elementary blocks we can then implement more complex migration scenarios. In particular, ODB does not try to handle data migration automatically since in most cases this requires understanding of application-specific semantics. In other words, there is no magic.
模型演化的主题是一个复杂而敏感的问题,因为通常会有有价值的产品数据处于危险之中。因此,ODB所采用的方法是提供我们可以理解和信任的简单而可靠的基本构建块(或迁移步骤)。使用这些基本块,我们可以实现更复杂的迁移场景。特别是,ODB不会尝试自动处理数据迁移,因为在大多数情况下,这需要理解特定于应用程序的语义。换句话说,数据迁移没有捷径。
There are two general approaches to working with older data: the application can either convert it to correspond to the new format or it can be made capable of working with multiple versions of this format. There is also a hybrid approach where the application may convert the data to the new format gradually as part of its normal functionality. ODB is capable of handling all these scenarios. That is, there is support for working with older models without performing any migration (schema or data). Alternatively, we can migrate the schema after which we have the choice of either also immediately migrating the data (immediate data migration) or doing it gradually (gradual data migration).
有两种处理旧数据的一般方法:应用程序可以将其转换为与新格式对应的数据,也可以使其能够处理该格式的多个版本。还有一种混合方法,应用程序可以将数据逐渐转换为新格式,作为其正常功能的一部分。ODB能够处理所有这些场景。也就是说,支持在不执行任何迁移(模型或数据)的情况下使用旧模型。或者,我们可以迁移模型,在此之后,我们可以选择立即迁移数据(立即数据迁移)或逐步迁移数据(逐步数据迁移)。
Schema evolution is already a complex task and we should not unnecessarily use a more complex approach where a simpler one would be sufficient. From the above, the simplest approach is the immediate schema migration that does not require any data migration. An example of such a change would be adding a new data member with the default value (Section 14.3.4, "default"). This case ODB can handle completely automatically.
模型演化已经是一项复杂的任务,如果简单的方法就足够了,我们不应该使用更复杂的方法。从上面来看,最简单的方法是不需要任何数据迁移的立即模型迁移。此类更改的一个示例是使用默认值(章节14.3.4,“default”)添加一个新的数据成员。这种情况下,ODB可以完全自动地处理。
If we do require data migration, then the next simplest approach is the immediate schema and data migration. Here we have to write custom migration code. However, it is separate from the rest of the core application logic and is executed at a well defined point (database migration). In other words, the core application logic need not be aware of older model versions. The potential drawback of this approach is performance. It may take a lot of resources and/or time to convert all the data upfront.
如果我们确实需要数据迁移,那么下一个最简单的方法是立即模型和数据迁移。这里我们必须编写自定义迁移代码。但是,它与核心应用程序逻辑的其余部分是分开的,并在一个定义良好的点(数据库迁移)上执行。换句话说,核心应用程序逻辑不需要知道旧的模型版本。这种方法的潜在缺点是性能。提前转换所有数据可能会花费大量的资源和/或时间。
If the immediate migration is not possible, then the next option is the immediate schema migration followed by the gradual data migration. With this approach, both old and new data must co-exist in the new database. We also have to change the application logic to both account for different sources of the same data (for example, when either an old or new version of the object is loaded) as well as migrate the data when appropriate (for example, when the old version of the object is updated). At some point, usually when the majority of the data has been converted, gradual migrations are terminated with an immediate migration.
如果不可能立即迁移,那么下一个选项是立即模型迁移,然后逐步进行数据迁移。使用这种方法,新数据库中必须同时存在旧数据和新数据。我们还必须改变应用程序逻辑都占相同的数据的不同来源(例如,当一个旧的或新版本的对象加载)以及在适当的时候进行迁移数据(例如,当对象的旧版本更新)。在某些情况下,通常是在大部分数据已经被转换时,逐步迁移被立即迁移终止。
The most complex approach is working with multiple versions of the database without performing any migrations, schema or data. ODB does provide support for implementing this approach (Section 13.4, "Soft Object Model Changes"), however we will not cover it any further in this chapter. Generally, this will require embedding knowledge about each version into the core application logic which makes it hard to maintain for any non-trivial object model.
最复杂的方法是使用多个版本的数据库,而不执行任何迁移、模型或数据。ODB确实为实现这种方法提供了支持(第13.4节,“软对象模型更改”),但是我们不会在本章中进一步讨论它。通常,这需要将关于每个版本的知识嵌入到核心应用程序逻辑中,这使得对任何重要的对象模型进行维护变得困难。
Note also that when it comes to data migration, we can use the immediate variant for some changes and gradual for others. We will discuss various migration scenarios in greater detail in section Section 13.3, "Data Migration".
还需要注意的是,当涉及到数据迁移时,我们可以对某些更改使用即时变体,而对其他更改使用渐进变体。我们将在13.3节“数据迁移”中更详细地讨论各种迁移场景。
13.1 Object Model Version and Changelog 对象模型版本与变更日志
To enable schema evolution support in ODB we need to specify the object model version, or, more precisely, two versions. The first is the base model version. It is the lowest version from which we will be able to migrate. The second version is the current model version. In ODB we can migrate from multiple previous versions by successively migrating from one to the next until we reach the current version. We use the db model version pragma to specify both the base and current versions.
要在ODB中启用模型演化支持,我们需要指定对象模型版本,或者更准确地说是两个版本。第一个是基本模型版本。这是我们能够迁移的最低版本。第二个版本是当前的模型版本。在ODB中,我们可以通过连续地从一个版本迁移到下一个版本,从而从多个以前的版本迁移到当前版本。我们使用 db model version pragma来指定基本版本和当前版本。
When we enable schema evolution for the first time, our base and current versions will be the same, for example:
当我们第一次启用模型演化时,我们的基本版本和当前版本将是相同的,例如:
#pragma db model version(1, 1)
Once we release our application, its users may create databases with the schema corresponding to this version of the object model. This means that if we make any modifications to our object model that also change the schema, then we will need to be able to migrate the old databases to this new schema. As a result, before making any new changes after a release, we increment the current version, for example:
一旦我们发布了我们的应用程序,它的用户可以用与这个版本的对象模型相对应的模型创建数据库。这意味着,如果我们对对象模型进行任何修改,同时也改变了模型,那么我们将需要能够将旧数据库迁移到这个新模型。因此,在发布后进行任何新更改之前,我们会增加当前版本,例如:
#pragma db model version(1, 2)
To put this another way, we can stay on the same version during development and keep adding new changes to it. But once we release it, any new changes to the object model will have to be done in a new version.
换句话说,我们可以在开发期间保持同一个版本,并不断添加新更改。但是一旦我们发布了它,任何对对象模型的新更改都必须在新版本中完成。
It is easy to forget to increment the version before making new changes to the object model. To help solve this problem, the db model version pragma accepts a third optional argument that specify whether the current version is open or closed for changes. For example:
在对对象模型进行新更改之前,很容易忘记增加版本。为了帮助解决这个问题,db model version pragma接受第三个可选参数,指定当前版本对于更改是打开还是关闭。例如:
#pragma db model version(1, 2, open) // Can add new changes to
// version 2.
#pragma db model version(1, 2, closed) // Can no longer add new
// changes to version 2.
If the current version is closed, ODB will refuse to accept any new schema changes. In this situation you would normally increment the current version and mark it as open or you could re-open the existing version if, for example, you need to fix something. Note, however, that re-opening versions that have been released will most likely result in migration malfunctions. By default the version is open.
如果当前版本关闭,ODB将拒绝接受任何新模型更改。在这种情况下,您通常会增加当前版本,并将其标记为打开,或者您可以重新打开现有版本,例如,您需要修复一些东西。但是,请注意,重新打开已经发布的版本很可能会导致迁移故障。默认情况下,该版本是打开的。
Normally, an application will have a range of older database versions from which it is able to migrate. When we change this range by removing support for older versions, we also need to adjust the base model version. This will make sure that ODB does not keep unnecessary information around.
通常,应用程序会有一系列可以迁移的旧数据库版本。当我们通过删除对旧版本的支持来更改这个范围时,我们还需要调整基本模型版本。这将确保ODB不会保留不必要的信息。
A model version (both base and current) is a 64-bit unsigned integer (unsigned long long). 0 is reserved to signify special situations, such as the lack of schema in the database. Other than that, we can use any values as versions as long as they are monotonically increasing. In particular, we don't have to start with version 1 and can increase the versions by any increment.
模型版本(包括基本版本和当前版本)是一个64位的无符号整数(unsigned long long)。0保留用于表示特殊情况,例如数据库中缺少模型。除此之外,我们可以使用任何值作为版本,只要它们是单调递增的。特别地,我们不必从版本1开始,可以任意增加版本。
One versioning approach is to use an independent object model version by starting from version 1 and also incrementing by 1. The alternative is to make the model version correspond to the application version. For example, if our application is using the X.Y.Z version format, then we could encode it as a hexadecimal number and use that as our model version, for example:
一种版本控制方法是使用独立的对象模型版本,从版本1开始,并且每次递增1。另一种方法是使模型版本与应用程序版本相对应。例如,如果我们的应用程序使用的是X.Y.Z版本格式,那么我们可以将其编码为十六进制数,并使用它作为我们的模型版本,例如:
#pragma db model version(0x020000, 0x020306) // 2.0.0-2.3.6
Most real-world object models will be spread over multiple header files and it will be burdensome to repeat the db model version pragma in each of them. The recommended way to handle this situation is to place the version pragma into a separate header file and include it into the object model files. If your project already has a header file that defines the application version, then it is natural to place this pragma there. For example:
大多数真实世界的对象模型将被分散在多个头文件中,并且在每个头文件中重复 db model version pragma会很麻烦。处理这种情况的推荐方法是将版本编译放入一个单独的头文件中,并将其包含到对象模型文件中。如果您的项目已经有一个定义应用程序版本的头文件,那么很自然地就会把这个编译说明放在那里。例如:
// version.hxx
//
// Define the application version.
//
#define MYAPP_VERSION 0x020306 // 2.3.6
#ifdef ODB_COMPILER
#pragma db model version(1, 7)
#endif
Note that we can also use macros in the version pragma which allows us to specify all the versions in a single place. For example:
请注意,我们也可以在版本编译中使用宏,它允许我们在一个地方指定所有版本。例如:
#define MYAPP_VERSION 0x020306 // 2.3.6
#define MYAPP_BASE_VERSION 0x020000 // 2.0.0
#ifdef ODB_COMPILER
#pragma db model version(MYAPP_BASE_VERSION, MYAPP_VERSION)
#endif
It is also possible to have multiple object models within the same application that have different versions. Such models must be independent, that is, no headers from one model shall include a header from another. You will also need to assign different schema names to each model with the --schema-name ODB compiler option.
在具有不同版本的同一个应用程序中也可能有多个对象模型。这样的模型必须是独立的,也就是说,一个模型的头文件不能包含另一个模型的头文件。您还需要使用——schema-name ODB编译器选项为每个模型分配不同的模型名称。
Once we specify the object model version, the ODB compiler starts tracking database schema changes in a changelog file. Changelog has an XML-based, line-oriented format. It uses XML in order to provide human readability while also facilitating, if desired, processing and analysis with custom tools. The line orientation makes it easy to review with tools like diff.
一旦我们指定了对象模型版本,ODB编译器就开始在一个变更日志文件中跟踪数据库模型的更改。更新日志具有基于xml的、面向行的格式。它使用XML是为了提供人类的可读性,同时也便于使用定制工具进行处理和分析(如果需要的话)。线方向使得使用diff之类的工具可以很容易地进行检查。
The changelog is maintained by the ODB compiler. Specifically, you do not need to make any manual changes to this file. You will, however, need to keep it around from one invocation of the ODB compiler to the next. In other words, the changelog file is both the input and the output of the ODB compiler. This, for example, means that if your project's source code is stored in a version control repository, then you will most likely want to store the changelog there as well. If you delete the changelog, then any ability to do schema migration will be lost.
变更日志由ODB编译器维护。具体来说,您不需要对该文件进行任何手动更改。但是,您需要将它保存在ODB编译器的一次调用到下一次调用之间。换句话说,变更日志文件既是ODB编译器的输入,也是其输出。例如,这意味着如果您的项目的源代码存储在版本控制存储库中,那么您很可能也希望将变更日志存储在那里。如果您删除了变更日志,那么进行模型迁移的任何能力都将丢失。
The only operation that you may want to perform with the changelog is to review the database schema changes that resulted from the C++ object model changes. For this you can use a tool like diff or, better yet, the change review facilities offered by your revision control system. For this purpose the contents of a changelog will be self-explanatory.
您可能希望对变更日志执行的唯一操作是检查由于C++对象模型更改而导致的数据库模型更改。为此,您可以使用像diff这样的工具,或者更好的是,您的修订控制系统提供的变更审查工具。为此,变更日志的内容将是不言自明的。
As an example, consider the following initial object model:
例如,考虑以下初始对象模型:
// person.hxx
//
#include <string>
#pragma db model version(1, 1)
#pragma db object
class person
{
...
#pragma db id auto
unsigned long id_;
std::string first_;
std::string last_;
};
We then compile this header file with the ODB compiler (using the PostgreSQL database as an example):
然后我们用ODB编译器编译这个头文件(以PostgreSQL数据库为例):
odb --database pgsql --generate-schema person.hxx
If we now look at the list of generated files, then in addition to the now familiar person-odb.?xx and person.sql, we will also see person.xml — the changelog file. Just for illustration, below are the contents of this changelog.
如果我们现在查看生成的文件列表,那么除了现在熟悉的 person-odb.?xx 和 person.sql,我们还将看到person.xml——更改日志文件。只是为了说明,下面是这个变更日志的内容。
<changelog database="pgsql">
<model version="1">
<table name="person" kind="object">
<column name="id" type="BIGINT" null="false"/>
<column name="first" type="TEXT" null="false"/>
<column name="last" type="TEXT" null="false"/>
<primary-key auto="true">
<column name="id"/>
</primary-key>
</table>
</model>
</changelog>
Let's say we now would like to add another data member to the person class — the middle name. We increment the version and make the change:
假设我们现在想要向person类添加另一个数据成员—中间名。我们增加版本并进行更改:
#pragma db model version(1, 2)
#pragma db object
class person
{
...
#pragma db id auto
unsigned long id_;
std::string first_;
std::string middle_;
std::string last_;
};
We use exactly the same command line to re-compile our file:
我们使用完全相同的命令行来重新编译文件:
odb --database pgsql --generate-schema person.hxx
This time the ODB compiler will read the old changelog, update it, and write out the new version. Again, for illustration only, below are the updated changelog contents:
这一次,ODB编译器将读取旧的更改日志,对其进行更新,并写出新版本。再次说明,以下是更新后的更新日志内容:
<changelog database="pgsql">
<changeset version="2">
<alter-table name="person">
<add-column name="middle" type="TEXT" null="false"/>
</alter-table>
</changeset>
<model version="1">
<table name="person" kind="object">
<column name="id" type="BIGINT" null="false"/>
<column name="first" type="TEXT" null="false"/>
<column name="last" type="TEXT" null="false"/>
<primary-key auto="true">
<column name="id"/>
</primary-key>
</table>
</model>
</changelog>
Just to reiterate, while the changelog may look like it could be written by hand, it is maintained completely automatically by the ODB compiler and the only reason you may want to look at its contents is to review the database schema changes. For example, if we compare the above two changelogs with diff, we will get the following summary of the database schema changes:
重申一下,虽然更改日志看起来像是可以手工编写的,但它完全是由ODB编译器自动维护的,您可能想查看其内容的唯一原因是检查数据库模型更改。例如,如果我们用diff比较上面的两个变更日志,我们会得到以下数据库模型变更的摘要:
--- person.xml.orig
+++ person.xml
@@ -1,4 +1,10 @@
<changelog database="pgsql">
+ <changeset version="2">
+ <alter-table name="person">
+ <add-column name="middle" type="TEXT" null="false"/>
+ </alter-table>
+ </changeset>
+
<model version="1">
<table name="person" kind="object">
<column name="id" type="BIGINT" null="false"/>
The changelog is only written when we generate the database schema, that is, the --generate-schema option is specified. Invocations of the ODB compiler that only produce the database support code (C++) do not read or update the changelog. To put it another way, the changelog tracks changes in the resulting database schema, not the C++ object model.
只有当我们生成数据库模型时,即指定了——generate-schema选项时,才会编写变更日志。对只生成数据库支持代码(C++)的ODB编译器的调用不会读取或更新更改日志。换句话说,变更日志跟踪的是结果数据库模型中的变更,而不是C++对象模型中的变更。
ODB ignores column order when comparing database schemas. This means that we can re-order data members in a class without causing any schema changes. Member renames, however, will result in schema changes since the column name changes as well (unless we specified the column name explicitly). From ODB's perspective such a rename looks like the deletion of one data member and the addition of another. If we don't want this to be treated as a schema change, then we will need to keep the old column name by explicitly specifying it with the db column pragma. For example, here is how we can rename middle_ to middle_name_ without causing any schema changes:
ODB在比较数据库模型时忽略列顺序。这意味着我们可以在不改变模型的情况下重新排序类中的数据成员。但是,成员重命名将导致模型更改,因为列名也会更改(除非显式指定列名)。从ODB的角度来看,这样的重命名类似于删除一个数据成员并添加另一个数据成员。如果我们不希望这被视为模型更改,那么我们将需要通过使用db column pragma显式指定它来保留旧的列名。例如,下面是如何在不改变模型的情况下将middle_重命名为middle_name_:
#pragma db model version(1, 2)
#pragma db object
class person
{
...
#pragma db column("middle") // Keep the original column name.
std::string middle_name_;
...
};
If your object model consists of a large number of header files and you generate the database schema for each of them individually, then a changelog will be created for each of your header files. This may be what you want, however, the large number of changelogs can quickly become unwieldy. In fact, if you are generating the database schema as standalone SQL files, then you may have already experienced a similar problem caused by a large number of .sql files, one for each header.
如果您的对象模型由大量的头文件组成,并且您为每个头文件分别生成数据库模型,那么将为每个头文件创建一个更改日志。这可能是您想要的,然而,大量的变更日志会很快变得难以处理。实际上,如果您将数据库模型生成为独立的SQL文件,那么您可能已经遇到过由大量. SQL文件引起的类似问题,每个头文件对应一个. SQL文件。
The solution to both of these problems is to generate a combined database schema file and a single changelog. For example, assume we have three header files in our object model: person.hxx, employee.hxx, and employer.hxx. To generate the database support code we compile them as usual but without specifying the --generate-schema option. In this case no changelog is created or updated:
这两个问题的解决方案是生成一个组合的数据库模型文件和一个单一的变更日志。例如,假设我们的对象模型中有三个头文件:person.hxx, employee.hxx, 和 employer.hxx。为了生成数据库支持代码,我们像往常一样编译它们,但没有指定——generate-schema选项。在这种情况下,没有创建或更新更改日志:
odb --database pgsql person.hxx
odb --database pgsql employee.hxx
odb --database pgsql employer.hxx
To generate the database schema, we perform a separate invocation of the ODB compiler. This time, however, we instruct it to only generate the schema (--generate-schema-only) and produce it combined (--at-once) for all the files in our object model:
为了生成数据库模型,我们对ODB编译器执行一个单独的调用。然而,这一次,我们让它只生成模型(——generate-schema-only),并为我们的对象模型中的所有文件组合生成(——at-once):
odb --database pgsql --generate-schema-only --at-once \
--input-name company person.hxx employee.hxx employer.hxx
The result of the above command is a single company.sql file (the name is derived from the --input-name value) that contains the database schema for our entire object model. There is also a single corresponding changelog file — company.xml.
上述命令的结果是一个company.sql文件(名称来自——input-name值),包含我们整个对象模型的数据库模型。还有一个相应的更改日志文件- company.xml。
The same can be achieved for the embedded schema by instructing the ODB compiler to generate the database creation code into a separate C++ file (--schema-format separate):
同样,通过指令ODB编译器将数据库创建代码生成到一个单独的C++文件(——schema-format separate),嵌入式模型也可以实现这一点:
odb --database pgsql --generate-schema-only --schema-format separate \
--at-once --input-name company person.hxx employee.hxx employer.hxx
The result of this command is a single company-schema.cxx file and, again, company.xml.
这个命令的结果是一个company-schema.cxx 文件和company.xml。
Note also that by default the changelog file is not placed into the directory specified with the --output-dir option. This is due to the changelog being both an input and an output file at the same time. As a result, by default, the ODB compiler will place it in the directory of the input header file.
还请注意,默认情况下,更改日志文件不会被放置到使用——output-dir选项指定的目录中。这是因为更改日志同时是输入和输出文件。因此,在默认情况下,ODB编译器将把它放在输入头文件的目录中。
There is, however, a number of command line options (including --changelog-dir) that allow us to fine-tune the name and location of the changelog file. For example, you can instruct the ODB compiler to read the changelog from one file while writing it to another. This, for example, can be useful if you want to review the changes before discarding the old file. For more information on these options, refer to the ODB Compiler Command Line Manual and search for "changelog".
但是,有许多命令行选项(包括——changelog-dir)允许我们调整更改日志文件的名称和位置。例如,您可以指示ODB编译器在将变更日志写入另一个文件时从一个文件读取变更日志。例如,如果您想在丢弃旧文件之前检查更改,这可能很有用。有关这些选项的更多信息,请参阅ODB编译器命令行手册,并搜索“changelog”。
When we were discussing version increments above, we used the terms development and release. Specifically, we talked about keeping the same object model versions during development periods and incrementing them after releases. What is a development period and a release in this context? These definitions can vary from project to project. Generally, during a development period we work on one or more changes to the object model that result in the changes to the database schema. A release is a point where we make our changes available to someone else who may have an older database to migrate from. In the traditional sense, a release is a point where you make a new version of your application available to its users. However, for schema evolution purposes, a release could also mean simply making your schema-altering changes available to other developers on your team. Let us consider two common scenarios to illustrate how all this fits together.
当我们在上面讨论版本增量时,我们使用了术语开发和发布。具体地说,我们讨论了在开发期间保持相同的对象模型版本,并在发布之后增加它们。在这种情况下,开发周期和发布版本是什么?这些定义因项目而异。通常,在开发期间,我们处理对象模型的一个或多个更改,这些更改会导致数据库模型的更改。发布是一个点,在这个点上,我们将我们的更改提供给其他人,他们可能有一个较旧的数据库来进行迁移。在传统意义上,发布是向用户提供应用程序新版本的时刻。然而,出于模型演化的目的,发布还可能意味着让团队中的其他开发人员可以使用您的模型更改更改。让我们考虑两个常见的场景,以说明所有这些是如何组合在一起的。
One way to setup a project would be to re-use the application development period and application release for schema evolution. That is, during a new application version development we keep a single object model version and when we release the application, we increment the model version. In this case it makes sense to also reuse the application version as a model version for consistency. Here is a step-by-step guide for this setup:
建立项目的一种方法是为模型演进重用应用程序开发周期和应用程序发布。也就是说,在一个新的应用程序版本开发期间,我们保留一个单一的对象模型版本,当我们发布应用程序时,我们增加模型版本。在这种情况下,为了一致性,将应用程序版本重用为模型版本也是有意义的。下面是这个设置的一步一步的指南:
During development, keep the current object model version open. 在开发期间,保持当前对象模型版本处于打开状态。
Before the release (for example, when entering a "feature freeze") close the version. 在发布之前(例如,当进入“特性冻结”时)关闭版本。
After the release, update the version and open it. 发布后,更新版本并打开它。
For each new feature, review the changeset at the top of the changelog, for example, with diff or your version control facilities. If you are using a version control, then this is best done just before committing your changes to the repository. 对于每个新特性,请使用diff或您的版本控制工具检查变更日志顶部的变更集。如果您正在使用版本控制,那么最好在将更改提交到存储库之前完成。
An alternative way to setup schema versioning in a project would be to define the development period as working on a single feature and the release as making this feature available to other people (developers, testers, etc.) on your team, for example, by committing the changes to a public version control repository. In this case, the object model version will be independent of the application version and can simply be a sequence that starts with 1 and is incremented by 1. Here is a step-by-step guide for this setup:
项目设置模型版本控制的另一个途径是发展时期定义为在一个特性和释放,使这个功能提供给其他人(开发人员、测试人员等)在你的团队,例如,通过公共版本控制存储库提交更改。在这种情况下,对象模型版本将独立于应用程序版本,并且可以是一个以1开始并以1递增的序列。下面是这个设置的一步一步的指南:
Keep the current model version closed. Once a change is made that affects the database schema, the ODB compiler will refuse to update the changelog. 保持当前的模型版本处于关闭状态。一旦进行了影响数据库模型的更改,ODB编译器将拒绝更新更改日志。
If the change is legitimate, open a new version, that is, increment the current version and make it open. 如果更改是合法的,则打开一个新版本,即增加当前版本并使其打开。
Once the feature is implemented and tested, review the final set of database changes (with diff or your version control facilities), close the version, and commit the changes to the version control repository (if using). 一旦实现并测试了该特性,就检查最终的数据库更改集(使用diff或版本控制工具),关闭版本,并将更改提交到版本控制存储库(如果使用)。
If you are using a version control repository that supports pre-commit checks, then you may want to consider adding such a check to make sure the committed version is always closed.
如果您正在使用支持预提交检查的版本控制存储库,那么您可能需要考虑添加这样的检查,以确保提交的版本总是关闭的。
If we are just starting schema evolution in our project, which approach should we choose? The two approaches will work better in different situations since they have a different set of advantages and disadvantages. The first approach, which we can call version per application release, is best suited for simpler projects with smaller releases since otherwise a single migration will bundle a large number of unrelated actions corresponding to different features. This can become difficult to review and, if things go wrong, debug.
如果我们刚开始在项目中进行模型演化,我们应该选择哪种方法?这两种方法在不同的情况下会更好地工作,因为它们有不同的优点和缺点。第一种方法,我们可以称其为每个应用程序版本,它最适合于具有较小版本的简单项目,因为否则单个迁移将捆绑大量与不同特性对应的无关操作。这可能会使审查变得困难,如果出现错误,则调试也会变得困难。
The second approach, which we can call version per feature, is much more modular and provides a number of additional benefits. We can perform migrations for each feature as a discreet step which makes it easier to debug. We can also place each such migration step into a separate transaction further improving reliability. It also scales much better in larger teams where multiple developers can work concurrently on features that affect the database schema. For example, if you find yourself in a situation where another developer on your team used the same version as you and managed to commit his changes before you (that is, you have a merge conflict), then you can simply change the version to the next available one, regenerate the changelog, and continue with your commit.
第二种方法,我们可以称之为版本特性,它更加模块化,并提供了许多额外的好处。我们可以将每个特性作为一个谨慎的步骤来执行迁移,这使得调试变得更容易。我们还可以将每个这样的迁移步骤放到一个单独的事务中,从而进一步提高可靠性。在更大的团队中,它也可以更好地扩展,在这样的团队中,多个开发人员可以在影响数据库模型的特性上同时工作。例如,如果你发现自己在一个团队中其他开发人员使用情况相同的版本和管理承诺之前,他的变化(即有合并冲突),那么您可以简单地改变要下一个可用的版本,重新生成的更新日志,并继续你的提交。
Overall, unless you have strong reasons to prefer the version per application release approach, rather choose version per feature even though it may seem more complex at the beginning. Also, if you do select the first approach, consider provisioning for switching to the second method by reserving a sub-version number. For example, for an application version in the form 2.3.4 you can make the object model version to be in the form 0x0203040000, reserving the last two bytes for a sub-version. Later on you can use it to switch to the version per feature approach.
总的来说,除非你有足够的理由选择每个应用程序发布版本的方法,否则宁可选择每个特性的版本,即使在开始的时候看起来比较复杂。另外,如果您选择了第一种方法,请考虑通过保留子版本号来切换到第二种方法。例如,对于表单2.3.4中的应用程序版本,您可以将对象模型版本设置为表单0x0203040000,为子版本保留最后两个字节。稍后,您可以使用它切换到每个特性版本的方法。
13.2 Schema Migration 模型迁移
Once we enable schema evolution by specifying the object model version, in addition to the schema creation statements, the ODB compiler starts generating schema migration statements for each version all the way from the base to the current. As with schema creation, schema migration can be generated either as a set of SQL files or embedded into the generated C++ code (--schema-format option).
一旦我们通过指定对象模型版本来支持模型演化,除了模型创建语句之外,ODB编译器就开始为每个版本生成模型迁移语句,从基本版本一直到当前版本。与模型创建一样,模型迁移既可以作为一组SQL文件生成,也可以嵌入到生成的C++代码中(——schema-format选项)。
For each migration step, that is from one version to the next, ODB generates two sets of statements: pre-migration and post-migration. The pre-migration statements "relax" the database schema so that both old and new data can co-exist. At this stage new columns and tables are added while old constraints are dropped. The post-migration statements "tighten" the database schema back so that only data conforming to the new format can remain. At this stage old columns and tables are dropped and new constraints are added. Now you can probably guess where the data migration fits into this — between the pre and post schema migrations where we can both access the old data and create the new one.
对于每个迁移步骤,也就是从一个版本到下一个版本,ODB生成两组语句:迁移前和迁移后。迁移前语句“放松”数据库模型,以便新旧数据可以共存。在这个阶段,新的列和表被添加,而旧的约束被删除。后迁移语句将数据库模型“收紧”回去,以便只有符合新格式的数据才能保留。在这个阶段,旧的列和表被删除,新的约束被添加。现在,您可能可以猜到数据迁移适合于此———在模型迁移前后,我们既可以访问旧数据,也可以创建新数据。
If the schema is being generated as standalone SQL files, then we end up with a pair of files for each step: the pre-migration file and the post-migration file. For the person example we started in the previous section we will have the person-002-pre.sql and person-002-post.sql files. Here 002 is the version to which we are migrating while the pre and post suffixes specify the migration stage. So if we wanted to migrate a person database from version 1 to 2, then we would first execute person-002-pre.sql, then migrate the data, if any (discussed in more detail in the next section), and finally execute person-002-post.sql. If our database is several versions behind, for example the database has version 1 while the current version is 5, then we simply perform this set of steps for each version until we reach the current version.
如果模型是作为独立的SQL文件生成的,那么我们将为每个步骤生成一对文件:迁移前文件和迁移后文件。对于我们在前一节中开始的person示例,我们将有person-002-pre.sql和person-002-post.sql文件。这里002是我们要迁移的版本,而前后缀和后后缀指定迁移阶段。因此,如果我们想将一个人数据库从版本1迁移到版本2,那么我们首先要执行person-002-pre.sql,然后迁移数据(如果有的话)(下一节将详细讨论),最后执行person-002-post.sql。如果我们的数据库落后几个版本,例如数据库的版本是1,而当前的版本是5,那么我们只需为每个版本执行这组步骤,直到我们到达当前版本。
If we look at the contents of the person-002-pre.sql file, we will see the following (or equivalent, depending on the database used) statement:
如果我们看看person-002-pre.sql文件的内容,我们将看到以下语句(或等效的,取决于使用的数据库):
ALTER TABLE "person"
ADD COLUMN "middle" TEXT NULL;
As we would expect, this statement adds a new column corresponding to the new data member. An observant reader would notice, however, that the column is added as NULL even though we never requested this semantics in our object model. Why is the column added as NULL? If during migration the person table already contains rows (that is, existing objects), then an attempt to add a non-NULL column that doesn't have a default value will fail. As a result, ODB will initially add a new column that doesn't have a default value as NULL but then clean this up at the post-migration stage. This way your data migration code is given a chance to assign some meaningful values for the new data member for all the existing objects. Here are the contents of the person-002-post.sql file:
正如我们所预期的,这个语句添加了一个与新数据成员对应的新列。然而,细心的读者会注意到,尽管我们从未在对象模型中请求过这种语义,但该列被添加为NULL。为什么将列添加为NULL?如果在迁移期间,人员表已经包含行(即现有对象),那么尝试添加没有默认值的非null列将会失败。因此,ODB将首先添加一个没有默认值为NULL的新列,然后在迁移后阶段将其清除。这样,数据迁移代码就有机会为所有现有对象的新数据成员分配一些有意义的值。这是person-002-post.sqll文件的内容:
ALTER TABLE "person"
ALTER COLUMN "middle" SET NOT NULL;
Currently ODB directly supports the following elementary database schema changes:
目前,ODB直接支持以下基本的数据库模型更改:
add table 添加表
drop table 删除表
add column 添加一列
drop column 删除列
alter column, set NULL/NOT NULL 修改列,设置NULL/NOT NULL
add foreign key 添加外键
drop foreign key 删除外键
add index 添加索引
drop index 删除索引
More complex changes can normally be implemented in terms of these building blocks. For example, to change a type of a data member (which leads to a change of a column type), we can add a new data member with the desired type (add column), migrate the data, and then delete the old data member (drop column). ODB will issue diagnostics for cases that are currently not supported directly. Note also that some database systems (notably SQLite) have a number of limitations in their support for schema changes. For more information on these database-specific limitations, refer to the "Limitations" sections in Part II, "Database Systems".
更复杂的更改通常可以用这些构建块来实现。例如,要更改数据成员的类型(这会导致更改列类型),可以添加一个具有所需类型的新数据成员(添加列),迁移数据,然后删除旧数据成员(删除列)。ODB将对当前不直接支持的案例发出诊断。还要注意,一些数据库系统(特别是SQLite)在支持模型更改方面有很多限制。有关这些数据库特定限制的更多信息,请参阅第二部分“数据库系统”中的“限制”部分。
How do we know what the current database version is? That is, the version from which we need to migrate? We need to know this, for example, in order to determine the set of migrations we have to perform. By default, when schema evolution is enabled, ODB maintains this information in a special table called schema_version that has the following (or equivalent, depending on the database used) definition:
我们如何知道当前数据库版本是什么?也就是说,我们需要迁移的版本?例如,为了确定我们必须执行的迁移集,我们需要知道这一点。默认情况下,当启用模型演化时,ODB在一个名为schema_version的特殊表中维护这些信息,该表具有以下(或等效的,取决于所使用的数据库)定义:
CREATE TABLE "schema_version" (
"name" TEXT NOT NULL PRIMARY KEY,
"version" BIGINT NOT NULL,
"migration" BOOLEAN NOT NULL);
The name column is the schema name as specified with the --schema-name option. It is empty for the default schema. The version column contains the current database version. And, finally, the migration flag indicates whether we are in the process of migrating the database, that is, between the pre and post-migration stages.
name列是用——schema-name选项指定的模型名。对于默认模型,它是空的。version列包含当前数据库版本。最后,migration 标志指示我们是否处于迁移数据库的过程中,即迁移前和迁移后阶段之间。
The schema creation statements (person.sql in our case) create this table and populate it with the initial model version. For example, if we executed person.sql corresponding to version 1 of our object model, then name would have been empty (which signifies the default schema since we didn't specify --schema-name), version will be 1 and migration will be FALSE.
初始模型版本,模型创建语句(person.sql)创建这个表并填充它。例如,如果我们执行person.sql对应于我们的对象模型的版本1,那么name将为空(表示默认模型,因为我们没有指定——schema-name), version将为1,而migration将为FALSE。
The pre-migration statements update the version and set the migration flag to TRUE. Continuing with our example, after executing person-002-pre.sql, version will become 2 and migration will be set to TRUE. The post-migration statements simply clear the migration flag. In our case, after running person-002-post.sql, version will remain 2 while migration will be reset to FALSE.
迁移前语句更新版本并将迁移标志设置为TRUE。继续我们的示例,在执行了person-002-pre.sql之后,版本将变成2,迁移将被设置为TRUE。后迁移语句只是清除迁移标志。在我们的例子中,在运行了person-002-post.sql之后, version将保持2,而迁移将被重置为FALSE。
Note also that above we mentioned that the schema creation statements (person.sql) create the schema_version table. This means that if we enable schema evolution support in the middle of a project, then we could already have existing databases that don't include this table. As a result, ODB will not be able to handle migrations for such databases unless we manually add the schema_version table and populate it with the correct version information. For this reason, it is highly recommended that you consider whether to use schema evolution and, if so, enable it from the beginning of your project.
还请注意,上面我们提到,模型创建语句(person.sql)创建了schema_version表。这意味着,如果我们在项目中启用模型演化支持,那么我们可能已经有不包含此表的现有数据库。因此,ODB将无法处理此类数据库的迁移,除非我们手动添加schema_version表并使用正确的版本信息填充它。由于这个原因,强烈建议您考虑是否使用模型演化,如果使用,则从项目一开始就启用它。
The odb::database class provides an API for accessing and modifying the current database version:
database类提供了访问和修改当前数据库版本的API:
namespace odb
{
typedef unsigned long long schema_version;
struct LIBODB_EXPORT schema_version_migration
{
schema_version_migration (schema_version = 0,
bool migration = false);
schema_version version;
bool migration;
// This class also provides the ==, !=, <, >, <=, and >= operators.
// Version ordering is as follows: {1,f} < {2,t} < {2,f} < {3,t}.
};
class database
{
public:
...
schema_version
schema_version (const std::string& name = "") const;
bool
schema_migration (const std::string& name = "") const;
const schema_version_migration&
schema_version_migration (const std::string& name = "") const;
// Set schema version and migration state manually.
//
void
schema_version_migration (schema_version,
bool migration,
const std::string& name = "");
void
schema_version_migration (const schema_version_migration&,
const std::string& name = "");
// Set default schema version table for all schemas.
//
void
schema_version_table (const std::string& table_name);
// Set schema version table for a specific schema.
//
void
schema_version_table (const std::string& table_name,
const std::string& name);
};
}
The schema_version() and schema_migration() accessors return the current database version and migration flag, respectively. The optional name argument is the schema name. If the database schema hasn't been created (that is, there is no corresponding entry in the schema_version table or this table does not exist), then schema_version() returns 0. The schema_version_migration() accessor returns both version and migration flag together in the schema_version_migration struct.
schema_version()和schema_migration()访问器分别返回当前数据库版本和迁移标志。可选的name参数是模型名。如果还没有创建数据库模型(也就是说,schema_version表中没有相应的条目,或者这个表不存在),那么schema_version()返回0。schema_version_migration()访问器在schema_version_migration结构中同时返回版本和迁移标志。
You may already have a version table in your database or you (or your database administrator) may prefer to keep track of versions your own way. You can instruct ODB not to create the schema_version table with the --suppress-schema-version option. However, ODB still needs to know the current database version in order for certain schema evolution mechanisms to function properly. As a result, in this case, you will need to set the schema version on the database instance manually using the schema_version_migration() modifier. Note that the modifier API is not thread-safe. That is, you should not modify the schema version while other threads may be accessing or modifying the same information.
您的数据库中可能已经有一个版本表,或者您(或者您的数据库管理员)可能更愿意以自己的方式跟踪版本。可以使用——suppress-schema-version选项指示ODB不要创建schema_version表。但是,ODB仍然需要知道当前的数据库版本,以便某些模型演化机制能够正常工作。因此,在本例中,您需要使用schema_version_migration()修饰符手动设置数据库实例上的模型版本。注意,修饰符API不是线程安全的。也就是说,当其他线程可能正在访问或修改相同的信息时,您不应该修改模型版本。
Note also that the accessors we discussed above will only query the schema_version table once and, if the version could be determined, cache the result. If, however, the version could not be determined (that is, schema_version() returned 0), then a subsequent call will re-query the table. While it is probably a bad idea to modify the database schema while the application is running (other than via the schema_catalog API, as discussed below), if for some reason you need ODB to re-query the version, then you can manually set it to 0 using the schema_version_migration() modifier.
还请注意,我们在上面讨论的访问器将只查询schema_version表一次,如果可以确定版本,则缓存结果。但是,如果无法确定版本(即schema_version()返回0),则后续调用将重新查询表。当然最好不要在应用程序运行时去修改数据库模型(而不是通过schema_catalog API,如下面所讨论的),如果由于某种原因你需要ODB重新查询版本,那么你可以使用schema_version_migration()手动设置为0。
It is also possible to change the name of the table that stores the schema version using the --schema-version-table option. You will also need to specify this alternative name on the database instance using the schema_version_table() modifier. The first version specifies the default table that is used for all the schema names. The second version specifies the table for a specific schema. The table name should be database-quoted, if necessary.
还可以使用——schema-version-table选项更改存储模型版本的表的名称。您还需要使用schema_version_table()修饰符在数据库实例上指定这个替代名称。第一个版本指定了用于所有模型名的默认表。第二个版本指定了特定模型的表。如果需要,表名应该用数据库引号括起来。
If we are generating our schema migrations as standalone SQL files, then the migration workflow could look like this:
如果我们将我们的模型迁移作为独立的SQL文件,那么迁移工作流可以像这样:
The database administrator determines the current database version. If migration is required, then for each migration step (that is, from one version to the next), he performs the following: 数据库的当前版本由数据库管理员决定。如果需要迁移,那么对于每个迁移步骤(即从一个版本到下一个版本),他执行以下操作:
Execute the pre-migration file. 执行预迁移文件。
Execute our application (or a separate migration program) to perform data migration (discussed later). Our application can determine that is is being executed in the "migration mode" by calling schema_migration() and then which migration code to run by calling schema_version(). 执行我们的应用程序(或单独的迁移程序)来执行数据迁移(稍后讨论)。我们的应用程序可以通过调用schema_migration()来确定这是在“迁移模型”中执行的,然后通过调用schema_version()来确定要运行的迁移代码。
Execute the post-migration file. 执行迁移后文件。
These steps become more integrated and automatic if we embed the schema creation and migration code into the generated C++ code. Now we can perform schema creation, schema migration, and data migration as well as determine when each step is necessary programmatically from within the application.
如果我们将模型创建和迁移代码嵌入到生成的C++代码中,这些步骤将变得更加集成和自动化。现在,我们可以执行模型创建、模型迁移和数据迁移,以及从应用程序内部以编程方式确定何时需要执行每个步骤。
Schema evolution support adds the following extra functions to the odb::schema_catalog class, which we first discussed in Section 3.4, "Database".
模型演化支持向odb::schema_catalog类添加了以下额外的函数,我们在3.4节“数据库”中首先讨论了这个类。
namespace odb
{
class schema_catalog
{
public:
...
// Schema migration.
//
static void
migrate_schema_pre (database&,
schema_version,
const std::string& name = "");
static void
migrate_schema_post (database&,
schema_version,
const std::string& name = "");
static void
migrate_schema (database&,
schema_version,
const std::string& name = "");
// Data migration.
//
// Discussed in the next section.
// Combined schema and data migration.
//
static void
migrate (database&,
schema_version = 0,
const std::string& name = "");
// Schema version information.
//
static schema_version
base_version (const database&,
const std::string& name = "");
static schema_version
base_version (database_id,
const std::string& name = "");
static schema_version
current_version (const database&,
const std::string& name = "");
static schema_version
current_version (database_id,
const std::string& name = "");
static schema_version
next_version (const database&,
schema_version = 0,
const std::string& name = "");
static schema_version
next_version (database_id,
schema_version,
const std::string& name = "");
};
}
The migrate_schema_pre() and migrate_schema_post() static functions perform a single stage (that is, pre or post) of a single migration step (that is, from one version to the next). The version argument specifies the version we are migrating to. For instance, in our person example, if we know that the database version is 1 and the next version is 2, then we can execute code like this:
migrate_schema_pre()和migrate_schema_post()静态函数执行单个迁移步骤(即从一个版本到下一个版本)的单个阶段(即前或后)。version参数指定我们要迁移到的版本。例如,在我们的person例子中,如果我们知道数据库的版本是1,下一个版本是2,那么我们可以像这样执行代码:
transaction t (db.begin ());
schema_catalog::migrate_schema_pre (db, 2);
// Data migration goes here.
schema_catalog::migrate_schema_post (db, 2);
t.commit ();
If you don't have any data migration code to run, then you can perform both stages with a single call using the migrate_schema() static function.
如果您没有任何数据迁移代码要运行,那么您可以使用migrate_schema()静态函数使用一个调用来执行这两个阶段。
The migrate() static function perform both schema and data migration (we discuss data migration in the next section). It can also perform several migration steps at once. If we don't specify its target version, then it will migrate (if necessary) all the way to the current model version. As an extra convenience, migrate() will also create the database schema if none exists. As a result, if we don't have any data migration code or we have registered it with schema_catalog (as discussed later), then the database schema creation and migration, whichever is necessary, if at all, can be performed with a single function call:
静态函数migrate()执行模型迁移和数据迁移(我们将在下一节讨论数据迁移)。它还可以同时执行几个迁移步骤。如果我们没有指定它的目标版本,那么它将(如果必要的话)一直迁移到当前的模型版本。作为一种额外的便利,如果不存在数据库模型,migrate()也会创建数据库模型。因此,如果我们没有任何数据迁移代码,或者我们已经将其注册到schema_catalog(稍后会讨论到),那么数据库模型的创建和迁移,如果需要的话,可以通过一个函数调用来执行:
transaction t (db.begin ());
schema_catalog::migrate (db);
t.commit ();
Note also that schema_catalog is integrated with the odb::database schema version API. In particular, schema_catalog functions will query and synchronize the schema version on the database instance if and when required.
还要注意,schema_catalog与odb::database模型版本API集成在一起。特别是,schema_catalog函数将在需要时查询并同步数据库实例上的模型版本。
The schema_catalog class also allows you to iterate over known versions (remember, there could be "gaps" in version numbers) with the base_version(), current_version() and next_version() static functions. The base_version() and current_version() functions return the base and current object model versions, respectively. That is, the lowest version from which we can migrate and the version that we ultimately want to migrate to. The next_version() function returns the next known version. If the passed version is greater or equal to the current version, then this function will return the current version plus one (that is, one past current). If we don't specify the version, then next_version() will use the current database version as the starting point. Note also that the schema version information provided by these functions is only available if we embed the schema migration code into the generated C++ code. For standalone SQL file migrations this information is normally not needed since the migration process is directed by an external entity, such as a database administrator or a script.
schema_catalog类还允许您在base_version()、current_version()和next_version()静态函数中迭代已知的版本(请记住,版本号可能存在“差距”)。base_version()和current_version()函数分别返回基对象模型版本和当前对象模型版本。也就是说,我们可以迁移的最低版本和我们最终想要迁移的版本。next_version()函数的作用是:返回下一个已知的版本。如果传递的版本大于或等于当前版本,则该函数将返回当前版本加1(即一个过去的当前版本)。如果我们没有指定版本号,那么next_version()将使用当前数据库版本作为起点。还请注意,只有当我们将模型迁移代码嵌入生成的C++代码中时,这些函数提供的模型版本信息才可用。对于独立的SQL文件迁移,通常不需要这些信息,因为迁移过程是由外部实体(如数据库管理员或脚本)指导的。
Most schema_catalog functions presented above also accept the optional schema name argument. If the passed schema name is not found, then the odb::unknown_schema exception is thrown. Similarly, functions that accept the schema version argument will throw the odb::unknown_schema_version exception if the passed version is invalid. Refer to Section 3.14, "ODB Exceptions" for more information on these exceptions.
上面介绍的大多数schema_catalog函数也接受可选的模型名参数。如果没有找到传递的模型名,则抛出odb::unknown_schema异常。类似地,如果传递的版本无效,接受模型版本参数的函数将抛出odb::unknown_schema_version异常。有关这些异常的更多信息,请参阅3.14节“ODB异常”。
To illustrate how all these parts fit together, consider the following more realistic database schema management example. Here we want to handle the schema creation in a special way and perform each migration step in its own transaction.
为了说明所有这些部分是如何组合在一起的,考虑下面更实际的数据库模型管理示例。在这里,我们希望以一种特殊的方式处理模型创建,并在自己的事务中执行每个迁移步骤。
schema_version v (db.schema_version ());
schema_version bv (schema_catalog::base_version (db));
schema_version cv (schema_catalog::current_version (db));
if (v == 0)
{
// No schema in the database. Create the schema and
// initialize the database.
//
transaction t (db.begin ());
schema_catalog::create_schema (db);
// Populate the database with initial data, if any.
t.commit ();
}
else if (v < cv)
{
// Old schema (and data) in the database, migrate them.
//
if (v < bv)
{
// Error: migration from this version is no longer supported.
}
for (v = schema_catalog::next_version (db, v);
v <= cv;
v = schema_catalog::next_version (db, v))
{
transaction t (db.begin ());
schema_catalog::migrate_schema_pre (db, v);
// Data migration goes here.
schema_catalog::migrate_schema_post (db, v);
t.commit ();
}
}
else if (v > cv)
{
// Error: old application trying to access new database.
}
13.3 Data Migration 数据迁移
In quite a few cases specifying the default value for new data members will be all that's required to handle the existing objects. For example, the natural default value for the new middle name that we have added is an empty string. And we can handle this case with the db default pragma and without any extra C++ code:
在相当多的情况下,为新数据成员指定默认值将是处理现有对象所需的全部操作。例如,我们添加的新中间名的自然默认值是一个空字符串。我们可以用ddb default pragma来处理这种情况,而不需要任何额外的C++代码:
#pragma db model version(1, 2)
#pragma db object
class person
{
...
#pragma db default("")
std::string middle_;
};
However, there will be situations where we would need to perform more elaborate data migrations, that is, convert old data to the new format. As an example, suppose we want to add gender to our person class. And, instead of leaving it unassigned for all the existing objects, we will try to guess it from the first name. This is not particularly accurate but it could be sufficient for our hypothetical application:
然而,在某些情况下,我们需要执行更详细的数据迁移,即将旧数据转换为新格式。例如,假设我们想要向person类添加gender。并且,对于所有现有的对象,我们将尝试通过名字来猜测它,而不是让它未被赋值。这不是特别准确,但对于我们的假设应用来说已经足够了:
#pragma db model version(1, 3)
enum gender {male, female};
#pragma db object
class person
{
...
gender gender_;
};
As we have discussed earlier, there are two ways to perform data migration: immediate and gradual. To recap, with immediate migration we migrate all the existing objects at once, normally after the schema pre-migration statements but before the post-migration statements. With gradual migration, we make sure the new object model can accommodate both old and new data and gradually migrate existing objects as the application runs and the opportunities to do so arise, for example, an object is updated.
正如我们在前面讨论过的,执行数据迁移有两种方法:立即迁移和逐步迁移。概括一下,对于立即迁移,我们一次迁移所有现有对象,通常在模型迁移前语句之后,但在模型迁移后语句之前。对于逐步迁移,我们要确保新的对象模型能够同时容纳旧数据和新数据,并在应用程序运行时逐步迁移现有对象,这样做的机会也会出现,例如更新对象。
There is also another option for data migration that is not discussed further in this section. Instead of using our C++ object model we could execute ad-hoc SQL statements that perform the necessary conversions and migrations directly on the database server. While in certain cases this can be a better option from the performance point of view, this approach is often limited in terms of the migration logic that we can handle.
还有一种数据迁移方法,本节将不再进一步讨论。我们可以不使用我们的C++对象模型,而是执行特别的SQL语句,直接在数据库服务器上执行必要的转换和迁移。虽然从性能的角度来看,在某些情况下这可能是一个更好的选择,但这种方法通常受到我们能够处理的迁移逻辑的限制。
13.3.1 Immediate Data Migration 即时数据迁移
Let's first see how we can implement an immediate migration for the new gender_ data member we have added above. If we are using standalone SQL files for migration, then we could add code along these lines somewhere early in main(), before the main application logic:
让我们先看看如何实现上面添加的新gender_数据成员的直接迁移。如果我们使用独立的SQL文件进行迁移,那么我们可以在main()应用程序之前在前面的某个地方添加代码,在主要应用程序逻辑之前:
int
main ()
{
...
odb::database& db = ...
// Migrate data if necessary.
//
if (db.schema_migration ())
{
switch (db.schema_version ())
{
case 3:
{
// Assign gender to all the existing objects.
//
transaction t (db.begin ());
for (person& p: db.query<person> ())
{
p.gender (guess_gender (p.first ()));
db.update (p);
}
t.commit ();
break;
}
}
}
...
}
If you have a large number of objects to migrate, it may also be a good idea, from the performance point of view, to break one big transaction that we now have into multiple smaller transactions (Section 3.5, "Transactions"). For example:
如果您有大量的对象要迁移,从性能的角度来看,将一个大的事务分解为多个较小的事务(第3.5节,“事务”)也是一个好主意。例如:
case 3:
{
transaction t (db.begin ());
size_t n (0);
for (person& p: db.query<person> ())
{
p.gender (guess_gender (p.first ()));
db.update (p);
// Commit the current transaction and start a new one after
// every 100 updates.
//
if (n++ % 100 == 0)
{
t.commit ();
t.reset (db.begin ());
}
}
t.commit ();
break;
}
While it looks straightforward enough, as we add more migration snippets, this approach can quickly become unmaintainable. Instead of having all the migrations in a single function and determining when to run each piece ourselves, we can package each migration into a separate function, register it with the schema_catalog class, and let ODB figure out when to run which migration functions. To support this functionality, schema_catalog provides the following data migration API:
虽然它看起来足够简单,但随着我们添加更多的迁移片段,这种方法很快就会变得不可维护。我们可以将每个迁移打包到一个单独的函数中,将其注册到schema_catalog类中,并让ODB判断何时运行哪个迁移函数,而不是将所有迁移都放在一个函数中并自行决定何时运行每个部分。为了支持这个功能,schema_catalog提供了以下数据迁移API:
namespace odb
{
class schema_catalog
{
public:
...
// Data migration.
//
static std::size_t
migrate_data (database&,
schema_version = 0,
const std::string& name = "");
typedef void data_migration_function_type (database&);
// Common (for all the databases) data migration, C++98/03 version:
//
template <schema_version v, schema_version base>
static void
data_migration_function (data_migration_function_type*,
const std::string& name = "");
// Common (for all the databases) data migration, C++11 version:
//
template <schema_version v, schema_version base>
static void
data_migration_function (std::function<data_migration_function_type>,
const std::string& name = "");
// Database-specific data migration, C++98/03 version:
//
template <schema_version v, schema_version base>
static void
data_migration_function (database&,
data_migration_function_type*,
const std::string& name = "");
template <schema_version v, schema_version base>
static void
data_migration_function (database_id,
data_migration_function_type*,
const std::string& name = "");
// Database-specific data migration, C++11 version:
//
template <schema_version v, schema_version base>
static void
data_migration_function (database&,
std::function<data_migration_function_type>,
const std::string& name = "");
template <schema_version v, schema_version base>
static void
data_migration_function (database_id,
std::function<data_migration_function_type>,
const std::string& name = "");
};
// Static data migration function registration, C++98/03 version:
//
template <schema_version v, schema_version base>
struct data_migration_entry
{
data_migration_entry (data_migration_function_type*,
const std::string& name = "");
data_migration_entry (database_id,
data_migration_function_type*,
const std::string& name = "");
};
// Static data migration function registration, C++11 version:
//
template <schema_version v, schema_version base>
struct data_migration_entry
{
data_migration_entry (std::function<data_migration_function_type>,
const std::string& name = "");
data_migration_entry (database_id,
std::function<data_migration_function_type>,
const std::string& name = "");
};
}
The migrate_data() static function performs data migration for the specified version. If no version is specified, then it will use the current database version and also check whether the database is in migration, that is, database::schema_migration() returns true. As a result, all we need to do in our main() is call this function. It will check if migration is required and if so, call all the migration functions registered for this version. For example:
静态函数migrate_data()执行指定版本的数据迁移。如果没有指定版本,则将使用当前数据库版本,并检查数据库是否正在迁移,也就是说,database::schema_migration()返回true。因此,我们在main()中所要做的就是调用这个函数。它将检查是否需要迁移,如果需要,则调用为该版本注册的所有迁移函数。例如:
int
main ()
{
...
database& db = ...
// Check if we need to migrate any data and do so
// if that's the case.
//
schema_catalog::migrate_data (db);
...
}
The migrate_data() function returns the number of migration functions called. You can use this value for debugging or logging.
函数的作用是:返回调用的迁移函数的数量。可以将此值用于调试或日志记录。
The only other step that we need to perform is register our data migration functions with schema_catalog. At the lower level we can call the data_migration_function() static function for every migration function we have, for example, at the beginning of main(). For each version, data migration functions are called in the order of registration.
我们需要执行的另一个步骤是将数据迁移函数注册到schema_catalog。在较低的层次上,我们可以为每个迁移函数调用data_migration_function()静态函数,例如,在main()的开头。对于每个版本,数据迁移函数按注册的顺序调用。
A more convenient approach, however, is to use the data_migration_entry helper class template to register the migration functions during static initialization. This way we can keep the migration function and its registration code next to each other. Here is how we can reimplement our gender migration code to use this mechanism:
然而,更方便的方法是在静态初始化期间使用data_migration_entry helper类模板注册迁移函数。这样我们就可以把迁移函数和它的注册代码放在一起。下面是我们如何重新实现我们的性别迁移代码来使用这个机制:
static void
migrate_gender (odb::database& db)
{
transaction t (db.begin ());
for (person& p: db.query<person> ())
{
p.gender (guess_gender (p.first ()));
db.update (p);
}
t.commit ();
}
static const odb::data_migration_entry<3, MYAPP_BASE_VERSION>
migrate_gender_entry (&migrate_gender);
The first template argument to the data_migration_entry class template is the version we want this data migration function to be called for. The second template argument is the base model version. This second argument is necessary to detect the situation where we no longer need this data migration function. Remember that when we move the base model version forward, migrations from any version below the new base are no longer possible. We, however, may still have migration functions registered for those lower versions. Since these functions will never be called, they are effectively dead code and it would be useful to identify and remove them. To assist with this, data_migration_entry (and lower lever data_migration_function()) will check at compile time (that is, static_assert) that the registration version is greater than the base model version.
那些较低版本注册迁移功能。由于这些函数永远不会被调用,它们实际上是死代码,识别和删除它们将是有用的。为了帮助解决这个问题,data_migration_entry(以及较低级别的data_migration_function())将在编译时(即static_assert)检查注册版本是否大于基本模型版本。
In the above example we use the MYAPP_BASE_VERSION macro that is presumably defined in a central place, for example, version.hxx. This is the recommended approach since we can update the base version in a single place and have the C++ compiler automatically identify all the data migration functions that can be removed.
在上面的例子中,我们使用MYAPP_BASE_VERSION宏,该宏大概定义在一个中心位置,例如version.hxx。这是推荐的方法,因为我们可以在一个地方更新基本版本,并让C++编译器自动识别可以删除的所有数据迁移函数。
In C++11 we can also create a template alias so that we don't have to repeat the base model macro in every registration, for example:
在C++ 11中,我们还可以创建一个模板别名,这样我们就不必在每次注册时都重复基模型宏,例如:
template <schema_version v>
using migration_entry = odb::data_migration_entry<v, MYAPP_BASE_VERSION>;
static const migration_entry<3>
migrate_gender_entry (&migrate_gender);
For cases where you need to by-pass the base version check, for example, to implement your own registration helper, ODB also provides "unsafe" versions of the data_migration_function() functions that take the version as a function argument rather than as a template parameter.
例如,在需要绕过基本版本检查来实现自己的注册帮助器的情况下,ODB还提供了data_migration_function()函数的“不安全”版本,这些函数将该版本作为函数参数而不是模板参数。
In C++11 we can also use lambdas as migration functions, which makes the migration code more concise:
在C++ 11中,我们也可以使用lambdas作为迁移函数,这使得迁移代码更简洁:
static const migration_entry<3>
migrate_gender_entry (
[] (odb::database& db)
{
transaction t (db.begin ());
for (person& p: db.query<person> ())
{
p.gender (guess_gender (p.first ()));
db.update (p);
}
t.commit ();
});
If we are using embedded schema migrations, then both schema and data migration is integrated and can be performed with a single call to the schema_catalog::migrate() function that we discussed earlier. For example:
如果我们使用嵌入式模型迁移,那么模型和数据迁移都是集成的,并且可以通过对前面讨论过的schema_catalog::migrate()函数的单个调用来执行。例如:
int
main ()
{
...
database& db = ...
// Check if we need to migrate the database and do so
// if that's the case.
//
{
transaction t (db.begin ());
schema_catalog::migrate (db);
t.commit ();
}
...
}
Note, however, that in this case we call migrate() within a transaction (for the schema migration part) which means that our migration functions will also be called within this transaction. As a result, we will need to adjust our migration functions not to start their own transaction:
但是请注意,在本例中,我们在事务中调用migrate()(用于模型迁移部分),这意味着我们的迁移函数也将在该事务中调用。因此,我们将需要调整我们的迁移功能,不启动自己的事务:
static void
migrate_gender (odb::database& db)
{
// Assume we are already in a transaction.
//
for (person& p: db.query<person> ())
{
p.gender (guess_gender (p.first ()));
db.update (p);
}
}
If, however, we want more granular transactions, then we can use the lower-level schema_catalog functions to gain more control, as we have seen at the end of the previous section. Here is the relevant part of that example with an added data migration call:
但是,如果我们想要更细粒度的事务,那么可以使用较低级别的schema_catalog函数来获得更多的控制,正如我们在前一节的末尾所看到的那样。下面是该示例的相关部分,添加了一个数据迁移调用:
// Old schema (and data) in the database, migrate them.
//
for (v = schema_catalog::next_version (db, v);
v <= cv;
v = schema_catalog::next_version (db, v))
{
transaction t (db.begin ());
schema_catalog::migrate_schema_pre (db, v);
schema_catalog::migrate_data (db, v);
schema_catalog::migrate_schema_post (db, v);
t.commit ();
}
13.3.2 Gradual Data Migration 数据逐步迁移
If the number of existing objects that require migration is large, then an all-at-once, immediate migration, while simple, may not be practical from a performance point of view. In this case, we can perform a gradual migration as the application does its normal functions.
如果需要迁移的现有对象的数量很大,那么一次性、立即的迁移虽然简单,但从性能的角度来看可能并不实用。在这种情况下,我们可以在应用程序执行正常功能时执行逐步迁移。
With gradual migrations, the object model must be capable of representing data that conforms to both old and new formats at the same time since, in general, the database will contain a mixture of old and new objects. For example, in case of our gender data member, we need a special value that represents the "no gender assigned yet" case (an old object). We also need to assign this special value to all the existing objects during the schema pre-migration stage. One way to do this would be add a special value to our gender enum and then make it the default value with the db default pragma. A cleaner and easier approach, however, is to use NULL as a special value. We can add support for the NULL value semantics to any existing type by wrapping it with odb::nullable, boost::optional or similar (Section 7.3, "Pointers and NULL Value Semantics"). We also don't need to specify the default value explicitly since NULL is used automatically. Here is how we can use this approach in our gender example:
包含新旧对象的混合。例如,对于我们的性别数据成员,我们需要一个特殊的值来表示“尚未分配性别”的情况(一个旧对象)。在模型预迁移阶段,我们还需要将这个特殊值赋给所有现有对象。一种方法是向gender enum添加一个特殊值,然后使用db default pragma将其设置为默认值。然而,一种更干净、更简单的方法是使用NULL作为一个特殊值。我们可以通过odb::nullable、boost::optional或类似的方式,为任何现有类型添加对NULL值语义的支持(章节7.3,“指针和NULL值语义”)。我们也不需要显式指定默认值,因为会自动使用NULL。以下是我们如何在我们的性别例子中使用这一方法:
#include <odb/nullable.hxx>
#pragma db object
class person
{
...
odb::nullable<gender> gender_;
};
A variety of strategies can be employed to implement gradual migrations. For example, we can migrate the data when the object is updated as part of the normal application logic. While there is no migration cost associated with this approach (the object is updated anyway), depending on how often objects are typically updated, this strategy can take a long time to complete. An alternative strategy would be to perform an update whenever an old object is loaded. Yet another strategy is to have a separate thread that slowly migrates all the old objects as the application runs.
可以使用各种策略来实现渐进迁移。例如,当对象作为正常应用程序逻辑的一部分进行更新时,我们可以迁移数据。虽然这种方法没有相关的迁移成本(无论如何都会更新对象),但取决于对象通常更新的频率,这种策略可能需要很长时间才能完成。另一种策略是在旧对象加载时执行更新。另一种策略是使用单独的线程,在应用程序运行时缓慢地迁移所有旧对象。
As an example, let us implement the first approach for our gender migration. While we could have added the necessary code throughout the application, from the maintenance point of view, it is best to try and localize the gradual migration logic to the persistent classes that it affects. And for this database operation callbacks (Section 14.1.7, "callback") are a very useful mechanism. In our case, all we have to do is handle the post_load event where we guess the gender if it is NULL:
例如,让我们为我们的性别移徙实施第一种方法。虽然我们可以在整个应用程序中添加必要的代码,但从维护的角度来看,最好是尝试将渐进迁移逻辑本地化到它所影响的持久类。对于这个数据库操作,回调(第14.1.7节,“回调”)是一个非常有用的机制。在我们的例子中,我们所要做的就是处理post_load事件,如果它是NULL,我们就猜测性别:
#include <odb/core.hxx> // odb::database
#include <odb/callback.hxx> // odb::callback_event
#include <odb/nullable.hxx>
#pragma db object callback(migrate)
class person
{
...
void
migrate (odb::callback_event e, odb::database&)
{
if (e == odb::callback_event::post_load)
{
// Guess gender if not assigned.
//
if (gender_.null ())
gender_ = guess_gender (first_);
}
}
odb::nullable<gender> gender_;
};
In particular, we don't have to touch any of the accessors or modifiers or the application logic — all of them can assume that the value can never be NULL. And when the object is next updated, the new gender value will be stored automatically.
特别是,我们不需要触及任何访问器、修饰符或应用程序逻辑——它们都可以假设值永远不能为NULL。当对象下次更新时,新的gender值将被自动存储。
All gradual migrations normally end up with a terminating immediate migration some number of versions down the line, when the bulk of the objects has presumably been converted. This way we don't have to keep the gradual migration code around forever. Here is how we could implement a terminating migration for our example:
当假定对象的大部分已经被转换时,所有的逐步迁移通常会以一些版本的立即迁移而告终。这样,我们就不必永远保留逐渐迁移的代码。下面是我们如何实现终止迁移的示例:
// person.hxx
//
#pragma db model version(1, 4)
#pragma db object
class person
{
...
gender gender_;
};
// person.cxx
//
static void
migrate_gender (odb::database& db)
{
typedef odb::query<person> query;
for (person& p: db.query<person> (query::gender.is_null ()))
{
p.gender (guess_gender (p.first ()));
db.update (p);
}
}
static const odb::data_migration_entry<4, MYAPP_BASE_VERSION>
migrate_gender_entry (&migrate_gender);
A couple of points to note about this code. Firstly, we removed all the gradual migration logic (the callback) from the class and replaced it with the immediate migration function. We also removed the odb::nullable wrapper (and therefore disallowed the NULL values) since after this migration all the objects will have been converted. Finally, in the migration function, we only query the database for objects that need migration, that is, have NULL gender.
关于这段代码有几点需要注意。首先,我们从类中移除所有渐进迁移逻辑(回调),并用即时迁移函数替换它。我们还删除了odb::nullable包装器(因此不允许NULL值),因为在这次迁移之后,所有对象都将被转换。最后,在迁移函数中,我们只查询数据库中需要迁移的对象,即具有NULL gender的对象。
13.4 Soft Object Model Changes 软对象模型变更
Let us consider another common kind of object model change: we delete an old member, add a new one, and need to copy the data from the old to the new, perhaps applying some conversion. For example, we may realize that in our application it is a better idea to store a person's name as a single string rather than split it into three fields. So what we would like to do is add a new data member, let's call it name_, convert all the existing split names, and then delete the first_, middle_, and last_ data members.
让我们考虑另一种常见的对象模型更改:我们删除一个旧成员,添加一个新成员,并需要将数据从旧成员复制到新成员,可能需要应用一些转换。例如,我们可能会意识到,在我们的应用程序中,将人名存储为单个字符串比将其分割为三个字段更好。因此,我们要做的是添加一个新的数据成员,我们叫它name_,转换所有现有的拆分名称,然后删除first_、middle_和last_ data成员。
While this sounds straightforward, there is a problem. If we delete (that is, physically remove from the source code) the old data members, then we won't be able to access the old data. The data will still be available in the database between the schema pre and post-migrations, it is just we will no longer be able to access it through our object model. And if we keep the old data members around, then the old data will remain stored in the database even after the schema post-migration.
虽然这听起来很简单,但有一个问题。如果我们删除(即从源代码中实际删除)旧数据成员,那么我们将无法访问旧数据。在模型迁移前和迁移后,数据仍然在数据库中可用,只是我们不再能够通过对象模型访问它。如果我们保留旧数据成员,那么旧数据将保留在数据库中,即使在模型迁移后也是如此。
There is also a more subtle problem that has to do with existing migrations for the previous versions. Remember, in version 3 of our person example we added the gender_ data member. We also have a data migration function which guesses the gender based on the first name. Deleting the first_ data member from our class will obviously break this code. But even adding the new name_ data member will cause problems because when we try to update the object in order to store the new gender, ODB will try to update name_ as well. But there is no corresponding column in the database yet. When we run this migration function, we are still several versions away from the point where the name column will be added.
还有一个更微妙的问题,与以前版本的现有迁移有关。请记住,在person示例的第3版本中,我们添加了gender_ data成员。我们还有一个数据迁移函数,它根据名字猜测性别。从类中删除first_ data成员显然会破坏这段代码。但是,即使添加新的name_ data成员也会导致问题,因为当我们试图更新对象以存储新的性别时,ODB也会尝试更新name_。但是数据库中还没有相应的列。当我们运行这个迁移函数时,距离添加name列还有好几个版本。
This is a very subtle but also very important implication to understand. Unlike the main application logic, which only needs to deal with the current model version, data migration code works on databases that can be multiple versions behind the current version.
这是一个非常微妙但也非常重要的含义,需要理解。与只需要处理当前模型版本的主应用程序逻辑不同,数据迁移代码可以在当前版本后面的多个版本的数据库上工作。
How can we resolve this problem? It appears what we need is the ability to add or delete data members starting from a specific version. In ODB this mechanism is called soft member additions and deletions. A soft-added member is only treated as persistent starting from the addition version. A soft-deleted member is persistent until the deletion version (but including the migration stage). In its essence, soft model changes allow us to maintain multiple versions of our object model all with a single set of persistent classes. Let us now see how this functionality can help implement our changes:
我们如何解决这个问题?看来我们需要的是从特定版本开始添加或删除数据成员的能力。在ODB中,这种机制称为软成员添加和删除。软添加的成员仅从添加版本开始被视为持久化。软删除的成员在删除版本(但包括迁移阶段)之前是持久的。本质上,软模型更改允许我们使用一组持久类来维护对象模型的多个版本。现在让我们看看这个功能如何帮助实现我们的更改:
#pragma db model version(1, 4)
#pragma db object
class person
{
...
#pragma db id auto
unsigned long id_;
#pragma db deleted(4)
std::string first_;
#pragma db deleted(4)
std::string middle_;
#pragma db deleted(4)
std::string last_;
#pragma db added(4)
std::string name_;
gender gender_;
};
The migration function for this change could then look like this:
这个变化的迁移函数是这样的:
static void
migrate_name (odb::database& db)
{
for (person& p: db.query<person> ())
{
p.name (p.first () + " " +
p.middle () + (p.middle ().empty () ? "" : " ") +
p.last ());
db.update (p);
}
}
static const odb::data_migration_entry<4, MYAPP_BASE_VERSION>
migrate_name_entry (&migrate_name);
Note also that no changes are required to the gender migration function.
还请注意,不需要对性别迁移功能进行更改。
As you may have noticed, in the code above we assumed that the person class still provides public accessors for the now deleted data members. This might not be ideal since now they should not be used by the application logic. The only code that may still need to access them is the migration functions. The recommended way to resolve this is to remove the accessors/modifiers corresponding to the deleted data member, make migration functions static functions of the class being migrated, and then access the deleted data members directly. For example:
您可能已经注意到,在上面的代码中,我们假设person类仍然为现在已删除的数据成员提供公共访问器。这可能并不理想,因为现在应用程序逻辑不应该使用它们。惟一需要访问它们的代码是迁移函数。解决这个问题的建议方法是删除对应于被删除数据成员的访问器/修饰符,使迁移函数成为被迁移类的静态函数,然后直接访问被删除的数据成员。例如:
#pragma db model version(1, 4)
#pragma db object
class person
{
...
private:
friend class odb::access;
#pragma db id auto
unsigned long id_;
#pragma db deleted(4)
std::string first_;
#pragma db deleted(4)
std::string middle_;
#pragma db deleted(4)
std::string last_;
#pragma db added(4)
std::string name_;
gender gender_;
private:
static void
migrate_gender (odb::database&);
static void
migrate_name (odb::database&);
};
void person::
migrate_gender (odb::database& db)
{
for (person& p: db.query<person> ())
{
p.gender_ = guess_gender (p.first_);
db.update (p);
}
}
static const odb::data_migration_entry<3, MYAPP_BASE_VERSION>
migrate_name_entry (&migrate_gender);
void person::
migrate_name (odb::database& db)
{
for (person& p: db.query<person> ())
{
p.name_ = p.first_ + " " +
p.middle_ + (p.middle_.empty () ? "" : " ") +
p.last_;
db.update (p);
}
}
static const odb::data_migration_entry<4, MYAPP_BASE_VERSION>
migrate_name_entry (&migrate_name);
Another potential issue with the soft-deletion is the requirement to keep the delete data members in the class. While they will not be initialized in the normal operation of the application (that is, not a migration), this can still be a problem if we need to minimize the memory footprint of our classes. For example, we may cache a large number of objects in memory and having three std::string data members can be a significant overhead.
软删除的另一个潜在问题是需要将删除数据成员保留在类中。虽然它们不会在应用程序的正常操作中初始化(也就是说,不是迁移),但如果我们需要最小化类的内存占用,这仍然是一个问题。例如,我们可能会在内存中缓存大量的对象,并且拥有三个std::string数据成员可能会带来很大的开销。
The recommended way to resolve this issue is to place all the deleted data members into a dynamically allocated composite value type. For example:
解决此问题的建议方法是将所有删除的数据成员放入动态分配的复合值类型中。例如:
#pragma db model version(1, 4)
#pragma db object
class person
{
...
#pragma db id auto
unsigned long id_;
#pragma db added(4)
std::string name_;
gender gender_;
#pragma db value
struct deleted_data
{
#pragma db deleted(4)
std::string first_;
#pragma db deleted(4)
std::string middle_;
#pragma db deleted(4)
std::string last_;
};
#pragma db column("")
std::unique_ptr<deleted_data> dd_;
...
};
ODB will then automatically allocate the deleted value type if any of the deleted data members are being loaded. During the normal operation, however, the pointer will stay NULL and therefore reduce the common case overhead to a single pointer per class. Note that we make the composite value column prefix empty (the db column("") pragma) in order to keep the same column names for the deleted data members.
然后,如果加载任何已删除的数据成员,ODB将自动分配已删除的值类型。然而,在正常操作期间,指针将保持NULL,因此将普通情况开销减少到每个类一个指针。注意,我们将复合值列前缀设为空(db column("") pragma),以便为已删除的数据成员保留相同的列名。
Soft-added and deleted data members can be used in objects, composite values, views, and container value types. We can also soft-add and delete data members of simple, composite, pointer to object, and container types. Only special data members, such as the object id and the optimistic concurrency version, cannot be soft-added or deleted.
可以在对象、复合值、视图和容器值类型中使用软添加和删除的数据成员。我们还可以软地添加和删除简单、复合、对象指针和容器类型的数据成员。只有特殊的数据成员,如对象id和乐观并发版本,不能被软添加或删除。
It is also possible to soft-delete a persistent class. We can still work with the existing objects of such a class, however, no table is created in new databases for soft-deleted classes. To put it another way, a soft-delete class is like an abstract class (no table) but which can still be loaded, updated, etc. Soft-added persistent classes do not make much sense and are therefore not supported.
也可以软删除一个持久化类。我们仍然可以使用这样一个类的现有对象,但是,在新的数据库中不会为软删除类创建表。换句话说,软删除类就像一个抽象类(没有表),但是仍然可以被加载、更新等等。软添加的持久类没有多大意义,因此不受支持。
As an example of a soft-deleted class, suppose we want to replace our person class with the new employee object and migrate the data. Here is how we could do this:
作为软删除类的一个例子,假设我们想用新的employee对象替换person类并迁移数据。我们可以这样做:
#pragma db model version(1, 5)
#pragma db object deleted(5)
class person
{
...
};
#pragma db object
class employee
{
...
#pragma db id auto
unsigned long id_;
std::string name_;
gender gender_;
static void
migrate_person (odb::database&);
};
void employee::
migrate_person (odb::database& db)
{
for (person& p: db.query<person> ())
{
employee e (p.name (), p.gender ());
db.persist (e);
}
}
static const odb::data_migration_entry<5, MYAPP_BASE_VERSION>
migrate_person_entry (&migrate_person);
As we have seen above, hard member additions and deletions can (and most likely will) break existing data migration code. Why, then, not treat all the changes, or at least additions, as soft? ODB requires you to explicitly request this semantics because support for soft-added and deleted data members incurs runtime overhead. And there can be plenty of cases where there is no existing data migration and therefore hard additions and deletions are sufficient.
正如我们在上面看到的,硬成员的添加和删除可能(而且很可能)破坏现有的数据迁移代码。那么,为什么不把所有的改变,或者至少是增加,都看成是软成员呢?ODB要求您显式地请求这种语义,因为支持软添加和删除的数据成员会导致运行时开销。在很多情况下,没有现有的数据迁移,因此硬添加和删除就足够了。
In some cases a hard addition or deletion will result in a compile-time error. For example, one of the data migration functions may reference the data member we just deleted. In many cases, however, such errors can only be detected at runtime, and, worse yet, only when the migration function is executed. For example, we may hard-add a new data member that an existing migration function will try to indirectly store in the database as part of an object update. As a result, it is highly recommended that you always test your application with the database that starts at the base version so that every data migration function is called and therefore ensured to still work correctly.
在某些情况下,硬添加或删除将导致编译时错误。例如,其中一个数据迁移函数可能引用我们刚刚删除的数据成员。然而,在许多情况下,这类错误只能在运行时检测到,更糟糕的是,只有在执行迁移函数时才检测到。例如,我们可以硬添加一个新的数据成员,现有的迁移函数将尝试将其作为对象更新的一部分间接存储在数据库中。因此,强烈建议您始终使用从基础版本开始的数据库来测试应用程序,以便调用每个数据迁移函数,从而确保仍然能够正常工作。
To help with this problem you can also instruct ODB to warn you about any hard additions or deletions with the --warn-hard-add, --warn-hard-delete, and --warn-hard command line options. ODB will only warn you about hard changes in the current version and only for as long as it is open, which makes this mechanism fairly usable.
为了帮助解决这个问题,您还可以指示ODB使用——warn-hard-add、——warn-hard-delete和——warn-hard命令行选项对任何硬添加或删除发出警告。ODB只会在当前版本中对硬更改发出警告,而且只会在当前版本开放时发出警告,这使得该机制相当有用。
You may also be wondering why we have to specify the addition and deletion versions explicitly. It may seem like the ODB compiler should be able to figure this out automatically. While it is theoretically possible, to achieve this, ODB would have to also maintain a separate changelog of the C++ object model in addition to the database schema changelog it already maintains. While being a lot more complex, such an additional changelog would also complicate the workflow significantly. In this light, maintaining this change information as part of the original source files appears to be a cleaner and simpler approach.
您可能还想知道为什么我们必须明确地指定添加和删除版本。看起来ODB编译器应该能够自动解决这个问题。虽然理论上是可能的,但要实现这一点,ODB除了已经维护的数据库模型更改日志外,还必须维护C++对象模型的单独更改日志。虽然要复杂得多,但这样一个额外的变更日志也会使工作流变得非常复杂。在这种情况下,将更改信息维护为原始源文件的一部分似乎是一种更干净和更简单的方法。
As we discussed before, when we move the base model version forward we essentially drop support for migrations from versions before the new base. As a result, it is no longer necessary to maintain the soft semantics of additions and deletions up to and including the new base version. ODB will issue diagnostics for all such members and classes. For soft deletions we can simply remove the data member or class entirely. For soft additions we only need to remove the db added pragma.
正如我们在前面讨论过的,当我们向前移动基本模型版本时,我们本质上放弃了对新基础之前版本的迁移的支持。因此,在包含新的基本版本之前,不再需要维护添加和删除的软语义。ODB将对所有此类成员和类发出诊断。对于软删除,我们可以简单地完全删除数据成员或类。对于软添加,我们只需要删除db added pragma。
13.4.1 Reuse Inheritance Changes 重用继承更改
Besides adding and deleting data members, another way to alter the object's table is using reuse-style inheritance. If we add a new reuse base, then, from the database schema point of view, this is equivalent to adding all its columns to the derived object's table. Similarly, deleting reuse inheritance results in all the base's columns being deleted from the derived's table.
除了添加和删除数据成员之外,修改对象表的另一种方法是使用重用风格的继承。如果我们添加一个新的重用基,那么,从数据库模型的角度来看,这相当于将其所有列添加到派生对象的表中。类似地,删除重用继承将导致从派生表中删除基类的所有列。
In the future ODB may provide direct support for soft addition and deletion of inheritance. Currently, however, this semantics can be emulated with soft-added and deleted data members. The following table describes the most common scenarios depending on where columns are added or deleted, that is, base table, derived table, or both.
列。
DELETE |
HARD |
SOFT |
In both (delete inheritance and base) |
Delete inheritance and base. Move object id to derived. |
Soft-delete base. Mark all data members (except id) in base as soft-deleted. |
In base only (delete base) |
Option 1: mark base as abstract. Option 2: move all the base member to derived, delete base. |
Soft-delete base. |
In derived only (delete inheritance) |
Delete inheritance, add object id to derived. |
Option 1: copy base to a new soft-deleted base, inherit from it instead. Mark all the data members (expect id) in this new base as soft-deleted. Note: we add the new base as soft-deleted to get notified when we can remove it. Option 2: Copy all the data members from base to derived and mark them as soft-deleted in derived. |
ADD |
HARD |
SOFT |
In both (add new base and inheritance) |
Add new base and inheritance. Potentially move object id member from derived to base. |
Add new base and mark all its data members as soft-added. Add inheritance. Move object id from derived to base. |
In base only (refactor existing data to new base) |
Add new base and move data members from derived to base. Note: in most cases the new base will be made abstract which make this scenario non-schema changing. |
The same as HARD. |
In derived only (add inheritance to existing base) |
Add inheritance, delete object id in derived. |
Copy existing base to a new abstract base and inherit from it. Mark all the database members in the new base as soft-added (except object id). When notified by the ODB compiler that the soft addition of the data members is no longer necessary, delete the copy and inherit from the original base. |
13.4.2 Polymorphism Inheritance Changes 多态性继承变化
Unlike reuse inheritance, adding or deleting a polymorphic base does not result in the base's data members being added or deleted from the derived object's table because each class in a polymorphic hierarchy is stored in a separate table. There are, however, other complications due to the presence of special columns (discriminator in the root table and object id links in derived tables) which makes altering the hierarchy structure difficult to handle automatically. Adding or deleting (including soft-deleting) of leaf classes (or leaf sub-hierarchies) in a polymorphic hierarchy is fully supported. Any more complex changes, such as adding or deleting the root or an intermediate base or getting an existing class into or out of a polymorphic hierarchy can be handled by creating a new leaf class (or leaf sub-hierarchy), soft-deleting the old class, and migrating the data.
与重用继承不同,添加或删除多态基不会导致基的数据成员被添加或从派生对象的表中删除,因为多态层次结构中的每个类都存储在一个单独的表中。然而,由于特殊列(根表中的标识符和派生表中的对象id链接)的存在,还存在其他一些并发症,这使得很难自动处理改变层次结构。完全支持在多态层次结构中添加或删除叶类(或叶的子层次结构)(包括软删除)。任何更复杂的更改,如添加或删除根或中间基,或将现有的类放入或退出多态层次结构,都可以通过创建新的叶类(或叶子层次结构)、软删除旧类和迁移数据来处理。