Write to a database efficiently

This article explores a heuristic for writing to databases safely and efficiently.

When AIMMS writes data to a database table, it uses structural information of the database table to determine a safe and efficient strategy for writing.

In this article we discuss:

  • An example project

  • Foreign key relations

  • Available strategies

  • Database schema design best practices

An example project description

We are maintaining orders to be delivered. Each order is identified by (the key columns):

  • A customer

  • A product

  • A delivery date

In addition, we maintain the following information:

  • For each customer, its address and name

  • For each product, its name, its color, package size (height, width, depth).

This is represented in the following database schema.

../../_images/two-to-many.png

As you can see from this schema, both a customer and a product have an Id. The table Orders refers to these Id s. In mathematics we have the following relations:

  • \(\forall {\tt CustomerId}: {\tt CustomerId} \in \{Customers.Id\}\)

    In words: For each CustomerId, it should be in the set of all Id in the table Customers.

  • \(\forall {\tt ProjectId}: {\tt ProjectId} \in \{Project.Id\}\)

    In words: For each ProjectId, it should be in the set of all Id in the table Projects.

Foreign keys relations

The relations mentioned in the previous section are examples of consistent relations in a database.

It is important that such relations between tables are maintained. For instance, if there is a CustomerId or a ProductId in the table Orders that is not in the table Customers or in the table Products respectively, then we will not be able to fulfill that order because we do not know where to deliver, or what to deliver!

These relations can be enforced in a database using Foreign Keys.

When such a Foreign Key relation is specified as a constraint in a database, and we insert an order without a matching Product, we will get an error message like:

Error writing to database table "db_OrdersNew": ODBC[1452] : HY000 [MySQL][ODBC 8.0(w)
Driver][mysqld-5.6.44-log]Cannot add or update a child row: a foreign key constraint fails
(`howto343b`.`orders`, CONSTRAINT `fk_order_product` FOREIGN KEY (`ProductId`)
REFERENCES `products` (`Id`)).

Based on the above example, we now introduce the following existing database terminology:

  • The constraint discussed above is known as a Foreign Key. In our example, the table Orders has two foreign keys:

    • A value in its column customerId should match a value in the column Id of the table Customers.

    • A value in its column ProductId should match a value in the column Id of the table Products.

  • The tables Customers and Products are parent tables of these foreign keys.

  • The table Orders is the child table of these foreign keys.

Note

Note that the definition of Foreign Keys given in Wikipedia allows for more complex relations. For instance, non-key columns in parent tables can be referenced. Also, the key referenced may consist of multiple columns. In this article, we restrict ourselves to just a single column, and in the parent table that is the key column.

When a customer is removed from the table Customers, when there are still orders for that customer, we will get an inconsistency as well.

Given the above foreign constraints, there are at least two options available to the database when attempting to delete a customer with existing orders:

  1. Forbid that deletion and issue an appropriate error message.

  2. Realize that deletion, and delete the corresponding orders as well. This is known as cascading deletes.

Tip

It is generally considered to be good practice to enforce all Foreign Key relations; as it will enhance the quality of the data. Note that by enforcing these constraints in the database, any application that reads and modifies data in that database needs to adhere to these constraints. This is also true for AIMMS applications!

Available strategies

An AIMMS write to table statement will delete, update and insert some rows in a database table. This can be implemented using the SQL statements DELETE, UPDATE, and INSERT.

  1. Strategy A:

    1. Determine the rows that are already in the database table

    2. Delete the existing rows no longer relevant

    3. Update the existing rows still relevant with new data

    4. Insert the new rows together with their data.

  2. Strategy B:

    1. Delete all old rows in the table

    2. Insert rows as there is information

Clearly, Strategy A looks more complicated and time-consuming. It can be more time consuming, as it needs to read a potentially large amount of data before the table is actually modified. To understand why this strategy is still needed, we need to take a close look at their behavior in combination with foreign keys.

Choice of strategy and consequences for safety and efficiency

Let’s get back to the strategies introduced at the beginning of the previous section.

As an example, consider the operation to change the address of a customer, using the two strategies:

  • Using Strategy B, we first delete that customer and then recreate it using an insert statement. By doing a delete first in the presence of orders for that customer, depending on the type of foreign constraint, this will either be forbidden or it will lead to cascading deletes of orders. Either way, that is undesired behavior for an operation like changing the address.

  • Using Strategy A, in the end, the only modification is done is just a SQL UPDATE of that customer. There is no effect on the table Orders, which is desired. Therefore, even though this strategy may be less efficient, it is safe.

When the table at hand is a parent table in a Foreign Key constraint, then the safe Strategy A is preferred. Otherwise, the efficient Strategy B is preferred.

AIMMS uses the knowledge of whether Foreign Keys are present or not based on the values of two options: Database_foreign_key_handling and Database_string_valued_foreign_keys, according to the following table:

Foreign key presence

Presence

Strategy

Advantage

Consquence when assumption invalid

Yes

A

Safe

Less efficient strategy used

No

B

Efficient

Data loss

Are Foreign Keys constraints active on the table to be written?

When writing to a table it is important to know whether the table at hand is used in a Foreign Key constraint:

  1. As a parent table, see strategy discussion above.

    When the option database_foreign_key_handling is set to:

    • 'check' The ODBC function SQLForeignKeys is used to determine this.

    • 'ignore' AIMMS assumes that the table is not used as a parent table in a Foreign Key constraint.

    • 'assume' AIMMS assumes that the table is used as a parent table in a Foreign Key constraint.

    The default of the option database_foreign_key_handling is 'check'.

  2. As a child table, if so, empty strings are written as NULL’s. So this information is only relevant if your database schema has string valued keys.

    When the option database_string_valued_foreign_keys is set to:

    • 'check' The ODBC function SQLForeignKeys is used to determine this.

    • 'ignore' AIMMS assumes that the table is not used as a child table in a Foreign Key constraint.

    • 'assume' AIMMS assumes that the table is used as a child table in a Foreign Key constraint.

    The default of the option database_foreign_key_handling is 'check' up to AIMMS 4.73 and is 'ignore' for AIMMS 4.74 and upwards.

Pros and cons of the setting ‘check’

The setting 'check' has some clear advantages in terms of ease of use:

  • It does not require the model builder to dive into the concept of Foreign Keys.

  • Nor does it require the model builder to check the schema whether any of the tables at hand are used as a parent table in a Foreign Key constraint.

On the other hand, obtaining metadata via the ODBC function SQLForeignKeys from a database can be rather time-consuming. This depends on the database vendor and the complexity of the database schema. Thereby this initial overhead can be significant in the overall performance if there are several tables to be written, and for each table only one or a few rows to be persisted.

Database schema design best practices

In this section, a practice is suggested to safely and efficiently write the data to the application database.

The good practice of specifying Foreign Keys is assumed but limited to single keys in Parent Tables. Writing strategies in the presence of more complex Foreign Keys is a topic for another article.

To do so, we divide the schema into two layers of metadata:

  1. The key data,

  2. The attribute data

Key data

The “key data” are tables that correspond to the sets and one-dimensional parameters declared over these sets. These tables are present as Parent Tables in Foreign Key relations. Examples are:

  • Assets, a parameter can indicate:

    • Type,

    • Location

  • Periods,

  • Customers

  • Products

When these tables have derived columns, they can also appear as child table in a Foreign Key. In our Customer-Order example, both tables Customers and Orders are key tables. You may recall that in the Foreign Key of that example, Customers is the parent table, and Orders is the child table.

When writing to these tables, it is important that data of other tables remain intact and that the writing operations succeed whenever possible. This is achieved by setting the option Database_foreign_key_handling to 'Assume' and therefore have the required safe writing strategy.

In our example, tables are best written to with the options Database_foreign_key_handling and Database_string_valued_foreign_keys set to 'Assume' and 'Ignore' respectively, as follows:

1
2
3
4
5
6
7
block where database_foreign_key_handling := 'assume',
            database_string_valued_foreign_keys := 'ignore'  ;

    write to table db_Customers ;
    write to table db_Projects ;

endblock ;

The following remarks apply to this code;

  • By using a block statement, the options are only set in the respective code portion, and the remainder of the application is left untouched. See article setting options

  • As integer keys are used in our example, the option database_string_valued_foreign_keys can be set to 'ignore'.

  • Efficiency is less important than correct behaviour, so we remove this from our considerations.

Attribute data

The actual data, for instance, how much of which product is bought by which customer and when. These tables can be a part of foreign key constraints only as child tables. It is, therefore, safe to use efficient Strategy B for writing to these tables.

1
2
3
4
5
6
block where database_foreign_key_handling := 'ignore',
            database_string_valued_foreign_keys := 'ignore'  ;

    write to table db_Orders ;

endblock ;

Note

Like key tables, the foreign keys of these tables only refer to keys in key tables.