Sql foreign key rules. Foreign keys FOREIGN KEY. Natural and surrogate key

They are used in any activity: in the banking and financial industries, tourism business, warehouses, in production and in training. They are a collection of tables, have clear properties and are subject to strict requirements. In relational databases, tables are called relationships.

What is a primary key in a database

In a database, the primary key of a table is one of its columns (Primary key). Let's see with an example how it looks. Imagine a simple relationship of university students (let's call it "Students").

We need to uniquely identify the student by one column. To do this, the information in this column for each record must be unique. But the available data in this regard do not allow us to unambiguously identify the entry, since namesakes, namesakes and students with the same surnames and first names can study in the same course and in the same faculty. The primary key in the database is used to accurately identify the required row in the relation. Most often, a numeric field is used in this capacity, which automatically increases with the entry of a record (auto-incrementing identifier column).

Simple and Composite Primary Key

Primary key can be simple or compound. If the uniqueness of a record is determined by the value in only one field, as described above, we are dealing with a simple key. A composite key is a database primary key that consists of two or more fields. Consider the following attitude of bank customers.

FULL NAME.	Date of Birth	Passport Series	Passport ID
Ivanov P.A.	12.05.1996	75	0553009
Sergeev V.T.	14.07.1958	71	4100654
L.V. Krasnov	22.01.2001	73	1265165

People's passports can contain the same series or numbers, but there are no passports with the same series and number combination. Thus, the fields "Passport series" and "Passport number" will become a composite key of the specified relationship, uniquely identifying a person.

Links between relationships

So, the primary key in a database is one or more columns of a table that allows you to uniquely identify the row of this relationship. What is it for?

Let's go back to the first example with the "Students" relationship. In addition to this relationship, the database stores other information, for example, the progress of each student. In order not to repeat all the information that is already contained in the database, they use the key referring to the required record. It looks like this.

In the two relationships of the example, we see the ID field. These are the primary keys in the database for these tables. As you can see, the progress only contains links to these fields from other tables without the need to indicate all the information from them.

Natural and surrogate key

How is the primary key of a database table determined? The two examples we have considered - "Students" and "Bank Clients" - illustrate the concepts of natural and surrogate keys. In the table of bank clients, we have defined a key consisting of the fields "Number" and "Passport series", using the existing columns. Such a key is called natural; we did not make any changes or additions to determine it. In the case of the "Students" relationship, no field or combination of fields gave us uniqueness. This forced us to enter an additional field with the student code. Such a key is called a surrogate key, for which we have added one more service column to the table. This column does not carry any useful information and only serves to identify records.

Foreign key and data integrity in the database

All of the above leads us to Foreign key and database integrity. Foreign key is a field that refers to the Primary key of the foreign relationship. In the table of grades, these are the "Student" and "Discipline" columns. Their data refers us to external tables. That is, the field "Student" in the relation "Performance" is a Foreign key, and in relation "Student" it is the primary key in the database.

An important principle of building databases is their integrity. And one of its rules is link integrity. This means that a table's foreign key cannot refer to a nonexistent Primary key of another relationship. You cannot delete a record with the code 1000 - Ivanov Ivan from the "Student" relationship if a record from the grades table refers to it. In a properly built database, when trying to delete, you will receive an error that this field is in use.

There are other groups of integrity rules as well as other database constraints that deserve attention and should be taken into account by developers.

FOREIGN KEY used to restrict links.
When all the values \u200b\u200bin one field of a table are represented in a field of another table, the first field is said to refer to the second. This indicates a direct relationship between the values \u200b\u200bof the two fields.

When one gender in a table refers to another, it is called foreign key; and the field it refers to is called parent key... Foreign key and parent key names do not have to be the same. A foreign key can have any number of fields, all of which are treated as a single module. The foreign key and the parent key that it refers to must have the same field number and type, and be in the same order. When a field is a foreign key, it is associated in a specific way with the table it refers to. Each value, (each row) of a foreign key must unambiguously refer to one and only this value (row) of the parent key. If this condition is met, then the database is in the state referential integrity.

SQL maintains constrained referential integrity FOREIGN KEY... This function should constrain the values \u200b\u200bthat can be entered into the database to force the foreign key and parent key to conform to the principle of referential integrity. One of the restriction actions FOREIGN KEY is discarding values \u200b\u200bfor fields that are constrained as a foreign key that is not yet represented in the parent key. This restriction also affects the ability to change or delete parent key values.

Limitation FOREIGN KEY used in a CREATE TABLE (or ALTER TABLE (intended to modify the structure of a table) command containing a field that is declared a foreign key. The parent key is given a name that is referenced within a constraint FOREIGN KEY.

Like most constraints, it can be a table or column constraint, in the form of a table, allowing multiple fields to be used as a single foreign key.

Table Constraint Syntax FOREIGN KEY:

FOREIGN KEY REFERENCES

[ ]

The first column list is a list of one or more table columns, separated by commas, and will be created or modified by this command.

Pktable is a table containing the parent key. It can be a table that is being created or modified by the current command.

The second column list is the list of columns that will make up the parent key. Lists of two columns must be compatible, i.e .:

have the same number of columns
in a given sequence, the first, second, third, etc., columns of the foreign key column list must have the same data types and sizes as the first, second, third, etc. columns of the parent key column list.
the columns in the lists of both columns must not have the same name.

FOREIGN KEY Example 1

CREATE TABLE Student
(Kod_stud integer NOT NULL PRIMARY KEY,
Kod_spec integer NOT NULL,

Adres char (50),
Ball decimal),
FOREIGN KEY (Kod_spec) REFERENCES Spec (Kod_spec)
);

When using ALTER TABLE instead of CREATE TABLE, to apply the constraint FOREIGN KEY, the values \u200b\u200bspecified in the foreign key and parent key must be in referential integrity. Otherwise, the command will be rejected.

Using constraint FOREIGN KEY table or column, you can omit the list of columns of the parent key if the parent key has a PRIMARY constraint KEY... Naturally, in the case of keys with many fields, the order of the columns in the foreign and primary keys must be the same, and in any case, the principle of compatibility between the two keys still applies.

FOREIGN KEY Example 2

CREATE TABLE Student (
Kod_stud integer NOT NULL PRIMARY KEY,
Fam char (30) NOT NULL UNIQUE,
Adres char (50),
Ball decimal),
Kod_spec integer REFERENCES Spec
);

Maintaining referential integrity requires some restrictions on the values \u200b\u200bthat can be represented in fields declared as foreign key and parent key. The parent key must be structured to ensure that each foreign key value matches one specified string. This means that it (key) must be unique and not contain any NULL values.

This is not sufficient for the parent key if a requirement such as a foreign key declaration is met. SQL must be sure that no double values \u200b\u200bor null values \u200b\u200bhave been introduced into the parent key. Therefore, you need to make sure that all fields that are used as parent keys have either a PRIMARY constraint KEY or a UNIQUE constraint like the NOT NULL constraint.

Referencing foreign keys only to primary keys is a good strategy. When foreign keys are used, they are associated with more than just the parent keys to which they refer; they are associated with a specific table row where this parent key will be found. By itself, the parent key does not provide any information that is not already present in the foreign key.

Since the purpose of the primary key is to identify the uniqueness of the row, this is a more logical and less ambiguous choice for a foreign key. For any foreign key that uses a unique key as the parent key, you must create a foreign key that uses the primary key of the same table for the same action. A foreign key, which has no other purpose than concatenating strings, resembles a primary key used solely for identifying strings, and is a good means of keeping the database structure clear and simple. A foreign key can only contain values \u200b\u200bthat are actually represented in the parent key or are empty (NULL). Attempts to enter other values \u200b\u200bin this key will be rejected.

FOREIGN KEY Example 3

CREATE TABLE payment (
sh_payout integer,
sh_eml integer,
date_payout date,
summ_payout real,
FOREIGN KEY (sh_eml) REFERENCES k_sotr2 (eid)
);

In this example FOREIGN KEY the sh_eml column is linked to the eid column from the k_sotr2 table.

These are electronic repositories of information, access to which is carried out using one or more computers. Usually, databases are created to store and access data containing information about a certain subject area, that is, a certain area of \u200b\u200bhuman activity or part of the real world.

DBMS is software for creating, filling, updating and deleting a database.

The unit of information stored in the database is a table. Each table is a collection of rows and columns, where the rows correspond to an instance of an object, a specific event or phenomenon, and the columns correspond to attributes (features, characteristics, parameters) of an object, event or phenomenon. Each line contains information about a specific event.

In terms of a database, the columns of a table are called fields, and its rows are called records.

Relationships can exist between individual database tables, that is, information in the previous table can be added to another. DBs, between individual tables of which there are links, are called relational. The same table can be the main one in relation to one database table and the child one in relation to another.

Tables linked by relationships interact on a master-slave basis. One and the same table can be master to one database table and child to another.

An object Is something that exists and is distinguishable with a set of properties. The difference between one object and another is determined by specific property values.

The essence - reflection of an object in the memory of a person or computer.

Attribute - the specific value of any of the properties of the entity.

Field Is a single record element that stores a specific value for an attribute.

Communication field – this is the field by which the two tables are linked.

Primary and secondary keys

Each database table can have a primary key - this is a field or a tabor of fields that uniquely identifies a record.

The primary key value in the database table must be unique, that is, there must not be two or more records with the same primary key value in the table.

Primary keys make it easier to establish relationships between tables. Since the primary key must be unique, not all table fields may be used for it.

If the table does not have fields with unique values, to create a primary key, an additional numeric field is usually introduced into it, the values \u200b\u200bof which the DBMS can dispose of at its discretion.

Secondary keys are set by fields, which are often used when searching or sorting data: indexes built on secondary keys will help the system find the necessary values \u200b\u200bstored in the corresponding fields much faster.

Unlike primary keys, secondary key fields may not contain unique information.

Relational relationships between tables

One to one. A one-to-one relationship occurs when one record in the parent table matches one record in the child table.

This relationship is much less common than a one-to-many relationship, it is used if you do not want the database table to swell from a secondary table. One-to-one communication leads to the fact that in order to read related information in several tables, several read operations have to be performed, which slows down obtaining the necessary information. In addition, databases that include tables with a one-to-one relationship cannot be considered fully normalized.

Like a one-to-many relationship, a one-to-one relationship can be rigid or non-rigid.

IT APPLIES TO: SQL Server (starting in 2016) Azure SQL Database Azure SQL Data WarehouseParallel Data Warehouse

Primary and foreign keys are two types of constraints that can be used to enforce the integrity of data in SQL Server tables. These are important database objects.

This topic is described in the following sections.

Primary key constraints

Foreign key constraints

Related tasks

Typically, a table has a column or a combination of columns containing values \u200b\u200bthat uniquely identify each row in the table. This column, or columns, is called the primary key (PK) of the table and ensures the integrity of the table entity. Primary key constraints are often defined in the identity column because they ensure that the data is unique.

When you set a primary key constraint on a table, the Database Engine ensures data uniqueness by automatically creating a unique index on the primary key columns. This index also provides fast access to data when using the primary key in queries. If a primary key constraint is specified on more than one column, then the values \u200b\u200bmay be duplicated within the same column, but each combination of values \u200b\u200bfrom all columns in the primary key constraint definition must be unique.

As shown in the following figure, the columns ProductID and VendorID in the table Purchasing.ProductVendor form a composite primary key constraint for a given table. It is also guaranteed that each row in the table ProductVendor has a unique combination of values ProductID and VendorID... This prevents duplicate rows from being inserted.

Only one primary key constraint can exist in a table.

The primary key cannot have more than 16 columns, and the total key length cannot exceed 900 bytes.

An index formed by a primary key constraint cannot cause the number of indexes in a table to go beyond 999 nonclustered and 1 clustered indexes.

If the primary key constraint does not specify whether the index is clustered or non-clustered, then a clustered index is created if there is no one in the table.

All columns with a primary key constraint must be defined as non-nullable. If nullable is not specified, then all columns with a primary key constraint are set as non-nullable.

If a primary key is defined on a column of a CLR UDD, the implementation of that type must support binary collation.

A foreign key (FK) is a column or combination of columns that is used to enforce a relationship between data in two tables in order to control the data that may be stored in a foreign key table. If one or more columns containing the primary key for one table are referenced in one or more columns of another table, then a relationship between the two tables is created in the foreign key reference. This column becomes a foreign key in the second table.

For example table Sales.SalesOrderHeader linked to table Sales.SalesPerson using a foreign key because there is a logical relationship between sales orders and salespeople. Column SalesPersonID in the table Sales.SalesOrderHeader matches the primary key column in the table SalesPerson... Column SalesPersonID in the table Sales.SalesOrderHeader is a foreign key to the table SalesPerson... By establishing this relationship by foreign key, the value for SalesPersonID cannot be inserted into the table SalesOrderHeaderif it is not currently contained in the table SalesPerson.

The maximum number of tables and columns a table can reference as foreign keys (outbound links) is 253. SQL Server 2016 increases the limit on the number of other tables and columns that can be referenced by columns in the same table (inbound links) from 253 up to 10,000. (Requires at least compatibility level 130.) The increase has the following limitations:

Exceeding 253 foreign key references is only supported for DML DELETE operations. UPDATE and MERGE operations are not supported.

Exceeding 253 foreign key references is currently not available for columnstore indexes, memory-optimized tables, Stretch database, or partitioned foreign key tables.

Indexes in Foreign Key Constraints

Unlike primary key constraints, when you create a foreign key constraint, the corresponding index is not automatically created. However, it is often necessary to manually create an index for a foreign key for the following reasons:

Foreign key columns are often used in join criteria when used together to query data from related tables. This is accomplished by mapping a column or columns in a foreign key constraint in one table to one or more primary or unique key columns in another table. An index allows the Database Engine to quickly find related data in a foreign key table. However, creating an index is optional. Data from two related tables can be combined even if there is no primary key or foreign key constraint defined between the tables, but a foreign key relationship between the two tables shows that the two tables are optimized to be used together in a query where keys are used as criteria.

Foreign key constraints are used to validate changes to the primary key constraints on related tables.

Referential integrity

The main purpose of a foreign key constraint is to control the data that can be stored in the foreign key table, but the constraint also controls how the data in the primary key table changes. For example, when deleting a row for a sales person from a table Sales.SalesPersonwhose ID is used in sales orders in a table Sales.SalesOrderHeader, the referential integrity of the two tables will be violated. Remote Manager Sales Orders in Table SalesOrderHeader will become invalid without linking to the data in the table SalesPerson.

The foreign key constraint prevents this situation from occurring. The constraint enforces link integrity in the following way: it disallows changes to the data in the primary key table if such changes make the reference in the foreign key table invalid. If, when trying to delete a row in the primary key table or change the value of this key, it is found that a specific value in a foreign key constraint in another table corresponds to the deleted or changed value of the primary key, then the action will not be performed. To successfully update or delete a row with a foreign key constraint, you must first delete the foreign key data in the foreign key table, or change the data in the foreign key table that links the foreign key to the data of another primary key.

Cascading referential integrity

By using cascading referential integrity constraints, you can define the actions the Database Engine takes when a user attempts to delete or update a key that is still referenced by existing foreign keys. The following cascading actions can be defined.

NO ACTION
The Database Engine generates an error and then rolls back the delete or update operation on a row in the parent table.

CASCADE
Matching rows are updated or deleted from the referenced table if the given row is updated or deleted from the parent table. CASCADE value cannot be specified if the column is of type timestamp is part of a foreign or referenced key. The ON DELETE CASCADE action cannot be specified on a table for which an INSTEAD OF DELETE trigger is defined. ON UPDATE CASCADE cannot be specified for tables that have INSTEAD OF UPDATE triggers defined.

SET NULL
All values \u200b\u200bthat make up a foreign key are set to NULL when the corresponding row in the parent table is updated or deleted. External key columns must be nullable to satisfy this constraint. Cannot be set on tables for which INSTEAD OF UPDATE triggers are defined.

SET DEFAULT
All values \u200b\u200bthat make up the foreign key are set to their default values \u200b\u200bwhen the corresponding row of the parent table is deleted or updated. To meet this constraint, all external key columns must have default definitions. If a column is nullable and the default is not explicitly defined, the default for the column becomes NULL. Cannot be set on tables for which INSTEAD OF UPDATE triggers are defined.

The keywords CASCADE, SET NULL, SET DEFAULT, and NO ACTION can be combined on tables that have reciprocal links. If the Database Engine encounters the NO ACTION keyword, it stops and rolls back the associated CASCADE, SET NULL, and SET DEFAULT operations. If a DELETE statement contains a combination of the keywords CASCADE, SET NULL, SET DEFAULT, and NO ACTION, then all CASCADE, SET NULL, and SET DEFAULT operations are performed before the Database Engine searches for NO ACTIONs.

Triggers and Cascading Reference Actions

Cascading referential actions fire AFTER UPDATE or AFTER DELETE triggers as follows:

All cascading referential actions directly invoked by the original DELETE or UPDATE statement are executed first.

If there are any AFTER triggers defined on the modified tables, those triggers are fired after all cascading actions have been performed. These triggers are fired in the reverse order of the cascading actions. If multiple triggers are defined for the same table, they are fired in random order, unless the first and last table triggers are highlighted. This order is determined by the procedure.

If the cascading sequences originate from a table that was the direct target of the DELETE or UPDATE actions, the order in which the triggers are fired is not specified by the sequences. However, one sequence of actions always fires all of its triggers before the next.

The AFTER trigger of a table that was the direct target of a DELETE or UPDATE action is fired regardless of whether any rows have changed. In this case, the cascading does not affect any other tables.

If one of the previous triggers performs DELETE or UPDATE operations on other tables, those operations can invoke their own cascading sequences. These secondary workflows are processed for each DELETE or UPDATE operation after all the triggers of the primary workflows have been executed. This process can be repeated recursively for subsequent DELETE or UPDATE operations.

Executing CREATE, ALTER, DELETE, or other DDL operations inside triggers can cause DDL triggers to fire. This can lead to further DELETE or UPDATE operations, which will start additional cascading sequences and fire their triggers.

If an error occurs in any particular sequence of cascading referenced actions, no AFTER triggers will be fired in that sequence, and DELETE or UPDATE operations generated by that sequence will be rolled back.

A table that has an INSTEAD OF trigger defined can also have a REFERENCES clause indicating a specific cascading action. However, an AFTER trigger on a cascading target table can execute an INSERT, UPDATE, or DELETE statement on another table or view that will trigger an INSTEAD OF trigger on that object.

The following table lists common tasks related to primary key and foreign key constraints.

Last update: 27.04.2019

Foreign keys allow you to establish relationships between tables. The foreign key is set for columns from the dependent, subordinate table, and points to one of the columns from the main table. Typically, a foreign key points to a primary key from a related master table.

The general syntax for setting a foreign key at the table level is:

FOREIGN KEY (column1, column2, ... columnN) REFERENCES master_table (column_main_table1, column_main_table2, ... column_main_tableN)

To create a foreign key constraint, after FOREIGN KEY, you specify the table column that will represent the foreign key. The REFERENCES keyword is followed by the name of the related table, followed by the name of the related column in parentheses that the foreign key will point to. The REFERENCES expression is followed by the ON DELETE and ON UPDATE statements, which specify the action to be taken when a row is deleted and updated from the main table, respectively.

For example, let's define two tables and link them using a foreign key:

CREATE TABLE Customers (Id INT PRIMARY KEY AUTO_INCREMENT, Age INT, FirstName VARCHAR (20) NOT NULL, LastName VARCHAR (20) NOT NULL, Phone VARCHAR (20) NOT NULL UNIQUE); CREATE TABLE Orders (Id INT PRIMARY KEY AUTO_INCREMENT, CustomerId INT, CreatedAt Date, FOREIGN KEY (CustomerId) REFERENCES Customers (Id));

In this case, the Customers and Orders tables are defined. Customers is in charge and represents the customer. Orders is dependent and represents an order placed by a customer. The Orders table is linked to the Customers table and its Id column through the CustomerId column. That is, the CustomerId column is a foreign key that points to the Id column from the Customers table.

The CONSTRAINT statement can be used to specify a name for the foreign key constraint:

CREATE TABLE Orders (Id INT PRIMARY KEY AUTO_INCREMENT, CustomerId INT, CreatedAt Date, CONSTRAINT orders_custonmers_fk FOREIGN KEY (CustomerId) REFERENCES Customers (Id));

ON DELETE and ON UPDATE

The ON DELETE and ON UPDATE statements can be used to set the actions to be taken, respectively, when a related row is deleted and modified from the master table. The following options can be used as an action:

CASCADE: Automatically deletes or modifies rows from a dependent table when deleting or modifying related rows in the main table.

SET NULL: Sets the foreign key column to NULL when deleting or updating a related row from the master table. (In this case, the foreign key column must support setting NULL)

RESTRICT: Rejects deleting or modifying rows in the master table if there are related rows in the dependent table.

NO ACTION: same as RESTRICT.

SET DEFAULT: When deleting a related row from the master table, sets the foreign key column to its default value, which is set using the DEFAULT attributes. Although this option is available in principle, the InnoDB engine does not support this expression.

Cascade deletion

Cascading delete allows you to automatically delete all related rows from the dependent table when you delete a row from the master table. For this, the CASCADE option is used:

CREATE TABLE Orders (Id INT PRIMARY KEY AUTO_INCREMENT, CustomerId INT, CreatedAt Date, FOREIGN KEY (CustomerId) REFERENCES Customers (Id) ON DELETE CASCADE);

The ON UPDATE CASCADE statement works in a similar way. Changing the value of the primary key will automatically change the value of its associated foreign key. However, since primary keys are changed very rarely, and in principle it is not recommended to use columns with mutable values \u200b\u200bas primary keys, in practice the ON UPDATE statement is rarely used.

Setting to NULL

Setting the SET NULL option for a foreign key requires the foreign key column to be nullable:

CREATE TABLE Orders (Id INT PRIMARY KEY AUTO_INCREMENT, CustomerId INT, CreatedAt Date, FOREIGN KEY (CustomerId) REFERENCES Customers (Id) ON DELETE SET NULL);