Relational database field. Relational database model

Level 1:Level external models - this is the topmost level where each model has its own view of the data. This level defines the database point of view of individual applications.

Conceptual level:The central control link, where the database is presented here in the most general form, which unites the data used by all applications. In fact, the conceptual level reflects a generalized domain model.

Physical layer (Database):This is the data itself located in files or in page structures located on external storage media.

Data models

The following data models are distinguished:

1. Infological

2. Date logical

3. Physical

The database design process begins with the design of an infological model. An infological data model is a generalized informal description of the database being created, made using natural language, mathematical formulas, tables, graphs, and other means that are understandable to all people working on database design.

Domain tuple

The infological model displays the real world in some human-readable concept, completely independent of the data storage environment. Therefore, the Infological model should not change until some changes in the real world require changes outside the definition, so that this model continues to display the subject area.

There are many approaches to building this model: graph models, semantic networks, entity-relationship, and others.

Datalogical model

The infological model should be displayed in a datalogical model that the DBMS can understand. Datalogical model is a formal description of an infological model in the DBMS language.

Hierarchical model

This model is a collection of related elements that form hierarchical structure... The basic concepts of a hierarchy are level, node, and relationship.

communication level

A node is a collection of data attributes describing an object. Each node is linked to one node by more than high level and with any number of lower level nodes. An exception is the highest level node. The number of trees in the database is determined by the number of tree roots. There is only one path from the root record to each database record. A simple example can serve as the Internet domain name system \\ address. On the first level (the root of the tree) lies our planet earth, on the second - the Country, on the third - the Region, on the fourth - the settlement, street, house, apartment. A typical representative is a DBMS from IBM - IMS.

All instances of this type a descendant with a common instance of the ancestor type is called twins. A complete traversal order is defined for the database. From top to bottom and from right to left.

Physical model

A physical model is built on the basis of the datalogical model. Physical organization data has a major impact on the performance of the database. DBMS developers are trying to create the most efficient physical data models, offering users one or another toolkit for customizing the model for a specific database.

Example: In particular for a relational database, it already takes into account:

1. Physical aspects of storing tables in specific files.

2. Creation of indexes that optimize the speed of data operations using the application.

3. Performing various actions on data at certain events, defined by users using triggers and stored procedures.

Infological models X

Physical models

For all levels and for any presentation method subject area, lies the coding of concepts of the relationship between concepts. A key stage in the development of any information system is a system analysis:

Formalization of the subject area and representation of the system as a set of components.

Composition as the basis of systems analysis can be functional (building a hierarchy).

However, in most systems, when it comes to databases, data types are more static than how they are processed. Therefore, such methods of system analysis as the data flown diagram have been intensively developed. Development of relational databases. Stimulated the development of building data development techniques in particular ER diagrams ER. The relational data model directly uses the concept of a relationship as a mapping. It is closest to the conceptual data presentation model. And it often underlies it.

Unlike the graph model theorist, in the relational model, relationships between relationships are implemented in an implicit way, for which the keys of relationships are used. For example, relationships of a hierarchical type are implemented by the mechanism of primary and foreign keys, when the attribute fact must be present in the subordinate relationship.

Such an attribute of a relationship in the main relationship will be called the primary key, and in the subordinate, the secondary key.

Progress in the development of programming languages \u200b\u200bassociated primarily with data typing and the emergence of object-oriented languages \u200b\u200bhas made it possible to approach the analysis of complex systems from the point of view of hierarchical representations, that is, using classes of objects with polymorphic properties, inheritance, and encapsulation.

THE RATIO IS A TABLE.

Editing tables, records ...

Deleting what was created and

Editing.

Relational database model

Relational data models are currently the most popular because of this kind of data presentation.

A relational model can be thought of as a special method of presenting data, containing its own data (in the form of tables) and ways of working and manipulating with it (in the form of links). The relational model assumes three conceptual elements: Structure, Integrity, and Data Processing. These elements have their own obligatory concepts that need to be explained for further presentation.

The table is considered as a direct data store. Traditionally in relational systems, a table is called attitude.The table row is called by a tupleand the column attribute... The attributes have unique names (within the relationship).

The number of tuples in the table is called cardinal number... Number of attributes degree.A unique identifier is set for the relationship, that is, one or more attributes whose values \u200b\u200bare not the same at the same time - the identifier is called primary key.Domainit is the set of admissible homogeneous values \u200b\u200bfor this or that attribute. Thus, a domain can be considered as a named set of data, and the constituent parts of this set are logically indivisible units (for example, a list of the names of employees of an institution can act as a domain, but not all surnames can be present in the table).

SUMM Kireeva 25.50 Motyleva 17.05 … …. …

Attitude

attributes

Fields KOD, NAME, SUMM are table attributes contained in the header.

The pairs KOD 5216, NAME Kireev, SUMM 25.50 are elements of the relationship body.

In relational databases, unlike other models, the user specifies what data is needed for him and not how to do it. For this reason, the process of moving and navigating the database in relational systems is automatic, and this task in the DBMS is performed by optimizer.His job is to make the most effective way fetch data from the database upon request. Thus, the optimizer should at least be able to determine from which tables the data is fetched, how much information in these tables and what is the physical order of records in the tables and how they are grouped.

In addition, the relational database also performs the functions of a directory. The catalog contains a description of all objects that make up the database: tables, indexes, triggers, etc. Obviously, a component such as an optimizer is vital for the proper operation of the entire system. The optimizer uses the information stored in the catalog. An interesting fact is that the catalog itself is a set of tables, so the DBMS can manipulate it traditional wayswithout resorting to any special techniques and methods.

Domains and relationships

Basic definitions: Domains, types of relations, predicates.

A relationship has a number of basic properties:

1. In the most general case, relations do not have common tuples - this follows from the very definition of relations. However, for some DBMS, in some cases, deviation from this property is allowed. As long as there is a primary key in the relationship, identical tuples are excluded.

2. Tuples are not ordered from top to bottom - there is simply no concept of a positional number in a relation. In a relationship without losing information, you can successfully arrange the tuples in any order.

3. Attributes are not ordered from left to right. Attributes in the header of relations can be placed in any order, while the data integrity is not violated. Therefore, the concept of a positional number in relation to an attribute does not exist either.

4. The value of attributes consists of logically indivisible units - this follows from the fact that the values \u200b\u200bare taken from domains, otherwise we can say that relations do not contain repetition groups. That is, they are normalized.

Relational systems support several types of relationships:

1. Named variables are relations variables defined in the DBMS by means of creation operators and, as a rule, are necessary for a more convenient presentation of information for the user.

2. Basic relationships are directly an important part of the database, so they are given their own name during design.

3. A derived relation is one that has been defined through other, usually basic, relations by using DBMS tools.

4. This view is actually a named derived relation, and the view is expressed exclusively through DBMS operators applied to named relations, so they do not physically exist in the database.

5. The result of queries is an unnamed derived relation containing data (the result of a specific query). The result is not stored in the database but exists as long as the user needs it.

6. A stored relationship is one that is physically maintained in the memory of a relationship. Stored relationships most often include a relationship base. Based on the above, you can define a relational database as a set of relationships related to each other.

A connection in this case is the association of two or more relationships.

KOD	ADRES

1 1 A one-to-many relationship consists in the fact that at each moment of time, each element (tuple A) corresponds to several elements of tuple B
∞ Binary Link

Students

Teachers

Timetable of classes

∞

Students

Ternary connections

Data integrity

In relational models, the issue of data integrity is given a special place. Recall that a key or a potential key is the minimum set of attributes, by the values \u200b\u200bof which the required tuple can be uniquely found, minimality means that excluding any attribute from the set does not allow identifying a tuple by the remaining attributes.

Every relationship has at least one possible key. One of them is taken as the primary key.

When choosing a primary key, preference should be given to non-composite keys or keys composed of a minimal set of attributes. It is also undesirable to use keys with long text values (Integer attributes are preferred as keys.) So to identify an employee, you can use either a unique personnel number, or a passport number, or a set of surnames, patronymic name and department number. It is not allowed that the primary key of the relationship, that is, any attribute participating in the primary key takes undefined values. In this case, a contradictory situation will arise ( collision): The non-unique element of the primary key appears. Therefore, this should be carefully considered when designing a database.

About foreign keys. It is worth noting that since relationship C connects relationships B and A, it must include foreign keys corresponding to the primary keys of relationships A and B.

The foreign key of a table is formed using multiple primary keys of other tables.

Thus, when considering the problem of choosing a method of linking a relationship in a database, the question arises of what should be the foreign keys. Moreover, for everyone foreign key it is necessary to solve the problem associated with the possibility (or impossibility) of the appearance of undefined values \u200b\u200bin foreign keys (NULL - values \u200b\u200b- value attribute for missing information). In other words, can there be some tuple in a relationship for which no tuple is known in the related relationship?

On the other hand, you need to think in advance about what happens when you remove tuples from the relationship that the foreign key refers to. However, there are the following possible possibilities:

· Operation cascades - that is, deleting tuples in a relationship leads to the deletion of tuples related by the relationship. For example, deleting information about the last name of the first name, etc. an employee in one respect results in a deletion about his wages in another respect;

· Operation limited to -that is, only those tuples for which there is no related information in another respect are removed. Not all information is deleted (not in all respects) as it can be used in other ways, the deletion of information in which leads to a breach of data integrity. If such information is available, then deletion cannot be carried out, for example, deletion of information about name, surname, etc. an employee is only possible if there is no information in the related relationship about his salary.

It is necessary to provide for the technology of what will happen when you try to update the primary key of a relationship referenced by some foreign key. Here you have the same options as when deleting:

· The operation is cascaded, that is, when the primary key is updated, the foreign key in the associated relationship is updated. For example, updating the primary key in a relationship where employee information is stored results in an update of the foreign key in the relationship with payroll information.

· The operation is limited, that is, only those primary keys for which there is no other related information are updated. If such information is available, the update cannot be done. For example, updating the primary key in a relationship where information about an employee is stored is possible only if there is no information about his salary in the related relationship.

Relational algebra

The formal basis of the base of the relational database model is relational algebra, based on set theory and considering a special operator over relations, and relational calculus based on mathematical logic.

Composition

A A A B C C

D D E

D D

AND

A B C

D D E

F F G

It should be noted that relational algebra is very powerful - complex database queries can be expressed using a single expression. It is for this reason that these mechanisms are included in relational model data. Any query expressed using a single relational algebra expression, or a single relational calculus formula, can be expressed using a single operator of this language.

Relational algebra has an important property - it is closed with respect to the concept of a relation. This means that the expression of relational algebra is performed on the relations of relational databases and the results of their calculation are also relations.

The main idea of \u200b\u200brelational algebra is that the means of manipulating relations considered as a set are based on traditional multiple operations, supplemented by some specific operations for the database.

Let us describe a variant of algebra that was proposed by KODDOM. The operation consists of 8 main operators:

Fetching a relation (unary operation)

Relationship projection (unary operation)

Union relations

Intersection of relations (binary operation)

Subtraction of relations

Product relationship

Connecting relationships

Division of relations

These operations can be explained as follows:

· The result of selecting a relation for some condition is a relation that includes only those tuples of the original relation that satisfy this condition.

· When a relation is projected onto a given set of its attributes, a relation whose tuples are taken from the corresponding tuples of the first relation will be obtained.

· When performing the operation of joining two relations, a relation will be obtained that includes all tuples included in at least one of the relations participating in the operation.

· When performing the operation of intersection of two relations, a relation will be obtained that includes all tuples included in both original relations.

· When performing the operation of subtracting two relations, a relation will be obtained that includes all tuples included in the first relation, except for those that are also included in the second relation.

· When performing a direct product of two relations, a relation whose tuples are a combination of tuples of the first and second relation is obtained.

· When two relations are joined according to some condition, the resulting relation of tuples is formed as a combination of tuples of the first and second relations that satisfy this condition.

· The operation of relational division has two operands - binary (consisting of two attributes) and unary (consisting of one attribute) relations. The result of the operation is a relation consisting of tuples including the relation of the first attribute of the tuples of the first relation, such that the set of values \u200b\u200bof the second attribute coincides with the set of values \u200b\u200bof the second relation.

In addition to the above, there are a number of special operations typical for working with databases:

· As a result of the renaming operation, the relation is a set of tuples, which coincides with the body of the original relation, but the names of the attributes have been changed.

It follows that the result of a relational operation is some relation, then it is possible to form relational expressions in which, instead of the initial relation (operand), a nested relational expression will be used. This is due to the fact that the operations of relational algebra are indeed closed to the concept of a relation. Let's start with the operation uniting relations, however, this applies equally to the operations of intersection and combination, that is, in relational algebra, the result of a union operation is a relation. Assuming in relational algebra the possibility associations arbitrary two relations with different sets of attributes, then the result of such an operation will be many, but many different types of tuples, that is, generally speaking, not a relation. If we proceed from the requirement that relational algebra be closed with respect to the concept of a relation, then such an operation associationsis meaningless. This gives rise to the concept relationship compatibility by unification: two relationships are compatible only if they have the same headings, that is, it has the same set of attribute names, and attributes of the same name are defined in the same domain.

Provided that two relations are compatible in terms of union, when the operation of union of intersection of subtraction is usually performed on them, the result of the operation is a relation with a correctly defined title that coincides with the title of each of the relations - operands. If two relations are not completely compatible in terms of union, that is, they are compatible in everything except attribute names, then before performing an operation such as a connection, these relations can be made completely compatible in terms of union by using the rename operation.

The operation of the direct product of two relations raises new problems. In Set Theory, the direct product can be obtained for any sets. The elements of the result set will be pairs made up of the elements of the first and second sets. Since relations are sets, it is possible to obtain a direct product for any two relations. However, the result will not be an attitude. The elements of the result will not be tuples, but pairs of tuples. Therefore, in relational algebra, a special form of the operation of taking a direct product is used - an extended direct product of relations. When the extended direct product of two relations is taken, the element of the resulting relation is a tuple formed by merging one tuple of the first relation and one tuple of the second relation. Immediately, a second problem arises, associated with obtaining a correctly formed header of the resulting relationship, this leads to the need to introduce the concept of compatibility of relationships by taking an extended direct product.

Two relations are compatible in taking a direct product only if the set of attribute names of these relations do not overlap. Any two relations can be converted to a compatible form by taking a direct product by applying the rename operation to one of these relations.

The fetch operation requires two relations: the original relation, the operand, and a simple constraint condition. As a result of the fetch operation, a header relation is produced which matches the operand relation header, and the body contains those operand relation tuples that satisfy the values \u200b\u200bof the constraint condition.

Let's introduce a number of operators.

Let union denote the union operation, intersect the intersection operation, minus the subtraction operation. To denote a selection operation, we will use the construction A where B, where A is the operand relation, and B is a simple comparison condition. Let C1 and C2 be two simple sampling conditions

A where C1 AND C2 is identical to (A where C1) intersect (A where C2)

A where C1 OR C2 is identical to (A where C1) union (A where C2)

A where C1 not C2 is identical to (A where C1) minus (A where C2)

Using these definitions, it is possible to implement selection operations in which the selection condition is an arbitrary logical expression composed of simple conditions using logical connections (and, or, not). The operation of taking projections the relation A op to the list of attributes a1, a2,…, an will be a relation whose title is the set of attributes, a1, a2,…, an. The body of the result will consist of tuples for which in relation A there is a tuple, the a1 attribute has the b1 value, the a2 attribute the b2 value< и так далее атрибут an – bn. По сути при выполнении операции проекции определяется «Вертикальная» вырезка отношения - операнда с удалением возникающих кортежей –дубликатов.

The join operation, sometimes called conditional join, requires two operands, a joinable relationship, and a third operand, a simple condition. Let the relation A and B be connected. As in the case of the selection operation, the join condition C has the form, (a comp –op b) or (a comp –op const) where A and B are the names of the attributes of the relations A and B, const is literally given constant. Comp-op is a valid comparison operation in this context. Then, by definition, the result of the join operation is the relation obtained by performing the constraint operation, by condition C of the direct product of the ratio A and B.

There is an important special case of connection, natural connection. A join operation is called a natural join operation if the join conditions are (a \u003d b) where a and b are attributes of different join operands. This case is important because it is especially common in practice and there are effective implementation algorithms for it in a DBMS. The natural join operation is applied to a pair of relations A and B that share a common attribute P, that is, an attribute with the same name and defined on the same domain. Let ab denote the union of the headings of the relations A and B. Then the natural union is the result of the union of A and B projected onto a. The operation of the natural join is not included directly into the set of operations of relational algebra, but it has very important practical significance.

The operation of dividing a relationship needs a more detailed explanation because it is difficult to understand. Let two relations A (a1, a2, .., an, b1, b2, ..., bm) be given

B (b1, b2, ..., bn) We will assume that attribute b1 of relationship A and attribute b1 of relationship B are defined on the same domain. Let's call the set of attributes (aj) a composite attribute a, and the set (bj) with a composite attribute b. After that, we will talk about the relational division of the binary relation A (a, b) into the unary relation B (b).

The result of dividing A by B is a unary relation C (a), consisting of such tuples v that in relation A there are tuples which in the set of values \u200b\u200b(w) include the set of values \u200b\u200bof b in relation to B.

Since division is the most difficult operation, let us explain it with an example. Let there be two relations in the students' database: STUDENTS (FULL NAME, NUMBER) and NAMES (FULL NAME), and the unary relation NAMES contains all the names that students of the institute have. Then, after performing the operation of relational division of the relation STUDENTS by the relation NAMES, a unary relation will be obtained containing the numbers of student cards belonging to students with all surnames possible in this institute.

Relational calculus

Suppose there is a database with the structure STUDENTS (number, name, scholarship, group code), and the relation GROUP (gr_nom, gr_col, gr old). Suppose that you need to find out the names and numbers of students. tickets for students who are heads of groups with more than 25 people. In relational algebra, you need to take the following steps for such a query:

1. Perform connection of relations STUDENTS and GROUPS, according to the condition "student_number \u003d gr_star";

2. Limit the resulting ratio by the condition gr_col\u003e 25.

3. Project the result of the previous operation to the attribute student_name, student_number.

Here, step by step, the sequence of execution of a query in the database is formulated, each of which corresponds to one relational operation. If we formulate the same query using the relational calculus, then we would get a formula that can be read: Give STUD_NAME and STUD_NUMBER for such students so that such a group GR_STAR and the value GR_COL\u003e 25 coexist. In the second formulation, we indicated only the characteristics of the resulting relationship, but did not say anything about the method of its formation. In this case, the DBMS must decide for itself what kind of operations and in what order it is necessary to perform on the relations STUDENTS and GROUPS. Both methods considered in the example are actually equivalent and there are not very complex conversions from one to another.

The basic concepts of the relational calculus are the concepts of a variable with a certain range of its value, and the concept of a well-formed formula based on variables and specials. Functions. What is the scope of the variable? There are tuple calculus, and domain calculus, that is, along or across. In tuple calculus, the domains of variable definition are database relations, that is, the admissible value of each variable is a tuple of some relation. In the calculus of domains, the domains of definition of variables are the domains on which the attributes of database relations are defined, that is, the valid value of each variable is the value of each variable.

Byte

Integer

String

Char

The RANGE command is used to define tuples. For example, to define a variable STUDENT whose scope is STUDENTS, you need to use the construction RANGE STUDENT IS STUDENTS. From this definition it follows that at any moment of time the variable student represents a tuple of the relation STUDENTS. When using tuple variables in formulas, you can reference the variable attribute values. For example, to refer to the value of the STUD_NAME attribute of the STUDENT variable, you need to use the STUDENT.STUDENT_NAME construction.

Correctly constructed formulas serve to express conditions imposed on tuple variables. Such formulas are based on simple comparisons, which are operations of comparing the values \u200b\u200bof attributes of variables and literally specified constants. For example, the construction STUDENT.STUD_NOM \u003d 123456. Is a simple comparison. More difficult option compound formulas are by means of logical connections AND, OR, NOT, IF… THEN. Finally, it is allowed to build well-formed formulas using quantifiers. If F is a well-formed formula in which the variable var participates, then the EXIST (existential quantifier) \u200b\u200bvar (F) and FORALL (for all tuples) var (F) constructions are correct.

Variables included in well-formed formulas can be free or bound. All variables included in their composition during the construction of which were not used quantifiers are free. This means that if for some set of values \u200b\u200bof free tuple variables when calculating formulas, the value "true" is obtained, then these values \u200b\u200bcan be included in the resulting relation. If the quantifier is used when constructing formulas, then the variables are bound. When calculating the value of such a well-formed formula, not a single value of the associated variable is used, but its entire domain of definition.

1) EXISTS STUD2 (STUD.1STUD_STIP\u003e STUD2.STUD_STIP)

2) FORALL STUD2 (STUD.1STUD_STYP\u003e STUD2.STUD_STIP)

Let STUD1 and STUD2 be two tuple variables defined for the student relation, then the formula for the current tuple of the STUD1 variable takes on the value true only if, in the whole relation, students there is such a tuple associated with the STUD2 variable that the value of its STUD_STIP attribute satisfies the internal comparison condition. Correctly constructed formula # 2 for the constructed tuple STUD 1 takes on the value true if for all tuples the relation STUDENTS associated with the variable STUD 2 the value of the STUD.STYP attribute satisfies the internal condition.

Thus, well-formed formulas provide a means of expressing a selection condition from a database relationship. To be able to use the relational calculus for real work with the database, one more component is required that defines the set and column names of the resulting relation. This component is called target list.

Target listlooks like:

· Var.attr - the name of a free variable, attr is the name of the relationship attribute on which the variable var is defined.

· Var which is equivalent to a relation from a list, Var.attr1, Var.attr1… Var.attr№ includes the names of all attributes of the defining relation.

· New_name \u003d var.attr; the new name of the corresponding attribute of the resulting relationship.

The latter option is required in those cases where the code in the formula uses several free variables with the same scope. In domain calculus, domains are not domains, but domains. With regard to the GROUP STUDENTS database, we can talk about the domain variables NAME (Domain values \u200b\u200bare valid names or NOM STUD). (Domain values \u200b\u200bare valid student numbers).

The main difference between the calculus of domains and the calculus of tuples is the presence of an additional set of predicates that allow one to express the so-called membership conditions. If R is an n-ary relation with attributes (a1, a2, ... an), then the membership condition has the form R (ai1: Vi1, ai2: Vi2, ... aim: Vim) where (m<=n). Где в Vij это либо литерально заданная константа либо имя кортежной переменной. Условие членства принимает значение истина, только в том случае если в отношении R существует кортеж, содержащий следующие значения указанных атрибутов. Если от Vij константа то на атрибут aij накладывается жёсткое условие независящее от текущих доменных переменных. Если же Vij имя доменной переменной то условие членства может принимать различные значения при разных значениях этой переменной.

A predicate is a logical function that returns true or false for some argument. A relationship can be viewed as a predicate with arguments that are attributes of the relationship in question. If the given specific set of tuples is present in the relation, then the predicate will return true; otherwise, false.

In all other respects, formulas and expressions in domain calculus look similar to formulas and expressions in tuple calculus. Relational domain notation is at the heart of most form-based language queries.

Similar information.

Transfer

Translator's note: although the article is quite old (published 2 years ago) and has a loud title, it still gives a good idea of \u200b\u200bthe differences between relational databases and NoSQL databases, their advantages and disadvantages, and also provides a brief overview of non-relational storage.

Lately, a lot of non-relational databases have emerged. This suggests that if you want virtually unlimited on-demand scalability, you need a non-relational database.

If this is true, does this mean that the mighty relational databases are vulnerable? Does this mean the days of relational databases are over and will soon be over? In this article, we'll take a look at the popular trend of non-relational databases for a variety of situations and see if this will impact the future of relational databases.

Relational databases have been around for about 30 years. During this time, several revolutions broke out that were supposed to end relational storage. Of course, none of these revolutions took place, and none of them shook the position of relational databases in the least.

Let's start with the basics

A relational database is a collection of tables (entities). Tables are made up of columns and rows (tuples). Constraints can be defined within tables, relationships exist between tables. You can use SQL to run queries that return datasets from one or more tables. Within a single query, data is obtained from several tables by joining them (JOIN), most often the same columns are used for joining that define the relationships between tables. Normalization is the process of structuring a data model to ensure consistency and non-redundancy in the data.

Relational databases are accessed through relational database management systems (RDBMS). Almost all database systems we use are relational such as Oracle, SQL Server, MySQL, Sybase, DB2, TeraData and so on.

The reasons for this dominance are not obvious. Throughout the history of relational databases, they have consistently offered the best blend of simplicity, robustness, flexibility, performance, scalability, and interoperability in data management.

However, to provide all of these features, relational storage is incredibly complex internally. For example, a simple SELECT query can have hundreds of potential execution paths that the optimizer will evaluate directly at runtime. All of this is hidden from users, but within the RDBMS it creates an execution plan based on things like cost estimation algorithms and best suited to the query.

Relational database problems

While relational storage provides the best blend of simplicity, robustness, flexibility, performance, scalability, and compatibility, it does not necessarily perform better on each of these than similar systems that focus on one particular feature. This was not a big problem, as the overwhelming dominance of relational DBMSs outweighed any shortcomings. However, if conventional RDBs did not meet the needs, there were always alternatives.

Today the situation is a little different. The variety of applications grows, and with it the importance of the listed features grows. And as the number of databases grows, one feature begins to overshadow all others. It's scalability. As more and more applications run under high load conditions such as web services, their scalability requirements can change very quickly and grow dramatically. The first problem can be very difficult to solve if you have a relational database located on your own server. Suppose the server load has tripled overnight. How quickly can you upgrade the hardware? The solution to the second problem also causes difficulties in the case of using relational databases.

Relational databases scale well only if they are located on a single server. When the resources of this server run out, you will need to add more machines and distribute the load between them. This is where the complexity of relational databases starts to play against scalability. If you try to increase the number of servers not to a few, but to a hundred or a thousand, the complexity increases by an order of magnitude, and the characteristics that make relational databases so attractive rapidly reduce to zero the chances of using them as a platform for large distributed systems.

To stay competitive, cloud vendors have to somehow deal with this limitation, because what kind of cloud platform is it without scalable data storage. Therefore, vendors have only one option if they want to provide users with scalable storage space. It is necessary to use other types of databases that are more scalable, albeit at the cost of other capabilities available in relational databases.

These advantages, as well as the existing demand for them, have led to a wave of new database management systems.

New wave

This type of database is commonly referred to as a key-value store. In fact, there is no official name, so you can see it in the context of document-oriented, attribute-oriented, distributed databases (although they can also be relational), sharded sorted arrays, distributed hash tables, and storage. key-value type. While each of these names indicates specific features of the system, they are all variations on a theme that we will call key-value storage.

However, whatever you call it, this "new" type of database is not that new and has always been used mainly for applications for which relational databases would not be suitable. However, without the need for scalability of the web and the cloud, these systems were not in high demand. Now the challenge is to determine which type of storage is best for a particular system.
Relational databases and key-value storages differ fundamentally and are designed to solve different problems. Comparing the characteristics will only allow you to understand the difference between them, but let's start with this:

Storage characteristics

Relational database	Key-value storage
A database is made up of tables, tables are made up of columns and rows, and rows are made up of column values. All rows in one table have the same structure.	For domains, you can draw an analogy with tables, but unlike tables for domains, the data structure is not defined. A domain is a box in which you can put anything you want. Records within the same domain can have different structures.
Data Model 1 is predefined. It is strongly typed and contains constraints and relationships to ensure data integrity.	Records are identified by key, with each record having a dynamic set of attributes associated with it.
The data model is based on the natural representation of the contained data, not on the functionality of the application.	In some implementations, attributes can only be strings. In other implementations, attributes have simple data types that reflect types used in programming: integers, arrays of strings, and lists.
The data model is normalized to avoid duplicate data. Normalization creates relationships between tables. Relationships link data from different tables.	Relationships are not explicitly defined between domains, as well as within one domain.

No joins

Key-value stores are record-oriented. This means that all information related to a given record is stored with it. A domain (which you can think of as a table) can contain countless different records. For example, a domain can contain information about customers and orders. This means that data is usually duplicated between different domains. This is an acceptable approach since disk space is cheap. The main thing is that it allows all related data to be stored in one place, which improves scalability, since there is no need to join data from different tables. When using a relational database, you would need to use joins to group the information you need in one place.

Although the need for a relationship drops dramatically to store key-value pairs, relationships are still needed. Such relationships usually exist between the main entities. For example, an ordering system would have records that contain data about customers, products, and orders. It doesn't matter if this data is in one domain or in several. The bottom line is that when a customer places an order, you probably don't want to keep the customer and order information in one record.
Instead, the order record must contain keys that point to the corresponding customer and product records. Since records can store any information, and relationships are not defined in the data model itself, the database management system will not be able to control the integrity of relationships. This means that you can delete customers and products they ordered. Ensuring data integrity falls entirely on the application.

Data access

Relational database	Key-value storage
Data is created, updated, deleted, and queried using Structured Query Language (SQL).	Data is created, updated, deleted and queried using API method calls.
SQL queries can retrieve data both from a single table and from multiple tables using joins.	Some implementations provide SQL-like syntax for specifying filter conditions.
SQL queries can include aggregations and complex filters.	Often, only basic comparison operators (\u003d,! \u003d,<, >, <= и =>).
A relational database usually contains inline logic such as triggers, stored procedures, and functions.	All business and data integrity logic is contained in the application code.

Interaction with applications

Key-value stores: benefits

There are two distinct advantages of such systems over relational storage.

Suitable for cloud services

The first advantage of key-value stores is that they are simpler and therefore more scalable than relational databases. If you are hosting your own system together, and are planning to host a dozen or a hundred servers that need to cope with an increasing load behind your data warehouse, then key-value stores are your choice.

Because they can be easily and dynamically expanded, they are also useful for vendors who provide a multi-tenant web storage platform. Such a database represents a relatively cheap storage medium with great potential for scalability. Users usually only pay for what they use, however their needs can grow. The vendor will be able to dynamically and practically without restrictions increase the size of the platform, based on the load.

More natural code integration

The relational data model and code object model are usually built differently, leading to some incompatibility. Developers solve this problem by writing code that maps the relational model to the object model. This process does not have a clear and quickly achievable value and can take quite a lot of time that could be spent developing the application itself. In the meantime, many key-value stores store data in a structure that maps to objects more naturally. This can significantly reduce development time.

Other arguments for using key-value stores, such as "Relational databases can get clumsy" (by the way, I have no idea what that means), are less compelling. But before you become a proponent of such repositories, check out the next section.

Key-value stores: disadvantages

Constraints in relational databases ensure data integrity at the lowest level. Data that does not meet the constraints physically cannot get into the database. There are no such restrictions in key-value storages, so data integrity control rests entirely with applications. However, there are bugs in any code. While errors in a well-designed relational database usually do not lead to data integrity problems, errors in key-value stores usually lead to such problems.

Another advantage of relational databases is that they force you through the process of developing a data model. If you have a well-designed model, the database will contain a logical structure that fully reflects the structure of the stored data, but is at odds with the structure of the application. Thus, the data becomes application independent. This means that another application can use the same data and the application logic can be changed without any changes to the base model. To do the same with key-value storage, try replacing the relational model design process with class design, which creates generic classes based on the natural data structure.

And don't forget about compatibility. Unlike relational databases, cloud-based storage has fewer common standards. Although conceptually they are not different, they all have different APIs, request interfaces and their own specifics. Therefore, you better trust your vendor, because if something happens, you cannot easily switch to another service provider. And given the fact that almost all modern key-value stores are in Beta 2, trusting becomes even more risky than relational databases.

Limited data analytics

Typically, all cloud storage is built on a multi-lease type, which means that a large number of users and applications use the same system. To prevent "hijacking" of the overall system, vendors usually restrict query execution in some way. For example, in SimpleDB, a query cannot take longer than 5 seconds. Google AppEngine Datastore cannot retrieve more than 1000 records per request 3.

These restrictions are not scary for simple logic (creating, updating, deleting and retrieving a small number of records). But what if your app becomes popular? You've got a lot of new users and a lot of new data, and now you want to make new experiences for users, or somehow benefit from the data. This is where you can go haywire with even simple queries to analyze data. Features like tracking app usage patterns or a recommendation system based on user history can be tricky at best. And at worst, they are simply impossible.

In this case, it is better for analytics to create a separate database that will be filled with data from your key-value storage. Think in advance how this can be done. Will you host the server in the cloud or on your own? Will there be any problems with signal delays between you and your provider? Does your storage support this data transfer? If you have 100 million records, and you can take 1000 records at a time, how much will it take to transfer all the data?

However, don't prioritize scalability. It will be useless if your users decide to use another service, because that one provides more options and settings.

Cloud storage

Many web service providers offer multi-tenant key-value stores. Most of them meet the criteria listed above, but each has its own distinctive features and differs from the standards described above. Let's take a look at specific example repositories such as SimpleDB, Google AppEngine Datastore, and SQL Data Services.

Amazon: SimpleDB

SimpleDB is a key-value attribute-oriented store that is included with Amazon WebServices. SimpleDB is in beta; users can use it for free - as long as their needs do not exceed a certain limit.

SimpleDB has several limitations. First, the query execution time is limited to 5 seconds. Second, there are no data types other than strings. Everything is stored, retrieved, and compared as a string, so in order to compare dates you will need to convert them to ISO8601 format. Third, the maximum size of any string is 1024 bytes, which limits the size of the text (eg product description) that you can store as an attribute. However, since the data structure is flexible, you can work around this limitation by adding the attributes "Product Description1", "Product Description2", and so on. But the number of attributes is also limited - a maximum of 256 attributes. While SimpleDB is in beta, the domain size is limited to 10 gigabytes, and the entire database cannot be more than 1 terabyte.

One of the key features of SimpleDB is its use of an eventual consistency model. This model is suitable for multi-threaded work, but keep in mind that after you change the value of an attribute in a record, subsequent read operations may not see these changes. The likelihood of such a development of events is quite low, nevertheless, it must be remembered. You don't want to sell the last ticket to five customers just because your data was inconsistent at the time of sale.

Google AppEngine Data Store

Google's AppEngine Datastore is built on top of BigTable, Google's internal structured data storage system. The AppEngine Datastore does not provide direct access to BigTable, but can be thought of as a simplified interface to interact with BigTable.

AppEngine Datastore supports more data types within a single record than SimpleDB. For example, lists, which can contain collections within a record.

Most likely you will use this data store when developing with the Google AppEngine. However, unlike SimpleDB, you cannot use the AppEngine Datastore (or BigTable) outside of Google Web Services.

Microsoft: SQL Data Services

SQL Data Services is part of the Microsoft Azure platform. SQL Data Services is free, in beta and has database size limits. SQL Data Services is a separate application - an add-on over many SQL servers that store data. These stores can be relational, but for you SDS is a key-value store like the products described above.

Non-cloud storage

There are also a number of storage facilities that you can use outside the cloud by installing them on your own. Almost all of these projects are young, in alpha or beta, and open source. With open source, you may be more aware of potential problems and limitations than with proprietary products.

CouchDB

CouchDB is a free and open source document-oriented database. JSON is used as the data storage format. CouchDB aims to fill the gap between document-oriented and relational databases using "views". Such views contain data from documents in a form similar to tabular, and allow you to build indexes and run queries.

CouchDB is not a truly distributed database at this time. It has replication features to keep data synchronized between servers, but this is not the kind of distribution needed to build a highly scalable environment. However, the CouchDB developers are working on it.

Voldemort project

The Voldemort project is a key-value distributed database designed to scale out across a large number of servers. It was born during the development of LinkedIn and has been used for several systems with high scalability requirements. The Voldemort project also uses a final consistency model.

Mongo

Mongo is a database developed at 10gen by Geir Magnusson and Dwight Merriman (who you may know from DoubleClick). Like CouchDB, Mongo is a document-oriented database that stores data in JSON format. However, Mongo is more of an object base than a pure key-value store.

Drizzle

Drizzle presents a very different approach to solving the problems that key-value stores are designed to deal with. Drizzle started out as a branch of MySQL 6.0. Later, the developers removed a number of functions (including views, triggers, compiled expressions, stored procedures, query cache, ACLs, and part of data types) in order to create a simpler and faster DBMS. However, Drizzle can still be used to store relational data. The goal of the developers is to build a semi-relational platform designed for web and cloud applications running on systems with 16 or more cores.

Decision

Ultimately, there are four reasons why you might choose non-relational key-value storage for your application:

Your data is highly document-oriented and is more suited to a key-value data model than a relational data model.
Your domain model is highly object-oriented, so using key-value storage will reduce the amount of extra code to transform the data.
The data warehouse is cheap and easy to integrate with your vendor's web services.
Your main concern is high scalability on demand.

However, when making your decision, be aware of the limitations of specific databases and the risks that you will encounter if you choose to use non-relational databases.

For all other requirements, it is better to choose the good old relational DBMS. So are they doomed? Of course not. At least for now.

1 - in my opinion, the term "data structure" is more appropriate here, but left the original data model.
2 - most likely, the author had in mind that in terms of their capabilities, non-relational databases are inferior to relational ones.
3 - the data may be out of date, the article is dated February 2009.

voldemort

drizzle

Add tags

The advent of computer technology in our time marked an information revolution in all spheres of human activity. But in order to prevent all information from becoming unnecessary garbage on the global Internet, a database system was invented, in which materials are sorted, systematized, as a result of which they are easy to find and present to subsequent processing. There are three main types - relational databases, hierarchical, network.

Fundamental models

Returning to the emergence of databases, it is worth saying that this process was quite complex, it originates with the development of programmable information processing equipment. Therefore, it is not surprising that the number of their models at the moment reaches more than 50, but the main ones are considered to be hierarchical, relational and network, which are still widely used in practice. What are they?

Hierarchical has a tree structure and is composed of data from different levels, between which there are links. The DB network model is a more complex pattern. Its structure resembles a hierarchical one, and the scheme is expanded and improved. The difference between them is that the hereditary data of a hierarchical model can have a connection with only one ancestor, while the network data can have several. The structure of a relational database is much more complex. Therefore, it should be analyzed in more detail.

Basic concept of a relational database

Such a model was developed in the 1970s by Edgar Codd, Ph.D. It is a logically structured table with fields describing the data, their relationships with each other, the operations performed on them, and most importantly, the rules that guarantee their integrity. Why is the model called relational? It is based on relationships (from lat. Relatio) between data. There are many definitions for this type of database. Relational tables with information are much easier to organize and process than in a network or hierarchical model. How can this be done? It is enough to know the features, the structure of the model and the properties of relational tables.

The process of modeling and drawing up the main elements

In order to create your own DBMS, you should use one of the modeling tools, think over what information you need to work with, design tables and relational single and multiple relationships between data, fill in entity cells and set primary, foreign keys.

Table modeling and relational database design is done through free tools such as Workbench, PhpMyAdmin, Case Studio, dbForge Studio. After detailed design, you should save the graphically finished relational model and translate it into ready-made SQL code. At this stage, you can start working with data sorting, processing and systematization.

Features, structure, and terms related to the relational model

Each source describes its elements in its own way, so for less confusion I would like to give a small hint:

relational label \u003d entity;
layout \u003d attributes \u003d field names \u003d entity column headers;
entity instance \u003d tuple \u003d record \u003d table row;
attribute value \u003d entity cell \u003d field.

To get to the properties of a relational database, you need to know what basic components it consists of and what they are for.

Essence. A relational database table can be one, or there can be a whole set of tables that characterize the described objects due to the data stored in them. They have a fixed number of fields and a variable number of records. A relational database model table is composed of strings, attributes, and a layout.
Record - a variable number of lines displaying data that characterize the described object. The records are numbered automatically by the system.
Attributes are data showing the description of the entity columns.
Field. Represents an entity column. Their number is a fixed value set during table creation or modification.

Now, knowing the constituent elements of the table, you can go to the properties of the database relational model:

Relational database entities are two-dimensional. Thanks to this property, it is easy to perform various logical and mathematical operations with them.
The order of the values \u200b\u200bof attributes and records in a relational table can be arbitrary.
A column within one relational table must have its own individual name.
All data in an entity column has a fixed length and the same type.
Any record is in essence considered one data item.
The constituent components of strings are one of a kind. There are no duplicate rows in a relational entity.

Based on the properties, it is clear that the attribute values \u200b\u200bmust be of the same type and length. Let's consider the features of the attribute values.

Basic characteristics of relational database fields

Field names must be unique within the same entity. Relational database attribute or field types describe which category data is stored in entity fields. A relational database field must have a fixed size in characters. The parameters and format of attribute values \u200b\u200bdetermine how the data is corrected in them. There is also such a thing as "mask" or "input pattern". It is intended to define the configuration of data entry into the attribute value. By all means, when you write something wrong in the field, an error message should be issued. Also, some restrictions are imposed on the elements of the fields - conditions for verifying the accuracy and error-free data entry. There is some required attribute value that must be unambiguously populated with data. Some attribute strings can be filled with NULL values. Entering empty data into field attributes is allowed. Like error notification, there are values \u200b\u200bthat are filled in automatically by the system - this is the default data. An indexed field is designed to speed up the search for any data.

2D relational database table schema

For a detailed understanding of the model using SQL, it is best to consider the schema by example. We already know what a relational database is. A record in each table is one data item. To prevent data redundancy, it is necessary to perform normalization operations.

Basic rules for normalizing a relational entity

1. The value of the field name for a relational table must be unique, one of a kind (the first normal form is 1NF).

2. For a table that is already reduced to 1NF, the name of any non-identifying column must be dependent on the unique identifier of the table (2NF).

3. For the entire table that is already in 2NF, each non-identifying field cannot depend on an element of another unrecognized value (entity 3NF).

Databases: relational relationships between tables

There are 2 main relational tables:

One-Many. Occurs when one key record of table # 1 matches several instances of the second entity. A key icon at one end of the drawn line indicates that the entity is on the "one" side, the other end of the line is often marked with an infinity symbol.

A "many-many" relationship is formed when several rows of one entity have an explicit logical interaction with a number of records in another table.
If a "one-to-one" concatenation occurs between two entities, this means that the key identifier of one table is present in another entity, then one of the tables should be removed, it is unnecessary. But sometimes, purely for security reasons, programmers deliberately separate the two. Therefore, hypothetically, a one-to-one relationship may exist.

The existence of keys in a relational database

The primary and secondary keys define the potential relationship of the database. Relational links of a data model can have only one potential key, this will be the primary key. What is he like? A primary key is an entity column or set of attributes through which you can access the data for a particular row. It must be unique, unique, and its fields cannot contain empty values. If the primary key consists of only one attribute, then it is called simple, otherwise it will be a component.

In addition to the primary key, there is also a foreign key. Many do not understand what the difference is between them. Let's analyze them in more detail using an example. So, there are 2 tables: "Dean's office" and "Students". The "Dean's office" entity contains the fields: "Student ID", "Full name" and "Group". The "Students" table has attribute values \u200b\u200bsuch as "Name", "Group" and "Average". Since student ID cannot be the same for multiple students, this field will be the primary key. "Full name" and "Group" from the "Students" table can be the same for several people, they refer to the student ID number from the "Dean's office" entity, so they can be used as a foreign key.

Relational Database Model Example

For clarity, we will give a simple example of a relational database model consisting of two entities. There is a table called "Deanery".

You need to make connections to get a full-fledged relational database. The record "IN-41", like "IN-72", may be present more than once in the "Dean's office" plate, also the surname, name and patronymic of students in rare cases may coincide, so these fields cannot be made a primary key. Let's show the entity "Students".

As we can see, the types of fields in relational databases are completely different. There are both digital and symbolic entries. Therefore, the values \u200b\u200bof integer, char, vachar, date and others should be specified in the attribute settings. In the "Dean's office" table, only the student ID is a unique value. This field can be taken as a primary key. Name, group and phone number from the "Students" entity can be taken as a foreign key referring to the student ID. Connection established. This is an example of a one-to-one relationship model. Hypothetically, one of the tables is superfluous, they can be easily combined into one entity. To prevent student ID numbers from becoming generally known, the existence of two tables is quite realistic.

Database (DB) -it is a named set of structured data related to a specific subject area and intended for storage, accumulation and processing by a computer.

Relational Database (RDB) is a set of relations whose names coincide with the names of the schema in the database schema.

Basic conceptsrelational databases:

· Data type - the type of values \u200b\u200bof a particular column.

· Domain (domain) is the set of all valid attribute values.

· Attribute (attribute) - the heading of the table column that characterizes the named property of the object, for example, the student's surname, the date of the order, the gender of the employee, etc.

· Tuple - a table row, which is a collection of values \u200b\u200bof logically related attributes.

· Attitude (relation) - a table reflecting information about objects in the real world, for example, about students, orders, employees, residents, etc.

· Primary key (primary key) - a field (or set of fields) in a table that uniquely identifies each of its records.

· Alternate key Is a field (or set of fields) that does not match the primary key and uniquely identifies an instance of a record.

· External key Is a field (or set of fields) whose values \u200b\u200bmatch the existing primary key values \u200b\u200bof another table. When linking two tables, the foreign key of the second table is linked to the primary key of the first table.

· Relational Data Model (RDM)- data organization in the form of two-dimensional tables.

Each relational table must have the following properties:

1. Each record of the table is unique, i.e. the set of values \u200b\u200bacross the fields is not repeated.

2. Each value written at the intersection of a row and a column is atomic (inseparable).

3. The values \u200b\u200bof each field must be of the same type.

4. Each field has a unique name.

5. The order of the records is not essential.

The main elements of the database:

Field - an elementary unit of logical organization of data. The following characteristics are used to describe the field:

· Name, for example, last name, first name, patronymic, date of birth;

· Type, for example, string, character, numeric, date;

· Length, for example, in bytes;

· Precision for numeric data, for example, two decimal places to display the fractional part of a number.

Recording - a set of values \u200b\u200bof logically related fields.

Index - a means of accelerating the operation of searching for records, used to establish relationships between tables. The table for which the index is used is called indexed. When working with indexes, you need to pay attention to the organization of the indexes, which is the basis for classification. A simple index is represented by a single field or a boolean expression that processes a single field. A composite index is represented by several fields with the ability to use various functions. Table indexes are stored in an index file.

Data integrity Is a means of protecting data by link fields, which allows maintaining tables in a consistent (consistent) state (that is, preventing the existence of records in the subordinate table that do not have corresponding records in the parent table).

Request - a formulated question to one or more interconnected tables, containing the criteria for data sampling. The query is performed using the Srtructured Query Language (SQL). As a result of fetching data from one or more tables, many records can be obtained, called a view.

Data presentation - a named query for data selection (from one or several tables) saved in the database.

A view is essentially a temporary table that results from a query. The request itself can be sent to a separate file, report, temporary table, table on disk, etc.

Report- a system component, the main purpose of which is to describe and print documents based on information from the database.

General characteristics of working with RDBs:

The most common interpretation of the relational data model appears to be that of Date, who reproduces it (with various refinements) in virtually all of his books. According to Date, the relational model consists of three parts that describe different aspects of the relational approach: the structural part, the manipulation part, and the integral part.

The structural part of the model states that the only data structure used in relational databases is a normalized n-ary relation.

The manipulation part of the model asserts two fundamental mechanisms for manipulating relational databases - relational algebra and relational calculus. The first mechanism is based mainly on classical set theory (with some refinements), and the second - on the classical logical apparatus of the first-order predicate calculus. Note that the main function of the manipulation part of the relational model is to provide a measure of the relativity of any specific language of relational databases: a language is called relational if it has no less expressiveness and power than relational algebra or relational calculus.

28. ALGORITHMIC LANGUAGES. TRANSLATORS (INTERPRETERS AND COMPILATORS). ALGORITHMIC LANGUAGE BASIC. STRUCTURE OF THE PROGRAM. IDENTIFIERS. VARIABLES. OPERATORS. HANDLING OF ONE AND TWO-DIMENSIONAL ARRAYS. USER FUNCTIONS. SUB-PROGRAMS. WORKING WITH DATA FILES.

High level language - a programming language, the concepts and structure of which are convenient for human perception.

Algorithmic language (Algorithmic language) - programming language - an artificial (formal) language designed to write algorithms. A programming language is defined by its own description and implemented as a special program: a compiler or interpreter. Examples of algorithmic languages \u200b\u200bare Borland Pascal, C ++, Basic, etc.

Basic concepts of an algorithmic language:

Language composition:

Common spoken language is made up of four basic elements: symbols, words, phrases, and sentences. Algorithmic language contains similar elements, only words are called elementary constructions, phrases - expressions, sentences - operators.

Symbols, elementary constructions, expressions and operators constitute a hierarchical structure, since elementary constructions are formed from a sequence of symbols.

Expressions is a sequence of elementary constructions and symbols,

Operator - a sequence of expressions, elementary constructions and symbols.

Language description:

Description of symbols consists in listing the allowed symbols of the language. The description of elementary structures is understood as the rules for their formation. The description of expressions is the rules for the formation of any expressions that make sense in a given language. The description of operators consists of considering all types of operators allowed in the language. The description of each element of the language is given by its SYNTAX and SEMANTICS.

Syntactic definitions establish rules for constructing language elements.

Semantics defines the meaning and rules of using those elements of the language for which syntactic definitions have been given.

Language symbols - these are the main indivisible signs in terms of which all texts in the language are written.

Elementary constructions are the minimum units of a language that have an independent meaning. They are formed from the basic symbols of the language.

Expression in algorithmic language consists of elementary constructions and symbols, it specifies the rule for calculating a certain value.

Operator specifies a complete description of some action to be performed. A group of operators may be required to describe a complex action.

In this case, the operators are combined into Compound operator or Block. Actionsgiven by operators are performed on the data. Algorithmic language sentences that provide information about data types are called declarations or non-executable operators. A set of descriptions and operators combined by a single algorithm forms a program in an algorithmic language. In the process of learning an algorithmic language, it is necessary to distinguish the algorithmic language from the language with which the description of the studied algorithmic language is carried out. Usually the target language is simply called the language, and the language in terms of which the description of the target language is given - Metalanguage.

Translators - (English translator - translator) is a translator program. It converts a program written in one of the high-level languages \u200b\u200binto a program consisting of machine instructions.

A program written in any high-level algorithmic language cannot be directly executed on a computer. The computer understands only the language of machine instructions. Consequently, a program in an algorithmic language must be translated (translated) into the command language of a specific computer. Such translation is carried out automatically by special translator programs created for each algorithmic language and for each type of computer.

There are two main ways to stream - compilation and interpretation.

1.Compilation: Compiler (English compiler - compiler, collector) reads the entire program as a whole, makes its translation and creates a complete version of the program in machine language, which is then executed.

When compilation the entire original program immediately turns into a sequence of machine instructions. After that, the resulting resulting program is executed by a computer with the available initial data. The advantage of this method is that the translation is performed once, and the (multiple) execution of the resulting program can be carried out at high speed. At the same time, the resulting program can take up a lot of space in the computer memory, since one language operator during translation is replaced by hundreds or even thousands of instructions. In addition, debugging and modifying the translated program is very difficult.

2. Interpretation: Interpreter (English interpreter - interpreter, interpreter) translates and executes the program line by line.

When interpretation the original program is stored in the computer memory almost unchanged. The interpreter program decodes the statements of the source program one by one and immediately ensures their execution with the available data. The interpreted program takes up little space in the computer's memory and is easy to debug and modify. On the other hand, the execution of the program is rather slow, since at each execution, all the operators are interpreted in turn.

Compiled programs run faster, but interpreted programs are easier to fix and change

Each particular language is oriented either towards compilation or interpretation, depending on the purpose for which it was created. For example, Pascal is usually used to solve rather complex problems in which the speed of programs is important. Therefore, this language is usually implemented using a compiler.

On the other hand, BASIC was created as a language for novice programmers, for whom line-by-line program execution has undeniable advantages.

Sometimes there is both a compiler and an interpreter for the same language. In this case, you can use the interpreter to develop and test the program, and then compile the debugged program to speed up its execution.

DBMS functions.

DBMS functions are of high and low level.

High level functions:

1. Data definition - using this function, it is determined what information will be stored in the database (type, properties of data and how they will be related to each other).

2. Data processing. Information can be processed in different ways: selection, filtering, sorting, combining one information with another, calculating totals.

3. Data management. This function specifies who is allowed to view the data, correct it or add new information, as well as define the rules for shared access.

Low level functions:

1. Data management in external memory;

2. RAM buffers management;

3. Transaction management;

4. Introduction of a log of changes to the database;

5. Ensuring the integrity and security of the database.

By transaction is called an indivisible sequence of operations, which is monitored by the DBMS from start to finish, and in which if one operation fails, the entire sequence is canceled.

DBMS log - a special database or part of the main database, inaccessible to the user and used to record information about all changes to the database.

Introducing the DBMS Log designed to ensure the reliability of storage in the database in the presence of hardware failures and failures, as well as errors in software.

Database integrity Is a property of a database, which means that it contains complete, consistent and adequately reflecting the subject area information.

DBMS classification.

DBMS can be classified:

1. By types of programs:

a. Database servers (for example, MS SQL Server, InterBase (Borland)) - designed for organizing data centers in computer networks and implementing database management functions requested by client programs using SQL statements (i.e. programs that respond to queries);

b. DB clients - programs that request data. PFSDBMS, spreadsheets, word processors, e-mail programs can be used as client programs;

c. Fully functional databases (MS Access, MS Fox Pro) - a program with a developed interface that allows you to create and modify tables, enter data, create and format queries, develop reports and print them.

2. According to the DBMS data model (as well as the DB):

a. Hierarchical - are based on a tree structure of information storage and resemble the file system of a computer; the main disadvantage is the inability to implement the many-to-many relationship;

b. Network - which replaced the hierarchical ones and did not last long because the main drawback was the complexity of developing serious applications. The main difference between the network and the hierarchical is that in the hierarchical structure "record - child" has only one ancestor, and in the network descendant it can have any number of ancestors;

c. Relational - whose data is located in tables, between which there are certain links;

d. Object oriented - they store data in the form of objects and the main advantage when working with them is that you can apply an object-oriented approach to them;

e. Hybrid, i.e. object - relational - combine the capabilities of relational and object - oriented databases. An example of such a database is Oracle (previously it was relational).

3. Depending on the location of the individual parts of the DBMS, there are:

a. local - all parts of which are located on one computer;

b. network.

Network includes:

- with organization file - server;

With such an organization, all data is located on one computer, which is called a file - server, and which is connected to the network. When finding the necessary information, the entire file is transferred, including a lot of redundant information. And only when creating a local copy, the required record is found.

- with a client-server organization;

The database server receives a request from the client, looks for the required record in the data and transfers it to the client. The query to the server is formed in the structured query language SQL, therefore the database servers are called SQL servers.

- distributed DBMS contain several tens and hundreds of servers located on a large territory.

Basic provisions of the relational database model.

Relational database is called a database in which all data is organized in the form of tables, and all operations on this data are reduced to operations on tables.

Features of relational databases:

1. Data is stored in tables consisting of columns and rows;

2. There is one value at the intersection of each column and row;

3. Each column - field has its own name, which serves as its name - attribute, and all values \u200b\u200bin one column are of the same type;

4. Columns are arranged in a certain order, which is specified when creating a table, as opposed to rows, which are arranged in an arbitrary order. The table may not have a single row, but there must be at least one column.

Relational database terminology:

Relational database element	Presentation form
1. Database	Set of tables
2. Database schema	A set of table headers
3. Attitude	Table
4. Relationship diagram	Row of table column headings
5. Essence	Description of object properties
6. Attribute	Column heading
7. Domain	Many valid attribute values
8. Primary key	A unique identifier that uniquely identifies each record in the table
9. Data type	The type of values \u200b\u200bof elements in a table
10. Tuple	String (write)
11. Cardinality	Number of rows in the table
12. Degree of attitude	Number of fields
13. Body relationship	Multiple relation tuples

When designing a relational database, data is placed in several tables. Relationships are established between tables using keys. When linking tables, the main and additional (subordinate) tables are selected.

There are the following types of relationships between tables:

1. Relationship of the form 1: 1 (one to one) means that each record in the main table corresponds to one record in the additional table and, conversely, each record in the additional table corresponds to one record in the main table.

2. Relationship type 1: M (one to many) means that each record in the main table corresponds to several records in the additional table and, conversely, each record in the additional table corresponds to only one record in the main table.

3. Relationship like M: 1 (many to one) means that one or more records in the main table correspond to only one record in the secondary table.

4. Relationship of the form M: M (many to many) - this is when several records of the additional table correspond to several records of the main table and vice versa.

5. Main components of MS Access.

The main components (objects) of MS Access are:

1. Tables;

3. Forms;

4. Reports;

5. Macros:

Modules.

Table Is an object designed to store data in the form of records (rows) and fields (columns). Each field contains a separate part of the record, and each table is used to store information about one specific question.

Request - a question about data stored in tables, or an instruction for selecting records to be changed.

The form Is an object in which you can place controls for entering, displaying and changing data in the fields of tables.

Report Is an object that allows you to present user-defined information in a certain way, view and print it.

Macro - one or more macros that can be used to automate a specific task. Macro is the main building block of a macro; a stand-alone instruction that can be combined with other macros to automate a task.

Module - a set of descriptions, instructions and procedures stored under one name. There are three types of modules in MS Access: form module, report module and general module. Form and report modules contain a local program for forms and reports.

6. Tables in MS Access.

There are the following methods for creating tables in MS Access:

1. Table mode;

2. Constructor;

3. Table Wizard;

4. Import of tables;

5. Relationship with tables.

IN table mode data is entered into an empty table. A table with 30 fields is provided for data entry. After saving it, MS Access decides for itself which data type to assign to each field.

Constructor provides the ability to independently create fields, select data types for fields, field sizes and set field properties.

To define a field in the mode Constructor are set:

1. Field name , which in each table must have a unique name that is a combination of letters, numbers, spaces and special characters, except for " .!” “ ". The maximum name length is 64 characters.

2. Data type defines the type and range of valid values, as well as the amount of memory allocated for this field.

MS Access data types

Data type	Description
Text	Text and numbers, such as names and addresses, telephone numbers, postal codes (up to 255 characters).
Memo field	Long text and numbers, such as comments and explanations (up to 64,000 characters).
Numerical	A general data type for numeric data that allows mathematical calculations, with the exception of monetary calculations.
Date Time	Date and time values. The user can choose standard shapes or create a custom format.
Monetary	Monetary values. It is not recommended to use numeric data types for monetary calculations, since they can be rounded off when calculating. Currency values \u200b\u200bare always displayed with the specified number of decimal places after the decimal point.
Counter	Automatically exposed sequential numbers. Numbering starts from 1. The counter field is convenient for creating a key. This field is compatible with a numeric field that has the Size property set to Long.
Logical	Values \u200b\u200bare "Yes / No", "True / False", "On / Off", one of two possible values.
OLE Object Field	Objects created in other programs that support the OLE protocol.

3. The most important field properties:

- Field size sets the maximum size of the data stored in the field.

- Field format is a display format of a given data type and sets the rules for presenting data when displaying it on the screen or printing.

- Field signature sets the text that is displayed in tables, forms, reports.

- Condition on value allows you to control input, sets restrictions on input values, in case of violation of conditions, prohibits input and displays the text specified by the Error message property;

- Error message specifies the text of the message displayed on the screen when the constraints specified by the Condition on the value are violated.

Control type - a property that is set on the Substitution tab in the table designer window. This property determines whether the field will be displayed in the table and in what form - as a field or as a combo box.

Unique (primary) key tables can be simple or complex with several fields.

To define the key, the fields that make up the key are highlighted, and the button on the toolbar is pressed key fieldor the command is executed Edit / Key Field.