Control and diagnostics system for electronic devices. Technical diagnostics of digital systems Control and testing of the designed digital device

The widespread use of electronic devices for digital signal processing causes an increased interest in diagnosing their technical condition. One of the types of diagnostics of digital assemblies and blocks is test diagnostics, the use of which at the design and manufacturing stage of digital assemblies makes it possible to determine the correctness of their functioning and carry out a troubleshooting procedure.

The essence of the test control is a test signal fed to a digital device and causing such a reaction of the control center, which indicates its operability.

Test is a collection of test signals.

A test program is an ordered sequence of tests.

There are two approaches to creating a test program, in accordance with this, two types of control are distinguished:

1) functional - the algorithm for the functioning of a digital device is used as the initial information for building a test program, i.e. solving the control problem. It does not allow identifying a significant part of possible malfunctions in the absence of information about the causes and nature of possible malfunctions, with increased complexity of the monitored system or low requirements for the completeness of control.

2). Structural - in the process of developing a test program, data on the structure of the control center and the nature of possible malfunctions are used. It provides a fairly complete check of the operation of the control center. However, for complex digital devices, structural control methods are ineffective due to the large number of circuit elements and the lack of adequate fault models typical for complex control centers.

To show the testing problems more clearly, let's determine the time required to test a typical microcircuit (IPC580).

The required number of possible test combinations is generally defined as C \u003d 2 nm, where n is the length of the data word in bits (n \u003d 8), m is the number of commands in the MP command system (m \u003d 76). Then C \u003d 2 8 * 76 \u003d 2 608 \u003d 10 183. This is the total number of test combinations. Let each test last 1μs. Then, all tests will take a testing time t \u003d 10 177 s. A 365-day year contains 3.15 * 10 7 s. Therefore, all tests will end in 0.3 * 10 170 years. For comparison, the age of the earth is 4.7 * 10 9 years.

Depending on the detail of the control object, when developing a test program, system and modular control methods are distinguished.

one). Systemic - the control center is considered as a single whole, for which the test program is being developed.

2). Modular control - CU is considered as a set of separate functional units (modules), for each of which its own test program is compiled. These programs are then combined into a complete system checker. Both in the systemic and in the modular approaches to the construction of test programs, both functional and structural methods can be used.

When developing test diagnostics, it becomes difficult to determine the reference responses when testing existing circuits, to determine the optimal number of control points for removing the output response of the diagnosed digital circuit. This can be done either by creating a prototype of the digital device being developed and carrying out its diagnostics using instrumental methods, or by simulating on a computer both the digital device and the diagnostic process. The most rational is the second approach, which involves the creation of automated diagnostic systems that allow diagnostics of digital circuits at the design stage and capable of solving the following tasks:

1. Perform logical modeling of digital circuits using a computer. The goal of logical modeling is to fulfill the function of the designed circuit without its physical implementation. In order to check the states of signals in the circuit, it is necessary to accurately describe the response delays of all elements under synchronization conditions. If, for example, only the values \u200b\u200bof a logical function are checked at the output of the circuit, then it is sufficient to represent the circuit at the level of logical elements.

2. Modeling of faults. The challenge of troubleshooting digital circuits is to determine if the digital circuit has the desired behavior. To solve this problem, it is necessary, first of all, to establish a model of a digital circuit as an object of control, then a method for detecting faults and, finally, a fault model. From the point of view of the features of the behavior of digital circuits, they can be divided into combinational and sequential. Combinational circuits are a relatively simple model in terms of fault detection. Sequential circuits with respect to behavior are characterized by the presence of internal feedback loops, therefore, the detection of faults in them in the general case is extremely difficult.

Simulation of the test diagnostics process. The classical strategy for testing digital circuits is based on the formation of test sequences that allow detecting a given set of faults. In this case, for the testing procedure, as a rule, both the test sequences themselves and the reference output reactions of the circuits to their effect are stored. In the process of the testing procedure itself, based on the results of comparing real output responses with the reference ones, a decision is made about the state of the tested circuit. If the received circuit responses correspond to the reference ones, it is considered to be in good order, otherwise the circuit contains a malfunction and is in a malfunctioning state.

For a number of currently produced schemes, the classical approach requires significant time expenditures both for the formation of test sequences and for the testing procedure. In addition, large volumes of test information and reference output responses require sophisticated equipment to conduct a test experiment. As a result, the cost and time required to implement the classical approach grows faster than the complexity of the digital circuits for which it is used.

Therefore, new solutions are proposed that make it possible to significantly simplify both the procedure for constructing test sequences and conducting a test experiment. In the general case, the implementation of the proposed methods is shown by the diagram in Fig. 1.

GTV - generator of test influences (generator M - sequences);

CA - digital circuit;

Block of reference reactions - block storing compressed output reactions;

The logical interconnection of functional blocks is constructed as follows: from the generator of test influences through a digital circuit, signals are sent to the information compression circuit. The compressed output reactions go to the comparison circuit, where they are compared with the standards that are stored in the block of reference reactions. Then the information goes to the device for outputting information about the state of the circuit.

In compact testing, the simplest methods are used to implement the test sequence to avoid a complex synthesis procedure. These include the following synthesis algorithms:

1. Formation of all kinds of input test cases, i.e. brute force enumeration of binary combinations. As a result of applying such an algorithm, the so-called counter sequences are generated.

2. Formation of random test sets with the required probabilities of the appearance of single and zero symbols for each DS input.

3. Formation of pseudo-random sequences.

The main property of these algorithms is that, as a result of their application, sequences of very long length are reproduced. Therefore, at the outputs of the tested DS, its reactions are formed, which have the same length. Moreover, if for generators of test sequences that generate counter, random and pseudo-random sequences there is no problem of memorizing and storing them, then for the output reactions of each circuit such a problem occurs. The simplest solution, allowing to significantly reduce the amount of stored information about the reference output reactions, is to obtain integral estimates that have a lower dimension. For this, compression algorithms are used. As a result of their application, compact estimates of the compressible information are formed. These estimates are often referred to as checksums, keywords, syndromes, or signatures of the corresponding poles of the digital circuit, for which one of the information compression algorithms is used. Thus, under compact testing, it is customary to understand testing in which the generation of tests and the analysis of answers are carried out by compact algorithms. Compact testing systems are used to present information in a concise manner.

In connection with the creation of complex digital systems based on integrated circuits, much attention has recently been paid to the development of new methods of built-in testing, i.e. definition of the diagnostic procedure as one of the functions of the digital system. Currently, the need for cost-effective testing systems is intensified by an increase in the degree of integration of the element base of computer technology. In this regard, there is a tendency to reduce the hardware complexity of diagnostic tools.

The most studied class of compact test systems are open-loop systems, in which the test generator (GT), the test object (OT), the response analyzer (AO) are connected in series (Fig. 2a). A further reduction in hardware complexity is achieved in the class of closed systems, where the generator, object, analyzer form a closed loop (Fig. 2b).

Features of closed systems are due to the effect of "multiplication" of a defect along the contour, which enhances the detecting ability.


Figure: 2. Open-circuit (a) and closed (b) testing systems.

The closed nature of compact testing systems largely contributes to the resolution of the contradiction caused by the lagging of the characteristics of the old testing tools from the characteristics of the newly created object. Since in the process of functioning of the built-in means of such systems there is no access to storage devices and comparison of actual responses with reference ones, it is possible to carry out checks at a high operating frequency of the object.

With the development of closed testing systems, the emergence of a loop testing system is associated. In ring systems, the functions of the generator and the analyzer are combined in space and time, the topology of the structure has the form of a ring, the models of systems are described in the algebra of a ring of polynomials and ring (cyclic) graphs, which gave rise to the term loop testing (hereinafter CT). During the check, a healthy system goes through its states along a cyclic route. Therefore, the conclusion about the health of the object is made based on a comparison of the initial and final states of the system.

A.A. Druzhaev, V.G. Khanbekov

The article discusses the prerequisites for the creation of control systems and diagnostics of electronic devices (ECD), the scope and possibilities of their application. The existing SKD EU Krona-511 is described.

The premises of developing a system for electronic device supervision and diagnosis, the areas and potentialities of its application are considered. The existing Krona-511 system is described.

Problems in the process control system

With the advent of automated control systems for technological processes, safety and emergency protection systems, multichannel automatic control systems (for example, generator excitation systems, turbine control systems, and other complex actuators), the task of monitoring and checking their operation arose. The usual approach (with a multimeter in one hand and an oscilloscope in the other) is not effective enough here because:

  • the simplest control system has tens (and sometimes hundreds) of signals that determine its state;
  • transient processes are too transient to be noticed and tracked on the oscilloscope screen (not to mention, to accurately measure their parameters);
  • it is required not only to measure the instantaneous values \u200b\u200bof signals, but also to have a "picture" of events preceding a certain (emergency) moment and following it;
  • it is necessary not only to fix the signals, but also to "tie" them to a single time count;
  • possible emergency conditions are quite rare in time.

Therefore, it became necessary to create a special class of systems that would effectively solve the issues of monitoring and diagnosing the operation of these devices.

Basic requirements for ACS EU

ACS EI, designed to solve the above problems, must have the following characteristics and capabilities:

  • lack of influence of ACS on the controlled object (both at the moment of connection and in operating mode);
  • continuity of ACS operation (from several hours to several days);
  • discreteness of recording input signals up to several microseconds;
  • the ability to start and end recording based on a combination of input signal states;
  • the ability to control levels, shapes and parameters of input signals;
  • fixing the time of the "accident";
  • the ability to continuously record input signals for a period of time from several seconds to several hours;
  • storage of information about the pre-emergency and post-emergency state of input signals;
  • the ability to accumulate several emergency situations in the memory;
  • the ability to view and analyze recorded signals in the form of time graphs.

ACS that meet these requirements allow not only observing and checking the operation of the control system, they automate the process of searching for "faulty", rarely occurring situations. In this case, the events of the pre-emergency situation are recorded, which is extremely important for diagnosing the causes of the accident.

The study of the time graphs of the recorded signals makes it possible to evaluate the parameters of the power plant, their "spread", thereby predicting the probability of accidents and failures. In addition, by making regular recordings, it is possible to observe and record the drift of parameters over time.

Multichannel recording with reference to a single point in time allows detecting the "run-up" of signals in time.

When using ACS EI there is a real opportunity to "look inside" electronic systems. As practice shows, even the qualified personnel serving the power plant does not have an accurate idea of \u200b\u200bits real work. It happens that only with the help of the ACS the presence of short-term or rare "bursts", "dips" or distortions of the waveform is detected.

Universal system for monitoring and diagnostics of electronic devices

The research and production complex "KRONA" has developed the SKD EU Krona-511, which meets all the requirements for such systems. The operation of the system is based on the principle of converting signals into digital form with a constant frequency, monitoring in real time and recording on a computer disk.

Main characteristics and distinctive features of the system:

  • the number of channels up to 64 (including up to 20 discrete), since the modularity of construction allows increasing the number of channels at the request of the customer;
  • connection directly to control points using remote adapters;
  • discreteness of recording, for example:
  • the recording time is limited only by the free space of the hard disk, at a maximum recording frequency of 1GB is enough to record more than one hour of operation of the EI;
  • monitoring of signals on the computer screen;
  • powerful tools for viewing and analyzing records, creating and printing reports, maintaining an archive of records, the ability to export recorded data to other programs;
  • built-in hardware and software self-monitoring; quick check of the operability of all parts of the system.

Connection to EI

The connection to the EI is made through external adapters (voltage, current, temperature, discrete signals, "dry contacts") of various ranges, while the distance to the connection point is from 2 to 10 meters. Ranges of measured signals: from 0.01 V to 2500 V, from 0.0005 A to 10 A, from 0 ° C to 100 ° C. The adapters provide galvanic "isolation" of the input channels from each other, as well as from the output signal circuits of the power plant and from the "ground" (up to 3500V), in addition, they can withstand emergency modes of multiple overloads, without disrupting their performance.

Setting, recording and monitoring signals

Control of setting, recording, processing and viewing signals is performed by a program running in the MS Windows 95, 98 OS environment.

The program allows you to quickly adjust the "Krona-511" to any power plant. It is enough to prepare and enter the list of input signals into the program. For the signals to be monitored, the shape is described or the monitored parameter (mean, root-mean-square or mean-square value) is set. The form of the signal standard can be set as one of the standard (sinusoidal, sawtooth, etc.), or recorded from a real EI. For each of the monitored signals, "tolerances" are set - permissible deviations of the shape or parameter.

In addition, the recording parameters are set - the discreteness of the input channels recording, as well as the conditions for monitoring synchronization, recording start and stop (for each of them - up to 60 conditions). The condition can be: transition of a given signal through a given level, finding a signal above or below a given level, in a given range (or logical state) or outside it. It is possible to logically combine conditions using the "AND" or "OR" operations.

The system has two modes of operation: single - this is writing information to the line buffer (until it is full), and circular - this is writing to the buffer, replacing obsolete data with newer ones (before stopping or an emergency event).

Thus, a one-time recording allows you to record and / or control the operation of the EI for a given period of time after launch. Therefore, the use of this mode is effective for displaying the interval at which an energy object reaches a certain level of operation.

Since the user has the ability to set the recording time not only "before stopping" (emergency), but also "after", the ring mode allows recording the operation of the EI both before and after the emergency / specified event, which is effective for any research tasks.

Information recording starts at the operator's command or according to the specified conditions. In parallel with the recording of the input information, the computer compares the signals with the standards or their control parameters.

The end of recording is performed after a specified period, or after the specified conditions for "non-comparison" of the form / parameter of signals or a command from the operator are fulfilled.

In the loop recording mode, it is possible to "auto-restart" recording after stopping - that is, restarting
without user intervention (the number of repeated launches is set in advance).

The user can display up to twelve "oscilloscope windows" on the screen, in which the selected signals will be "drawn" in real time.

Viewing records

The results of signal recording are displayed to the user in the form of graphs (Fig. 1.).

Figure 1. Viewing a multichannel recording of an emergency moment with an overlay of a signal standard

Several time axes can be displayed on the screen, on each of which several graphs of signals can be placed (signal records can be from different sessions, which makes it possible to estimate the drift of the ES parameters over time) (Fig. 2.). The received images of signals can be superimposed with standards for visual assessment.


Figure 2. Observing the change in signal over time

The graphs indicate the moments of synchronization of control, non-comparison, shutdown. In addition, the user can independently add comments on the graphs.

The graphics to be printed are drawn up in the form of a document. It includes a summary indicating the names of sessions and printed signals, the date and time of the start and end of the recording, as well as the stop / "non-comparison"; and, in addition, the calculated parameters for the indicated graphs.

The built-in "Template Editor" allows you to "cut" a part of the signal record (if necessary, edit it) and use it in the future as a physical standard to control complex signals!

The user has the ability to copy recording sessions from the ACS disk to other media (floppy disks, removable high-capacity disks, network drives). This allows for distributed processing of the recorded data by several users on different computers.

Provides the ability to export the recorded data as a text file. This allows you to process the data by other programs (for example, the company's own AWP programs).

The software has been modernized as a result of numerous implementations. It takes into account all the comments and wishes of users received during the 3-year operation of the products. Work is constantly underway to supplement the ACS with new functionality.

Conclusion

The experience of using the SKD EI "Krona-511" at a number of nuclear power plants in the Russian Federation has shown the possibility of building on this equipment multichannel systems for predicting failures of safety systems, emergency protection, etc. Moreover, the probability of detecting a potentially unreliable channel (node, element) is quite high even before these systems reach a critical level.

In connection with the widespread introduction of DS based on LSI and microprocessor kits (MPK), the problem of diagnostics, i.e. the process of determining the technical state of an object with a certain accuracy to a faulty element, in many cases is difficult to solve. Foreign experience in the operation of communication equipment based on LSI and IPC has shown that it is impossible to ensure the reliable operation of the latter without the appropriate organization of control and technical diagnostics. This is due to a significant increase in the number of complex digital cards in operation, as well as a different approach to issues of ensuring traceability by various enterprises - manufacturers of digital systems. Most of the specialists involved in the maintenance of complex equipment realized quite clearly that the problem of monitoring and diagnostics should not be treated as a matter of secondary importance. Therefore, an increase in the technical and operational characteristics of complex equipment based on LSI and IPC is inextricably linked with the development of new methods and diagnostic tools, with the need for comprehensive accounting and analysis of digital cards and their components as an object of control and diagnostics.

An effective system of technical diagnostics should provide a two-stage strategy for troubleshooting in a centralized service center with a depth of search, respectively, up to the fuel and power plant (board) and microcircuit. Taking into account the expansion of the range of service centers, it becomes necessary to reduce the requirements for the qualifications of service personnel of technical diagnostics systems, especially for service and repair centers. The diagnostic equipment intended for these centers should have, if possible, the minimum weight and size indicators and ensure that the specificity of each diagnostic object is taken into account.

The two-stage strategy of technical diagnostics consists of the following stages:

Localization of faults in the TEC to a typical replacement element (TEC) or a group of TEC (carried out by the built-in automatic diagnostics system.) In this case, diagnostic tests are launched at the request of the maintenance system. A faulty TEZ should be replaced with a serviceable TEZ from the spare parts kit;

The TEZ, which was replaced, is marked as faulty and sent to the repair center. In the latter, using diagnostic tools, the search, localization of the faulty component and its replacement are carried out. The number and composition of spare parts in the centers should ensure its continuous operation, taking into account the return of fuel and energy resources from repair.



Features of control and diagnostics of digital boards with LSI are as follows:

Wide range of LSI characteristics;

The number of control tests can reach several thousand;

Digital cards with LSI have a backbone principle of organization, which requires data exchange on 4, 8, 16-bit buses for one clock cycle, as well as simultaneous multi-channel control;

The backbone buses in most LSIs have a bi-directional mode of operation, therefore, the control equipment must ensure switching from transmit to receive during one clock cycle;

Digital cards with LSI can have several bidirectional input / output channels in the interface circuits;

Since the time characteristics play an important role, the control operations should be performed at a frequency close to the operating frequency, up to 10-20 MHz.

Based on the above, it can be noted that in the operating conditions of communication equipment, the following control and diagnostic tasks must be solved:

Reducing the cost of control and diagnostic work in order to minimize the cost of repair and restoration work (RVR);

Collection and processing of information about the operational reliability of digital cards and their components, as well as about the time and economic costs of troubleshooting.

From a diagnostic point of view, the field troubleshooting process has the following specific features:

In most cases, localization of faults at the level of a plug-in digital card is sufficient;

High probability of occurrence of no more than one malfunction by the time of repair;

Most systems provide some monitoring and diagnostic capabilities, the ability to monitor the health status;



With properly organized preventive examinations, early detection of a potential failure is possible;

Monitoring and diagnostics of a small number of communication equipment with a large number of different types of digital cards.

The process of automatic diagnostics (in systems of functional and test diagnostics) can be implemented in the following ways:

Hardware;

Software;

Hardware and software.

The hardware diagnostic method can be used in relation to various technical objects. In contrast to it, the software diagnostic method is applicable only for objects operating according to a plug-in program. Examples of such objects are specialized and universal, control and computing machines.

The software diagnostic method is implemented using programs that control the diagnostic object.

The most effective is a software and hardware diagnostic method that combines the advantages of the first two methods.

In order to develop an automated device for diagnostics of digital cards (ADCP) based on a PC and create a database of diagnostic data, the following should be considered:

Methods for analyzing the nomenclature and technical data of specified types of digital electronic equipment boards, as an object of control and diagnostics for compact testing tools;

Methods for analyzing statistical data of controlled operation of a given equipment to determine the reliability characteristics of digital boards.

In the first direction, the analysis of the nomenclature and technical data of individual digital cards and their components is carried out, which are necessary in the development of a PC-based ADCP interface device and a digital card diagnostic object:

Distribution of the number of digital cards of various functional purpose in terminal and channel-forming equipment;

The number of types of digital boards and their sizes, types, series and number of ICs, LSIs and IPCs;

Types and number of connectors, number of connector pins in various types of digital cards;

Operating frequencies of the nodes in the considered digital cards;

Power supply voltage gradations for various digital cards with IC, LSI and MPK.

In the second direction, the analysis of the existing subsystem of repair and restoration works associated with digital cards is carried out:

General organization, methods and means of control and diagnostics used in RVR;

Time and cost expenses for carrying out control and diagnostic operations for the given digital cards and repair and restoration work in general;

Analysis of the reliability characteristics of digital cards and their components based on the results of generalized operating experience.

In order to determine the main quantitative indicators of the operational reliability of digital cards, the accounting of which will reduce the real labor costs for conducting control and diagnostic operations, an analysis is carried out:

Failure rates of digital cards;

The share of failures of individual digital cards in the total number of hardware failures;

Average time of troubleshooting;

MTBF and mean recovery time for digital cards;

Ranking of digital cards according to the criterion of operational reliability.

Thus, in the created database of diagnostic data of the ADCP, storage is provided for:

Information about the types of IC, LSI and IPC and their reference signatures required for their replacement and for the organization of incoming control;

Information about the tested digital cards and their reference signatures, directly on the connector contacts;

Information about the topological model of the digital circuit board;

Algorithms for finding and localizing the location of a fault in the digital cards of the fault search card;

Information about the external docking parameters required when setting up and checking the performance of recovery digital cards and bringing these parameters to the standards specified in the technical specifications.

At the same time, as the foreign and domestic experience of creating automated control and diagnostic tools shows, the user of the ADCP must be presented with one of the following modes to choose from:

Dictionary mode "log" of reference signatures, for specified types of digital cards. Such a dictionary of digital boards' signatures makes it possible to control the state of a digital circuit by them in any order, looking for incorrect or unstable signatures;

The mode of backtracking errors according to the specified algorithm of the fault finding map in the digital card. In this mode, the operator is instructed to sequentially check a set of points, which allows the operator, starting from the wrong signature, to determine the entire signature chain leading to the faulty element or circuit node with the accuracy that compact testing methods provide.

In both modes, the display of diagnostic information is carried out on the display, and the carrier of the diagnostic program is stored in the PC memory.

At the same time, at the end of the control and diagnostic procedures, automatic documentation and storage of the results should be provided in the ADCP:

Date and time of the malfunction;

The operating mode of the equipment at the time of the appearance of the malfunction;

Places and means used to find and localize the place of failure;

Locations and causes of malfunction.

Send your good work in the knowledge base is simple. Use the form below

Students, graduate students, young scientists using the knowledge base in their studies and work will be very grateful to you.

Posted on http://www.allbest.ru/

TECHNICAL DIAGNOSTICS OF DIGITAL SYSTEMS

Tutorial

Tashkent 2006

Content

  • Introduction
  • 1. Technical operation of digital systems and devices
  • 3 ... Elements of digital systems and problems of increasing their reliability
  • 3.1 Digital systems, the main criteria for their reliability
  • 3.3 Analysis of the strategy for diagnosing and restoring the operability of digital systems
  • 4. Methods of control and diagnostics of digital systems
  • 4.1 Features of modern digital systems as an object of control and diagnostics
  • 4.2 Analysis of fault models of digital devices
  • 4.3 Types and methods of control and diagnostics
  • 4.4 Built-in control of digital systems
  • 5. Technical means of control and diagnostics of digital devices
  • 5.1 Logic probes and current indicators
  • 5.2 Logic analyzers
  • 5.3 Signature analyzer
  • 5.4 Technique for measuring reference signatures and constructing troubleshooting algorithms using signature analysis
  • Conclusion
  • List of sources used
  • The tutorial provides the basics of control and technical diagnostics of digital systems, analysis and classification of methods and means of control and diagnostics. The analysis of digital systems as an object of diagnostics, models of malfunctions of digital devices is carried out. The effectiveness of the built-in control of digital systems has been evaluated. The issues of technical implementation of procedures for control and diagnostics of digital devices based on signature analysis are considered.
  • The textbook is intended for bachelors and masters who study the issues of maintenance and repair of digital systems, as well as for specialists in the technical diagnostics of digital devices.

Introduction

In the last decade, digital systems have become widespread in telecommunication networks, which include:

network elements (SDH transmission systems, digital automatic telephone exchanges (ATS), data transmission systems, access servers, routers, terminal equipment, etc.);

support systems for network functioning (network management, traffic control, etc.);

business process support systems and automated billing systems (billing systems).

Putting digital systems into technical operation sets the main task to ensure their high-quality functioning. To build modern digital systems, an element base is used based on the use of large integrated circuits (LSI), very large integrated circuits (VLSI) and microprocessor sets (MPK), which can significantly increase the efficiency of systems - increase productivity and reliability, expand the functionality of systems, reduce weight, dimensions and power consumption. At the same time, the transition to the widespread use of LSI, VLSI and IPC in modern telecommunication systems has created, along with indisputable advantages and a number of serious problems in their operational maintenance, associated primarily with the processes of monitoring and diagnostics. This is because the complexity and number of digital systems in operation is growing faster than the number of skilled maintenance personnel. Since any digital system has ultimate reliability, when failures occur in it, it becomes necessary to quickly detect, troubleshoot and restore the specified reliability indicators. Of particular importance is the fact that traditional methods of technical diagnostics require either highly qualified service personnel or complex diagnostic support. It should be noted that as the overall reliability of digital systems increases, the number of failures and operator intervention for troubleshooting decreases. On the other hand, along with the increasing reliability of digital systems, there is a tendency for maintenance personnel to lose some troubleshooting skills. A well-known paradox arises: the more reliable the digital system is, the slower and less accurate the faults are found, because maintenance personnel have a hard time gaining experience in finding and localizing faults in advanced digital systems. In general, up to 70-80% of the recovery time for failed systems is the time of technical diagnostics, which consists of the time to search and localize the failed elements. However, as operational practice shows, today engineers are not always ready to solve the problems of technical operation of digital systems at the required level. Therefore, the increasing complexity of digital systems and the importance of ensuring their high-quality functioning requires the organization of its technical operation on a scientific basis. In this regard, engineers associated with the technical operation of digital systems must not only know how systems work, but also know how they do not work, how the state of inoperability manifests itself.

A decisive factor ensuring high availability of digital systems is the availability of diagnostic tools that allow you to quickly search and localize faults. This requires that engineers are well trained in preventing and recognizing the occurrence of inoperative conditions and faults, i.e. were familiar with the goals, objectives, principles, methods and means of technical diagnostics. They knew how to choose them correctly, apply them and use them effectively in operational conditions. This textbook for the course "Technical Diagnostics of Digital Systems" is intended to draw due attention to the problems and tasks of technical diagnostics in the preparation of bachelors and masters in the field of telecommunications.

digital system diagnostics control

1. Technical operation of digital systems and devices

1.1 Life cycle of a digital system

Digital devices and systems, like other technical systems, are created to meet the specific needs of people and society. Objectively, a digital system is characterized by a hierarchical structure, connection with the external environment, the interconnection of the elements that make up the subsystems, the presence of governing and executive bodies, etc.

At the same time, all changes in the digital system, starting from the moment of its creation (the emergence of the need for its creation) and ending with complete disposal, form a life cycle (LC), characterized by a number of processes and including various stages and stages. Table 1.1 shows a typical digital system life cycle.

The life cycle of a digital system is a set of research, development, manufacturing, handling, operation and disposal of the system from the beginning of the study of the possibilities of its creation until the end of its intended use.

Life cycle components are:

the stage of research and design of digital systems, at which research and development of the concept is carried out, the formation of a quality level corresponding to the achievements of scientific and technological progress, the development of design and working documentation, the manufacture and testing of a prototype, the development of working design documentation;

the stage of manufacturing digital systems, including: technological preparation of production; establishment of production; preparation of products for transportation and storage;

the stage of product circulation, at which the maximum preservation of the quality of finished products is organized during the period of transportation and storage;

the stage of operation, at which the quality of the system is implemented, maintained and restored, it includes: intended use, in accordance with the purpose; Maintenance; repair and recovery after failure.

Figure 1.1 shows a typical distribution of stages and stages of the life cycle of a digital system. We will consider the tasks that arise during the life cycle stage associated with the operation of digital systems. So, the operation of the system is the stage of the life cycle at which its quality is implemented (functional use), maintained (maintenance) and restored (maintenance and repair).

The part of operation, including transportation, storage, maintenance and repair, is called technical operation.

Table 1.1

Stages of the life cycle of a digital system

Exploratory research

Scientific research work (R&D)

Research and development (R&D)

Industrial production

Exploitation

1. Statement of a scientific problem

2. Analysis of publications on the problem under study

3. Theoretical

research and

development of scientific

concepts

(research

1. Development

technical

research assignments

2. Formalization

technical idea

3. Market research

4. Technical

economic

justification

1. Development of technical

assignments for OCD

Development of a draft

3. Making models

4. Development of technical

5. Creating a worker

6. Manufacturing experienced

samples, their testing

7. Adjustment

design

documentation (CD) for

the result

manufacturing and

testing of experienced

samples

8. Technical training,

production

1. Manufacturing and

test

installation

2. Adjustment

design

documentation

results

manufacturing and

trials

installation

3. Serial

production

1. Running-in

2. Normal

exploitation

3. Aging

4. Repair or

recycling

Figure 1.1 Life cycle of a digital system

1.2 The main tasks of the theory of technical operation of digital systems

The classification of the main tasks of the technical operation of digital systems is shown in Figure 1.2. The theory of technical operation of systems considers mathematical models of degradation processes in the operation of systems, aging and wear of nodes, methods for calculating and assessing the reliable functioning of systems, the theory of diagnosing and predicting failures and malfunctions in systems, the theory of optimal preventive measures, the theory of recovery and methods of increasing the technical resource of systems and etc. Due to the fact that these processes are mainly stochastic, in order to develop their mathematical model, analytical methods of the theory of random processes and the theory of queuing are used. At present, the statistical theory of decision making and the statistical theory of pattern recognition are successfully used for the same purposes.

The use of new directions in the mathematical theory of random processes in the development of models for the processes of technical operation of systems allows us to significantly expand our knowledge and successfully manage processes to improve the efficiency of functioning and improve the performance of rather complex digital systems.

Fig. 1.2 Classification of tasks of technical operation of digital systems

Therefore, at the first stage of the study, the following tasks are solved: optimal management of operational processes, development of optimal models for the operation of digital systems, drawing up optimal plans for organizing maintenance, choosing optimal preventive procedures, developing methods for effective technical diagnostics and predicting the technical state of systems.

As indicated in, the main task of the theory of operation is to scientifically predict the states of complex systems or technical devices and to develop recommendations for organizing their operation using special models and mathematical methods of analysis and synthesis of these models. It should be noted that when solving the main problem of operation, a probabilistic-statistical approach is used to predict and control the states of complex systems and to model operational processes. Therefore, the theory of operation of digital systems in this period is rapidly being formed and intensively developed.

The technical operation of digital systems is reduced to optimizing the activity of man-machine systems and procedures for managing human influences on the functioning of systems. Therefore, the modes of operation of digital systems (Figure 1.2) can be distinguished depending on the relationships of the man-machine system: pre-operational modes of systems, operational modes of systems, maintenance modes and systems repair modes.

The modes differ in certain stages and phases, the type of procedures for the control actions of the technical staff on the functioning of the systems.

Operating modes depend mainly on the quality of the element base of the systems, the degree of use of microprocessor technology as part of the equipment, the complex of control and measuring equipment, the degree of training of technical personnel, as well as other circumstances related to the provision of spare elements of the systems. In addition, the operating modes are determined by the basic requirements for digital systems: the fidelity of information transfer, the delay time in the delivery of information, and the reliability of information delivery.

The operation of systems is the process of using them for their intended purpose while maintaining systems in a technically sound condition, which consists of a chain of various sequential and systematic activities: maintenance, prevention, control, repair, etc.

Maintenance of systems (Figure 1.2) is characterized by three main stages: preventive maintenance, monitoring and assessment of technical condition, organization of maintenance. It is very difficult to determine the degree of influence of individual stages of maintenance on the reliability of systems, but it is known that they have a significant impact on the quality and reliability of systems functioning.

Control and assessment of the technical state of systems is carried out by monitoring the quality of functioning of system nodes, methods of technical diagnostics of failures and malfunctions, as well as the implementation of algorithms for predicting failures in systems.

1.3 General principles of building a technical operation system

The general task of the technical operation system (STE) is to ensure the uninterrupted functioning of digital systems, therefore, the main direction of the STE development is the automation of the most important technological operation processes. The functional task of technical operation is the development of control actions that compensate for the influence of external and internal environments in order to maintain a given technical state of digital systems. This general function is divided into two: general operation - managing the state of the external environment and technical operation - managing the state of the internal environment. In this case, the management of the state of the internal environment consists in the management of its technical state.

A possible structure of an automated STE is shown in Figure 1.3.

Fig.1.3 Block diagram of the automated system of technical operation: PNRM - a subsystem of commissioning and repair work; STX - subsystem of supply, transportation and storage; SOISTE - subsystem for collecting and processing information STE; TTD - subsystem of test technical diagnostics; EOSTE - subsystem of ergonomic support for STE; USTE - STE control subsystem.

ASTE consists of two subsystems: a subsystem of technical operation when preparing and using digital systems (TEPI) and a subsystem of technical operation when using digital systems for their intended purpose (TEIN). Each of these subsystems contains a number of elements, the main of which are shown in Figure 1.3. The functions of the subsystems are shown in more detail in Table 1.2.

Table 1.2

Subsystem

Main functions

Organization of commissioning of newly introduced digital systems, as well as current, medium and

overhaul

Placement and replenishment of spare parts, supply bases and factories of manufacturers of spare parts, transportation and storage of spare parts

Planning the use of digital systems and maintaining operational documentation, collecting and processing operational data, developing recommendations for improving STE

Determination of the technical condition, detection of a defect with a given depth, interaction with the subsystem of functional technical diagnostics (FTD)

Performing part of the TTD functions that require human participation, providing two-way communication in the "man-machine" system, participating in routine repairs performed without interrupting functioning

Determination of the order of execution of tasks of TTD, EOSTE for specific conditions, management of the recovery process, processing of the results of performance of tasks of TTD and EOSTE, organization of interaction with other elements of digital systems

The presence of STE can significantly reduce the time for detecting malfunctions in digital systems and, on the basis of control information about the state of systems, prevent the appearance of downtime in its operation. For this purpose, centers for the technical operation of digital systems are organized, which carry out the functions indicated in Fig. 1.4.

In modern digital systems, the statistical method of maintenance is widespread, which consists in the fact that repair and restoration work begins after the quality of functioning has reached a critical value. If, when monitoring the state of system elements, there are signs of a decrease in the quality of functioning, then they are disconnected from the network to restore operability.

Control over the functioning of digital systems is carried out by a set of parameters that characterize their performance.

Control over the functioning of digital systems is carried out according to the following characteristics; fidelity of message transmission; time of transmission of messages; the likelihood of timely delivery of messages; average time of message delivery, etc. The general scheme of functional control is shown in Fig. 1.5.

Figure 1.4 Main functions of the maintenance center

Fig. 1.5 Algorithm of the functional diagnostics system of a digital system

2. Basics of control and technical diagnostics of digital systems

2.1 Basic concepts and definitions

One of the most effective ways to improve the operational and technical characteristics of digital systems that have taken a dominant position in modern telecommunication systems is the use of methods and means of control and technical diagnostics during their operation.

Technical diagnostics is a field of knowledge that makes it possible to separate the faulty and serviceable states of systems with a given reliability and its purpose is to localize faults and restore the system to a serviceable state. From the point of view of a systematic approach, it is advisable to consider the means of control and technical diagnostics as an integral part of the maintenance and repair subsystem, i.e., the technical operation system.

Let's consider the basic concepts and definitions used to describe and characterize methods of control and diagnostics.

Technical service - this is a set of works (operations) to maintain the system in good or efficient condition.

Repairs - a set of operations to restore the operability and restore the resources of the system or its components.

Maintainability - the property of the system, which consists in adaptability to the prevention and detection of the causes of its failures and the restoration of an operable state through maintenance and repair.

Depending on the complexity and scope of work, the nature of the faults, two types of repair of digital systems are provided:

unscheduled maintenance of the system;

unscheduled medium repair of the system.

Current repairs - repairs carried out to ensure or restore the functionality of the system and consisting in the replacement or restoration of its individual parts.

Average repairs - repairs carried out to restore serviceability and partial restoration of the resource with the replacement or restoration of components of a limited range and control of the technical condition of components, carried out in the amount established by the normative and technical documentation.

One of the important concepts in technical diagnostics is

technical condition of the object.

Technical state - a set of object properties subject to change in the production or operation process, characterized at a certain moment by the characteristics established by the normative and technical documentation.

The control technical fortunes - determination of the type of technical condition.

View technical fortunes - a set of technical conditions that satisfy (or not satisfy) the requirements that determine the serviceability, operability or correct functioning of the object.

There are the following types of object state:

good or faulty condition,

operable or inoperative state,

full or partial functioning.

Serviceable - technical condition in which the object meets all the established requirements.

Faulty - technical condition in which the object does not meet at least one of the established requirements of regulatory characteristics.

Workable - technical condition in which the object is able to perform the specified functions, keeping the values \u200b\u200bof the specified parameters within the specified limits.

Unworkable - technical condition in which the value of at least one specified parameter characterizing the facility's ability to perform specified functions does not meet the established requirements.

Correct functioning - technical condition, in which the object performs all those regulated functions that are required at the current time, keeping the values \u200b\u200bof the specified parameters of their implementation within the established limits.

Wrong functioning - technical condition in which the object does not perform part of the regulated functions required at the current time or does not retain the values \u200b\u200bof the specified parameters of their implementation within the established limits.

From the definitions of the technical states of an object, it follows that in a state of health, an object is always operable, in a state of operability it functions correctly in all modes, and in a state of malfunctioning, it is inoperative and faulty. A properly functioning object may be inoperative, and therefore defective. A healthy object may also be faulty.

Let's consider some definitions related to the concept of testability and technical diagnostics.

Traceability - property of an object that characterizes its adaptability to control by specified means.

Index testability - quantitative characteristics of testability.

Level testability - the relative characteristic of testability, based on the comparison of the set of testability indicators of the evaluated object with the corresponding set of basic indicators.

Technical diagnosing - the process of determining the technical condition of an object with a certain accuracy.

Search defect - diagnostics, the purpose of which is to determine the location and, if necessary, the cause and type of the defect.

Test diagnosing - one or more test influences and the sequence of their implementation, providing diagnostics.

Verifier test - a diagnostic test to check the health or functionality of the object.

Test search defect - a diagnostic test for finding a defect.

System technical diagnosing - a set of means and object of diagnosis and, if necessary, performers, prepared for diagnosis or carrying it out according to the rules established by the relevant documentation.

The result of diagnostics is a conclusion on the technical condition of the object, indicating, if necessary, the place, type and cause of the defect. The number of conditions that need to be distinguished as a result of diagnostics is determined by the depth of troubleshooting.

Depth search malfunctions - the degree of detail in technical diagnostics, indicating to which component of the object the location of the malfunction is determined.

2.2 Tasks and classification of technical diagnostic systems

The ever-increasing requirements for the reliability of digital systems necessitate the creation and implementation of modern methods and technical means of control and diagnostics for various stages of the life cycle. As noted earlier, the transition to the widespread use of LSI, VLSI and IPC in digital systems has created, along with indisputable advantages, a number of serious problems in their operational maintenance, associated primarily with the monitoring and diagnostics processes. It is known that the cost of troubleshooting during the manufacturing phase is between 30% and 50% of the total cost of manufacturing devices. At the stage of operation, at least 80% of the recovery time of a digital system falls on the search for a faulty replaceable element. In general, the costs associated with detecting, troubleshooting and elimination of a malfunction increase by 10 times as the malfunction passes through each technological stage, and from the incoming inspection of integrated microcircuits to detecting a failure during the operation phase, the cost is 1000 times more. A successful solution to such a problem is possible only on the basis of an integrated approach to diagnostic control issues, since diagnostic systems are used at all stages of the life of a digital system. This requires a further increase in the intensity of maintenance, restoration and repair work at the stages of production and operation.

The general tasks of monitoring and diagnosing digital systems and its components are usually considered from the point of view of the main stages of development, production and operation. Along with the general approaches to solving these problems, there are also significant differences due to the specific features inherent in these stages. At the stage of development of digital systems, two tasks of control and diagnostics are solved:

1. Ensuring the traceability of the digital system as a whole and its components.

2. Debugging, checking the health and functionality of the components and the digital system as a whole.

During control and diagnostics under the conditions of digital system production, the following tasks are solved:

1. Identification and rejection of defective components and assemblies at the early stages of manufacturing.

2. Collection and analysis of statistical information on defects and types of faults.

3. Reducing labor intensity and, accordingly, the cost of control and diagnostics.

Monitoring and diagnostics of a digital system under operating conditions have the following features:

1. In most cases, localization of faults at the level of a structurally removable unit is sufficient, as a rule, of a typical replacement element (TEC).

2. There is a high probability that no more than one malfunction will appear by the time of repair.

3. Most digital systems have some monitoring and diagnostic capabilities.

4. Possibly early detection of pre-failure conditions during preventive examinations.

Thus, the type and purpose of the diagnostic system must be established for the object subject to technical diagnostics. According to the established the following main areas of application of diagnostic systems:

a) at the stage of production of the object: in the process of adjustment, in the process of acceptance;

b) at the stage of facility operation; during maintenance during use, during maintenance during storage, during maintenance during transportation;

c) when repairing a product: before repair, after repair.

Diagnostic systems are designed to solve one or several tasks: health checks; performance checks; function tests: search for defects. In this case, the components of the diagnostic system are: an object of technical diagnostics, which is understood as an object or its components, the technical condition of which is to be determined, means of technical diagnostics, a set of measuring instruments, means of switching and interfacing with the object.

Technical diagnostics (TD) is carried out in the technical diagnostics system (STD), which is a set of means and an object of diagnostics and, if necessary, performers, prepared for diagnostics and carrying it out according to the rules established by the documentation.

The components of the system are:

an object technical diagnosing (CTD), which is understood as the system or its component parts, the technical condition of which is to be determined, and facilities technical diagnosing - a set of measuring instruments, means of switching and interfacing with OTD.

System technical diagnosing works in accordance with the TD algorithm, which represents a set of instructions for diagnosing.

The conditions for conducting TD, including the composition of diagnostic parameters (DP), their maximum permissible minimum and maximum pre-failure values, the frequency of product diagnostics and the operational parameters of the tools used, determine the mode of technical diagnostics and control.

Diagnostic parameter (sign) is a parameter used in the prescribed manner to determine the technical state of an object.

Technical diagnostics systems (STD) can be different in their purpose, structure, place of installation, composition, design, circuit solutions. They can be classified according to a number of characteristics that determine their purpose, tasks, structure, and the composition of technical means:

by the degree of coverage of the CTD; by the nature of the interaction between the CTD and the technical diagnostics and control system (STDK); by the means of technical diagnostics and control used; according to the degree of automation of the CTD.

According to the degree of coverage, technical diagnostics systems can be divided into local and general. Local systems are understood as systems of technical diagnostics that solve one or more of the above tasks - determining operability or finding a place of failure. General - are called technical diagnostics systems that solve all the assigned diagnostic tasks.

By the nature of the interaction of the CTD with the means of technical diagnostics (SRTD), the technical diagnostics systems are divided into:

systems from functional diagnosisstick, in which the solution of diagnostic problems is carried out during the operation of the DTD for its intended purpose, and systems with test diagnostics, in which the solution of diagnostic problems is carried out in a special mode of operation of the DTD by supplying test signals to it.

By the means of technical diagnostics used, TD systems can be divided into:

systems with universal means of TDK (for example, computers);

systems with specialized means (stands, simulators, specialized computers);

systems from external meansin which facilities and DTD are structurally separated from each other;

systems with embedded means, in which OTD and STD constructively represent one product.

According to the degree of automation, technical diagnostics systems can be divided into:

automatic, in which the process of obtaining information about the technical condition of the CTD is carried out without human participation;

automatedin which the receipt and processing of information is carried out with partial participation of a person;

non-automated (manual), in which the receipt and processing of information is carried out by a human operator.

The means of technical diagnostics can be classified in the same way: automatic; automated; manual.

With regard to the object of technical diagnostics, the diagnostic systems must: prevent gradual failures; identify implicit failures; search for faulty assemblies, blocks, assembly units and localize the place of failure.

2.3 Indicators of diagnosis and testability

As mentioned earlier, the process of determining the technical state of an object during diagnosis involves the use of diagnostic indicators.

Diagnostic indicators represent a set of characteristics of an object used to assess its technical condition. Diagnostic indicators are determined during the design, testing and operation of the diagnostic system and are used when comparing various options of the latter. According to the established the following indicators of diagnosis:

1. The probability of an error in diagnosing a species is the probability of a joint occurrence of two events: the diagnostic object is in a technical condition, and as a result of diagnostics it is considered to be in a technical condition (when the indicator is the probability of correctly determining the technical condition of the diagnostic object)

, (2.1)

where is the number of states of the diagnostic tool;

- the prior probability of finding the diagnostic object in the state;

- the prior probability of finding the diagnostic tool in the state;

- the conditional probability that, as a result of diagnostics, the diagnostic object is recognized as being in a state under the conditions that it is in a state and the diagnostic tool is in a state;

- the conditional probability of obtaining the result "the diagnostic object is in the state", provided that the diagnostic tool is in the state;

- the conditional probability of finding the diagnostic object in the state under the conditions that the result "the diagnostic object is in the state" is obtained and the diagnostic tool is in the state.

2. The posterior probability of a diagnosis error is the probability of finding the diagnostic object in a state provided that the result "the diagnostic object is in a technical condition" is obtained (at \u003d) the indicator is the posterior probability of a correct determination of the technical condition).

, (2.2)

where is the number of object states.

3. The probability of correct diagnosis D is the total probability that the diagnostic system determines the technical condition in which the object of diagnosis is actually located.

. (2.3)

4. Average operational duration of diagnosis

- the mathematical expectation of the operational duration of one

multiple diagnostics.

, (2.4)

where is the average operational duration of diagnosing an object in a state;

- the operational duration of diagnosing an object in a state, provided that the diagnostic tool is in a state.

The quantity includes the duration of the auxiliary diagnostic operations and the duration of the actual diagnosis.

5. Average cost of diagnostics - the mathematical expectation of the cost of a single diagnostics.

, (2.5)

where is the average cost of diagnosing an object in a state;

- the cost of diagnosing an object in a state, provided that the diagnostic tool is in a state. The value includes the amortization costs of diagnostics, the costs of operating the diagnostics system and the cost of wear of the diagnosed object.

6. Average operational labor intensity of diagnostics is the mathematical expectation of operational labor intensity of a single diagnosis

, (2.6)

where is the average operational complexity of diagnostics when the object is in a state;

- the operational complexity of diagnosing an object in a state, provided that the diagnostic tool is in a state.

7. Depth of search for a defect L - a characteristic of a search for a defect, specified by indicating a component of the diagnostic object or its section with an accuracy to which the location of the defect is determined.

Consider now the testability metric. Traceability is ensured at the development and manufacturing stages and must be established in the technical specifications for the development and modernization of the product.

According to the established the following testability indicators and formulas for their calculation:

1. Coefficient of completeness of checking serviceability (serviceability, correct functioning):

, (2.7)

where is the total failure rate of the tested components of the system at the accepted division level;

- the total failure rate of all components of the system at the accepted division level.

Search depth factor:

, (2.8)

where is the number of uniquely distinguishable components of the system at the accepted division level, up to which the location of the defect is determined; - the total number of components of the system at the accepted division level, up to which it is required to determine the location of the defect.

Diagnostic test length:

(2.9)

where || - the number of test influences.

4. Average time to prepare the system for diagnostics by a given number of specialists:

, (2.10)

where is the average installation time for removing measuring transducers and other devices necessary for diagnostics;

- the average time of machine-dismantling work on systems required to prepare for diagnostics.

5. Average labor intensity of preparation for diagnosis:

, (2.11)

where is the average labor intensity of installing and removing converters and other devices required for diagnostics;

- average labor intensity of installation - dismantling of works on the object to provide access to control points and bring the object to its original state after diagnostics.

6. System redundancy ratio:

(2.12)

where is the volume of components introduced to diagnose the system;

- mass or volume of the system.

7. Coefficient of unification of interface devices and systems with diagnostic tools:

(2.13)

where is the number of unified interface devices.

- total number of interface devices.

8. Coefficient of unification of system signal parameters:

(2.14)

where is the number of unified parameters of the system signals used in diagnostics;

- the total number of signal parameters used in diagnostics.

9. Coefficient of labor intensity of preparing the system for diagnosis:

(2.15)

where is the average operational complexity of diagnosing the system;

- the average complexity of preparing the system for diagnosis.

10. Coefficient of use of special diagnostic tools:

(2.16)

where is the total mass or volume of serial and special diagnostic tools;

- the mass or volume of special diagnostic tools.

11. Level of testability in the assessment:

differential:

(2.17)

where is the value of the testability indicator of the evaluated system; - the value of the basic testability indicator.

An integrated

, (2.18)

where - the number of testability indicators, according to the aggregate of which the level of testability is assessed;

is the weight coefficient of the ith testability indicator.

3. Elements of digital systems and problems of increasing their reliability

3.1 Digital systems, the main criteria for their reliability

The main task of modern digital systems is to improve the efficiency and quality of information transmission. The solution to this problem is developing in two directions: on the one hand, the methods of transmission and reception of discrete messages are being improved to increase the speed and reliability of the transmitted information while limiting costs, on the other hand, new methods for constructing digital systems are being developed, ensuring high reliability of their operation.

This approach requires the development of digital systems that implement complex control algorithms under conditions of random influences with the need for adaptation and have the property of fault tolerance.

The use of LSI, VLSI and IPC for these purposes allows to ensure high efficiency of information transmission channels and the ability in case of failure to quickly restore the normal functioning of digital systems. In the future, a modern digital system will be understood as a system that is built on the basis of LSI, VLSI and IPC.

The block diagram of the digital system is shown in Fig. 3.1 The transmitting part of the digital system carries out a number of transformations of a discrete message into a signal. The set of operations associated with converting the transmitted messages into a signal is called the transmission method, which can be described by the operator relation

(3.1)

where is the operator of the transmission method;

- coding operator;

- modulation operator;

- random process of failures and failures in the transmitter.

The appearance of faults and failures in the transmitter leads to a violation of the condition\u003e and an increase in the number of errors in the digital system. As a consequence, it is necessary to design the transmitter in such a way that an increase in the number of errors due to violation of the condition\u003e

Signals transmitted in a propagation medium undergo attenuation and distortion in it. Therefore, the signals arriving at the receiving point may differ significantly from those transmitted by the transmitter.

Fig 3.1 Block diagram of a digital system

The influence of the medium on the signals propagated in it can also be described by the operator relation

(3.2)

where is the operator of the distribution environment.

In the communication channel, interference is superimposed on the transmitted signal, so that during signal transmission a distorted signal acts at the input of the receiver:

, (3.3)

where is a random process corresponding to one of the interference;

- the number of independent sources of interference.

The task of the receiver is to determine which message was transmitted from the received distorted signal. The set of receiver operations can be described by the operator relation:

(3.4)

where - receiving method operator;

- demodulation operator;

- decode operator;

- a random process of occurrence of failures and failures in the receiver.

The completeness of the correspondence of the transmitted sequence depends not only on the correcting capabilities of the coded sequence, the signal and interference level and their statistics, the properties of decoding devices, but also on the ability of the digital system to correct errors caused by hardware failures and failures of the transmitter and receiver, etc. The considered approach allows us to describe the process of transferring information by a mathematical model, which makes it possible to identify the influence of various factors on the efficiency of digital systems and outline ways to improve their reliability.

It is known that all digital systems are non-recoverable and recoverable. The main criterion for the reliability of a non-recoverable digital system is the probability of failure-free operation:

(3.5)

it is the probability that a failure will not occur in a given time interval t; where -

l is the intensity of failure;

- the number of elements in the digital system;

- the intensity of failure of one element of the digital system.

The main criterion for the reliability of restored digital systems is the availability factor

, (3.6)

which characterizes the probability that the system will be in good condition at an arbitrarily chosen moment in time; Where - mean time between failures; This is the average value of the duration of continuous system operation between two failures.

, (3.7)

where N is the total number of failures;

-time between () and failure.

.

- recovery time. The average system downtime caused by finding and fixing a failure.

, (3.8)

where is the duration of the failure.

where is the recovery rate, characterizes the number of recoveries per unit time.

3.2 Ways to improve the reliability of digital systems

Modern digital systems are complex geographically distributed technical complexes that perform important tasks for the timely and high-quality transmission of information.

Maintenance and the provision of necessary repair and restoration work for complex digital systems is an important issue.

When choosing digital systems, you need to make sure that their manufacturers are ready to provide technical support during not only the warranty, but also the entire service life, i.e. before the onset of the limiting state. Thus, when deciding whether to purchase digital systems, operators need to consider the long-term maintenance and repair costs.

It should be noted that the quality of the services offered, as well as the amount of costs incurred by the operator in its activities, largely depends on the preparation and organization of the process of maintenance and repair of digital systems. Therefore, the task of improving methods of maintenance and repair, geographically distributed digital systems is becoming increasingly important.

It is known that the requirements of international standards in the field of quality oblige the telecom operator as a service provider to include in the scope of the quality system - maintenance and repair of digital systems.

As the international experience of developed countries, in which the period of mass digitalization of the telecommunications network and the introduction of fundamentally new services, has already passed, this task is effectively solved by creating a developed infrastructure of organizational and technical support, which also includes a system of service centers and repair centers.

Therefore, suppliers of digital systems must organize service centers for the implementation of warranty and post-warranty maintenance of their equipment, its current operation and repair.

Typically, the structure of a service center system includes:

the main service center coordinating the work of all other service centers and having the ability to perform the most complex types of work;

regional service centers;

service provider technical service.

However, as practice shows, along with the high quality of the supplied equipment and its wide functionality, a number of problems arise:

insufficient development (and in some cases absence) of the service network for the supplied digital systems;

there are more suppliers of digital systems than service centers;

high cost of repairing digital systems.

In this regard, it is necessary to present appropriate requirements to the suppliers for the organization of maintenance of the supplied equipment and the timing of replacement of faulty nodes of digital systems.

Since the level of convenience of the maintenance functions of digital systems varies from system to system, working with different systems requires different degrees of training of the maintenance personnel. As practice shows, suppliers of telecommunications equipment and their strategy of organizing service support build differently:

creation of the main service center for technical support;

creation of a developed network of regional support centers;

support through a network of distributors and your representative office;

support by the dealer network.

Currently, there is a wide variety of forms, methods and types of maintenance. Services to customers are provided in four different forms:

self-service by the customers themselves;

on-site service of equipment;

service in centers that do not repair, but replace;

service in repair centers.

It should be emphasized that currently there is no single service concept.

1. Some operator companies are of the opinion that the main task is to speed up repairs, which is achieved by replacing boards and even blocks, which then go through a full cycle of monitoring and restoring their performance in repair centers equipped with a set of modern diagnostic equipment.

2. Other carrier companies prefer to move to item-level repairs, for which they use the latest diagnostic tools of high functional complexity to isolate faults.

Therefore, the technical diagnostics system is an integral part of maintenance and repair systems as a system for managing the state of digital systems. It is now generally accepted that one of the important ways to increase the operational reliability and, ultimately, the quality of the functioning of digital systems is to create an effective system of technical diagnostics.

Therefore, solving the problems of maintenance and repair involves the use of an appropriate system for technical diagnostics of digital systems at the stage of their operation, which should provide a two-stage strategy for troubleshooting in digital systems with a search depth, respectively, up to a typical replacement element (TEZ), board and microcircuit. Taking into account the expansion of the range of digital systems, it becomes necessary to reduce the requirements for the qualifications of the maintenance personnel of technical diagnostics systems, especially for service centers and repairs. The diagnostic equipment intended for these centers should have, if possible, the minimum weight and size indicators and ensure that the specificity of each diagnostic object is taken into account.

Currently, the following main directions of work to improve the reliability of the functioning of digital systems are known:

1. First of all, reliability is improved through the use of highly reliable components. This direction is associated with a significant cost of funds and provides only a solution to the problem of reliability, but not maintainability. A one-sided orientation in the creation of systems to achieve high reliability (due to the use of a more advanced element base and units) to the detriment of maintainability, in many cases does not lead, ultimately, to an increase in the availability factor in real operating conditions. This is due to the fact that even highly qualified specialists, using traditional technical diagnostic tools, spend up to 70-80% of the active repair time on finding and localizing faults in complex modern digital systems.

Similar documents

    The quality of control and diagnostics depends not only on the technical characteristics of the control and diagnostic equipment, but also on the testability of the tested product. Signals arising during the operation of the main and control equipment.

    abstract added 12/24/2008

    The concept and definitions of the theory of reliability and technical diagnostics of automated systems. Organization of automated control in production systems. Characteristics and essence of the main methods and means of modern technical diagnostics.

    test, added 08/23/2013

    Basic theoretical principles of operation of devices for operational control of the reliability of information transmission. Equipment and methods for calculating the reliability of receiving information about the reduction of digital transmission systems below the threshold values \u200b\u200bfor alarm systems.

    test, added 10/30/2016

    Types and methods of redundancy as a method of increasing the reliability of technical systems. Calculation of the reliability of technical systems for the reliability of their elements. Systems with serial and parallel connection of elements. Methods for transforming complex structures.

    presentation added 01/03/2014

    The concept of digital signal sources models. Programs for circuit simulation of digital devices. Setting simulation parameters. Determination of the maximum performance. Models of digital components, the main methods of their development.

    term paper added on 11/12/2014

    Review of modern schemes for constructing digital radio receiving devices (RPU). Representation of signals in digital form. Elements of digital radio receivers: digital filters, detectors, digital display devices and monitoring and control devices.

    term paper, added 12/15/2009

    Methods for controlling information words and addresses in digital automation devices. Structural and functional diagrams of control devices. Ensuring the reliability of automation devices and computers. Numerical hardware control modulo.

    test, added 06/08/2009

    Fundamentals of the algebra of logic. Drawing up a timing diagram of a combinational logic circuit. Development of digital devices based on triggers, electronic counters. Selection of an electronic circuit for analog-to-digital conversion of electrical signals.

    term paper added 05/11/2015

    Automation of design. Development of circuits for digital devices based on integrated circuits of varying degrees of integration. Requirements, methods and tools for the development of printed circuit boards. HSA editor DipTrace. Requirements of normative and technical documentation.

    practice report, added 05/25/2014

    Block diagram of digital transmission systems and signal input-output equipment. Speech coding methods. Characteristics of analog-to-digital and digital-to-analog conversion methods. Methods for transmitting low-speed digital signals over digital channels.

Built-in control and diagnostics of digital devices. Methods for improving the traceability of digital devices

The quality of control and diagnostics depends not only on the technical characteristics of the control and diagnostic equipment, but also, first of all, on the testability (controllability) of the tested product itself. This means that the quality of inspection is largely determined by the quality of product development. The simplest solution to improve the quality of control is to bring some internal points of the product to an external connector. However, the number of free contacts on a connector is limited, so this approach is rarely available or effective enough. A more acceptable solution is associated with the placement of additional functional elements on the board, designed to directly receive or accumulate information about the state of internal points and then transfer it for processing at the request of an analyzing device (external or also built-in).

Signals arising during the operation of the main and control equipment located together on one printing module or IC chip are compared according to certain rules. As a result of this comparison, information is generated about the correct functioning of the monitored node. A complete copy of the tested unit can be used as redundant equipment (Fig. 1, a). In this case, the simplest comparison of two identical sets of codes is made. In order to reduce the volume of additional control equipment, simpler control devices with redundant coding are used (Fig. 1, b), but at the same time, the methods for obtaining control ratios are complicated.

Figure: one. Built-in control circuits with redundant duplication of hardware (a) and redundant coding of operations:

ОУ - main device; KU - control device;

US - comparison device; UK - coding device:

УОКК - control code processing device;

UD - decoding device; Z - error signal.

Redundant coding is based on the introduction of additional symbols into the input, processed and output information signal, which together with the basic ones form codes that have the properties of detecting or correcting errors.

As an example of built-in check with redundant coding, consider one of the methods for controlling the transmission of information: to a group of information bits, which are a simple (i.e., non-redundant) code, one redundant (check) bit is added, carrying information about the parity and oddness of the transmitted information. The value of the parity bit is) if the number of ones in the transmitted code is even and 1 if the number of ones is odd (Fig. 2).

When transmitting information, the word is transmitted with its check bit. If the receiving device detects that the check bit value does not correspond to the parity of the sum of word units, then this is perceived as a sign of an error in the information transmission line.

Figure: 2. Transmission of information with a check bit: if Z \u003d 0, then information is transmitted without error; if Z \u003d 1, then the information is transmitted incorrectly; n is the number of main channels; n + 1 - additional check digit.

By oddness, the complete disappearance of information is controlled, since a code word consisting of zeros is classified as prohibited.

This method is used in microprocessor systems to control information transfers between registers, read information from RAM, and exchange between devices. Data backbones make up 60 to 80% of all MPS hardware. Therefore, the use of parity control can significantly increase the reliability of information transfer operations.

Figure: 3. Odd-parity check circuit for 8-bit pyramidal bus on two-input exclusive-OR gates

Iterative codes are another example. They are used to control the transfer of code arrays between an external memory and a computer, between two computers and in other cases. An iterative code is formed by adding extra parity bits to each row and each column of the transmitted word array (two-dimensional code). In addition, the parity can be determined by the diagonal elements of the word array (multidimensional) code. The detecting ability of the code depends on the number of additional control symbols. It allows you to detect multiple errors and is easy to rehabilitate.

The simplest hardware methods of built-in control include the method of duplicating circuits and comparing the output signals of these circuits (Fig. 3). This method can be easily applied to test any circuit. In addition, it has the advantage of being able to detect any functional error that appears in the circuit. The disadvantage of this method is, firstly, an increase in the cost of redundancy, and, secondly, it does not exclude the own errors of the backup control equipment.

It is possible to somewhat reduce the cost of hardware duplication of digital circuits by using the so-called two-wire logic. At the same time, the original and backup circuits differ in that they implement inverse outputs and in the circuit all signals are presented simultaneously in direct and inverted form. Comparison of the output signals with conventional duplication is carried out on the basis of their equality, and with two-wire logic - based on their inequality.

To detect errors in combinational circuits, especially for arithmetic and logical functions that depend on two arguments, the pseudo-duplication method is often used. In this case, the data is processed twice sequentially in time, in the same order, but in different paths, and is checked for equality using intermediate storage. In this case, instead of the required redundancy of the circuit, the information processing time actually increases.

Figure 4 shows a scheme for checking a two-bit componentwise logical combination of two operands using an ALU. First, the switches S1 and S2 are turned on in the right position according to the circuit and from the ALU output the result of the operation is recorded in the memory register 3 connected to one of the inputs of the comparison circuit.

In the next step, switches S1 and S2 are turned on to the left. The high and low bits of the input numbers at the ALU input are interchanged, and the result of the operation from the ALU output with the high and low order bits also rearranged goes directly to the comparison circuit.

Figure: 4. Scheme for checking the performance of arithmetic operations using the pseudo-duplication method

Suppose that the error "\u003d 1" (identical unit) appears at the output 3 of the ALU and the operands 0110 and 0010 are digitally added to the ALU modulo 2. If the switches S1 and S2 are turned on in the right position, then the number 0100 is written to register 3. If the switches switched on to the left position, i.e. numbers 1100 and 0100 are received at the ALU outputs, respectively, and at the output 1100 (taking into account the error \u003d 1 at the output 3 of the ALU). The inputs of the comparison circuit receive codes 0100 - from the output of register 3 and 0110 - from the ALU output, which generate an error signal.

The built-in controller is especially convenient for organizing control and diagnostics of products in operation, but it can also be useful in production conditions, for example, in the manufacture of LSI microprocessor sets. For this, additional means are introduced into the LSI circuit, which carry out the reconfiguration of the LSI structure in the testing mode and provide, at the same time, an improvement in the controllability and observability of all triggers included in it (Fig. 5, a). In this case, testing a complex LSI turns into a relatively simple procedure for the recombination circuits included in the LSI.

To implement this approach, such means of reconfiguring the structure of the sequential circuit are needed so that the control signal switches all triggers from the operating mode to the test mode, in which all triggers become controllable and observable (Fig. 5, b). The most widespread among these methods is the scanning method ****, carried out by connecting special additional memory elements into a single shift register that stores the internal state of the circuit. The scanning of additional memory elements can be controlled by addressing them and directly selecting information about the state of the circuit from additional memory.

All this complicates the LSI, but provides economic feasibility. So for the Intel 8086 series MP, having a chip area of \u200b\u200b3 mm2, the introduction of means for increasing the controllability increases the chip area by about 20%, which reduces the yield from 10% to 12 (20)%. Together with a decrease in the number of crystals on a wafer, this leads to a rise in production costs by 70%. Nevertheless, the decrease in the cost of testing, which is more than 80% of the labor intensity of LSI manufacturing, fully compensates for such an increase in the cost of LSI, and complex control systems are developed in such a way as to provide the possibility of self-testing without the participation of external equipment and software.

To implement self-testing of circuits, two registers are placed on a printed circuit board or on a microprocessor chip, programmed to perform the functions of a pseudo-random code generator and a signature generator. A special test program is stored in the programmable ROM of the processor, which must ensure sequential testing of all functional units of the microprocessor. The pseudo-random code generator generates an input test sequence directed to the controlled software-accessible blocks of the microprocessor, and the signature generator removes the corresponding control signatures from the microprocessor output, which in turn are compared with the reference ones stored in the ROM. The comparison result provides information to the microprocessor about its state.

LSI self-diagnostics is a natural development of the structural approach to the design of controllable devices. The combination of built-in testability means (end-to-end shift register for scanning states, pseudo-random test code generator, signature analysis register) allows organizing self-testing of crystals, semiconductor wafers, microcircuits and printed circuit assemblies. Since the cost of self-diagnostic tools remains approximately the same, and the cost of testing by standard methods is increasing exponentially, it can be assumed that with an increase in the saturation of VLSI (degree of integration), self-diagnostic tools will become mandatory.

Figure: 5. Built-in LSI MP control. Reconfiguration of the LSI structure in testing mode using additional triggers (a) and a special memory (b)

LITERATURE

1. B. Khabarov, G. Kulikov, A. Paramonov. Technical diagnostics and repair of household electronic equipment. - Minsk: Publishing house: Hotline - Telecom, 2004. - 376 p.

2. Davidson G. Troubleshooting and repair of electronic equipment without diagrams. 2 edition. M. Publisher: DMK Press. 2005, - 544 p.

3. Ignatovich V.G., Mityukhin A.I. - Adjustment and repair of electronic equipment. - Minsk: "Higher school", 2002 - 366 p.

4. N.I. Domarenok, N.S. Sobchuk. "Physical foundations of diagnostics and non-destructive quality control of MEA", - Minsk, BSUIR, 2001.

Did you like the article? To share with friends: