Shewhart chart in the clinical laboratory. How to build control charts for SPC charts in Excel. Basic provisions of the theory

4. Construction examples control charts Shewhart using GOST R 50779.42–99

Shewhart control charts come in two main types: for quantitative and alternative data. There are two situations for each control chart:

a) standard values ​​are not set;

b) standard values ​​are set.

Default values ​​are values ​​set according to some specific requirement or purpose.

The purpose of control charts for which standard values ​​are not set is to detect deviations in the values ​​of characteristics (for example, or some other statistics) that are caused by reasons other than those that can only be explained by chance. These control charts are based entirely on the data of the samples themselves and are used to detect variations that are due to non-random causes.

The purpose of control charts, given standard values, is to determine whether the observed values ​​are different, etc. for several subgroups (each with a volume of observations) from the corresponding standard values ​​(or ), etc. more than can be expected from the action of only random causes. A feature of maps with predetermined standard values ​​is an additional requirement related to the position of the center and process variation. The set values ​​may be based on experience gained from using control charts at predetermined standard values, as well as on economic indicators established after considering the need for the service and the cost of production, or specified in technical requirements for products.


4.1 Control charts for quantitative data

Control charts for quantitative data are classic control charts used for process control when the characteristics or results of the process are measurable and the actual values ​​of the controlled parameter measured with the required accuracy are recorded.

Control charts for quantitative data allow you to control both the location of the center (level, average, center of adjustment) of the process, and its spread (range, standard deviation). Therefore, control charts for quantitative data are almost always used and analyzed in pairs, one for location and one for scatter.

The most commonly used are pairs and -cards, as well as -cards. Formulas for calculating the position of the control boundaries of these maps are given in Table. 1. The values ​​of the coefficients included in these formulas and depending on the sample size are given in Table. 2.

It should be emphasized that the coefficients given in this table were obtained on the assumption that the quantitative values ​​of the controlled parameter have a normal or close to normal distribution.


Table 1

Control Limit Formulas for Shewhart Maps Using Quantitative Data

Statistics Standard values ​​are set
central line UCL and LCL central line UCL and LCL

Note: Default values ​​are either , , or .

table 2

Coefficients for calculating control chart lines

Number of observations in subgroup n

Coefficients for calculating control limits Coefficients for calculating the center line
2 2,121 1,880 2,659 0,000 3,267 0,000 2,606 0,000 3,686 0,000 3,267 0,7979 1,2533 1,128 0,8865
3 1,732 1,023 1,954 0,000 2,568 0,000 2,276 0,000 4,358 0,000 2,574 0,8886 1,1284 1,693 0,5907
4 1,500 0,729 1,628 0,000 2,266 0,000 2,088 0,000 4,696 0,000 2,282 0,9213 1,0854 2,059 0,4857
5 1,342 0,577 1,427 0,000 2,089 0,000 1,964 0,000 4,918 0,000 2,114 0,9400 1,0638 2,326 0,4299
6 1,225 0,483 1,287 0,030 1,970 0,029 1,874 0,000 5,078 0,000 2,004 0,9515 1,0510 2,534 0,3946
7 1,134 0,419 1,182 0,118 1,882 0,113 1,806 0,204 5,204 0,076 1,924 0,9594 1,0423 2,704 0,3698
8 1,061 0,373 1,099 0,185 1,815 0,179 1,751 0,388 5,306 0,136 1,864 0,9650 1,0363 2,847 0,3512
9 1,000 0,337 1,032 0,239 1,761 0,232 1,707 0,547 5,393 0,184 1,816 0,9693 1,0317 2,970 0,3367
10 0,949 0,308 0,975 0,284 1,716 0,276 1,669 0,687 5,469 0,223 1,777 0,9727 1,0281 3,078 0,3249
11 0,905 0,285 0,927 0,321 1,679 0,313 1,637 0,811 5,535 0,256 1,744 0,9754 1,0252 3,173 0,3152
12 0,866 0,266 0,886 0,354 1,646 0,346 1,610 0,922 5,594 0,283 1,717 0,9776 1,0229 3,258 0,3069
13 0,832 0,249 0,850 0,382 1,618 0,374 1,585 1,025 5,647 0,307 1,693 0,9794 1,0210 3,336 0,2998
14 0,802 0,235 0,817 0,406 1,594 0,399 1,563 1,118 5,696 0,328 1,672 0,9810 1,0194 3,407 0,2935
15 0,775 0,223 0,789 0,428 1,572 0,421 1,544 1,203 5,741 0,347 1,653 0,9823 1,0180 3,472 0,2880
16 0,750 0,212 0,763 0,448 1,552 0,440 1,526 1,282 5,782 0,363 1,637 0,9835 1,0168 3,532 0,2831
17 0,728 0,203 0,739 0,466 1,534 0,458 1,511 1,356 5,820 0,378 1,622 0,9845 1,0157 3,588 0,2784
18 0,707 0,194 0,718 0,482 1,518 0,475 1,496 1,424 5,856 0,391 1,608 0,9854 1,0148 3,640 0,2747
19 0,688 0,187 0,698 0,497 1,503 0,490 1,483 1,487 5,891 0,403 1,597 0,9862 1,0140 3,689 0,2711
20 0,671 0,180 0,680 0,510 1,490 0,504 1,470 1,549 5,921 0,415 1,585 0,9869 1,0133 3,735 0,2677
21 0,655 0,173 0,663 0,523 1,477 0,516 1,459 1,605 5,951 0,425 1,575 0,9876 1,0126 3,778 0,2647
22 0,640 0,167 0,647 0,534 1,466 0,528 1,448 1,659 5,979 0,434 1,566 0,9882 1,0119 3,819 0,2618
23 0,626 0,162 0,633 0,545 1,455 0,539 1,438 1,710 6,006 0,443 1,557 0,9887 1,0114 3,858 0,2592
24 0,612 0,157 0,619 0,555 1,445 0,549 1,429 1,759 6,031 0,451 1,548 0,9892 1,0109 3,895 0,2567
25 0,600 0,153 0,606 0,565 1,434 0,559 1,420 1,806 6,056 0,459 1,541 0,9896 1,0105 3,931 0,2544

An alternative to maps are median control maps (– maps), which are less computationally intensive than maps. This can facilitate their introduction into production. The position of the central line on the map is determined by the average value of the medians () for all controlled samples. The positions of the upper and lower control boundaries are determined by the relations

(4.1)

The values ​​of the coefficient depending on the sample size are given in Table. 3.

Table 3

Coefficient values

2 3 4 5 6 7 8 9 10
1,88 1,19 0,80 0,69 0,55 0,51 0,43 0,41 0,36

Usually – map is used together with – map, sample size

In some cases, the cost or duration of the measurement of the controlled parameter is so great that it is necessary to control the process based on the measurement of individual values ​​of the controlled parameter. In this case, the sliding range serves as a measure of process variation, i.e. the absolute value of the difference between the measurements of the controlled parameter in successive pairs: the difference between the first and second measurements, then the second and third, etc. Based on the moving ranges, the average moving range is calculated, which is used to build control charts of individual values ​​and moving ranges (and -maps). Formulas for calculating the position of the control boundaries of these maps are given in Table. four.

Table 4

Control Limit Formulas for Individual Value Maps

Statistics Standard values ​​are not set Standard values ​​are set
central line UCL and LCL central line UCL and LCL

individual value

Sliding

Note: Standard values ​​and or and are set.

The values ​​of the coefficients and can be obtained indirectly from Table 2 at n=2.

4.1.1 and -cards. Standard values ​​are not set

In table. 6 shows the results of measurements of the outer radius of the sleeve. Four measurements were taken every half an hour, for a total of 20 samples. The averages and ranges of subgroups are also given in Table. 5. The maximum permissible values ​​of the outer radius are set: 0.219 and 0.125 dm. The goal is to determine the performance of the process and control it by tuning and spread so that it meets the specified requirements.


Table 5

Production data for bushing outer radius

Subgroup number Radius
1 0,1898 0,1729 0,2067 0,1898 0,1898 0,038
2 0,2012 0,1913 0,1878 0,1921 0,1931 0,0134
3 0,2217 0,2192 0,2078 0,1980 0,2117 0,0237
4 0,1832 0,1812 0,1963 0,1800 0,1852 0,0163
5 0,1692 0,2263 0,2066 0,2091 0,2033 0,0571
6 0,1621 0,1832 0,1914 0,1783 0,1788 0,0293
7 0,2001 0,1937 0,2169 0,2082 0,2045 0,0242
8 0,2401 0,1825 0,1910 0,2264 0,2100 0,0576
9 0,1996 0,1980 0,2076 0,2023 0,2019 0,0096
10 0,1783 0,1715 0,1829 0,1961 0,1822 0,0246
11 0,2166 0,1748 0,1960 0,1923 0,1949 0,0418
12 0,1924 0,1984 0,2377 0,2003 0,2072 0,0453
13 0,1768 0,1986 0,2241 0,2022 0,2004 0,0473
14 0,1923 0,1876 0,1903 0,1986 0,1922 0,0110
15 0,1924 0,1996 0,2120 0,2160 0,2050 0,0236
16 0,1720 0,1940 0,2116 0,2320 0,2049 0,0600
17 0,1824 0,1790 0,1876 0,1821 0,1828 0,0086
18 0,1812 0,1585 0,1699 0,1680 0,1694 0,0227
19 0,1700 0,1567 0,1694 0,1702 0,1666 0,0135
20 0,1698 0,1664 0,1700 0,1600 0,1655 0,0100

where is the number of subgroups,

The first step: building a map and determining the state of the process from it.

central line:

The values ​​of the factors and are taken from Table. 2 for n=4. Since the values ​​in Table. 5 are within the control limits, the map indicates a statistically controlled state. The value can now be used to calculate map control boundaries.

center line: g

The multiplier values ​​are taken from Table. 2 for n=4.

and -maps are shown in Fig. 5. Analysis of the map shows that the last three points are out of bounds. This indicates the possibility of some special causes of variation being at work. If limits have been calculated from previous data, then action must be taken at the point corresponding to the 18th subgroup.

Fig.5. Medium and Span Cards

At this point in the process, appropriate corrective action should be taken to eliminate special causes and prevent recurrence. The work with the maps continues after the establishment of the revised control boundaries without excluded points that have gone beyond the old boundaries, i.e. values ​​for samples no. 18, 19 and 20. The values ​​and lines of the control chart are recalculated as follows:

revised value

revised value

the revised map has following parameters:

center line: g

revised map:

central line:

(because the central line: , then there is no LCL).

For a stable process with revised control limits, the possibilities can be assessed. We calculate the index of opportunities:

where is the upper limit value of the controlled parameter; – lower limit value of the controlled parameter; - estimated by the average variability within subgroups and expressed as . The value of the constant is taken from table 2 for n=4.

Rice. 6. Revised and -maps

Since , the possibilities of the process can be considered acceptable. However, upon close examination, it can be seen that the process is not set up correctly in relation to the tolerance and therefore about 11.8% of the units will be outside the set upper limit value. Therefore, before setting the constant parameters of the control charts, one should try to set up the process correctly, while maintaining it in a statistically controlled state.

The tool is used when processing is carried out with a tool whose design and dimensions are approved by GOST and OST or are available in industry standards. When developing technological processes for manufacturing parts, a normalized tool should be used as the cheapest and easiest. A special cutting tool is used in cases where processing with a normalized ...



Such control is very expensive. Therefore, from continuous control, they pass to selective with the use of statistical methods for processing the results. However, such control is effective only when the technological processes, being in an established state, have the accuracy and stability sufficient to “automatically” guarantee the manufacture of defect-free products. Hence the need...

And the organization of the control process. Status of control In this course project, the terms of reference provide for the development of the stages of the process of acceptance control of the details of the gearbox of a cylindrical coaxial two-stage two-stream - gear wheel and active control in the operation of grinding the hole. The methods of active and acceptance control complement each other and are combined. Active...

Plan:

10.1 Basics of Shewhart's Control Charts

10.2 Types of Shewhart control charts

10.1 Basics of Shewhart's Control Charts

The task of statistical process control is to ensure and maintain processes at an acceptable and stable level, ensuring that products and services meet specified requirements. The main statistical tool used for this is the control chart. The control chart method helps to determine whether a process has indeed reached or remains in a statistically controlled state at the correct level, and then to maintain control and a high degree of uniformity of the most important characteristics of a product or service by continuously recording product quality information during the production process. The use of control charts and their careful analysis lead to a better understanding and improvement of processes.

Shewhart Control Charts (CCC) are the main tool of statistical quality management. The CCF is used to compare the information obtained from the samples about the current state of the process with the control limits, which represent the limits of the intrinsic variability (scatter) of the process. The KKH is used to assess whether or not a manufacturing process, service process, or management process is in a statistically controlled state. Initially, KKSH were developed for use in industrial production. Currently, they are widely used in the service industry and other areas.

Control card is a graphical way of presenting and comparing information based on a sequence of samples reflecting the current state of a process, with boundaries set based on the intrinsic variability of the process.

Control chart theory distinguishes between two kinds of variability. The first type is variability due to “random (ordinary variables) due to an innumerable set of diverse causes that are constantly present, which are not easy or impossible to detect. Each of these causes constitutes a very small fraction of the total variability, and none of them is significant on its own. However, the sum of all these causes is measurable and is assumed to be intrinsic to the process. Eliminating or reducing the impact of common causes requires management decisions and the allocation of resources to improve the process and system. The second kind is real change in the process. They may be the result of some identifiable causes that are not inherent in the process internally, and can be eliminated. These identifiable causes are considered as "non-random" or "special" causes of change. These may include tool breakage, insufficient homogeneity of material, production or control equipment, qualification of personnel, failure to follow procedures, etc.

The purpose of control charts is to detect unnatural changes in data from repetitive processes and provide criteria for detecting a lack of statistical control. A process is in a statistically controlled state if the variability is caused only by random causes. When determining this acceptable level of variability, any deviation from it is considered the result of special causes that should be identified, eliminated or reduced.

The Shewhart map requires data to be sampled from the process at roughly equal intervals. Intervals can be set either by time (eg hourly) or by quantity (each batch). Typically, each sub-group consists of the same type of product or service with the same controlled indicators, and all sub-groups have equal volumes. For each subgroup, one or more characteristics are determined, such as the arithmetic mean of the subgroup and the range of the subgroup R or the sample standard deviation S. The Shewhart map is a graph of the values ​​of certain characteristics of the subgroups depending on their numbers. It has a center line (CL) corresponding to the characteristic reference value. When assessing whether a process is in a statistically controlled state, the reference is usually the arithmetic mean of the data under consideration. In process control, the reference is the long-term value of a characteristic set in the specification, or its nominal value based on previous information about the process, or the intended target value of a characteristic of a product or service. The Shewhart map has two statistically defined control limits relative to the center line, called the upper control limit (UCL) and the lower control limit (LCL) (Figure 9).

Sequence number of the sample

Figure 9 - View of the control chart

The control lines on Shewhart's map are at a distance of 3 from the center line where is the general standard deviation of the statistics used. Variation within subgroups is a measure of random variation. To get an estimate calculate the sample standard deviation or multiply the sample range by the appropriate factor. This measure does not include intergroup variation, but only assesses variation within subgroups.

Limits ±3 indicate that about 99.7% of the values ​​of the characteristics of the subgroups will fall within these limits, provided that the process is in a statistically controlled state. In other words, there is a risk equal to 0.3% (or an average of three in a thousand cases) that the plotted point will be outside the control limits when the process is stable. The word "approximately" is used because deviations from initial assumptions, such as the distribution of the data, will affect the probability values.

Some consultants prefer 3.09 instead of a factor of 3 to provide a nominal probability of 0.2% (an average of two misleading observations per thousand), but Shewhart chose 3 to avoid considering exact probabilities. Similarly, some consultants use actual probabilities for maps based on non-normal distributions, such as range and disparity maps, in which case the Shewhart map also uses boundaries at a distance of ± 3 instead of probabilistic limits, simplifying the empirical interpretation.

The probability that a boundary violation is indeed a random event, and not a real signal, is considered so small that when a point outside the boundaries appears, certain actions should be taken. Since the action is taken at this point, then control boundaries are sometimes referred to as "action boundaries".

Often on the control map, the boundaries are also drawn at a distance of 2 .Then any sample value that falls outside the boundaries of 2a can serve as a warning about the imminent situation of the process exiting the state of statistical controllability. Therefore, the limits ±2 sometimes called "warning".

When using control charts, two types of errors are possible: the first and second kind.

A Type I error occurs when the process is in a statistically controlled state and the point jumps out of control bounds by accident. As a result, it is incorrectly determined that the process is out of statistical control and an attempt is made to find and eliminate the cause of a non-existent problem.

A Type II error occurs when the process in question is out of control and the points happen to be inside the control boundaries. In this case, they incorrectly conclude that the process is statistically controlled and miss the opportunity to prevent an increase in the yield of nonconforming products. Type II error risk is a function of three factors: the width of the control limits, the degree of uncontrollability, and the sample size. Their nature is such that only a general statement about the magnitude of the error can be made.

The Shewhart charting system takes into account only Type I errors of 0.3% within bounds 3 . Since in the general case it is impractical to make a full estimate of the losses from a Type II error in a particular situation, but it is convenient to arbitrarily take a small subgroup size (4 or 5 units), it is advisable to use boundaries at a distance of ± 3 and focus mainly on managing and improving the quality of the process itself.

If the process is statistically controlled, the control charts implement a method of continuous statistical testing of the null hypothesis that the process has not changed and remains stable. But since the value of a particular deviation of a process characteristic from the target that could attract attention can usually not be determined in advance, as well as the risk of a type II error, and the sample size is not calculated to satisfy the corresponding level of risk, the Shewhart map should not be considered from the point of view of testing hypotheses. . Shewhart emphasized the empirical utility of control charts in establishing deviations from the state of statistical control, and not their probabilistic interpretation. Some users use performance curves as a means of interpreting hypothesis tests.

When the plotted value is outside any of the control limits, or the series of values ​​exhibits unusual patterns, the state of statistical control is questioned. In this case, it is necessary to investigate and detect non-random (special) causes, and the process can be stopped or corrected. Once special causes are found and ruled out, the process is ready to continue again. When a Type I error occurs, no particular cause can be found. Then it is believed that the point going beyond the boundaries is a rather rare random phenomenon when the process is in a statistically controlled state.

When a process control chart is built for the first time, it often turns out that the process is statistically uncontrollable. Control limits calculated from data from such a process will sometimes lead to erroneous conclusions because they may be too wide. Therefore, before setting the constant parameters of the control charts, it is necessary to bring the process into a statistically controlled state.

Algorithm:

1. Process analysis.

First of all, it is necessary to ask about the existing problem, because, in the absence of them, the analysis will not make sense. For greater clarity, you can use the cause-and-effect diagram of Ishikawa (mentioned above, ch. 2). To compile it, it is recommended to involve employees from different departments and use brainstorming. After a thorough analysis of the problem, and finding out the factors influencing it, we proceed to the second stage.

2. Process selection.

Having clarified in the previous stage the factors influencing the process by drawing a detailed skeleton of the “fish”, it is necessary to choose the process that will be subject to further research. This step is very important because choosing the wrong indicators will make the whole control chart less effective due to the examination of insignificant indicators. At this stage, it is worth realizing that the choice of the appropriate process and indicator determines the outcome of the entire study and the costs associated with it.

3. Data collection.

The purpose of this stage is to collect data about the process. To do this, it is necessary to design the most suitable way to collect data, find out who and at what time will take measurements. If the process is not equipped with technical means to automate the entry and processing of data, it is possible to use one of the seven simple ways Ishikawa - checklists. Control sheets, in fact, are forms for registering the parameter under study. Their advantage lies in ease of use and ease of training employees. If there is a computer at the workplace, it is possible to enter data through the appropriate software products.

Depending on the specifics of the indicator, the frequency, time of collection and sample size are determined to ensure representativeness of the data. The collected data is the basis for further operations and calculations.

After collecting information, the researcher must decide on the need to group the data. Grouping often determines the performance of control charts. Here, with the help of the analysis already carried out using the cause-and-effect diagram, it is possible to establish factors by which it will be possible to group the data in the most rational way. It should be noted that data within one group should have little variability, otherwise the data may be misinterpreted. Also, if the process is divided into parts using stratification, each part should be analyzed separately (example: the manufacture of the same parts by different workers).

Changing the way grouping will change the factors that form within-group variation. Therefore, it is necessary to study the factors influencing the change in the indicator in order to be able to apply the correct grouping.

4. Calculation of the values ​​of the control chart.

Shewhart's control charts are divided into quantitative and qualitative (alternative) depending on the measurability of the studied indicator. If the value of the indicator is measurable (temperature, weight, size, etc.), maps of the value of the indicator, ranges and double maps of Shewhart are used. On the contrary, if the indicator does not allow the use of numerical measurements, use card types for an alternative feature. In fact, the indicators studied on this basis are defined as meeting or not meeting the requirements. Hence the use of maps for the proportion (number) of defects and the number of conformances (inconsistencies) per unit of production.

For any type of Shewhart charts, it is assumed to define the central and control lines, where the central line (CL-control limit) actually represents the average value of the indicator, and the control limits (UCL-upper control limit; LCL-lower control limit) are the allowable tolerance values .

At this stage, the researcher must calculate the values ​​of CL, UCL, LCL.

5. Construction of a control chart.

So, we have come to the most interesting process - a graphical reflection of the data obtained. So, if the data was entered into a computer, then using the Statistica or Excel program environment, you can quickly graphically display the data. However, it is possible to build a control chart and, without special programs, then, along the OY axis of the control charts, we plot the values ​​of the quality indicator, and along the OX axis, the time points of registering the values, in the following sequence:

  • 1) put on the control card the central line (CL)
  • 2) draw borders (UCL; LCL)
  • 3) we reflect the data obtained during the study by applying an appropriate marker to the point of intersection of the value of the indicator and the time of its registration. It is recommended to use different types of markers for values ​​that are inside and outside the tolerance limits.
  • 6. Checking the stability and controllability of the process.

This stage is designed to show us what the research was conducted for - whether the process is stable. Stability (statistical controllability) is understood as a state in which the repeatability of parameters is guaranteed. Thus, the process will be stable only if the following cases do not occur.

Consider the main criteria for process instability:

  • 1) Going beyond the control limits
  • 2) A series - a certain number of points, invariably on one side of the center line - (top) bottom.

A series of seven points is considered abnormal. In addition, the situation should be considered abnormal if:

  • a) at least 10 out of 11 points are on the same side of the center line;
  • b) at least 12 out of 14 points are on the same side of the center line;
  • c) at least 16 out of 20 points are on the same side of the center line.
  • 3) trend - a continuously rising or falling curve.
  • 4) approaching the control boundaries. If 2 or 3 points are very close to the control limits, this indicates an abnormal distribution.
  • 5) approaching the center line. If the values ​​are concentrated near the center line, this may indicate an incorrect choice of grouping method, which makes the range too wide and leads to data mixing with different distributions.
  • 6) periodicity. When, after certain equal intervals of time, the curve goes either to "decline", or to "rise".
  • 7. Analysis of control charts.

Further actions are based on the conclusion about the stability or instability of the process. If the process does not meet the stability criteria, the influence of non-random factors should be reduced and, by collecting new data, a control chart should be built. But, if the process meets the stability criteria, it is necessary to evaluate the capabilities of the process. The smaller the spread of parameters within the tolerance limits, the higher the value of the process capability indicator. The indicator reflects the ratio of the width of the parameter and the degree of its dispersion.

I recently posted my slidecast here about 6-Sigma, Shewhart Control Charts, and Snowflake People, where enough plain language, in places abusing foul language, under a 20-minute laughter of listeners, he talked about how to separate systemic variations from variations caused by special reasons.

Now I want to analyze in detail an example of constructing a Shewhart control chart based on real data. As real data, I took historical information about completed personal tasks. I have this information thanks to the adaptation of David Allen's personal effectiveness model Getting Things (I also have an old slidecast about this in three parts: Part 1, Part 2, Part 3 + Excel table with macros for analyzing tasks from Outlook).

The task statement looks like this. I have a distribution of the average number of completed tasks depending on the day of the week (below in the graph) and I need to answer the question: “is there anything special about Mondays or is it just a system error?”

Let's answer this question with the help of the Shewhart control chart, the main tool of statistical process control.

So, Shewhart's criterion for the presence of a special cause of variation is quite simple: if any point goes beyond the control limits, calculated in a special way, then it indicates a special reason. If the point lies within these limits, then the deviation is due to the general properties of the system itself. Roughly speaking, is the measurement error.
The formula for calculating the control limits looks like this:

Where
- the average value of the average values ​​for the subgroup,
- average range,
- some engineering coefficient depending on the size of the subgroup.

All formulas and tabular coefficients can be found, for example, in GOST 50779.42-99, where the approach to statistical management is briefly and clearly stated (honestly, I myself did not expect that there is such a GOST. The topic of statistical management and its place in business optimization is disclosed in more detail in book by D. Wheeler).

In our case, we group the number of completed tasks by day of the week - these will be the subgroups of our sample. I took data on the number of completed tasks for 5 weeks of work, that is, the size of the subgroup is 5. Using table 2 from GOST, we find the value of the engineering coefficient:

Calculating the average value and range (difference between the minimum and maximum values) for a subgroup (in our case, by the day of the week) is a fairly simple task, in my case the results are as follows:

The central line of the control chart will be the mean of the group means, i.e.:

We also calculate the average range:

Now we know that the lower control limit for the number of completed tasks will be equal to:

That is, those days on which I complete fewer tasks on average are special from the point of view of the system.

Similarly, we obtain the upper control limit:

Now plot the center line (red), upper control limit (green), and lower control limit (purple) on the chart:

And, oh, miracle! We see three clearly distinct groups outside the control limits, in which there are clearly non-systemic causes of variation!

I don't work on Saturdays and Sundays. Fact. And Monday was a really special day. And now you can think and look for what is really special on Mondays.

However, if the average number of tasks completed on Monday were within the control limits and even if it stood out strongly against the background of other points, then from the point of view of Shewhart and Deming, it would be pointless to look for any features on Mondays, since such behavior is determined solely by general causes. . For example, I built a control chart for another 5 weeks at the end of last year:

And it seems like there is some kind of feeling that Monday somehow stands out, but according to the Shewhart criterion, this is just a fluctuation or an error in the system itself. According to Shewhart, in this case, you can explore the special causes of Mondays for an arbitrarily long time - they simply do not exist. From the point of view of the statistical office, on these data, Monday is no different from any other working day (even Sundays).

Recently I published my own here, where in a rather simple language, sometimes abusing foul language, under 20 minutes of laughter from listeners, I talked about how to separate system variations from variations caused by special reasons.

Now I want to analyze in detail an example of constructing a Shewhart control chart based on real data. As real data, I took historical information about completed personal tasks. I have this information thanks to the adaptation of David Allen's personal effectiveness model Getting Things (I also have an old slidecast about this in three parts: Part 1, Part 2, Part 3 + Excel table with macros for analyzing tasks from Outlook).

The task statement looks like this. I have a distribution of the average number of completed tasks depending on the day of the week (below in the graph) and I need to answer the question: “is there anything special about Mondays or is it just a system error?”

Let's answer this question with the help of the Shewhart control chart, the main tool of statistical process control.

So, Shewhart's criterion for the presence of a special cause of variation is quite simple: if any point goes beyond the control limits, calculated in a special way, then it indicates a special reason. If the point lies within these limits, then the deviation is due to the general properties of the system itself. Roughly speaking, is the measurement error.
The formula for calculating the control limits looks like this:

Where
- the average value of the average values ​​for the subgroup,
- average range,
- some engineering coefficient depending on the size of the subgroup.

All formulas and tabular coefficients can be found, for example, in GOST 50779.42-99, where the approach to statistical management is briefly and clearly stated (honestly, I myself did not expect that there is such a GOST. The topic of statistical management and its place in business optimization is disclosed in more detail in book by D. Wheeler).

In our case, we group the number of completed tasks by day of the week - these will be the subgroups of our sample. I took data on the number of completed tasks for 5 weeks of work, that is, the size of the subgroup is 5. Using table 2 from GOST, we find the value of the engineering coefficient:

Calculating the average value and range (difference between the minimum and maximum values) for a subgroup (in our case, by the day of the week) is a fairly simple task, in my case the results are as follows:

The central line of the control chart will be the mean of the group means, i.e.:

We also calculate the average range:

Now we know that the lower control limit for the number of completed tasks will be equal to:

That is, those days on which I complete fewer tasks on average are special from the point of view of the system.

Similarly, we obtain the upper control limit:

Now plot the center line (red), upper control limit (green), and lower control limit (purple) on the chart:

And, oh, miracle! We see three clearly distinct groups outside the control limits, in which there are clearly non-systemic causes of variation!

I don't work on Saturdays and Sundays. Fact. And Monday was a really special day. And now you can think and look for what is really special on Mondays.

However, if the average number of tasks completed on Monday were within the control limits and even if it stood out strongly against the background of other points, then from the point of view of Shewhart and Deming, it would be pointless to look for any features on Mondays, since such behavior is determined solely by general causes. . For example, I built a control chart for another 5 weeks at the end of last year:

And it seems like there is some kind of feeling that Monday somehow stands out, but according to the Shewhart criterion, this is just a fluctuation or an error in the system itself. According to Shewhart, in this case, you can explore the special causes of Mondays for an arbitrarily long time - they simply do not exist. From the point of view of the statistical office, on these data, Monday is no different from any other working day (even Sundays).