Shewhart Chart for Individuals (I-Chart)

Background

The Shewhart Control Chart is a classic technique used in Statistical Process Control. The principal objective of using a control chart is to identify whether a process being monitored over time has shifted (or drifted) into an out-of-control state (i.e., a state that differs from historical data). Shewhart charts are typically constructed by plotting the means of rational subgroups over time, with the common assumption being that these sub-group means follow a normal distribution. However, a Shewhart chart can be constructed for virtually any type of data, such as counts, proportions, attributes, or continuous skewed data and these data may follow different types of distributions, such as Binomial, Poisson, Gamma, etc. In this VSP module, we provide a Shewhart chart intended to monitor individual measurements over time, where the historical, in-control measurements follow a normal distribution. Sometimes we will refer to this control chart by its common abbreviation, the I-chart.

Statistical Process Control is often described as involving two phases: Phase I and Phase II. Phase I is the process of constructing a control chart with the desired false alarm rate using information from historical data. In Phase II, the control chart created in Phase I is used to monitor the present state of the process by plotting the values of arriving data and raising an alarm if a plotted point exceeds the control limits of the chart.

 

Phase I: Constructing the I-chart

The purpose of Phase I is to construct a control chart that can be used in Phase II for the monitoring of future data. This requires identifying a set of in-control, historical data that may be used to estimate the in-control mean and standard deviation. It may be useful to employ outlier tests or even the control limits of a preliminary control chart to identify points in the historical data that may be out-of-control. These out-of-control points should then be removed from the historical data prior to estimating the mean and standard deviation of the process. It is good practice to determine the cause of these historical out-of-control (outlier) points. Doing so provides insight into the process and it provides a defensible rationale for removing those points from the historical, in-control data set.

Having identified the in-control data set and removed any outliers, the process mean may be estimated by simply using the sample mean of the in-control data set. This mean will become the target, or centerline, of the control chart that is employed in Phase II. Alternatively, there may be situations where using a known mean or target value (in lieu of estimating it from historical data) is justified. Likewise, the standard deviation may also be specified as a known value, but more commonly, it should be estimated from the historical, in-control data. VSP provides two common methods for doing this: the average moving range and the sample standard deviation. The average moving range is typically preferred because it is more robust to shifts, or drifts in the mean which may be present, but go undetected, in the historical data. However, if the historical data truly do belong to a single distribution with a constant mean (i.e., they are stationary), the sample standard deviation does provide a more precise estimate of the standard deviation of the process than the average moving range. Plotting the historical data in the control chart during Phase 1 is highly recommended, because it will facilitate the identification of outliers or other patterns in the data, and the plot may be used to informally assess whether the data are stationary. The formulas for the sample mean, \( \bar X \) , and estimators of the process standard deviation, \( \hat \sigma_{MR} \) and \( \hat \sigma_S \) , using the average moving range and sample standard deviation, respectively, are given below. Let the \( X_1 , X_2 , \dotsb, X_n \) represent the historical, individual measurements in chronological order. Then

 

$$ \Large \overline X = \frac{1}{n} \displaystyle\sum\limits_{i=1}^n X_i \, , \, \hat \sigma_{MR} = \frac{\sum_{i=2}^n |X_i - X_{i-1} |}{d_2 (n-1)} \quad \mbox{and} \quad \hat \sigma_S = \frac{1}{c_4 (n)} \sqrt{\frac{\sum_{i=1}^n (X_i - \overline X )}{n-1}} $$

where


$$ \Large d_2 = 1.128 \quad \mbox{and} \quad c_4 (n) = \sqrt{ \frac{2}{n-1}} \frac{ \Gamma ( \frac{n}{2} ) }{ \Gamma ( \frac{n-1}{2} ) } $$
 

are unbiasing constants which ensure that \( \hat \sigma_{MR} \) and \( \hat \sigma_S \) are unbiased estimators of the process standard deviation. Values for \( c_4 (n) \) and \( d_2 \) are given by Montgomery (2001). The formula for \( c_4 (n) \) is a well-known result discussed by several authors, Gurland and Tripathi (1971) among them. Lastly, that \( \Gamma (x) = \int_0^\infty t^{x-1} e^{-t} \, dt \) is the standard gamma function.

Having obtained the estimates of the in-control process mean and standard deviation, control limits can be constructed to achieve a desired false alarm rate (which is the reciprocal of the average run length of the chart). Calculating these control limits relies on the assumption that the historical data used to construct the chart follow a normal distribution. The normality of the in-control data can be informally assessed using the QQ-Plot on the Plots sub-page on Data Analysis page in VSP. The closer the data follow a straight line in the QQ plot, the more likely they are to be normally distributed. If the historical, in-control data are not normal, the desired false alarm rate will not be achieved—more specifically, it is likely that the false alarm rate will be higher than expected. Not having perfectly normal historical data does not preclude the use of the I-chart. But it could make it more difficult to determine whether an out-of-control point in Phase II represents a true change in the process mean or if it is just a false alarm.

The upper and lower control limits of the I-chart, \( UCL \, \text{and} \, LCL \) , are calculated as follows:

 

$$ UCL = \bar X + L \hat \sigma \quad \mbox{and} \quad LCL = \bar X - L \hat \sigma $$

 

where \( \hat \sigma \) is either \( \, \hat \sigma_{MR} \quad \mbox{or} \quad \hat \sigma_S \) . The value of \( L \) is chosen to give a desired false alarm rate. We illustrate the process with an example. Suppose individual measurements are collected at a rate of \( s \) samples per \( r \) units of time (e.g. 1 sample every 2 weeks, or, 5 samples per hour). Let \( R = s/r \) be the sample rate. Provided the process remains in control, suppose we desire a false alarm rate of, on average, one false alarm per \( T \) units of time (e.g. 1 false alarm every 30 weeks, or 1 false alarm every year). Assuming that \( R \) and \( T \) use the same units of time, we have

 

$$ T = \frac{1}{2R(1- \Phi (L))} \quad \mbox{and} \quad L = \Phi ^{-1} (1-\frac{1}{2RT}) $$

 

Where \( \Phi ( \centerdot ) \) is the cumulative distribution function of a standard normal variate. Alternatively, the value of \( L \) can be specified irrespective of the sampling and false alarm rates. For example, a value of \( L = 3 \) is commonly used, which indicates that the control limits will be set at 3 standard deviations above and below the historical estimate of the process mean. However, the performance of the chart is more easily interpreted in terms of the false alarm rate. After calculating the control limits, \( UCL \) and \( LCL \), the design of the control chart is complete.

 

Phase II: Prospective Monitoring with the I-chart and Moving Range chart

Having designed the chart in Phase I, newly arriving observations may be plotted on the control chart with the objective of detecting a shift in the process mean. If a plotted point falls above or below the upper or lower control limits, an alarm is raised, suggesting that the distribution of the process may be changing, i.e., it may be out of control. The out-of-control state could involve an increase or decrease in the process mean or perhaps an increase in the variability of the process. Diagnosing the nature of the change from the historical distribution will often require making additional measurements either immediately or over time. This additional investigation will serve to determine whether the alarm is false or not. If the alarm is valid, the investigation will also help investigators identify the underlying cause of the change in the distribution of the process.

In addition to the I-chart, a moving range chart is often displayed simultaneously during Phase II monitoring. The purpose of the moving range chart is to identify changes in the variability of the process. The moving range chart is constructed by plotting the absolute value of successive differences of the data. For example, at time \( i \) , the value of \( |X_i - X_{i-1} | \) is plotted on the chart. For reference, a horizontal line showing the average moving range is also plotted at

$$ \Large \overline {MR} = \frac{1}{n-1} \displaystyle\sum\limits_{i=2}^n |X_i - X_{i-1} | $$
 
 

If the moving range chart shows a sustained pattern of higher values, this may be indicative of an increase in the variability of the process. However, the moving range chart does not have a control limit for signaling alarms. Its purpose is to provide additional insight into the variability of the process and it should be used to support and interpret the I-chart.

After a period of Phase II monitoring, the control chart should be recalibrated with more recent historical data. Thus, as monitoring occurs over time, investigators should iterate between Phase I and Phase II. A Phase II control chart should also be recalibrated if an investigator suspects the in-control distribution of the process has changed.

Summary of assumptions

  1. Individual measurements are collected over time.

  2. Individual measurements are independent (i.e., they do not exhibit autocorrelation).

  3. The historical in-control data (in Phase I) follow a normal distribution with a stationary (constant) mean. Note that this assumption does not refer to newly arriving data that are monitored in Phase II.

References:

Gurland, John and Tripathi, Ram C. 1971. A Simple Approximation for Unbiased Estimation of the Standard Deviation. The American Statistician. 25:30-32.

Montgomery, Douglas C. 2001. Introduction to Statistical Quality Control, 4th ed. John Wiley & Sons, Inc.

 

The I-Chart dialog contains the following controls:

Analyte selection list

Location selection list

 

Use a known mean

Use the estimated mean based on historical data

 

Use a known standard deviation

Use the estimated standard deviation based on historical data

 

Use the average moving range as the historical standard deviation

Use the sample standard deviation

Historical data input

Historical data pick button

 

Use schedule for samples in future

Number of Samples Sample frequency Sample period

Use no particular schedule for samples in future

 

Use a false alarm rate

 Alarm rate Alarm period

Use control limits of a set number of standard deviations