Lesson 2 Variability and Statistical Process Control

Last lesson discussed the definitions of control, average value, and variability.  Specifically variability was defined in terms of the standard deviation or s.  This lesson will continue the discussion of variability and look at how operators can use variability to understand the performance of their controls.  Finally, to best understand automatic process control, manual statistical process control will be reviewed.

Measures of Variability

We continue our discussion of the standard deviation by asking the question – is a standard deviation of .2 good?  If we have two loops, one having a standard deviation of .5 and one having a standard deviation of 2.0, is the second loop operating with 4 times the variability of the first loop?  Can we even say which loop has less variability?
Of the three questions given above, only the last question can be answered.  That question can be answered with an unequivocal no!  The first two questions can not be answered with the data given.
To understand why this is - we look at the units of standard deviation.  By units we mean how the item is measured.  Examples of units would be grams/minute, feet/minute. Litres, and gallons.  In lesson 2 the formula for standard deviation was given as (note 1):



Equation 1: Standard Deviation

Assuming that the measurement of interest (process value) is feet/second we can logically determine the units of standard deviation. 

So we have shown (note 3) that the units of standard deviation are the same as the units that we measured in the first place.   We can change the standard deviation by as simple a matter as changing the units of measurement.  If in our example, instead of measuring the speed of something in feet/minute, we measured it in metres/second we would get a different standard deviation for exactly the same speeds!  However, the same expectation would still exist, that is, 95% of the samples will fall within 2 standard deviations of the average.  Therefore the performance of two loops can not be compared with each other using standard deviation.  But what if the two loops use the same units.  We might think that in this case the two loops could be compared, however, care must be taken.  If one loop has the average of 1 with a standard deviation of .1 and another has an average of 10 with a standard deviation of .1, do the loops have the same variability.  It seems to make sense that the loop with an average of 10 has less variability than the other loop given that they both have the same standard deviation.  What is needed is a number that not only accommodates loops with different units, but also that have different average values as well.
Luckily that number exists and we have already done most of the work to calculate it.  The number we desire is called the coefficient of variation and is given by the formula (note 4):

Where
V = coefficient of variation
s = the standard deviation of a set of samples
=the average value of the set of samples
 It can be seen that the units of V are % (that is unit-less)

Using the V we can discuss variability from loop to loop in terms of its percent variation.  Therefore if one loop has a V of 4% and another has a V of 8%, the second loop does indeed have twice the variability of the first loop, even if the first loop is controlling consistency and the second loop is controlling temperature.  Of course, care must be taken when comparing loops that control dramatically different processes.  Controlling sheet basis weight profile is much more difficult than controlling consistency so a lower coefficient of variability would be expected on a consistency control loop than a basis weight loop.
Normally the variability of a loop is not given, and if it is, it is given as standard deviation.  Most of the time, the standard deviation is acceptable because the operator will be interested in how the variability of a single loop is changing over time.  However, the operator should be able to divide the standard deviation by the average of the controlled variable to determine the coefficient of variability.  This way loops can be compared for relative variability.

Variability over time

Now that the statistics is out of the way, how is this information used to discuss controls.  When variability is discussed, there are two situations that must be explored.  The first situation is when discussing a loop that directly controls a finished product specification or for some other reason has a set of limits attached to the measurement.  The other situation is of course when there are no limits set to the variable being considered.

Variables without limits

In this case, the process is controlled, but there are no real expectations on what the loop should do.  This is probably the most difficult situation in the plant.  Because there are no limits, there is likely not a lot of attention being paid to how the loop is performing.  It is quite possible for the loop controls to deteriorate to the point where variability is higher with the controls on than with them in manual.  The second most effective thing to do is to monitor the variability over time and determine if it deteriorating.  The most effective thing to do is to set limits on the variability of the loop and fix it when the variability gets larger than the limt.

Variables with limits

Variables that have limits are more important to the plant.  The fact that they have limits typically means that a customer specification is involved, or that the loop’s value is critical to the operation of the process.
If a limit is imposed on a process, the variability is unaffected.  Reviewing the formula for standard deviation, it can be seen that nowhere is there a place to use the desired value or the limits.  Standard deviation simply relates history, what has the process done in the past.  The assumption is that the process will do the same in the future, assuming no major process changes are made.  However, when discussing short term variability, what the process did do, or will do, is not influenced by what people want it to do.  (This is not exactly true as will be shown below). 
Assume that there is a pulp machine producing product with average moisture of 10%.  It has a two standard deviation of 1% moisture.  Allowable customer specifications are 10% ± 1%.  How much off-grade will be produced over a long period of time?  We know that 95% of the product will fall within 2 standard deviations (which in this case is the same as the customer spec of 1%) so logically 5% of production will be off-grade.  Is that good enough?  Maybe.  Can we do something about it?  Remember, in the short term, what the process did do, or will do, is not influenced by what people want it to do.
Many people use customer specs to drive manual control strategy.  In this example a common approach would be to say, “if the product is out of spec, make an adjustment to bring it back into specification.”  So if a reading of 11.1% is seen, the operator will make a change to bring it back to 10%.  Now that we know the statistics of the situation we see the error of this strategy.  When the reading of 11.1% is seen, there is a 95% chance that the next sample will be in limits without any changes.  But what happens when the operator makes a change?
If the operator makes a change to drive the average down by 1.1% which would appear to drive the value back to 10% from 11.1%, what actually happens is that the average should drive down to 8.9%.  In other words we have changed the target to produce off-grade.  If the process changes quickly, we would expect a little more than a 50% chance that we would make off-grade due to being too dry.  The odds are 50% because if the average were changed to 9%, then half the readings would be above 9% and half below.  Because processes often take longer to change completely than the rate at which they are sampled, the next reading would have lower odds of being out of spec, but eventually the change would result in a low reading.  The operator would then make a change upwards, with a resulting effect of causing off-grade on the upper end.
Looking over a period of time, it is very common to see manual control resulting in the controlled process “ping-ponging” up and down.  This is especially true in slow processes where it takes a long time for the process to react to a set point change.  There the illusion of control is heightened as it takes a while for variability to catch up with the process changes.  A good example is a pulp plant digester which has a time constant of about 6 hours.  (Very roughly, time constant is how long it takes the process to change in response to a set point change).  In this case the plant was able to improve the coefficient of variability by about 25% simply by reducing the number of manual adjustments to the process.
Earlier it was stated that if a sample of 11.1% was seen in our example process, there was nothing we could do about it.  It was then stated that that statement was untrue.  The untruth was that there was nothing we could do about it – as we have just seen, we can do something about it - we can make it worse.  However, there is nothing short term that we can do to make it better.
If the standard deviation of the process is close to its limits, then the “capability of the process” is less than needed.  The measure of capability is based on the standard deviation of the process.  If a process needs to deliver on-quality product almost all of the time, then the limits must be set far outside the standard deviation.  The standard rule of thumb is to have the specification limits set at 6s as statistically more than 99.9% of product will fall within the 6s range.  Of course, the customer usually sets the specifications so setting the limit based on process capability is often not an option. But, once the specification limits are set, and the standard deviation of the process known, then the amount of off-grade over time can be calculated.  If the limits are based on customer specifications, then the only mechanism for reducing off-grade is to reduce the variability of the loop.  However, making a significant change in the performance of a loop generally requires redesign of the loop and/or process.  We will return to this topic later in the course when we discuss sources of loop variability.
Before we get into a deeper discussion of automatic process control, we will examine manual control.  The first control algorithm that we look at – Statistical Process Control (SPC) - will not replicate automatic control, but will show a very good way to control a process manually.  The reason for looking at SPC is that it: allows us: to draw comparisons to automatic control; to discuss some of automatic control’s strengths and weaknesses; and to utilise SPC to make set point changes to automatic controls if needed.

Statistical Process Control

Earlier it was explained that if the data is normal, the statistical rules will hold.  One rule of 95% of data falling with 2 s of the average was given.  There are others that will be discussed here.  The general logic behind SPC is that for a new data point, if that point fits the statistical expectations, then the process is operating normally and no action should be taken.  If on the other hand, the point does not fit within the statistical expectations, then something has changed in the process and action should be taken.  Simply the goal of SPC is to only make changes when we have some confidence that the process operation has changed.  The most likely sources of change would be a change in the inputs to the process.
In the example given in the previous section, the first time a point is seen over 11%, we know that this could well be the expected variability of a process with an average of 10% and a 2s of 1%.  5 points out of 100 will exhibit this behaviour.   But what if the next point comes in at 1.1 also?  The chances of the first point being outside 2s were 5%, the chances of the second point being outside is also 5%, but the chances of the two points happening consecutively is 5%*5% or .25%.  Still possible - but a lot less likely.  If a third point were to be out of range, then the probability would be down to .01% or once in 10,000.  This says that it is still possible that three high points in a row are just due to normal variation of the process; however, the odds are now much greater that something has happened in the process.
For example, assume a pulp dryer that was operating with an inlet consistency of 50% and an output of 10%.  However, due to a change in head box flows, inlet consistency drops to 48%.  It can be expected that, making no other changes, the outlet moisture content will rise as a result of the extra water that is coming into the dryer.  It may be that the outlet moisture content will rise to 11%.  In this situation, we would say that the process has changed – although no physical modifications have been made – the change of the inputs result in a statistically different process.  Therefore in this situation a corresponding change to the steam flow would be an appropriate response to the change in inlet consistency.  The next sections discuss how can use the statistics that we have already learned to monitor the process and state with some confidence what readings are examples of simple variability and what are indicators of changes in the process that require an operator’s response.
What does this mean to the operator?  It means that quite often, better performance can be achieved by leaving the process alone.  If the process is under manual control, do not make adjustments unless there is some statistical proof that an adjustment is needed.  If the process is under automatic control, set points should not be changed without the same degree of certainty that the process is out of control.  Of course this is based on the assumption that the control limits are based on a proper calculation of the standard deviation over a large data and that these statistics are used to determine whether the process has changed.

Implementing Statistical Process Control

Statistical process control is simply the process of making changes when the process has changed, and not making changes when the process has not changed.  From the discussion above we know that in reality we can’t accurately say when a process has changed.  For example, consider the first sample that falls outside the 2 s limit - does that indicate a changed process?  We know that 5% of the samples of a process will fall outside the 2 s limits so we could have some confidence in saying that the process has not changed.  However, we are not sure.  Part of the problem is that the statistics refer to a stable process.  Five times out of a hundred a stable process will have a reading fall outside the 2 s limits.  But our question is more complex.  We are asking, have we seen an indication that the current sample is not representative of the previously observed (i.e. past samples of a) stable process.  Statistically this can be answered (again with only a probability) but generally not with one sample.  We can guess with one sample, but our confidence is relatively low that we are right.  We need more samples.  As shown previously, even one sample helps.  If the odds are 5 in 100 that a sample will fall outside the 2 s limits, the odds are 5%*5%=.25% that the second sample will also fall outside the limits – or 25 chances out of 10000.  Therefore if we have as little as 2 samples that fall outside the 2 s limits we can say with some surety that the process has been changed.  If every time two values are seen outside the limits we say that the process has changed - we will be correct 9975 times and wrong 25 times.  We will be wrong, but not very often.
Is being correct 9975 times out of 10000 a good enough track record?  It depends on what we are trying to do.  If we are trying to meet a very stringent customer spec, perhaps not, but if we are trying to control the temperature in the executive board room – probably.   From a control perspective, it depends on the impact of making a change to a process that has not really changed vs. the impact of not making a change to a process that has changed.  Each situation is different. 
One way to improve our odds of being correct is to have more rules.  Using the 2 s rule only catches some of the statistics involved in a changing process.  Other statistical rules will improve the odds of identifying when a process changed.  Statistical Process Control texts usually give a list of 3-5 base rules plus add on rules for special circumstances.  These rules then provide a higher probability that if a rule is met, the process has truly changed.  Each text usually has a slightly different set of rules.  This is because of the different perspective of the author as to the severity of incorrectly saying a process has changed.  For example, one text (note 5) says the process has changed if 1 sample is seen outside the 2 s limits while another (note 6) requires 2 consecutive samples outside the limits.  The first text will cause more changes to the process, but will find process changes faster (at the cost of sometimes identifying process changes that did not really occur).  The second text will always take at least one more sample point to identify a change.  But by waiting one extra sample period, it will be able to state that the process has changed with more confidence and therefore will make fewer disturbances to the process.
The rules (note 7) we will use in this example are:

  1. One point outside 3 s
  2. Two consecutive points outside 2 s
  3. Seven points on one side of the mean
  4. Five or more points trending in one direction

Example

Figure 1 shows a set of 32 data points collected on a hypothetical process.


Figure 1: First 36 data points of a process


Examining the rules given above we see no deviations indicating a change in the process so no adjustments to the process would be made.  There are two points outside the 2 s limit but they are not consecutive so do not break rule #2.


Figure 2 shows the same process with 7 new data points.


Figure 2: 37 Data Points


Now a rule has been broken, there are seven points on one side of the mean.  We have come close to breaking the first rule as well, but it is not quite at the 3 s limit.  If this was a real process we would make an appropriate process change to accommodate whatever has caused the upset.  In an ideal world we would determine what caused the process to change and resolve that.
Figure 3 shows the remainder of our data (we did not make the process change) and we see that the process indeed changed as it now has broken the first three rules.


Figure 3: All Data Points


According to the statistical rules, we first verified that the process had changed at point 36.  No points prior to that point indicated a process change based on the 4 rules.  Was this an accurate determination?  Did the process change and if so when.  Because this was a hypothetical process we have the luxury of knowing exactly how the data points were created.  In this case, the process was a simple addition of noise to an average value.  The average value was changed starting at point 30 as shown in Figure 4.  It took some time for the statistics to “catch up” with the process, but when they did, they were accurate in identifying the change.


Figure 4: All Data Points with corrected average

Summary

In this lesson we have learned about the coefficient of variability and how it can be used to compare the performance of loops with different average values or different units.  Most of this lesson focussed on how to use Statistical Process Control to improve process variability.  Next lesson will start the discussion of automatic control.

Exercises

Complete exercises 2 and 3 of the COP101 Exercises

Notes

1)
Statistics for Experimenters, George Box, William Hunter, J. Hunter. 1978 John Wiley and Sons. p40
2) As stated before, the square of a number is the number multiplied by itself.  So if the number is y, then y times y is called y squared or y2.  The square root of some number x is simply the number that when multiplied by itself gives x.  For example: 2x2 is 4,  so 2 squared is 4.  Likewise, the square root of 4 is 2.
3) If you don’t follow the mathematics – don’t worry.  All that you really need to know is that the units of standard deviation are the same as the units of the process variable.
4) Applied Engineering Statistics for Practicing Engineers, Lawrence Mann, Jr.  1970 Barnes and Noble Inc, New York.
5) Operations Management in Canada, Don Waters, Ragu Nayak, 1995 Addison-Wesley Publishers Limited, Don Mills Ontario
6) Web page - http://lorien.ncl.ac.uk/ming/spc/spc8.htm#rules, Ming T. Tham, 2001, Department of Chemical and Process Engineering at the University of Newcastle upon Tyne, UK
7) Web page - http://lorien.ncl.ac.uk/ming/spc/spc8.htm#rules, Ming T. Tham, 2001, Department of Chemical and Process Engineering at the University of Newcastle upon Tyne, UK

controls for operations personnel c 2007 managed reliability llc