Chapter 13: Measure – six-sigmas.com

Moving from the define phase to the measure phase of a project, Six Sigma teams continue to delve into the process, now coming to understand processes more fully through data. The measure phase is often the most laborious phase for the team, especially when data is not already available in digital formats. In this chapter, we’ll review some of the metrics covered in previous chapters and introduce some
concepts for data collection. We’ll continue building on the concepts of measure introduced in later chapters on statistical analysis.

One of the first steps of the measure phase is determining the capability of a process. This step can be completed before a team formally leaves the define phase if the data needed to perform sigma level calculations is available. Calculating sigma levels for a process was covered in Chapter 1. In addition to sigma levels, teams might also calculate various metrics for a process, including defects per million opportunities, FTY, or RTY, which were all covered in Chapter 5.

Failure Modes and Effect Analysis
The Failure Modes and Effect Analysis is a tool that can be applied by a Six Sigma team in any phase from define to analyze. Often, teams begin working with FMEAs in measure because it helps them identify risk priorities for various inputs and errors within a process. Used properly, the FMEA uses systemic data and team input to set the stage for root cause analysis in the next DMAIC phase. Remember, while tollgates do occur and teams move through five phases during a DMAIC project, hard borders don’t always exist between the phases. Teams might begin working on measure phase tasks before leaving the define phase, and it’s almost certain that teams will begin some analysis while still collecting data. Ultimately, an FMEA tool should be used when teams need more detailed information about inputs and possible associated fail-points than the tools discussed in the define chapter allow. The FMEA offers some of the information that is offered by SIPOC, but it also provides evaluations of the inputs. Teams typically create FMEAs in a spreadsheet program, as some calculations are required during the process. To create an FMEA, create a spreadsheet with the following column headers:

Process step
Potential failure
Potential failure effect
SEV
Potential cause of failure
OCC
Current monitor/control
DET
RPN
Recommended changes/actions
Who and When?
Action completed
SEV
DET
RPN

Columns 1 through 9 of the FMEA might be completed during the measure phase while columns 10 through 15 are more appropriate for the improve phase.

Begin by identifying all possible process steps, activities, or inputs in column one. In column two, indicate what might go wrong for each process step. Note that you can list process steps more than once if there are multiple opportunities for error within each step. If the team has created a detailed enough list of steps, however, this won’t likely be the case for a majority of the steps.

In column three, enter a short description of the impact of the failure on the customer. Incorrect measurement can result in increased variance in a product, for example. In the SEV column, rate the severity of the possible failure you described in the previous columns. Rate the severity from 1 to 10, with 1 being no effect, 5 being minor disruption to production, and 10 being severe enough to endanger a process or person.

In column five, enter the potential reasons the specific failure might occur, and in the OCC column, enter a numeric rating for how often the failure might be expected, with 1 being a very unlikely failure and 10 being an almost inevitable failure. In column seven, create a short description of the current controls that are in place to monitor the process or prevent the failures the team has described. In the DET column, rate the ability of the process or staff to detect failure if it does occur. Rate detection between 1 and 10, with 1 being a process that includes automated detection that rarely fails and 10 being no detection at all.

Finally, calculate the risk priority number by multiplying the severity, occurrence, and detection ratings, as in the example below.

A Six Sigma team working on a project to improve the speed with which refunds are processed to customers is creating an FMEA. One row of the FMEA includes the following information:

Process step: Refund request is entered in system.
Potential failure: Incorrect amount is entered.
Potential failure effect: The customer receives more or less refund than anticipated.
SEV: 8
Potential cause of failure: Data-entry employee transposes numbers or makes a similar typing
mistake.
OCC: 10
Current monitor/control: A supervisor randomly reviews a sample of refund requests to ensurevaccuracy.
DET: 7
RPN: (SEV * OCC * DET) = (8107) = 560

The team completes a second row as follows:

Process step: Refund check is printed.
Potential failure: The printed check has defects that make it difficult to cash.
Potential failure effect: The customer can’t cash the check and has to call for a new one.
SEV: 9
Potential cause of failure: Printer is misaligned or out of ink.
OCC: 1
Current monitor/control: The person who signs the checks reviews the checks as they sign them.
DET: 2
RPN: (SEV * OCC * DET) = (912) = 18

The potential failure in the first example has a much higher risk priority number, which means, as the team moves forward, they are more likely to work on solving that potential failure. During analyze and improve phases, the team would recommend changes, implement the recommended actions, and rescore the process to determine if the RPN of the changed process is lower. If it is higher or the same, then the change was not a good one and the team might need to try again.

Collecting Data
Creating a baseline metric for a process begins in the define phase, but teams cannot leave the measure phase without a strong understanding of current process performance. That understanding begins with figures such as sigma level, but teams should also define a process-specific metric where possible and gather historical data regarding that metric so they have something to compare future data against to prove that improvements were made. Ideally, the team would have access to historical metrics for the process. In some cases, the team has to collect data from scratch. We’ll introduce data collection later in this chapter and cover it in depth in the units on sampling.

Continuous versus discrete data
Before creating and displaying a baseline metric via graphical representation, you have to understand the type of data you are dealing with. Data is either discrete or continuous, and teams collect data either as a population sample or a process sample. How teams collect data and the type of data collected determine how the data can be viewed graphically and analyzed.

Discrete Data
Discrete data is categorical in nature; it is also referred to as qualitative data or attribute data. Discrete data falls into three categories: ordinal, nominal, and binary, or attribute, data; some data collected can be expressed in one or more of the discrete categories. For example, student test scores can be conveyed in an ordinal fashion via the grades A, B, C, D, and F or in a binary fashion via the Pass/Fail distinction.

Discrete data can be displayed via Pareto charts, pie charts, and bar charts. In some instances, the data can be converted to run and control charts using variation within the data or ratios as the item being charted. In the chapter on the control phase, you’ll begin understanding why a team might want to convert discrete data to be used in a control chart.

Within discrete data, binary or attribute data is usually the easiest data to collect. Attribute data records one of the other answers. Does the person choose paper or plastic? Is the room hot or cold? Is the glass empty or full? Is the light on or off? Depending on the scenario, attribute data can be very accurate. The light is either on or off; the switch position tells you that. Attribute data in this case can be automated with the right technology, which means it would be highly accurate. Whether the cup is empty or full is another story, because there are so many variations between completely empty and completely full. If the data is being collected by people, personal biases might enter the equation. Teams can remove some of those biases and better ensure accurate measurements, which will be covered in the sections on measurement systems.

Continuous Data
Continuous data is quantitative data and is measured in units. For example, the time of day is measured in hours. Temperature is measured in degrees, and almost anything can be converted to continuous data by making it a percentage. Continuous data is visualized in graphs such as histograms and box plots. Box plots are discussed in chapter 14, and histograms are covered in depth in the chapters on statistics. Continuous data can also be viewed in the form of run and control charts.

Choosing Between Discrete and Continuous Data
Sometimes, a process or activity can be measured in both discrete or continuous data. Depending on the purpose of the measurements, teams might need to pick between the two data types. For example, if aSix Sigma team has identified room temperature as an input into the quality of product, they will want to monitor the temperature of the room. They can do so by recording the temperature in degrees every ten minutes; that data would be continuous. Alternatively, the team might create a tick sheet, having someone make a mark every hour to note whether the temperature was in the 40s, 50s, 60s, 70s, or 80s with regard to the Fahrenheit scale. That data would be discrete. In this particular example, most teams would choose to record the continuous data. Exact temperature measurements every 10 minutes provides a lot more information than whether the temperature of the room was in the 70s at the turn of the hour. The continuous data could be converted to provide teams with the discrete data easily; the discrete data in this case – and in most cases – could not be converted to continuous data. What is true in the example is true for most scenarios. When possible, teams should convert measurements to continuous data. Continuous data:

Provides more information than discrete data does.
Is typically more time-consuming to collect than discrete data unless teams have access to
automated or computerized data collection.
Is more precise than discrete data.
Lets teams remove variation and errors inherent in estimation and rounding.

Levels of Data
Data can be classified at four basic levels: nominal, ordinal, interval, and ratio. Attribute, or binary, data is actually a limited form of nominal data.

Nominal Data
Nominal is considered to be the lowest data classification level and simply involves applying number labels to a qualitative description so statistical analysis programs and tests can be applied to the data. The numbers assigned to each category don’t provide any information about whether the data is better or worse than other data in the listing – in nominal data, numbers don’t reflect a scale. An example of nominal data might be applied in a list of birth states for a classroom. In a class of 30, the number of students born in various states breaks down as follows:

Texas: 6
Louisiana: 4
Arkansas: 10
Mississippi: 1
Oklahoma: 9
In nominal data, each state would be provided a numeric label:

Texas
Louisiana
Arkansas
Mississippi
Oklahoma

That doesn’t mean 5 students are from Oklahoma; it means 9 students fall into category 5 for the question “What state were you born in?” For nominal data, central tendencies are calculated not with means or medians, but with mode. For example, a list of the nominal data in our example would be as follows:
1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5
The mode is the number that appears most in the set; in this case, 3. Statistically, analysis is limited with regard to nominal data, but some tests can be performed with statistical analysis software.

Ordinal Data
Ordinal data is considered to be a higher form of data than nominal, though it still uses numbers and categories to identify data elements. With ordinal data, though, the numbers themselves actually provide some meaning. The numbers used in the FMEA scales at the beginning of this chapter were ordinal data. The numbers are qualitative in nature, but they are also ranked. Central tendencies with ordinal data are measured by either the mode or the median, and common uses for this type of data include ranking various things against each other or rating a specific thing, such as a movie or pain level. Ordinal data can be arranged in an order that makes sense: on a 1 to 10 scale, Suzy rated the movies as 2, 5, 6, and 9. If one is the worst and 10 is the best, then we can assume Suzy liked the last movie best.

While ordinal data comes with a logical order, the intervals between the numbers don’t mean anything. If Suzy rated movie A as a 10 and movie B as a 9, the conclusion is that she liked movie A better. Exactly how much more she liked the movie is not discernible using ordinal data.

Interval Data
Interval data is an even higher form of data classification. Interval data provides numeric values that can be arranged in a logical order with meaning; the big difference between ordinal and interval data is that the difference between each interval value provides meaning. If Frank is keeping track of the temperature in his house and he sees that at 8:00 a.m. it was 76 degrees and at 9:00 a.m. it was 80 degrees, he not only knows that 9:00 a.m. was hotter; he knows that it was 4 degrees hotter at 9:00 a.m. than at 8:00 a.m. Interval data is continuous, or quantitative, and offers more flexibility when it comes to statistical analysis.

Ratio Data
Ratio data is considered to be the highest of the data classifications. Ratio data has an absolute zero point, can be both discrete or continuous in nature, and provides the highest capabilities for statistical analysis in many cases. Some examples of information that can be recorded using ratio data include force, defects per million opportunities, voltage, height, units per hour, and volume.

Choosing the Best Measurement Systems
Measurement systems analysis applies scientific principles to help teams analyze how much variation a system of measurement brings to a process. The purpose of the MSA is to identify errors of accuracy within data collection tools. Teams can then redress measurement systems to create more accurate data captures or, if that isn’t possible, take the possibility of errors into account when performing analysis on data.

During measurement systems analysis, teams should review multiple components of possible measurement error. Six Sigma teams analyze:

Whether bias occurs in the accuracy of measurements
Whether the measurement has the proper resolution
What measurement scale linearity exists
Whether measurement activities are stable over time
Whether measurements are repeatable and reproducible

Depending on the measurements the team is dealing with, the MSA can be time consuming and is often why the measure phase of a DMAIC project is one of the longest.

Creating Accuracy
In this stage of MSA, teams define the difference between the most accurate measurement possible and the data being collected by the current measurement system. The goal of a measurement system is accuracy: coming as close as possible to a defined target, if not the exact measurement. For example, in a computer manufacturing plant, one employee might solder a chip to the motherboard. For the rest of the chips and wires to be added to the motherboard, the chip must be placed within a 2 mm area. In this case, a measurement tool might be implemented with a required accuracy of plus or minus 0.5 mm to ensure the chip is placed within the area targeted.

Teams can ensure accuracy of data by verifying that the gauge used to collect data is performing
accurately. If a digital scale is being used to weigh ingredients, teams should calibrate the scale using
calibrated weights. If templates are used on a factory floor to make measurements more efficient,
teams should ensure those templates are accurate by comparing them against known measurement
tools such as verified rulers and scales. Note that, for the purpose of the MSA, accuracy reflects the
performance of the measurement tool, not the operator. Whether the employee uses the measurement
tool correctly or records the amount correctly is considered a concern of precision, which is covered
later in this chapter under R&R Gages.
Once a Six Sigma team is confident that a measurement tool is properly calibrated, they can instruct
employees or others who are responsible for recording data. Data should be accepted as it is collected
for most efficient access and because early review can turn up specific problems with data collection.
When possible, teams should not round data but collect it as it is recorded.
If data is being collected manually, employees should have a data collection template that prompts
them to collect data at appropriate times and record information about the data collection event,
including the person collecting the data, the machine or process involved, conditions of the environment
– especially those that are different from normal conditions or might have a direct impact on
measurements – and the measurement tool being used if multiple tools are an option. These details
help Six Sigma teams rule out outliers, which are discussed in the next chapter.
Before measurements are passed to the analyze phase, Six Sigma experts should review data to ensure
there are no misplaced or missing decimal points, that duplicate entries haven’t been recorded, that
frequency-based measurements aren’t missing points, or that any other obvious issues haven’t occurred
with the data. Addressing obvious data problems before beginning analyze phases reduces the chance
that teams will come to false conclusions about root causes or viable solutions for a process.

Addressing Resolution
Measuring at the correct resolution ensures that a measurement system can detect change in the data
or process appropriately. For example, if a Six Sigma team is working to improve a process that cuts pipe
for bathroom fixture installations, it might be concerned with the length of the pipe. In reviewing the
measurement system for the cut pipe, the team finds that the process includes measuring the pipe to
the nearest centimeter. If, however, pipes that are off by several millimeters cause issues in the
installation, then the nearest centimeter measurement is not a small enough resolution.
A good rule of thumb to follow for resolution is called the 10-bucket rule. Break your measurement
resolution into a tenth of what is required. If the pipe must measure within a range of 5 mm to perform,
the measuring tool should measure to the ½ mm. In another example, a food service department might
be tasked with maintaining the correct temperature in a freezer. To monitor the temperature, an
employee records the temperature once per hour. If temperatures fluctuate quickly in the freezer, a
change in temperature that would impact quality of food or ingredients might come and go between
recordings. In this case, the proper resolution might be gained by recording measurements every 10
minutes or every six minutes for 10 readings per hour. Even better, in a freezer with a digital thermostat
connected to a network, teams might be able to access readings recorded every minute.
Resolution is usually one of the easiest things to correct within a measuring system, but it isn’t always
cost-effective or plausible to measure at the most detailed resolutions. Teams should consider resource requirements when developing a measurement system. If, however, the most detailed resolution is
possible, measurements obtained will provide more information about the process and a larger sample
size from which to work.
Adjusting for Errors of Linearity
Linearity describes how a measurement system performs across a range. A standard metric ruler in the
hands of most people is fairly accurate at measuring centimeters, but is less accurate at measuring
millimeters or kilometers. A scale with a range between 0 and 10 kilograms might measure less
accurately at either end of the range.
Taking measurements at various ranges with an existing measurement system and comparing those
measurements to data gathered with tools known to be accurate across all ranges can help teams find
errors of linearity. In some cases, teams can develop mathematical equations to account for the
discrepancies. For example, if the scale is accurate at 5 kilograms, but is off by an extra quarter of a
kilogram for each kilogram thereafter, a measurement of 8.5 kilograms would actually be:
8.5 – ((8.5-5)*.25) = 7.625
If mathematical adjustments are not possible, then teams should not use measurement systems to
measure ranges where linearity errors regularly occur.
Stability
Stability describes the consistency of measurements over time. If operators are measuring in the same
way and using the same tools – and those tools don’t have any of the other problems described above –
then measurements should reflect stability on a control chart. Control charts are introduced in Chapter
16 and covered in depth in later chapters on statistical process control.
If the variation of measurements, as reflected on a control chart, do not indicate stability, then teams
might want to first rule out a problem with the measurement system before determining that the
process is out of control.

Gage R&R
Gage R&R tools are used to ensure repeatability and reproducibility with regard to measurement
systems. In most cases, Gage R&R tools apply to measurement systems that involve human operators
and appraisers. Six Sigma teams apply Gage R&R tests to find weaknesses within such measurement
systems.
In Gage R&R testing, repeatability means that a single employee, using the same measurement system
and appraising the same things, can repeat his or her measurements. Reproducibility means that
multiple employees using the same measurement system and appraising the same things come up with
measurements that match or are very close to matching.
Most Gage R&R tests fall into two types: attribute and variable. The premise for testing each type of
measurement is the same, though the criteria and statistical analysis following the test differ slightly.
Attribute Gage R&R
An attribute Gage R&R is used when Six Sigma teams are analyzing measurement systems for go/no go
data. For example, if operators review an item in the product line and decide simply to pass or fail it, this would be an attribute measurement. In the example of the freezer measurements above, an employee
might simply be tasked with recording whether the temperature was in an appropriate range: a yes/no
measurement. As previously stated, attribute measurements provide the least information about a
process, so in the case of the freezer temperature, it’s better to record the actual temperature. Whether
that recording was within appropriate range can be determined systemically from the temperature data.
When attribute data is used, an attribute Gage R&R is used to test the measurement system following
the steps below.

Select at least two appraisers.
Provide a number of samples. Label the samples in a way that you know which one is which but
that wouldn’t identify the sample for the appraiser.
Record the actual attribute measurement for each sample according to the best possible (most
accurate) measurement you have.
Have each appraiser record the attribute measurement for each sample provided (go/no go;
yes/no; hot/cold; pass/fail; etc.).
Repeat the process with the same samples and appraiser, randomizing the order in which you
present the samples. Randomizing sample order the second time appraisers are presented with
them reduces the chance that appraisers remember what measurement they recorded the first
time and record the same measurement by default.
Enter all data into a spreadsheet or Gage R&R file similar to the one below that shows a test of a
pass/fail measurement.

From the Gage R&R above, you can see that the measurement system is reproducible only 50 percent of
the time, making it a poor measurement system. It is repeatable 90 percent of the time for Appraiser 1 and 80 percent of the time for Appraiser 2, and the appraisers are accurate 80 percent and 70 percent of
the time respectively. Given these results, a Six Sigma team might determine that there is some problem
of clarity with instructions for how to determine whether a sample is a pass or a fail. The chart above
only provides data for a set of 10 samples; more accurate attribute Gage R&R testing usually requires at
least 20 data points.
Variable Gage R&R
Not all data is attribute data, which is why teams can also perform variable Gage R&R tests. While the
raw data from a variable Gage R&R test can provide a Six Sigma team with a picture of whether a
measurement system is obviously failing or not, statistical analysis is usually required to make a true
determination about the performance of a measurement system. This is because, with variable
measurements, some differences between measurements and operators is likely, particularly when
measuring to very small or large figures or capturing data in a moving measurement.
Set up a variable Gage R&R test in much the same way you set up an attribute test, using two to three
appraisers and at least five to ten outputs to be measured. Have each appraiser measure each sample
two or three times, randomizing the order in which samples are presented to avoid appraisers
remembering the measurements initially entered. Record all data on a variable Gage R&R template,
such as the example below.

The statistical analysis performed in Excel SPC or Minitab by a Black Belt or Green Belt typically returns
four figures:

% Study Variation
% Tolerance
% Contribution
Number of distinct categories

Teams should look to ensure all four elements of a variable Gage R&R test calculation are in what are
considered “safe” ranges. Commonly, each element comes with a scale for safe, or green, zones along
with caution zones and failure zones. If one of the elements falls into a caution zone and all others into the green, then a team will likely conclude that the measurement system is sufficient. In some cases, all
or a majority of caution zone scores might be deemed acceptable, particularly if making the
measurement system any more accurate would be costly or cause application issues for other
processes. Measurement systems that score in the failure zone for any element should probably be
repaired or replaced.
Common criteria used to judge each element of the variable Gage R&R calculations are as follows:

Note: Another tool that is effective in identifying variation in a measurement system is called the
ANOVA, or Analysis of Variance. ANOVA is also useful for analyzing variation of any type, and will be
covered in Unit 5 on intermediate statistics.
Collecting Data Samples
Once teams are sure the best possible measurement tools are in place, they can begin collecting data to
be used in the analyze phase of the DMAIC project. The most accurate conclusions come when a team
can analyze data for the entire population, but that is rarely possible due to time and cost constraints. If
you can gain access to automated data or data warehouses, you might be able to collect population data
or extremely large sample sizes that better approximate population data. Otherwise, Six Sigma teams
must randomly sample the population that is available and use those samples to draw conclusions about
the population as a whole.
To ensure samples can be used to draw statistical conclusions, they must be handled correctly and be
the appropriate size. In this section, we’ll simply cover the types of sampling strategies that Six Sigma
teams might use and why.
Simple Random Sampling
Simple random sampling works when there is an equal chance that any item within the population will
be chosen. For example, if you put 20 marbles of the exact size, weight, and texture in a bag and blindly
select one, each marble in the bag has a 1 in 20 chance at being selected. If the marbles are different
sizes or weights, those differing attributes can impact the chance that each marble will be selected.
Heavier marbles might sort to the bottom of the bag; bigger marbles might be more likely to be picked
up.
Random sampling for statistical analysis requires that the sample will represent similar attributes and
percentages as the entire population. The population is “N” items large. The sample size is “n” items
large. How big the sample needs to be to statistically represent the population is decided by a number
of factors.
Stratified Sampling
Stratified sampling occurs when the population as a whole is divided or can be divided into subgroups
with differing attributes. For example, if a shipping company wants to test the accuracy of its estimated
shipping times against actual shipping times, it might assume that the results will vary according to the
distance a package has to travel. By randomly selecting samples from the entire package population,
there’s a chance the company might only end up with samples for packages delivered within a 200-mile
radius.
To prevent bias in the data, the shipping company might divide the population into four subgroups:

Deliveries within 200-mile radius
Deliveries within 201 to 400-mile radius
Deliveries within 401 to 600-mile radius
Deliveries over 600 miles

By sampling randomly from the stratified subgroups, the team ensures a sample size with less bias.
Sequential Sampling
Sequential sampling involves selecting every X item for inclusion in the sampling. Sequential sampling
can be used when teams are collecting data at intervals such as time. The team might collect data every
10 minutes. Sequential sampling can also be used to sample physical items; every 5th item on a product
line might be reviewed. Given the right parameters and enough time, sequential sampling can provide
valid statistical results. Teams must be cognizant, however, that the sequence of the sampling could, in
rare cases, skew results. It is possible, for example, that something occurs during every 5th iteration of a
process that causes a difference to occur.

Samples that Aren’t Random
Non-random sampling should not be used when dealing with statistical analysis because it is more likely
to introduce user or sampling error. While all sampling comes with some form of error, random
sampling errors can be calculated and accounted for in analysis. The same cannot be said of non-random
samples.
Non-random sampling includes convenience or judgment sampling. Convenience sampling occurs when
a team takes the most convenient measurements. “We want to know about the process right now, so
let’s review the next dozen items that come off the line.” That type of analysis only truly tells the team
how the process performed at that exact moment in time.
Judgment sampling occurs when an expert or knowledgeable person is tasked with “selecting”
appropriate samples. A supervisor might say to his or her team members, “Select some of your work
that represents the normal way you function in a given day.” In most cases, the team members select
what they believe is better quality work, skewing any results from the sample.
Delivering a Baseline Metric
One of the major deliverables coming out of define and measure phases is the baseline metric. How is
the process performing now, and what measurement will the team use to compare current performance
to post-improvement performance?

Baseline metrics are numbers, but most teams find that presenting the metric graphically resonates best
with business resources and executives. Visual representations also provide teams with a quick way to
determine if progress is occurring.
The type of visual representation you use depends on whether your major metric is discrete or
continuous. Discrete data can be displayed on Pareto charts (see Chapter 5) and continuous data can be
displayed via run charts. You can also use variation or other calculations to convert discrete data to
continuous data for display in run charts and control charts (see Chapter 16 for information about
control charts).
Run Charts
A run chart can be used to monitor the performance of any variable or process over time. With a single,
intuitive chart, Six Sigma teams can display trends, shifts, and cycles within a process; they can also
monitor a process for concerns, though run charts are not as effective at this as the very similar control
chart is.
A basic run chart is simply a line plot of the data over time, which means anyone can create the chart.
Most Six Sigma run charts also feature a line representing the median of all data points for visual
reference. Depending on the type of information being charted, you may need to convert data to a ratio
for a more accurate run chart. For example, if you are plotting the temperature of a surface over time,
there is no need to convert data. If you are plotting the number of patients readmitted to a hospital
within 30 days of being discharged, then it helps to convert the data to a percentage of the number of
patients discharged within the same time period. In a 30-day period where 10,000 patients were
discharged, you can expect a higher number of returns than a period during which only 5,000 patients
are discharged.
The figure below illustrates a run chart of temperature over time. You can see how temperature
changes through time and begin to see some possible trends. A Six Sigma team would be able to zoom
out, viewing the run chart over more time to validate trend assumptions. You can also see that the
median temperature for the process is 33.

The run chart below indicates the number of returns per hundred sales for each month of the year. You
can see that returns as a rate of sales increases steadily during the first part of the year before holding
steady from May through November. The orange line indicates the median returns per hundreds sales,
which is just under 7.

Create Basic Run Charts in Excel
Statistical analysis software, including Minitab and Excel SPC, creates all elements of a run chart
automatically from entered data, but anyone can use basic Excel functions to create a run chart if
needed.
First, create a data table.
Creating a data table for a single attribute, such as temperature, just requires entering the time labels in
one column and the attribute measurements in another. For the example, we’ll walk through creating a
rate data table, since it involves additional steps.

Enter the data labels (month, week, hour, etc.) in the first column of Excel.
In the next column, enter the corresponding measurements for the attribute you are interested
in: in this case, the total number of returns per month.
In the third column, enter the total number of items you are comparing the attribute to: in this
case, the total number of sales per month.

In the fourth column, calculate the percentage the first column of data is of the second. In this
case, the percentage of returns per sales for each month. The calculation is achieved in this case
by the formula =B2/C2 for January, =B3/C3 for February, and so forth.

Decide whether you want to create a run chart showing percentages, or if you would like to
create the chart illustrating rate per 100, per 1,000, etc.
If you want to illustrate a rate per (x), multiply the percentage calculation in the fourth column
by (x). In this case, the figures in column D are multiplied by 100

Use Excel to calculate the median of the number you plan to chart. The median is calculated
with the formula =Median(Number 1, Number 2,…), where the numbers in the formula correlate
with the range of all the charted data points. In this case, the median is 6.96679
Highlight the data labels (in this case, column A) and the figures to be charted (column E)

9. Select Insert > Charts > Line Chart to insert a line chart of the attribute or attribute calculation.

10. Select Insert > Shape > Draw Line.

Draw a line on your run chart approximately where the median would be. Use Excel tools to
select a color and thickness for the line that you desire.

The completed run chart can be used to present information to the Six Sigma team or include a
graphical representation of baseline process performance in a measure phase tollgate presentation.
Again, it should be noted that manual creation of a run chart is not required for most Black Belts and
Green Belts, who will have access to statistical analysis software.
Measure Tollgate Checklist
Use the checklist below to determine whether a team is ready to move from the measure phase to the
analyze phase of a DMAIC project.
o The team has agreed upon the key measurements and come up with a baseline measurement of
process performance.
o The team has analyzed measurement systems and identified any issues that might contribute to
analysis errors.
o Where possible, the team has corrected measurement systems to remove error risks.
o The team has calculated process variation and sigma level.
o The team has conducted appropriate sampling to allow for statistically valid conclusions in the
next phase.
o The sponsor or champion has reviewed and signed off on all elements of the measure phase.