Minitab manual

1.- SIGMA LEVEL

Sigma is a measure of variability. Indicates that "information" falls within the requirements of the customers, the larger the sigma of the process, the greater the outputs of the process of the products and services that meet the requirements of the customers δ = Sigma δ = Standard deviation, measures data variation.

minitab-manual

6δ = It is equivalent to zero defects. It is a correct functioning level of 99.9997 percent; where defects in processes and products are practically non-existent.

The Sigma level is determined from the difference of the Mean (X) with the Upper (LS) and Lower Limits (LI) between the standard deviation, selecting the highest Result.

Example

From the data obtained from a process

Specification = 100 +/- 15

Upper Limit (LS) = 115

Lower Limit (LI) = 85

Mean (X) = 99.55

Standard Deviation = 2.98

- LS (115) -X (99.55) / Deviation Standard (2.98) = 15.55 / 2.98 = 5.22

- LI (85) -X (99.55) / StandardDeviation (2.98) = 14.55 / 2.98 = 4.88 Sigma Level = 5

- To calculate the mean and standard deviation using MINITAB, empty the data obtained in the MINITAB worksheet

- Select Histogram in the Graphs menu

- Select the graph "WithFit"

- Select the column of data in the field "Graph Variables"

- Obtain the Mean and Standard deviation of the resulting graph.

2.- AVERAGE (AVERAGE)

The arithmetic average or mean describes with an individual value a whole set of observations, it is known as the most useful measure of central tendency.

It is obtained by dividing the sum of the observed values in a series by the number of readings.

The mean of a sample (a few) is represented by the symbol

The mean of a population (all) is represented by the symbol 

For example, the wait time (in minutes) of five clients at the bank was: 3, 2, 4, 1, and 2.

On average, a customer waits 2.4 minutes for service at a bank.

3.-STANDARD DEVIATION

What is standard deviation (σ)?

The Standard deviation, in a data set, is a measure of dispersion, which tells us how far the values can deviate from the average (mean), therefore it is useful to find the probabilities that an event occurs

The standard deviation can be interpreted as a measure of uncertainty. When determining whether a group of measures agrees with the theoretical model, the standard deviation of those measures is of vital importance: if the mean of the measures is too far from the prediction (with the distance measured in standard deviations), then we consider that the measures contradict the theory. This is consistent, since the measurements fall outside the range of values in which it would be reasonable to expect them to occur if the theoretical model were correct. The standard deviation is one of three central location parameters; shows the grouping of data around a central value (the mean or average).

The formula is easy: it is the square root of the variance. So "what is variance?"

Variance

The variance (which is the square of the standard deviation: σ2) is defined as follows:

It is the mean of the squared differences with the mean.

In other words, follow these steps:

1. Find the mean (the average of the numbers)

2. Now, for each number subtract the mean and square the result (the difference squared).

3. Now calculate the mean of these squared differences.

* Note: why squared?

Squaring each difference makes all the numbers positive (to prevent negative numbers from reducing the variance)

AND also makes large differences stand out. For example 1002 = 10,000 is much larger than 502 = 2,500.

But squaring them makes the answer very large, so we undo it (with the square root) and so the standard deviation is much more useful.

Standard deviation formula:

Normal distribution curve

The standard deviation is a powerful statistic when used with models such as the Normal distribution, since it allows us to make predictions about the expected variation of the process based on a sample of the process.

One of the properties of the normal curve is that if the curve is divided into standard deviations, from the average:

• 68.26% of the area under the curve falls within ± 1 standard deviation

• 95.45% of the area under the curve falls within ± 2 standard deviation

• 99.73% of the area under the curve falls within ± 3 standard deviation

• Next we will show how to obtain the Standard deviation with the minitab program from 100 data

Step # 1

Once the data is entered in Minitab, we go to the option that says Graph, click it and select

Histogram, from there we click on the With Fit option as shown in the upper screen

Step # 2

Immediately the screen that says Histogram-With Fit will appear, from there we go to the small screen that says Graph variables and we give it two clicks. Next, the name of the column where we have the dates and data will appear on the long screen. In this case we are going to double click on column C2 Process 1, since it is where we have the data. On the Graph variables screen, 'process 1' should appear as shown in the upper screen. Next step we select the OK button.

Finally we will obtain our graph where it shows us the variation of our data under a normal curve with respect to the mean of the 100 data. In this case our mean was 502.5 and on the other hand our standard deviation was 49.17

4.-NORMALITY TEST

Before conducting any statistical study, to determine if the data to be analyzed are reliable, a normality test must be performed. One of the most used tests is the Anderson-Darling.

This test uses the "Normal Probability Plot" to verify that the data is normal. The graph will show a Probability Value ("P-Value"), if it is greater than 0.05, the data is normal with 95% reliability.

1- to generate the graph, open the file containing the records taken.

2- Select…. Stat> Basic Statics> Normality Test.

3- In the Variable box, enter the data you want to analyze and make sure that the Anderson Darling option is selected.

4- Press Ok to generate the normality test graph.

Interpretation: If P-Value> 0.05, the data are normal with a 95% confidence level, therefore for the example shown of process 1 the data shown are normal since P-Value is 0.656.

Visually, it can be seen that the data follow the reference line, which indicates that they come from a normal distribution.

5.- CONTROL CHARTS Control

charts consist of a diagram where the results of an inspection are successively recorded during a process.

To improve the process using the control charts the following steps have to be repeated.

1. Collection.

• Data is taken and plotted.

2. Control.

• Limits are calculated based on the data obtained and plotted.

• Special causes are identified and necessary corrective actions are taken.

3. Analysis and improvement.

• Variation due to common causes is qualified and actions are taken to reduce it. These three phases are repeated to achieve continuous improvement of the process.

The benefits of using the control charts correctly can be among others:

• Help the process run consistently and be predictable.

• Provide information to operators for continuous control of the process.

• Distinguish common causes from special ones, as a guide for taking local or system actions.

The requirements for the proper use of control charts are:

• Have the process defined.

• Identify the characteristics to control.

• Define the measurement system.

• Adjust the process to reduce unnecessary variation.

Graphs of Averages, Ranges and Standard Deviation.

Graphing and checking the averages of the samples is not enough, since the mean of a process can remain stable for short periods of time while its dispersion or variation can change.

Therefore it is necessary to use the range chart together with the average chart. This plot is based on the concept that the ranges calculated for small samples tend to be normally distributed.

The standard deviation graph helps us to see how the degree of dispersion of the data behaves with respect to the mean of the samples.

Average: The average is the average. This is found by dividing the sum of the values by the

total number of the values.

Range:

Range is a common measure of variation. To determine the range, subtract the smallest value in a sample from the largest value in the same sample.

Range = ”R”

Xmax = Maximum value

Xmin = Minimum value

R = Xmax - Xmin

Standard Deviation:

The standard deviation is a measure of the degree of dispersion of the data with respect to the average value.

Considerations for obtaining data.

• The variation of the subgroups to be chosen must be small and can consist of 4 or 5 consecutive pieces in the process.

• The frequency of data collection should be taken in relatively short periods of time, this with the intention of detecting any situation that causes a variation in our process.

• The number of subgroups must be sufficient to allow us the sources of variation to have the opportunity to be reflected in our graphs.

These graphs have a central line that represents the historical average of the characteristic that is being controlled, as well as two other lines that represent the upper and lower limits also obtained from historical data. In the case of minitab, both the center line and the limits are calculated automatically with the data entered.

Control charts can be by variables or by attributes.

By variables:

A measurable quality characteristic such as dimension, weight, volume is a quantitative variable that is why the control charts by variables are joined to provide information on the performance of the processes.

For data subgroup

Example of a Diagram for 20 subgroups

Graphs x bar We

would follow the following steps in Minitab or Stat / Control Charts / Variables Charts for Subgroups / Xbar

No points outside the control limits are shown in this graph.

Graphs R We

would follow the following steps in Minitab

Graphs S We

would follow the following steps in Minitab

Graph Xbar - R We

would follow the following steps in Minitab or Stat / Control Charts / Variables Charts for Subgroups / Xbar-R

Graph Xbar - S We

would follow the following steps in Minitab or Stat / Control Charts / Variables Charts for Subgroups / Xbar-S

Control charts for individual observations We

would follow the following steps in Minitab or Stat / Control Charts / Variables Charts for Individuals / Individuals

Processes are shown to be stable in this data section

Classification of graphs by attributes

They are used to contrast the qualitative characteristics, that is, characteristics that are not numerically quantifiable.

Select: stat - control chart - attributes chart - p, u, np, c

P charts (proportion of defective units)

P Charts (Proportion of Defective Units)

The “P” charts measure the proportion of defective parts in a group of parts inspected.

It is important to keep in mind that each component, part or item inspected is recorded as conforming or non-conforming without considering that a single item has several defects.

The graph shows us 6 points out of control.

Graph Np

This graph measures the number of defective parts in an inspected lot. It is identical to the “P” graph except that the number of defective parts is plotted and not the proportion, both apply for the same situations choosing the graph np when:

a) The real number of defective parts has greater meaning or is easier to determine. report.

b) The sample size remains constant from period to period.

Chart c

Chart “C” measures the number of defects in an inspection lot. This graph requires a constant sample size. It applies in two types of inspection situations.

a) When the defects are dispersed through a continuous flow of the product.

b) When defects from different potential sources can be found in a single unit.

U Charts

The “U” chart measures the number of defects per unit inspected in subgroups that can have different sizes. It is similar to the “C” chart except that the number of defects is expressed on a unit basis. Both graphs are suitable for the same situations: however the “U” graph can be used if:

a) The sample includes more than one “unit”

b) The sample size can vary from period to period.

6.- AVERAGE AND STANDARD DEVIATION

It is obtained by dividing the sum of the values observed in a series by the number of readings.

The mean of a sample (a few) is represented by the symbol

Example taking average of a population of data

Selecting graph - histogram –whit fit gives

The mean and standard deviation are shown here

7.-ONE-SAMPLE T-CONFIDENCE INTERVAL AND HYPOTHESIS TEST

Use 1-Sample t to calculate a confidence interval and perform a hypothesis test of the mean when the population standard deviation () is unknown. For a two-tailed-one-sample t:

Data

Enter each sample in a unique numeric column. You can generate a test of hypothesis or confidence interval for more than one column at the same time.

MINITAB automatically skips missing data from calculations.

To generate a t-Confidence Interval and Hypothesis Test:

1. Stat / Basic Statistics / 1-Sample t

2. In Sample in columns, enter the column (s) that contain the samples.

3. Do one of the following:

• To calculate the confidence interval for the mean, select Options Confidense interval.

• To perform a hypothesis test, select Test mean and enter the value of the mean.

4. If you want, make a graph select Graphs…

Select the population of data to be analyzed and then return to the minitab functions for the calculation of One-Sample t-Test, in this case Process 1 is selected.

Display a histogram, scatter plot, and box plot for each column. The graphs show the sample mean and a confidence interval for the mean and, in addition, the value of the null test hypothesis when performing a hypothesis test.

Interpreting the Results

One-Sample T: Process 1

Test of mu = 500 vs not = 500

The statistical test, T, for H0:  = 500 is calculated as 0.51.

The p-value of this test, or the probability of obtaining the most extreme value of the statistical test for the null hypothesis to be true, is 0.608. This speaks of the confidence level, or p-value. Therefore, reject H0 if its acceptability level α is greater than the p-value.

A 95% confidence interval for the population mean, , is (492,771, 512,284).

8.- PROCESS CAPACITY

The capacity of a process is the ability to generate a product that meets certain specifications. In the best case, it is desirable that the natural tolerance limits of the process are within the limits of the product specification, to ensure that all production will meet specifications. To analyze the capacity of the process a frequency histogram is used, for which it is necessary to take a certain number of measurements

To measure the capacity of a process, coefficients are used to compare the range of specifications with the natural fluctuation of the process. One of them is Cp:

Cp = (LSE - LIE)

6 δ

Where:

• LSE is the Upper Specification Limit

• LEL is the Lower Specification Limit

If the process has the capacity to manufacture the product, then Cp> 1. In general, Cp> 1.30 is required for greater safety.

Definitions

Cp: It is the capacity index which is defined as the tolerance divided by the capacity of the process regardless of whether the process is centered.

Cp = (LSE - LIE)

6δ

Pp: It is the performance index which is defined as the tolerance divided by the performance of the process regardless of whether the process is centered

Pp = (LSE - LIE) 6δ s

CPU: It is the superior index capacity which is defined as the spread of the upper tolerance divided by the actual upper spread.

This graph shows that a good part of the product is above the Upper Specification Limit (LSE). Even so, it turns out Cp> 1, wrongly indicating that the process has sufficient capacity; in this case, the second coefficient must be used, which clearly shows that the process does not have sufficient capacity (Cpk <1).

- Example:

Calculation of process capacity in MINITAB

Empty the data obtained in the MINITAB worksheet

Since the data was captured in the worksheet in the main menu bar select: stat> Quality Tools> Capability analysis> Normal

A data table will be displayed where the following will be captured:

1.-In the "single column:" box select column 2 "process 1"

2.- In the "group size" box select column 1 "Date"

3.- In the "Lower spec" and "Upper spec" boxes enter the specification limits (500 ± 200) select ok to generate graphs and data

In the resulting graph we can observe the values obtained from our process to be analyzed

Cp = 1 in this example 1.34 indicates that the process is capable of producing 99.73% of the parts within the engineering specifications

CPK = 1.3 in example 1.33 indicates that the process is capable of producing good parts 99.73% of the parts within specification

References for CPK or positive CPK <1 indicates that the process average is within specification but one of the 3 sigma is outside the specification limits (bad parts or high possibility of them coming out)

o CPK = zero indicates that the process is centered in one of the specification limits o Negative CPK indicates that the process average is outside of one of the specification

limits PPM = 50.81 defective parts in a million parts manufactured

9.- SIX PACK

Sixpack capacity (Normal distribution)

It is used to generate reports of process capacity when your data follows a normal distribution.

To confirm the stability of the process the report includes:

- An Xbar chart (or individual charts for individual observations)

- An R chart or an S chart (for subgroups larger than 8)

- A run chart of the last 25 subgroups (or last 25 observations) To confirm normality, the report includes:

- A histogram of the process data

- A normal probability plot (with 95% confidence interval, Anderson-Darling, and P values) To evaluate capacity, The report includes:

- A plot of the process capacity

- General capacity statistics; Cp, Cpk, Cpm (if you specify a goal), Pp, Ppk, and Z-value comparison.

Sixpack Capacity Example (Normal Probability Model)

A wire manufacturer wants to assess whether the wire diameter meets specifications. The wire must be 0.55 +/- 0.05 cm in diameter to meet engineering specifications. Analysts evaluate the capability of the process to ensure that the customer's requirement of a Ppk of 1.33 is being met. Every hour, analysts take a subset of 5 consecutive cables from the production line and record the diameter.

1- to generate the report, open the file containing the records taken.

2- Select…. Stat> Quality Tools> Capability Sixpack> Normal.

3- In individual column, enter "Diameter" since it is the column that contains the records. In subgroup size enter the number 5.

4- To register the upper limit, in the Upper spec field, enter 0.60. and to set a lower limit in the Lower spec field, enter 0.50.

5- Click on Options. In Target (add Cpm to table), enter 0.55 Click OK in each dialog box.

The output graph is shown below

Interpreting the results

In both X and R charts, the points are randomly distributed between the control limits, implying a stable process. However, you must also compare the points on the R chart with those on the X chart to see if the points follow each other. These points do not, which again imply a stable process.

The points in the graph of the last 20 subgroups make a random horizontal dispersion, without trends or changes, which also indicates stability in the process.

If you want to interpret the statistics for process capability, your data should roughly follow a normal distribution. In the capacity histogram, the data roughly follows the normal curve. On the normal probability plot, the points approximately follow a straight line and fall within the 95% confidence interval. These patterns indicate that the data is normally distributed.

But, from the capacity plot, you can see that the overall process variation is wider than the range for the specification limits. This means that sometimes you will see wires with diameter outside the tolerance limits. In addition, the value of Ppk (0.80) is below the required goal of 1.33, indicating that the manufacturer needs to improve its process.

10.- Regression

Regression is a statistical technique used to simulate the relationship between two or more variables. Therefore, it can be used to build a model that allows predicting the behavior of a given variable.

Regression.

where β0 is the intersection or "constant" term, las are the respective parameters to each independent variable, and p is the number of independent parameters to take into account in the regression

Terms and definitions:

Response variable “Y” = Independent variable

Predictor “X” = Dependent variable

S = Standard deviation

R-Sq = Coefficient of determination

R-Sq (adj) = Adjusted coefficient of determination

There are four types of regression:

Linear Regression (y = A + Bx), Logarithmic Regression (y = A + BLn (x)), Square Regression (y = A + Bx + Cx2) and Exponential Regression (y = Ae (Bx)) Where "Linear Regression", "Square Regression" and Exponential Regression "are commonly used.

There are two options in Minitab:

o Stat / Regression / Regression: where MINITAB provides very detailed information about the regression analysis.

o Stat / Regression / Ftted Line Plot: where MNINITAB presents the less detailed result, but shows a scatter diagram of the data, which graphically completes the information provided.

To get a better idea of what "regression" is, we will see the following example:

A manufacturer of cannons for tennis balls decided to investigate the use of compressed air instead of the classic model that uses a felt wheel in its friction model, to which made 30 shots gradually increasing the pressure (bars) of the air and measuring the progress of the distance (meters) progressed. The manufacturer is interested in knowing how many bars will be necessary to reach a distance of 60 meters.

Once the data from the shots have been captured or copied from the excell and pasted into Minitab, we proceed to choose from the menu

Stat / Regresion / Fitted line plot… a window will appear in which we will leave the option "Linear"

Once this is done, this graph will appear where we can see in its upper part the equation "Bars = -

1.442 + 0.2910 Meters" which we can use to predict the bars that will be necessary for greater distances.

In the following example we will see how to identify if the "Regression" is linear, square or exponential.

Using the data from the previous example but now with the altered distances we proceed to graph “Fitted Line Plot” with the option of “Linear” exactly as we did in the previous example and we will obtain the following:

As we can see, R-Sq is not very close to 100%, therefore we will have to continue looking for the

most appropriate "Regression" option to get as close as possible to 100%. For this we will select

Stat / Regresion / Fitted line in the menu plot… a window will appear in which we will leave the option "Quadratic"

We obtain…

The value of R-Sq is 63.4%, therefore the square "Regression" may not be what we are looking for as we need it to be as close to 100% as possible.

Let's repeat the graph but now this time we will choose the option "Cubic"

We obtain…

In this graph of "Regression" Cubica the value of R-Sq is 65.4% which is above the 63.4% of the Square "Regression" and 56.6% of the Linear "Regression" of the previous graphs. We must be very clear that when testing with the three options previously seen (Linear, Square and Cubic or exponential) we are looking for which option brings us closer to 100%, and that the closer we are to 100% the more reliable the prediction will be. we can calculate.

11.- CORRELATION

Purpose of knowing the relationship that can occur between two or more variables

DEPENDENT: Hayman (1974) defines it as a property or characteristic that is to be changed by manipulating the independent variable… It is the factor that is observed and measured to determine the effect of the independent variable.

INDEPENDENT: It is manipulated by the researcher in an experiment in order to study how the expression of the dependent variable indexes.

Correlation Coefficient

Sote (2005), The correlation coefficient (r) defines it as a "statistical indicator that allows us to know the degree of relationship, association or dependence that may exist between two or more variables".

- Simple correlation: When you study the possible relationship between two variables.

- Multiple correlation: When analyzing the association or dependence of more than two variables.

- Curvilinear correlation: The variable has a different trend than the straight line.

Types of Correlation

Positive or directly proportional correlation r = (+)

It indicates that when the variable changes in one direction, the other does so in the same direction.

Negative or inversely proportional correlation r = (-)

It shows us that when one variable changes in a certain direction, the other does it in the opposite or opposite direction.

Uncorrelationr = 0

When the obtaining of said indicator is equal to zero, it is said that there is no relationship, association or dependency between the variables studied. Therefore, they are correlated variables or lack some different dependency.

Different types of Correlation.

Pearson's correlation coefficient: Index that measures the linear relationship between two quantitative random variables.

Spearman's correlation coefficient: It is a measure of the correlation (the association or interdependence) between two continuous random variables that measures the linear relationship between two quantitative random variables.

Another way to measure correlation is by calculating an Application correlation coefficient in Minitab

Once you have the table with the data that we want to analyze:

1.- Go to Stat / Basic

Statistics / Corralación… in the main menu.

2.- Select your two variables

3.- We finish with “OK” and we obtain the confidence value of the correlation between the two variables, as well as the Probability Value (P-Value).

12.- TWO SAMPLE T-CONFIDENCE INTERVAL AND HYPOTHESIS TEST

This test tries to verify the hypothesis of the non-existence of significant differences between the means of two different samples:

In other words, now there are two samples from two different populations, assumed to be normally distributed and independent, and the aim is to check whether or not there are significant differences between the two.

To generate a Two-Sample t-Test:

1. Stat / Basic Statistics / 2-Sample t

2. Select the populations you need to compare in Sample in different columns:

3. Select the reliability level of the test in Options.

4. Select Graphs to generate a graph and interpret the results.

Interpreting Results:

Difference = mu (Process 1) - mu (process 3) Estimate for difference: -63.0000

95% CI for difference: (-92.9348, -33.0652)

T-Test of difference = 0 (vs not =): T-Value = - 4.17 P-Value = 0.000 DF = 122

It is observed that there are significant differences between both processes, with a P-Value = 0. 000 which is interpreted that there is no correlation between both processes.

There is an Estimated Difference of 63 points.

The P-Value = 0 indicates that one of the processes is not normal.

In addition, the Standard Deviation of the Mean of both processes are very different. It can be noted that there is a difference between the groups and the experiments. In other words, Process 1 is more capable than Process 2.

13.- GAGE R&R

Definitions:

Repeatability: It is the variation observed when the same operator measures the same element repeatedly using the same device. It gives an idea of the variation due to said measuring device.

Reproducibility: It is the variation observed when different operators measure the same element using the same device. It gives us an idea of the variation due to the operator.

Measurement Repeatability and Reproducibility studies determine how much of the variation observed in the process is due to the measurement system used.

Minitab provides two methods for conducting these types of studies: The X-bar / R method breaks down the total variation into three categories: element by element, repeatability, and reproducibility. The ANOVA method goes one step further and breaks down reproducibility into two subcategories, the operator and the operator per element (for this reason the latter method is more accurate than the previous one):

Inconsistent Tool Revealed

The R&R results show that even when the same person weighs the same box on the same scale, the measurements may differ by several grams, indicating that the scale is in dire need of recalibration. The faulty scale would have made the control chart practically useless. Although the average measurements are not very far apart, the diffusion of the measurements is enormous!

To meet the growing demand, a company hires new workers to prepare carefully measured quantities of an expensive solution. The company uses an R&R study to compare new operators to experienced operators.

The study reveals that, when workers measure the same sample, measurements for new hires are too high or too low more often than measurements for experienced workers. The company decides to carry out more training for new employees.

How to analyze a Gage R&R study in minitab?

An awareness of how well you can measure something can have significant financial benefits. Minitab makes it easy to analyze how accurate your measurements are.

A restaurant plans to evaluate how the temperature of food is measured to ensure that the food is hot enough. Incorrect temperatures can lead to discarding good foods, failing to a health inspection, or even making a customer sick.

Starting

Preparing to analyze your measurement system is easy because Gage Minitab Create R&R Study Sheet can generate a data collection sheet for you. The dialog box allows you to quickly specify who takes the measurements (the operators), the element they measure (the parts), and in what order the data should be collected.

1. Select Stat> Quality Tools> Gage Study> Create Gage R&R Study Worksheet.

2. Specify the number of parts, the number of operators, and the number of times the same operator will measure the same part.

3. Assign descriptive names to the parts and operators so that they are easy to identify in the output. 4. Click OK

The main event

After entering the measurements into the spreadsheet, you can use the Gage R&R Study (Crossed) to analyze the measurements

1. Select Stat> Quality Tools> Gage Study> Gage R&R Study (Crossed).

2. In Part Numbers, place the Parts.

3. In Operators, put the Operators.

4. In Measurement Data, enter 'Food Temp'.

5. Click Options.

6. Enter your specification limits. In this case, set a lower specification for the minimum temperature.

7. Click OK in each dialog box.