Logo en.artbmxmagazine.com

Chi square test

Table of contents:

Anonim

When doing statistical analysis in search of testing hypotheses, these must be appropriate according to what has been established, and what has been built over time by different authors in the world. This technical note has been prepared to provide an adequate tool that guides how to proceed when testing hypotheses using qualitative variables of a categorical type such as nominal and ordinal variables. In this it guides the user, how to proceed when performing the Chi Square test for hypothesis testing using categorical variables.

Contemplate:

  1. Variables and types of tests. The chi square.

Variables and Types of Hypothesis Tests

When doing statistical analyzes using categorical variables, it is important to know the type of analysis to be done and how the results are interpreted. As is known, the variables are divided into qualitative (nominal and ordinal) and quantitative (interval and ratio).

In the case of categorical variables, when trying to test hypotheses using contrast tests, the following can be used: Pearson's Chi square (X2) or Chi square, Fisher's exact test and Mcnemar's (Statis Geek, 2019)

Table 1. Relationship of variables and Types of Tests

Source: Statis Geek (June 22, 2019). Chi Cuadra retrieved from

Chi Square

The chi square test is useful to establish whether or not there is a relationship between categorical variables (nominal and ordinal). It can be done by using the following formula:

Using this formula, you can work in Excel, from the observed data that serve as the basis for the association test, and you can also use statistical packages that contain it. If two variables are associated, it means that part of the variability of one of them can be explained by the other. However, although this test can explain the association between two variables, it does not explain the characteristics or form of the association.

To understand the results of the Chi square it is important to understand what it is, the intensity and the sense or direction of the association. The first tells us the strength of the association, which can range from weak to very strong, and the second tells us whether it is direct or inverse. When the relationship is direct, if one variable increases, the other increases, or vice versa. On the other hand, if it is inverse, if one variable goes up, the other goes down, or vice versa. It is important to note that this is only possible to identify if you work with two ordinal variables. There is another element that can be determined with the association of categorical variables and it is relative to the type of relationship that exists between the associated variables, which can be: symmetric and directional (Ruiz, 2019).

  • Symmetric is used when you only want to measure the strength and sense of the association between two categorical variables. Example the intensity of association between the level of qualifications and the teaching method used (one nominal and one ordinal). In this case there is a test suite that can be used. The directional. When you want to determine how much one variable can help you predict the behavior of the other. For this, one of the variables must be dependent (y) before the other independent (x). eg stress level (Y) with career level (x). In this case there is a set of tests that can determine it.

The basic difference between the two is that the symmetric only allows to measure the intensity of the association, while the directionality allows to establish the degree of prediction that can be had of a dependent variable, knowing the level of the independent variable (Ruiz, 2019).

According to Ruiz (2019), the following steps must be followed to carry out the chi-square test:

  1. Identification of the variables to be associated Identify the types of variables to be associated (nominal or ordinal). In this case, there can be two nominal variables, at least one of them nominal, and two ordinal variables. Establish the type of relationship you want to make. In this sense, there are two ways in which the variables can be related:
    • Symmetry measures. They measure the relationship of variables. Measurements of directionality. Measures the dependency relationship between two variables (one dependent and one independent).

      In the case of at least one nominal variable, the intensity of the relationship can be measured. Whereas if they are two ordinal variables, the intensity of the relationship and its directionality can be measured.

    Identify the type of test to be performed.
    • If it is a symmetric measure. If you work with two nominal variables or at least one nominal variable, a scale from 0 to 1 is used, which allows knowing the intensity of the relationship (non-dependence), once it has previously been determined if there is an association by using the P-value (p-value≤0.05 or 0.01) as previously defined. The tests available in this case are: Phi, Cramer's V, and Contingency coefficient. If instead two ordinal variables are used, in the symmetric measurement a scale from -1 to +1 is used and the tests are used: Gamma, Kendall's Tau b, and Kendall's Tau C (see table 2) If it is a directionality measure. If at least one nominal variable is used, a scale from 0 to 1 is used, it allows to know the directionality of the relationship (degree of dependence) and uses the Lambda test. On the other hand, when working with two ordinal variables in the measure of directionality, the scale from -1 to +1 and the Somers D test are used.

Table 2. Association Measures for Nominal Variables

Note: extracted from Ruiz, CC (October 12, 2019). Chi block and association measures. Retrieved March 31, 2020, from

Table 3. Type of association) Cramer's V)

Table 4. Measures of Association for Ordinal variables

Note. Statis Geek. (June 22, 2019). Chi square. Retrieved on April 2, 2020, from

Using Chi Square

This test is used when you want to establish whether or not there is an association between categorical variables, such as nominal and ordinal variables.

This test contrasts a qualitative variable Vs a qualitative variable. The test is based on the chi-square distribution in order to obtain a value of P.

In Chi square, the null hypothesis (Ho) is expressed in terms of independence, while the alternative or research hypothesis is expressed in terms of dependence or association.

Assumptions of the Chi Square Test

For chi square, at least 80% of the expected frequencies must be greater than 5. For a 2 * 2 test, all expected values ​​must be greater than 5 to use. In case the expected value is less than 5, Fisher's exact test should be used. Eg Sex (man, woman), type of school (public, private).

Chi Square Procedure

  1. The SPSS, minitad etc. is calculated with the appropriate formula. The results are interpreted according to the hypothesis and the level of statistical significance considered. If the significance in 5%, the value of p must be less than 0.05 to reject the null hypothesis. If the significance is 1%, the value of p must be less than 0.01 to reject the null hypothesis. If there are expected values ​​less than 5 by 20% or more, do not use Chi-square but the Fisher's exact test. If the table is 2 * 2 and has expected frequencies less than 5 Fisher's exact test is used.

We leave you with a couple of videos (which serve as a source for this technical note) in which you can learn more about the chi-square test

References

  • Ruiz, CC (October 12, 2019). Chi block and association measures. Retrieved on March 31, 2020, from https://www.youtube.com/watch?v=cyRAxn5NbD4&t=106sStatis Geek. (June 22, 2019). Chi square. Retrieved on April 2, 2020, from
Chi square test