Logo en.artbmxmagazine.com

Mistakes in market research

Anonim

The essential characteristics of the information to support decision-making must be its relevance, timeliness, reliability and accuracy.

Relevance refers to the degree to which the information is capable of reducing the uncertainty that the decision-maker has regarding the possible result of choosing one or the other action alternative.

Information should help predict something that will happen, anticipate the outcome of a decision. This requirement leads to seeking information about the variables (causes) that affect the result (effect). If there is no such relationship, the information is irrelevant.

The opportunity refers to the moment when the information must be available; that is, before making a decision. This feature, which may seem too obvious, has more to do with when the need for the information arises than with when it needs to be delivered.

That is, you have to be informed before making a decision, but the anticipation with which it is established what is the relevant information to support it is what leads to more or less pressure to obtain it. The difference in time has a direct impact on the quality and cost of the information. Sometimes the decision will have to be made without information.

Making a decision is similar to making a bet. To predict the outcome of a boxing match, the bettor tries to lessen his uncertainty with relevant information, available before the fight. For example, the weight of the contestants, their height and reach, as well as the results of their previous fights. It is useless to know data such as the name of his brothers or the brand of soap they use; they simply do not reduce the uncertainty of the outcome of the fight.

Information has value as uncertainty decreases. Also, in a direct relationship, although not proportional, the cost of obtaining it increases. In very practical terms, one could use a coin to toss it in the air and choose a boxer to bet on it. Alternatively, you could use the coin to buy a newspaper and find out some details in the sports section.

This means that decisions can be made with or without information, especially in those cases in which the information will not modify the previous position of the decision maker.

If the information has no chance of changing the decision, it is not worth obtaining. It is also not worth the effort when the cost is too high or when it reduces uncertainty to such an extent that it does not help to predict with sufficient certainty the expected outcome of the decision.

In the case of Market Research, the cost of the information is normally much less than the value added to decision-making. It is better to make decisions with information, even if it is not perfect information.

Accuracy and reliability lead to accurate and truthful information, which makes it credible. Both characteristics are mainly related to the determination of the sources and the way in which the information is obtained from them.

In the case of a study between people, they have to do with the selection process of those who become part of the sample and with the design of a tool to obtain information from them.

The determination of the sample size and the selection process of the sample are the only two aspects of the market research process for which it is possible to quantify errors. For the other aspects, only procedures and standards can be established that, if followed, reduce the possibility that errors will occur.

An excellent example of the way to reduce errors in Market Research is the Service Standard for Market Research in Mexico (ESIMM) established since 2000 by the Mexican Association of Market Research Agencies (AMAI).

The standard defines the basic elements of quality that market research companies must possess and implements documented procedures that ensure that quality is repeatable and results in customer satisfaction. This very important effort, first in the world, to professionalize the activity and self-regulate, helps a lot to increase the credibility of the results of the studies.

Its implementation mainly translates into greater reliability and accuracy, although not so much to greater relevance and opportunity. These two characteristics do not result exclusively from the contribution of the agency conducting the study, since they require to a very high degree the participation of the client who requests it.

The relevance of the information and the opportunity to obtain it depend mainly on the user requesting it and, therefore, the most important element to ensure that both characteristics are achieved is the adequate formulation of a Study Request by the user and a Study Proposal. by the information provider.

Both the quality of the information and the quality of the relationship between client and agency depend on the collaborative work of both at the beginning of the market study.

There are no magic formulas, much less standardized procedures, just hard work. If the client wants to get more value from the research department or agency, or if the agency wants to give more quality to its clients, they must collaborate intensively at the beginning of the project. It is more work, of course. But the added value is surprising.

Another valuable aspect of the research process that improves the relevance of information for decision-making is in the analysis and interpretation of the study results. Beyond the preparation of reports that communicate findings, it requires not only the proper handling of appropriate statistical techniques, the use of which should be foreseen from the beginning of the project, but also an acute point of view capable of finding the true meaning that these Findings represent the decision you want to make.

The ESIMM encourages the accuracy and reliability of the information from market studies, but it requires a different effort than its procedures to give it relevance and timeliness.

If the client wants to get more value from their agency or market research department and if the agency wants to give more quality to their clients, both should collaborate intensively at the start of a study.

The best starting point is a good 'study request' by the client that leads to a better 'study proposal' by the agency. And do not forget that price negotiation must be balanced with time negotiation, since these two factors greatly influence the quality of the information.

The ESIMM (Service Standard for Market Research in Mexico) established since 2000 by the Mexican Association of Market Research Agencies (AMAI) means an effort to standardize the studies carried out by the agencies.

However, this initiative is oriented more to the Satisfaction of the Research Client than to the quality of the studies and also leaves aside the quality requirements that the Agency itself must impose on the Client.

Indeed, there are quality requirements that the supplier must demand of his client. The two most important have to do with the delivery of complete information and a timely and fair payment.

The initial collaboration between Client and Agency leads to the study obtaining truly relevant information and that it is carried out in the necessary time and at the right price. This is the true utility of using tools like the application and study proposal.

Thus, the possible differences between the customer satisfaction requirements and the methodological strength of the study are also corrected. The same as the possible differences between what the agency considers should be the information that the client needs and what the client considers to be truly relevant to her decision are corrected.

In other words, since neither the client is necessarily an expert in market research, nor the agency necessarily an expert in business decision making, both must collaborate intensively at the beginning of the project.

Collaboration can extend beyond project design, as joint decisions are sometimes required about the sample, the measurement tool, the analysis, and the presentation of results.

Procedurally, the ESIMM provides for this possibility and establishes the requirement to document such decisions with evidence and their authorization by either party, client or agency. Curiously, this practice has been the normally accepted one in relations between client and advertising agency, not so among market research agencies.

During the strict application of the ESIMM, care must be taken with the risk of bureaucratization that inhibits creativity, as well as the risk of being subject more to the requirements of customer satisfaction than to the methodological and academic soundness of market research.

The relevance and timeliness of study information is greatly increased thanks to the collaborative participation of the client and the agency at the start of the study.

On the other hand, the reliability and accuracy of the information may be affected by errors that are made during the study and that are mainly related to the design of questionnaires, the sampling procedure, and the analysis of results and their interpretation.

In a quantitative study, a good questionnaire design is more the result of applying common sense than any other technique. The only proven procedure to reduce errors is to test this questionnaire with people of the same profile as those who will compose the sample.

Prior to this, what is required is intimate contact with people of the same profile, to know the language they use, the way they express themselves and, even more importantly, the answers they give to the questions they are asked. pose.

Many errors that can happen around the questionnaire are foreseeable and consequently, avoidable through good procedural control. The only unavoidable error is the one that results from those people who refuse to answer the survey. Furthermore, it is an invaluable mistake, since it will never be possible to know what these people would have answered.

The sampling procedure is suspicious at best and therefore subject to countless criticisms. This happens both among laymen and among experts, since the latter sometimes lack the correct foundations.

The doubts typically have to do with determining the sample size, since in reality the errors in practice come mainly from the method of selecting the interviewees.

The sample size, in most studies that require probabilistic sampling, is normally greater than the minimum necessary to make inferences about the characteristics of the study population. This is the result of the need to include interviewee quotas of such a size that they allow cross-information analysis to be carried out later.

True sampling errors occur when there is no systematic and objective procedure for selecting which interviewees to include. The desire to keep study costs low leads to the use of interviewee selection procedures for convenience, in the absence of a sampling frame.

That the sampling is not probabilistic only means that the results of the study cannot be extrapolated from the sample to the population. The information may be relevant and timely and as such sufficient to make a decision. However, it is impossible to determine its reliability and accuracy.

Finally, in the analysis of results and their interpretation, errors arise from the handling of the information obtained in the study, both in form and substance. This is, sometimes the numerical, statistical handling of the information is inadequate, while in others the interpretation of the results causes the error and leads to conclusions that are not appropriate.

The determination of the sample size and the selection process of the sample are the only two aspects of the market research process for which it is possible to quantify errors. Said quantification extends to the analysis of the results of a descriptive study.

The accuracy and reliability of the information obtained from a descriptive study depend mainly on the sampling procedure, which includes both the determination of the sample size and the method of selecting the interviewees.

Empirically, the sampling procedure is based on the intuition that it is valid to draw general conclusions about all the elements of a group, based on the knowledge of only a part of the elements of that population.

In daily life, it is a practice that extends to many areas since we generalize judgments about people, products, services, weather conditions and countless situations, based on samples as small as a brief look at the attendance of a nightclub (it is very atmospheric) or the observation of a single event (the food here is very tasty).

Theoretically, sampling is based on induction and subjects these conjectures to a probabilistic evaluation in order to determine their degree of approximation to reality. In other words, it allows us to know or estimate the size of an error derived from the sampling procedure.

Suppose the following numbers of Internet Connection Hours for a population of 5 families:

F1 F2 F3 F4 F5
6 5 7 6 9

The Average Connection Hours is 6.60, with a Standard Deviation of 1.52, which is a measure of its variability. In other words, it is true that the population connects on average a little over six and a half hours, although that average varies by about an hour and a half for each family.

Although in practice only one sample is used, if all possible samples of size 2 are considered, it is possible to see that each of the 9 possible samples would allow estimating a different average number of connection hours, so the estimate will depend from which of the samples is selected:

F1

F2

F1

F3

F1

F4

F1

F5

F2

F3

F2

F4

F2

F5

F3

F4

F3

F5

5.5

6.5

6.0

7.5

6.0

5.5

7.0

6.5

8.0

The sampling procedure can be trusted because the average of all possible samples is also 6.60.

An error derived from sampling comes from the selection of the sample, since as seen in the previous example, the estimation of an average of Connection Hours depends on which elements become part of the sample.

The Standard Deviation of all these possible samples of size 2, now called the standard error, is 0.88 Connection Hours and can be interpreted, for example, as 66% of the possible samples showing an average number of hours between 5.72 and 7.48, that is, within an interval more or less a standard error.

In other words, 6 of the 9 possible samples are within a range equivalent to a Standard Error around the true mean.

Thus, we have that the reliability of the sampling can be expressed in terms of the probability that any sample will obtain a result within a specific interval.

As the sample size increases, each sample better represents the population, to the extent that, in this example, if the sample were size 4, the estimated average would be 6.5 Connection Hours in the case of taking the sample. formed by F1, F2, F3 and F5. And the estimation interval of all possible samples would be considerably less.

The sample size is mainly related to the variability of the characteristic of that population to be studied.

Thus, a population of infinite size could be perfectly represented by a very small sample, provided that its characteristics are homogeneous.

Take the example of the pool, in which, to know the temperature of the water, a person puts only the tip of the foot, only on the shore. Taking this minuscule sample, it is possible to make a decision regarding the entire pool. In fact, if the water temperature seems nice to you, possibly invite the rest of the crowd to fully introduce themselves, not just the toe and not just on the shore.

Thus, the two relevant components for determining the sample size are the reliability of the sample representing the population (expressed in standard error units) and the precision with which you want to make an estimate.

The size of the study population has nothing to do with determining the sample size. Proof of this is that the formula to determine the sample size does NOT include the population size.

However, mainly in attention to analysis requirements, it tends to use larger sample sizes than what is theoretically determined. That is, it is desired to have a sufficient number of observations (surveys) within each cell of information analysis that results when comparing information between groups of respondents according to their classification data or other responses obtained during the study.

Errors occur that arise from the numerical management of the information obtained in the study, both in form and substance. That is, sometimes its interpretation is what causes the error and leads to conclusions that are not appropriate.

In a very special way, errors are made when presenting for comparison means or incidences of responses that different groups of interviewees give to the same question that has been asked in identical terms.

Without the urge to enter into a detailed discussion in terms of complicated formulas, it is possible to establish some concepts that clarify what is behind a statistical comparison.

The appropriate method to compare the means of a numerical variable between two or more groups of subjects, identified in turn by the values ​​of a nominal or ordinal variable, is the Analysis of Variance.

It involves the calculation of the Fisher's F value, which is defined as the result of dividing the variance within the sample means over the variance between the sample means. The F value, or rather, its probability of occurrence, inform us if the differences between two means are or are not significant and, therefore, if the means of the element sets are or are not statistically equal.

Conceptually, it is easier to understand it with a simple example. Let's say that the average Hours of Internet Connection in two groups of families from two different NSE are 8 and 10 hours a week, respectively. Those two numbers could be different or the same, statistically speaking.

Why? Mainly due to the variance of responses within each group. Let's look at the following hypothetical possibilities.

Case 1. If the average of the first group of families comes exclusively from 8 Connection Hours of each one of them and the average of the second group of families comes exclusively from 10 Connection Hours of each one of them, there is a high probability that the averages are different from each other. The families of the first group are connected for 8 hours, each and every one of them, while the second 10, also homogeneously.

The absence of variability within each group of families leads us to think that the average of Connection Hours is a measure that represents the group very well (homogeneously). Additionally, since the means of the two groups are different, it is thought that the 8 hours of the first group are different from the 10 hours of the second group.

Case 2. If the mean equal to 8 in the first group comes from responses that vary between, say, 3 and 16 hours; and the mean equal to 10 comes from a range of 2 to 18 hours, the most probable thing is that statistically both numbers, 8 and 10, should be considered equal.

In other words, the variability within each group of families leads us to believe that their averages are not a sufficiently representative measure of the families that comprise them. Thus, the average number of 8 Connection Hours can actually be as low as 3 or as high as 16; while the number 10 Connection Hours varies practically within the same range.

If the numbers do not represent the group well, one consequence is that it can hardly be considered that the average 8 and 10 Connection Hours are actually different figures.

Case 3. Alternatively, two arithmetically identical means could be considered statistically different. In the same example, 8 Connection Hours on average in the two groups could be considered significantly different if they have different variability.

Let's say that one of them results from the individual consumption of 7, 8 or 9 hours of each family (a very homogeneous distribution of responses) and the other results from the consumption of between 2 and 20 soft drinks per week; Although its mean is also 8, the variance of responses is so wide that it could hardly be said that 8 adequately represents the second group of families. Therefore, the first 8 (homogeneous) is different from the second 8 (heterogeneous).

The F value is calculated considering the variance within each group and the variance between the groups. What is relevant is not the value itself, but the probability of obtaining it. Hence the concept of statistical significance.

The probability of obtaining a finished value of F in a random distribution leads us to consider whether said value is large enough to conclude that it has not happened randomly, but rather is derived from real differences between the compared groups.

Since the resulting value is also influenced by the characteristics of the sampling procedure, its probability of occurrence is compared with the percentage of reliability with which the sample size was determined.

Thus, for a sample with 95% Reliability, an F value with significant relevance will be one that occurs randomly at most in 5% of cases.

When the answers to a question are not given in numerical terms, but based on nominal or ordinal answers, then means cannot and should not be calculated, but an incidence of response must be managed.

To evaluate whether two response incidents are the same or different between two groups of interviewees, the Chi-square value is used to calculate, which is, by definition, the sum of the fractions that have the square of the differences between observed frequencies. and expected frequencies and by denominator the expected frequency.

The Chi square value is zero when the differences between the observed and expected frequencies are zero, that is, they agree.

As the number and importance of the differences between the frequencies increases, the Chi square value will also increase, as a measure of discrepancy between them.

A high Chi-square value can be obtained randomly (due to characteristics related to sample size and selection), or it can be obtained because the two sets of elements differ from each other. Again, the concept of significant relevance in combination with the Reliability percentage of the Sample.

When comparing two figures with each other, it is very important to consider that their arithmetic difference is not similar to their statistical difference, since the latter is determined by the distribution of its variance.

Mistakes in market research