Read on to understand. 10 common mistakes in multiple regression analysis | Marketing and multiple regression analysis - Part 3

Column
Data analysisstatistics

Analysis of marketer's essential skills and promotionLet's try using multiple regression analysis. We will learn the basics of multiple regression analysis over five sessions. We have summarized 5 common patterns of failure in multiple regression analysis, divided into three categories: problems with combination, structure, and data.

Learning with diagrams makes it easier to understand! | Marketing and Multiple Regression Analysis - Part 1Now, we have explained the basics of multiple regression analysis. When you actually try to carry out an analysis, you may find that you do not get good results due to various reasons until you get used to it.

In Marketing and Multiple Regression Analysis - Part 3, we have summarized 10 common mistakes that occur in multiple regression analysis. You can check it when you have a problem in your work, or you can read it as a preparation. You will be able to intuitively grasp "what is happening" and quickly and correctly determine "how to deal with it."

A guide to multiple regression analysis using Excel

Free downloads of related materials

A guide to multiple regression analysis in Excel that empowers marketers
~ Understand the correlation between marketing measures and business results ~

Combination Problems

Too few elements

A typical failure pattern is incorporating too few factors into the model, which in turn means that the factors needed to explain the outcome are not fully incorporated.

In this case, you need to think again about what elements have not been incorporated. However, sometimes the elements that should be incorporated are actually difficult (or impossible) to express as "data," so analysis does not necessarily reveal everything. The trick is to find elements that are easy to collect as data.

Too many elements

Conversely, if you include too many factors, you will not get good analytical results. So what is the appropriate number of factors? This depends on the case, and there is no clear academic answer. As a guideline, many studies in the academic field limit the number of factors to "about seven." Please use this as a reference.

Highly correlated (multicollinear)

When elements are highly correlated, analytical results can be calculated but will be extremely unreliable.

For example, if you include the factors "number of days with rain" and "monthly precipitation," multicollinearity will occur.

In other words, the overlapping of similar elements causes a bug. In this case, you can solve the problem by removing one of the elements.

The results are not focused

When conducting analysis, it can sometimes be difficult to narrow down what should be considered a result.

In the business world, there are various ways to set outcomes, such as "sales," "number of visitors," "number of groups attending," "number of food orders," etc., and each may provide different insights.

In this case, you might start by narrowing down your findings to those most likely to impact your business goals, and/or run multiple analyses and compare the results with each other.

Structural issues

Spurious correlation

For example, suppose there is a correlation between "increased beer sales and increased air conditioner sales."

In this case, it is more appropriate to think that there are two overlapping different relationships: "When the temperature rises, more people buy beer" and "When the temperature rises, more people buy air conditioners," rather than "People who buy beer also buy air conditioners." Causal and correlation relationships are often misinterpreted not only in analyses but also in survey results. It's a good idea to be careful.

Multiple connection structures are mixed

If multiple connections are mixed together in one analysis, the results will be less accurate and harder to interpret. However, in reality, such multi-layered relationships are quite common.

So in reality, the approach often involves trying to identify one or two key factors that have a large impact on outcomes and running short cycles of hypothesis testing.

There is too much time difference

Multiple regression analysisIn statistical analysis, it is very difficult to analyze "things that have an effect over a long period of time." For example,TV commercials increase brand awarenessExamples include the relationship between "productivity" and "sales."

In practice, even if multiple regression analysis shows that there is no relationship, it is necessary to consider the possibility that the time lag is so large that the results may appear to be unrelated at first glance.

Special factors influence

There may be cases where results are significantly influenced locally by extremely unusual things that happen "by chance."

In this case, we either extract the "outliers" from the data being analyzed, or represent the exceptional events themselves in the data and include them in the analysis.

Data issues

Results cannot be turned into data

Obviously, anything that cannot be expressed in numbers cannot be analyzed.

In practice, this may be the case when the frequency of obtaining numerical values ​​is extremely low, such as "the number of orders for a business negotiation in which one deal is concluded over three years." In such cases, we deal with the situation by using alternative data that is not the actual results.

The data is less accurate

Too often, results are not visible due to the poor quality of the data collected.

In the business world, you may decide that "it's better to try something than nothing," and the results of that "try it out first" may help you identify the next steps to take in your analysis.

So it's not a bad thing. However, you should avoid leaving aside the premise that you "tried it out" and letting inaccurate results take on a life of their own. You should be very careful when handling the results of your analysis.

Analysis depends on "hypothesis" and "interpretation"

I have introduced 10 patterns above, but in them, some vague expressions such as "depending on interpretation" and "depending on the circumstances" appear many times. The analysis method itself is established logically and with mathematical basis.

However, how to use it and how to apply the results to reality are largely left to the analyst. To conduct a quality analysis, the analyst needs a quality "hypothesis" based on insight andMultifaceted "interpretation" of the results obtainedYou will need both.

Therefore, if there is only one analyst, bias may occur and the perspective may become narrow. Therefore, Marketing Netacho recommends that you do not analyze alone, but rather work together as a team.

By considering things from various perspectives, you can set hypotheses and read (interpret) the analysis more accurately.

Next time, let's try multiple regression analysis in Excel.

The explanation up to this point is, so to speak, the "classroom" part.

Now you want to try out multiple regression analysis using the knowledge you have learned. In the next lesson, we will show you how to perform multiple regression analysis using Excel and how to interpret the results.

Marketing and Statistics Article Summary

Learning with diagrams makes it easier to understand! | Marketing and Multiple Regression Analysis - Part 1
We explain why multiple regression analysis is perfect for the marketing field and use diagrams to clearly explain multiple regression analysis.

Seven statistical terms to know | Marketing and multiple regression analysis - Part 7
There are many terms that appear in statistics. Here are seven statistical terms that you will need to know to understand them all.

Recommended articles