Best practices for data collection: What we learned from working with clients

In the previous report,The Importance of Visualizing Marketing EffectsIn this report, we will discuss the process of "data collection," which is essential for visualizing the effects.
In the world of data analysis, there is a famous saying, "Garbage in, garbage out" (if you put in flawed data, you'll get meaningless output), so it is important to pay attention to the quality of the data you input.
On the other hand, being too concerned with the quality of the data required means that the cost-effectiveness of collecting data is not being ensured.Data-Driven MarketingThere are probably many companies that have not yet taken the step of visualizing the effects.
In this report, we will introduce the key points of "data collection" along with specific examples related to data collection, based on the knowledge gained from XICA's experience working closely with customers on data collection.
Marketing Data
In recent years, as the importance of data accumulation and data analysis has been emphasized, the number of services for accumulating and referencing data continues to increase. However, the know-how on collecting marketing data has not kept pace with the increase in available data, and as a result, the "difficulty of collecting data" has become an obstacle to visualizing marketing effects.(* 1)Isn't this the current situation?
*1) A survey of corporate advertising managers on how to measure advertising effectiveness
Let’s take a moment to examine what types of marketing-related data are typically available.
i. Sales data
This is data related to sales. This is the data that is most closely related to your company's business, and if you consider marketing activities to be part of the efforts to increase sales, then it's natural that you should obtain this data.
Examples of sales data that can be collected :
- Sales amount
- Sales quantity
ii. Advertising Data
This is advertising placement data. We obtain information from advertising agencies about the type of media and creative used, and the volume of advertising placed.
The extent to which these measures were cost-effective is an important indicator, but if an agency cannot track cookie data, it can be difficult to calculate cost-effectiveness.
Examples of advertising data that can be collected :
- Number of imps
- Clicks
- Number of conversions
- Advertising costs
- Video views
- Number of completed video views
ⅲ. Data on company website and company SNS operations
If you operate your own website, you can track the number of visitors to your organic site and customer behavior there using log data from Google Analytics, etc. Also, when operating on social media, you can obtain responses to posts as an engagement indicator.
This is an important indicator for measuring increased interest in your company and the effects of improving your content.
On the other hand, it is difficult to determine how much these contribute to a company's sales, and it will be difficult to judge the effectiveness of marketing based solely on data from a company's website and social media accounts.
CollectableCompany website and company SNSExample data :
- Number of SESSIONS
- Number of PV
- Engagement indicators for managed social media accounts
- Number of likes
- Number of replies
- Comments
ⅳ. In-store data
In-store data is also important in marketing. Data on in-store product prices, distribution rates, and data related to in-store events and promotions provide important insights for marketing.
On the other hand, the impact that price and distribution rate have on marketing varies greatly depending on market and competitive trends, so it is difficult to visualize and interpret the effects of marketing on a standalone basis.
Examples of in-store data you can collect:
- Store price
- Delivery rate
v. Data linked to personal ID
The development of cookies allows us to track purchasing behavior linked to individual IDs, which allows us to create hypotheses about customer personas and deliver powerful, customized marketing approaches.
Additionally, if you operate an e-commerce site or service site, there are various ways to use it, such as classifying royalties based on information linked to accounts.
On the other hand, due to growing awareness of privacy in recent years, it has become more difficult to obtain such data, and there is growing emphasis on marketing activities that do not rely on personal ID data.
CollectableLinked to personal IDExample data:
- Demographic Data
- CRM Data
- purchase history
ⅵ. Attitude index (survey) data
Customer survey data. Examples include awareness rate data and customer satisfaction data.
Compared to sales data or advertising data, data is often biased due to sample selection and question design, so it is an indicator that requires more care when analyzing.
On the other hand, when it comes to visualizing marketing effects, it is very important to ask how much these things contribute to sales and how they change depending on the product PR method, although this is quite difficult to visualize.
CollectableAwareness IndicatorsExample data:
- Five-level data
- Free-text data (customer feedback, etc.)
vii. External Data
Although i. to vi. are data related to your company, externally accessible data is also an important part of visualizing the effectiveness of your marketing. This information is necessary for visualizing the effectiveness of your marketing, such as government survey data, market trends such as Google Trends, trends of competitors, and more recently, changes in consumer behavior due to COVID-19.
The number of vendors that collect and store external data has increased significantly recently, so it can be said that the possibilities are expanding greatly.
Collectable外部Example data:
- Statistics Bureau Data
- Google trends
As mentioned above, there is a wide range of data related to marketing.
Depending on the hypothesis you want to answer, you may be able to visualize marketing effectiveness with just a portion of the data, but the larger the issue, the more comprehensive an evaluation will be required.
Next, I will explain the data collection process, including how to select and discard this data.
Data Collection Process
"Data collection" for visualizing marketing effectiveness proceeds through the following process.

Rather than rushing to visualize marketing effects and collecting all sorts of data, it is important to follow the following steps: 1) agree on the purpose of analysis, 2) design a model that matches the purpose of analysis, and 3) identify factors related to the model.
Alternatively, rather than trying to think of some way to analyze the data you have at hand, it is important to work backwards from what you want to clarify through the process above and start collecting data.
The significance of visualizing marketing effects was explained in the previous report.(* 2)As shown above, it involves planning for the future and reflecting on the reasons why goals were or were not achieved.
Once you have involved the relevant departments in steps ① to ③ and have designed a solid analysis model, you can use it for repeated planning and review.
*2) Previous report:Data usage examples for visualizing marketing effects

In other words, once a data collection system is created, it becomes possible to keep a powerful PDCA cycle going.
On the other hand, neglecting steps ① to ③ will result in the following risks.
- (1) When consensus on the purpose of the analysis has not been reached
- Because there are no hypotheses to be solved, the analysis results cannot be used for further action.
- Not being able to involve relevant departments and get them to cooperate with data collection
- Since each department has different goals, it is not possible to design a cross-departmental analysis model.
- (2) When it is not possible to design a model that matches the analytical purpose
- Because it is not possible to define the data required to maintain analytical accuracy, it is not possible to properly visualize marketing effectiveness
- (3) When factors related to the model have not been identified
- Overlooking factors that affect marketing effectiveness
- Allocating more cost and labor than necessary to data collection results in a loss of cost-effectiveness for analysis
At XICA, we work closely with our customers to prevent situations like those described above, providing comprehensive support for steps ① through ③.
In the next chapter, we will discuss what methods you should use to accumulate data on a regular basis so that you don't run into trouble when it comes time to analyze the data.
Data accumulation concept
At XICA, we ask our customers to collect the necessary data for analysis, but in customer surveys after the analysis, the most common issue is the burden involved in collecting the data. "It's more difficult than we expected," was a common point shared by many customers.
As mentioned above, once you have the system in place, you will be able to turn the PDCA cycle mentioned above. In order to visualize the marketing effect or to be prepared to do so at any time, we will discuss "how to accumulate data."
Relationship between the amount of data held and the data structure that is easy to analyze
As mentioned in the chapter on "Marketing Data," data related to marketing alone is diverse, and accumulating all of the data and arranging it in a form that can be used for analysis is a very arduous task. There is a trade-off between the amount of data to be stored and a data structure that is easy to analyze, and the more data you try to accumulate, the more complex the data structure becomes, making it difficult to analyze.
For example, it is difficult to store time-series sales data and cross-sectional survey data in the same format in a database. As you can see from this, the more data you store, the more time and complexity you will have to manage it.
How to handle huge amounts of data
The terms "data lake" and "data warehouse" are used as mechanisms and concepts for handling large amounts of data. They may not be familiar terms, but they are closely related to the trade-off between the amount of data and the complexity of the data structure mentioned earlier.
A data lake is a method of data storage that focuses on storing huge amounts of data in the form in which it was acquired, that is, on increasing the amount of data retained. On the other hand, a data warehouse is a collection of data that has been processed to make it easier to analyze.
Due to management costs, it would be difficult to accumulate all data using a data warehouse approach, so we recommend that you take the following approach with the aim of being prepared to get started immediately when the idea of visualizing marketing effects occurs.
Data that is easily structured is managed in a data warehouse.
Let's start by explaining what "structuring data" means. It may be easier to understand if you imagine an Excel spreadsheet. When values and symbols are listed in a table format with rows and columns that follow a certain pattern, that data is structured. On the other hand, data that is difficult to structure would be things like image files or customer testimonials.
Marketing-related data, such as time-series data and data linked to individual or product IDs, is relatively easy to structure, so we recommend managing this using a data warehouse approach.
For example, if you create a system to group together time-series data such as sales, advertising placements, store-related data (trends in store prices and distribution rates), and trends in traffic to your company's website, you will be able to smoothly proceed when you want to visualize the effectiveness of your marketing.We also recommend that data related to personal IDs and product IDs, if possible, be linked to each ID and grouped together so that it can be retrieved quickly.
Conversely, for data that is difficult to structure as described above, keeping it using the data lake concept and preparing it for use when you want to dig deeper or examine issues can be said to be a good way to deal with the trade-off between the amount of data and the complexity of the data structure.
Summary
The key points of this report are as follows:
- Marketing data is becoming more diverse
- Data collection proceeds backwards from the purpose of the analysis
- Data storage should be done efficiently by considering the data structure.
Integrated on-off analysis allows you to continuously understand acquisition efficiency and carry out the most effective marketing operations.
From next time onwards,Data analysis methods for visualizing specific marketing effectsWe plan to explain the following:
Recommended articles
- XICA Analysis Insight
What is the optimal analysis method for visualizing marketing effects? Explanation of important points regarding analysis methods that companies should choose
- XICA Analysis Insight
The impact of the coronavirus on business results and future prospects as seen from measuring the effectiveness of OOH advertising
- XICA Analysis Insight
Are SNS ads effective? ~Advertisement effect on business results~