What is Data Science? A data professional explains it in an easy-to-understand way [3 books recommended for beginners]

There is probably no one who has never heard of the term "data science." In today's business, where the importance of data is being emphasized, data is one of the essential elements for winning in the market competition.
Data science is the entire approach to extracting useful knowledge from data, and is slightly different from statistics and data analytics. In this article, we will explain data science in an easy-to-understand manner for beginners.
If you are a business person involved in data analysis or have an interest in data science, please take a look at this article.
table of contents
- What is Data Science? Meaning and Definition
- The difference between data science and...
- Background of the rise of data science
- What is the role of a data scientist?
- 3 books recommended by data analysts for beginners
- "Introduction to Data Hermeneutics for Analysts: Techniques for Capturing the Essence of Data (Takahiro Ezaki/Soshim)"
- "Introduction to analytical models for capturing the essence of data analysis: A comprehensive guide from uses and characteristics to principles of statistical models, deep learning, reinforcement learning, etc. (Satoshi Sugiyama/Soshim)"
- Bayesian Statistical Modeling: A Tutorial with R, JAGS, and Stan, 2nd Edition (John K. Kruschke (author), Kazuhiro Maeda (translator), Koji Kosugi (translator) / Kyoritsu Shuppan)
- In conclusion
What is Data Science? Meaning and Definition
Data science is an approach for analyzing the vast amounts of data held by companies and deriving insights that will benefit the business.
XICA has always said that "data does not give you answers." Data is colorless and inorganic. Having data alone does not give you answers.
So what should we do? People need to add color to the data and breathe life into it. Then, data that was previously colorless and impersonal can begin to provide a variety of insights.
The part here of "adding color to data" and "breathing life into it" is the specialty of data science.
Specifically, its activities include:We use statistical analysis, data visualization, machine learning, and other techniques to efficiently collect, organize, process, and analyze huge amounts of data, and derive insights from the results of data analysis that will benefit your business, turning data into a powerful weapon for your business.
The non-fiction film "Moneyball," released in 2011, depicts the general manager of a small Major League Baseball team, who uses his own unique approach to reform the team into a powerhouse on a low budget. Data science is used in this film.
The difference between data science and...
There are many terms related to data analysis, so many people confuse data science with other terms. For example, "statistics" is a term that is easily confused with data science.
Terms that were originally used with the same meaning have changed over time and are now used as independent terms. Data science is one such term.
Here, we will explain the differences between data science and "statistics," "business analysis," and "data analytics," which are often confused with data science. It is difficult to completely separate and explain everything, but here we will briefly summarize the differences and relationships between each of them in order to increase the resolution of "data science."
Differences from statistics
Statistics is a science that uses applied mathematics to find common properties, regularities, or irregularities in data that varies in nature.
For example, the cash registers of convenience stores and supermarkets record POS data (sales performance data) such as the items purchased and their unit prices, the total amount, the age group and gender of the purchaser.
By analyzing huge amounts of POS data, it is possible to predict the optimal timing and quantity of purchases, seasonal demand, etc. Statistics are incorporated into this analysis system.
At first glance, it may seem similar to data science, butIf statistics is a method for dealing with data, data science refers to the entire approach of using methods such as statistical analysis to derive insights that are useful for business.

For statistics, see "[For beginners] What is statistics? Explaining what statistics can do using familiar examples' for a detailed explanation, so please refer to it along with this article.
Differences from business analysis
Business analytics is one of the fields encompassed by data science.It is business-focused and primarily uses structured data to help drive business decisions, such as sales forecasting.In the business world, MMM (Marketing Mix Modeling) and BI (Business Intelligence) are often introduced to conduct business analysis.
MMMis a statistical method for grasping the direct ROI (return on investment) of marketing measures and the indirect effects of multiple measures. It can visualize the effects of offline advertising, such as TV commercials, where the effects are difficult to see, and can analyze the relationships between measures in an integrated manner, such as how much influence each measure had on a single result, so it can be used to accurately measure the effects and optimize marketing measures.
on the other hand,BIis a system that promotes management decision-making. Data collected and accumulated in BI is automatically output as a report, allowing you to discuss management while observing real-time data.
Differences from data analytics
Data analytics is the process of analyzing primarily structured data using statistical methods, and is one of the areas that fall under data science.
The difference between data science and data analytics"Engineer-oriented or business-oriented?"While data science uses a wide range of skills, such as statistical methods and programming, data analytics is sometimes limited to the use of statistical methods. However, this boundary varies from company to company.
Background of the rise of data science
In recent years, interest in data science has increased dramatically. The graph below shows the popularity index of "data science" on Google Trends.

According to Google Trends, data science is still gaining attention. What is the background to this? Here are three possible reasons.
The growing presence of AI and machine learning
Fields such as AI and machine learning are proving to be more than just a passing trend in various areas of modern business. As ChatGPT has been a hot topic in recent years, AI and machine learning are expected to continue to grow rapidly in the future.
As AI and machine learning become more prominent, data science fever has also been gradually increasing. Services using AI and machine learning can collect huge amounts of data. Data science is essential to derive new knowledge from that data.
Evaluation of Data Scientists
In 2012, the business journal Harvard Business Review introduced data scientist as "the sexiest job of the 21st century."(*1) At the time, the data community was so enthusiastic that some even viewed data scientists as deified.
More than a decade has passed since then, and data scientists are becoming more and more valued. For example, the median annual salary of a top data scientist in California is approaching $10 (approximately 20 million yen)(*2,700).
Unfortunately, Japan, known as a data developing country, has yet to reach this figure. However, companies that are sensitive to data analysis are already investing heavily in attracting talented data scientists.
(*1) Data Scientist: The Sexiest Job of the 21st Century|Harvard Business Review https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
(*2) Data Science Manager Salary in California | Salary.com https://www.salary.com/research/salary/benchmark/data-science-manager-salary/ca
The concept of big data and cloud computing is born.
Going back further from the information in the Harvard Business Review, we can see that interest in data science began to increase around 2006, when concepts such as big data and cloud computing were born.
Cloud services such as AWS and GAE (now GCP) have appeared one after another, making it possible to quickly set up environments for large-scale data processing. As a result, environments for practicing data science have been developed, and the opportunities for data scientists to work have expanded greatly.
What is the role of a data scientist?
Data scientists use data science to derive insights that benefit businesses. Their main roles include:
● Listen to business leaders about their data analysis goals
● Work with database engineers to collect data
● Perform data cleansing and data modeling
● Verify the accuracy of the data model and determine the model.
Implement data analysis functions
● Report and explain the results of data analysis
● Repeatedly verify the accuracy of the data model as necessary
When you hear the term "data scientist," you might imagine someone who just deals with data, but they also spend a lot of time communicating with business managers and other stakeholders.
for that reason,A good data scientist not only has strong statistical skills but also strong business communication skills.
Skills required for a data scientist
To be successful as a data scientist, you need the following skills:
● Applied Mathematics
Statistics
● Data Engineering
Coding
● Data modeling
● Data Cleansing
In addition, business skillsCommunication skills, management skills, document creation skills, consulting skillsYou will also be asked to:
Differences from Data Analysts
Data analysts are data analysis specialists on a par with data scientists. They share a common mission: analyzing the vast amounts of data held by companies to facilitate rapid decision-making and derive insights that will benefit the business.
On the other hand,Data analysts are focused on applying analysis results to business.While a data scientist is a "general professional in data analysis," a data analyst is a "business-specific expert."
Therefore, while many data scientists have studied statistics at university (or graduate school), many data analysts tend to have studied statistics and made a career change from the business sector.
3 books recommended by data analysts for beginners
Finally, we will introduce three books for beginners recommended by XICA's data analysts and CRO (Chief Research Officer).
"Introduction to Data Hermeneutics for Analysts: Techniques for Capturing the Essence of Data (Takahiro Ezaki/Soshim)"
One of the important things when dealing with data science in practice is to correctly understand the nature of the data before analyzing it. When we talk about data science and data analysis, we tend to focus on the methods and coding, but without the right data, nothing can begin. It is also necessary to interpret the numbers to see what the analysis results represent in the real world.
This book comprehensively covers the important point "before and after analysis" that is often overlooked in data science. If you are thinking of using data science in your practice, we recommend that you read this book as your first step.
(Motonobu Takagi, Manager of Research Section, Analysis Department, Business Headquarters, and Manager of Analysis 1 Section, Analysis Department, Business Headquarters, XICA Corporation)
"Introduction to analytical models for capturing the essence of data analysis: A comprehensive guide from uses and characteristics to principles of statistical models, deep learning, reinforcement learning, etc. (Satoshi Sugiyama/Soshim)"
This book provides a broad and comprehensive look at data science, covering everything from basic statistical analysis methods to more advanced fields such as machine learning.
In addition to methods that are often used in business, such as multiple regression analysis, logistic regression analysis, factor analysis, and cluster analysis, the book also provides concise overviews, applications, and interpretations of each analytical method that has recently been in the spotlight, such as deep reinforcement learning.This book is likely to be a must-read for anyone who wants to learn more about data analysis in the future.
(Yusuke Nagai, Manager of Analysis 2, Analysis Department, Business Headquarters, XICA Corporation)
Bayesian Statistical Modeling: A Tutorial with R, JAGS, and Stan, 2nd Edition (John K. Kruschke (author), Kazuhiro Maeda (translator), Koji Kosugi (translator) / Kyoritsu Shuppan)
At first glance, this book may seem bulky and not for beginners. But that's a misunderstanding. Just open it and read it, even if you think you've been tricked. It explains things very carefully and in an easy-to-understand way, with almost redundant examples. With this one book, you can learn everything from the basics of probability to the basic theory of approximate estimation using MCMC, how to implement it using R or Stan, and how to replace statistical tests with Bayesian estimation.
(Kenta Murata, Executive Officer, CRO/Research Department, Development Division, XICA Corporation)
In conclusion
The demand for data science will continue to accelerate in the future, and the roles that data scientists and data analysts will play within companies will also become larger.
Business people considering career advancement should also remember that there is also the option of aiming to become a "specialized data analyst who utilizes experience in the business division."
This video also provides some tips on how to become a person who can utilize data analysis in business. Please take a look at this as well.
Recommended articles
-
ColumnWhat is context engineering? Why generative AI answers become mediocre: The difference between context design and prompting.
-
ColumnIs your ad really contributing to sales? Exploring the limits of attribution and the true value of incrementality
-
ColumnHow to identify your selling points to beat your competitors | Data-driven marketing strategy that connects analysis to your next move



