A decision-making framework for precision marketing A decision-making framework for precision marketing Zhen You a, Yain-Whar Si b, Defu Zhang a,⇑, XiangXiang Zeng a, Stephen C.H. Leung c, Tao Li a,d a Department of Computer Science, Xiamen University, Xiamen 361005, China b Department of Computer and Information Science, University of Macau, Macau, China c Department of Management Sciences, City University of Hong Kong, Hong Kong d School of Computer Science, Florida International University, Miami, FL, USA a r t i c l e i n f o Article history: Available online 20 December 2014 Keywords: Data mining Decision tree Forecasting Precision marketing Decision-making a b s t r a c t Precision marketing offers personalized customer service and is used to help enterprises increase their profits by means of high-efficiency marketing. This paper presents a novel decision-making framework for precision marking using data-mining techniques. First, this study presents a trend model to accurately predict monthly supply quantity; second, it uses a RFM (Recency, Frequency and Monetary) model to select attributes to cluster customers into different groups; third, it uses CHAID decision trees and Pareto values to identify important attribute values to distinguish different customer groups; and finally, it creates different supply strategies targeting each customer group. The objective of the proposed precision-making framework is to help managers identify the potential characteristics of different customer categories and put forward appropriate precision marketing strategies, which can greatly reduce inventory for every customer category. The real-world data from a company in China were collected and used in a case study to illustrate how to implement the proposed framework. This case study demonstrates that our proposed decision-making framework is efficient and capable of providing a very good precision marketing strategy for enterprises. � 2014 Elsevier Ltd. All rights reserved. 1. Introduction Due to the accelerated pace of economic globalization and increasing market competition, economic pressures and competi- tion have led enterprise managers to face the problem of choosing the right strategic decision-making policies for selling the right products to the right customers at the right time, such that the companies can increase their profits. Recently, it has been recog- nized that precision marketing has become a key means of gener- ating profit and is becoming increasingly important as customers become better informed about the products and their rights as con- sumers. The availability of customer data and transaction records provides better understanding of customers’ consumption behav- iors and preferences. In the increasingly competitive environment, enterprises have to create a decision-making model for precision marketing that provides appropriate strategies to manage the mar- ket positioning system for fulfilling their customers’ needs. The research motivation of this paper is stemmed from a real-world project. This project considers a marketing problem where the supplier or manufacturer provides different products for retail cus- tomers, of which some may sell well in some customer segments and some may not. Products that are not sold will be returned back to the supplier. Therefore, the supplier needs to find a good mar- keting strategy that minimizes goods in stock and satisfies the sup- plier, retailers and consumers. In recent years, the decision-making problems have received much attention due to a wide range of real-world applications. Many decision-making techniques have been proposed in literature. Chen and Wang (2009) presented multi-criteria optimization and compromise solutions. Saen (2010) developed a technique for order performance via similarity to ideal solution. Hsu, Chiang, and Shu (2010) developed a nonlin- ear programming for decision-making, and Lin, Chen, and Ting (2011) presented a Linear programming for decision making. When discussing the decision making in this context, it is important to pay attention to the role of artificial intelligence (AI). The decision support framework of AI is considered to be a major tool for obtaining information related to historical data col- lections using some artificial Intelligence technique such as genetic algorithm (GA), artificial colony optimization (Ghasab, Khamis, Mohammad, and Fariman, 2015), and other data-mining tech- niques (Chai, Liu, & Ngai, 2013). Nowadays, data-mining, which can extract useful customer information and discover the hidden customer’s behaviors from big data, has a great influence on http://dx.doi.org/10.1016/j.eswa.2014.12.022 0957-4174/� 2014 Elsevier Ltd. All rights reserved. ⇑ Corresponding author. Tel.: +86 18959217108; fax: +86 0592 2580258. E-mail addresses: uzhen@foxmail.com (Z. You), fstasp@umac.mo (Y.-W. Si), dfzhang@xmu.edu.cn (D. Zhang), xzeng@xmu.edu.cn (X. Zeng), mssleung@cityu. edu.hk (S.C.H. Leung), taoli@cs.fiu.edu (T. Li). Expert Systems with Applications 42 (2015) 3357–3367 Contents lists available at ScienceDirect Expert Systems with Applications j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2014.12.022&domain=pdf http://dx.doi.org/10.1016/j.eswa.2014.12.022 mailto:uzhen@foxmail.com mailto:fstasp@umac.mo mailto:dfzhang@xmu.edu.cn mailto:xzeng@xmu.edu.cn mailto:mssleung@cityu.edu.hk mailto:mssleung@cityu.edu.hk mailto:taoli@cs.fiu.edu http://dx.doi.org/10.1016/j.eswa.2014.12.022 http://www.sciencedirect.com/science/journal/09574174 http://www.elsevier.com/locate/eswa guiding decision making and forecasting the effects of decisions. Guo, Yuan, and Tian (2009) used a hierarchical potential support vector machine (HPSVM) for supplier selection and improved the accuracy. Chang and Hung (2010) presented a rough set theory (RST) to analyze the rules of supplier selection and guide the deci- sion making. Chai et al. (2013) conducted a systematic review of literature about the application of decision-making techniques in supplier selection. They classified these techniques into three cat- egories: Multi-criteria decision making techniques (MCDM), math- ematical programming techniques (MP), and artificial intelligence techniques (AI). MCDM is a methodological framework that aims to provide decision makers a knowledgeable recommendation amid a finite set of alternatives (Chai et al., 2013). Wang, Wu, Wang, Zhang, and Chen (2014) developed two kinds of prioritized aggregation operators of IVHFLNs for MCDM. However, in practice, it is not an easy task to choose a good data mining tool since, each data-mining tool has its own advan- tages and disadvantages. For example, artificial neural networks (ANNs) involve too many hidden neurons and training parame- ters (Zhang, Jiang, & Li, 2005). Its disadvantages include the ‘‘black box’’ nature, greater computational burden and proneness to over-fitting. However, an advantage of ANN is that ANN has the ability to implicitly detect complex nonlinear relationships between dependent and independent variables. Decision trees are simple to use and easy to understand and they offer many advantages compared with other decision-making tools. How- ever, the disadvantages of decision trees include their instability and relatively low performance. Hence, there is no single best model for all the cases (Akın, 2015). Recently, researchers attempt to combine different models together by considering their respective advantages. Guo, Wong, and Li (2013) presented a multivariate intelligent decision-making mode which combined three different model to complete every phase of the decision making process. Tadić, Zečević, and Krstić (2014) developed a novel hybrid MCDM model that combines fuzzy decision making trial and evaluation laboratory model to provide support to deci- sion makers. Furthermore, Yan and Ma (2015) proposed a novel two-stage group decision-making approach to uncertain quality function deployment. To the best of our knowledge, the marketing problem consid- ered in this paper is still a new problem. The objective of this paper is to propose a decision-making framework for combining various data-mining algorithms to achieve precision marketing of real products. Real-world data that include historical monthly supply (quantity) and information of every customer were collected from a company in China. The goal is to find a model that can classify targeted customers and predict supply quantity and then provide a strategy for precision marketing, i.e., deciding the quantity of products that every store needs. Depending on the different char- acteristics and requirements of each phase of the marketing model, the decision-making framework uses four data-mining models/ algorithms, which are K-means algorithm, decision tree, Pareto ratio method, and RFM model, for decision-making. Overall, the purpose of this study is to generate, using data-mining techniques, a decision-making model for products’ precision marketing. The proposed decision-making framework is more accurate due to its integrated precision strategies which combine prediction models, clustering, and classifying model. Moreover, the proposed frame- work incorporates RFM model and Pareto ratio into the process of customer segmentation so that the generated strategies are more convincing. The rest of this paper is organized as follows. Section 2 introduces related works on common data-mining models and algorithms. Section 3 describes the methodology in our proposed framework briefly. Section 4 presents a case study and results of analyses. Finally, Section 5 concludes the paper. 2. Related works Since a single data-mining model may only be suitable for a specific problem, such as prediction or clustering or classification, in the proposed framework, we have combined four data-mining models or algorithms to derive a precision marketing strategy for enterprises. The literature on these four models or algorithms was reviewed below. 2.1. K-means algorithm Clustering is the process of grouping a set of physical or abstract items into classes of similar items where the groups are either meaningful or useful, or both. A well-known clustering algorithm is K-means, which was first proposed by MacQueen (1967). The accuracy of this algorithm depends on the initialization and the number of clusters (Mesforoush & Tarokh, 2013). The basic idea of K-means is to discover k clusters, such that the records within each cluster are similar to each other and distinct from the records in other clusters. K-means is an iterative algorithm: an initial set of clusters is defined and the clusters are repeatedly updated until no further improvement is possible (or the number of iterations exceeds a specified threshold). The K-means algorithm is widely used to pre-process data or for clustering because of its simplicity and efficiency (Mesforoush & Tarokh, 2013). K-means has been widely used to effectively identify the valuable customers and develop the related marketing strategies (Wei, Lee, Chen, & Wu, 2013; Mesforoush & Tarokh, 2013). In particular, Cheng and Chen (2009) use the RFM model and K-means to perform customer rela- tionship management, and the experimental results demonstrate that their proposed model is an effective method in customer value analysis. 2.2. Decision tree A decision tree is an efficient data mining algorithm with a strong explanatory capability (Zhang, Zhou, Leung, & Zheng, 2010). This study uses the Chi-squared Automatic Interaction Detector (CHAID) decision tree, a decision tree algorithm that is a type of database segmentation. The concept of CHAID was first published in Kass (1980). CHAID is used for prediction and classi- fication. Like other decision trees, CHAID’s advantages are that its output is highly visual and easy to interpret. A recent study indi- cates that CHAID is superior to judgment RFM for identification of likely responders (McCarty & Hastak, 2007). CHAID is similar to the RFM approach because it can identify the terminal nodes that will break even with respect to expected profit and costs. Therefore, CHAID was widely adopted used for segmenting cus- tomers and extracting rules that show the associations between the input and output variables (McCarty & Hastak, 2007; Chen & Wang, 2009; Mistikoglu et al., 2015). In addition, Coussement, Van den Bossche, and De Bock (2014) demonstrate that, when per- forming customer segmentation at different levels of uncertainty, CHAID outperforms RFM and logistic regression. Furthermore, CHAID can also assess its robustness to data accuracy problems. So in our study, we use CHAID to segment customers. 2.3. Pareto ratio The concept of Pareto ratio is from the ‘‘Pareto Principle’’ (also known as the 80–20 rule, the law of the vital few or the principle of factor sparsity) which states that for many events, roughly 80% of the effects come from 20% of the causes. This is a common rule of thumb in business. This distribution is relevant to entrepre- neurs and business managers, stating, for example, that 80% of the 3358 Z. You et al. / Expert Systems with Applications 42 (2015) 3357–3367 https://isiarticles.com/article/41312