Association rules applied to credit card fraud detection Association rules applied to credit card fraud detection D. Sánchez a,*, M.A. Vila a, L. Cerda a, J.M. Serrano b a Department of Computer Science, A.I., University of Granada, E.T.S.I. Informática y Telecomunicación, C/Periodista Daniel Saucedo Aranda s/n, 18071 Granada, Spain b Department of Informatics, University of Jaen, Spain Abstract Association rules are considered to be the best studied models for data mining. In this article, we propose their use in order to extract knowledge so that normal behavior patterns may be obtained in unlawful transactions from transactional credit card databases in order to detect and prevent fraud. The proposed methodology has been applied on data about credit card fraud in some of the most important retail companies in Chile. � 2008 Elsevier Ltd. All rights reserved. Keywords: Association rules; Data mining; Credit card fraud; Fraud detection; Fraud prevention 1. Introduction Competitiveness in the retail industry is continuing and it is becoming increasingly aggressive as revealed by recent events in the sector and specialist studies such as (Zarufe, 2005). One of the leading businesses in this sphere is hire purchase and one of the main commercial strategies is the emission of department store credit cards to clients, as evident from the publications (Zarufe, 2005) which indi- cate that three companies lead the retail industry in the Latin American Southern Cone (Argentina, Chile, Peru, Colombia). In Chile, they compete for 95% of retail indus- try sales, which in 2003, according to these same publica- tions, exceeded the three thousand, three hundred million dollar mark with market shares of 60.29%, 18.26% and 15.63%, respectively. By the end of 2003, they had issued 3.0, 2.7 and 2.6 million credit cards, and in Chile alone 16 million credit cards had already been issued by the dif- ferent retail distribution chains. This form of payment was 7 times higher than the number of bank-issued credit cards, which were responsible for on average 65% of the sales of their issuing houses, represented almost 20% of Chile’s con- sumption debt in total, and there were over 11,000 estab- lishments who did not issue this type of card but which traded with them. Within the retail industry, the predomi- nant trade component are financial services and the distri- bution of clothing and furnishings through department stores with sales in 2005 reaching US $3,194, sales which are distributed among the four major chain stores with the profile in Table 1. For years, both from the academic and the technological advisory or consultancy perspective, it has been observed and in this case confirmed that in three of these four large companies, the level of technological support used as part of their computerized management systems in their deci- sion-making processes is very different from the level of use of information technologies used in the operational transaction systems, and these computerized management systems may therefore be classified according to the follow- ing levels of computing maturity: Level Zero (non-computerized systems): No incorpora- tion of computer technology in computerized management systems, i.e. information for decision-making is obtained manually. 0957-4174/$ - see front matter � 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.02.001 * Corresponding author. Tel.: +34 958 246397; fax: +34 958 243317. E-mail addresses: daniel@decsai.ugr.es (D. Sánchez), vila@decsai.ugr. es (M.A. Vila), lcerda1511@hotmail.com (L. Cerda), jschica@ujaen.es (J.M. Serrano). www.elsevier.com/locate/eswa Available online at www.sciencedirect.com Expert Systems with Applications 36 (2009) 3630–3640 Expert Systems with Applications mailto:daniel@decsai.ugr.es mailto:vila@decsai.ugr. es mailto:vila@decsai.ugr. es mailto:lcerda1511@hotmail.com mailto:jschica@ujaen.es Level One (semi-computerized systems): Incipient use of technological resources in computerized management sys- tems for decision-making, e.g. the use of spreadsheets to prepare the information. Level Two (departmental computerized systems): Use of departmental information systems as the nucleus of the computerized management system for decision-making, and data is usually gathered from transactional processes associated to a specific function within the value chain. Level Three (integrated computerized systems): Use of the company’s integrated administrative information sys- tem (ERP) as the main element to supply the computerized management system (CMS) for decision-making. These CMS are supplied by data from all the processes support- ing the company’s value chain. Level Four (controlled or synchronized computerized sys- tems): These computerized management systems (CMS) integrate the use of control boards or control panels, enabling decisions to be made as the need arises by having online information for the fulfillment level of the objectives associated with the management indicators. Level Five (predictive computerized systems): Data min- ing models are incorporated into the previous level to extract non-explicit information from the records of trans- actions from daily operation. These enable new behavior patterns to be determined which reinforce, confirm or mod- ify management indicators and allow trends to be recog- nized for decision-making. Level Six (automatic computerized systems): The previ- ous level of computerized management systems is rein- forced and combined with daily operation through the use of expert systems with knowledge bases and inference engines to support decision-making, thereby incorporating intelligence into the operational systems so that given cer- tain management parameters and indicators they activate, restrict or modify business rules. From the critical or strategic decision-making processes for the business in question, two areas were chosen, which are the most representative in this industry: operational risk control, and corporate management and planning. In this article, we present the first work into operational risk control, whereby we worked with the area of the com- pany with available data and with a Level Two computer- ized management system (the others were ruled out on account of them not having any available data for confi- dentiality reasons or having a Level Zero or Level One computerized management system). In our next publica- tion, we will present our work into the area of corporate management, studying the case of one of these leading companies which also has a Level Two computerized man- agement system. The objective of both these works is to transform the computerized management systems of these decision-mak- ing processes from their current computerization levels with their reactive decision-making processes to computer- ization levels with proactive decision-making processes. This first publication therefore presents the result of applying cutting edge information technologies to one of the operational risk control processes and transforming it from Level Two to Level Five or Six. In particular, the work focuses on the process for controlling the risk of fraud through the use of corporate credit cards as a form of payment. In this respect, the selected process supports one of the widest used differentiation and sustained growth strategies in this industry for obtaining client loyalty. While it is true that the mass issue of credit cards by department stores has been successful as a marketing project, it is equally true that this has increased the risk of exposure to illegal activ- ity, as demonstrated by the growing tendency for fraud which is highlighted in specialist publications (e.g. the latest Cybersource report, Sponsored by CyberSource Corpora- tion Conducted by Mindwave Research, 2006). Diversifica- tion of the client portfolio with this mass issue of credit cards and aggressive marketing plans which encourage the diverse use of this payment method, combined with the lack of efficient techniques and intelligent systems to enable effective detection and prevention of their illegal use, without inconveniencing genuine credit card users, involve the challenge of seeking more efficient methods. This effort is reflected in various articles, in particular in specialist publications which offer different approaches for detecting and preventing this illegal behavior. Never- theless, all of these concur with Bhatla’s observations in 2002 (Bhatla, Vikram, & Dua, 2003) that the evaluated sys- tems are prone to guaranteed effectiveness and that none of the reviewed tools and technologies can alone eliminate fraud. Furthermore, since each technique contributes to the ability to detect fraud, he believes that the most success- ful option would be a combination of several of these techniques, since the results of the Cybersource survey (Sponsored by CyberSource Corporation Conducted by Mindwave Research, 2006) seem to indicate that manual control is still the most used method for detecting and pre- venting fraud. A summary of the state of the art in the techniques and methods used in fraud detection and prevention, and a review of various relevant publications over the last three years confirms the effort employed to obtain useful knowl- Table 1 Distribution of sales between the four major chain stores in Chile Indicator Falabella Ripley Paris La Polar Total No. of stores 30 31 19 26 106 Sales surface area m2 177,538 227,909 154,544 81,000 640,991 Surface area/stores 5,917 7,352 8,134 3,115 6,048 Sales (US$M) 1,178 871 782 360 3,194 % market 36.90% 27.30% 24.50% 11.30% 100 Sales/m2 3.55 2.6 2.6 2.3 Cards issued (M) 3.3 2.6 3.0 1.9 10.8 Active cards (M) 2.6 1.4 1.2 1.4 6.6 % card sales 67% 63% 67% 80% Projected investment (US$M) 1130 551 1200 100 D. Sánchez et al. / Expert Systems with Applications 36 (2009) 3630–3640 3631 https://isiarticles.com/article/17710