Comparison of multilabel classification models to forecast project dispute resolutions Comparison of multilabel classification models to forecast project dispute resolutions Jui-Sheng Chou ⇑ Department of Construction Engineering, National Taiwan University of Science and Technology, 43 Sec. 4, Keelung Rd., Taipei 106, Taiwan a r t i c l e i n f o Keywords: Data mining Multilabel classification Forecasting Dispute resolutions Public–private partnership Procurement management a b s t r a c t Early forecasting of project dispute resolutions (PDRs) provides decision-support information for resolv- ing potential procurement problems before a dispute occurs. This study compares the performances of classification and ensemble models for predicting dispute handling methods in public–private partner- ship (PPP) projects. Model analyses use machine learners (i.e., Support Vector Machines (SVMs), Artificial Neural Networks (ANNs), and Tree-augmented Naïve (TAN) Bayesian), classification and regression-based techniques (i.e., Classification and Regression Tree (CART), Quick, Unbiased and Efficient Statistical Tree (QUEST), Exhaustive Chi-squared Automatic Interaction Detection (Exhaustive CHAID), and C5.0), and combinations of these techniques that performed best for a set of PPP data. Analytical results exhibit that the combined technique of QUEST + CHAID + C5.0 has the best classification accuracy at 84.65% in pre- dicting dispute resolution outcomes (i.e., mediation, arbitration, litigation, negotiation, administrative appeals or no dispute occurred). Moreover, as the dispute category and phase in which the dispute occurs are known during project execution, the best classification model is the CART model, with an accuracy of 69.05%. This study demonstrates effective classification application for early PDR prediction related to public infrastructure projects. � 2012 Elsevier Ltd. All rights reserved. 1. Introduction Taiwan has legally supported PPP projects for more than ten years. The National PPP Taskforce of the Taiwan Public Construc- tion Commission (TPCC) is generally responsible for nationwide policies and in some cases provides advice about provisions for individual projects. Engineering departments and local govern- ments are typically responsible for PPP project delivery. To achieve effective control of diverse projects under current workloads and to design proactive dispute management strategies, knowledge of possible PPP project dispute resolutions before disputes occur is essential to providing the governmental PPP Taskforce with infor- mation about future countermeasures. Additional preparation is generally beneficial once a dispute occurs by reducing the effort, time, and cost to multiple parties during dispute settlement. PPP projects involve devoted stakeholders, including a pro- moter (government), private investors, and financial institutions. Due to the high risks associated with the construction industry, re- peated challenges for stakeholders can result in project delays, budget overruns, and poor construction quality during the imple- mentation, construction, operating, and transfer phases. Although numerous studies (Abednego & Ogunlana, 2006; Cheung, 1999; Cheung, Suen, & Lam, 2002; Gebken & Edward Gibson, 2006; Jones, 2006) demonstrated that an efficient, effective, and fair dispute resolution process is essential for PPP project success, this study fo- cuses on identifying warnings of potential dispute resolutions prior to project initiation. The proposed classification methods provide governmental authorities with the information needed to design proactive measures during project preparation and the phase when a dispute occurs. Many PPP projects initiated during the last decade have failed due to disputes occurring in the build, operate, and transfer (BOT) phases. According to the TPCC, the dispute rate was 23.6% during 2002–2009 (PCC, 2011). The most common processes for handling disputes are mediation/negotiation and non-mediation procedures. Non-mediation procedures include arbitration, litigation, and administrative appeals. In Taiwan, up to 84% of PPP projects are set- tled by mediation or negotiation within only 1–9 months after disputes occur (PCC, 2011). Notably, arbitration or litigation costs all parties considerably more time and money when a mediated agreement cannot be reached. This study applies multilabel classification models to early pre- dict PPP dispute likelihood and potential resolutions, thereby alle- viating the future adverse effects of disputes on project delivery, operation, and transfer from a governmental perspective. First, this study acquired historical dispute data for PPP projects started dur- ing 2002–2009 to establish functional relationships between pro- ject characteristics and their corresponding dispute resolutions. Differing from conventional construction project disputes, PPP 0957-4174/$ - see front matter � 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2012.02.103 ⇑ Tel.: +886 2 2737 6321; fax: +886 2 2737 6606. E-mail address: jschou@mail.ntust.edu.tw Expert Systems with Applications 39 (2012) 10202–10211 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a http://dx.doi.org/10.1016/j.eswa.2012.02.103 mailto:jschou@mail.ntust.edu.tw http://dx.doi.org/10.1016/j.eswa.2012.02.103 http://www.sciencedirect.com/science/journal/09574174 http://www.elsevier.com/locate/eswa project disputes may occur during the building phase, as well as during the operating, renting, or transfer phases. Thus, a second modeling phase with only dispute cases was implemented to iden- tify the possibility of dispute resolutions under a set of known pro- ject attributes, dispute items, and the phase in which a dispute occurs. The rest of this paper is organized as follows. Section 2 thor- oughly reviews artificial intelligence literature and its application in predicting construction claims and litigation outcomes. Section 3 then presents the research methodology and evaluation meth- ods, respectively, providing a theoretical basis for classification models adopted in subsequent investigations. Section 4 describes the project dispute database and compares model performance based on classification techniques. Conclusions are finally drawn in Section 5, along with recommendations for future research. 2. Literature review of dispute forecasting In response to extremely large upfront investment costs, recent public infrastructure and building construction projects have been financed via PPP (Clifton & Duffield, 2006). Nevertheless, disputes between PPP participants usually occur unexpectedly and may involve many issues, including surety bond issue, sub-contractor qualifications, licenses, permits, investment scale, resident rights, government guarantees, excessive profits, operating period, taxa- tion, and default loan commitment (Jones, 2006). Disagreements among parties typically jeopardize a project plan via a time-con- suming dispute resolution process that damages a government’s reputation as it relates to PPP projects and reduces the willingness of investors to participate in future projects. When a dispute or claim occurs, the local government usually resorts to adjudication by the central governmental authority – the TPCC in this case – when initial negotiations fail to resolve conflicts. Once agreed upon by all interested parties, an impartial committee may be the next option for dispute mediation (Jones, 2006; Keith, 1997). The timing of committee formation, operating functions, and method implementation must be defined contractu- ally before project execution. Since mediation sometimes fails to resolve disputes, arbitration or litigation becomes the only option based on existing laws. However, as stakeholders lack confidence in the current arbitra- tion system, litigation is the primary resolution process for most Taiwanese PPP projects (PCC, 2010). Given that not all disputes or claims require a costly and time-consuming dispute resolution process, method that provides early warnings by predicting dis- pute and its handling methods is needed, such that governments and investors can enact dispute-prevention measures during pub- lic construction projects. Hence, management personnel would benefit when the TPCC has a decision-support tool for estimating dispute likelihood and for outlining how disputes would be re- solved before project start. Several studies have attempted to minimize the number of con- struction litigation cases by predicting the likely court rulings. Arditi, Oksay, and Tokdemir (1998), for example, trained a network using Illinois appellate court data, and achieved a 67% prediction accuracy for litigation outcomes (Arditi et al., 1998). They argued that if parties in a dispute know with some certainty how a case will be resolved in court, the number of disputes could be reduced markedly. Artificial intelligence (AI) techniques have achieved excellent prediction accuracy with the same dataset; a prediction accuracy of 83.33% was achieved with a case-based reasoning (Arditi & Tokdemir, 1999b), 89.95% was achieved with boosted decision trees (Arditi & Pulket, 2005), and 91.15% was attained with inte- grated prediction modeling (Arditi & Pulket, 2010). These studies used AI to enhance outcome prediction in conventional construc- tion procurement litigation. Furthermore, Chau (2007) found that, excluding the above case studies, AI techniques are rarely applied in the legal field (Chau, 2007). Thus, Chau utilized AI techniques based on particle swarm optimization to predict construction litigation outcomes, a field in which relatively new data mining (DM) techniques are rarely applied. The network developed by Chau achieved a prediction accuracy rate of 80%, much higher than chance. Nevertheless, Chau suggested using additional case factors related to cultural, psycho- logical, social, environmental, and political characteristics in future work. For construction disputes triggered by change order of the con- struction process and design, Chen (2008) developed a K Nearest Neighbour (KNN) pattern classification scheme that identifies po- tential lawsuits based on a nationwide study of US court records (Chen, 2008). Chen indicated that the KNN approach has an 84.38% classification accuracy. Chen and Hsu (2007) further ap- plied a hybrid artificial neural network-case based reasoning (ANN–CBR) model to a disputed change order dataset to obtain early-warning information. Their classifier achieved a prediction rate of 84.61% (Chen & Hsu, 2007). Although many studies have used CBR and its variations to identify similar dispute cases as references for dispute settlements, Cheng, Tsai, and Chiu (2009) further refined and improved the con- ventional CBR approach by combining fuzzy set theory with a new similarity measurement that integrates Euclidean distance and co- sine angle distance (Cheng et al., 2009). Their model successfully extracted the knowledge and experience of experts from 153 con- struction dispute cases collected manually from multiple sources. Generally, all related work focused on either specific change or- der disputes or conventional contracting projects. Characteristics and environments for construction projects under the PPP strategy, however, differ markedly from the contractor-owner relationships and require insightful analyses via AI or DM techniques with exploratory modeling performance comparisons to assist govern- ment agencies in predicting likely dispute outcomes before disputes occur. Since a dispute always involves numerous complex and inter- connected factors that are difficult to rationalize, using DM tech- niques is now among the most effective methods for identifying hidden relationships between available or accessible attributes and dispute resolution methods (Arditi & Pulket, 2005, 2010; Arditi & Tokdemir, 1999a; El-Adaway & Kandil, 2010; Kassab, Hegazy, & Hipel, 2010; Pulket & Arditi, 2009). Identifying these variables will provide practitioners with an improved understanding of the com- plexity of PPP project disputes. The DM- and AI-based approaches are related to computer sys- tem programs that attempt to resolve problems intelligently by emulating human brain processes. As AI technology enhances the ability of computer programs to handle tasks at which humans re- main superior (Haykin, 1999), AI techniques are typically applied to solve prediction and classification problems. Researchers in var- ious scientific and engineering fields have recently combined dif- ferent AI models to enhance their efficacy. Numerous studies have demonstrated that hybrid AI schemes generate promising results in many industries (Adeodato, Arnaud, Vasconcelos, Cunha, & Monteiro, 2011; Andrawis, Atiya, & El- Shishiny, 2011; Arditi & Pulket, 2010; Chen, 2007; Chou, Chiu, Farfoura, & Al-Taharwa, 2011; Chou, Tai, & Chang, 2010; Kim & Shin, 2007; Lee, 2009; Li et al., 2005; Min, Lee, & Han, 2006; Nandi et al., 2004; Wichard, 2011; Wu, Tzeng, & Lin, 2009; Wu, 2010). Notably, since selecting the most appropriate combination of AI models is both difficult and time-consuming, such that further at- tempts are not worthwhile unless prediction performance is improved significantly. J.-S. Chou / Expert Systems with Applications 39 (2012) 10202–10211 10203 https://isiarticles.com/article/18194