key: cord-0046956-jyitg4h9 authors: Chen, Penghe; Lu, Yu; Peng, Yan; Liu, Jiefei; Xu, Qi title: Identification of Students’ Need Deficiency Through a Dialogue System date: 2020-06-10 journal: Artificial Intelligence in Education DOI: 10.1007/978-3-030-52240-7_11 sha: 679ac9c9ae2f64ed41416503e5161f1635939422 doc_id: 46956 cord_uid: jyitg4h9 In the domain of moral education, students’ need deficiency refers to the unsatisfied need that would result in problem behaviors. Timely and accurate identification of students’ need deficiency is crucial to moral education and the students themselves. Previous psychology research focusing on distinct factors only provides scattered guidelines to identify such need deficiencies and meanwhile few teachers and parents have the related expertise, which makes the identification task difficult to accomplish. To address these issues, we develop a task-oriented dialogue system to help teachers and parents identify students’ need deficiency through multi-turn dialogues. Specifically, relevant factors of need deficiency are summarized based on psychology theories, which provides a theoretical foundation for the newly proposed system. In addition, reinforcement learning methodology is adopted to learn dialogue policy to serve the designed dialogue system. Experimental results demonstrate that the developed dialogue system achieves its design objectives. In the moral education domain, students' need deficiency referring to the unsatisfied need is what drives students' problem behavior like playing truant and fighting in school [3, 8] . In this work, based on Maslow's Hierarchy of Need [8] , need deficiency is divided into five specific types: physiological needs, safety needs, belongingness and love needs, esteem needs, and cognition needs. Timely and accurate identification of students' need deficiency is crucial for reducing and modifying students' problem behavior. Past literature has demonstrated that effective moral education can promote behavioral advancement [5] . Targeting on this problem, extensive research has been conducted to analyze different factors of problem behavior and need deficiency. For instance, researchers found that uninvolved parenting style would lead to a higher probability of externalizing problems [11] , and boys are more likely to perform aggressive behaviors [6] . These findings are informative for need deficiency identification, but too scattered to be employed systematically by teachers and parents without the expertise. Therefore, the education domain needs a system that not only encompasses relevant psychology theories, but is also easy to use without the requirement of mastery on those theories. With advancement in artificial intelligence, task-oriented dialogue system, aiming to complete a specific task through natural language interaction, has been applied to different fields, such as ticket booking [7] , restaurant searching [13] , disease diagnosis [12] , moral education [10] . Adoption of dialogue system can significantly improve the service efficiency and accessibility in these domains. Hence, we are inspired to develop a task-oriented dialogue system for need deficiency identification, which presents three main advantages. Firstly, the dialogue system is designed according to psychology research findings, which guarantees the identification consistent with relevant theories. Secondly, unlike supervised classification model that requires information of each student's all aspects to make inference, the dialogue system depends on necessary information only and acquires them adaptively, which significantly reduces service cost but improves service applicability. Thirdly, the natural language based interaction makes dialogue system easy to use without mastering psychology theories necessarily. Note that the dialogue system is mainly designed as an assistant tool for giving professional suggestions on students' need deficiency behind problem behavior rather than directly providing the complete solution. We next explain the proposed dialogue system in detail. As shown in Fig. 1 , the proposed dialogue system consists of four main modules: Natural Language Understanding (NLU), Dialogue State Tracking (DST), Policy Learning (PL) and Natural Language Generation (NLG) [2] . The NLU module interprets user's input to identify the intention and the semantic slots. For example, user input "The student is boy." is interpreted as "inform(gender=boy)". With the output of NLU, the DST module updates the dialogue state which represents students' information. Based on dialogue state, the PL module decides next system action. Action can be like "inform(deficiency=belongingness and love needs)" to inform the need deficiency or "request(parenting style)" to inquiry more information. Subsequently, NLG composes a response based on system action using natural language. For instance, "request(parenting style)" generates output as "Do you know which kind of parenting style his family perform?". Through multi-turn dialogue, this system can acquire necessary information of a student and automatically infer need deficiency behind his problem behavior. There are two main challenges in developing this dialogue system. One is how to properly define the semantic slots of need deficiency identification because they are the basis for dialogue state and system action design. The other is how to properly design the dialogue policy because it controls how to collect necessary information and infer the need deficiency. To solve these problems, we first summarize the main factors related to need deficiency identification based on previous psychology research findings. Specifically, based on Teacher's Report Form [1] and Problem Behavior Theory [4] , we classify the relevant factors into three categories: problem behavior, internal individual characteristic and external environmental characteristic, which provides a foundation to define semantic slots and build the dialogue system. Secondly, we adopt reinforcement learning methodology, specifically deep Q-learning network (DQN) model [9] , to learn dialogue policy so that the dialogue system can automatically request essential information from user to identify students' need deficiency. Meanwhile, to learn the dialogue policy, a user simulator is developed to emulate user based on real-life cases collected from an online platform. We next explain the DQN in detail. By defining the state s t as specific factors of problem behavior, internal individual characteristic and external environmental characteristic, the action a t as request and inform, the reward r t as system's immediate reward obtained at state s t after taking the action a t , the DQN model aims to find optimal Q-value Q * (s t , a t ; θ): where t and t+1 denote current step and next step respectively, γ ∈ [0, 1] denotes the discount factor and θ denotes model parameters. The optimal policy π * (s t ) is defined as the actions generating the optimal Q-values at different states. To learn model parameter θ, the -greedy learning strategy is employed to balance the trade-off between exploration and exploitation in reinforcement learning. In addition, the two techniques of experience replay and periodic parameter updating are also employed to train the model [9] through optimizing the loss function: where y = r t + γ max at+1 Q(s t+1 , a t+1 ; θ − ) denotes the target optimal Q-value, and is computed by summing the current reward r t and the optimal Q-value of subsequent step based on the target network θ − . Dataset. We obtain the data used to identify need deficiency from the reallife cases posted on an online moral education platform. In total, 689 cases are collected and converted into structured format in accordance with the defined moral education framework. Specifically, each case is manually annotated by two experts, and the Kappa value between these two annotations is 0.83. To ensure the data quality, cases with less information are excluded, thus create a dataset consisting of 628 cases to build the dialogue system. In order to check the system performance of different sized training data, the experiments are conducted with 50%, 60%, 70%, 80%, 90% of data for training and the rest for testing. The results are presented at Fig. 2 , where success denotes the success rate, reward denotes the average reward, and turns denotes the average turns. We have several significant findings from observing the result: Firstly, the dialogue system can achieve success rate between 0.4 and 0.44 for different-sized training data, showing the effectiveness of our system on identifying need deficiency. Secondly, the dialogue system returns result within just 11 turns on average, which means it has successfully recognized the essential factors to request and infer the need deficiency. In this work, we designed and implemented a task-oriented dialogue system for identification of need deficiency in moral education. Based on factors summarized based on psychology theories, the DQN model of reinforcement learning was adopted to learn optimal dialogue policy. Experimental results demonstrated that the dialogue system can achieve success rate around 0.44 with only 11 dialogue turns on average. The Achenbach system of empirically based assessment (ASEBA) for ages 1.5 to 18 years A survey on dialogue systems: recent advances and new frontiers Counseling children in crisis based on Maslow's hierarchy of basic needs Problem Behavior and Psychosocial Development: A Longitudinal Study of Youth A meta-analysis on the relationship between character education and student achievement and behavioral outcomes Gender aggression and mental health intervention during early adolescence End-to-end task-completion neural dialogue systems A theory of human motivation Human-level control through deep reinforcement learning A task-oriented dialogue system for moral education Associations of parenting dimensions and styles with externalizing problems of children and adolescents: an updated meta-analysis Inquire and diagnose: neural symptom checking ensemble using deep reinforcement learning A network-based end-to-end trainable task-oriented dialogue system Acknowledgment. This research is partially supported by the National Natural Science Foundation of China (No. 61807003), the Fundamental Research Funds for the Central Universities, and sponsored by CCF-Tencent Open Fund.