key: cord-0179800-8a6juvyw authors: Sharma, Abhinav; Deshpande, Advait; Wang, Yanming; Xu, Xinyi; Madumal, Prashan; Hou, Anbin title: Searching k-Optimal Goals for an Orienteering Problem on a Specialized Graph with Budget Constraints date: 2020-11-02 journal: nan DOI: nan sha: 5fa00ef714d88e2f908c1d98d9df926c4733277f doc_id: 179800 cord_uid: 8a6juvyw We propose a novel non-randomized anytime orienteering algorithm for finding k-optimal goals that maximize reward on a specialized graph with budget constraints. This specialized graph represents a real-world scenario which is analogous to an orienteering problem of finding k-most optimal goal states. Orienteering Problem (OP) is a special case of the Informative Path Planning (IPP) problem where rewards at different nodes are calculated independently of each other. However, the OP is considered to be NP-hard and mostly solved with heuristic-based search strategies and customized algorithms (Wei and Zheng 2020) . We aim to solve a domain-related orienteering problem which can be formalized for a specialized directed weighted graph. First, we initialize a specialized graph for mapping the Parkville campus of the University of Melbourne. We then use this graph to formalize our problem of finding the most optimal nearest building from a starting building such that the reward can be maximized within the provided travelling budget constraint. The proposed non-randomized algorithm is applied to find k-most optimal nearest buildings inside the campus from a given starting building, discussed in the results section. We also show how COVID-19 lock-down restrictions can be incorporated into our algorithm to solve our defined orienteering problem. We formulate our domain-related optimal building finding problem into a generic orienteering problem (OP) for a specialized graph below. Let us assume a weighted directed specialized graph G s = (V, E) for n number of nodes where v s ∈ V is the pre-defined start node such that V = {v 1 , v 2 , v 3 , ..., v n } and Here, v s is having n out-degree with 0 in-degree (i.e. v s is connected to every other node in V ) and v i ∀ i ∈ [1, n] \ v s is connected to only v s with 1 in-degree and 0 out-degree. Let v g be the set of k-optimal goal nodes s.t. v g ∈ V and k ≤ n. These goals are attained in the decreasing order of Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. their gained rewards after respecting budget constraints (i.e. v g1 > v g2 > ... > v gk ). Let r be the set of nodes which we can visit such that r ⊆ V \ v s . Let B be the travelling budget which will enable the budget constraints. Let O be the generic objectives and F be the generic factors which can be used to tweak the reward function of the problem. Using above notations, the hard-constraint problem can then be defined by equation 1. We can relax the above hard-constraint by introducing a hyper-parameter δ to formulate a soft-constraint problem as shown in equation 2. where δ ∈ IR + 0 ∪ {∞}. Informally, the solution to our stated problem is a set of ordered k-optimal goal nodes, such that the reward obtained by visiting the node is maximized while the path cost stays within a specified travelling budget B. In this section, we propose a novel way of solving the problem formulation shown in equation 2 which is inspired by the general randomized algorithm for IPP problems (Arora and Scherer 2017) . The algorithm starts with a priority queue and creates r subset s.t. r ⊆ V \ v s . Then, for each node in r, path cost C(r) and node reward I(r) is calculated. It is then ensured that the budget constraint is satisfied and the selected node is pushed into the priority queue with negative reward as the priority. We can pop the queue item with minimum priority k-times to find the k-most optimal goal nodes. This process is described in Algorithm 1. Time Complexity. If we assume a standard binary heap implementation of the priority queue, then the insertion and deletion time complexity is O(log n), where n is the size of the input (Atkinson et al. 1986) . This can be further optimized by several customizations (Edelkamp, Elmasry, and Katajainen 2017) . Hence, the time complexity of our proposed algorithm for the best and the worst case can be stated as O(n − 1 * log n) + O(k * log n) ≤ O(n log n). Space Complexity. If we again assume a heap data structure implementation of the priority queue, then the space complexity of storing n elements in the priority queue is O(n) (Atkinson et al. 1986) . Hence, the best and worst case space complexity of our proposed algorithm is O(n). Limitations. Our algorithm relies on the assumption that the graph is a specialized weighted directed graph with one central node (0 in-degree and n out-degree) and n isolated nodes connected with only one central node. Due to this assumption, the algorithm is efficient and applicable only for such versions of the specialized graph and cannot be extended implicitly to any general weighted directed graph. In this section, we show experimental results for a domainspecific orienteering problem solved using our proposed algorithm. Here, our goal is to find the k-most optimal nearest building inside the Parkville campus of the University of Melbourne. These buildings should be within a specific radius (B) that maximises the chances (reward) of either booking a meeting room or using a toilet facility based on supply, demand and other preferences or factors. A specific scenario is shown in Figure 1 where R(.) are the rewards given by the buildings with no factors and R(COVID) are the rewards based on COVID-19 lock-down restrictions. Table 1 shows the results for the stated scenario for 3optimal nearest buildings using our proposed algorithm. In Figure 1 : Finding k = 3 most optimal nearest building from v s = 220 that maximises the chances (reward) of booking a meeting room within B = 200 meters and δ = 50 meters addition, we were also able to simulate a COVID-19 restriction scenario by enhancing the reward function R(r, o, f ), obtaining results as shown in the Table 2 . Randomized algorithm for informative path planning with budget constraints Heaps and Generalized Priority Queues Optimizing Binary Heaps. Theory of Computing Systems 61 Informative Path Planning for Mobile Sensing with Reinforcement Learning