key: cord-319828-9ru9lh0c authors: Shi, Shuyun; He, Debiao; Li, Li; Kumar, Neeraj; Khan, Muhammad Khurram; Choo, Kim-Kwang Raymond title: Applications of Blockchain in Ensuring the Security and Privacy of Electronic Health Record Systems: A Survey date: 2020-07-15 journal: Comput Secur DOI: 10.1016/j.cose.2020.101966 sha: doc_id: 319828 cord_uid: 9ru9lh0c Due to the popularity of blockchain, there have been many proposed applications of blockchain in the healthcare sector, such as electronic health record (EHR) systems. Therefore, in this paper we perform a systematic literature review of blockchain approaches designed for EHR systems, focusing only on the security and privacy aspects. As part of the review, we introduce relevant background knowledge relating to both EHR systems and blockchain, prior to investigating the (potential) applications of blockchain in EHR systems. We also identify a number of research challenges and opportunities. There is an increasing interest in digitalizing healthcare systems by governments and related industry sectors, partly evidenced by various initiatives taking place in different countries and sectors. For example, the then U.S. president signed into law the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009, as 5 part of the American Recovery and Reinvestment Act of 2009. HITECH is designed to encourage broader adoption of electronic health records (EHRs), with the ultimate aim of benefiting patients and society. The potential benefits associated with EHR systems (e.g. public healthcare management, online patient access, and patients medical data sharing) have also attracted the interest of the research community [1, 2, 3, 4, 5, 6, 7, 8, 9] . The 10 potential of EHRs is also evidenced by the recent 2019 novel coronavirus (also referred to as 2019-nCoV and COVID-2019) pandemic, where remote patient monitoring and other healthcare deliveries are increasingly used in order to contain the situation. As with any maturing consumer technologies, there are a number of research and operational challenges. For example, many existing EHR systems use a centralized server 15 model, and hence such deployments inherit security and privacy limitations associated with the centralized server model (e.g. single point of failure and performance bottleneck). In addition, as EHR systems become more commonplace and the increasing understanding of the importance of data (particularly healthcare data), honest but curious servers may surreptitiously collect personal information of users while carrying out 20 their normal activities. In recent times, there is an increasing trend in deploying blockchain in a broad range of applications, including healthcare (e.g. public healthcare management, counterfeit drug prevention, and clinical trial) [10, 11, 12 ]. This is not surprising, since blockchain is an immutable, transparent and decentralized distributed database [13] that can be 25 leveraged to provide a secure and trusty value chain. An architecture of blockchain-based healthcare systems is shown in Fig. 1 . Blockchain is a distributed ledger database on a peer-to-peer (P2P) network that comprises a list of ordered blocks chronologically. In other words, this is a decentralized and trustworthy distributed system (without relying on any third party). Trust relation among 30 distributed nodes is established by mathematical methods and cryptography technologies instead of semi-trusted central institutions. Blockchain-based systems can mitigate the limitation of the single point of failure. Besides, since data is recorded in the public ledger, and all of nodes in the blockchain network have ledger backups and can access these data anytime and anywhere, such a system ensures data transparency and helps to 35 build trust among distributed nodes. It also facilitates data audit and accountability by having the capability to trace tamper-resistant historical record in the ledger. Depend- ing on the actual deployment, data in the ledger can be stored in the encrypted form using different cryptographic techniques; hence, preserving data privacy. Users can also protect their real identities in the sense of pseudo-anonymity. To enhance robustness, 40 we can introduce smart contracts (i.e. a kind of self-executing program deployed on the distributed blockchain network) to support diverse functions for different application scenarios. Specifically, the terms of smart contract can be preset by users and the smart contract will only be executed if the terms are fulfilled. Hence, this hands over control to the owner of the data. There are a (small) number of real-world blockchain-based 45 healthcare systems, such as Gem, Guardtime and healthbank [14] . Hence, in this paper we focus on blockchain-based healthcare systems. Specifically, we will comprehensively review some existing work, and identify existing and emerging challenges and potential research opportunities. Prior to presenting the results of our re-3 view, we will first introduce EHR system and blockchain architecture in the next section. 50 Then, in Section 3, we will review the extant literature and provide a comparative summary of some existing systems. In Section 4, we identify a number of potential research opportunities. Finally, we conclude the paper in the last section. In a centralized architecture, such as those that underpin a conventional EHR system, 55 a central institution is tasked with managing, coordinating and controlling of the entire network. However, in a distributed architecture, all nodes are maintained without relying on a central authority. Now, we will briefly explain the EHR system and blockchain technology. The electronic health record (EHR) is generally defined to be the collection of patients' electronic health information (e.g. in the form of electronic medical records -EMRs). EMRs can serve as a data source for EHR mainly from healthcare providers in the medical institutions. The personal health record (PHR) contains personal healthcare information, such as those obtained from wearable devices owned and controlled by 65 patients. Information collected as part of PHRs can be available to healthcare providers, by users (patients). In theory, EHR systems should ensure the confidentiality, integrity and availability of the stored data, and data can be shared securely among authorized users (e.g. medical practitioners with the right need to access particular patient's data to facilitate diagno-70 sis). In addition, such a system if implemented well, can reduce data replication and the risk of lost record, and so on. However, the challenge of securing data in such systems, whether in-transit or at-rest, is compounded by the increasing connectivity to these systems (e.g. more potential attack vectors). For example, mobile devices that can sync with the EHR system is a potential attack vector that can be targeted (e.g. an attacker 75 can seek to exploit a known vulnerability in the hospital-issued mobile devices and install malware to facilitate covert exfiltration of sensitive data (e.g. PHRs)). One of the key benefits of EHR systems is the availability of large volumes of data, which can be used to facilitate data analysis and machine learning, for example to inform other medical research efforts such as disease forecasting (e.g. the 2019 Novel 80 Coronavirus). Furthermore, wearable and other Internet of Things (IoT) devices can collect and upload relevant information, including those relating to PHRs, to the EHR systems, which can facilitate healthcare monitoring and personalized health services. Blockchain is made popular by the success of Bitcoin [15] , and can be used to facilitate 85 trustworthy and secure transactions across an untrusted network without relying on any centralized third party. We will now introduce the fundamental building blocks in the blockchain [16, 17, 18] . Blockchain is a chronological sequence of blocks including a list of complete and valid transaction record. Blocks are linked to the previous block by a reference (hash value), 90 and thus forming a chain. The block preceding a given block is called its parent block, and the first block is known as the genesis block. A block [15] consists of the block header The block header contains: • Block version: block validation rules; 95 • Previous block hash: hash value of the previous block; • Timestamp: the creation time of the current block; • Nonce: a 4-byte random field that miners adjust for every hash calculation to solve a PoW mining puzzle (see also Section 2.2.2); • Body root hash: hash value of the Merkle tree root built by transactions in the 100 block body; • Target hash: target threshold of hash value of a new valid block. The target hash is used to determine the difficulty of the PoW puzzle (see also Section 2.2.2). Merkle tree is used to store all the valid transactions, in which every leaf node is a 105 transaction and every non-leaf node is the hash value of its two concatenated child nodes. Such a tree structure is efficient for the verification of the transaction's existence and integrity, since any node can confirm the validation of any transaction by the hash value of the corresponding branches rather than entire Merkle tree. Meanwhile, any modification on the transaction will generate a new hash value in the upper layer and 110 this will result in a falsified root hash. Besides, the maximum number of transactions that a block can contain depends on the size of each transaction and the block size. These blocks are then chained together using cryptographic hash function in an append-only structure. That means new data is only appended in the form of additional blocks chained with previous blocks since altering and deleting previously confirmed data 115 is impossible. As previously discussed, any modification of one of the blocks will generate a different hash value and different link relation. Hence, achieving immutability and security. Digital signature based on asymmetric cryptography is generally used for transaction 120 authentication in an untrustworthy environment [19, 20] . Blockchain uses asymmetric 6 cryptography mechanism to send transactions and verify the authentication of transac- Otherwise, it will be discarded in this process. Only valid transactions can be stored in the new block of blockchain network. We will take the coin transfer as an example (see Fig. 3 ). Alice transfers a certain amount of coins to Bob. In step 1, she initiates a transaction signed by her private key. The transaction can be easily verified by others using Alice's public key. In step 2, the 135 transaction is broadcasted to other nodes through the P2P network. In step 3, each node will verify the transaction by predefined rules. In step 4, each validated transaction will be packed chronologically and appended to a new block once a miner solves the puzzle. Finally, every node will update and back up the new block. In the blockchain network, there is no trusted central authority. Thus, reaching a consensus for these transactions among untrustworthy nodes in a distributed network is an important issue, which is a transformation of the Byzantine Generals (BG) Problem proposed in [22] . The BG problem is that a group of generals command the Byzantine army to circle the city, and they have no chance of winning the war unless all of them 145 attack at the same time. However, they are not sure whether there are traitors who might retreat in a distributed environment. Thus, they have to reach an agreement to attack or retreat. It is the same challenge for the blockchain network. A number of protocols have been designed to reach consensus among all the distributed nodes before a new block is linked into blockchain [23] , such as the following: • PoW (Proof of Work) is the consensus mechanism used in Bitcoin. If the miner node who has certain computing (hashing) power wishes to obtain some rewards, the miner must perform the laborious task of mining to prove that he is not malicious. The task requires that the node repeatedly performs hash computations to find an eligible nonce value that satisfies the requirement that a hashed block 155 head must be less than (or equal to) the target hash value. The nonce is difficult to generate but easy for other nodes to validate. The task is costly (in terms of computing resources) due to the number of difficult calculations. A 51% attack is a potential attack in the blockchain network, where if a miner or a group of miners can control more than 51% of the computing power, they could interfere with the 160 generation of new blocks and create fraudulent transaction records beneficial for the attackers. • PoS (Proof of Stake) is an improved and energy-saving mechanism of PoW. It is believed that nodes with the largest number of stakes (e.g. currency) would be less likely to attack the network. However, the selection based on account balance is 165 unfair because the richest node is more likely to be dominant in the network, which would be similar to a centralized system gradually. Blockchain systems are divided into three types based on permissions given to network 210 nodes: • Public blockchain. The public blockchain is open to anyone who wants to join anytime and acts as a simple node or as a miner for economic rewards. Bitcoin [15] and Ethereum [25] are two well-known public blockchain platforms. • Private blockchain. The private blockchain network works based on access control, 215 in which participants must obtain an invitation or permissions to join. GemOS [26] and MultiChain [27] are both typical private blockchain platforms. • Consortium blockchain. The consortium blockchain is "semi-private" sitting on the fence between public and private blockchains. It is granted to a group of approved organizations commonly associated with enterprise use to improve business. Hy-220 perledger fabric [28] is a business consortium blockchain framework. Ethereum also supports for building consortium blockchains. Generally, EHRs mainly contain patient medical history, personal statistics (e.g. age and weight), laboratory test results and so on. Hence, it is crucial to ensure the security 225 and privacy of these data. In addition, hospitals in countries such as U.S. are subject to exacting regulatory oversight. There are also a number of challenges in deploying and implementing healthcare systems in practice. For example, centralized server models are vulnerable to the single-point attack limitations and malicious insider attacks, as previously discussed. Users (e.g. patients) whose data is outsourced or stored in these 230 EHR systems generally lose control of their data, and have no way of knowing who is accessing their data and for what kind of purposes (i.e. violation of personal privacy). Such information may also be at risk of being leaked by malicious insiders to another organization, for example an insurance company may deny insurance coverage to the particular patient based on leaked medical history. Meanwhile, data sharing is increasingly crucial particularly as our society and population become more mobile. By leveraging the interconnectivity between different healthcare entities, shared data can improve medical service delivery, and so on. Overcoming the "Information and Resource Island" (information silo) will be challenging, for example due to privacy concerns and regulations. The information silo also contributes to 240 unnecessary data redundancy and red-tape. In this case, the Health Insurance Portability and Accountability Act (HIPAA) was • Unique Identifiers Rule. Only the National Provider Identifier (NPI) identifies covered entities in the standard transactions to protect the patient identity information. • Enforcement Rule. Investigation and penalties for violating HIPAA rules. There is another common framework for audit trails for EHRs, called ISO 27789, to keep personal health information auditable across systems and domains. Secure audit record must be created each time any operation is triggered via the system complying with ISO 27789. Hence, we posit the importance of a collaborative and transparent data 260 sharing system, which also facilitates audit and post-incident investigation or forensics in the event of an alleged misconduct (e.g. data leakage). Such a notion (forensic-by-design) is also emphasized by forensic researchers [29, 30] . As a regulatory response to security concerns about managing the distribution, storage and retrieval of health record by medical industry, Title 21 CFR Part 11 places 265 requirements on medical systems, including measures such as document encryption and the use of digital signature standards to ensure the authenticity, integrity and confidentiality of record. We summarize the following requirements that should be met based on these relevant standards above when implementing the next generation secure EHR systems: • Accuracy and integrity of data (e.g. any unauthorized modification of data is not allowed, and can be detected); • Security and privacy of data; • Efficient data sharing mechanism (e.g. [31] ); • Mechanism to return the control of EHRs back to the patients (e.g. patients can 275 monitor their record and receive notification for loss or unauthorized acquisition); • Audit and accountability of data (e.g. forensic-by-design [29, 30] ). The above properties can be achieved using blockchain, as explained below: • Decentralization. Compared with the centralized mode, blockchain no longer needs to rely on the semi-trusted third party. • Security. It is resilient to single point of failure and insider attacks in the blockchainbased decentralized system. • Pseudonymity. Each node is bound with a public pseudonymous address to protect its real identity. • Immutability. It is computationally hard to delete or modify any record of any 285 block included in the blockchain by one-way cryptographic hash function. • Autonomy. Patients hold the rights of their own data and share their data flexibly by the settings of special items in the smart contract. • Incentive mechanism. Incentive mechanism of blockchain can stimulate the cooperation and sharing of competitive institutions to promote the development of 290 medical services and research. • Auditability. It is easy to keep trace of any operation since any historical transaction is recorded in the blockchain. Hence, if blockchain is applied correctly in the EHR systems, it can help to ensure the security of EHR systems, enhance the integrity and privacy of data, encourage orga-295 nizations and individuals to share data, and facilitate both audit and accountability. Based on the requirements of a new version of secure EHR systems and the characteristics of blockchain discussed in the preceding section 2.3, we will now describe the key goals in the implementation of secure blockchain-based EHR systems as follows: • Privacy: individual data will be used privately and only authorized parties can access the requested data. • Security: in the sense of confidentiality, integrity and availability (CIA): 1. Confidentiality: only authorized users can access the data. Integrity: data must be accurate in transit and not be altered by unauthorized 305 entity(ies). 3. Availability: legitimate user's access to information and resources is not improperly denied. • Accountability: an individual or an organization will be audited and be responsible for misbehavior. • Authenticity: capability to validate the identities of requestors before allowing access to sensitive data. • Anonymity: entities have no visible identifier for privacy. Complete anonymity is challenging, and pseudo-anonymity is more common (i.e. users are identified by something other than their actual identities). In order to satisfy the above goals, existing blockchain-based research in the healthcare domain includes the following main aspects: • Data storage. Blockchain serves as a trusted ledger database to store a broad range of private healthcare data. Data privacy should be guaranteed when secure storage is achieved. However, healthcare data volume tends to be large and complex in practice. Hence, a corresponding challenge is how to deal with big data storage without having an adverse impact on the performance of blockchain network. • Data sharing. In most existing healthcare systems, service providers usually maintain primary stewardship of data. With the notion of self-sovereignty, it is a trend to return the ownership of healthcare data back to the user who is capable of sharing (or not sharing) his personal data at will. It is also necessary to achieve secure data sharing across different organizations and domains. • Data audit. Audit logs can serve as proofs to hold requestors accountable for their interactions with EHRs when disputes arise. Some systems utilize blockchain and smart contract to keep trace for auditability purpose. Any operation or request will be recorded in the blockchain ledger, and can be retrieved at any time. • Identity manager. The legitimacy of each user's identity needs to be guaranteed in 335 the system. In other words, only legitimate users can make the relevant requests to ensure system security and avoid malicious attacks. In the remaining of this section, we will review existing approaches to achieve data storage, data sharing, data audit, and identity manager (see Sections 3.1 to 3.4). According to section 2.3, one of the solutions to ensure greater security in the EHR system is the use of blockchain technology. However, there are potential privacy problems for all of raw/encrypted data in the public ledger, since blockchain as a public database has the risk of sensitive data being exposed under the statistical attack. Some measures should be taken to enhance the privacy protection of sensitive health record in the blockchain-based EHR systems. In generally, privacy preserving approaches can be classified into cryptographic and non-cryptographic approaches, including encryption, anonymisation and access control mechanism respectively. Encryption scheme is a relatively common method, such as public key encryption 350 (PKE), symmetric key encryption (SKE), secure multi-party computation (MPC) [33] and so on. al. [35] proposed that sensors data will be uploaded using a pair of unique private and public keys in the blockchain network to protect the privacy and security of biometric information. Zheng et al. [36] proposed that data will be encrypted before being uploaded to cloud servers by symmetric key scheme (i.e. Rijndael AES [37] ) with threshold encryption 360 scheme. The symmetric key will be split into multiple shares distributed among different key keepers by Shamir's secret sharing scheme [38] . Only if data requestor gets enough key shares, he can decrypt the ciphertext. Compromising of some key keepers(less than threshold) would not lead to data leakage. Yue et al. [39] designed an App on smartphones based on blockchain with MPC tech-365 nique, called Healthcare Data Gateway (HDG). The system allows to run computations of encrypted data directly on the private blockchain cloud and obtain the final results without revealing the raw data. Besides, Guo et al. [40] proposed an attribute-based signature scheme with multiple authorities (MA-ABS) in the healthcare blockchain. The signature of this scheme attests 370 not to the identity of the patient who endorses a message, instead to a claim (like access policy) regarding the attributes delegated from some authorities he possesses. Meanwhile, the system has the ability to resist collusion attack by sharing the secret pseudorandom function (PRF) seeds among authorities. In order to resist malicious attacks (e.g. statistical attack), healthcare systems have 375 to change the encryption keys frequently of general methods. It will bring the cost for storage and management of a large amount of historical keys since these historical keys must be stored well to decrypt some historical data in future, then the storage cost will be greater, especially for limited computational resource and storage devices. To address this problem, Zhao et al. [41] designed a lightweight backup and effi-380 cient recovery key management scheme for body senor networks (BSNs) to protect the privacy of sensor data from human body and greatly reduce the storage cost of secret keys. Fuzzy vault technology is applied for the generation, backup and recovery of keys without storing any encryption key, and the recovery of the key is executed by BSNs. The adversary hardly decrypts sensor data without symmetric key since sensor data is 385 encrypted by symmetric encryption technology (i.e. AES or 3DES). We compare and analyse some systems above, shown in Table 1 and 2. Most systems use cryptographic technology to enhance the security and privacy of healthcare data in the blockchain. However, encryption technique is not absolutely secure. The computational cost of encryption is high for some limited devices. Transaction record may 390 also reveal user behaviors and identity because of the fixed account address. Malicious attackers may break the ciphertext stored in the public ledger by some means. 2. all of data will be exposed once the corresponding symmetric key is lost Table 2 : systems requirements that have been met in Table 1 paper security privacy anonymity integrity authentication controllability auditability accountability [34] [35] [36] [39] [40] [42] [41] Meanwhile, another important issue is key management. It is the foundation of entire data field safety that private keys do not reveal. The loss of private key means that the holder would have no ability to control the corresponding data. Once the 395 private/symmetric key is compromised, all of data may be exposed by attackers. So, both encryption technique and key management should be considered when developers design a secure EHR system. Additionally, it must guarantee that only authorized legitimate users can access private data to enhance security. Non-cryptographic approaches mainly use access control 400 mechanism for security and preserving privacy. With regard to the security goals, access control mechanism is a kind of security technique that performs identification authenti-19 cation and authorization for entities. It is a tool widely used in the secure data sharing with minimal risk of data leakage. We will discuss this mechanism in details in the next section 3.2.2. The EHR systems can upload medical record and other information in the blockchain. If these data is stored directly in the blockchain network, it will increase computational overhead and storage burden due to the fixed and limited block size. What's more, these data would also suffer from privacy leakage. To solve these problems, most relevant research and applications [36, 42, 43, 44] Yue et al. [39] proposed that a simple unified Indicator Centric Schema (ICS) could organize all kinds of personal healthcare data easily in one simple "table". In this system, data is uploaded once and retrieved many times. They designed multi-level index and Most systems in the previous sections are adopted third-party database architecture. The third-party services (such as cloud computing) in the far-end assist the users to improve Quality of Service (QoS) of the applications by providing data storage and computation power, but with a transmission latency. Such a storage system has gained common acceptance depending on a trusted third Table (DHT). Nguyen et al. [48] designed a system that integrates smart contract with IPFS to improve decentralized cloud storage and controlled data sharing for better user access management. Rifi et al. [49] also adopted IPFS as the candidate for off-chain database to store large amounts of sensor personal data. 475 Wang et al. [50] designed a system that utilizes IPFS to store the encrypted file. The encryption key of the file is first encrypted using ABE algorithm, then encrypted with other information (file location hash ciphertext) using AES algorithm. Only when the attributes set of the requestor meets the access policy predefined by data owner, the requestor can obtain the clue from blockchain, then download and decrypt the files from 480 IPFS. 22 Table 4 : systems requirements that have been met in Table 3 paper security privacy anonymity integrity authentication controllability auditability accountability [44] [47] [42] [45] [36] [43] [48] [49] [50] According to Table 3 and 4, the common architecture for data storage in the EHR system is shown in Fig. 5 . The advantages of integrating off-line storage into blockchain systems are as follows. First, detailed medical record is not allowed to access directly for patient's data privacy preserving. Second, it helps to reduce the throughput require-485 ment significantly, since only transaction record and a few metadata are stored in the blockchain. Besides, data pointers stored in the block can be linked to the location of raw data in the off-chain database for data integrity. However, it is difficult to fully trust the third parties to store these sensitive data. Meanwhile, it may also contradict the idea of decentralization. Further research is needed 490 to accelerate the acceptance of distributed storage systems in practice, like IPFS. Also, the next step should be to improve the storage architecture of blockchain for high storage capacity. Healthcare industry relies on multiple sources of information recorded in different sys-495 tems, such as hospitals, clinics, laboratories and so on. Healthcare data should be stored, retrieved and manipulated by different healthcare providers for medical purposes. However, such a sharing approach of medical data is challenging due to heterogeneous data structures among different organizations. It is necessary to consider interoperability of 25 Figure 5 : common architecture for data storage in the EHR system data among different organizations before sharing data. We will introduce interoperabil-500 ity first. Interoperability of EHR is the degree to which EHR is understood and used by multiple different providers as they read each other's data. Interoperability can be used to standardize and optimize the quality of health care. Interoperability can mainly be 505 classified into three levels: • Syntactic interoperability: One EHR system can communicate with another system through compatible formats. • Semantic interoperability: Data can be exchanged and accurately interpreted at the data field level between different systems. The lack of unified interoperability standards has been a major barrier in the highperformance data sharing between different entities. According to the study [51] , there In some studies [10, 52, 53] , they adopted the Health Level Seven International (FHIR) as data specification and standard formats for data exchange between different organi-520 zations. The criterion was created by HL7 healthcare standards organization. The system in [10] Bahga et al. [56] proposed that cloud health information systems technology architecture (CHISTAR) achieves semantic interoperability, defines a general purpose set of data structures and attributes and allows to aggregate healthcare data from disparate 545 data sources. Besides, it can support security features and address the key requirements of HIPAA. Chen et al. [57] designed a secure interoperable cloud-based EHR service with Continuity of Care Document (CCD). They provided self-protecting security for health documents with support for embedding and user-friendly control. In a word, interoperability is the basic ability for different information systems to communicate, exchange and use data in the healthcare context. EHR systems following international standards can achieve interoperability and support for data sharing between multiple healthcare providers and organizations. We will discuss data sharing in detail next. It is obviously inconvenient and inefficient to transfer paper medical record between different hospitals by patients themselves.Sharing healthcare data is considered to be a critical approach to improve the quality of healthcare service and reduce medical costs. Though current EHR systems bring much convenience, many obstacles still exist in 560 the healthcare information systems in practice, hinder secure and scalable data sharing across multiple organizations and thus limit the development of medical decision-making and research. As mentioned above, there are risks of the single-point attack and data leakage in a centralized system. Besides, patients cannot preserve the ownership of their own private 565 data to share with someone who they trust. It may result in unauthorized use of private data by curious organizations. Furthermore, different competing organizations lacking of trust partnerships are not willing to share data, which would also hinder the development of data sharing. In this case, it is necessary to ensure security and privacy-protection and return the 570 control right of data back to users in order to encourage data sharing. It is relatively simply to deal with security and privacy issues when data resides in a single organisa-tion, but it will be challenging in the case of secure health information exchange across different domains. Meanwhile, it also needs to consider further how to encourage efficient collaboration in the medical industry. Secure access control mechanism as one of common approaches requires that only authorized entities can access sharing data. This mechanism includes access policy commonly consisting of access control list (ACL) associated with data owner. ACL is a list of requestors who can access data, and related permissions (read, write, update) to specific data. Authorization is a function of granting permission to authenticated users in order to access the protected resources following predefined access policies. The authentication process always comes before the authorization process. Access policies of this mechanism mainly focus on who is performing which action on what data object for which purposes. Traditional access control approaches for EHRs 585 sharing are deployed, managed and run by third parties. Users always assume that third parties (e.g. cloud servers) perform authentication and access requests on data usage honestly. However, in fact, the server is honest but curious. It is promising that combining blockchain with access control mechanism is to build a trustworthy system. Users can realize secure self-management of their own data and 590 keep shared data private. In this new model, patients can predefine access permissions (authorize, refuse, revoke), operation (read, write, update, delete) and duration to share their data by smart contracts on the blockchain without the loss of control right. Smart contracts can be triggered on the blockchain once all of preconditions are met and can provide audit mechanism for any request recorded in the ledger as well. There 595 are many existing studies and applications applying smart contract for secure healthcare data sharing. Peterson et al. [10] proposed that patients can authorize access to their record only under predefined conditions (research of a certain type, and for a given time range). Smart contract placed directly on the blockchain verifies whether data requestors meet 600 these conditions to access the specified data. If the requestor does not have the access rights, the system will abort the session. Similarly, smart contracts in [58] can be used for granting and revocation of access right and notifying the updated information as Smart contract in most systems includes predefined access policies depending on requestors' role/purposes and based-role/based-purpose privileges. However, it is inflexible to handle unplanned or dynamic events and may lead to potential security threats [60]. Another mechanism, Attribute-Based Access Control (ABAC), has been applied in the secure systems to handle remaining issues in the extensions of RBAC and enhance the security in some specific cases. The systems based on access control mechanism record any operation about access policies by logging. However, it is vulnerable to malicious tampering without the assurance of integrity of these logs in the traditional systems. Blockchain and smart contract can perform access authorization automatically in a secure container and make sure the integrity of policies and operations. Thus, access control mechanism integrated with blockchain can provide secure data sharing. The diversified forms of access control can be applied into different situations depending on the demands for system security. Audit-based access control aims to enhance the 670 reliability of posteriori verification [64] . Organization-based access control (OrBAC) [65] can be expressed dynamically based on hierarchical structure, including organization, role, activity, view and context. 1. user's identity may be exposed without de-identification mechanism Table 6 : systems requirements that have been met in Table 5 paper security privacy anonymity integrity authentication controllability auditability accountability [10] [58] [44] [49] [48] [59] [39] [62] 36 Table 6 : systems requirements that have been met in Table 5 paper security privacy anonymity integrity authentication controllability auditability accountability [61] [66] [67] [68] [50] [69] [42] [70] [71] Based on the information in the Table 5 We can also use cryptography technology to enhance secure data sharing and the security of access control mechanism in most EHR systems. Dubovitskaya et al. [66] proposed a framework to manage and share EMRs for cancer patient care based on symmetric encryption. Patients can generate symmetric encryption keys to encrypt/decrypt the sharing data with doctors. If the symmetric key is 685 compromised, proxy re-encryption algorithm on the data stored in the trusty cloud can be performed and then a new key will be shared with clinicians according to predefined access policies. Only the patients can share symmetric keys and set up the access policies by smart contract to enhance the security of sharing data. Xia et al. [67] designed a system that allows users to get access to requested data from 690 a shared sensitive data repository after both their identities and issuing keys are verified. In this system, User-Issuer Protocol is designed to create membership verification key and transaction key. User-Verifier Protocol is used for membership verification, then only valid users can send data request to the system. Ramani et al. [68] utilized lightweight public key cryptographic operations to enhance 695 the security of permissioned requests (append, retrieve). Nobody can change the patients' data without sending a notification to patients, since the requested transaction will be checked whether it has signed by the patient before being stored on a private blockchain. Wang et al. [50] designed a system that combines Ethereum with attribute-based encryption (ABE) technology to achieve fine-grained access control over data in the de-700 centralized storage system without trusted private key generator (PKG). The encryption key of the file is stored on the blockchain in the encrypted format using AES algorithm. Requestors whose attributes meet the access policies can decrypt the file encryption key and then download the encrypted file from IPFS. Besides, the keyword search implemented by smart contract can avoid dishonest behavior of cloud servers. Liu et al. [42] proposed blockchain-based privacy-preserving data sharing scheme for EMR called BPDS. The system adopted content extraction signature (CES) [73] 715 which can remove sensitive information of EMRs, support for selective sharing data and generate valid extraction signatures to reduce the risk of data privacy leakage and help enhance the security of access control policies. Besides, users can use different public keys for different transactions to keep anonymous. Huang et al. [70] designed a blockchain-based data sharing scheme in the cloud com-720 puting environment to solve the trust issue among different groups using group signature and ensure the reliability of the data from other organizations. Requestors can verify the As shown in Table 5 and 6, cryptography technology can protect sensitive data directly and improve the traditional access control mechanism to meet the demand for security and privacy. However, public key encryption has high computational overhead 735 and trusted PKI is necessary for authentication. The similar problem exists in a trusted PKG as one of important components of ABE. Besides, how to transmit the shared key securely should be addressed in the symmetric encryption. As mentioned before, MPC may not be suitable for wearable devices in the IoT context due to high computational cost. It is necessary to improve these algorithms to adapt devices/sensors with limited 740 resource. Above all, blockchain as a secure, immutable and decentralized framework makes the 39 control right of data return to patients themselves in the healthcare industry. As shown in Fig. 6 , The combination of access control mechanism by smart contract with cryptography technology on sensitive data can be achieved secure data sharing among different 745 individuals and institutions. Meanwhile, all of record is included in the immutable public ledger to ensure the integrity and reliability of data and minimize the risk of raw data leakage. Concerning potential dishonest behavior or wrong results of third parties (cloud servers) holding large amounts of raw/encrypted data, blockchain offers immutable his-750 torical record for traceability and accountability, sometimes with cryptography technique (such as group signature). Next we discuss about secure audit to enhance the security of EHR systems further. Healthcare systems also rely on audit log management as security mechanism since 755 some exceptions may have resulted from the misuse of access privileges or dishonest behavior by third parties or data requestors. Audit log can serve as proofs when disputes arise to hold users accountable for their interactions with patient record. Immutable public ledger and smart contract in the blockchain can provide immutable record for all of access requests to achieve traceability and accountability. Audit log mainly contains vital and understandable information: • timestamp of logged event • user ID which requests the data • data owner ID whose data is accessed • action type (create, delete, query, update) • the validation result of the request Qi et al. [74] designed a data sharing model with the ability to effectively track the dishonest behaviour of sharing data as well as revoke access right to violated permissions and malicious access. The system provides provenance, audit and medical data sharing among cloud service providers with minimal risk of data privacy. The similar system in [67] provides auditable and accountable access control for shared cloud repositories among big data entities in a trust-less environment. Azaria et al. [53] also provided auditability via comprehensive log. They mentioned that obfuscation for privacy needs further exploration while preserving auditability in the public ledger. Fernandez et al. [75] designed a blockchain-based system called AuditChain to man- To improve quality of research by better reproducibility, the timestamped statistical analysis on clinical trials ensures traceability and integrity of each samples metadata in [77] based on blockchain which allows to store proofs of existence of data. The related analytical code to process the data must be timestamped in order that data is checked 790 and analysis is reproducible. Timestamp in the blockchain will provide for better version control than git. The above-mentioned studies indicate that blockchain plays an important role in auditing and accountability. Users can not only hold the control right of their own data, but also monitor all request operations for data audit and accountability when disputes 795 occur. Above all, audit log provides reliable evidence for anomalous and potentially malicious behavior to improve the security of access control models. Meanwhile, it brings benefits to the adjustment of healthcare service by gaining insight into personnel interactions and workflows in hospitals. store and process. Currently, audit log data does not contain required and representative information reliably, which would be difficult to interpret or hardly access. It would get worse in the collaboration of multiple EHR organizations. In this case, it is necessary to consider how to achieve interoperable and well-formatted audit log standard for the 805 support of secure data exchange among different healthcare institutions. Membership verification is the first step to ensure the security of any system before getting access to any resource. In the access control mechanism mentioned before, identity authentication is always first performed to make sure that specific rights are granted 810 to data requestors with legal identity before sharing data. Common types of user authentication have pass-through authentication, biometric authentication and identity verification based on public key cryptography algorithms. Public Key Infrastructure (PKI) is commonly used, which relies on trusted third parties to provide membership management services. Identity registration is performed in [44] with registrar smart contract to map valid string form of identity information to a unique Ethereum address via public key cryptography. It can employe a DNS-like implementation to allow the mapping of regulate existing forms of ID. 835 Zhang et al. [69] established secure links for wireless body area network (WBAN) area and wireless body area network (PSN) area after authentication and key establishment through an improved IEEE 802.15.6 display authenticated association protocol [78] . The protocol can protect collected data through Human body channels (HBCs) and reduce computational load on the sensors. 840 Xia et al. [67] designed an efficient and secure identity-based authentication and key agreement protocol for membership authentication with anonymity in a permissioned blockchain. The process of verification is a challenge-response dialog to prove whether the sender is authentic when the verifier receives a verification request from a user using shared key. Most blockchain-based systems use pseudonyms to hide the real identity for privacy. However, there is conflict between privacy preserving and authenticity. That means how to verify the identity without exposing the information of real identity. In addition, adversaries or curious third parties can guess the real identity and relevant behavior pattern through inference attacks, such as transaction graph analysis. 850 Shae et al. [79] designed an anonymous identity authentication mechanism based on zero-knowledge technology [80] , which can address two conflicting requirements: maintain the identity anonymous and verify the legitimacy of user identity as well as IoT devices. Sun et al. [45] proposed a decentralizing attribute-based signature (called DABS) scheme to provide effective verification of signer's attributes without his identity infor-855 mation leakage. Multiple authorities can issue valid signature keys according to user's attributes rather than real identity and provide privacy-preserving verification service. Other nodes can verify whether the data owner is qualified by verification key corresponding to satisfied attributes without revealing owner identity. Hardjono et al. [81] designed an anonymous but verifiable identity scheme, called 860 ChainAchor, using the EPID zero-knowledge proof scheme. These anonymous identities can achieve unlinkable transactions using different public key in the blockchain when 43 nodes execute zero-knowledge proof protocol successfully. They also provide optional disclosure of the real identity when disputes occur. Biometric authentication is also widely used, such as face and voice pattern identifi-865 cation, retinal pattern analysis, hand characteristics and automated fingerprint analysis based on pattern recognition. Lee et al. [35] proposed that human nails can be used for identity authentication since nails have the high degree of uniqueness. The system uses histogram of oriented gradients (HOG) and local binary pattern (LBP) feature to extract the biometric identification 870 signature, then SVM and convolutional neural network are utilized for authentication with high accuracy. This identity verification technology with dynamic identity rather than regular real identity information ensures user anonymity and privacy. The main goal of identity management is to ensure that only authenticated users can be authorized to access the specified resource. Currently, most systems rely on 875 membership service component or similar providers for identity authentication. Traditional authentication process mainly adopts password authentication and even transmit user account in the clear text. Anyone can eavesdrop on the external connection to intercept user account. In this case, attackers or curious third parties may impersonate compromised users to gain access to sensitive data. It is difficult to find and rely on such a trustworthy third membership service party that validates user identity and accomplishes complex cross-heterogeneous domains authentication honestly without potential risk of real identity leakage. Besides, typical blockchain systems cannot provide privacy-preserving verification due to public transaction record including pseudonyms and related behavior. In this case, curious third 885 servers or network nodes may collect large amounts of data to infer the real identity by statistical analysis. Blockchain can also allow rollback models storage if false predication rate is high. Blockchain stores the pointers of relevant data of retrained models in a secure and immutable manner. Juneja et al. [43] proposed that retraining models indexed by pointers 930 in the blockchain can increase the accuracies for continuous remote systems in the context of irregular arrhythmia alarm rate. Additionally, artificial intelligence can be applied to design automatic generation of smart contact to enhance secure and flexible operations. In the context of IoT, the locations of products can be tracked at each step with radio-frequency identification (RFID), sensors or GPS tags. Individual healthy situation can be monitored at home via sensor devices and shared on the cloud environment where physical providers can access to provide on-time medical supports. However, as the use of sensors is experiencing exponential growth in various environ-940 ments, the security level of sensitive data of these sensors has to be improved. Currently, a few studies focus on solving the above mentioned problems. Related 980 research mainly focuses on the improvement of consensus algorithm, block size design [67] and so on. Croman et al. [89] mainly improved the scalability of blockchain on latency, throughput and other parameters. The experiments showed that block size and generation inter-val in Bitcoin are the first step toward throughput improvements and latency reduction without threat to system decentralization. New challenges for two data types in the blockchain-based system are throughput and fairness. Two fairness-based packing algorithms are designed to improve the throughput 1000 and fairness of system among users. In the practical application scenario, how to encourage miners to participate in the network is important for the maintenance of trustworthy and stable blockchain. Azaria et al. [44] proposed an incentive mechanism to encourage medical researchers and healthcare authorities as miners and create data economics by awarding for big data on hospital 1005 records to researchers. Yang et al. [92] proposed a selection method in the incentive mechanism. Providers have less significance (means the efforts that providers have been made on network maintenance and new blocks generation) with higher probabilities of being selected to carry out the task of new block generation and will be granted significance as bonus to reduce 1010 the selected probability in future. Pham et al. [93] made further improvements on gas prices of blockchain, which can boost the priority in the processing transaction queue by automatically adjusting the gas price and then trigger an emergency contact to providers for on-time treatment immediately. Meanwhile, it should be noted that all transactions can be "seen" by any node in the blockchain network. Homomorphic encryption and zero knowledge proofs could be utilized to prevent data forensics by inference, maintain the privacy of individual information and allow computations to be performed without the leakage of input and output of computations. As the above statement, blockchain still has many limitations and more aggressive extensions will require fundamental protocol redesign. So it is urgent to be towards to the improvement of underlying architecture of blockchain for better service. In the context of IoT, personal healthcare data streams collected from wearable devices are high in volume and at fast rate. Large amounts of data can support for big 1025 data and machine learning to increase the quality of data and provide more intelligent health service. However, it may lead to high network latency due to the physical distance to mobile devices and traffic congestion on the cloud servers. Besides, the mining process and some encryption algorithms may cost high computational power on resource-limited devices 1030 and restrict the use of blockchain. A new trend is increasingly moving from the function of clouds towards network edge with low network latency. It is mainly required by time-sensitive applications, like healthcare monitor applications. Combining with edge computing, blockchain is broadened to a wide range of services from pure data storage, such as device configuration 1035 and governance, sensor data storage and management, and multi-access payments. If new technologies enter the market without some form of vetting, they should be adopted with care for example based on a cost-benefit-analysis. Hence, to improve compliance, security, interoperability and other factors, we need to develop uniform stan-1040 dards, policies and regulations (e.g. those relating to data security and privacy, and blockchain ecosystem). For example, we would likely need different independent and trusted mechanisms to evaluate different blockchain solutions for different applications and context, in terms of privacy, security, throughput, latency, capacity, etc. We would also need to be able to police and enforce penalty for misbehavior and/or violations (e.g. non-compliance or not delivering as agreed in the contract). Blockchain has shown great potential in transforming the conventional healthcare industry, as demonstrated in this paper. There, however, remain a number of research and operational challenges, when attempting to fully integrate blockchain technology 1050 with existing EHR systems. In this paper, we reviewed and discussed some of these challenges. Then, we identified a number of potential research opportunities, for example relating to IoT, big data, machine learning and edge computing. We hope this review will contribute to further insight into the development and implementation of the next generation EHR systems, which will benefit our (ageing) society. Healthcare professionals organisational barriers to health information technologiesa lit-1065 erature review Maturity models of healthcare information systems and technologies: a literature review Security and privacy in electronic health records: A systematic literature review Implementing electronic health records in hospitals: a systematic literature review Electronic health record use by nurses in mental health settings: a literature review Personal electronic healthcare records: What influences consumers to engage with their clinical data online? a literature review Methodologies for designing healthcare analytics solutions: A literature analysis Opportunities and challenges in healthcare information systems research: Caring for patients with chronic conditions Visualization of blockchain data: A systematic review A blockchain-based approach to health information exchange networks Blockchain in healthcare applications: Research challenges and opportunities Blockchain: A panacea for healthcare cloud-based data security and privacy? 2017 IEEE Technology & Engineering Management Conference (TEMSCON) Blockchain technology in healthcare: The revolution starts here Bitcoin: A peer-to-peer electronic cash system Dcap: A secure and efficient decentralized 1100 conditional anonymous payment system based on blockchain An efficient nizk scheme for privacy-preserving transactions over account-model blockchain A survey on privacy protection in blockchain system Secure and efficient two-party signing protocol for the identity-based signature scheme in the IEEE P1363 standard for public key cryptography Multi-party signing protocol for the identitybased signature scheme in ieee p1363 standard The byzantine generals problem A survey on consensus mechanisms and mining strategy management in blockchain networks Practical byzantine fault tolerance ethereum: Blockchain app platforms multichain: Open platform for building blockchains Forensic-by-design framework for cyberphysical cloud systems Medical cyber-physical systems development: A forensics-driven approach SDTE: A secure blockchain-based data trading ecosystem Class: Cloud log assuring soundness and secrecy scheme for cloud forensics Enigma: Decentralized computation platform with guaranteed privacy Medibchain: A blockchain based privacy preserving platform for healthcare data Fingernail analysis management system using microscopy sensor and blockchain technology Ordieres-Mere, Blockchain-based personal health data sharing system using cloud storage The design of rijndael: Aes -the advanced encryption standard. 1150 [38] S. Vanstone, A. Menezes, P. V. Oorschot, Handbook of applied cryptography Healthcare data gateways: found healthcare intelligence on blockchain with novel privacy risk control Secure attribute-based signature scheme with multiple authorities for blockchain in electronic health records systems Lightweight backup and efficient recovery scheme for health blockchain keys Bpds: A blockchain based privacy-preserving data sharing for electronic medical records Leveraging blockchain for retraining deep learning architecture in patientspecific arrhythmia classification Medrec: Using blockchain for medical data access 1165 and permission management A decentralizing attribute-based signature for healthcare blockchain Using java to generate globally unique identifiers for dicom objects A framework for secure and decentralized sharing of medical imaging data via blockchain consensus Blockchain for secure ehrs sharing of 1175 mobile cloud based e-health systems Towards using blockchain technology for ehealth data access management A blockchain-based framework for data sharing with fine-grained 1180 access control in decentralized storage systems An overview of interoperability standards for electronic health records, USA: society for design and process science 2018 IEEE International Conference on Bioinformatics and Biomedicine Medrec: Using blockchain for medical data access and permission management Fhirchain: Applying blockchain 1190 to securely and scalably share clinical data Applying software patterns to address interoperability in blockchain-based healthcare apps A cloud-based approach for interoperable electronic health records 1195 (ehrs) Design for a secure interoperable cloud-based personal health record service How distributed ledgers can improve provider data management and support 1200 interoperabilityHttps Integrating blockchain for data sharing and collaboration in mobile healthcare applications Security and privacy in electronic health records: A systematic literature review Blockchain for access control in e-health scenarios Blockchain based access control servicesdoi Blockchain based delegatable access control scheme for a collaborative e-health environment Audit-based access control with a distributed ledger: Applications to healthcare organizations Organization based access control Secure and trustable electronic medical records sharing using blockchain Bbds: Blockchain-based data sharing for electronic medical records in cloud environments Secure and efficient data accessibility in blockchain based healthcare systems A secure system for pervasive social network-based healthcare Blockchain-based multiple groups data sharing with anonymity and traceability Privacy-preserving attribute-based access control model for xml-based electronic health record system Dynamic access control policy based on blockchain and machine learning for the internet of things Content extraction signatures Medshare: Trust-less medical data sharing among cloud service providers via blockchain Security and privacy in electronic health records: A systematic literature review Improving data transparency in clinical trials using blockchain smart contracts Blockchain technology for improving clinical research quality Blockchain distributed ledger technologies for biomedical 1250 and health care applications On the design of a blockchain platform for clinical trial and precision medicine Non-interactive zero-knowledge and its applications Verifiable anonymous identities and access control in permissioned blockchains Big data: Are biomedical and health informatics training programs ready? Privacy preserving in blockchain based on partial homomorphic encryption system for ai applications A fully homomorphic encryption scheme An architecture and protocol for smart continuous ehealth monitoring using 5g 5g-smart diabetes: Toward personalized diabetes 1270 diagnosis with healthcare big data clouds Permissioned blockchain and edge computing empowered privacy-preserving smart grid networks Integrated blockchain and edge computing systems: A survey, some research issues and challenges Blochie: a blockchain-based platform for healthcare information exchange A design of blockchain-based architecture for the security of electronic health record (ehr) systems A secure remote healthcare system for hospital using blockchain smart contract Shuyun Shi received the Bachelor degree in 2019, from the School of Computer 1290 She is currently working toward a Master degree at the Key Laboratory of Aerospace Information Security and Trusted Computing Ministry of Education He is currently a professor of the Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan Uni-1300 versity, Wuhan 430072, China. His main research interests include cryptography and information security Li Li received her Ph.D degree in computer science from Computer School She is currently an associate professor at School of Software, Wuhan Univer-1305 sity. Her research interests include data security and privacy, applied cryptography and security protocols His research is focused on mobile computing, parallel/distributed computing, multi-agent systems, service oriented computing, routing and security issues in mobile ad hoc, sensor and mesh networks. He has more than 100 technical research papers in leading journals such as-IEEE TII His research is supported from DST, TCS and UGC. He has guided many students leading to M.E. and Ph.D Australia Day Achievement Medallion, and British Computer Society's Wilkes Award in 2008. He is also a Fellow of the Australian Computer Society Digital Rights Management for Multimedia Interest Group We thank the anonymous reviewers for their valuable comments and suggestions which helped us to improve the content and presentation of this paper. The authors declare that they have no conflicts of interest. 61