Ensuring user authentication and data integrity in multi-cloud environment

The necessity to improve security in a multi-cloud environment has become very urgent in recent years. Although in this topic, many methods using the message authentication code had been realized but, the results of these methods are unsatisfactory and heavy to apply, which, is why the security problem remains unresolved in this environment. This article proposes a new model that provides authentication and data integrity in a distributed and interoperable environment. For that in this paper, the authors first analyze some security models used in a large and distributed environment, and then, we introduce a new model to solve security issues in this environment. Our approach consists of three steps, the first step, was to propose a private virtual network to secure the data in transit. Secondly, we used an authentication method based on data encryption, to protect the identity of the user and his data, and finally, we realize an algorithm to know the integrity of data distributed on the various clouds of the system. The model achieves both identity authentication and the ability to inter-operate between processes running on different cloud’s provider. A data integrity algorithm will be demonstrated. The results of this proposed model can efficiently and safely construct a reliable and stable system in the cross-cloud environment.

in a multi-cloud architecture and adopt it [4]. Although this new technology has many advantages, it also presents significant challenges in terms of data security [5].
Therefore, before making the transfer of the data towards multi-cloud, the company owes to classify them and choose Cloud adapted according to her needs [6]. However, organizations are very worried about how security was being guaranteed in this distributed environment [7], especially when it comes to hosting their sensitive data and critical applications in the cloud [8], and also to know how to recover their data later. For that, we can consider that the first part of the problem resides in the fact that the companies do not have adequate security policies when they access the cloud [9], in this case, the problem was repeating throughout the cloud system [10]. Moreover, human errors can generate all sorts of risks, especially when running multiple instances across cloud or multiple cloud providers.
Moreover, with the development of the Internet of Things (IoT), security attacks have increased considerably. Several methods opt for the use of machine learning [11] as a solution to solve this problem, but so far they have not brought satisfactory results, because of the different constraints of the IoT, such as distribution, scalability, and low latency. Also, many researchers have proposed solutions and ideas to solve the security problem and, in particular authentication of users in multi-cloud environments. In [12], this paper investigated authentication mechanisms used in the Cloud Computing environment; the authors show that each type of cloud has its specificities in terms of authentication, and according to their classification, they explain that only certain authentication mechanisms are valid, but their classification does not seem to take into account a multi-cloud environment. In [13], the authors of this article use the features of multi-cloud computing to enhance the authentication mechanism, and this by dividing the verification tasks on different virtual machines and generate a new password for each new connection. This solution is obsolete in the case where one of the virtual machines breaks down.
Moreover, the use of a backup server, which communicates with the storage server, is a classic solution. There are, many other solutions that use the token [14][15][16], also named the USB key, these solutions are effective but require a periodic update of the token with cloud providers; or those based on the blockchain that guarantees the security and integrity of data inside and outside smart homes [17]. Note also, that currently many systems based on Artificial Intelligence (AI) for the protection of privacy are widely used in many fields [18], such as secure communication, a social network, data mining, etc. Although, the multi-cloud system is spreading more and more today over the world. New errors and attacks have been discovered, for example, when users click on the spam videos/photos, shared in facebook or Twitter, they may be redirected to the spammer websites. Later, spammers may steal the user's credentials using those websites [19][20][21]. Our motivation is to strengthen the protection of applications and data stored in this environment, and the use of authentication solutions that complicate the task of malware. For that, the main objective of our research is to improve the security in the level of authentication of the users in a multi-cloud system, knowing that the authentication is the first thing to do in all the systems clouds and, which protects the user's identity and this data [22]. For that, we will use the asymmetric encryption method, used for secure data in transmission and based on the algorithm of Rivest, Shamir and Adleman (RSA) [23], which is very reliable in terms of security of data. The second goal is to maintain the integrity of the data saved in the different clouds by a hash algorithm.
In our framework, any user's need to access his data, he must first be registered in an authentication server; this server contains the information about the client, such as the username and password as well as other information. Only the user registered in this server of authentication is allowed to log in. The username and password authentication are encrypted. Dividing data across different clouds for secure ones are required. Secret information must be stored, in private clouds. A data integrity algorithm will present. The analysis and results of the proposed new model are analyzed, in relation, to security factors in cloud computing.
The remainder of this paper is organized as follows: In "Background" section, a brief description of security in multi-Cloud environment. "Related works" section, describes some related work in the field of security in multi-Cloud environment. In "Proposed solution" section, the authors elaborate a scheme to secure access of applications and services in the different clouds. In "Simulation and results" section, a simulation with results will be established. In "Discussion" section, the authors discuss the results and the performance of the system. Finally, in "Conclusion and futures works" section, the conclusions of this work are presented.

Background
Until now, the researchers study the security in only one sense [24]; that is to say, the customer works with one vendor Cloud, that he gives her sensitive data [8], but what will happen if the same customer decides to work in an open and distributed system that is the multi-cloud, and how will this data be restored afterwards.

Multi-cloud
Many of the public or private cloud networks are configured to work as closed systems which are not built to communicate with one another. The lack of integration between these networks makes it a hardship on organizations to combine their IT systems inside the cloud and realize productivity gains and cost benefits [21].
Unfortunately, with the current trend of using different services from different clouds, the frequency and improvement of cyber attacks are increasing at a time, knowing that the average now resides in a victim's network for more than 200 days before being detected [9]. Today, 75% of intrusions on the network are attributed to compromised user authentication, and authentication methods known to date, are not designed for distributed environments [9,10]. However, when migrating of our data and applications to this multi-cloud model, users need to be protected in order to access their applications without problems.
Cloud service providers require customers to store their account information in the cloud. When a customer decides to use multiple cloud services, they will need to store their password in multiple clouds [25]. In this case, the number of copies of user information will be significant, and security threats will rise for both customers and cloud service providers. For this, cloud service providers will use several authentication mechanisms to recognize their customers.

Authentication
Authentication is a subdomain of security, as explained in [14]. Like security, authentication strongly depends on important aspects of confidentiality, integrity, and availability; they become obligations in the design of secure systems. As in a single cloud system, the multi-cloud must also guarantee [13,25]: -Ease of use: The cloud services can easily be used by malicious attackers, since a registration process is very simple, because we only have one valid credit card -Secure data transmission: When transferring the data from clients to the cloud, the data needs to be transferred by using an encrypted secure communication. -Insecure APIs: Various cloud services on the Internet are exposed by application programming interfaces. Since the APIs are accessible from any-where on the Internet, malicious attackers can use them to compromise the confidentiality and integrity of the enterprise customers. -Malicious insiders: Employees working at cloud service provider could have complete access to the company resources. -Shared technology issues: The cloud service SaaS,/PasS,/IaaS providers use scalable infrastructure to support multiple tenants which share the underlying infrastructure.

Related works
During the last years, a lot of researches and implementations on authentication and security for limiting the risk of data loss and corruption had been performed. In their publication [26], the authors have proposed a distributed system based on replication, its objective is to be able to defend against failures, in which components of a system fail in arbitrary ways. Byzantine failure tolerant algorithms must cope with such failures and still satisfy the specification of the problems they are designed to solve, but its general problems are omission failure or execution failure or lying.
In paper [27], the authors in this article present a system called "CHARON", it is a cloud storage system able to store and share data securely and efficiently, using multiple cloud providers and storage repositories to comply with the legal requirements for sensitive personal data. CHARON implements three distinguishing features: (1) it requires no trust in any entity (2) it does not require any client-managed servers, and (3) it efficiently handles large files on a geographically dispersed storage set. With a data-centric leasing protocol, resilient byzantine way. But the use of byzantine-resilient for cloud storage implies increased latency compared to a single cloud, For this solution, the addition of a biometric solution can be more effective in terms of security and, more it allows benefits of functionality, control, verification and biometric authentication.
Authors in [28]: The model proposed which is the implementation of a secured multi-cloud virtual infrastructure consists of a grid engine on top of the multi-cloud infrastructure to distribute the task among the worker nodes that are supplied with various resources from different clouds to enhance cost efficiency of the infrastructure set up and also to implement high availability feature. The Oracle grid engine is used to schedule the jobs to the worker nodes (in-house and cloud). Worker nodes will be acting like listeners to receive the job from the Oracle grid engine master node. The Client after proper authentication procedure submits the job to the grid gain engine. Access control is a key concern as attacks by hackers are of a great risk. A potential hacker can be someone with approved access to the Cloud.
Concerning user authentication framework, authors in [29]: They propose an access control mechanism to ensure confidentiality of data in the cloud. The mechanism is based on two protocols: ABE (Attribute Based Encryption) for data privacy and ABS (Attribute Based Signature) for user authentication. ABE is combined with ABS to ensure anonymity of users that store their data in the cloud. Key attributes and distribution is done in a Decentralized Manner. But several attacks have been discovered on DES and these methods are no longer effective with newer encryption algorithms due to the introduction of an avalanche effect.
Another approach has been proposed by researchers in [30]. The architecture consists of trusted client as well as three or more cloud service providers that provide Database as a Service. The Database as a service provider provides reliable content storage and data management, but is not trusted by the clients to preserve content privacy Authority. The client does not store any persistent data but stores a mapping table describing the storage of various fragments location.
In paper [31] the authors consider the existence of multiple CSPs (Cloud service providers) to collaboratively store and maintain the client's data. Moreover, a cooperative Provable data possession (cooperative PDP) is used to verify the integrity and availability of stored data in CSPs).The verification procedure is described as follows: firstly, the client (data owner) uses the secret key to pre-process the file, which consists of a collection of n blocks. It generates a set of public verification information that is stored in TTP (Trusted third parties), then transmits the file and some verification tags to CSPs, and may delete its local copy. This architecture could provide some special functions for data storage and management but these functions would increase the complexity of storage management. For example, there may exist overlap among data blocks and skipping.
In [32], the authors propose a solution for the confidentiality of recording location data. They build the structure multilevel query tree from the database to display the combination location data and access frequency, and then they add the noises to the query tree nodes and publish the new query as the end result. Which can improve protection location data.
The authors in [33] propose a data collection scheme preserving the confidentiality of Wireless Sensor Networks (WSNs). Their objective is to guarantee the confidentiality of sensory readings by preventing traffic analysis and flow tracing in the WSNs. They exploit homomorphic encryption functions (HEFs) in the collection of compressive data to thwart traffic analysis and maintain confidentiality. But the proposed scheme can only resist attacks from inside the network, but not from outside the network because an intermediate sensor node can be targeted by security attacks.
The authors in [34], have proposed a scheme which consists of three parts, namely selective HEVC video encryption, data integration in encrypted HEVC videos and data extraction. First, the content owner encrypts the original HEVC bitstream using stream cyphers before sending it to the data cache. Then the data cache integrates some secret messages in the encrypted video using the coefficient modulation method. On the receiver side, data extraction and decryption are completely separable, i.e. they can be encrypted and decrypted domains. This mechanism has not been tested in complex IT environments such as cloud computing.
The authors made a comparison of these systems ( Fig. 1) to better structure the authentication problem and to identify the different approaches and solutions established by the various authors cited above, for the management of security in multi-cloud environments.

Proposed solution
Despite the work cited above, as well as several other works carried in this area, the security problem still persists, and users are always afraid of losing their data, or that their data was illegally disclosed. As we know, spammers are at 80% responsible for this [35][36][37]. For example, a spammer can steal the identity of a legitimate user and subsequently, become the owner of a stolen account Fig. 2. But as a spammer never responds to the messages [19,20] sent by the authentication server (such as secret questions that only the legitimate user knows), the server can filter users based on their behavior. But that was not enough to guarantee security in an open and distributed system.
In this section, the authors will present their scheme to secure access of applications and services in the different clouds. Our proposal framework contains three techniques: firstly, we create a private network virtual (VPN) between the customer and the provider; which minimizes the risk of data loss during its transfer, and to reduce the number of intrusions by offering a highly qualified security. Secondly, we will use asymmetric data cryptography with the RSA algorithm [14,38], to provide a high level of data security in transit. And finally, recover the information from several sources (clouds) by guaranteeing the integrity of the data using the hashing algorithm [22]. The Fig. 3 illustrates these three steps.

a. Create a Virtual private network
The first step to security is to create a virtual private network between the client and the host server, to ensure that only authorized users can access the network and that the data can not be intercepted by malicious software [24,31]. Connecting to the VPN takes about 30 s to attempt to connect to the gateway. Sometimes a username and password are required to the user to login. In this step, two cases are possible: either the connection established successfully, or an error message occurs to indicate that the connection failed Algorithm 1 presents method steps to connect vpn.

b. Access with authentication
The authentication mechanism plays a vital role in security controls by ensuring that only authorized users gain access to resources. Moreover, the security of any private cloud or public cloud service depends on the level of protection given to the cryptographic keys used to protect sensitive data [36]. Encryption and decryption are carried out, using two different keys. The two keys was referred to as the public key (PU) and the private key (PR) [22,37,38]. If the client wants to send an authenticated message to provider, he would encrypt the message with the private key, and this message would only be decipherable with the public key, that would establish the authenticity of the message. The encryption method contains four steps, which was explaining in the following. Table 1 summarizes the most common abbreviations and Fig. 4 shows the process of encryption and decryption using the RSA algorithm.
Step-1: key generation: either the client part Ai, who wishes to send a message to another party who is Aj. The part Ai must convert its message which is: M in an encrypted form C, with the algorithm RSA: C = F (PR (A), M).
The processing steps undertaken by the second part Aj to recover M of C is:

) d (mod n) = M ed (mod _(n)) ≡ M (mod n).
Step-2: Generate two different primes (p,q), n = p × q andΦ(n) Step-3: verification: The authors ensure that n is not factorizable by one of the modern integer factorization algorithms. Select for public exponent an integer e such as that: 1 < e < Φ (n) and gcd(Φ (n), e) = 1.
Step-4: calculate: this step is to calculate for the private exponent a value for d from e and the modulus n. Such that: d = e − 1 mod Φ(n) or d × e ≡ 1 (mod Φ(n)). Once the user is registered in VPN and in the authentication server, he can translate his authorized status into several applications and allows him to transition to a dispersed application environment.

c. Data integrity in multi-cloud
Here, the authors consider a data storage service composing with three different entities: Granted clients, who have the authorization to access and manipulate the stored data; and Cloud service providers (CSPs), who work together to provide security services and integrity of data and have enough storage space [14,38]. The third-party auditor must handle the errors and must monitor the activities of the cloud server.  Demonstrates an Integrity Schema in the Multi-Cloud. Algorithm 2 presents the integrity of data method steps.
In the end, once the file has been approved by cloud,it can be downloaded by user.

Simulation and results
The simulation environment was built, based on the system of the national social security fund. This environment has a calculation centre which manages confidential data on insured persons such as sickness, disability…. As well, as public data such as family benefit records, drug reimbursements, and other benefits. In this organization, a very limited number of people know the access code to the information stored in Cloud servers (public, private).
So to eliminate the risk of data intrusion, we first secured the communication tunnel through a virtual private network. Once VPN access is complete, a username and password, are requested by the authentication server. This server plays a role of filtering different users according to their information (password, name, date of last access,…). The information provided by the legitimate user is encrypted by the RSA algorithm and stored at the user database on that server. Access to the database on the various public and private clouds will be authorised or refused by this server. Confidential data will be stored on private clouds and public data will be stored on the public clouds. Let's try to apply that system to that organization. In the proposed system, the authors use two techniques to ensure authentication and security in cloud: Secure tunnel and a cryptography authentication.
1) Secure with vpn: VPN can increase security. To prevent the disclosure of private information, VPNs only allow authenticated remote access using encryption techniques.
2) Crypted authentication: At this level the username and password of the client will be encrypted and decrypted with the public and private key that are owned by both parties Fig 6. Illustrate the system.

Each function is explain as follows
• Open application and connect: the client connects to the application that has been configured at home before. • Id user: a username and password will be requested by the application. An attacker in this system would only see encrypted data. Sender authentication is performed using the information provided when configuring the vpn on its own computer, such as the user name, password, and IP address. • Identification and password: When the connection is established, a web page with the following IP address http:\\ 192.168.25.2 will be opened. • Encrypt identification: in this second part the authentication will be encrypted with the RSA code.
We will now present an example based on what we have already demonstrated in the subsection 4. In our case, the two prime numbers p and q are very large, and the product of these two very large numbers is not practically factorizable. Indeed the known efficient algorithms which make it possible to verify that a number is not prime do not provide factorization.
• Check of the certificate: all the information concerning the customer is gathered in a database in the authentication server such as the name, the date of the last access, the remaining time of the contract, the password, the last operation to perform …etc., all this information is verified at every temptation to enter the client. • Authorization: either access is allowed if all the information entered is correct, or refuse if there is a lack. • Dispatching of data: if all permissions are acquired the data will be dispatched or query or update in the different clouds that is participating.
This system must ensure several parameters namely the availability of data, their integrity and the technology chosen for its organization (a centralized or decentralized management).

a) Availability and intégrity:
The first cloud that will receive the data (the cloud manager) of the client, he will record them on his RAID then, he diffuses them on all the other clouds of the system which will in turn save them [39][40][41]. But downloading one of the files must be check and without fail, for those CSPs must be cooperative [31]. The Fig. 4, showns this organization.
For the integrity, each cloud must verified this condition: If n i=1 f pi , f pi (V , (bi, pk, β, ni)) = {1} , this verification is iterative and Proof(P,V) is valid between CPS (show in Fig. 5). Then the file is correct and complete and there is no data loss between different clouds.
Here the different providers' hard drives work with "A Case for Redundant Arrays of Inexpensive Disks" (RAID) technology by placing data on multiple disks and allowing Input/Output operations to overlap in a balanced way, improving performance. As the use of multiple disks increases the average time between failures, redundant data storage also increases fault tolerance. With this system, if one or more drives fail, the entire system remains functional and data integrity is available [42,43]. The RAID meets the needs of reliability but also of high performance [44].
The most widely used redundancy system in the raid 5 is the parity calculation, it is based on the logical operation XOR (exclusive OR) and consists in determining whether on n data bits considered, the number of bits in the state1 is even or odd. If the number of 1 is even, then the parity bit is 0. If the number of 1 is odd, then the parity bit is 1. When one of the n + 1 data bits thus formed becomes unavailable, it is then possible to regenerate the missing bit by applying again the same method on the n remaining elements.

b) Organizational levels analysis:
Various organizational levels for a better distribution of controls in terms of data security in the cloud can be presented [14,22,42]. This organizational model represents a network model between applications and data. Figure 7 summarizes the different forms of organizational solution. The model of organizational is divided into four stages. The Fig. 8, shown the organization of model: -Level 1: this level is characterized by completely centralized applications and data.
It is an easy system to implement because there is only one governing party. Other parts are the executors of the system. The latest Internet platforms can be assigned to this level. -Level 2: there is no centralized management, so each party must manage these applications and secure that data. Here the principle of data distribution to different participants is used in order to achieve high availability; also the maintenance of cooperation and interaction between the different parties must be ensured. This approach is similar to that of peer-to-peer. -Level 3: decentralization of user data takes place. "Storage Cloud" technology is suitable for technological implementation because it allows for easy integration of user data storage into the system. Here the data is distributed and the central provider plays the role of authority and data management. This framework can be implemented by different methods and instruments appropriate. Therefore, we will use the simulation to evaluate this system according to the results obtained.

c) Evaluation:
We present our evaluation according to the criteria that we think are most important, namely: authentication and security, availability and performance, and the simplicity of its execution for the end user that we discuss in the next section.
Authentication and security level: to have a high level of security, we must have a very low probability of spammer intrusion, for this, we use the fish law, also called the low probability law, to check the credibility of our framework, this law uses two parameters: A high number of intrusion attempts in a time interval (t) is: (N). and (p) the probability of occurrence of the event in a time interval (∆t) to give p > 10%. To test the validity of our simulation, we considered that there are two intrusion attempts per second (1S), and we want to know, what is the probability that there are exactly ten in 10 s? 2 by 1 s, that is 2 × 10 = 20 in 10 s λ = 2 × 10 = 20.
By this method, even if we take other values for λ, the probabilities of intrusions remain very low. Availability and performance: the distribution of data in fragments on the various clouds of the system minimizes the risk of loss and degradation of the latter. Add to that, by placing the data on multiple hard drives from different cloud providers using (RAID) technology, improving the performance of our system and also increasing our fault tolerance. The system remains functional even if a failure occurs on one of the server's.

Discussion
To validate the performance of our framework, we compared its results with other existing methods in terms of detecting attackers at the user authentication level [26][27][28][29][30][31][32][33][34]. The results were compared based on cryptology at the identification level, and on the validity of responses to user requests.
Then, the performance of our framework is better than other; the users can see virtual networking concept is used. In the same way, authentication server that contains the identity database is sources of truth, the exit procedures are simplified, just disable the user account which reduce errors and generate operational efficiency gains [45,46]. If the user is an attacker then attacker has to block to the authentication server.
The main criterion is the trust relationship between all eco-systems. Therefore, centralized management is the most successful variant of trust. Basically, the goal is to better control sensitive data by offering more security than it already exists [47,48]. In this system, when the file size is large the download time of this file by n users also increases, as formulated Fig. 9. Authors may also notice that the CPU usage is reduced a bit when the virtual network concept is used Fig. 10. In the same way, the control of the integrity of files by each provider is reduced to a minimum time.
The performance results are obtained by following the evolution of the authentication server about filtering, and security that it offers throughout the connection of the legitimate user. The authors also note the following classification in Fig. 11. • The first step vpn gives an efficiency between 90% and 72%, which reduce the risk of intrusion into the tunnel • The second step authentication with asymmetric encryption (algorithm of RSA) results in a maximum efficiency of 90% and minimal 60%. • The third step of data integrity the maximum of 80%, and the minimum of 60%. Also, our system provides a high degree of confidence to better secure our data (90%). It is not complex (5%), responds to the budget of any organization (7%), the data remains consistent and correct (92%) with a reduced response time (8%). The authors can evaluate their system by the following diagram in Fig. 12.
In fact the Researchers are not able to adapt a system without taking into account the characteristics mentioned in this section.
In this proposition of solution the two most important properties is ensured: 1. Resistance against impersonation attacks: Communication between the user and the AS during authentication requires knowledge of the private key of one of the parts. But the private key PR is managed by the authentication server and it is impossible to have it. As a result, identity theft attacks become impossible. 2. Impossibility: this property is related to the different phases of our system this is because only the user who owns the VPN application is able to generate a valid user login request. In the second part of the login phase of the user, the user communicates in encrypted form with the authentication server. And the third phase the data is segmented between different clouds. So even the risk of intrusion of the wrongintentioned person is very low.
Note that in the case study, our company uses and generates data of different types such as text, images, video, audio…etc. Which will be stored on all the clouds in the system, this is multimedia data. Not to mention that some diagrams use the multilevel query tree to show the combination of location data and access frequency [32]. Our solution is based on the distribution and storage of data in each cloud of the multi-cloud system in a RAID type configuration, which allows to always have a copy of the original file. And with the data integration process, we can get images very similar to the original data.
Currently we can compare our system with some other mechanisms mentioned in section three, namely [26][27][28][29][30][31][32][33][34] (see Table 2). This table compares these systems with ours in terms of data security, availability, complexity and risk. The challenges for all decentralized architectures as in this study are the areas of security, data availability and data integrity. In addition, a reliable and trustworthy system must to be installed. For this, the aim is to develop a system with high data security and low complexity and at the same time, the user no longer needs to worry about the security of his data but must answer questions regarding the acceptance of data and reduced complexity.

Conclusion and futures works
A new secure multi-cloud architecture, preserving the confidentiality and integrity of data has been demonstrated. When you process sensitive information such as, in social insurance, which collects personal information about people or banking data, the confidentiality and authentication of users is of paramount importance. Also, the efficiency, the security and the integrity of the data that the multi-cloud could fear are essential criteria for the acceptance of these remote services.To meet these requirements, the authors have proposed a system that uses encrypted authentication, which guarantees that only the authorized person has the right to access the data as well as an algorithm that verifies the integrity of the data between the different clouds. Fragmenting data into pieces preserving confidentiality and integrity is used to minimize the risk of intrusion. The results of our simulation show that the proposed model reduces the number of intrusions of malicious people and reduces the time needed to establish a path by establishing the virtual private network.
In the future, we are looking to test the proposed method in a complex environment to test its scalability and design a system to enhance and extend current work using multiagent systems.

Abbreviations
IoT: Internet of Things; AI: Artificial intelligence; VPN: Virtual private network; CPS: Cloud provider service; Di: Data; TTP: Trusted third parties; RAID: Redundant array of independent disks; PDP: Cooperative provable data possession.