(66063) 1998 RO1: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Nickst
{{DEFAULTSORT:1998 RO1}}
 
en>Kheider
Just because it has an orbital period of 360 days does not make it a 1:1 resonance with Earth. It makes too many very close approaches to Mars to make all these claims without a reference.
Line 1: Line 1:
'''Distributed file system for cloud ''' is a [[w:file system|file system]]  that allows many clients to have access to the same data/file providing important operations (create, delete, modify, read, write). Each file may be partitioned into several parts called chunks. Each chunk is stored in remote machines.Typically, data is stored in files in a hierarchical tree where the nodes represent the directories. Hence, it facilitates the parallel execution of applications. There are several ways to share files in a distributed architecture. Each solution must be suitable for a certain type of application relying on how complex is the application or how simple it is. Meanwhile, the security of the system  must be ensured. [[w:Confidentiality|Confidentiality]], [[w:Availability|availability]] and [[w:Integrity|integrity]] are the main keys for a secure system. 
20 year old Film, Tv, Radio and Phase Directors Merle from Leduc, has hobbies including jazz, property developers in singapore and cloud watching. Has recently completed a journey to Würzburg Residence with the Court Gardens.<br><br>Stop by my weblog; [http://fomcc.org/fishers/blogs/post/3241 http://fomcc.org/]
Nowadays, users can share resources from any computer/device, anywhere and everywhere through internet thanks to cloud computing which is typically characterized by the [[w:Scalability|scalable]] and [[w:Elasticity (cloud computing)|elastic]] resources -such as physical [[w:Server (computing)|servers]], applications and any services that are [[w:Virtualization|virtualized]] and allocated dynamically. Thus, [[w:Synchronization|synchronization]] is required to make sure that all devices are update.
Distributed file systems enable also many big, medium and small enterprises to store and access their remote data exactly as they do locally, facilitating the use of variable resources.
 
==Overview==
 
===History===
Today, there are many implementations of distributed file systems.
The first file servers were developed by researchers in the 1970s, and the Sun's Network File System were disposable in the early 1980.
Before that, people who wanted to share files used the [[w:sneakernet|sneakernet]] method. Once the computer networks start to progress, it became obvious that the existing file systems had a lot of limitations and were unsuitable for multi-user environments. At the beginning, many users started to use [[FTP]] to share files.<ref>{{harvnb|Sun microsystem |p=1|id= sun}}.</ref> It started running on the [[w:PDP-10|PDP-10]] in the end of 1973. Even with FTP, files needed to be copied from the source computer onto a server and also from the server onto the destination computer. And that force the users to know the physical addresses of all computers concerned by the file sharing.<ref>{{harvnb|Fabio Kon |p=1|id= fabio}}</ref>
 
===Supporting techniques===
Cloud computing use important techniques to enforce the performance of all the system. Modern Data centers provide a huge environment with data center networking  (DCN) and consisting of big number of computers characterized by different capacity of storage. [[w:MapReduce|MapReduce]] framework had shown its performance with [[w:Data-intensive computing|Data-intensive computing]] applications in a parallel and distributed system. Moreover, [[w:Virtualization|virtualization]] technique has been employed to provide dynamic resource allocation and allowing multiple operating systems to coexist on the same physical server.
 
===Applications===
As cloud computing provides a large-scale computing thanks to its ability of providing to the user the needful CPU and storage resources with a complete transparency, it makes it very suitable to different types of applications that require a large-scale distributed processing. That kind of [[w:Data-intensive computing|Data-intensive computing]] needs a high performance file system that can share data between VMs ([[w:Virtual machines|Virtual machine]]).<ref>{{harvnb|Kobayashi| Mikami| Kimura|Tatebe |2011|p=1|id= Kobayashi}}.</ref>
 
The application of the Cloud Computing and Cluster Computing paradigms are becoming increasingly important in the industrial data processing and scientific applications such as astronomy or physic ones that frequently demand  the availability of a huge number of computers in order to lead the required experiments. The cloud computing have represent a new way of using the computing infrastructure by dynamically allocating the needed resources, release them once it's finished and only pay for what they use instead of paying some resources, for a certain time fixed earlier(the pas-as-you-go model). That kind of services is often provide in the context of [[w:SLA|Service-level agreement]].<ref>{{harvnb|Angabini|Yazdani| Mundt|Hassani |2011|p=1|id= Angabini}}.</ref>
 
==Architectures==
Most of distributed file systems are built on the client-server architecture, but yet others decentralized solutions exist as well.
[[File:UploadDownload.PNG|thumb|200py|upload and download model]]
 
===Client-server architecture===
[[File:Remotemodel.PNG|thumb|200py|Remote access model]]
 
{{abbr|NFS|Network File System}} is the one of the most that use this architecture. It enables to share files between a certain number of machines on a network as if they were located locally. It provides a standardized view of the local file system. The NFS protocol allows heterogeneous clients (process), probably running on different operating systems and machines, to access the files on a distant server, ignoring the actual location of files.
However, relying on a single server makes the  NFS protocol suffering form a low availability and a poor scalability. Using multiple servers does not solve the problem since each server is working independently.<ref>{{harvnb|Di Sano| Di Stefano|Morana|Zito|2012|p=2|id= Di Sano}}.</ref>
The model of NFS is the remote file service. This model is also called the remote access model which is in contrast with the upload/download model:
* remote access model: provides the transparency , the client has access to a file. He can do requests to the remote file(the file remains on the server) <ref>{{harvnb|Andrew|Maarten|2006|p=492|id=Tanenbaum}}.</ref>
* upload/download model: the client can access the file only locally. It means that he has to download the file , make the modification and uploaded it again so it can be used by others clients.
The file system offered by NFS is almost the same as the one offered by [[w:UNIX|Unix]] systems. Files are hierarchically organized into a naming graph in which directories and files are represented by nodes.
 
===Cluster-Based architectures===
 
It's rather an amelioration of client-server architecture in a way that improve the execution of parallel application. The technique used here is the file-striping one. This technique lead to split a file into several segments in order to save them in multiple servers. The goal is to have access to different parts of a file in parallel.
If the application does not benefit from this technique, then it could be more convenient to just store different files on different servers.
However, when it comes to organize a distributed file system for large data centers such as Amazon and Google that offer services to web clients allowing multiple operations (reading, updating, deleting,...) to a huge amount of files distributed among a massive number of computers, then it becomes more interesting. Note that a massive number of computers opens the door for more hardware failures because more server machines mean more hardware and thus high probability of hardware failures.<ref>{{harvnb|Andrew |Maarten |2006|p=496|id= Tanenbaum}}</ref> Two of the most widely used DFS are the Google file system and the Hadoop distributed file system. In both systems, the file system is implemented by user level processes running on top of a standard operating system (in the case of GFS, [[w:Linux|Linux]]).<ref>{{harvnb|Humbetov|2012|p=2|id= Humbetov}}</ref>
 
====Design principles====
 
=====Goals=====
{{abbr|GFS |Google File System}} and {{abbr|HDFS |Hadoop Distributed File System}} are specifically built for handling [[w:batch processing|batch processing]] on very large data sets.
For that, the following hypotheses must be taken into account:<ref name="Krzyzanowski_p2"/>
* High availability: the [[w:Computer cluster|cluster]] can contain thousands of file servers and some of them can be down at any time
* Servers belong to a rack,a room, a data center, a country and a continent in order to precisely identify its geographical location
* The size of file can varied form many gigabytes to many terabytes. The file system should be able to support a massive number of files
* Need to support append operations and allow file contents to be visible even while a file is being written
* Communication is reliable among working machines: [[w:Transmission Control Protocol|TCP/IP]] is used with an [[w:Remote procedure call|Remote_procedure_call RPC]] communication abstraction. TCP allows the client to know almost immediately that there is a problem and can try to set up a new connection.<ref>{{harvnb|Pavel Bžoch |p=7|id= Pavel}}.</ref>
 
[[File:LoadbalancingDel.png|thumb|290px|load balancing and rebalancing: Delete file]]
[[File:Loadbalancingadd.png|thumb|290px|load balancing and rebalancing : New server]]
 
=====Load balancing=====
 
Load Balancing is essential for efficient operations in distributed environments. It means distributing the amount of work to do between different servers<ref>{{harvnb|Kai|Dayang|Hui|Yintang|2013|p=23|id=Fan}}.</ref> in order to get more work done in the same amount of time and clients get served faster.
In our case, consider a large-scale distributed file system. The system contains N chunkservers  in a cloud (N can be 1000, 10000, or more) and where a certain number of files are stored. Each file is splitted into several parts/chunks of fixed- size( for example 64 MBytes). The load of a each chunkserver is proportional to the number of chunks hosted by the server.<ref name="ReferenceA">{{harvnb|Hsiao|Chung|Shen|Chao|2013|p=2|id=Hsiao}}.</ref>
In a load balanced cloud, the resources can be well used while maximizing the performance of MapReduce- based applications.
 
=====Load rebalancing=====
 
In a cloud computing environment, failure is the norm <ref>{{harvnb|Hsiao|Chung|Shen|Chao|2013|p=952|id=Hsiao}}.</ref><ref>{{harvnb|Ghemawat|Gobioff|Leung|2003|p=1|id=Ghemawat}}.</ref>
, and chunkservers may be upgraded, replaced, and added in the system. In addition, files can also be dynamically created, deleted, and appended. An that lead to load imbalance in a distributed file system. It means that the file chunks are not distributed equitably between the nodes.
 
Distributed file systems in clouds such as GFS and HDFS, rely on central servers (master for GFS and NameNode for HDFS) to manage the metadata and the load balancing. The master rebalances replicas periodically: data must be moved form a DataNode/chumkserver to another one if its free space is below a certain threshold.<ref>{{harvnb|Ghemawat|Gobioff|Leung|2003|p=8|id=Ghemawat}}.</ref>
However, this centralized approach can provoke a bottleneck for those servers as they become unable to manage a large number of file accesses. Consequently, dealing with the load imbalance problem with the central nodes complicate more the situation as it increases their heavy loads. Note that the load rebalance problem is [[w:NP-hard|NP-hard]].<ref>{{harvnb|Hsiao|Chung|Shen|Chao|2013|p=953|id=Hsiao}}.</ref>
 
In order to manage large number of chunkservers to work in collaboration, and solve the problem of load balancing in distributed file systems, there are several approaches that have been proposed such as reallocating file chunks such that the chunks can be distributed to the system as uniformly as possible while reducing the movement cost as much as possible.<ref name="ReferenceA"/>
 
[[File:ChunkServerF.png|thumb|upleft=2.9|280px|500py|google file system architecture]]
 
====Google file system====
[[File:SplitedFile.png|thumb|280px|300py|Splitting File]]
{{Cat main|Google File System}}
 
=====Description=====
Among the biggest internet companies, Google has created its own distributed file system named Google File System to meet the rapidly growing requests of Google's data processing needs and it's used for all cloud services.
GFS is a scalable distributed file system for data-intensive applications. It provides a fault-tolerant way to store data and offer a high performance to a large number of clients.
 
GFS uses [[w:MapReduce|MapReduce]] that allows users to create programs and run them on multiple machines without thinking about the parallelization and load-balancing issues .
GFS architecture is based on a single master, multiple chunckservers and multiple clients.<ref>{{harvnb|Di Sano|Di Stefano|Morana|Zito|2012|pp=1–2|id= Di Sano}}</ref>
 
The master server running on a dedicated node is responsible for coordinating storage resources and managing files's [[w:metadata|metadata]] (such as the equivalent of inodes in classical file systems).<ref name="Krzyzanowski_p2">{{harvnb|Krzyzanowski|2012|p=2|id= Krzyzanowski}}</ref>
Each file is splited to multiple chunks of 64 MByte. Each chunk is stored in a chunk server.A chunk is identified by a chunk handle, which is a globally unique 64-bit number that is assigned by the master when the chunk is first created.
 
As said previously, the master maintain all of the files's metadata including their names, directories and the mapping of files to the list of chunks that contain each file’s data.The metadata is kept in the master main memory, along with the mapping  of files to chunks. Updates of these data are logged to the disk onto an operation log.  This operation log is also replicated onto remote machines. When the log become too large, a checkpoint is made and the main-memory data is stored in a [[w:B-tree|B-tree]] structure to facilitate the mapped back into main memory.<ref>{{harvnb|Krzyzanowski|2012|p=4|id= Krzyzanowski}}</ref>
 
=====Fault tolerance=====
For fault tolerance, a chunk  is replicated onto multiple chunkservers, by default on three chunckservers.<ref>{{harvnb|Di Sano|Di Stefano| Morana|Zito|2012|p=2|id= Di Sano}}</ref> A chunk is available on at least a chunk server.
The advantage of this system is the simplicity. The master is responsible of allocating the chunk servers for each chunk and it's contacted only for metadata information. For all other data, the client has to interact with chunkservers.
Moreover, the master keeps track of where a chunk is located. However, it does not attempt to keep precisely the chunk locations but occasionally contact the chunk servers to see which chunks they have stored.<ref>{{harvnb|Andrew |Maarten |2006|p=497|id= Tanenbaum}}</ref>
GFS is a scalable distributed file system for data-intensive applications.<ref>{{harvnb|Humbetov|2012|p=3|id= Humbetov}}</ref>
The master does not have a problem of bottleneck due to all the work that has to to accomplish. In fact, when the client want to access a data, it communicates with the master to see which chunk server is holding that data. Once done, the communication is setted up between the client and the concerned chunk server.
 
In GFS, most files are modified by appending new data and not overwriting existing data. In fact, once written, the files are only read and often only sequentially rather than randomly, and that made this DFS the most suitable for scenarios in which many large files are created once but read many times.<ref>{{harvnb|Humbetov|2012|p=5|id= Humbetov}}</ref><ref>{{harvnb|Andrew|Maarten|2006|p=498|id= Tanenbaum }}</ref>
 
=====File process=====
When a client wants to write/update to a file, the master should accord a replica for this operation. This replica will be the primary replica since it's the first one that gets the modification from clients.
The process of writing is decomposed into two steps:<ref <ref name="Krzyzanowski_p2"/>
 
* sending: First, and by far the most important, the client contacts the master to find out which chunk servers holds the data. So the client is given a list of replicas identifying the primary chunk server and secondaries ones. Then, the client contacts the nearest replica chunk server, and send the data to it. This server will send the data to the next closest one, which then forwards it to yet another replica, and so on. After that, the data have been propagated but not yet written to a file (sits in a cache)
 
* writing: When all the replicas receive the data, the client sends a write request to the primary chunk server -identifying the data that was sent in the sending phase- who will then assign a sequence number to the write operations that it has received, applies the writes to the file in serial-number order, and forwards the write requests in that order to the secondaries. Meanwhile, the master is kept out of the loop.
Consequently, we can differentiate two types of flows: the data flow and the control flow. The first one is associated to the sending phase and the second one is associated to the writing phase. This assures that the primary chunk server takes the control of the writes order.
Note that when the master accord the write operation to a replica, it increments the chunk version number and informs all of the replicas containing that chunk of the new version number. Chunk version numbers allow to see if any replica didn't make the update because that chunkserver was down.<ref>{{harvnb|Krzyzanowski|2012|p=5|id= Krzyzanowski}}</ref>
 
It seems that some new Google applications didn't work well  with the 64-megabyte chunk size. To treat that, GFS started in 2004 to implement the [[w:BigTable|BigTable]] approach."[http://arstechnica.com/business/2012/01/the-big-disk-drive-in-the-sky-how-the-giants-of-the-web-store-big-data/]
 
====Hadoop distributed file system====
 
{{Cat main|Apache Hadoop}}
 
{{abbr|HDFS |Hadoop Distributed File System}}, hosted by Apache Software Foundation, is a distributed file system. It's designed to hold very large amounts of data (terabytes or even petabytes). It's architecture is similar to GFS one, i.e. a master/slave architecture.The HDFS is normally installed on a cluster of computers.
The design concept of Hadoop refers to Google, including Google File System, Google MapReduce and [[w:BigTable|BigTable]]. These three techniques are individually mapping to Hadoop and Distributed File System (HDFS), Hadoop MapReduce Hadoop Base (HBase).<ref>{{harvnb|Fan-Hsun|Chi-Yuan| Li-Der| Han-Chieh|2012|p=2|id= Fan-Hsun}}</ref>
 
An HDFS cluster consists of a single NameNode and several Datanode machines. A nameNode, a master server, manages and maintains the metadata of storage DataNodes in its RAM. DataNodes manage storage attached to the nodes that they run on.
The NameNode and Datanode are software programs designed to run on everyday use machines.These machines typically run on a GNU/Linux OS. HDFS can be run on any machine that supports Java and therefore can run either a NameNode or the Datanode software.<ref>{{harvnb|Azzedin|2013|p=2|id= Azzedin}}</ref>
 
More explicitly, a file is split into one or more equal-size blocks except the last one that could smaller. Each block is stored in multiple DataNodes. Each block may be replicated on multiple DataNodes to guarantee a high availability. By default, each block is replicated three times and that process is called "Block Level Replication".<ref name="admaov_2">{{harvnb|Adamov|2012|p=2|id= Adamov}}</ref>
 
The NameNode manage the file system namespace operations like opening, closing, and renaming files and directories and regulates the file access. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for operating read and write requests from the file system’s clients, managing the block allocation or deletion, and replicating blocks.<ref>{{harvnb|Yee|Thu Naing|2011|p=122|id= yee}}</ref>
 
When a client  wants to read or write data, it contacts the NameNode and the NameNode checks where the data should be read from or written to.
After that, the client has the location of the dataNode and can send reads/writes request to it.
 
The HDFS is typically characterized by its compatibility with data rebalancing schemes. In general, managing the free space on a DataNode is very important. Data must be moved form a DataNode to another one if its free space is not satisfying.  And also, in the case of creating additional replicas, data should move to assure the balance of the system.<ref name="admaov_2"/>
 
====Other examples====
Distributed file systems can be classified into two categories. The first category of DFS is the one designed for internet services such as GFS. The second category include DFS that support intensive applications usually executed in parallel.<ref>{{harvnb|Soares| Dantas†|de Macedo|Bauer|2013|p=158|id= Soares}}</ref> Here are some example from the second category: [[w:Ceph (storage)|Ceph FS]], [[w:FhGFS|Fraunhofer File System (FhGFS)]], [[w:Lustre (file system)|Lustre File System]], [[w:IBM General Parallel File System|IBM General Parallel File System (GPFS)]] and [[w:Parallel Virtual File System|Parallel Virtual File System]].
 
Ceph file system is a distributed file system that provides excellent performance and reliability.<ref>{{harvnb|Weil|Brandt|Miller|Long|2006|p=307|id=Weil}}</ref> It presents some  challenges that are the need to be able to deal with
huge files and directories, coordinate the activity of thousands of disks, provide parallel access to metadata on a massive scale, manipulate both scientific and general-purpose workloads, authenticate and encrypt at scale, and increase or decrease dynamically because of frequent device decommissioning, device failures, and cluster expansions.<ref>{{harvnb|MALTZAHN|MOLINA-ESTOLANO|KHURANA|NELSON|2010|p=39|id= MALTZAHN}}</ref>
 
FhGFS, the high-performance parallel file system from the Fraunhofer Competence Centre for High Performance Computing. The distributed metadata architecture of FhGFS has been designed in order to provide the scalability and flexibility needed to run the most widely used [[w:Testing high-performance computing applications|HPC]] applications.<ref>{{harvnb|Jacobi|Lingemann|p=10|id=Jacobi}}</ref>
 
Lustre File System  has been designed and implemented to  deal with the issue of  bottlenecks traditionally found in distributed systems. Lustre is characterized by its efficiency, scalability and redundancy.<ref>{{harvnb|Schwan Philip|2003 |p=401|id=Schwan}}</ref> GPFS was also designed with the goal of removing the bottlenecks. <ref>{{harvnb|Jones|Koniges|Yates|2000 |p=1|id=Jones}}</ref>
 
==Communication==
 
The high performance of distributed file systems require an efficient communication between computing nodes and a fast access to the storage system. Operations as open, close, read, write, send and receive should be fast to assure that performance. Note that for each read or write request, the remote disk is accessed and that may takes a long time due to the network latencies.<ref>{{harvnb|Upadhyaya|Azimov|Doan|Eunmi|2008|p=400|id=Upadhyaya}}.</ref>
 
The data communication (send/receive) operation transfer the data from the application buffer to the kernel on the machine.[[w:Transmission Control Protocol|TCP]] control the process of sending data and is implemented in the kernel. However, in case of network congestion or errors, TCP may not send the data directly.
While transferring, data from a buffer in the [[w:Kernel (computing)|kernel]] to the application, the machine does not read the byte stream from the remote machine. In fact, TCP is responsible for buffering the data for the application.<ref>{{harvnb|Upadhyaya|Azimov|Doan|Eunmi|2008|p=403|id= Upadhyaya}}.</ref>
 
Providing a high level of communication can be done by choosing the buffer-size of file reading and writing or file sending and receiving on application level.
Explicitly, the buffer mechanism is developed using [[w:Linked list|Circular Linked List]].<ref>{{harvnb|Upadhyaya|Azimov|Doan|Eunmi|2008|p=401|id= Upadhyaya}}.</ref> It consists of a set of BufferNodes. Each BufferNode has a DataField. The DataField contains the data and a pointer called NextBufferNode that points to the next BufferNode. To find out the current position, two [[w:Pointer (computer programming)|pointers]] are used: CurrentBufferNode and EndBufferNode , that represent the position in the BufferNode for the last  written poistion and last read one.
If the BufferNode has no free space, it will send a wait signal to the client to tell him to wait until there is available space.<ref>{{harvnb|Upadhyaya|Azimov|Doan|Eunmi|2008|p=402|id= Upadhyaya}}.</ref>
 
==Cloud-based Synchronization of Distributed File System==
 
More and more users have multiple devices with ad hoc connectivity. These devices need to be synchronized. In fact, an important point is to maintain user data by synchronizing replicated data sets between an arbitrary number of servers. This is useful for the backups and also for offline operation. Indeed, when the user network conditions are not good, then the user device  will selectively replicate  a part of data that will be modified later and off-line. Once the network conditions become good, it makes the synchronization.<ref name="Uppoor">{{harvnb|Uppoor|Flouris|Bilas|2010|p=1|id=Uppoor}}</ref>
Two approaches exists to tackle with the distributed synchronization issue: the user-controlled peer-to-peer synchronization and the cloud master-replica synchronization approach.<ref name="Uppoor"/>
 
* user-controlled peer-to-peer: software such as [[w:rsync|rsync]] must be installed in all users computers that contain their data. The files are synchronized by a  peer-to-peer synchronization in a way that users has to give all the network addresses of the devices and the synchronization parameters and thus made a manual process.
 
*cloud master-replica synchronization: widely used by cloud services in which a master replica that contains  all data to be synchronized is retained as a central copy in the cloud, and all the updates and synchronization operations are pushed to this central copy offering a high level of availability and reliability in case of failures.
 
==Security keys==
 
In cloud computing, the most important security concepts are confidentiality, availability and integrity.
In fact, confidentiality becomes indispensable in order to keep private data from being disclosed and maintain privacy. In addition, integrity assures that data is not corrupted.<ref>{{harvnb|Zhifeng |Yang|2013|p=854|id= Zhifeng}}</ref>
 
===Confidentiality===
 
Confidentiality means that data and computation tasks are confidential: neither the cloud provider nor others clients could access to data.
Many researches have been made about confidentiality because it's one of the  crucial points that still represent challenges for cloud computing. The lack of trust toward the cloud providers is also a related issue.<ref>{{harvnb|Zhifeng |Yang|2013|pp=845–846|id= Zhifeng}}</ref> So the infrastructure of the cloud must make assurance that all consumer's data will not be accessed by any an unauthorized persons.
The environment becomes unsecured if the service provider:<ref>{{harvnb|Yau|An|2010|p=353|id= Stephen}}</ref>
*can locate consumer's data in the cloud
*has the privilege to access and retrieve consumer's data
*can understand the meaning of data (types of data, functionalities and interfaces of the application and format of the data).
If these three conditions are satisfied simultaneously, then it became very dangerous.
The geographic location of data stores influences on the privacy and confidentiality. Furthermore, the location of clients should be taken into account. Indeed, clients in Europe won't be interested by using datacenters located  in United States, because that affects the confidentiality of data as it will not be guaranteed. In order to figure out that problem, some Cloud computing vendors have included the geographic location of the hosting as a parameter of the service level agreement made with the customer <ref>{{harvnb|Vecchiola|Pandey|Buyya|2009|p=14|id= Vecchiola }}</ref> allowing users to chose by themselves the locations of the servers that will host their data.
 
An approach that may help to face the confidentiality matter is the data encryption <ref>{{harvnb|Yau|An|2010|p=352|id=  Stephen}}</ref> otherwise, there will be some serious risks of unauthorized uses. In the same context, other solutions exists such as encrypting only  sensitive data.<ref>{{harvnb|Miranda|Siani|2009|id= Miranda}}</ref> and supporting only some operations, in order to simplify computation.<ref>{{harvnb|Naehrig|Lauter|2013|id= Michael}}.</ref> Furthermore, Cryptographic techniques and tools as [[w:Homomorphic encryption|FHE]], are also used to strengthen privacy preserving in cloud.<ref>{{harvnb|Zhifeng |Yang|2013|p=854|id=Zhifeng}}.</ref>
 
===Availability===
 
Availability is generally treated by [[w:Replication (computing)|replication]].<ref name="availability">{{harvnb|Bonvin|Papaioannou|Aberer|2009|p=206|id= Bonvin}}</ref><ref>{{harvnb|Cuong|Cao|Kalbarczyk|Iyer|2012|p=5|id=Cuong}}</ref>
<ref>{{harvnb|A.| A.|P.|2011|p=3|id= Undheim}}</ref><ref>{{harvnb|Qian |D.|T.|2011|p=3|id= Medhi}}</ref>
Meanwhile, [[w:consistency|consistency]] must be guaranteed.
However, consistency and availability cannot be achieved at the same time. This means that neither releasing consistency will allow the system to remain available nor making consistency a priority and letting the system sometimes unavailable.<ref>{{harvnb|Vogels|2009|p=2|id= Vogels}}</ref>
In other hand, data must have an identity to be accessible. For instance, Skute <ref name="availability"/> is a mechanism based on key/value store that allow dynamic data allocation in an efficient way. Indeed, each server must be identified by a label in this form “continent-country-datacenter-room-rack-server”. The server has reference to multiple virtual nodes, each node has a selection of data(or multiple partition of multiple data). Each data is identified by a key space which is generated by a one-way cryptographic hash function (e.g. [[w:MD5|MD5]]) and is localised by the hash function value of this key. The key space may be partitioned into multiple partitions and every partition refers to a part of a data. To perform replication, virtual nodes must be replicated and so referenced by other servers. To maximize data availability data durability, the replicas must be placed in different servers and every server should be in different region, because data availability increase with the geographical diversity.
The process of replication consists of an evaluation of the data availability that must be above a certain minimum. Otherwise, data are replicated to another chunk server. Each partition i has an availability value represented by the following formula:
 
<math>avail_i=\sum_{i=0}^{|s_i|}\sum_{j=i+1}^{|s_i|} conf_i.conf_j.diversity(s_i,s_j)</math>
 
where <math> s_{i} </math> are the servers hosting the replicas, <math> conf_{i} </math> and <math> conf_{j} </math> are the confidence of servers <math> _{i} </math> and <math> _{j} </math> (relying on technical factors such as hardware components and non-technical ones like the economic and political situation of a country) and the diversity is the geographical distance between<math> s_{i} </math> and <math> s_{j} </math>.<ref>{{harvnb|Bonvin|Papaioannou|Aberer|2009|p=208|id= Bonvin}}</ref>
 
Replication is a great solution to ensure data availability, but it costs too much in terms of memory space.<ref name="ReferenceB">{{harvnb|Carnegie|Tantisiriroj|Xiao|Gibson|2009|p=1|id= Carnegie}}</ref>  DiskReduce <ref name="ReferenceB"/> is a modified version of HDFS that's based on [[w:RAID|RAID]] technology (RAID-5 and RAID-6) and allows asynchronous encoding of replicated data. Indeed, there is a background process which look for wide data and it deletes extra copies after encoding it. Another approach is to replace replication with erasure coding<ref>{{harvnb|Wang|Gong|P.|Xie|2012|p=1|id= Changsheng}}</ref> In addition, to ensure data availability there are many approaches that allow data recovery. In fact, data must be coded and once it is lost, it can be recovered from fragments which are constructed during the coding phase.<ref>{{harvnb|Abu-Libdeh|Princehouse|Weatherspoon|2010|p=2|id= Hussam}}</ref> Some other approaches that apply different mechanisms to guarantee availability are following: Reed-Solomon code of Microsoft Azure, RaidNode for HDFS, also Google is still working on a new approach based on erasure coding mechanism.<ref>{{harvnb|Wang|Gong|P.|Xie|2012|p=9|id= Changsheng}}</ref>
 
Until now there is no RAID implementation established for cloud storage.<ref>{{harvnb|Wang|Gong|P.|Xie|2012|p=1|id= Changsheng}}</ref>
 
===integrity===
 
Integrity in cloud computing implies data integrity and meanwhile computing integrity. Integrity means data has to be stored correctly on cloud servers and in case of failures or incorrect computing, problems have to be detected.
 
Data integrity is easy to achieve thanks to cryptography (typically through [[w:Message-Authentication Codes|Message authentication code]], or MACs, on data blocks).<ref>{{harvnb|Juels|Oprea|2013|p=4|id= Ari Juels}}</ref>
 
There are different ways affecting data's integrity either from a malicious event or from administration errors (i.e [[w:Backup|backup]] and restore, data migration, changing memberships in [[w:Peer-to-peer|P2P]] systems).<ref>{{harvnb|Zhifeng|Yang|2013|p=5|id= Zhifeng}}</ref>
 
It exists some checking mechanisms that check data integrity. For instance:
*HAIL (HAIL (High-Availability and Integrity Layer) a distributed cryptographic system that allows a set of servers to prove to a client that a stored file is intact and retrievable.<ref>{{harvnb|Bowers |Juels |Oprea|2009 |id=  HAIL}}</ref>
* Hach PORs <ref>{{harvnb|Juels |S. Kaliski |2007|p=2 |id=  Burton}}</ref> (proofs of retreivability for large file) is based on a symmetric cryptographic system, there is only one verification key that must be stored in file to improve its integrity. This method serves to encrypt a file F and then generate a random string named sentinel that must be added at the end of the encrypted file. The server cannot locate the sentinel, which is impossible to differentiate it from other blocks, so a small change would indicate whether the file has been changed or not.
*Different mechanisms of PDP (Provable data possession) checking : Is a class of efficient and practical method that provides an efficient way to check data integrity at untrusted servers:
 
: PDP:<ref>{{harvnb|Ateniese |Burns |Curtmola|Herring|Kissner|Peterson|Song|2007|id=  Giuseppe}}</ref> Before storing the data on a server, the client must store , locally, some meta-data. At a later time, and without downloading data, the client is able to ask the server to check that the data had not been falsified. This approach is used for static data.
: Scalable PDP:<ref>{{harvnb|Ateniese |Di Pietro  |V. Mancini|Tsudik|2008 |p=5|p=9|id=  Ateniese}}</ref> This approach is premised upon a symmetric-key which is more efficient than public-key encryption. It supports some dynamic operations (modification, deletion and append) but it cannot be used for public verification.
: Dynamic PDP:<ref>{{harvnb|Erway |Küpçü |Tamassia|Papamanthou|2009|p=2 |id=  Erway}}</ref> This approach extends the PDP model to support several update operations such as append, insert, modify and delete which is well-suited for intense computation .
 
==Economic aspects==
 
The cloud computing is growing rapidly. The US government decided to spend 40% of annual growth rate [[w:CAGR|CAGR]] and fixed 7 billion dollars by 2015. Huge number that should be take into consideration.<ref>{{harvnb|Lori M. Kaufman|2009|p=2|id=Kaufman}}</ref>
 
More and more companies have been utilizing the cloud computing to manage the massive amount of data and overcome the lack of storage capacities.
Indeed, the companies are enabled to use resources as a service to assure their computing needs without having to invest on infrastructure, so they pay for what they are going to use (Pay-as-you-go model).<ref>{{harvnb|Angabini|Yazdani|Mundt|Hassani|2011|p=1|id=Angabini}}</ref>
 
Every application provider has to periodically pay the cost of each server where replicas of his data are stored. The cost of a server is generally estimated by the quality of the hardware, the storage capacities, and its query processing and communication overhead.<ref>{{harvnb|Bonvin|Papaioannou|Aberer|2009|p=3|id=Bonvin}}</ref>
 
Cloud computing facilitates the tasks for enterprises to scale their services under the client requests.
The pay-as-you-go model has also facilitate the tasks for the startup companies that wish to benefit from compute-intensive business. Cloud computing also offers a huge opportunity to many third-world countries that don't have enough resources, and thus enabling IT services.
Cloud computing can lower IT barriers to innovation.<ref>{{harvnb|Marston|Lia|Bandyopadhyaya|Zhanga|2011|p=3|id=Lia}}.</ref>
 
Although the wide utilization of cloud computing,  an efficient sharing of large volumes of data in an untrusted cloud is still a challenging research topic.
 
==References==
{{reflist|4}}
 
==Bibliography==
* {{cite book
| ref = harv
| id = Tanenbaum
| last1 = Andrew | first1 = S.Tanenbaum
| last2 = Maarten | first2 = Van Steen
| year = 2006
| title = Distributed file systems principles and paradigms
| url=http://net.pku.edu.cn/~course/cs501/2011/resource/2006-Book-distributed%20systems%20principles%20and%20paradigms%202nd%20edition.pdf
}}
* {{cite web
| ref=harv
| id = fabio
| author    = Fabio Kon
| url        = http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.4609
| title      =Distributed File Systems,The State of the Art and concept of Ph.D. Thesis
}}
* {{cite web
| ref=harv
| id = Pavel
| author    = Pavel Bžoch
| url        = http://www.kiv.zcu.cz/site/documents/verejne/vyzkum/publikace/technicke-zpravy/2012/tr-2012-02.pdf
| title      = Distributed File Systems Past, Present and Future A Distributed File System for 2006 (1996)
}}
* {{cite web
| ref=harv
| id = sun
| author = Sun microsystem
| url        = http://www.cse.chalmers.se/~tsigas/Courses/DCDSeminar/Files/afs_report.pdf
| title      = Distributed file systems – an overview
}}
* {{cite web
| ref=harv
| id = Jacobi
| last1 =    Jacobi
|first1= Tim-Daniel
|last2=Lingemann
|first2=Jan
| url        = http://wr.informatik.uni-hamburg.de/_media/research/labs/2012/2012-10-tim-daniel_jacobi_jan_lingemann-evaluation_of_distributed_file_systems-report.pdf
| title      =Evaluation of Distributed File Systems
}}
#Architecture & Structure & design:
#* {{cite journal
| id = Zhang
| last1        = Zhang 
| first1            = Qi-fei
| last2        = Pan 
| first2            = Xue-zeng
| last3        = Shen
| first3            = Yan
| last4        = Li
| first4            = Wen-juan
| title          = A Novel Scalable Architecture of Cloud Storage System for Small Files Based on P2P
| periodical      = Cluster Computing Workshops (CLUSTER WORKSHOPS), 2012 IEEE International Conference on
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6354581
| year          = 2012
| doi            = 10.1109/ClusterW.2012.27
| others =  Coll. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
| ref=harv
}}
#* {{cite journal
| id = Azzedin
| ref=harv
| last1        = Azzedin 
| first1            =Farag 
| title          = Towards A Scalable HDFS Architecture
| periodical      = Collaboration Technologies and Systems (CTS), 2013 International Conference on
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6558543
| year          = 2013
| doi            = 10.1109/CTS.2013.6567222
| others =  Information and Computer Science Department King Fahd University of Petroleum and Minerals
| pages = 155–161
}}
#* {{Cite web
| id = Krzyzanowski
| ref=harv
| last1        = Krzyzanowski 
| first1            = Paul 
| title          = Distributed File Systems
| year          = 2012
| url      = http://www.cs.rutgers.edu/~pxk/417/notes/16-dfs.pdf
}}
#* {{cite conference
| ref=harv
| id = Kobayashi
| last1        = Kobayashi | first1            = K
| last2        = Mikami| first2            = S
| last3        = Kimura| first3            = H
| last4        = Tatebe| first4            = O
| year          = 2011
| title          = The Gfarm File System on Compute Clouds
| conference      = Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
| conferenceurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6008655
| doi            = 10.1109/IPDPS.2011.255
| others =  Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tsukuba, Japan
}}
#* {{cite journal
| id = Humbetov
| ref=harv
| last1        = Humbetov 
| first1            = Shamil 
| title          = Data-Intensive Computing with Map-Reduce and Hadoop
| periodical      = Application of Information and Communication Technologies (AICT), 2012 6th International Conference on 
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6385344
| year          = 2012
| doi            = 10.1109/ICAICT.2012.6398489
| others =  Department of Computer Engineering Qafqaz University Baku, Azerbaijan
| pages = 1–5
 
}}
#* {{cite journal
| id = Hsiao
| ref=harv
| last1        =  Hsiao 
| first1            =Hung-Chang
| last2        =  Chung 
| first2            =Hsueh-Yi 
| last3        =  Shen 
| first3            =Haiying
| last4        =  Chao 
| first4            =Yu-Chang
| title          = Load Rebalancing for Distributed File Systems in Clouds
| periodical      = Parallel and Distributed Systems, IEEE Transactions on  (Volume:24 ,  Issue: 5 ) 
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/RecentIssue.jsp?punumber=71
| year          = 2013
| doi            = 10.1109/TPDS.2012.196
| others =  National Cheng Kung University, Tainan
| pages = 951–962
}}
#* {{cite journal
| id = Fan 
| ref=harv
| last1        = Kai 
| first1            = Fan
| last2        = Dayang   
| first2            = Zhang
| last3        = Hui   
| first3            = Li
| last4        = Yintang 
| first4            = Yang
| title          = An Adaptive Feedback Load Balancing Algorithm in HDFS
| periodical      = Intelligent Networking and Collaborative Systems (INCoS), 2013 5th International Conference on 
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6630246
| year          = 2013
| doi            = 10.1109/INCoS.2013.14
| others =  State Key Lab. of Integrated Service Networks, Xidian Univ., Xi'an, China
| pages = 23–29
}}
#* {{cite journal
| id = Upadhyaya
| ref=harv
| last1        = Upadhyaya
| first1            = B 
| last2        = Azimov
| first2            = F
| last3        = Doan
| first3            = T.T
| last4        = Eunmi
| first4            = Choi 
| last5          =  Sangbum
| first5            =  Kim
| last6          =  Pilsung
| first6            = Kim 
| title          = Distributed File System: Efficiency Experiments for Data Access and Communication
| periodical      = Networked Computing and Advanced Information Management, 2008. NCM '08. Fourth International Conference on  (Volume:2 ) 
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=4623957
| year          = 2008
| doi            = 10.1109/NCM.2008.164
| others =  Sch. of Bus. IT, Kookmin Univ., Seoul
| pages = 400–405
}}
#* {{cite journal
| id = Soares
| ref=harv
| last1        =  Soares 
| first1            = Tiago S.
| last2        = Dantas†
| first2            = M.A.R
| last3        = de Macedo
| first3            =  Douglas D.J.
| last4        =  Bauer
| first4            = Michael A
| title          = A Data Management in a Private Cloud Storage Environment Utilizing High Performance Distributed File Systems
| periodical      = Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 2013 IEEE 22nd International Workshop on
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6569034
| year          = 2013
| doi            = 10.1109/WETICE.2013.12
| others =  nf. & Statistic Dept. (INE), Fed. Univ. of Santa Catarina (UFSC), Florianopolis, Brazil
| pages =      158 - 163
}}
#* {{cite journal
| id = Adamov
| ref=harv
| last1        = Adamov 
| first1            = Abzetdin 
| title          = Distributed File System as a basis of Data-Intensive Computing
| periodical      = Application of Information and Communication Technologies (AICT), 2012 6th International Conference on 
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6385344
| year          = 2012
| doi            = 10.1109/ICAICT.2012.6398484
| others =  Comput. Eng. Dept., Qafqaz Univ., Baku, Azerbaijan
| pages = 1–3
}}
#* {{cite journal
| id = Schwan
| ref=harv
| author    =  Schwan Philip
| title          = Lustre: Building a File System for 1,000-node Clusters
| periodical      = Proceedings of the 2003 Linux Symposium 
| layurl = http://www.linuxsymposium.org/archives/OLS/Reprints-2003/
| year          = 2003
| url            = https://www.kernel.org/doc/ols/2003/ols2003-pages-380-386.pdf
| others = Cluster File Systems, Inc.
| pages = 400-407
}}
#* {{cite journal
| id = Jones
| ref=harv
| last1    =  Jones
|first1=Terry
  | last2=  Koniges
  |first2=Alice
|last3= Yates
|first3=R. Kim
| title          = Performance of the IBM General Parallel File System
| periodical      = Parallel and Distributed Processing Symposium, 2000. IPDPS 2000. Proceedings. 14th International
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6818
| url            = https://computing.llnl.gov/code/sio/GPFS_performance.pdf
|year =2000
| others = Lawrence Livermore National Laboratory
}}
#* {{cite journal
| id = Weil
| ref=harv
| last1        =  Weil   
| first1            = Sage A.
| last2 =Brandt
| first2=Scott A.
| last3 =Miller
| first3=Ethan L.
| last4 =Long
| first4= Darrell D. E.
| title          =Ceph: A Scalable, High-Performance Distributed File System
| year          = 2006
| url            = http://www.ssrc.ucsc.edu/Papers/weil-osdi06.pdf
| others =  University of California, Santa Cruz
}}
#* {{cite journal
| id = MALTZAHN
| ref=harv
| last1        =  MALTZAHN 
| first1            = CARLOS
| last2 =  MOLINA-ESTOLANO 
| first2=ESTEBAN
| last3 = KHURANA
| first3=AMANDEEP
| last4 =NELSON
| first4= ALEX J.
| last5 = BRANDT
|first5= SCOTT A.
|last6=WEIL
|first6=SAGE
| title          =Ceph as a scalable alternative to the Hadoop Distributed FileSystem
| year          = 2010
| url            =https://www.usenix.org/legacy/publications/login/2010-08/openpdfs/maltzahn.pdf
}}
#* {{cite journal
| id =Brandt
| ref=harv
| last1        = S.A. 
| first1            = Brandt 
| last2        =  E.L.
| first2            = Miller
| last3        =  D.D.E.
| first3            = Long
| last4        = Lan
| first4            = Xue
| title          = Efficient metadata management in large distributed storage systems
| periodical      = Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings. 20th IEEE/11th NASA Goddard Conference on
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8502
| year          = 2003
| doi            = 10.1109/MASS.2003.1194865
| others =  Storage Syst. Res. Center, California Univ., Santa Cruz, CA, USA
| pages = 290–298
}}
#* {{cite journal
| id =Gibson 
| ref=harv
| last1        = Garth A. 
| first1            = Gibson   
| last2        = Rodney 
| first2            = MVan Meter
| title          = Network attached storage architecture
| periodical      = COMMUNICATIONS OF THE ACM
| volume          = 43
| number        = 11
| year          = November 2000
| url      =http://www.cs.cmu.edu/~garth/CACM/CACM00-p37-gibson.pdf
}}
#* {{cite journal
| id =Yee 
| ref=harv
| last1        =  Yee 
| first1            = Tin Tin 
| last2        =  Thu Naing
| first2            =  Thinn
| title          =  PC-Cluster based Storage System Architecture for Cloud Storage
| periodical      = The Smithsonian/NASA Astrophysics Data System
| year          = 2011
| url      =http://arxiv.org/abs/1112.2025
 
}}
#* {{cite journal
| id =Khaing
| ref=harv
| last1        = Cho Cho   
| first1            = Khaing 
| last2        = Thinn Thu
| first2            = Naing
| title          = The efficient data storage management system on cluster-based private cloud data center 
| periodical      = Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on 
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6034549
| year          = 2011
| doi            = 10.1109/CCIS.2011.6045066
| pages = 235–239
 
}}
#* {{cite journal
| id =Brandt
| ref=harv
| last1        = S.A. 
| first1            = Brandt 
| last2        =  E.L.
| first2            = Miller
| last3        =  D.D.E.
| first3            = Long
| last4        = Lan
| first4            = Xue
| title          = A carrier-grade service-oriented file storage architecture for cloud computing
| periodical      = Web Society (SWS), 2011 3rd Symposium on 
| layurl =http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6093898
| year          = 2011
| doi            = 10.1109/SWS.2011.6101263
| others =  PCN&CAD Center, Beijing Univ. of Posts & Telecommun., Beijing, China
| pages = 16–20
 
}}
#* {{cite journal
| id = Ghemawat
| ref=harv
| last1        = Ghemawat 
| first1            =Sanjay   
| last2        =  Gobioff 
| first2            =Howard 
| last3        =  Leung
| first3            =Shun-Tak
| title          = The Google File System
| periodical      = SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles 
| layurl =http://www.acm.org/publications
| year = 2003
| doi            = 10.1145/945445.945450
| pages = 29–43
 
}}
#Security Concept
#* {{cite journal
| id = Vecchiola 
| ref=harv
| last1        =  Vecchiola
| first1            = C 
| last2        =  Pandey
| first2            = S
| last3        = Buyya
| first3            = R
| title          = High-Performance Cloud Computing: A View of Scientific Applications
| periodical      =Pervasive Systems, Algorithms, and Networks (ISPAN), 2009 10th International Symposium on
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=5379703
| year          = 2009
| doi            = 10.1109/I-SPAN.2009.150
| others =  Dept. of Comput. Sci. & Software Eng., Univ. of Melbourne, Melbourne, VIC, Australia
| pages = 4–16
}}
#* {{cite journal
| id = Miranda  
| ref=harv
| last1        =  Miranda
| first1            = Mowbray 
| last2        =  Siani
| first2            =  Pearson
| title          = A client-based privacy manager for cloud computing
| periodical      =COMSWARE '09 Proceedings of the Fourth International ICST Conference on COMmunication System softWAre and middlewaRE
| layurl = http://www.comsware.org/
| year          = 2009
| doi            = 10.1145/1621890.1621897 
}}
 
#* {{cite journal
| id = Michael 
| ref=harv
| last1        = Naehrig
| first1            = Michael 
| last2        = Lauter
| first2            =  Kristin
| title          =Can homomorphic encryption be practical?
| periodical      =CCSW '11 Proceedings of the 3rd ACM workshop on Cloud computing security workshop
| layurl = http://www.sigsac.org/ccs/CCS2011/
| year          = 2013
| doi            = 10.1145/2046660.2046682
| pages = 113–124   
}}
#* {{cite journal
| id = Hongtao 
| ref=harv
| last1        = Du 
| first1            = Hongtao 
| last2        = Li
| first2            = Zhanhuai
| title          = Efficient metadata management in large distributed storage systems
| periodical      = Measurement, Information and Control (MIC), 2012 International Conference on
|volume          = 1
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6261643
| year          = 2012
| doi            = 10.1109/MIC.2012.6273264
| others =  Comput. Coll., Northwestern Polytech. Univ., Xi''An, China
| pages = 327–331
 
}}
#* {{cite journal
| id =Scott 
| ref=harv
| last1        = A.Brandt 
| first1            = Scott   
| last2        = L.Miller
| first2            = Ethan
| last3        = D.E.Long
| first3            = Darrell 
| last4        = Xue
| first4            = Lan
| title          =Efficient Metadata Management in Large Distributed Storage Systems
| periodical      = 11th NASA Goddard Conference on Mass Storage Systems and Technologies,SanDiego,CA 
| year          = 2003
| url      =http://www.ssrc.ucsc.edu/Papers/brandt-mss03.pdf
| others =  Storage Systems Research Center University of California,Santa Cruz
 
}}
#* {{cite journal
| id = Kaufman
| ref=harv
| author        = Lori M. Kaufman
| title          =Data Security in the World of Cloud Computing
| periodical      = Security & Privacy, IEEE  (Volume:7 ,  Issue: 4 ) 
| layurl = http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8013
| year          = 2009
| doi            = 10.1109/MSP.2009.87
| pages = 161–64
}}
#* {{cite journal
| id = HAIL
| ref=harv
| last1        =  Bowers
| first1            = Kevin   
| last2        = Juels
| first2            = Ari
| last3        =  Oprea
| first3            =Alina
| title          = HAIL: a high-availability and integrity layer for cloud storageComputing
| periodical      = Proceedings of the 16th ACM conference on Computer and communications security
| layurl = http://www.sigsac.org/ccs/CCS2009/
| year          = 2009
| doi            = 10.1145/1653662.1653686
| pages = 187–198
 
}}
#* {{cite journal
| id = Ari Juels
| ref=harv
| last1        =  Juels
| first1            = Ari  
| last2        = Oprea
| first2            =Alina 
| title          = New approaches to security and availability for cloud data
| periodical      = Magazine Communications of the ACM CACM Homepage archive Volume 56 Issue 2, February 2013
| layurl = http://cacm.acm.org/
| year          = 2013
| doi            = 10.1145/2408776.2408793
| pages = 64–73
 
}}
#* {{cite journal
| id = Jing 
| ref=harv
| last1        = Zhang   
| first1            = Jing 
| last2        =  Wu
| first2            = Gongqing
| last3        =  Hu
| first3            = Xuegang
| last4        = Wu
| first4            =  Xindong
| title          = A Distributed Cache for Hadoop Distributed File System in Real-Time Cloud Services 
| periodical      = Grid Computing (GRID), 2012 ACM/IEEE 13th International Conference on
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6317268
| year          = 2012
| doi            = 10.1109/Grid.2012.17
| others =  Dept. of Comput. Sci., Hefei Univ. of Technol., Hefei, China
| pages = 12–21
 
}}
#* {{cite journal
| id = Pan
| ref=harv 
| last1        = A. 
| first1            = Pan   
| last2        =  J.P.
| first2            = Walters
| last3        =  V.S.
| first3            = Pai
| last4        = D.-I.D.
| first4            =  Kang
| last5        = S.P.
| first5            =  Crago
| title          =Integrating High Performance File Systems in a Cloud Computing Environment
| periodical      = High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6494369
| year          = 2012
| doi            = 10.1109/SC.Companion.2012.103
| others =  Dept. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
| pages = 753–759
 
}}
#* {{cite journal
| id = Fan-Hsun
| ref=harv
| last1        =  Fan-Hsun 
| first1            = Tseng   
| last2        =  Chi-Yuan
| first2            =Chen 
| last3        =  Li-Der
| first3            = Chou
| last4        =  Han-Chieh
| first4            =Chao 
| title          =Implement a reliable and secure cloud distributed file system
| periodical      = Intelligent Signal Processing and Communications Systems (ISPACS), 2012 International Symposium on 
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6470430
| year          = 2012
| doi            = 10.1109/ISPACS.2012.6473485
| others =  Dept. of Comput. Sci. & Inf. Eng., Nat. Central Univ., Taoyuan, Taiwan
| pages = 227–232
}}
#* {{cite journal
| ref=harv
| id = Di Sano
| last1        = Di Sano 
| first1            = M   
| last2        =  Di Stefano
| first2            = A
| last3        = Morana
| first3            = G
| last4        = Zito
| first4            = D
| title          =File System As-a-Service: Providing Transient and Consistent Views of Files to Cooperating Applications in Clouds
| periodical      = Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 2012 IEEE 21st International Workshop on 
| layurl = http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6269211
| year          = 2012
| doi            = 10.1109/WETICE.2012.104
| others =  Dept. of Electr., Electron. & Comput. Eng., Univ. of Catania, Catania, Italy
| pages = 173–178
}}
#* {{cite journal
| ref=harv
| id = Zhifeng
| last1        = Zhifeng
| first1            = Xiao
| last2        = Yang 
| first2            = Xiao 
| title          = Security and Privacy in Cloud Computing
| periodical      = Communications Surveys & Tutorials, IEEE  (Volume:15 ,  Issue: 2 ) 
| layurl = https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9739
| year          = 2013
| doi            = 10.1109/SURV.2012.060912.00182
| pages = 843–859
 
}}
#* {{Cite web
| ref=harv
| id = Horrigan
| last1        = John B
| first1            = Horrigan
| title          = Use of cloud computing applications and services
| year          = 2008
| url      = http://www.pewinternet.org/~/media//Files/Reports/2008/PIP_Cloud.Memo.pdf.pdf
 
}}
#* {{cite journal
| ref=harv
| id = Stephen
| last1        = Yau
| first1            = Stephen
| last2        =  An
| first2            =Ho 
| title          = Confidentiality Protection in cloud computing systems
| periodical      = Int J Software Informatics, Vol.4, No.4, 
| year          = 2010
| url      = http://www.ijsi.org/ch/reader/create_pdf.aspx?file_no=i68&flag=&journal_id=ijsi&year_id=2010
| pages = 351–365
 
}}
#* {{cite journal
| ref=harv
| id = Carnegie   
| last1        = Carnegie   
| first1            = Bin Fan
| last2        = Tantisiriroj   
| first2            = Wittawat
| last3        = Xiao   
| first3            = Lin
| last4        = Gibson   
| first4            = Garth  
| title          = DiskReduce: RAID for data-intensive scalable computing
| periodical      = PDSW '09 Proceedings of the 4th Annual Workshop on Petascale Data Storage
| layurl = http://cacm.acm.org/
| year          = 2009 
| doi            = 10.1145/1713072.1713075
| pages = 6–10
}}
#* {{cite journal
| ref=harv
| id = Changsheng 
| last1        = Wang   
| first1            = Jianzong
| last2        = Gong   
| first2            = Weijiao
| last3        = P.   
| first3            = Varman
| last4        = Xie   
| first4            = Changsheng  
| title          = Reducing Storage Overhead with Small Write Bottleneck Avoiding in Cloud RAID System
| periodical      = Grid Computing (GRID), 2012 ACM/IEEE 13th International Conference on
| layurl = http://ieeexplore.ieee.org/
| year          = 2012 
| doi            = 10.1109/Grid.2012.29
| pages = 174–183
}}
#* {{cite journal
| ref=harv
| id = Hussam   
| last1        = Abu-Libdeh   
| first1            = Hussam
| last2        = Princehouse   
| first2            = Lonnie
| last3        = Weatherspoon   
| first3            = Hakim 
| title          = RACS: a case for cloud storage diversity
| periodical      = SoCC '10 Proceedings of the 1st ACM symposium on Cloud computing
| layurl = http://cacm.acm.org/
| year          = 2010
| doi            = 10.1145/1807128.1807165
| pages = 229–240
}}
#* {{cite journal
| ref=harv
| id = Vogels   
| last1        = Vogels   
| first1            = Werner  
| title          = Eventually consistent
| periodical      = Communications of the ACM - Rural engineering development CACM Volume 52 Issue 1
| layurl = http://cacm.acm.org/
| year          = 2009
| doi            = 10.1145/1435417.1435432
| pages = 40–44
}}
#* {{cite journal
| ref=harv
| id = Cuong
| last1        = Cuong
| first1            = Pham
| last2        = Cao
| first2            = Phuong 
| last3        =  Kalbarczyk
| first3            = Z
| last4        =    Iyer
| first4            =R.K
| title          = Toward a high availability cloud: Techniques and challenges
| periodical      = Dependable Systems and Networks Workshops (DSN-W), 2012 IEEE/IFIP 42nd International Conference on
| layurl = http://ieeexplore.ieee.org/
| year          = 2012
| doi            = 10.1109/DSNW.2012.6264687
| pages = 1–6
}}
#* {{cite journal
| ref=harv
| id = Undheim
| last1        = A.
| first1            = Undheim
| last2        =  A.
| first2            = Chilwan 
| last3        =  P.
| first3            =  Heegaard
| title          = Differentiated Availability in Cloud Computing SLAs
| periodical      = Grid Computing (GRID), 2011 12th IEEE/ACM International Conference on
| layurl = http://ieeexplore.ieee.org/
| year          = 2011
| doi            = 10.1109/Grid.2011.25
| pages = 129–136
}}
#* {{cite journal
| ref=harv
| id = Medhi   
| last1        = Qian
| first1            = Haiyang
| last2        = D.
| first2            = Medhi
| last3        = T.
| first3            = Trivedi
| title          = A hierarchical model to evaluate quality of experience of online services hosted by cloud computing
| periodical      = Communications of the ACM - Rural engineering development CACM Volume 52 Issue 1
| layurl = http://ieeexplore.ieee.org/
| year          = 2011
| doi            = 10.1109/INM.2011.5990680
| pages = 105–112
}}
#* {{cite journal
| ref=harv
| id = Giuseppe
| last1        = Ateniese  
| first1            = Giuseppe 
| last2        = Burns   
| first2            = Randal  
| last3        = Curtmola   
| first3          = Reza  
| last4        = Herring   
| first4            = Joseph  
| last5        = Kissner   
| first5            = Lea  
| last6        = Peterson   
| first6            = Zachary
| last7        = Song   
| first7            = Dawn
| title          = Provable data possession at untrusted stores
| periodical      = CCS '07 Proceedings of the 14th ACM conference on Computer and communications security
| layurl = http://cacm.acm.org/
| year          = 2007
| doi            = 10.1145/1315245.1315318
| pages =  598–609
}}
#* {{cite journal
| ref=harv
| id = Ateniese   
| last1        = Ateniese  
| first1            = Giuseppe 
| last2        = Di Pietro 
| first2            = Roberto
| last3        = V. Mancini   
| first3            = Luigi
| last4        = Tsudik   
| first4            = Gene
| title          = Scalable and efficient provable data possession
| periodical      = Proceedings of the 4th international conference on Security and privacy in communication netowrks
| layurl = http://cacm.acm.org/
| year          = 2008
| doi            = 10.1145/1460877.1460889
}}
#* {{cite journal
| ref=harv
| id = Erway   
| last1        = Erway   
| first1            = Chris
| last2        = Küpçü   
| first2            = Alptekin
| last3        = Tamassia   
| first3            = Roberto
| last4        = Papamanthou   
| first4            = Charalampos  
| title          = Dynamic provable data possession
| periodical      = Proceedings of the 16th ACM conference on Computer and communications security
| layurl = http://cacm.acm.org/
| year          = 2009
| doi            = 10.1145/1653662.1653688
| pages = 213–222
}}
#* {{cite journal
| ref=harv
| id = Burton   
| last1        = Juels   
| first1            = Ari
| last2        = S. Kaliski   
| first2            = Burton  
| title          = Pors: proofs of retrievability for large files
| periodical      =  Proceedings of the 14th ACM conference on Computer and communications
| layurl = http://cacm.acm.org/
| year          = 2007
| doi            = 10.1145/1315245.1315317
| pages = 584–597
}}
#* {{cite journal
| ref=harv
| id =  Bonvin 
| last1        =  Bonvin 
| first1            =Nicolas 
| last2        =  Papaioannou  
| first2            =Thanasis 
| last3        = Aberer 
| first3            = Karl   
| title          = A self-organized, fault-tolerant and scalable replication scheme for cloud storage
| periodical      = SoCC '10 Proceedings of the 1st ACM symposium on Cloud computing
| layurl = http://www.sigmod.org/
| year          = 2009
| doi            = 10.1145/1807128.1807162
| pages = 205–216
 
}}
#* {{cite journal
| ref=harv
| id = Kraska
| last1        = Tim
| first1            = Kraska 
| last2        = Martin 
| first2            = Hentschel
| last3        = Gustavo
| first3          = Alonso
| last4        = Donald
| first4        = Kossma 
| title          = Consistency rationing in the cloud: pay only when it matters
| periodical      = Proceedings of the VLDB Endowment VLDB Endowment Hompage archive Volume 2 Issue 1,
| layurl = http://www.eecs.umich.edu/db/pvldb/
| year          = 2009
| url      = http://delivery.acm.org.docproxy.univ-lille1.fr/10.1145/1690000/1687657/p253-kraska.pdf?ip=193.49.225.10&id=1687657&acc=ACTIVE%20SERVICE&key=C2716FEBFA981EF1971CC4FEAA1B2E6AEC2E9AF98B5715B6&CFID=281393312&CFTOKEN=60394752&__acm__=1389372304_3e308d0c7334e812e60a992fb660a429
| pages = 253–264
}}
#* {{cite journal
| ref=harv
| id = Abadi  
| last1        = Daniel   
| first1            = J. Abadi  
| title          = Data Management in the Cloud: Limitations and Opportunities
| periodical      = IEEE
| layurl = http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=867310AB38EE46A5E505E698E2F8C82F?doi=10.1.1.178.200&rep=rep1&type=pdf
| url      = ftp://131.107.65.22/pub/debull/A09mar/abadi.pdf
| year          = 2009
}}
#* {{cite journal
| id = Vogels   
| ref=harv
| last1        = Ari 
| first1            = Juels
| last2        = S.  
| first2        = Burton 
| last3        = Jr  
| first3            = Kaliski
| title          = Pors: proofs of retrievability for large files
| periodical      = Communications of the ACM CACM Volume 56 Issue 2
| layurl = http://www.acm.org/sigs/sigsac/ccs
| year          = 2007
| doi            = 10.1145/1315245.1315317
| pages = 584–597
}}
#* {{cite journal
| ref=harv
| id = Ari 
| last1        = Ari 
| first1            = Ateniese
| last2        = Randal  
| first2            = Burns 
| last3        = Johns  
| first3            = Reza
| last4        = Curtmola
| first4            = Joseph
| last5        = Herring  
| first5            = Burton 
| last6        = Lea  
| first6            = Kissner
| last7        = Zachary  
| first7            = Peterson
| last8        = Dawn  
| first8            = Song
| title          = PDP: Provable data possession at untrusted stores
| periodical      = CCS '07 Proceedings of the 14th ACM conference on Computer and communications security
| layurl = http://www.acm.org/sigs/sigsac/ccs
| year          = 2007
| doi            = 10.1145/1315245.1315318
| pages = 598–609
}}
#synchronization
#* {{cite journal
| ref=harv
| id = Uppoor 
| last1        =  Uppoor
| first1            = S
| last2        =  Flouris  
| first2        = M.D 
| last3        =  Bilas  
| first3        = A 
| title          = Cloud-based Synchronization of Distributed File System Hierarchies
| periodical      =  Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on
| layurl = http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=5606060
| year          = 2010
| doi            = 10.1109/CLUSTERWKSP.2010.5613087
| pages = 1–4
| others    =Inst. of Comput. Sci. (ICS), Found. for Res. & Technol. - Hellas (FORTH), Heraklion, Greece
}}
#Economic aspects
#* {{cite journal
| last1        = Lori M.
| first1            = Kaufman   
| title          = Data Security in the World of Cloud Computing
| periodical      = Security & Privacy, IEEE  (Volume:7 ,  Issue: 4 ) 
| layurl = http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8013
| year          = 2009
| doi            = 10.1109/MSP.2009.87
| pages = 161–64 
}}
#* {{cite conference
| ref=harv
| id = Lia
| last1        = Marston
| first1        = Sean
| last2        =  Lia
| first2            = Zhi     
| last3        =  Bandyopadhyaya
| first3            = Subhajyoti 
| last4        =  Zhanga
| first4        = Juheng
| last5        =  Ghalsasi
| first5        = Anand
| title          = Cloud computing — The business perspective
| conference      = Decision Support Systems Volume 51, Issue 1, 
| conferenceurl = http://www.sciencedirect.com/science/journal/01679236
| year          = 2011
| doi            = 10.1016/j.dss.2010.12.006
| pages = 176–189 
}}
#* {{cite journal
| id = Angabini
| last1        = Angabini
| first1            = A   
| last2        = Yazdani
| first2            = N
| last3        = Mundt 
| first3        = T
| last4        = Hassani
| first4        = F
| title          = Suitability of Cloud Computing for Scientific Data Analyzing Applications; An Empirical Study
| periodical      = P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2011 International Conference on
| layurl =http://ieeexplore.ieee.org.docproxy.univ-lille1.fr/xpl/mostRecentIssue.jsp?punumber=6099686
| year          = 2011
| doi            = 10.1109/3PGCIC.2011.37
|others= Sch. of Electr. & Comput. Eng., Univ. of Tehran, Tehran, Iran
| pages =193–199   
|ref=harv
}}
 
{{Cloud computing}}
 
[[Category:Cloud computing]]

Revision as of 19:54, 26 February 2014

20 year old Film, Tv, Radio and Phase Directors Merle from Leduc, has hobbies including jazz, property developers in singapore and cloud watching. Has recently completed a journey to Würzburg Residence with the Court Gardens.

Stop by my weblog; http://fomcc.org/