Cloud & Big Data Research Projects

Cloud Storage

Cloud Storage

Currently cloud storage has gained great prominence in both academia and industry. Google, Amazon, Microsoft, HP, IBM, Salesforce, Dell, etc., almost every infrastructure related company is providing their own cloud storage. Cloud storage is also a very cost-effective option for users who need to store data. Cloud storage stores data in logical storage pools which span across multiple physical disks in one data center or multiple geographically distributed data centers. How data are stored and provided are totally managed by the service provider.

Cloud storage covers a lot of background and across multiple areas. In our group, we mainly focus on the following areas. 
  Data management. Cloud storage has availability, consistency and scalability requirement. Availability means when a user accesses data, the data must be available. Consistency means the data that user read must be the right version. Scalability means cloud storage should be able to scale easily as the volume of data increases. In this research, we are exploring how to design cloud storage system and manage data so that we can ensure high availability, consistency and scalability. 
  Virtualization. Virtualization is a building block for cloud service. Those people who choose to deploy their service in cloud will often use both server service and storage service. They deploy their applications in cloud and at the same time store data in cloud storage. Traditionally, their applications are deployed in virtual machines (VMs). Cloud provider will install hypervisors on their servers to virtualize physical servers into multiple VMs to achieve higher density and flexibility. In this environment, how to manage the data access from VMs to Cloud storage to achieve better performance is a problem. For example, after an application issues an IO request to the cloud storage, the VM where the application resides may replicate that IO request so that they could be sent to different replicas of that piece of data. The fastest response will be received by the application. Now the trend of virtualization prefers a more lighted-weighted solution – Container. And among all container management, docker is a most successful one. Docker container achieves much higher density and is much more flexible than traditional VMs. It consumes less CPU and memory resources. It can boot in 0.1s and reboot in 2s, a tremendous improvement. On storage, it can achieve a close to native storage access performance. We are exploring how docker container is going to change the traditional cloud. 
  Local and Cloud Storage. In this research, we fully consider the difference between local and cloud storage, e.g., capacity, speed, reliability, cost. We are seeking to combine local with cloud storage and access data with thorough consideration of such disparity.