Picture of Wilson Rivera Gallego
CISE Lecture, Oct 31, 2013
by Wilson Rivera Gallego - Thursday, 14 November 2013, 7:51 PM


Title: "Using de-duplication to improve efficiency of data storage"
Dr. Emmanuel Arzuaga, Assistant Professor of the Electrical & Engineering Department, UPRM

Thursday, October 31st, 2013 (10:30AM), M-201


Data de-duplication is a technique to eliminate redundancy in stored data. For example, it is currently being used to improve storage utilization and reduce network data transfers. The de-duplication process consists of identifying unique chunks of data or byte patterns and storing them in the storage media. Future writes to the storage system are compared to the stored chunks and in the event of a match, the incoming chunk is classified as redundant and a small reference that points to the stored chunk is saved in its place. In this talk we will present examples of the use of this technology in storage media, virtualization and network infrastructure. We will then explain a content based chunking scheme along with a near duplicate detection mechanism that could be used in the implementation of a de-duplication tool.


Emmanuel Arzuaga is an Assistant Professor at the Department of Electrical and Computer Engineering of the University of Puerto Rico at Mayaguez (UPRM). He obtained a PhD in Computer Engineering from the Department of Electrical and Computer Engineering at Northeastern University in Boston, MA. He received his BS and MS degrees both in Computer Engineering from UPRM. He worked for UPRM as a Software Developer for the Laboratory for Applied Remote Sensing and Image Processing (LARSIP).


His research interests include virtualization, cloud infrastructure, resource efficiency, I/O workload characterization and storage systems performance analysis, currently focused on modeling and understanding the I/O effect of enterprise server workloads in virtualized environments, with special attention to virtual machine migration and system power consumption. He has worked on modeling and analyzing the behavior of Storage Systems. His previous work was related with the development of Remote Sensing and Pattern Recognition related software.