logo

Home

Department of Grid Storage

Projects


Hydra Block Store

Hydra Block Store provides one pool of content addressable blocks, thus allowing duplicate elimination to work across all the storage nodes in the pool. It supports dynamic node additions and removals, is resilient to multiple disk and node failures, and can recover from such failures without administrator intervention. It is optimized for providing high throughput read and write access for single or multiple concurrent streams. See HYDRAstor.

Hydra File System

HydraFS is a file system using Hydra Block Store as persistent storage, and is designed for high-throughput streaming workloads. The combination of block immutability, high latency of I/O operations, and high bandwidth requirements pose interesting challenges for the architecture, design, and implementation of the file system. See HYDRAstor.

Distributed Load Balancing File System

The DLBFS project is aimed at extending HydraFS to a distributed file system, that is capable not only of fail-over but also of non-disruptive dynamic migration of mount points for load-balancing. As in the case of HydraFS, the nature of content-addressable storage makes this problem differ in important ways from those typically encountered in distributed file systems. See HYDRAstor.

Content Defined Chunking

Using content-defined chunking in HYDRAstor brings two challenges. The first concerns the speed. For achieving in-line deduplication at high throughputs, chunking needs to be both fast and efficient. The second concerns the duplicate elimination induced by the chunking process. We are investigating algorithms capable of producing larger average chunk sizes while retaining the duplicate elimination ratios achievable with smaller chunks. See HYDRAstor.

Deduplicated Primary Storage

The Hydra File System is optimized for high-throughput streaming read and write operations. However, for metadata-intensive workloads its performance is quite poor due to the high latency of block store operations. This work uses solid-state disks (SSDs) to absorb the latency cost of metadata-intensive operations, enabling the Hydra File System to perform well enough to be used as primary storage. See HYDRAstor.

Energy Efficiency of Distributed Storage Systems

The goal of this project is to reduce energy consumption in distributed storage through autonomic and adaptive placement and caching of data. We are investigating how to best leverage SSDs for this purpose, as well as algorithms for data and metadata placement and caching based on observed access patterns. This work also involves developing simulation tools to help evaluate the algorithms.

Quality of Service in Distributed Storage Systems

One of the challenges in sharing a storage system is that one user's activity interferes with that of other users. The goal of this project is to provide mechanisms through which the storage system can provide quality of service guarantees to multiple users in a way that allows efficient utilization of resources.

Solid State Drives

Due to the many desirable features of SSDs, we are investigating possible uses of them in many of our other projects: keeping key metadata in the Hydra Block Store, providing temporary storage for the filesystem in the Primary Storage project, indirection maps for helping data placement and caching algorithms for energy efficiency, etc. Their use poses interesting problems, from simple optimizations of fundamental data structures that take advantage of their unique characteristics, to potential redesign of the storage stack for SSD storage.



NEC Laboratories America, Inc.
Princeton Campus - 4 Independence Way, Suite 200, Princeton NJ 08540   |    Cupertino Campus - 10080 North Wolfe Road, Suite SW3-350, Cupertino, CA 95014
webmaster@nec-labs.com   ©2008 NEC Laboratories America, Inc. All rights reserved. Please Read our Privacy Policy

Website design by Dragonfly Interactive, LLC