logo

Home

Department of Grid Storage

HYDRAstor


Overview

This research is aimed at providing a scale-out storage platform based on a community of nodes operating as a single system providing a set of data management services. The research led to a product launched by NEC, available in the US and Japan. More information is available on the HYDRAstor home page at NEC America.

Projects


Hydra Block Store

Hydra Block Store provides one pool of content addressable blocks, thus allowing duplicate elimination to work across all the storage nodes in the pool. It supports dynamic node additions and removals, is resilient to multiple disk and node failures, and can recover from such failures without administrator intervention. It is optimized for providing high throughput read and write access for single or multiple concurrent streams.

Hydra File System

HydraFS is a file system using Hydra Block Store as persistent storage, and is designed for high-throughput streaming workloads. The combination of block immutability, high latency of I/O operations, and high bandwidth requirements pose interesting challenges for the architecture, design, and implementation of the file system.

Distributed Load Balancing File System

The DLBFS project is aimed at extending HydraFS to a distributed file system, that is capable not only of fail-over but also of non-disruptive dynamic migration of mount points for load-balancing. As in the case of HydraFS, the nature of content-addressable storage makes this problem differ in important ways from those typically encountered in distributed file systems.

Content Defined Chunking

Using content-defined chunking in HYDRAstor brings two challenges. The first concerns the speed. For achieving in-line deduplication at high throughputs, chunking needs to be both fast and efficient. The second concerns the duplicate elimination induced by the chunking process. We are investigating algorithms capable of producing larger average chunk sizes while retaining the duplicate elimination ratios achievable with smaller chunks.

We developed algorithms that achieve 2-4 times larger average chunks for comparable duplicate elimination, by using knowledge about input stream properties, and relying on the ability of the Hydra Block Store to quickly answer queries about the existence of already stored chunks.

Primary Storage

The Hydra File System is optimized for high-throughput streaming read and write operations. However, due to the high latency of block store operations, it exhibits poor performance for metadata-intensive workloads. This work uses solid-state drives (SSDs) to absorb the latency cost of metadata-intensive operations, enabling the Hydra File System to perform well enough to be used as primary storage.



NEC Laboratories America, Inc.
Princeton Campus - 4 Independence Way, Suite 200, Princeton NJ 08540   |    Cupertino Campus - 10080 North Wolfe Road, Suite SW3-350, Cupertino, CA 95014
webmaster@nec-labs.com   ©2008 NEC Laboratories America, Inc. All rights reserved. Please Read our Privacy Policy

Website design by Dragonfly Interactive, LLC