Research Projects

HybridStore Project (2006-Present)

  • Managing Performance, Power, Cost in Hybrid Enterprise-scale Storage System
    • Flash memory overcomes some key shortcomings of hard disk drives (HDDs), including faster access to non-sequential data (when not degraded by garbage collection (GC) overheads) and lower power consumption. Economic forces, driven by the desire to introduce flash into the enterprise market without changing existing software-base, have resulted in the emergence of solid-state drives (SSDs), flash packaged in HDD form factors and capable of working with device drivers and I/O buses designed for HDDs. Unlike the use of DRAM for caching or buffering, however, certain idiosyncrasies of SSDs make their integration into HDD-based systems non-trivial. Flash memory suffers from limits on its reliability, is an order of magnitude more expensive than the disk, and can be sometimes even slower than the HDD (due to excessive GC induced by high intensity of random writes). Given the complementary properties of HDDs and SSDs in terms of cost, performance, and lifetime, the current consensus among several storage experts is to view SSDs not as a replacement for HDD but rather as a complementary device within the storage hierarchy. We design and evaluate such a hybrid system called MixedStore to provide: (a) improved capacity planning tech- niques to administrators with the overall goal of operating within cost-budgets and (b) improved performance/lifetime guarantees during episodes of deviations from expected workloads through three novel mechanisms: (i) adaptive wear-leveling, (ii) write-regulation and (iii) fragmentation busting. We implement and validate a simulator for MixedStore and evaluate its efficacy using well-regarded enterprise-scale storage traces.
      (Papers: TR CSE08-017)

  • Designing and Implementing Efficient Flash Translation Layer in Flash based SSD
    • Unlike hard disks, flash devices are free from any mechanical moving parts, have no seek or rotational delays and consume lower power. However, the internal idiosyncrasies of flash technology make its performance highly dependent on workload characteristics. The poor performance of random writes has been a cause of major concern that needs to be addressed to better utilize the potential of flash in enterprise-scale environments. We examine one of the important causes of this poor performance: the design of the Flash Translation Layer (FTL), which performs the virtual-to-physical address translations and hides the erase-before-write characteristics of flash. We propose a complete paradigm shift in the design of the core FTL engine from the existing techniques with our Demand-based Flash Translation Layer (DFTL), which selectively caches page-level address mappings. We develop and validate a flash simulation framework called FlashSim. Our experimental evaluation with realistic enterprise-scale workloads endorses the utility of DFTL in enterprise-scale storage systems by demonstrating: (i) improved performance, (ii) reduced garbage collection overhead and (iii) better overload behavior compared to state-of-the-art FTL schemes.
      (Papers: ASPLOS09, TR CSE08-012)

  • Project Homepage: http://csl.cse.psu.edu/hybridstore