Excelero delivers big data AI storage solutions for business & enterprise, big data storage solutions, and enterprise data storage solutions. Applications are for major web scale companies for data analytics, machine learning applications in media and entertainment and HPC environments. Skip to main content
ExceleroPartners

MeshProtect™

By December 12, 2018March 12th, 2020No Comments

MeshProtect™

Introduction

Data redundancy is a core functionality of every storage solution and NVMesh is no different. However, as a solution designed for web-scale applications, there are major differentiation points between NVMesh’s data protection technology called MeshProtect and other approaches available on the market.

In this blog, I’ll cover these differences and their benefits to customers.

What is MeshProtect?

MeshProtect is NVMesh’s flexible data protection technology, providing multiple redundancy schemes that can be tuned for specific use cases. In NVMesh, drives are perceived as resources that are conceptually aggregated into a large storage pool. Logical volumes are then carved out of the storage pool and presented to clients as block devices. Volumes may span multiple physical drives and target hosts, but do not need to use entire drives, so a single drive can be allocated to multiple volumes.

Volumes can be configured in any of the following redundancy levels:

  • Concatenated – Data is laid out on single or multiple drives with no data redundancy. This volume type can be used for applications requiring temporary storage, or one that is replicated to other locations at the application layer (rack level failure domain).
  • Striped – Data is laid out across a set of drives and hosts with no data redundancy. This volume type can be used for applications requiring high-performance temporary storage, or one that is replicated to other locations at the application layer (rack level failure domain).
  • Mirrored – Data is protected by mirroring data across drives segments. To increase data availability, the drive segments are allocated from drives on different target hosts. These hosts should be connected to different power supplies, preferably in separate upgrade zones and availability zones and any other zoning used to protect against risk. The software’s management layer provides the agility to ensure such separation. Multi-way, active-active multi-path networking is used for availability and for performance.
  • Striped and Mirrored – Data is protected by mirroring data across drive segments and striping across these mirrors. Data is serviced from many drives and hosts, achieving high performance without sacrificing redundancy.
  • Parity-based – New in NVMesh v2.0, data is stored on a set of drive segments using an erasure coding like algorithm, with an N+2 redundancy level. In the case of a segment failure either because of a device or a host failure, data is reconstructed on the fly from the remaining segments located on the other drives. This provides higher redundancy than both Mirrored and Striped and Mirrored volumes while still providing high performance by spreading the workload across hosts and drives..

How is MeshProtect different from other RAIDs?

MeshProtect is different from other data redundancy technologies in various ways.

  • MeshProtect is Software-only – MeshProtect is provided as software that runs on standard servers, networking gear, and storage media. It does not require customers to purchase hardware from specific vendors or brands. As such, it breaks the tradition of storage vendor lock-in that has been the de-facto standard in the storage industry for decades. Buy whichever drive model you prefer, mix and match brands and sizes – we can consult, but it’s your call. 

  • MeshProtect provides the full range of redundancy options – Next-generation, distributed applications are deployed at scale. This calls for non-traditional redundancy designs. More often than not, redundancy is provided at the rack level, with the application itself replicating its data across racks, and in some cases even across data centers. This type of design expects the lowest possible latency from the storage layer, one that cannot be provided when redundancy of any kind is introduced. MeshProtect provides concatenated and striping volume types for these applications. It can also run as a cluster-in-a-box converged implementation, or disaggregated – with a rack full of servers accessing a single top-of-rack storage server running NVMesh.
    Many applications still require redundancy provided by the storager later. For these, MeshProtect provides both mirroring, striping and mirroring (N+1) and the newly introduced distributed parity-based protection (N+M). Note that at initial release only 8+2 will be supported. Additional schemes will be supported in future software updates. These various configurations provide customers a wide selection of redundancy and latency options.
  • MeshProtect is Distributed – The entire data redundancy in MeshProtect is running on the client side. In mirrored and striped-and-mirrored volumes, the spreading of IOs across a set of drives is initiated entirely by the clients in the cluster. In parity-based volumes, parity is calculated on the clients in the cluster as well. With its distributed data layout, MeshProtect can be configured to spread data across failure zones for high availability.
  • MeshProtect is Scale-Out – Being distributed means true scalability. Adding drives and clients will increase the overall performance of an NVMesh environment. Since volumes are not bound to certain pre-defined hardware configurations, the administrator can decide to scale them in any desired direction.
  • MeshProtect supports both Disaggregated and Converged setups – NVMesh clusters can be configured in top-of-rack type of configurations, where there is distinct separation between the applications and the storage servers, or as fully converged systems – with applications running directly on servers where the storage media is installed. Clusters can also be configured as a mix of the two approaches.
    In order to allow converged deployments with minimal target side CPU usage, MeshProtect avoids target side CPU usage for the data path. This allows customers to better plan resource allocations where applications are running.

How does it all compare against other approaches? 

Customers looking at accelerating their web-scale infrastructure using NVMe drives originally had very little choice and implemented NVMe drives as a direct-attached storage solution. Each server had its own NVMe drives, typically with no redundancy whatsoever, and low capacity utilization, since NVMe drives had to be purchased upfront, oversized, planning for future growth during the entire refresh cycle of the server.

Since then, some vendors started to offer top-of-rack NVMe-based appliances, providing centralized management of NVMe capacity. However, this approach has many drawbacks. It is forcing very specific configurations of hardware, it does not offer converged or mixed disaggregated/converged configuration options, and is limited by specific hardware constraints. And lastly, it is essentially an entire rip and replace for existing local NVMe drives, so it is not applicable for existing infrastructure, making it an option only for new projects.

MeshProtect builds on local NVMe drives configurations. It aggregates all these separate drives to be managed as a pool, with virtual block devices carved out as needed, with a flexible choice of redundancy levels. This provides the best of both worlds – the benefits of centralized management, with the performance acceleration of separate NVMe drives, all in a flexible software-based approach.

Performance-wise, NVMesh with MeshProtect parity-based volumes delivers more than 3M IOPS from just 10 NVMe drives, which can easily be scaled up to multiples of these numbers by adding drives and servers.

Summary

MeshProtect is implemented on top of local NVMe drives and opens up their true performance potential and capacity utilization. It offers a wide selection of redundancy options as required by modern web-scale designs. Redundancy varies from no redundancy up to efficient, distributed parity-based protection. MeshProtect provides redundancy running as software on the client side and therefore scales up as clients are added.

With MeshProtect, customers can deploy a storage solution for next-generation applications that provides very high throughput and IOPS, ultra-low latency and at the same time also very high storage efficiency.

Contact us to see a live demonstration of MeshProtect and see how it can help your existing and future application requirements.

Yaniv Romem

Author Yaniv Romem

More posts by Yaniv Romem