On Hyper-Convergence

Hyper-Convergence is the future; I seriously doubt there are professionals still debating this issue anymore.
Everyone is getting into this game; Microsoft, VMware, Cisco and other companies are all seeing emerging opportunities and trying to release software before someone else does.

For those of you who don't know, hyper-converged is another buzz word for software-defined everything: Storage, Networking and Compute.

On Hyper-Converged Storage

The traditional way of building infrastructure is to have a SAN that acts as a central storage for your data. Depending on the SAN, you would most likely have it be redundant in itself.
Traditional Setup In a Hyper-Converged Storage environment, instead of having a physical SAN, you would have a pool of servers with local storage. You pool these servers and their storage together and they are presented as one logical storage volume.
Hyper-Converged Storage Setup So, essentially, it is the software that does all the work. You have no RAID controllers (in fact they would be discouraged) as the raw storage would be presented directly to the storage solution, which would be aware of the physical enclosure, disk types, server location, etc..

Microsoft Storage Spaces Direct

Microsoft Storage Spaces Direct is a new feature that will be available in Windows Server 2016. Currently in Technical Preview 4 it is available via PowerShell. There are two targeted deployment scenarios for Windows Server 2016 Storage Spaces Direct.

You can have disaggregated deployment where the Scale Out File Cluster is for the specific role of storage (so it does nothing else).

The second deployment scenario is the hyper-converged deployment scenario. This has the Hyper-V (compute) and Storage Spaces Direct (storage) components on the same cluster. Virtual machine's files are stored on the local CSVs and does not implement a Scale-Out File Server. This allows for scaling Hyper-V compute clusters and storage together and removes requirement of configuring file server access and permissions.

One of the biggest advantages of Microsoft Storage Spaces direct is the cost. Since this is a feature bundled with Windows, you will have it on your Windows Server 2016 deployment; it's just a matter of turning on the feature.

Unfortunately, Windows Server 2016 is still not released and no target date is set although there are some speculations that it will be released at the end of Q3 2016. Even then, when it's released, would you deploy the first version of this feature in production? I certainly would not, so for me this feature is still far away from production use.

VMware Virtual SAN

VMware Virtual SAN is, perhaps, the most feature-full solution out there. VMware Virtual SAN can only be used on vSphere setups. It is definitely much more expensive than Storage Spaces Direct, but on the other hand it is something you can use now and has more features. Currently, VMware Virtual SAN is on version 6.2.

I think if you already have a vSphere setup and are looking to venture into hyper-converged storage, VMware Virtual SAN is the obvious solution for you.

StarWind and DataCore

StarWind and DataCore are 2 companies that offer off-the-shelf Hyper-Converged software solutions. They mainly target Windows deployments, because there is a clear lack of competition there considering that Storage Spaces Direct is not yet available.

DataCore pricing is very attractive at a very small scale but once you a certain amount of Terabytes it's as expensive as VMware Virtual SAN and at that point I would opt for something backed by VMware, rather than some random company.

Sizing a Hyper-Converged Storage Setup

Sizing Hyper-Converged Storage Setup is very different than sizing a SAN. Firstly, you must understand that in this case, to introduce new storage you will most likely have to introduce compute as well.

This is not essentially a bad thing. If you think about it and if you plan accordingly you can target a deployment where by the time you are saturating the Storage, you will also be saturating CPU and RAM. This way you introduce more nodes which contribute both CPU and RAM as well as storage.

On Hyper-Converged Networking

Hyper-Converged Networking is a huge and complex topic.

One of the problems that Hyper-Converged Networking tries to solve is the scalability problems associated with large cloud computing deployments. Imagine, for example, you have Virtual Machines on premises that you want to fail-over to the cloud. This brings in challenges of how to have your internal IP addresses being routed to the cloud. In short, new technologies such as VXLAN and NVGRE are Hyper-Converged Networking protocols that solve these kind of problems.

Another area where Hyper-Converged Networking shines is in the realization that most of the traffic in the data-center is east-west traffic (where applications are most chatty between themselves and not with the rest of the internet). VMware NSX is a very interesting suite of tools that are suitable for these kind of problems. As usual, be prepared to dish out $$$. In Windows Server 2016, Microsoft are also introducing some new tools to facilitate these problems (such as implementing VXLAN protocol in Hyper-V)

Against Hyper-Converging

One of the biggest disadvantages of Hyper-Convergence is one that most people fail to see. In a hyper-converged setup it looks like you have no single point of failures; you have multiple nodes contributing storage, if 1 node fails everything can still run.

However, in this case, the software is a single point of failure. For example, in hyper-converged storage, Storage Spaces Direct, VMware Virtual SAN, DataCore and StarWind software are all prone to bugs. This software is distributed across all your nodes and is the only thing that knows how to communicate with the disk. If this software decides to fail on all nodes, it will bring your whole infrastructure down.

In Conclusion

As we are seeing, the modern data-center is transitioning to hyper-converged setups. For sure, serious Clouds (Azure, AWS, etc...) need to run hyper-converged in order to keep up with the sustained increase in load.

Hyper-Converged architecture is not without it's downsides, however we must embrace this change and make sure our infrastructure and even applications work in this kind of environment and start transitioning more of our workloads into hyper-converged.