Navigation

CTO Blog – Why VSAN? summary of VMware PEX 2015

VMware PEX 2015

VMware’s annual Partner Exchange Conference (PEX) took place in San Francisco last week and I was invited to co-speak on ‘Storage as a Service using VMware VSAN’ with Sanjeev Desai from VMware Product Marketing where the new features of VSAN 6.0 were announced for release later in 2015.

This event is a great opportunity to learn about new launches, partnerships and product updates. VMware partners are brought together from across the globe. Key highlights for me were;

– The release of vSphere 6.0 which is providing exciting new features such as long distance vMotion and Fault Tolerance support for virtual machines with 4 vCPU’s

– VSAN 6.0 which will be able to double the cluster size from 32 to 64, 4x more performance, introducing fault tolerant domains across racks and the all flash VSAN. Also new is the support for blade infrastructures!

– Meeting industry peers and have the interaction with VMware senior executives.

The next session of this blog is a summary of my presentation I did at PEX2015, the slides will be made available via VMware Partner portal for PEX2015 attendees or contact PeaSoup at info@peasoup.net for the PeaSoup presentation.

—-

Adopting Change

In January 2007, Apple announced the iPhone as the first smart phone of its kind with no key board and 100% touch screen. I was gobsmacked, as this phone not only looked very smart but also provided new functionality so much better than other “smart” phones before. In that time I was using a Compaq iPAQ which was a huge phone with some sort of Windows NT 3.51 look alike OS on it and its usability was rather clunky…

Mr Jobs said, “We are all born with the ultimate pointing device – our fingers – and iPhone uses them to create the most revolutionary user interface since the mouse.” Apple had dared to make a look that was not in their line of business.

Later that year Steve Ballmer commented on the introduction of the Apple iPhone , how stupid it was and why people would never use it, as people  needed email access … How wrong was Ballmer… and he was one of many (Blackberry, Psion…) that thought along these lines.

Today we have 1.75 Billion smart phones leaving Microsoft trailing behind in the mobile market. So what’s Apple’s story?

Apple does not just design, develop and sell consumer devices, they sell a lifestyle – one that enriches our everyday lives, such music on demand. They created a new market which is totally commoditized and where everyone is using a smart device. Apple were open for change, they listened to what consumers wanted and created a whole new market. This market is the key driver for datacentre business, today. How many apps are out there? How many are using a cloud platform?

Microsoft as an innovator was not open for change… and in this case they missed the boat…. Which was actually not the first time… Think virtualisation ….

The smart device market and Virtual SAN

Storage is on the verge of the same change seen within the mobile device market. VMware is at the forefront of this change by introducing Virtual SAN. Virtual SAN is built from the ground up for virtual servers, providing a true software defined storage solution with minimal overhead and simplified management. VSAN enables read/write caching using server-side flash. VSAN is built into the VMware hypervisor, optimizing the I/O data path to deliver better performance than a virtual appliance or an external device.

“Physical” does not apply to virtual, this is something we are saying in the industry for the last few years and now we can finally apply this to the storage market. We don’t have to architect our application infrastructure around the storage provider anymore. Consumerization is a key business driver, increasing the demands for storage, impacting IT solutions provided. Monitoring the market and influx of demands is crucial in understanding the requirements for an increase in availability and performance, and less down-time (as possible) whilst allowing for granular growth with the virtual platform.

To grow businesses in line with the increase in consumer rates, flexible solutions need to be implemented which provide reliability, performance and growth rather than large incremental upgrades.

For example, the applications inside the virtual machines will determine the policy set for storage.

 

CTO Harold Buter speaking at PEX2015

PeaSoup CTO Harold Buter speaking at PEX2015

 

 

Cost Challenges

When we started PeaSoup, my partner and I sat down and discussed PeaSoup storage requirements and associated costs. We both have over 20 years’ experience in the industry and we have been selling storage for 15 years. So we knew the biggest challenge we faced would be cash flow, and so we wanted to provide a Cloud solution that did not have a large initial cost from the outset. First of all we knew if we decided to go down the traditional storage route we would have a large capacity array but this would low performance capability. Secondly, our growth model showed the need for a storage solution that could grow linearly with the environment. We didn’t want to create a solution that could not adapt to the growing demands of the environment and require storage controller upgrades increasing risk for further down time.

Virtual SAN was one of the reasons we wanted to start as a cloud services provider,  as we could see the potential for our customers. We did our calculations and looked at what CAPEX and OPEX impacts VSAN would have compared to traditional arrays.

On the CAPEX side, rather than buy large storage arrays with smart software to enhance performance and dedicated fibre network, we chose VSAN – yes, VSAN. The hosts cost more than a non VSAN host as they do not need SSD, multiple disks or a raid controller, however the cost per GB is much lower than that using traditional arrays.

Another advantage to using VSAN, we could start relatively small, i.e. not buy a large storage array upfront which would require array upgrades with expansion units. We did not want to create more risks of downtime with cost implications. As most of you know, once you buy the initial storage your expansions would never be at the same discount levels… using Virtual SAN provides linear growth, so we added more data store space in line with the CPU and memory growth. Adding a VSAN node takes a matter of minutes (especially when using auto deploy) and does not have any risks. Secondly using this model made it easy for us to predict when would need new hosts and an operations manager forecasts growth and alarms.

Looking at the OPEX for PeaSoup we decided we needed virtualisation staff rather than virtualisation and storage engineers.  Virtualisation engineers require less storage knowledge and can focus on the Virtualisation skills instead.

PeaSoup VSAN is available for service providers under the vCloud Air Network Program (0.08 points per GB per month for the allocated capacity for the hybrid model).

 

Operational challenges

When we compare our operational challenges with the traditional storage arrays you will find the latter more complex, requiring dedicated specialists to maintain both arrays and network.  This is because traditional storage solutions are aligned to storage containers i.e. LUN volumes etc. VSAN’s application-centric policy is very simplified and helps us to match the policy to the application.

We wanted a solution that provided VM centric snapshots and not use a solution that snapshots a complete LUN. As you can imagine, in a fully automated environment you don’t want to create a LUN per VM or vAPP which creates more administration work and increases the complexity.

Another challenge is expansion. Previous experience has taught me to check firmware levels of new units when expanding. If the units are newer the production environment requires an upgrade which could have its risks and need advanced planning. But in a fast paced cloud environment there is no time to plan weeks beforehand you need to expand there and then without any risks.

Virtual SAN is great as it provides me with a single datastore that is managed via the vCenter, it is that simple.  All the management is completed at vCenter level which means that snapshots and policies are all done from within the same management layer and fully integrates with vCloud Director and vRealize Orchestrator. It means that PeaSoup can simplify the storage solution and spend time and resource on enhancing other services, hence we say PeaSoup | cloud simplified.

To date we have been able to define VM specific policies, for example on reliability and performance, which in the future will be even more improved (IO control is not part of VSAN yet) today we can set the fault tolerance, amount of spindles used and reserve % for flash.

 

Reliability challenges

Most cloud outages can be placed in two categories, either network related or storage related. In general, network related issues can be solved pretty quickly if the right people and resources are in place. However storage related issues can have more impact, especially when it’s a fatal error such as a full rebuild, which are very time consuming. For PeaSoup we wanted a solution that allowed fast resolution for any issues occurring on storage layer. In our VSAN design we added an extra host to fail over too, so that customers won’t notice the impact of a host failure.

Additionally with traditional arrays, the solution complexity creates an increased probability of risks of human error over time. Don’t get me wrong, a good designed storage solution should be capable to deal with errors but unqualified staff could be the biggest risk to the stability of the environment. For our environment we wanted a solution that minimizes human error as much as possible.

The beauty of virtual SAN is the scalability and the possibility of a fully automated solution, ie using auto deploy to add new hosts into the cluster – making the no requirement to manually upgrade firmware of controllers, expansion units etc.

Virtual SAN is designed to fail!  This is an interesting concept which means there are processes in place created for when a failure occurs and provides an automated solution, for instance to start a VM on a completely separate host when an issues arises on a host. Obviously, you still need to make sure you have the capacity! The policies can be set per VM, you can even choose to have multiple FT points to create more replicas of the virtual disks. This will cost additional storage space which you have to calculate and charge customers! A FT of 1 will have two replicas and a FT of 2 will have three replicas so a VMDK of 50 GB disk uses 100GB with FT1 and 150GB with FT2.

Since VSAN does not use RAID it means that we don’t have the rebuild time. FT ensures that virtual machines don’t have to be rebuilt, ensuring down time is limited to an absolute minimum.

 

Scalability challenges

As mentioned previously, scalability is one of the most important features for us, as a cloud provider we need to upgrade when needed, if we get a new customer with a large resource demand, we need to add hosts as quickly as possible for increased CPU, memory, storage capacity and performance. Adding new customers should never implicate storage performance decrease. So we needed a solution with linear scalability without having to cope with large incremental costs.

Note with traditional solutions you can outgrow your array, so you might chose a solution that looked like it fitted the bill when you started, but after a year your controllers can no longer cope with the traffic generated. So you can choose to buy a new larger model, or perform in a place with a controller upgrade which could cause downtime (this is frequently due to human error!) or if upgrading you might need to opt for a different vendor / model with different management tools which require staff needs to learn and adapt.

So to conclude VSAN takes away hosting complexity. It delivers a solution that will grow with the environment – not only increase the processors and memory, but also the capacity and performance of the storage layer. At PeaSoup we can even add additional performance by using larger SSD disks. And with auto deploy you minimize the risk of human error by fully automating the installation of a new host to a cluster.

 

About PeaSoup

PeaSoup started in 2014 and was founded on the basis of VSAN released in last year. We spent months laboriously going through the process of due diligence, testing, checking and double checking our solution and we are doing well.  As CTO I’ve used many resources to make sure we had the right hardware. In the summer 2014 we built our cloud platform and on boarded a media ISV customer capturing high volumes of raw video data before converting it to mobile video data before exporting across various distribution channels – a very IO intensive, customer happy with performance and services provided.

At the end of the year we concentrated on simplifying our platform further and are currently testing a new portal solution – so watch this space.

2015 we shall go live with the new portal which will provide a single pane view for our customers and partners. The single pane should not only provide vCloud functionality but also backup, usage, billing, disaster recovery and support desk functionality.

We will also improve our service with ‘extreme automation’ – we want to automate as much as possible in order to minimise human error and increase service speed to both customers and partners.

The PeaSoup cloud will constantly adapt to fit new solutions, and we aim to add a new cluster based on vSphere 6.0 in the future to utilise new features such as fault domain on racks which will further increase our reliability. We will add a new tier to our services with flash array. The main feature would be new file systems which will improve snapshot and cloning. This will have a positive and great impact on the platform, especially as backup software uses snapshots.

For further resources on your VSAN journey check out these blogs:  www.yellow-brick.com, (Duncan Epping), www.cormachogan.com and www.virtuallyghetto.com (William Lam). These guys provide a lot of very good content! Also I absolutely recommend purchasing Essential Virtual SAN (2014), Hogan C. and Epping D, VMWare Press.

 

 

CTO Blog – Architecture – Backup

Veeam backup

In the last blog I went into the main architecture of our cloud using VMware vCloud platform and vSAN. In this article I’ll describe how we protect our infrastructure.

Today I will write about backup of the environment. DR and Backup to our cloud  environment will be discussed in later articles. PeaSoup is using Veeam Backup and Replication to backup the customer and the management environment.

Why Veeam?
We had an architecture dilemma when we started with PeaSoup. Originally we planned to use VMware VDP Appliances while this was the only backup solution on the market that supports vSAN. However, we didn’t want any management servers on our customer cluster, and we did found out during testing, that VDP appliance needs to be installed within the resource cluster. We did look at the big brother of VDP, EMC Avamar, however we did not find this the right solution.

In June Veeam released patch 4 for their Backup and replication v7 product which included support for vSAN. I’ve been working with Veeam for a long time and I really like the simplicity of their solution. Secondly Veeam is fully vCloud supported, meaning we can select organization VDC’s to be backup’ed. We tested Veeam in our cloud using vCloud and vSAN and it really performs very well, both backup and restores were very easy and quick.

The next step was to create a design for the backup environment. As we are using virtual Veeam servers, we decided the best way forward is actually to create a separate cluster for the backup environment. This meant we can utilise different lower cost hardware as we not want to use vSAN as a backup target. The backup cluster uses traditional storage solutions and we are using this purely for backup and disaster recovery targets.
backup logical overview

The figure above depicts a logical overview of the architecture with the three clusters we have today.

The cluster is also being used for our backup to cloud using Asigra (which we will describe another time)

Later this year we will upgrade Veeam to version 8 as this will provide repository as a service solution. This is a great option, while we can provide repository services to customers who have Veeam on premise, and want to backup to the cloud or have a backup copy of their jobs stored outside their premises.

CTO Blog – Architecture of PeaSoup Hosting

Pea Soup Cloud partners

At PeaSoup, our philosophy is to be transparent and open to our customers. So the first transparency blog will be about our architecture.

As CTO I’ve been involved in many technologies in the past, some good, some excellent and some ok ish. This experience of technologies helped us to design our hosting platform. Our main Data Center is in London Dock Lands at the Telecity group. We have 4 independent ISP links going into our racks to have no single point of failure when a line drops or under performs.