(Upgrade) Times are changing…..

Stop me if you’ve heard this one before. It’s Monday morning, you open up your newsreader and there are 25 different articles about an exploit that has been found that is sweeping the net. It affects nearly 90% of systems out there. You know it’s only a matter of time until this news goes from being on the tech sites only to the Wall Street Journal and The New York Times. Once that happens, the alert level hits red. Now all of your C-Level execs are aware of the problem and someone is going to be calling you asking for a status update. If you’re Peter Gibbons, you may even have 8 different people calling you. Where do you go from here? In the old days, this would mean, any plans you had for that weekend were scrapped. You’d now have to coordinate outages with your application teams, IT staff, sometimes you’d even have to get your building’s security team involved. You’d also have to break the news to your wife, husband, girlfriend, boyfriend, kids, or whoever that you may not see them again until Tuesday (assuming that all goes well). Then you get to go through this scenario:
  • Planning and executing the downing all of your affected DEV/TEST systems.
  • Preemptively opening cases with your vendors in case you run into an issue (you would hate to get stuck in the queue without a case number while your systems are down)
  • Downloading and applying the patches to fix the vulnerability.
  • Bringing all of said systems back up and running.
  • Contacting all of your applications owners once the systems are back up and having them test all of the applications.
  • Squeeze in a phone call to your loved ones asking about how life is on the outside.
  • Notifying all of your users that the systems are back up and running and that now regular weekend work can commence.
  • Once all of this is done, and you’ve verified that everything is OK and there are no issues, you can now plan to do the same thing to your Production systems. YAY! That usually means another weekend down the toilet.
Many times, some of the pain involved with this type of maintenance can be lessened through mechanisms like vMotion, Exchange DAGs, and clustered systems in general. Typically, you patch each of the secondary nodes in the cluster, then you patch the primary node and you’re good to go. This process of upgrading different cluster nodes can take hours depending on the size of your environment and requires total concentration and focus. If you run into an issue during a failover, you’ll be happy you opened that support case.
Why do I bring all of this up? Traditionally, the one system that usually has the biggest issues during this kind of upgrade/update scenario is your storage environment. Especially if you are on legacy storage for one reason or another. In most cases that I have seen, storage code upgrades are completely ignored unless absolutely necessary. I can see why people make that argument. If your storage goes down, especially in a small to medium sized shop, EVERYTHING goes down. This scares the pants off of a lot people, with good reason. They would rather take the “If it ain’t broke, don’t fix it.” approach of yesteryear. Nobody wants to run into those kinds of problems and lose their weekends because of storage issues. This kind of thinking leads to rolling the dice and hoping that the storage environment will just keep on chugging along and that no one will exploit the vulnerabilities that are out there. I think this model is changing in storage though, along the same lines that the break/fix mentality was replaced with a proactive approach. IT departments are getting more sophisticated and are looking to get everything patched and protected BEFORE someone tries to exploit the vulnerabilities. 
What if you, the IT engineer, could avoid those sleeping in the office kind of issues and get your weekends back? Who would say no to that? As I’ve written about in the past, I’ve been a customer of Pure Storage for about two and a half years now. I started out on an FA-320 array, I’m currently using their FA-400 series and I’m getting ready to start playing with the FlashArray//m as soon as it arrives. One of the things that sold me on Pure Storage was the Non-Disruptive Upgrade (NDU) capabilities for both the software and the hardware of the array (you can see a demo of their NDU here). I’ve gone through almost every iteration imaginable. I’ve done code upgrades (both minor and major revisions), I’ve added additional shelves of disk, I’ve gone from 300 series to 400 series controllers, you name it and I probably done it. The one similarity in every upgrade was that it happened like they said it would happen. No downtime, no performance degradation, no idea that it was happening from a user perspective. They were all quick, seamless, and pain free. They also happened during the week (we played it safe and did them on Friday evenings for our Production units) but on Saturday morning I was home playing with my little boy which is what I care about most.
As I said earlier, this approach appears to be the new status quo. Many other vendors besides Pure Storage are trying to follow suit. EMC has stated that they now support NDU’s (although I’m not sure that is the case for different hardware versions). Other vendors such as Solid Fire and Nimble also support NDU’s. This is a direction that I think everyone in IT welcomes. Being able to provide services quickly to the end user without disturbing their workflow is the goal of nearly every IT staff. This new model greatly increases the success rate of achieving that goal. Pure Storage has gone one step further and changed the typical storage lifecycle model around this principle when they launched Evergreen Storage. The belief is that forklift upgrades will go the way of the dodo bird and you can just replace individual components when needed. Your maintenance never increases (unless you add capacity). Your storage system can stay the same for as long as you need it too saving you tons of money in the long run while also providing you with a solid foundation to house your infrastructure on.
If other systems start following suit and rethink how we look at system lifecycles, the end result can be great for IT Admins. What if it was as easy to upgrade the code on your core switches and routers as it is to upgrade an app on your iPhone? What if said code could be upgraded FROM your iPhone while you’re sipping margaritas on a beach somewhere (just don’t drink too many until the upgrade is done)? What if upgrading your email servers wasn’t a 6 month project? Whether it’s PC refreshes, server upgrades, or application upgrades, a pain-free process is something everyone would welcome and what we currently strive for as IT pros. It’s nice to see that not only can we make end users’ lives easier, I think it’s time that we make our own lives easier as well. Don’t we as IT admins deserve the same level of happiness and time away from the office as our users do? I sure think so. I think you all would agree with me. It’s nice to see that vendors like Pure Storage share that same vision and are doing something to achieve it.

How do I know which storage array is right for me?

After a ton of positive feedback on my last post (thank you all for that), people wanted to know more. Specifically, how did I come to the decision on what product was right for my environment? Hopefully, this post will help guide you in the right direction and maybe point something out that you didn’t think of previously. I’m going to do my best to generalize this so you can compare and contrast vendors on your own. Every environment is different so you’ll have to cater these guidelines to your situation. No one is going to know what you need better than YOU! This actually leads me into my first point

Identify Your Needs
This step is the most important in my opinion but it is often the most overlooked. Why are you looking at new storage in the first place? Are you experiencing a performance problem that you (and/or your current vendor) cannot resolve? Are there limitations with your current setup that are preventing you from providing the necessary services that your customers require? Is there a new project or initiative at your firm that is presenting you with a new set of requirements altogether? An example of this is when your clients request storage replication for DR/BCP purposes where there was no need prior. Or is it a situation where your array was installed while Saved By The Bell was still on the air and it’s just time for you to find out what the latest and greatest product is and how fast you can get it installed in your environment? Also, do you need Fibre Channel, iSCSI, direct attached or something else entirely? Once you have a clear and concise understanding of what you are looking for and why, the rest of the search is much simpler.

Cost
Unless your name is Tony Stark, Bruce Wayne or Richie Rich cost is a major factor in any IT purchase. You’re going to have a budget that you need to stick to and you also need to get the most bang for the buck. This is a step that can get very tricky if you don’t have a clear picture of your environment. Obviously, the cost of the array itself and the associated support & maintenance are huge factors in what your overall spend will be. There are other things to consider as well.

  • What does your environment look like now?
  • Are you in a Co-Lo facility?
  • What is your current monthly OPEX spend from a power, cooling and rackspace perspective?
  • What are your power requirements? Does your current array require dedicated circuits to run? What is the additional cost of those circuits?
  • How many rack units and/or full racks does your current setup use? How many do you have available?
  • What is the total $/GB(or TB)?
  • Are there additional costs consideration? Will your existing SAN support the additional port density? Will you need to purchase additional networking equipment, or cables to support the new requirements?
  • Are there software costs to consider? Do you have to license individual features such as replication, snapshots, etc? Or is it included with the cost of the array?
  • What are the costs for support and maintenance? Do these costs increase substantially over time or will they remain flat for the lifespan of the array? Does maintenance entitle you to any new features or hardware?
  • Better yet, will the solutions that you are looking into DECREASE any of the above mentioned costs? Will you save money on monthly OPEX costs thus lowering the TCO for your solution?

These are some of the things that you need to consider when calculating what your total spend will be. I’ve never met a CxO that likes to be surprised by large increases in their monthly or yearly budget that they didn’t plan for. It usually means a nice conversation with the CFO which never ends well for the CxO and ultimately it doesn’t end well for the person responsible for the increase.

Performance
Now that you know what your needs are and how much you can spend on your shiny new array, it’s time to get down to business. It has to live up to the hype. You’re going to step in front of your boss in a conference room with a fancy PowerPoint presentation that took you 6 weeks to prepare since you’re a technical person not a PowerPoint guru. You need to justify this exorbitant expense that you are throwing in front of them. The array HAS TO perform. If you are looking at a new array to resolve a performance issue it DEFINITELY needs to perform. You’re going to be looking at All-Flash Architectures, Hybrid arrays, solutions that leverage tiering, server-side solutions, you name it, and I’m sure it will pop up during your search in one way, shape, or form. Once again, the only one who can tell you what is right for you, is you. Make sure you perform baselines before you start looking at solutions so you know what your IOPS, Latency and Bandwidth requirements are. It will help narrow down the possible solutions that suit your needs.

Capacity
Along with performance, question 1A is usually “How much space do I need?”. Seems like a pretty obvious question as well. Along with how much space you need, you should be asking yourself, why so much space? Are you just looking for a performance enhancement but the capacity that you have is more than sufficient? You have 100TB now so you’ll get 100TB on my new array? Are you taking growth into consideration? Is what you’re buying now sufficient to hold you over for the next 3-5 years and beyond? How difficult is it to add new capacity to the array you want to purchase a year from now, 3 years from now or 5 years from now? Can capacity be added non-disruptively? (HUGE POINT in my opinion) What type of storage are you looking at? Are you looking at tiers, all-flash, SAS, SATA? How much of a concern is speed? What type of data will be stored on the array (VMs, Databases, Email, Archive, File)? This is an area you need to be relatively sure of prior to purchase or make management aware that additional capacity may be needed in the future. You don’t want to walk in to your CxO’s office 18 months after you buy an array asking for more money because you didn’t buy enough disk. Depending on your CxO, that can turn into a resume generating event.

Features
Now that you know how fast your disk needs to be and how much of it you need, it’s time to look at the other factors that you should consider. For me, the first was simplicity. I’ve worked with at least a dozen different arrays. The bottom line is storage is not the easiest area to deal with if you are not a seasoned storage vet. Especially when you get into the hundreds of terabytes and petabytes. Smaller shops usually feel the pain of this a little more than enterprises do. They may have really good Windows & VMware admins but most of the jack-of-all-trades guys learn storage last. Enterprises usually have dedicated storage teams that only do storage day in and day out. Having an array that is easy to configure and more importantly easy to manage should definitely be on your checklist if you are a novice or even if you’re a top tier storage admin. You’ll need your time to manage the legacy environments that are still lingering. The top of your list should also contain Non-Disruptive Upgrades (NDU). We all know what a pain having to schedule downtime for an array can be. You basically have to bring down EVERYTHING and hope it comes back up normally. Wouldn’t it be nice if that went away and you could upgrade your array as easily as you upgrade an app on your iPhone? There are other features that you should look for like Deduplication, Snapshots, Replication and hypervisor compatibility for virtualized shops. VAAI support makes a huge difference in vCenter environments. You’ll also need to figure out how easy it will be to migrate your data. If you’re a VMware user, it should be as simple as a Storage vMotion. Physical hosts can be a little trickier but most vendors will provide guidance and assistance when necessary. A lot of the features that you’ll need will be extremely apparent just from dealing with your current situation. You know what you like and what you don’t, now is your time to fix all of those issues that you’ve hated for years.

Next Steps
Meet with vendors, lots of them. See what you like and dislike from all of them. Try to gauge which solution meets your needs. You should have the knowledge at this point of what you need, what is most important to you and how much you can spend. Try to get the most bang for your buck. One thing to remember is that you are the customer and you have to do what is right for YOUR company. Making a sales person happy is not your job, making your end users and your management happy is. When all is said and done, if you’re still not sure, make like you’re buying a new car. Take it for a test drive. Most vendors can set up Proof of Concept (POC) boxes for you and you can test the array with your own data. Nothing will show you if a solution will work better than slapping a copy of your VMs on the box and going to town on them. Run the reports you normally run, try your backup jobs, run all of your applications at as close to a production load as you can. What you put into your testing will show tenfold when the production array shows up. You’ll now have a familiarity with the array and you’ll have reasonable expectation on how it will perform. If you took baselines like I suggested earlier, you’ll even have data to compare to. Also, speak to your peers and read up as much as you can. There are plenty of engineers and admins that have gone through this process before you. Don’t try to reinvent the wheel. Use all the help that you can find around you. Hopefully you have done your homework and you’ll be on the right track to storage happiness.

For those of you who are curious, here’s a simple breakdown of what my evalutation looked like. Obvious I went into much more detail during my search but this proves that you can figure out your needs with just a few bullet points.

Identify Your Needs – Fast performing, small footprint, low power consumption, cut down on FC ports if possible since we’re nearing capacity on our SAN.
Cost – Had to stay within my budget (numbers withheld for confidentiality reasons)
Performance – Must be able to run Tier 1 apps without affecting other apps and servers running on the array.
Capacity – Expected growth was 150% over three years. Looked for double the usable capacity of current system. Must be able to add additional shelves as need arises.
Features – Simplicity, NDU, Deduplication, Snapshots, Replication,
Next Steps – Met with 10-12 vendors, performed 3 POC’s. Found an array that met the majority of my needs and the remaining needs were on their roadmap. We have Loved Our Storage ever since.

Hopefully this guide will help you in your search. I remember the pain that I went through during this process. I’d love to save you from going through the same. The thing to remember is that this is the tip of the iceberg. You still need to install the array and migrate your data. The quicker you can settle on what works for you, the quicker you can get down to the fun stuff. Feel free to reach out with any questions and please leave feedback if you can. Good luck in your search.

Storage Is Virtually Pure Now

Disclaimer: This is an opinion piece meant to help all of the VMware Admins out there based on my own experiences. I’ve seen a lot of user reviews and figured what the heck? I should tell my story and as you’ll be able to tell, I’m not a writer 🙂
I’ve been a “VMware Guy” for a little bit now. It’s been about 10 years since I first started playing around with GSX Server (not a typo). I immediately knew that this thing was a game changer. It was a very rare feeling that I did not feel again for a long, long time. More on that later.
I’ve seen a lot of different environments in my time as an in-house admin and field engineer. I’ve seen a lot of things done right and just as many things done wrong. The goal in life of most IT guys (and gals) is to get people off their backs. They may not come out and say it but it’s the truth. The majority of their careers are spent listening to users complain about how the system doesn’t do what they want it to and then having to fix it so it does. VMware Admins face a similar challenge but in most cases they’re listening to other IT guys complain about how their server isn’t fast enough and needs more resources or that they need 15 new dev boxes in the next hour to test an application or that they’d rather have a physical server because VM’s aren’t as good. So the goal of a VMware Admin is to keep things running as smooth as possible without having to constantly mess with the environment. Simple is good.
Like I said earlier, I’m a VMware Guy. I started as a regular IT guy and morphed into what I am now. I do a lot of virtualization, some storage, some networking, some scripting when I need to and some Windows administration. Basically, I’m a modern day infrastructure guy. At this point I think it’s what is becoming the norm. IT guys need to do it all. Or at the very least, understand how it all works together.
In my last few years, I’ve been doing more storage related work. I’ve done Fibre Channel configuration and zoning, LUN creation, and provisioning, you name it, I’ve probably touched it in some way shape or form and to be honest, I’m not a fan. The work itself is fun but it has a limit. The payoff just isn’t there for me. Unfortunately though it’s a necessary evil. I’d rather spend time working with VMware but it won’t mean much without storage behind it. I always wanted my storage to just work but could never find a platform that didn’t require constant babysitting. That is, until I found Pure Storage.
After encountering some performance problems on one of my database clusters, we determined that the problem was the storage array. It was time to find a new way of doing things and the search was on. I’m not going to go into detail about the search itself (unless you want me to, leave comments below), I’m going to tell you about what the results did for me and my environment. Pure Storage’s all-flash array seemed way too good to be true. It was so easy to manage that for once I did not have to concentrate on making my storage work, it just did. Not only did my database cluster perform, it excelled! Obviously, performance should be spectacular with an all-flash array but it was all of the other benefits that really struck me:
Ease Of Use
The first thing that struck me about this product was how simple it was. I used to install products from other vendors and it was usually a FULL day affair. When the Pure engineer came onsite, I was expecting more of the same. What blew me away was the fact that I was ready to kick him out before lunch. When does that happen with any vendor install? It took longer to get the array out of the box than it took to configure it. I grew up using Windows, so I’m familiar with Disk Management. Most Windows guys (and gals) are. Bring the disk online, create a partition, format it and you’re off to the races. This was just as easy. The interface is clean, simple and very intuitive. You don’t have to be a storage admin to use this product. With Pure, once your zoning and SAN stuff is done, you add a Host or Host Group to match your VMware environment, Create a volume, Rescan your storage in VMware, set your path selection policy (one line script) and you’re done.
No Bloat
One of the things that annoys me nowadays is everything comes with bloatware. Whether it’s a toolbar, a Java installer, your new smartphone, or a new PC, there’s always crap you don’t want bundled in. Same holds true with hardware. How many times have you gone through this? Array is en route, and the engineer sends you a checklist or pre-req list that includes the need for 25 IP’s, 3 Management VM’s, 200 GB of space for the VM’s, a specific version of Java. Who wants to deal with that crap? Pure on the other hands had a one page document, no VM requirements, no Java requirement and once again it was nice and simple. Give us your IP info, your time server info and if you want AD authentication, your domain controller info. Everything was scripted out ahead of time and once again the engineer was gone by lunch which means I can spend more time VMware-ing. Is that a word? If not, it should be. #VMware-ing
Storage Overcommittment
I don’t know about you but I’ve seen so many VMware environments where there is Thin Provisioning on the storage array and then there is Thin Provisioning at the vSphere level as well. This equals problems in most cases ranging from performance degradation from Thin Provisioning overhead to having arrays run out of space if capacity isn’t monitored properly. With Pure Storage, this is a moot point. Since they have data reduction inline, VMDK files can now formatted as Thick Eager and storage capacity no longer has to be managed in two places. All of those zeroes that would get written out on a traditional array are now just de-duped metadata. All of the performance benefits of having Thick Eager VMDK’s can now be realized along with simplified management of storage.
Provisioning Times
How many times have you received a request at 4:50 PM on Friday that someone needs a server and they need it by the end of the day? Most VM’s nowadays can be spun up within 15-20 minutes. So usually this isn’t the worst thing in the world but when your next train is an hour later and you need to be out the door by 5 PM on the dot it IS the worst thing in the world. With XCOPY functionality on the Pure Storage cloning from a template with customization usually takes between 9 and 12 seconds in my environment. More importantly, it means that I’m making my train and seeing my kid before bedtime.
Data Reduction
It doesn’t matter how big your environment is, I can guarantee you have duplicate data. If you’re a large virtualized shop, you have tons of dupes. How many times have you cloned a template? How many different copies of Windows system files are stored on your storage? More importantly, how much do those copies cost you? If you have 100 VM’s and there is a 10 GB Windows installation on each server, that’s basically 1 TB (if my math is correct) of data right there. I didn’t mention page files, duplicate apps and other instances of duped data. Basically you’re spending money for wasted capacity. On an array with data reduction like the Pure Storage array, you’d only have one instance of the data and that 1 TB would now become 10 GB. The other benefit or data reduction is being able to cram a lot more data into a smaller space. Hello Green Initiatives. So I can have a smaller footprint in my datacenter, requiring less equipment, power and cooling to host the same workload? Sounds a lot like the benefits of VMware to me.
Multiple Workloads
I may be dating myself but when I was a kid, I remember seeing a brand of shampoo that said “No More Tears” on the bottle. Now I see it a different way. “No More Tiers”. Does anyone enjoy configuring tiered storage? Seriously? Anyone? It’s a lot of work. A LOT OF WORK. At the end of the day, flash is going to smoke it anyway. So why waste the man-hours on configuring something that doesn’t work as well in my humble opinion? I haven’t seen a tiered system that compares in cost, configuration and performance to Pure Storage. It’s not even close. I may be a storage novice but this seems like a no-brainer. Also, now I can forget about having to configure multiple VMware Storage Profiles. The only tier that I have now is ONE. You can keep your database servers on the same storage as your print servers and domain controllers and the array will not blink. Everything becomes Tier 1. It some cases, it’s complete overkill. The simplicity of it all though is such a huge benefit that any additional cost (which is debatable, frankly) is totally worth it. How much do storage admins make? How many of them do you need in a tiered environment with 50 TB’s or more? How much more complicated does your storage and VMware setup become? Is it worth the price?
No Licensing
One of the other huge benefits, is the fact that everything is included. You do not have to license individual components. When a new feature comes out, it’s yours. Snapshots, Replication, VVOLS, it doesn’t matter. When it gets released, you perform an update and BAM! it’s on your array. It’s as easy as updating an app on your phone. Pure even went ahead and did the same with their hardware. “Love My Storage” is unbelievable. If you pay your maintenance, you get new controllers. No more forklifts, no more extortion at the end of your support contract. It simplifies your budget in ways that I have not seen when it comes to storage. You just get a product that works and will continue to work for years to come.
Let me try and sum it all up for you. Pure Storage has a product that is simple, easy to manage and extremely high performing. I left a lot out and I could probably keep going on each of these bullets for days and probably add a few more if I really thought about it. I know the market is changing and a lot of competitors have similar products but based on my own experience, Pure Storage is the best of the breed. If your array is coming up for renewal or you’re having problems with performance or complexity, I’d highly recommend that you give Pure Storage a look.