If you work for a vendor and after reading this you feel that this entry is about your company, then you are right! And if you work for a vendor and after reading this you feel that this entry has nothing to with your company, then you are totally wrong!
I've been working in IT now for more than twenty years; I've done a variety of roles but a big chunk of my career has been running support teams and I've done this both sides of the fence i.e both at an end-user but also at a reseller. I know a lot of people who have worked in support on both sides of the fence and pretty much all of us have coming to the same conclusion; over the past ten years or so, vendor support has got markedly worse! Of course, it may be that we are looking at the past through rose-tinted glasses but I do not believe so. I would also say that this is pretty much across the board; this is not a storage problem but it impacts hardware and software support across the board.
There are a variety of reasons for this decline in quality and ironically, it is some of the services that we as end-users have demanded that may have driven the decline in service.
The Internet is a big driver in the decline of quality; there are huge amounts of support information and detail on line and we as end-users have demanded that the vendor support databases are put on line; unfortunately, this information is often presented in a manner which it is pretty much impossible to effectively search and you find yourself trawling through screens of unrelated sludge before you find the answer you are looking for. But because all of this information has been provided in a 'self-service' portal, this appears to have been used as an excuse to reduce the number of qualified people that you have in a support centre.
The Internet has also further enabled the stock holding questions from the support centre and having worked both sides, I know that these are holding questions in general.
1) Are you at the latest level?
2) Can you send me the logs?
3) Can you send me the configuration files?
Now all of these are valid questions in many circumstances but too often it feels that you are dealing with a robot who is working off a script! This is a Support Centre not a Call Centre; there is a difference! And my (and many other's) responses to the questions are
1) Why? Will it fix my problem? Where in the release notes does it talk about something even vaguely related to my problem? And if I'm in a really sceptical mood, can you send me the piece of the code which fixes the problem. And you realise that this is a live production system and just applying fixes is unacceptable, I will get asked all the questions I've just asked you by the change board. And if it makes it worse, it's my job on the line.
2) Why? Will that log help you diagnose the problem because I've already looked at the log and I can't see anything. Do you have any idea how frustrating it is when you have got a disk off-line and marked failed to be asked for logs from the array?
3) Which files do you want to see and why?
And often, I've already attached the information to the ticket already; so you asking for them again shows that you have not read the ticket or are just working off the script.
Constantly in conversations with our various hardware engineers from all vendors, we discover reductions in headcount, experience engineers being retired early etc and territory coverage per engineer being increased. We are not buying less hardware, we are buying more and there are more systems out there to go wrong. Now arguably, systems are becoming more reliable but there are more of them. And in the world of storage, we have lots of moving parts; disks spin, tapes spool, robots move and these are all things which wear out and break. Disks get bigger and the potential impact of a disk failure and the resultant rebuild times gets ever larger.
Talking to people who work in the support centres; it appears to be more important to keep the queue within the targets than solving the customer's problem at point of first contact. There is no longer time to do follow-up calls; for example, calling the customer who had the severity 1 call to ensure that they are happy with the fix.
This is just the tip of the ice-berg and I could rant on and on about this subject; I was ranting about the decline in service years ago and yet it really is not getting better. For example, I am personally aware of four companies this year who have experience major outages due to problems caused by vendor support; it may be that now I have a fairly high profile in the industry that people tell me their war stories but it seems we are on an upward trend.
I think it's about time that the vendors started to review what and how they are providing support (fix your websites or at least put a decent search capability on it, it is pretty awful that generally I find myself using generic search engines and the site: directive to search your site); it is also about time that they started treating support centre staff with respect and giving them time to do a great job.
Your post and your responses to me make it very obvious to me why you have a hard time with support.
You'd rather believe I'm too lazy or incompetent to read release notes, I need 10 more years of experience to understand how difficult it is to make changes (nevermind 90%+ of the software and hardware in IT is less than 10 years old), or every problem that pops up in your environment is well documented in some mythical internal document just go look for it. Oh and I shouldn't ask for logs or config details either. I'm just supposed to mind read the problem from you and presto give you a fix.
What world do you live in? Very revealing discussion. Thanks.
Posted by: Support Engineer | December 20, 2010 at 03:58 PM
Remind me who you work for again? You see I know..
And I never said you shouldn't ask for logs and config; actually most of the time, you will already have the logs and the configurations, most of the time they are already attached to the problem record. And most of the time, you are working for a script. And most of the time, you'd rather not be...actually, I generally get very apologetic support people who want to do a better job...but the process that they have to follow gets in the way.
If you read the whole post, you will find that that I am pretty sympathetic to the support teams; you generally aren't given enough time to do a good job and the quality has gone down because of this.
If you took the time to talk to experience customers, you would find that this is not exactly an uncommon tho'.
And yes, you do need experience working in a large end-user as opposed to a vendor; you do need to sit in change-boards and learn how complex systems hang together, you do need to know that C-level sometimes want proof that the fix you are going to put is not going to have adverse impact and take the whole environment down (yes it's happened).
And yes, you should read you own bug-tracker; you will find that in several occasions that going to the latest fix would have made the problem worse and actually broken things even more.
Posted by: Martin G | December 20, 2010 at 04:26 PM
I can understand sending the logs and that's pretty much done by our teams when we open cases with vendors so they don't even have to ask. But the notion that we have to automatically upgrade to the latest version is absurd for the reasons Martin stated. In a large enterprise environment we have strict change control and version control policies for very good reasons. If a support agent is going to require us to upgrade to some new OS version, they're going to have to show me a reference (release notes, previous customer cases, etc) that show that 1) the issue I'm reporting has been reported before and 2) that it is fixed in the version they are recommending we upgrade to.
And if a vendor is not putting out comprehensive release notes, I consider that a major knock against that vendor for future purchase consideration.
Posted by: Scott H | December 20, 2010 at 04:30 PM
It's not difficult with my E-mail address that you have to figure out who I am or who I work for. Hell, if you E-mailed me and asked I could have saved you the googling. I'm confident I've helped your company recently (E-mail me if you want details). I just don't think it's fair to overgeneralize support and stereotyping only makes resolving the issue which I genuinely want to do harder. I still need the exact same information. I can tell you with certainty, your 3 dreaded questions from support are not just holding questions.
Posted by: Support Engineer | December 20, 2010 at 04:48 PM
Scott H, The question from support about upgrading can usually be answered exactly as I'm sure you do.
"It's a production environment and we can't upgrade without knowing for certain it'll fix this specific problem."
Sometimes when we ask the question it's preprod or a lab and that's not the answer. That saves all of us time and possibly money. BTW, most of the time I want to find the problem first as you do. I discourage support staff from taking the just upgrade approach for every problem. You're right that it can cause more problems than it fixes. Total time to ask question and get that answer is about 30 seconds. If I was using that to buy time, it didn't work. :-) I do what I do for the rush that comes from really resolving an issue. I don't want to just dump you off the phone or close a case. I think my company is a little different about support though which is why I still do it.
Posted by: Support Engineer | December 20, 2010 at 05:02 PM
Actually, I think the first question is to just see how stupid the caller is! If they'll blindly upgrade to the latest version, you've got a naive muppet on the call and they'll probably believe whatever else you suggest.
The second two can be legitimate but not always...and should not be used to hold off sending out an engineer with replacement parts. This is absolutely being done, I've had discussions with field engineers who agree and actually, they really enjoy working with us because we understand them and the organisational issues that they are suffering.
And we always ask why you want particular information; we don't just blindly handover information. I expect you to know the answer and not just flannel with some meaningless process-related crap. Or at least be honest and say, 'it's process related crap...'
Support has gone backwards; believe me....there still good people working in support; they don't have enough time to do a decent job.
Posted by: Martin G | December 20, 2010 at 05:04 PM
"Actually, I think the first question is to just see how stupid the caller is!"
You don't see how that's over the top on cynical?
If you'd have started your post with some of that Martin I could explain even better why we do what we do.
Many problems that log errors that look like hardware problems can be attributed to software issues. My company has worked very hard to make sure we address the problem the first time and not spend a lot of time and money on chasing hardware that we know to be software problems. We need logs to say for certain. I think you'd also rather us fix the issue instead of blindly ship hardware only to find out it's a software bug. See your own comments and post about difficulty getting maintenance window. There's never enough time to do it right the first time, but there's always enough time to do it again.
Asking why we want some specific information is fair and proactive on your part. For some support staff and issues it's required information to engage a certain group or expert. Some experts in support won't look at an issue without having certain information to piece together what's broken based on the experience they have that you need.
Posted by: Support Engineer | December 20, 2010 at 05:28 PM
Of course I'm cynical...I deal with vendors on a day to day basis; I explain to them on a day to day basis why we can't just upgrade to the latest level. I spend time explaining to them what change management is, what problem management is; I explain to them what a critical system is.
Yes, I'd prefer that you fixed the problem first time; which is why installing the latest levels on your say so without evidence is something I'm not willing to do. When I've had an array running stable at a particular level and nothing has changed; a failed disk is often just a failed disk. Dispatching a hardware engineer should be the first response...and actually swapping redundant hardware is generally a damn sight easier than arranging a firmware/OS/patch. I've recently had a spate of cases where vendors have been working very hard to avoid doing the sensible thing until they've got enough staff in to handle things.
I will stick by my statement that support has got worse; a mixture of call-centre mentality with understaffing by the powers that be has caused this.
And I quote from this blog
" I think it's about time that the vendors started to review what and how they are providing support (fix your websites or at least put a decent search capability on it, it is pretty awful that generally I find myself using generic search engines and the site: directive to search your site); it is also about time that they started treating support centre staff with respect and giving them time to do a great job."
You will see that I am standing up for the people who actually work on the coal-face.
Posted by: Martin G | December 20, 2010 at 05:50 PM
I know who Support Engineer works for based on Martin's tweet earlier and I have to say that at least for us, his company has been less guilty of the "support by rote-script" behavior Martin is rightfully protesting. Just last week I opened an issue with your company and was presently surprised out how quickly we were engaged and the fact that the very first person I spoke to was the right person, was engaged with the issue, and was capable of doing an in depth analysis and asking the right kinds of support questions.
I was surprised because he wasn't engaging in the behavior martin discusses which absolutely have become the norm in the industry. When we call for support, the issues are typically complex and are coming from a relatively complex environment. More often than not, we go through at least 1 front-line agent if not 2 before any real diagnosis and support starts. So, yeah, i was pleasantly surprised when the first person I spoke to got right down to it and rendered solid support. It's simply not the current industry norm.
And, across the industry, the problem is generally getting worse. Martin's analysis of vendor support site search capabilities is also dead-on accurate in my experience.
Posted by: Scott H | December 20, 2010 at 06:03 PM
It'll be interesting to see if NetApp can continue with such quality as they continue to undergo rapid growth; or will they fall into the trap that so many of their competitors have? They are also fortunate that they have a single product (more or less), so their people only have to be an expert in a single product line. This is not the case for many other vendors.
I remember a time when I could call pretty much any vendor and get to speak to the right person pretty much immediately. Now, far too often we find ourselves raising calls and then going round the process to get the support we need. It doesn't happen all the time but it's with increasing frequency.
Posted by: Martin G | December 20, 2010 at 06:18 PM