Welcome to the Australian Ford Forums forum.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and inserts advertising. By joining our free community you will have access to post topics, communicate privately with other members, respond to polls, upload content and access many other special features without post based advertising banners. Registration is simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact us.

Please Note: All new registrations go through a manual approval queue to keep spammers out. This is checked twice each day so there will be a delay before your registration is activated.

Go Back   Australian Ford Forums > General Topics > Ford Forums Central > Site Support

Site Support If something isn't working or you have a suggestion ( a nice one !! ) let us know here.

 
 
Thread Tools Display Modes
Prev Previous Post   Next Post Next
Old 10-03-2013, 10:17 AM   #1
russellw
Chairman & Administrator
Donating Member3
 
russellw's Avatar
 
Join Date: Dec 2004
Location: 1975
Posts: 107,049
Community Builder: In recognition of those who have helped build the AFF community. - Issue reason: Raptor: For Continued, and prolonged service to the wider Ford Community 
Exclamation Unplanned outage

Good morning

Firstly, let me apologise for the extended server outage that we have experienced this morning but it's the result of a comedy of errors.

By way of background, a ticket was originally raised with our ISP to upgrade RAM from 8Gb to 16GB in the server for action between 0100-0400 yesterday morning. This was completed successfully within the agreed change window at about 0100 AEDST.

There was a minor outage of <10 mins at about 0200 but the server came back up.

The server was then down from 0400-0900 yesterday which wasn't noticed until 0830 at which time a request for reboot was lodged and the reboot performed. I ran some simple diagnostics and checked error logs but could find no cause. 0400 is the time when the backup and several housekeeping tasks are performed and server load tends to be very high.

The server crashed again at 2330 last night and I contacted the support team and advised that the server was down again and that I wanted diagnostics run when it was restarted to find the root cause of the problem. I received the following email from support at 00:04 this morning.

Hello,
I apologize for the issues you are having since the RAM upgrade. We can check the hardware as requested but we will need to bring the server offline for 1-2 hours. We can do this 24/7, just reply back letting us know what time you would like us to proceed.


This email came from their Dedicated Customer Care so I replied to say that as it was 0100 here (when I saw the email) it was fine to proceed as this was out of hours.

Stupidly, despite saying "just reply back" I have since noted that this is a no reply email address and I was supposed to go into the ticket and update it.

I checked the server at 0200 before heading to bed and the server was still down.

Anyway. as at 0700 this morning when I got up the server was still down, nothing had been done and it hadn't even been rebooted. I contacted support and spoke to an agent who advised that because the ticket had been reopened and I hadn't replied to the message above nothing had been done - to quote:

I don't think the tech was aware that it was already down. I will reopen it and add that information

It's like Keystone Cops. At the very least it should have been rebooted and it would probably still be up and we could have scheduled the maintenance window in non production hours. As of 0800 local it was still down and I again contacted the support team to be advised that they were running diagnostics on new RAM after the original upgrade RAM failed the test - this despite my request that the diagnostics not be performed so that we could do it in a planned maintenance window out of normal hours.

I'm happy to take responsibility for not noticing that the email was from a no reply address but, in this era, it isn't technically hard to route email to agents and we should be well past using no-reply addresses. Not a lot of point sending the question in the email if you don't want a response via the same channel.

I have escalated the issue to the support manager as to not do anything for 7.5 hours and make no further contact is unacceptable. Hopefully, the server will return to a stable state now that the new memory has been tested and we shouldn't experience any further issues.

Cheers
Russ

__________________

__________________________________________________

Observatio Facta Rotae


russellw is offline   Reply With Quote Multi-Quote with this Post
 


Forum Jump


All times are GMT +11. The time now is 08:22 AM.


Powered by vBulletin® Version 3.8.5
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Other than what is legally copyrighted by the respective owners, this site is copyright www.fordforums.com.au
Positive SSL