[Faccus] Tentative downtime for SharePoint – February 6 at 8 p.m.

Natasha Jennings njennings at uwaterloo.ca
Wed Feb 4 12:58:49 EST 2015


Last week, SharePoint experienced a prolonged outage. Investigation identified issues with the storage infrastructure used by SharePoint. In the incident summary (below) shared Friday, January 30, it was indicated that IST would fast-track a solution to move SharePoint data to an alternate storage solution. Our goal is to have the replacement hardware tested and in place Friday, February 6. An update on this work, and confirmation of downtime, will be provided Thursday, February 5.

Tentative plan for SharePoint maintenance (TBC)

What is happening? Currently, IST staff are fine-tuning the replacement infrastructure, and planning a test move of the data. Once we confirm our implementation works, we will need to move all the SharePoint databases. This is a large amount of data and will take a minimum of 6 hours to complete.

When will this happen? On Friday, February 6 at 8:00 p.m. SharePoint will be restored to a backup and the data moved to the replacement storage hardware. SharePoint will return to service by 8:00 a.m. Saturday, February 7. SharePoint may be back online before 8:00 a.m. Saturday morning, however the maintenance window is being extended to provide time for backing out of the process should an issue arise.

What are the steps? This process will require:


•         running a backup to ensure we have a clean copy of all the databases – this could take 2 hours to complete

•         copying the data to the new location – this will take up to 6 hours

•         reconfiguring SharePoint to use the new data locations, and bringing all servers back online

What do you need to do? Please close all sessions to SharePoint, and do not attempt to use SharePoint during the maintenance window.

Questions/concerns? Please contact Stephen Markan, smarkan at uwaterloo.ca<mailto:smarkan at uwaterloo.ca>, SharePoint Coordinator, if you have any questions about this plan.


Natasha Jennings
Communications Officer
Information Systems & Technology
University of Waterloo
519-888-4567 ext. 47951
[university-of-waterloo-logo-esig]


From: Natasha Jennings
Sent: Friday, January 30, 2015 4:09 PM
To: isthd at lists.uwaterloo.ca; ist-staff at lists.uwaterloo.ca; admin-support at lists.uwaterloo.ca; faccus at lists.uwaterloo.ca; mactug at lists.uwaterloo.ca; ctsc at lists.uwaterloo.ca; UWweb at lists.uwaterloo.ca; ucist at lists.uwaterloo.ca; 'SharePoint-Alerts at lists.uwaterloo.ca'
Subject: SharePoint Service incident summary

The SharePoint Service offered by IST experienced some difficulties earlier this week. SharePoint was unavailable to most campus areas from late Wednesday, January 28 to Thursday, January 29 at 4:00 p.m.

What happened? SharePoint stores its information in a database structure called a ‘Content Database’. Because SharePoint is used extensively across campus, and stores a large number of documents, this information is spread across a number of content databases. During the evening hours of Wednesday, January 28,  most of the content databases became damaged. The damage seemed to cascade across the content databases; the end-result was the majority of SharePoint sites were unavailable for access or use.

How was the issue fixed? The immediate goal was to get SharePoint back to a useful state, with usable data. Our first step was to restore individual sites to the last available good backup (Wednesday, January 27 after 8 p.m.). These restore attempts had mixed success. It was determined that the damaged content databases needed to be fully restored to the Wednesday, January 27 backups that ran after 12 p.m. (noon). Because of the large amount of data, the complete restore of the damaged content databases took just under three hours to complete.

Parts of the restore have continued today as some individual sites were still experiencing some problems.

Future steps: The SharePoint Services Team is investigating the specific cause of the corruption. SharePoint is a service many units rely on for business processes. IST is aware of the inconvenience this service outage has caused. We have identified issues with the current network storage location, and, are working on creating an alternative network storage infrastructure for SharePoint data, which will help protect the data, and prevent a reoccurrence of this type of an outage. This will involve moving the content databases to a different storage location. Detailed communications about this work will be shared next week.

Feedback and questions:  If you have any concerns or questions about this incident, please contact the IST SharePoint Service Coordinator, Stephen Markan, at smarkan at uwaterloo.ca<mailto:smarkan at uwaterloo.ca>

Recipients of this list: isthd, ist-staff, admin-support, faccus, mactug, ctsc, uwweb, ucist, Sharepoint-Alerts


Natasha Jennings
Communications Officer
Information Systems & Technology
University of Waterloo
519-888-4567 ext. 47951
[university-of-waterloo-logo-esig]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uwaterloo.ca/pipermail/faccus/attachments/20150204/06d64b27/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 5807 bytes
Desc: image003.png
URL: <http://lists.uwaterloo.ca/pipermail/faccus/attachments/20150204/06d64b27/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 5807 bytes
Desc: image002.png
URL: <http://lists.uwaterloo.ca/pipermail/faccus/attachments/20150204/06d64b27/attachment-0003.png>


More information about the Faccus mailing list