MV Communications: 1998 outages/events

1998 outages/events

News server Dec 11 1998
The news server blew a power supply at about 3pm; we temporarily replaced the PS and the server was back up at about 4:15.

Nashua Dec 1 1998
There was a brief outage in Nashua this evening around 9:20PM while we investigated a possible problem in the router (there was one, and it was repaired). The repair only took about 5 minutes, however the Bell Atlantic PRI number (886-7124) was inadvertantly left down until until about 10:45PM.

Tuesday November 17 02:00-03:00
There is another planned outtage scheduled for 2:00 AM Tuesday morning that will be used to continue some of the work that could not be completed durring the Sunday morning downtime. Again, this planned maintance is directly related to readying things for the BBNPlanet link. However, this time the outtage should only include the gw2-55bridge router in Manchester.
Status: Unfortunately same result as before: the router is giving us a difficult time taking the upgrade.

Sunday November 15
In the wee hours of the morning we'll be doing some miscellaneous maintenance: at about 1:30 AM the gw-nnh-1 router in Nashua router will taken down for about 15 minutes to add some memory. And at about 3:00 AM the gw2-55bridge router in Manchester will be also be taken down for at least 30 minutes to add some memory. Also around this latter time the bnh dialin access servers (serving the Brooks numbers) will be taken down for some brief maintenance.
Status: All was accomplished except for the Manchester router upgrade, where there was a glitch; this will need to be rescheduled.

News server Nov 11-13
Overnight November 11/12 and then again early November 13 the news server was out for a period of an hour or more. After the first outage one of its RAID disks was marked as failed and we underwent a rebuild operation during the day November 12 (the server was up and available during this process). We believe that there may have been a power supply issue that was related to all of these incidents; some changes were made as a result.

Salem ...
The link to Salem went down at about 3:15AM Thursday Nov 5; given the outage last night Bell Atlantic was called immediately. At this point they believe they see a problem internal to their system and are trying to confirm it and narrow it down with some more tests.
4:55 : After some thorough investigation (checking multiple test points) the Bell Atlantic technician believes that there are errors communicating with the network interface unit (NIU) at the Salem location. This person also believes the report last evening of a fiber problem was erroneous, that whoever gave us that report was confusing it with a fiber problem in Nashua. The link was released from testing and came back up. Bell Atlantic will be dispatching someone to check out the NIU in Salem during the day today; this means that there may be another outage period while they replace it. There may be further instability in the line until the repair is made.

Wednesday Nov 4, Salem
At about 21:30 the link to Salem went down and was showing a lot of framing errors in one direction. We went on site to check our equipment at the side that was showing errors but without touching it found that the line had cleared by about 21:50. It's possible that Bell Atlantic threw the line over to another trunk; we'll see what we can find out.
Update 22:20 : BA reports that the outage was due to a fiber failure (of unspecified nature) that was repaired.

Tuesday November 3 1998
At about 12:53 both T1s between Litchfield and our Manchester office went down. Bell Atlantic was called in for repair and quickly found the cause: apparently while they were working on installing one of our new T1s they disconnected these two. The outage caused a separation of our Manchester office from the rest of the world; users dialing into the "bnh" (Brooks) numbers could access our servers but not the rest of the net, users dialing into other locations could access the Internet but not MV's servers.
Status 14:35 : Both T1s back in service.

Friday October 30 1998
At about 11AM the T1 between Litchfield and Nashua went down. The T1 was given to Bell Atlantic for testing. BA did not find a problem but, as so often happens (to the point where some Bell Atlantic technicians recognize this as a way to repair a down circuit), after they finished the line came back up working again. Connectivity was restored at about 3:50PM. During the outage, users dialing into the Nashua POP (that is, the Bell Atlantic numbers 886-6688 and 886-7124) were not able to reach Internet services or MV servers (note however that the BNH numbers serving Nashua were not affected). This outage also cut us off from one of our links to the Internet backbone; the full load switched over to the Sprintlink connection.
Note: See news for related information about upcoming changes in our connectivity.

Sunday October 11 1998 11AM
At about 11AM we will be bringing down all of the servers in order to do some brief work on the power. Downtime should last about an hour.
Status: Everything was back up by noon.

News and Mail servers, Oct 2 1998
We had a heating system run amok here and had to shut down the news server temporarily at about 1125AM; it was back up as of about 12:55 when the heat was again under control. The mail server was down for about 15 minutes during this period as well.

Nashua Oct 1 13:35
There is a power outage in downtown Nashua affecting our Nashua POP, and therefore one of our backbone links to the Internet. The Nashua Bell Atlantic access numbers will ring busy during this outage, however the BNH (brooks) access numbers are still available.
Status 15:05 : Back up.

Sunday, Sep 27 1998
Beginning at 11AM we will be doing some work in our server area; the shell server will be brought down to swap its ethernet card and to change its network wiring, and other servers and access equipment may be brought down briefly so that we can do some physical changes (wiring and location). We expect this to be over by 1PM.
Status: We finished most everything by about 1PM; there was also a quick reboot of the shell server at about 1:45 in order to re-attach an external SCSI device.

Sep 19 1998
The news server (news.mv.net) was rebooted at about 11:54AM mainly to restart one of its daemons. It was back up within about 15 minutes.

Sep 7 1998
An early morning violent thunderstorm took out our link to Nashua at about 5AM -- service was restored at about 11:00.

Aug 18 1998
A recall was issued on some recently-shipped modem cards used in the Livingston pm3 remote access servers we use for our K56 lines. We had to pull and inspect each card to see which (if any) needed to be returned. Since we wanted to get this done ASAP we did this during the afternoon, and this resulted in hanging up modems during the 20-minute (or so) period that it took, which was from about 3:20 PM through 3:40 PM. FYI we identified 8 cards (78 modems) that will be returned and replaced-- the recall relates to excessive retrains and some failures to connect in some environments.

Aug 15 1998
The Nashua K56/ISDN number (886-7124) is ringing fast-busy -- our equipment shows the circuits up, so it's likely a number translation issue. Bell Atlantic has been sic'd on it
Status Aug 18 15:00 : Fixed by BA

Aug 11 1998 10pm
Salem: Power was out in our Salem POP, probably due to a fast-moving thunderstorm, from about 10:00PM Tuesday night through 00:45 Wednesday morning.

Aug 11 1998 14:30
There was a power outage in our Manchester office at about 2:30PM which affected all of our servers. Most everything was back up at about 3:00. The news server (which was also back up at this time) had some data errors (likely because of the outage) which caused some trouble in the early evening as well.

Aug 2, 1998 13:00
We will be doing some work in our computer room starting at 1PM- the shell system (mv.mv.com) will be down for a short period -- it should be back up by 2pm.
Status: This took somewhat longer than expected; but was back up by about 3:15.

Aug 1 15:00
There was a power outage in our Manchester office today which took down the servers there. Most servers are back up but we are still working on getting the news and mail servers back online.
Systems were back up as of about 16:00 - but at that time we took the news server offline to do another filesystem check there. It was down for another 20 minutes or so.

July 28 11:30
We have temporarily disabled access to the mail server while a problem with mail forwarding is being looked into.
Status 12:05 : server is available again

July 25 5:30pm thru 9:30pm
At about 5:30 pm we lost connectivity to a number of servers at our Bridge Street NOC. This was traced to a loss of disk on one of the servers -- a server that is, ironically, scheduled to be retired in the immediate future. We rebuilt the disk and restored system-related files from backups, and everything was back up by about 9:30pm.

July 14 4pm
The news server will be down for a few minutes between 4pm and 4:30pm while we move it a few feet.

Salem July 12 1998
The link to Salem is out at this time- Bell Atlantic has the circuit for testing.
Status: Back up at about 7:30pm.

June 27
The news server will be taken down at about 11AM for some hardware adjustments. Outage should be no more than an hour.

Servers Jun 23 1998
We lost a UPS (battery backup unit) at about 6:15 tonight resulting in a couple of servers going down (including nameservice and the shell machine). Systems were fully back by about 7:05.

Nashua June 6 1998
There will be a scheduled power outage in our Nashua facility on Saturday June 6 from approximately 7AM though 9AM. During the outage our Nashua Bell Atlantic numbers will be down, as will one of our Internet links.
Status: Over as of about 7:30; downtime was about 10 minues.

May 31 1998
There are violent thunderstorms passing through New Hampshire this evening, with more on the way. Affected MV POPs:

All hubs, May 19
There is an issue with one of the routers at our main hub in Litchfield. This problem is being worked on. The symptoms will be intermittent outages for users. For example, you may be able to get to a site, but then on a reload of a web page, receive a timeout error.
Status: Problem was located and solved at 2:15pm
Addendum: Repurcussions intermittendly affected Nashua dialins somewhat after this time, but were also fixed later in the day.

Brooks dialins May 7
Our Brooks dialin lines (the K56flex/ISDN numbers serving Manchester, Concord, Dover, and Peterborough) will be down from about 9AM to 10AM as Brooks does some reconfiguration on the PRI trunk groups.
Status: This was done about an hour late, but it was done successfully.

Salem April 29
We are experiencing problems with the link to Salem and the link has been down and up throughout the day. We are working to get this resolved.
Status 16:30 : Up and stable again as of this time

Salem April 28/29
The link to Salem went out at about 23:25 April 28. After checking our equipment we asked Bell Atlantic to run a test on the link. They tested it for about 15 minutes, and it ran clean. Once they exited the test, the link came back up (at about 00:15 April 29). (This is a phenomenon that the telco test people tell us happens quite a bit. Sometimes the act of putting the line into and taking it out of test mode clears whatever was wrong with it.)

Power at MV office; April 24, 10:50am
PSNH is shutting down all power at our Manchester office building, due to potentially hazardous electrical problems in the basement. The staff has been forced to leave the building at this time. Our servers are all located in this building; they are on UPS's which will keep them running for a while, but if the outage lasts too long, service might be interrupted. We hope the interruption will be as brief as possible (or non-existent), and that the staff can get back to their phones and stations soon.
Status 13:30 : We're back in, and working on bringing servers back up
14:00 everything is back.

News server April 23
The news server is down as of about 3pm; it had a glitch in one of the component RAID disks and we are rebuilding the parity information in the RAID array. We don't expect any lossage, but it will be down for a few hours while it rebuilds.
Status: Back up at 18:00

April 19, 1998
News server: The news server was offline from about 15:30 through 16:30 due to an outage in the ethernet segment that it is on.

April 15, 1998
Litchfield: Bell Atlantic is again going to try to do a conversion on the SLC in our Litchfield location. This will happen at 3AM early Wednesday morning April 15, and should take a couple of hours.
At the same time, we will take this opportunity to do some minor equipment relocation in Litchfield, which may result in occasional small interruptions.
Status 4AM: The conversion took about an hour and looks to have been successful. The second part (equipment relocation) turned out not to be possible at this time and so was not done.

April 12 noon- 15:30
We're doing some work in our server room which may result in some brief interruptions to one server or another this afternoon. Duration of any interruptions should be just a few minutes.

April 6 1998 10:45am
The news server will be taken offline, so that a replacement SCSI RAID controller card can be installed. Downtime should be no more than an hour if all goes well.

March 29-30 1998
The news server crashed at about 19:00 Sunday night due to an apparent SCSI failure. Several attempts to preserve the news spool failed, and we finally elected to rebuild the filesystem, which means that all articles on the news server were lost. Further, the server is operating in a degraded mode until the SCSI problem can be corrected. The system was back up at about 3AM March 30 in this mode.

March 20 1998
mv.mv.com (the shell system) will be taken down today at around 11AM to replace a failing disk drive. Downtime will be several hours as the data is copied to a new disk.
Status: Finished as of about 14:00

March 18 1998
We found a malfunctioning port in Litchfield which was preventing IP callers from making a successful call on that port. Since the port was fairly near the beginning of the hunt group, many callers were affected. The modem and phone line were moved to a free port on another terminal server.

March 18 1998
PSNH is replacing a power meter at our Manchester office today at noon. This may result in a brief outage of one or more of our servers.
Note: this was originally scheduled for March 17 but was pushed back a day.

Mar 16 1998
There was a power outage in Peterborough and surrounding areas this afternoon; our Peterborough site was down from about 13:20 through 14:05.

Mar 3 1998
The link to the Manchester dialin modems is offline; we are on our way to investigate.
Status: The CSU in Litchfield had been disturbed by construction workers on premises and was in a loopback mode.

Mar 2 1998
The link to Keene is down as of about 9:15 PM, no ETR as yet.
Status: Up as of March 3 AM, problem was a wedged CSU/DSU in Keene

Feb 25 1998
Litchfield: The SLC conversion originally scheduled for Feb 13 has been rescheduled for Feb 25 at 3AM. See the original note below for details, everything is the same except the date.
Status: The operation was not successful; BA stopped at about 5AM without success, put the unit back the way it was, and promised to reschedule after they could figure out what was going wrong. After they left, 6 lines were found not working; we called and had them busied out until they can be repaired.

Feb 23 15:00
pm-1 (the first terminal server in Litchfield) had to be replaced due to periodic problems with required a site visit to reboot the beast (there were two such problems over the weekend). A replacement was swapped in, which we expect to stabilize the situation.

Feb 17 13:44
The news server is currently down; no ETR as yet.
Feb 18 00:36 Back up, after tracing down the problem to two bad SCSI connectors. Gaudy details are in the mv.info.outages newsgroup.

Feb 13 1998
Litchfield: On the morning of Friday February 13, 1998, Bell Atlantic will be doing some work on our phone connections in Litchfield. The phone lines in Litchfield are connected to a SLC in our office, which is connected to the Bell Atlantic switch in Merrimack by a direct fiber link. BA will be converting the this SLC to an "Integrated" carrier unit, meaning that the phone lines will bypass the analog part of the CO switch and instead be connected to the trunk side. This may improve connection rates in Litchfield (we'll see after the conversion); but BA is doing this in order to relieve congestion in the Merrimack switch.
The work is scheduled to start at 3AM Friday morning, and they expect to be done by around 5AM.
Update: This was cancelled by BA and will be rescheduled soon.

Litchfield Feb 7 - 8
As part of our equipment move within the building, the remaining modems are being relocated this weekend. We will make every attempt to do this as gently as possible, e.g. by busying out groups of lines before moving the modems attached to them. However, this is a lengthy process and some glitches are inevitable.

Feb 7 1998
There were two power outages in Litchfield this morning due to some tinkering by construction people. They said that it was accidental-- once at about 8AM and again at about 11AM.

Jan 24 14:xx
The link to Dover is down after power outages in both Litchfield and Dover.
Jan 25 12:15 AM: Dover is back up following on-site visits by MV and Bell Atlantic. (BA was swamped with outages over the past day due to the ice storm; we appreciate their extra effort and long hours.)

Jan 24 14:00
The shell server mv.mv.com is down to replace a disk and add disk capacity (see also the note about this morning's power outage below).

Jan 24 1998
After last night's ice storm Litchfield was without power from early this morning through about 2PM, which meant that we were mostly down. We took this, um, opportunity to perform a needed disk replacement, as well as addition of disk space, on the shell server (mv.mv.com). We expect this operation to be completed by around 4PM.

Jan 21 1998
The Nashua link will be moved back to a replacement cisco router during the early morning hours overnight Jan 20/21, no earlier than 2AM. This operation will bring down boh the Sprintlink and the Nashua connections to Litchfield while the equipment is being moved. Duration should be around 15 minutes.
2:15 AM: Move succesfully completed, Nashua link is back up with full routing at this point.

Jan 20 1998
Dover: The access servers in Dover were offline from about 15:45 through 17:00. The cause was, regrettably, pilot error on our part while we were working on another issue. We apologize for this downtime.

Litchfield Jan 19, 1998
Starting at about midnight Sunday night/Monday morning, we will be doing some rearrangement of the first 30 modems in Litchfield. We will make all efforts to keep the disruption to a minimum.
Note: this was originally scheduled for Sunday morning but was pushed back due to the events in Salem.

Jan 18 1998
There is a power outage affecting areas in Salem in Windham; our Salem POP is down as a result as of about 12:05AM. Granite State Electric reports that crews are working on the problem but have no ETR.
Update 2:45 AM: Power was restored at about 1:00AM; however, the pm-snh-1 terminal server did not come up. On arrival it was found that this terminal server was nonfunctional. We are working on getting it restored or replaced, hopefully before 5:00.
4:10 AM: pm-snh-1 was replaced with a spare and is back up.

Nashua Jan 15 1998
The T1 between Litchfield and Nashua is out as of about 5:30 AM. Bell Atlantic has the line for testing.
Update: The problem appears to have been caused by equipment failure at MV, likely a router port. We've restored connectivity to Nashua by moving the link to another router; however this affects the way data is routed between our multiple upstream connections. While incoming data will still be split between the two providers using optimum paths, most outgoing data will be sent via Sprintlink (except for Nashua users, where outgoing data will be sent via the Destek connection). Because of the upcoming storm, this may take until some time next week to resolve.
Update Jan 16: The link to Nashua has been showing a lot of CRC errors on the Nashua end since being brought back up on the new router, resulting in suboptimal performance. As always, watch the mv.info.outages newsgroup for more info.
Jan 20: The link has been running clean since about 12:30 today. We will be moving it back to the cisco router overnight and hope to have it back in its fully functional condition at that point.
Jan 21 2:15AM: Link is back up and fully operational.


Rates and services Access Policies Register Customer Pages User Information Back to top
About MV Our Staff Feedback Contacting us
Copyright © 1998 thru 2008 MV Communications, Inc.