MV Communications: 1997 outages/events
1997 outages/events
- Dover Dec 26 1997
- At about 12:30 pm several modems and one of the terminal servers
in Dover lost power and failed to properly restart. At this point we
feel that there may be a problem with our UPS power source, and have
taken a new unit to Dover as a replacement. All equipment is back up
again as of about 15:15. During the outage period, a number of modems
in the main hunt group were offline, yielding ring-no-answer; and the
entire K56flex group was down.
Update 17:30 :
The outage was apparently due to the facility owner installing a new
circuit for us in response to last night's outage, a fine thing
except that they moved our equipment to the new circuit without
coordinating it with us or otherwise letting us know. Our UPS
power supply was at maximum load and likely shut down one or more of its
outlets during the transition to the new circuit. We transferred some
of the load to the new UPS, thus adding UPS capacity to this site.
- Dover Dec 25 1997
- At about 9:15pm all equipment in the Dover POP appears to
be down. An MV person is on the way there to address the problem.
Update 00:30 Dec 26: There was an electrical circuit
out at the facility; MV equipment is back up after being moved
to another circuit pending more information.
- Dec 22 1997
- At about 5:00pm today we'll be doing an upgrade to our phone
system. During this upgrade, which should take around an hour,
our phones will not be answering.
- December 20 1997
- This afternoon we had an outage in one of our gateway systems that
effectively cut most of our servers off of the net. During the outage,
users could not get authenticated (i.e., couldn't log in), and anyone
who was already logged in would experience nameserver and other problems.
The outage lasted from about 13:50 through about 15:15.
- Nashua December 13, 1997
- PSNH plans to do power work in the area of our Nashua building on
Saturday, December 13, from approximately 8AM through 9AM, requiring
power to the building to be shut down during this period (This is work
that was twice scheduled and postponed previously.) Although our
equipment is protected by battery backup against brief power losses
and glitches, we do not expect the batteries to hold out
through the entire outage. Expect some down time during this period,
both for dialin access and for one of our Internet backbone links to
which we are connected at this location.
Nashua was down from about 8:30AM to 9:30AM
- Dec 10 1997
- This morning there was an apparent power glitch in Litchfield
at about 9:55 which disrupted the links to Salem and to Nashua.
Salem was back online at about 10:14, and Nashua at about 10:25.
- Dec 2 1997
- The link to Salem went out at 9:30pm. After a call to Bell
Atlantic, the link was back up at about 10:05.
Update: Test center believes this was due to
construction of new fiber circuits between Manchester and Salem,
and moving this circuit over to it (a "throw"). The test
center was also caught short by this, if that's what it was,
having a number of alarms during this period.
- Dec 1 1997
- MV nameservers went down at about 6:30 this morning;
this eventually resulted in the need to reboot several other servers
at about 9:30.
- Nov 26 1997
- Peterborough: One of the lines in the Peterborough
hunt group is experiencing severe static and, since it is early
in the hunt group, it gets hit a lot and is causing a great deal
of trouble. BA has been called to repair the line, in the meantime
we also asked them to busy out that line so that it will be skipped
until it is fixed.
Update 15:20 Bell Atlantic claims to have repaired this
line, and we do see it being used at this time.
- Nov 26 1997
- Several things this morning: news2 (aka zircon), the news server
primarily for shell users, went down at about 5:18 and had to be rebooted;
it was back up slightly before 9:00. The news server being down triggered
an odd situation wherein one of the nameservers (on a different system) was
unable to start; this was corrected at about 8:00, before the news server
came back up. Finally, the shell system (mv.mv.com) was rebooted at about
9:15 to clear some stale NFS operations that were caused by the news server
outage.
- Nov 25 1997
- pm-1, the first Portmaster in Litchfield, went down for about an hour
this evening; it had to be power cycled by hand. Some users calling in to
Litchfield during this period would have seen an initial modem connection
and a hangup shortly thereafter.
Apologies for the problem.
- Nov 18 1997
- We're experiencing trouble (large numbers of CRC errors)
on one of the two T1s between our
Manchester office and Litchfield. That T1 has been taken out of
service and given to Bell Atlantic for testing/repair. While it is down,
connections to our servers may be sluggish. (Note that taking it
out of service made the connections much better than while it was
in service and causing problems.)
Update 731pm: Bell Atlantic tested the line and found no errors,
when we put it in service it ran clean. It's back but we'll be watching
it.
- Nov 17 1997
- As of about 2:15 PM, our SprintLink connection went down due to
an unspecified outage in Texas. Our traffic has automatically
switched to our other Internet link, but you may notice a slowdown
while the SprintLink connection is down. We have no ETR as yet.
Update: Back up at about 5:00 pm
- Nov 9 1997
- Our voicemail/phone system will be down Sunday morning for some
maintenance.
- Nov 3 1997
- The new mail server was physically moved from its temporary
location to its real spot; this occured at about 13:23 and required
about 5 minutes of down time.
- Nov 1-2 1997
- It appears that the weekend weather had some effect on our
Concord modem lines, as we had a number of reports indicating inability
to connect. It's not clear whether there was a problem in the lines
or our modems, or both; however we did reinitialize modems that were
not in use.
- Oct 25 1997
- On Saturday Oct 25 there will be a power outage at the Nashua POP from
about 8AM through 9AM due to work by PSNH and the building maintenance
people, and possibly again at about 5PM through 6PM. Each outage
is likely to be about an hour, which will probably be longer than
our battery backups can sustain our equipment -- so we do expect
some loss of service during these periods.
- Oct 25 1997
- The web server (www.mv.com et al) will be taken down tonight -- the
night of October 24/25 -- at about 1AM for addition of disk space.
The process will involve replacing one of the disks, and will take
from 2-3 hours.
- Oct 23 1997
- POP mail service will be moved to the new mail server during the
wee hours of the morning on October 23 (that's the night of the
22nd/23rd). Starting at about 2AM, mail service to both the old
mail server and the new server will be shut off so that the mailboxes
and accounts can be transferred. This may take 1-2 hours. Please
make sure that you are referencing the mail and pop servers by the
proper names (mail.mv.net, pop.mv.net, etc) so that you will not be
affected by the change in the underlying servers.
- Oct 16 1997
- At about 17:40, pm-1 (the first terminal server in Litchfield)
went down and had to be manually rebooted. It was back up again at
about 18:15.
- Oct 2 1997
- Shortly after 10pm, mv.mv.com (the shell system) crashed because of a
filesystem error. When the system was brought up, we decided -- mainly out
of paranoia-- to give it another emergency reboot just to make sure the
filesystems were clean. This went without error.
- Sep 30 1997
- At noon, PSNH is going to replace a meter at our Manchester
office. Supposedly, this will only affect a small portion of
our office area, not including any critical systems. However,
our servers are all on battery backed-up power, so even if there
is a brief loss of power to the servers they should be unaffected.
We think it's wise to post this warning anyway.
- Sep 25 1997
- Connectivity between Salem and Litchfield was lost at about
17:35. We remotely reset the T1 port on the Salem router and
the link came back up at about 17:50.
- Sep 23 1997
- zircon, the second news server, went down at about 12:20AM,
causing some NFS problems on the shell server. zircon was back up
within about 45 minutes, and the NFS problems cleared some time
after that.
- Sep 21 1997 10AM
- The shell system mv.mv.com will be taken down Sunday, Sep 21,
at 10AM for maintenance -- replacement of a disk with a larger one.
Downtime should be between 1 and 2 hours.
Update: mv.mv.com was back up at about 1:45 pm; data
transfer took longer than predicted.
- Sep 3 1997
- Nashua: There was a misconfiguration of the second
Nashua portmaster (the terminal server that you log into) that
prevented users with dynamic IP addresses from logging in.
(Users with static IP addresses were not affected.) If you have
a dynamic IP account, and you reached the second terminal
server, your call would get dropped without logging in; the
next time in you might get the first terminal server and
get in OK. This problem has been corrected; apologies for
the inconvenience.
- Aug 25 1997
- Nashua: The Nashua router went down at about 10:00 this
morning with a hardware failure. A spare router was swapped in and
the site was back up as of about 12:15, however the spare router can not
maintain our link to Destek, so that link is still down as of this time.
We are working on repairing the original router and hope to have it
back in service soon.
Update 7:50pm The original router is back in place and
all links are up.
- Aug 24 1997
- The Frame Relay link from Litchfield into NYNEX's frame cloud began
flapping at about 3:00pm and eventually went down hard at about 3:30.
NYNEX has been contacted to repair the situation; in the meantime the
Keene MV location and all MV Frame Relay customers will be affected.
Update 5:30pm: NYNEX has acknowledged that the problem
appears to be in their frame relay switch and that a lot of NYNEX's
frame relay customers are affected by this outage. They are still
working on restoring service.
Update 11:00pm: After some up and down periods, the link
appears to be up as of about 9:30pm. However, NYNEX has not yet closed the
trouble report.
Update Aug 25: NYNEX reports the trouble was a software
problem in their frame relay switch in Manchester; they report the
problem repaired now. When they say "software problem" they mean that
it was not a hardware failure: it could have been a configuration
error.
- Aug 17 1997
- mv.mv.com (the shell system) crashed at about 4:10 AM; it was back up
at about 10:45. Some time during this interval the nameservers and account
login authentication also stopped working as a direct result of mv.mv.com being
unavailable.
- Aug 13 1997
- Concord: All equipment in Concord went offline at about
10:45pm due to apparent power outage or surge at that location. The
outage lasted just a minute or two while the equipment rebooted.
- Aug 13 1997
- Concord: Following the lead of pm-1 a week ago, the
pm-cnh-2 portmaster repeatedly rebooted itself until it could be
manually powercycled. This affected about 20 modems for 2-3 hours
during the early evening.
- Aug 6 1997
- Litchfield: The pm-1 portmaster repeatedly rebooted itself
from about 12:10AM through 12:55AM, when it was powercycled by hand.
- Aug 4 1997
- The mail server crashed again this morning at about 10:20.
We took this opportunity to replace its disc controller and its
internal disc cabling, something we were going to schedule for a
later time, but since it was only a 10-minute operation and the
system was down, we did it; it was back up at about 10:50.
- Aug 2, 1997
- The mail server was down for about an hour and a half Saturday
night starting at around 10pm when it crashed for as-yet-unknown
reason. It needed to be manually restarted.
- July 24 1997
-
Dover: The T1 link to Dover went down at about 9:45 AM (along with
another, customer T1 circuit at the same time). These have been given to
NYNEX; no ETA as yet.
Update: The link came back at about 11:23 after some
preliminary testing by NYNEX.
- July 22 1997
- Peterborough:
We're going to be doing some minor site work in Peterborough the morning
of Tuesday, July 22, from about 9:30 to 10:30, similar to what
was done in Keene last week. The work involves
adding some power equipment and changing the ethernet slightly.
This will require shutting the site down for a few minutes some
time during this interval; the outage should be brief.
- July 20/21 1997
- Sprintlink:
At midnight the night of July 20/21, sprintlink is making a change
to their router configuration (specifically, they are changing the
AS number of the BGP peer). This may result in a routing outage
via sprintlink as the routers need to resynch.
- T1 to Bridge Street
- We're seeing a lot of lag in the T1 to Bridge street (which
connects the servers to the MV network). This is mostly noticeable
by shell users; however we'll be working to correct this problem
and there may be some brief outages for test/repair purposes.
Update June 19 23:30 :
NYNEX will be testing this circuit from 4AM thru 5AM early Friday
morning June 20. During this time all MV servers will be unavailable.
Update: NYNEX test did not reveal any problems.
Update June 24: Problems continue on this T1; late this
afternoon experienced severe congestion as well as underlying data
errors. NYNEX has also noted the errors and will be sending a
technician to look at the Manchester end of the circuit in the
morning of June 25.
Update June 25: The link was running clean this morning,
no errors and no congestion (although with a normal load); there was
no point in NYNEX work being done without errors being present. Hopefully
when errors reappear, we can get them to work on the line. In the
meantime, we've ordered a second parallel T1 for redundancy and fallover.
As always, see mv.info.outages for more info.
Update July 23: Second T1 has been installed and is
up and running; this should help with the performance to Bridge Street.
- July 18 1997
- Keene:
We're going to be doing some minor site work in Keene the morning
of Friday, July 18, from about 9:30 to 10:30. The work involves
adding some power equipment and changing the ethernet slightly.
This will require shutting the site down for a few minutes some
time during this interval; the outage should be brief, but we'll
gain a cleaner on-site ethernet and better power backup.
Note: The posting here originally said that we
were also planning to do the same in
Peterborough, but that has been postponed to a later date.
- July 17, 1997
- You may find problems around the net today (as well as part of
yesterday) due to two major fiber cuts in the US. Also, early
morning today some of the root nameservers on the Internet became
corrupted. We took steps to avoid referencing the root nameservers
that are bad, but there can still be problems with other nameservers
and other sites.
- July 13 1997
- The mail server was down from about 3:00 AM through 9:30 AM
Sunday July 13.
- July 4 1997
- Concord: There was an apparent power outage
in Concord late July 3 which was very brief. The outage left some modems
misconfigured; callers to Concord may have gotten "garbage" when trying to
log in. These modems have been reconfigured as of afternoon July 4.
There may be one or more modems that did not come back on after the
outage, and we'll be sending someone up on Saturday to make sure
all modems have power.
Update July 5: On inspection, all modems were on.
- June 23 , 1997 19:30
- The news server will be down / is down starting at about 19:30
tonight, while we rebuild its filesystems. It has been crashing
during the overnight expire recently and this is intended to help.
Update 22:00 : Filesystems were rebuilt and the
server has been running an expire for about an hour. Will be
watching to see if this procedure has cured the patient, or if
something more invasive is required.
Update June 24: The rebuild appears to have cured
the problem. The system has remained stable through the last
two expires, and lots of new news is being processed.
- June 16, 1997
- We're moving our staff office and servers from Litchfield to
Manchester starting at about 1AM Monday morning (Sunday night).
This will result in a couple of
hours of downtime for all the servers, including news, web, mail,
and login authentication. For more info see the "news" category.
- June 6, 1997
- Nashua: There was a component failure in the router at
the Nashua POP and as a result the router went down at about 1:30AM.
The T1 to Nashua as well as peering with Destek is out as a
result.
Update 4:15 AM: A replacement router was swapped in to
restore the T1 connectivity between Litchfield and Nashua. This
router is not capable of also restoring the peering with Destek,
which will remain down until the original router is mended and
returned to service. We expect that to occur this afternoon
or evening (Friday, June 6) -- be aware that when the original router
is put back into service there will be a short disconnection of
the T1 circuit between Nashua and Litchfield as that circuit is
moved from one router to the other.
9:15PM Original router back in service.
- June 1, 1997
- Nashua:
There is a scheduled power outage for the building in Nashua on
Sunday, June 1 that
is supposed to last approximately 4 hours. The time of day for
this outage has not yet been set. We'll try to gang up some
battery power equipment for the event, but I suspect the outage
will outlast the batteries. More when we know more.
Update: Outage is supposed to be 7AM - 11AM.
- May 24 - 26, 1997
- We'll be doing some minor miscellaneous stuff over the weekend,
including power work on several of the POPs and on a couple of servers,
and replacement of some I/O card(s) in some of the servers.
The power work mainly involves changing the source of the electrical
power, i.e. unplugging and plugging things, but this will require
rebooting the equipment involved. Those outages should be very
brief.
- May 12, 1997
- news.mv.net:
The news3 disk on news.mv.net is giving SCSI errors. (This disk
holds the alt.binaries tree, mostly.) Because of this, no news
is being processed by the news server. We will be looking into
the problem this evening; it may be necessary to replace the
disk.
Update 21:15 : Problem appears to have been a stuck
SCSI interface on the drive in question. Powercycled the
beast and it's up and running. Will watch it for further problems.
- May 12, 1997
- Peterborough:
We will be replacing some communications equipment at about 1pm, and
there will be a brief (1-5 minutes) loss of service during this time.
Status: Switch went without a hitch.
- April 30, 1997
- Some MV customers were unable to reach most of the Internet some time
between 8:30PM and 9:30PM, caused by a configuration error at MV while we
were preparing for some routing changes.
- April 27, 1997
- The mail server went down at about 4:30AM Sunday morning and was
brought back at about 11:45AM.
Note: The outage was due to a mail bombing attempt that
flooded the server. The originating address has been blocked
from the mail server.
- April 26, 1997
- At about 11AM the mail server will have a motherboard upgrade.
Downtime should be about an hour.
- April 25, 1997
- There was an Internet outage from about noon to 2PM today
caused by a company in Virginia injecting bad routing information
into the internet backbone(s). This had a global impact that was
rapidly corrected.
- April 24, 1997
- The link to Sprintlink is down as of about 1:40AM. SprintLink
is working on the problem.
Update: Up again as of 3:25AM, problem corrected by
Sprint.
- April 19 1997
- There was a power outage in Litchfield from about 4:15AM
to about 9:30. Our systems were back up by about 10:30. We
took this opportunity to do a move that was scheduled for
Sunday, so that move has been cancelled.
- April 20 1997
- We're going to be moving and rebooting a couple of systems on
Sunday at 1pm for electrical power reasons. These include
mv.mv.com and sard.mv.net, and they should only be down for
10 minutes or so.
Note: this was done during the power outage Saturady morning
instead.
- April 16 1997
- There was a power outage in Litchfield this morning and several
systems had to be rebooted.
- April 3 1997
- At least a dozen phone lines in Litchfield went dead on Thursday.
Half of these were in the hunt group (toward the end). NYNEX is working
on getting them restored.
April 4: fixed.
- April 2 1997
- We brought the mail server down at about 16:10 today because we had
reason to believe that one of its RAM chips had a problem. We swapped
out and replaced its 64MB of RAM and brought it back up. Outage was
slightly over a half hour.
- April 1 1997
- The Peterborough site is without power as of about 9:45AM
Status: Back up at 1:05 PM
- March 26 1997
- There was a RNA (ring-no-answer) in Concord which was s15@pm-cnh-1.
This was repaired today. We have also discovered an RNA in Nashua
(s16@pm-nnh-1) and one in Salem (s13@pm-snh-1). These should be fixed
within a day.
Update: s13@pm-snh-1 was not RNA, it had been inexplicably
busied out by NYNEX. So while it wasn't answering, it was being
skipped altogether and you wouldn't have gotten a ring on it.
Fixed now.
Update March 27: s16@nnh-1 is also fixed.
- March 22-23 1997
- Mail server crashed at about 11:40pm March 22 as a result of one of the
filesystems getting filled (this is a filesystem that mainly contains the
log data plus some executables). System had to be manually rebooted at was
back up at about 12:30AM March 23.
- March 19 1997
- Dover: The final part of the office move is scheduled for
Wednesday morning, March 19. At this time new lines are also due to
be installed. The move may bring us down for several hours in the
morning.
Status March 19: The move went well. However NYNEX
was unable to complete the install of the new lines by early
evening; they will be back in the morning to finish.
March 20:All new lines are up and working.
- March 16 1997
- Litchfield: We'll be doing some cabling work in Litchfield
starting at about 1PM on Sunday. This may result in some brief
stoppages while things are disconnected and reconnected.
- March 14 1997
- Peterborough: New lines are due to be installed in Peterborough.
The new modems should be online either Friday or Saturday.
Update: NYNEX arrived on site and discovered they did not
have enough facilities to complete the order, so they will be
rescheduling it. Supposedly this was checked in advance, but they
must have missed something.
- March 10 1997
- Dover: As part of the upcoming office move in Dover, we'll be
moving the dedicated circuits (the T1 and other leased lines) Monday morning.
This is scheduled to start at about 8AM. At that time, NYNEX will run some
new cables. When they are done with that, we'll move the equipment and the
leased lines, and there will be an outage while this part is happening. We
expect the outage to be minimal, on the order of a half an hour.
- March 9 1997
- Web server: The web server will be taken down at 1PM Sunday,
March 9, for installation of more disk space. Outage should be less than an
hour.
- March 2 1997
- Salem: we'll be swapping in new equipment in Salem in
preparation for the installation of new lines later this week.
Downtime should be about an hour. (The site will only be totally
down for 10-15 minutes, and then modems will come back online
one at a time.)
- Feb 27 1997
-
The Nashua dialin was unavailable from about 3pm to 3:30pm.
We are in the process of establishing a direct connection to another
provider in the building; during some of the wiring process the
ethernet was inadvertantly broken.
- Feb 25 1997
- Peterborough: the last 10 ports of the portmaster are not taking
PPP connections. These ports are all on the same 10-port card so it
appears that the card has flaked out. We'll be working to get it
fixed or replaced, but in the interim we'll be 9 ports short
(one of them isn't in use). Some of these ports are assigned
to dedicated dialup users, who will need to use the hunt group
temporarily.
Feb 26: The Peterborough portmaster will be
replaced at about 6:30pm. (Note: the failing unit is a spare that
replaced the original portmaster there; that portmaster was taken
out of service a couple of weeks ago because it was rebooting
spontaneously. We reflashed the ROM in that unit and re-installed
the software, and that unit is going back into service tonight.)
7pm: Portmaster has been replaced.
- Feb 22 1997
- Wind and weather are playing havoc with Southern New Hampshire
and we have seen power outages so far in Nashua, Salem, and Dover.
Nashua was out from about 3pm to 3:30; Salem and Dover went down at
about 3:20. Dover is back up as of about 6:15. Salem is back as of
about 7:30pm.
- Feb 20 1997
- Our Concord equipment will be moved to a new room (just down the
hall from where it is now) during the morning of Thursday, February 20.
Service will be out for several hours while the move is underway.
Note: this was originally scheduled for Feb 18 but was rescheduled
due to order processing delays at NYNEX
Note2: The time of day has been fixed: the work will begin at about
11:30AM and hopefully will take only about 2-3 hours.
Followup: the move is complete. While many modems were up and
running after 3 hours, the last was done at about 6:30pm.
- Feb 12 1997
- The Keene node went down at about 3:30 pm and came back up
at 5:10. The problem was found to be a broken wire in a Manchester
NYNEX facility.
- Feb 1 1997
- The Peterborough terminal server has been acting up lately
and we're going to swap it out Saturday morning at about 10 AM.
We expect this to take about a half hour
(and it did).
- Jan 31 1997
- The mail/POP server at Litchfield crashed at about 6:30 this
morning; it was back up a little after 9:00.
- January 14 1997
- The portmaster in Peterborough began misbehaving around noontime today.
It went into a cycle of spontaneously rebooting itself, disconnecting
anyone who was logged in. This was happening every few minutes.
We went to the site and power cycled it at 2:00PM and it appears
to be stable again.
- January 8 1997
- The web server (the one you're reading this from right now) will
be moved to a new system in the wee hours of Wednesday, January 8.
Starting at 1AM, this should take no more than a couple of hours.
During this move, you will not be able to update your web pages
through your POP mailboxes (although you may be able to access
the web pages over the web for some of this period). The purpose
of this move is to change to a new system with a later rev of the
operating system.
Move completed approx 4:00 AM.
Copyright © 1998 thru 2008 MV Communications, Inc.