Strike Two for O'Hare Radar...

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

This is the second time the Y2K-Compliant Radar system has failed an installation attempt at operation in a production environment. This was attempted about 4-6 months ago with similar results.

The URL may change in a few days as the article is archived.

http://chicagotribune.com/news/nationworld/article/0,1051,SAV-9905060162,00.html

FLIGHTS STACK UP AS RADAR BUGS OUT

By Jon Hilkevitch and James Janega, Tribune Staff Writers. Freelance writer Carolyn Rusin, at O'Hare Airport,... May 6, 1999

The Federal Aviation Administration Wednesday warned that an order slowing traffic at O'Hare International and Midway Airports could lead to serious disruptions in air travel as the agency addresses problems with two radar systems that forced flight cancellations and may have contributed to a near-collision between two planes.

The mandatory air-traffic slowdown--which during the next three weeks will limit the number of planes arriving and departing from the Chicago region in peak morning and early evening periods--could have a major impact on airline schedules across the nation, particularly during poor weather, the FAA and airlines acknowledged.

Travelers at both Midway and O'Hare were inconvenienced Wednesday for several hours when flights were temporarily grounded because of "software glitches" in a radar program at the FAA's traffic control facility in Elgin, officials said.

United Airlines was forced to cancel 100 flights--25 percent of its daily schedule in and out of O'Hare. American Airlines canceled 34 flights, but reported long delays on many others.

The slowdown order came amid reports that a cargo jet Tuesday came within 300 feet of a Southwest Airlines plane that had departed Midway. The FAA blamed the near-collision on controller error, but the controllers union said unfamiliarity with a new system played a role.

The computer bugs at the Elgin center involved an aircraft-tracking system that was returned to service only last month after being unplugged late last year because it repeatedly misidentified, or "ghosted," the location of planes on air-traffic controllers' screens.

A revision of the software called ARTS 6.05 was loaded onto computers Tuesday night. But on Wednesday, vital information that tells controllers the identity, speed and altitude of airplanes began to temporarily disappear from radar screens, FAA and union officials said.

After efforts to fix the problem failed, FAA management and the National Air Traffic Controllers Association agreed to deactivate the finicky ARTS 6.05 system and replace it Thursday morning with a retired version, ARTS 6.04, which had a solid track record. However, the older system is not Year 2000 compliant, forcing the FAA to quickly correct the glitches in the new, Y2K-compliant ARTS 6.05 and get it back on-line soon, said FAA spokesman Tony Molinaro.

The FAA also needs a reliable ARTS 6.05 system in order to resume a controversial experiment this summer in which airliners heading toward O'Hare are directed toward the airport "piggyback-style" rather than single-file in order to speed arrivals and maximize O'Hare's limited runway capacity. U.S. Transportation Secretary Rodney Slater halted the piggybacking last November after reading reports in the Tribune about the potential dangers of the operation.

The computer problems at Elgin caused operations at O'Hare and Midway to be halted twice Wednesday, once for five minutes and a second time for 45 minutes, disrupting airline schedules for most of the day.

"When I heard about the air-traffic system going out, it was scary and made me nervous," said Richard Trieber of San Francisco, whose flight from Toronto to O'Hare was delayed for nearly three hours. "I mean, this is O'Hare were talking about, not some little airport."

At the Elgin facility, visual readouts on the radar screens, which identify and locate the position of aircraft, froze up for a few seconds at a time and dropped data before coming back up, the FAA said, requiring controllers to estimate the new positions of aircraft until the system recovered.

As a precaution, the FAA said, all aircraft at O'Hare and Midway were grounded and traffic already in the airspace controlled by Elgin was sent elsewhere until the problems stopped.

"We scrambled, we stopped airplanes and we made lots and lots of delays," said Kurt Granger, controllers union president at the Elgin facility. "It looks like this is going to be a long summer. My advice is that if you've got time to spare, go by air."

The FAA, meanwhile, attributed Tuesday's near-collision between the Southwest Boeing 737 and a Federal Express cargo plane to an error by a controller at the Chicago Air Route Traffic Control Center in Aurora. The controller ordered the Fedex Boeing 727 to descend to 28,000 feet--directly in the path of the Kansas City-bound Southwest jetliner, which was flying level at 28,000 feet and took evasive action after the cockpit crew was alerted by a collision warning system.

The controllers union blamed the near-collision, which occurred over Missouri, in part on a lack of training and unfamiliarity with new radar scopes and work stations in Aurora. On Monday, the new hardware was used for the first time to direct air traffic, the same day traffic slowdowns were instituted as an added safety margin.

"The lighting is different, the radar scopes are different and a lot of the buttons and knobs on the old system have been replaced by keyboard functions that require controllers to keep their heads down and off the screen longer," said Ron Downen, controllers union president at the Chicago Center.

"Unfortunately, the controller was thinking about equipment issues rather than thinking about separating the aircraft," Downen said.

Molinaro agreed that it will take some time before controllers feel comfortable with the new system, called the Display System Replacement. Because of the complaints, which include controllers having to take their eyes off the radar screen to use the computer's trackball, the traffic restrictions will remain in place in the Chicago-area airspace.

But Molinaro disputed that training or readiness was a factor in Tuesday's near-collision.

"It's not that the displays on the new system are vastly different," Molinaro said. "And besides, the controller made the error while his duties were extremely light--he was responsible for only one airplane in his sector."

The airlines are officially supporting the FAA's decision to slow the pace of operations. Behind the scenes, however, the Air Transport Association, which represents major U.S. carriers, has been pressuring FAA Administrator Jane Garvey to ease the restrictions, which are expensive for the airlines and frustrating for their customers experiencing long delays and missed connections.

"We're 60 to 90 minutes delayed," American spokeswoman Mary Frances Fagan said late Wednesday afternoon. "You're trying to push them out of here or turn them around to get out on time, but you can't make up that much time. You can only do your best."



-- Plonk! (realaddress@hotmail.com), May 06, 1999

Answers

Let's see the Polly'sresponse to this one...see what happens when you put systems back into production that haven't been fully tested??

Let's see if they can get it fixed in 72 hours. Ready...set...go.

R.

-- Roland (nottelling@nowhere.com), May 06, 1999.


A near miss (and RD Herring will correct me if I'm wrong) is when two planes come within 5 miles of each other at similar altitudes? Sounds reasonable, until you consider the planes are moving 200-600 miles/hour, which places them 60-120 seconds apart.

Thank God for Collision-detection systems. Planes really could have fallen out of the sky.........

-- Plink! (not@real.address), May 06, 1999.


And who said planes won't fall from the sky........huh ?

The FAA, meanwhile, attributed Tuesday's near-collision between the Southwest Boeing 737 and a Federal Express cargo plane to an error by a controller at the Chicago Air Route Traffic Control Center in Aurora. The controller ordered the Fedex Boeing 727 to descend to 28,000 feet--directly in the path of the Kansas City-bound Southwest jetliner, which was flying level at 28,000 feet and took evasive action after the cockpit crew was alerted by a collision warning system.

-- (innxxs@yahoo.com), May 06, 1999.


I'll respond to this - the air traffic control systems SUCK! They have been problematic for quite some time, apart from Y2K, and their attempted overhauls have been classic examples of gross mismanagement of large-scale systems projects. Sooner or later, I think there will be a mid-air collision due to a failure in one of these systems.

Personally, I would like to see the people that code the (error-free) software for the Space Shuttle get involved in building a new ATC system. Without the FAA, IBM, EDS, Andersen Consulting or any other self-interested bureaucratic organization getting involved and mucking it up.

But this still doesn't mean that Y2K is going to be that bad! Another huge, nonlinear leap.

-- Polly (skippy@innermongolia.com), May 06, 1999.


What you do not realise is that the old computers had been made Y2KOk in a small time last year because they were so old that they still ran on tubes and sirvo systems. Completely analog. Except for a small digital overlay which was fixed in no time. People not understanding this, demanded more. So New, digital computers ahve been put in place too soon with problems. Not Y2K problems, but the usual problems found in new computers and software that is normally tested for long periods of time before being allowed to run such a sensitive (safty in this case) system. Do not confuse FAA with ATC. ATC controls the aircraft traffic, FAA is does many things becides overseeing ATC. If FAA sat their computers are 50% fixed, yet last sept ATC said their computers were 95% fixed, understand they are talking about two completely different things.

None of this is based on opinion, these are the facts.

-- Cherri (sams@brigadoom.com), May 06, 1999.



TO: POLLY, Cherri:

The gist of this article is this:

The current ATC system is...OK, it works.

However: It is NOT Y2K compliant, now does that mean that the system will just spit up the wrong date on a screen? or will the system crash and refuse to operate? I don't know.

But what I DO know is that this "UPGRADE" to the Radar system in Chicago was attempted last fall. It failed miserably, it was yanked to be "debugged". The EXACT same story!

Now, 6 months later, they are trying it AGAIN!

Surprise! It "may" be Y2K compliant, but its full of supposedly fixed flaws.

The article SAYS they are attempting to upgrade from ARTS 6.04 to ARTS 6.05.

What happens to O'Hare's traffic control system if ARTS 6.05 cannot be implemented by 01/01/2000?

I would guess that O'Hare would either shut down, or operate at a TREMENDOUSLY reduced rate.

The radar system they are currently using EXPIRES on 01/01/2000

Do you understand that? Did you even read the article?

Maybe they can set the clocks back.....:-)

because it sure doesn't look like they're gonna have a working replacement up and running by then.

Time will Tell.

-- PLONK! (realaddress@hotmail.com), May 06, 1999.


And this is only two airports out of all of them - at every airport, the "controllers will have to be retrained in the new system...", the new system will need to be installed, tested, ...

How many days left? Air traffic affects UPS, FedEx, USPS, business shipments as much (if not more) than passenger only.

-- Robert A. Cook, PE (Kennesaw, GA) (cook.r@csaatl.com), May 10, 1999.


USPS is looking into using ground transportation more than air.

-- the postman (always@rings.twice), May 10, 1999.

I think I'm remembering this correctly.

Wasn't it IBM that was urging and pleading with the FAA and others to get rid of their systems because IBM knew their systems were not compliant and could not be made compliant?

Didn't IBM run full page ads to this affect?

Didn't the ad say replace these systems with our stuff or someone else's stuff but replace those systems because they simply wont work after the turn?

What did the FAA do? To paraphrase, my recollection was they said, "naw, we can fix 'um" and they've been trying to ever since.

Wasn't the FAA supposed to be compliant already?

Mike ===============================================================

-- Michael Taylor (mtdesign3@aol.com), May 10, 1999.


Michael,

I think the FAA's "next" stated Y2K compliancy date is June 30th.

*Sigh*

Diane

-- Diane J. Squire (sacredspaces@yahoo.com), May 10, 1999.



NATCA (Nat. Air Traffic Controllers Assoc. had at one point in time a LONG text describing the history of the air traffic computers and the problems with them. It is no longer online unfortunately (just checked). Needless to say it wasn't a cheerfull text. Where Cherri is getting her facts, I would love to know. I mean that in all sincerity and would love some URLs to read, particularily where the system used analog computers(?!?). One of the main thrusts of the NATCA text was that the units were operating beyond their operational life and replacement parts were no longer available. Ie., if some items have a MTBF of 50,000 hours and you're at 49,000 hours, you've used up your spare parts on previously failed units, and no spare parts it's time to start worrying big time.

If you want to piece together the history, go to North's site under Transportation and search for FAA. It will take you a couple of dozen articles or so (which is why I wish that NATCA would have left their text on line) but you'll get the drift.

-- Ken Seger (kenseger@earthlink.net), May 11, 1999.


Ken - It's more like the Mean Time Between Failures is 50,000 houirs, and the FAA is operating them at 120,000 hours. These things ARE breaking down from old age and previous temporary repairs, are already salvaging parts from other computers and terminals.

The MTBF is only a "predicted" failure interval - some (many) will last longer than their predicted failure times. But they WILL eventually fail. reliability mathematical theory simply predicts an ever increasing failure rate as lifetimes are extended past the nominal "end of use" date.

There are several equations that can be used to predict future failure rates of this kind of equipment - Weibull, exponential, constant, linear increasing, various quadractics, etc. depending on the measured failure intervals and number of existing equipments in service. But on a practical basis, without exact maintenance histories, and exact "time in service" records, you and I are just guessing.

I'd rely on the original IBM letter - before it got politicized and "updated" by FAA management - as the best indicator. Listen to the ATC union members - they know what's going on with their gear - the FAA management are following political directives.

This is only a publicity stunt.

-- Robert A. Cook, PE (Kennesaw, GA) (cook.r@csaatl.com), May 11, 1999.


Here is an article. Unfortunatly the link is now dead. It has been almost a year. As for being analog... Vacume tube mainframes are old analog technology. I've worked on them myself. I have also talked to people who have worked on them receintly. About two weeks ago I was talking to a man who helps find parts for them, we got into a detailed discussion about the search for spare vacume tubes. Who else remembers when there were vacume tube checkers in stores, now none can be found? Got any 12AT6's?

Date: Wed, 22 Jul 1998 09:14:09 -0500 >Subject: Air Traffic Control Computer System Cleared for 2000 > >http://www.washingtonpost.com/wp-srv/WPcap/1998-07/22/014r-072298-idx .html > >-- spacer >Air Traffic Control Computer System Cleared for 2000 >IBM Warning Prompted Tests > >By Rajiv Chandrasekaran >Washington Post Staff Writer >Wednesday, July 22, 1998; Page A15 > > ATLANTIC CITY, N.J.Federal Aviation Administration technicians have >concluded that a critical mainframe computer system used in the nation's >largest air traffic control centers will function properly in the year >2000, despite warnings from the system's manufacturer that the agency >should replace the equipment. > >The determination, reached over the past few weeks by programmers at the >FAA's technical center here, has elicited cheers from agency officials, who >had been castigated by congressional investigators earlier this year for >not planning a quick replacement of the systems. > >"The examination has revealed that the [system] will transition the >millennium in a routine manner," FAA Administrator Jane F. Garvey said in >an interview yesterday. > >The mainframe computers at issue, made by International Business Machines >Corp., are used at the FAA's 20 air route traffic control centers to track >high-altitude aircraft between airports. The computers, IBM's Model 3083 >mainframes, receive data from radar systems and integrate that information >into a picture for air traffic controllers. > >Last October, IBM sent a letter to the FAA warning that "the appropriate >skills and tools do not exist to conduct a complete Year 2000 test >assessment" of the 3083 computers, once the mainstay of large corporate >data centers. The machines have been mothballed by most users, a step IBM >urged the FAA to take. > >Although the FAA plans to replace the mainframes as part of a broader >modernization effort, agency officials were unsure they could complete the >process by 2000. As a result, they embarked on an aggressive testing >program to figure out how the computer system would be affected. > >Most mainframes use a two-digit dating system that assumes that 1 and 9 are >the first two digits of the year. Without specialized reprogramming, it was >feared that the IBM 3083s would recognize "00" not as 2000 but as 1900, a >glitch that could cause them to malfunction. The federal government and >private companies are racing to fix other computers to avoid the year 2000 >problem. > >To conduct the testing, the FAA hired two retired IBM programmers and >assigned a handful of other agency employees to the project, which involved >checking more than 40 million lines of "microcode" -- software that >controls the mainframe's most basic functions. Among the initial areas of >concern was whether a date problem would affect the operation of the >mainframe's cooling pumps. If the computer does not regularly switch from >one cooling pump to another, it can overheat and shut down, causing >controllers' radar screens to go blank. > >The technicians, however, found that the microcode doesn't consider the >last two digits of the year when processing dates. Instead, it stores the >year as a two-digit number between one and 32, assuming that 1975 was year >one. As a result, they determined, the system would fail in 2007, but not >in 2000. > >"Nothing we have found will cause an operational aberration over the new >year. It will continue to function as it's supposed to," said one FAA >technician working on the project. FAA officials recently allowed a >reporter to tour the facility here and talk to employees on the condition >that they not be named. > >"We're dealing with minutes and seconds in air traffic control," said >another technician. "The systems don't really care about days and years." > >The programmers did find four software modules that need to be repaired to >handle the leap year in 2000, but they said the task would be relatively >straightforward. > >While the technicians came to their conclusions a few weeks ago, Garvey >only recently was briefed on the findings. The results, sources said, have >not yet been shared widely within the Transportation Department or with >lawmakers. > >Agency officials acknowledge their determination will be met with >skepticism on Capitol Hill and in the aviation industry. To bolster their >case, the technicians said they have compiled reams of computer printouts >that back up their conclusions. > >The findings highlight one of the uncertainties of year 2000 repair work. >While some projects can be more costly and time consuming than originally >expected, others can be unexpectedly simple. > >"It's a welcome surprise," Garvey said. "And we don't get many of them in >government." > >) Copyright 1998 The Washington Post Company

-- Cherri (sams@brigadoon.com), May 12, 1999.


Here is an email on "Amy's List by someone involved. If you really want to know the truth ~ why not call the phone number provided?

Subject: RE: Sighting: FAA ATC Computers Y2K OK! Date: Tue, 28 Jul 1998 11:43:17 +0100 From: "Y2K Maillist (Via: Amy)" Reply-To: year2000-discuss@year2000.com To: year2000-discuss@year2000.com

Date: Mon, 27 Jul 1998 11:54:26 -0400 From: NATE MURPHY <105174.1470@compuserve.com> Subject: RE: Sighting: FAA ATC Computers Y2K OK! To: "INTERNET:year2000-discuss@year2000.com"

Re: FAA

Ralph,

Yes, you are a skeptic and unnecessarily so.

The FAA started developing this system over thirty years ago. It went into production in the Los Angeles Airtraffic Control Center in 1972. The National Airspace System(NAS) was developed as a result of several air collisions that occurred in the 1950's. They understand more about the business of air traffic control and air safety than any organization that I am aware of. Believe me, Flight Plans, Departure flights, Tracking and Handoffs to ARTS(departure and landing) are all part of this multiprocessing, continuously operational(24x7) fully recoverable software / fail hardware system. This is a hugh messaging system written with its own priority operating(pre OS/360) and database management system(" DBMS" word not invented yet). This is a time dependent system(not Date Sensitive). Day is only important when it read the daily flight plan tape which is supplied by the airlines.

Believe this, on March 23,1998, Stan Graham,TechBeamers, Bob Nagel and myself met for two hours with Ray Long, the FAA year 2000 manager and his staff. We discussed several alternatives with Ray. Ray's top priority was to analyze the micro code in the 3083's because it was the best alternative for the FAA, and it worked. At the time, we did not feel it would be appropriate to share that information with the group. By the way, it only took twenty lines of code to make the Enroute Air Traffic System year 2000 compliant.

Ray and his staff deserves credit for saving a lot of time and money. They are perfectionist and the airways are much safer because of their technical tenacity.

Nate Murphy Nate Murphy & Associates 105174.1470@compuserve.com The Assembler People 609-234-2353

-- Cherri (sams@brigadoon.com), May 12, 1999.


Moderation questions? read the FAQ