More Info-Embedded Chips, Texaco, This is Not a Test

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

Concerning the discussion about embedded chips and possible failure on Day 1 or 31, here is some additional information that was published in Wired Magazine in the April, 1999 issue. Go to: www.wired.com/wired/archive/7.04. Click on the article: "This is Not a Test."

"The precise embedded system to be tested was a remote terminal unit, or RTU. An RTU is something like a small, single-purpose computer, the Stormac team explained. In a paperback-sized box mounted on the wall were several integrated circuit boards, each containing chips with embedded logic. Unlike programmable logic controllers, or PLCs, which can contain complex programs to control industrial processes, an RTU is fairly primitive, usually confined to doing one task. This one measures the flow of liquids and gases through a pipeline. Simple as its work sounds - it measures the instantaneous flow rate, stamps the measurement with a date and time, and stores it temporarily in its internal memory - it's a crucial piece of gear for Texaco. This little box is how it knows how much fuel it's delivering through its pipelines - and how much to bill the customers who are getting that fuel.

The Scada host computer sat on the other side of the machine room - nothing exotic-looking, just an Intel-based PC with specialized OS and software. But the Scada system is the heart of Texaco's embedded-system network. If it can't collect data from the field devices, the company has no idea what's going on in its operations, can't analyze its production, can't bill customers - can't function as a company. By law, if Texaco loses contact with its field devices, it shuts down in four hours. Right at that moment, the Scada system was polling hundreds of embedded-system devices, collecting and storing about 30,000 points of data.

Cook attached a laptop to the RTU, which gave him a direct interface to the logic in the device. He was, of course, about to do the one thing everyone wanted to do: set the date on the device to December 31, 1999, wait for the year to change, and then see what would happen.

Using a handheld interface terminal, he entered the date and time: 12/31/99 23:59:45.

Then we all watched the display on the face of the RTU as the seconds counted up to midnight. 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59 - then the date rolled over.

01/01/:0.

"Colon zero," said Cook. "It's like, what is that?"

Then he tried entering the date 12/31/00. Again the seconds counted up to midnight, this time to 01/01/:1. But nothing terrible seemed to happen. No flashing lights, no buzzers, no equipment shutdowns. Was it just a weird date-format problem? A lot of hype over a display? Cook then took me over to the terminal for the Scada system and tried to collect information from the RTU. He entered the command to retrieve the device's idea of the date and time, and the Scada console displayed:

01/01/101

Then he tried to retrieve the crucial information from the device, the date-stamped flow measurements stored in the unit. And the Scada system answered:

METER DATA NOT AVAILABLE - CONTRACT HOUR NOT CURRENT

"It can't get the data," Cook explained.

Gas and oil continued to flow, unmonitored and unmeasured. If you can't read the data, said Cook, "you don't know what you've sold, and you can't get paid for it." How long could Texaco continue to function without being able to bill for the oil and gas delivered through its pipelines? Abshier and Martin looked at each other and let the question go.

Texaco has hundreds of RTUs like this one out in the field. Fixing the devices involves going out to each unit, changing the chips inside it, and installing new software - about an hour's work per unit. The first round of replacement chips the RTU vendor sent them didn't work; they had to wait for another. Then the Scada system needed upgrading. And that was just for this one device. There are all those other devices in the field, with their chips and their embedded logic - setting valve positions, measuring pressure - hundreds of them."

-- Y2kObserver (cjohnson@seidata.com), January 03, 2000

Answers

Thanks Observer. I remember reading this very well.

We're obviously running on "manual workarounds" doing what ever it takes to continue to look and feel normal. These kinds of situations will take weeks to actually have an impact.

Don't give up your seat in the front row...the opening credits are ending and this movie is just getting started.

Mike

==================================================================

-- Mike Taylor (mtdesign3@aol.com), January 03, 2000.


I agree with you Mike. I also think we should be celebrating 1996 as well...

-- Uncle Bob (UNCLB0B@AOL.COM), January 03, 2000.

Thank you. Very interesting but before we get all excited lets take a cold hard look at the situation. RC any comments on the following:

1) ok so they can't bill or collect revenue but the world is still able to run (free of course for a while). No real big deal except to Texaco and the others - oh well cost of doing business. No love loss here.

2) Ok, this was April 99 what has been done since?

3) If nothing, for saftey etc reasons for arguments sake, do they have replacment parts on hand to fix on failure now that they know what will fail and has that been fixed or being in the process if fixed.

4) If nothing is still being done, why not?

-- Interested Spectator (is@the_ring.side), January 03, 2000.


I remember reading of so many more failures in testing along than happened at the rollover!

-- Sheri (wncy2k@nccn.net), January 03, 2000.

rofl Uncle Bob! Yep...I think so :-)

Thanks!

Mike

======================================================================

-- Mike Taylor (mtdesign3@aol.com), January 03, 2000.



I work for a major utility on the Y2K project. We have hundreds of PLC's and RTU's being monitored by SCADA systems, mostly for natural gas but some for electric power also. We saw the same types of problems described in Y2K's post. We fixed them all. We spent a lot of time and money on them because, although it wouldn't interrupt power or gas, we wouldn't be able to accurately bill customers. This was, as you might imagine, a big deal to the officers of the company.

The implication of the post is that, somehow, the RTU's were never fixed and are lurking, waiting to go off. What do you you think we spent all that money on? WE FIXED THEM. There won't be any problems with embedded systems. It's time to get a life again.

-- (Someone@Somewhere.com), January 03, 2000.


Someone.

We're glad you fixed them - bravo.

Were they also fixed in Saudi, Ven, Russia, Iran, Iraq, Mehico, Indonesia, Nort Sea etc. etc.

-- Andy (2000EOD@prodigy.net), January 03, 2000.


Andy,

They were fixed for the same reason we fixed them - they have to be able to bill their customers. There is no higher motivation than profit, as I'm sure you know.

Just out of curiosity, how much time have you spent working on any Y2K project so we can judge the level of your knowledge about embedded systems?

-- (Someone@somewhere.com), January 03, 2000.


thanks for the info and glad you managed however, what about those chips you couldn't get to? You know, the ones in pipelines or buried under concrete? Maybe deep in the ocean?

You got all those too?

<

-- Mike Taylor (mtdesign3@aol.com), January 03, 2000.


Hey Someone -- did you test your fixes and are they still holding? I hope so.

-- Mello1 (Mello1@ix.netcom.com), January 03, 2000.


There are no critical sytems "buried" deep in the ocean - or anyhwere else. Where do you get this type of information. All systems require maintenance and have to accessible for this purpose. Can you give me even one example of a "buried" , inaccessible, critical embedded system?

Of course we tested our fixes. We spent as least as much time on these RTU's and PLC's as any other, probably more since it directly affects revenue. You can speculate all you want about "phantom" embedded systems that are going to cause trouble but I'm absolutely sure that they are going to work right.

-- (Someone@somewhere.com), January 03, 2000.


Andy, They were fixed for the same reason we fixed them - they have to be able to bill their customers. There is no higher motivation than profit, as I'm sure you know.

Just out of curiosity, how much time have you spent working on any Y2K project so we can judge the level of your knowledge about embedded systems?

-- (Someone@somewhere.com), January 03, 2000.

======================================================================

Someone sorry but you're full of shit.

You said on another thread that you checked what, 6,500 chips to find only one that would have caused the slightest glitch to your operation.

I ask you a perfectly reasonable question about foreign ops and you try and bullshit us all.

You think the Iranians are gonna bill someone? Who exactly? the Ayatollah? BS!

You think the Iraqis are gonna bill Donald Trump do you? BS again.

I spent 3 years in Saudi training Arabs in IT. I doubr ***very much*** going by my experiences in the Middle East that they've done anything like the checking your company did.

your attitude is so so typical.

Because ***I'M ALRIGHT*** here in Bumfuque, indiana, then ***THE REST OF THE WORLD IS TOO***.

So take your pompous bullshit attitude and shove it pal.

I've been in IT 22 years and am currently on a y2k project so you can smoke that one too.

Asshole.

-- Andy (2000EOD@prodigy.net), January 03, 2000.


"Can you give me even one example of a "buried" , inaccessible, critical embedded system?"

How many satellites are there in geostationary Earth orbit?

-- Ron Schwarz (rs@clubvb.com.delete.this), January 03, 2000.


Someone:

Thanks for your post. Its about time someone who's actually worked on this stuff spoke up. I'm assuming you're legit (as we have had our share of phoneys around here) from your reply in my RC challenge post. It seems that the communications somewhere between what you and others like you did and reported and those at the top got garbled. As those at the top reported incorreclty (or failed to report) your findings so that the government got very worried to the extent that the Navy and others predicted massive failures. Perhaps consultants like Mr. CEO's outfit didn't like the fact that most emebds (95%) track time with counters (as you said you found in my post) and so they'd have very little work. Perhaps that's why his teams were being sent back early. His clients caught on to the racket and sent the teams packing as they would be doing a lot of investigating and not much fixing. Or they were just finished. But certainly they were not sent home because of what Mr. CEO said (i.e. the teams were leaving a paper trail that exposed the companies to liability as they documented all the non-compliant stuff that would need to be fixed).

-- Interested Spectator (is@the_ring.side), January 03, 2000.


Andy,

I'm sorry to see you descend in to such a personal attack. I merely asked you what experience you have with embedded systems. With all your rather heated response and claims of how long you have worked in IT I see you still never answered that simple question. I wonder why?

:You think the Iranians are gonna bill someone? Who exactly? the :Ayatollah? BS!

:You think the Iraqis are gonna bill Donald Trump do you? BS again.

Errr....yes, I think they are all going to bill someone for the oil they sell. Or, do you think they do it for free? How is this BS?

:Because ***I'M ALRIGHT*** here in Bumfuque, indiana, then ***THE :REST OF THE WORLD IS TOO***.

And you believe that, if things are OK here, that means there must be problems somewhere else? Maybe there will be - I don't know if it's true or not. I can't prove a a negative.

:Asshole.

And how does this contribute to the discussion? I don't recall ever having written anything like this about you - why do you feel the need to do so to me?

-- (Someone@somewhere.com), January 03, 2000.



Ron:

How many satellites have date critical embedded systems that are not controlled by earth stations? I don't know the answer but since most of these satellites are used for communications and communications around the world seem to be working well, I suspect that can't be many.

-- (Someone@somewhere.com), January 03, 2000.


IS,

I assure you I'm legitimate. I am not posting my real name or e-mail address in a public forum because I don't want to be seen as speaking for my employer. I would be happy to e-mail you personally if you'd like.

It's not clear to me where this failure of communications took place. We reported we were Y2K ready on September 1. All the others utilities reported to NERC and the few that were not ready were publicly exposed by NERC. I never read about the Navy or any other responsible group predicting widespread power failures. If any did, they certainly weren't reading the same publicly available documents I was.

I don't think that this was a communications failure - I think it was a trust failure. I can't say as I blame people for not believing everything put out by the government or corporations but this was one time that we weren't lying. The problem is, with a history of lies, how do you convince the public that this time we were really, really, telling the truth. I'm afraid only what has already occured would convince people. Everything else just had to play out.

-- (Someone@somewhere.com), January 03, 2000.


Someone - don't worry I call everyone asshole - it's a term of endearment.

Botton line - you don't know what those countries have done - if they checked every embedded or none so just say so.

I don't deal with embeddeds in my work, which is why I'm asking the expert here.

-- Andy (2000EOD@prodigy.net), January 03, 2000.


Someone@somewhere -- I have no idea.

I *do* know that they *are* inaccessible. For me, that debunks the claim that no one *ever* puts critical circuitry in places where it *cannot* be retrieved and repaired.

-- Ron Schwarz (rs@clubvb.com.delete.this), January 03, 2000.


Any comment on how this might possibly relate the the blowout that was just reported as occurring in the Gulf of Mexico?

I thought there were sensors on the blowout preventers located "down the hole" which would detect a sudden flow increase, realize a blowout situation and then shut off the flow. Unless the sensors are hung up in some "I don't know what day it is, so I quit transmitting data" mode.

Going from normal flow to a blowout situation, that's a pretty good change from steady-state conditions. And when embeddeds have failed during steady state operations and then don't react until a change of state occurs, that's when you find safety devices that fail to operate.

Blowout preventers, pipeline rupture detection systems, tank over pressurization detection, hazardous/explosive vapor level detectors, turbine vibration detection, radiation level monitors; what else can you think of that could be operating at steady state with failed monitoring systems that can cause a world of hurt when things start changing?

WW

-- Wildweasel (vtmldm@epix.net), January 03, 2000.


Moderation questions? read the FAQ