Still Confused Over Embedded Systems

greenspun.com : LUSENET : Electric Utilities and Y2K : One Thread

I'm still confused regarding "type" testing of embedded systems. So I will try to simplify the question. Are individual electric utilities individualy testing each embedded system in all of their safety related systems? Is there an EPRI or NERC standard regarding embedded system testing on safety related systems? Any advise would be appreciated.

-- Anonymous, February 21, 1999

Answers

The answer to Q1 is: it depends. ;-) Some are, some aren't. It depends on a lot of things, too numerous to go into at the moment. Under normal circumstances, it is accepted QA practice to lot/batch test even safety related components. Do potential Y2k induced problems meet the test of "normal circumstances"? Honestly, it's a management call in each Y2k program as to whether lot/batch testing is sufficient. Personally, I don't like "type testing" (the same concept as lot/batch testing, only less intensive) for reasons described later in this post.

Q2: NERC has stated that type testing should be avoided (see their September report). EPRI is basing a lot of their program and results to date on type testing. So, you actually have two divergent viewpoints within the industry on "type testing". CL has previously posted in this forum that it is his opinion that type testing is valid.

Ok, now onto some other viewpoints.

ComputerWeekly ran an article some time back which talked about the results of Y2k type testing at Smith-Kline-Beecham (pharmaceuticals). Here's the link to the article:

http://www.computerweekly.com/cwarchive/news/19970508/cwcontainer.asp? name=H1.html (note: requires free registration)

Yet another worm in the can is that type testing (checking one representative system bought from the same supplier, and assuming it reflects the compliance status of every other such system) may not hold water. Pharmaceutical company Smith Kline Beecham has found that out. "It bought two machines for monitoring and recording the performance of drug production," says Guenier. "When they tested one, it handled January 2000 very well, and they were very happy. But when they tested the other - same machine, identical chips - it didn't."

The scary explanation for the anomaly, when the firm checked serial numbers with the manufacturer, was that the chips had come from different makers, one of whom had made them year 2000 compliant, while the other hadn't. Documentation down to this level of detail is often not specified in the world of embedded systems. And these were machines, notes Guenier, that had been made last year.

Robin Guenier is the former executive director of TaskForce 2000, the UK's equivalent of the President's Y2k Council.

Is all of this confusing? Yep. Being an ex-nuke, I am conservative by nature when it comes to testing anything, so I default to testing everything in every manner possible for every possible failure mode.

-- Anonymous, February 21, 1999


Rick, thanks much for your excellent answer. The latest news on the y2k embedded system problem is that accordingly to the Utility Industry embedded systems are not that big of a problem. Today, Ed Yardini announced that he is not really concerned about the electric utilities anymore. Senator Bennett went on record earlier that he is not that concerned about power production in this country anymore. I can only deduce from these statements that the utility industry is doing "type" testing. So if individual testing is required, and they are not doing it, the only answer is to "fix-on-failure". But on the other hand, if "type" testing is OK, then everything will be honky- dory. This seems like a crap shoot to me. Thanks for your input.

-- Anonymous, February 21, 1999

That answers the same question for me Bill. I am not a technical type but over time I have come to understand that type testing is a crap-shoot. I think of the telecommunications industry upon which smooth operation of the electric industry is dependant,and..well I worry. There has been a lot of silence there.

-- Anonymous, February 21, 1999

Here's another example:

In recent congressional testimony, a prominent Arizona farmer noted that when testing embedded systems in three identical main sprinkler/irrigation controls, two worked, one failed completely. If he had just type tested one of the two that will work, he would have never known about the one that would fail--until next year, that is.

In its latest (Jan. 11th) quarterly report, NERC explicitly warns power companies NOT to rely upon type testing and vendor statements with regard to embedded components in distribution systems that might cause actual outages. (I guess type testing is OK for components in generating and transmission systems, huh??!!) A stern warning--buried in one little paragraph, about 60 pages into the report, long after the rosy "press release" part delivered before the National Press Club in Washington. Golly gee.

6,000 power plants, 112,000 substations, and how many embedded components in distribution systems? Think lots of type testing ain't going on? Why do you think EPRI is relying almost exclusively on type testing results for its pollyanna statements? Because it has no choice. Everybody will cross their fingers, rub the rabbit's foot, and pray like hell come next January. It's that simple. Maybe we will be lucky, maybe we won't. If we aren't, there will be hell to pay, as companies scramble to replace embedded systems and stress out their vendors. Think the failures of various embedded systems aren't capable of causing serious outages? LOL

I'll say it again: go to www.nerc.com, find their "generic inventory of embedded components that may be susceptible to Y2K" (or some such title), print out the 44-page list of 3,600 items, and look at all the things (voltage regulators, transformer monitors, recloser controls, turbine controls, substation RTUs, etc., etc., etc.) that NERC does NOT know the Y2K status of. NERC refuses to "fill in the blanks" on these items (the list hasn't even been updated since late Sept., to my knowledge), evidently because it doesn't want the legal liability of having proclaimed each item compliant individually (or Y2K irrelevant) when, in fact, it really doesn't know. I have pressed NERC on this issue; no reply to date. NERC knows full well the pitfalls of type testing; it just can't do anything about it. Every power company statement to the effect that "we've fixed and tested everything mission critical" must be read in this light.

As for vendor statements of their products' compliance, which many companies tend to accept at face value (so much cheaper that way, you know): What sort of tests do you think the vendors performed? (assuming they performed any tests, that is!) Gosh, type tests?

The current Senate Y2K report is a hoot from hell: the good Senators warn ominously that hospitals should NOT rely upon type testing and vendor statements when assessing their embedded systems. But the power companies? the natural gas companies? the oil companies? the water utility companies? Well, as Scarlett O'Hara said, "I'll think about that tomorrow."

-- Anonymous, February 26, 1999


Dan,

Look closely at the NERC form. This part is voluntary and optinal. If NERC doesn't know, it doesn't mean the utilities don't know - they just aren't reporting it to NERC. EPRI has a superior platform for sharing results that has made the NERC form you reference obsolete. Unfortunately, EPRI results are restricted to members.

By the way, if vendor and model #'s are included, this NERC list gives the opportunity for all those who predict outages of varying severity and length to surf the sites of all those vendors and fill in the blanks yourselves. No credibility issue, if you want something done right do it yourself. Find the devices that will cause these alleged outages. Maybe the concerned here could start a thread to volunteer and split the inventory list alphabetically to divide the labor.

Just a suggestion. I'll turn blue holding my breath. (grin)

-- Anonymous, February 26, 1999



Typical of Cowles. Answer a new question with data from 1997. We have made some progress in understanding the embedded control system problem since then! Would Mr. Cowles care to give the percentages of embedded system problems found in actual testing by utilities? It is very much lower than the 97 estimates.

-- Anonymous, February 27, 1999

Paul,

What the heck are you talking about? I didn't reference 'data' from 1997. I pointed Mr. Watt to an article from 1997 that discusses a finding relative to the issue of type testing, not data. Please read the information, in this thread and the referenced article, prior to pulling the trigger.

-- Anonymous, February 27, 1999


Moderation questions? read the FAQ