forwarded: Recent Discussion of Embeddeds by an Expert

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

Subject: All you ever wanted to know about the embeded chip Date: Thu, 02 Dec 1999 23:28:53 -0500 From: Chayim Leib Weiss Organization: Weiss Consulting Services, Inc To: IAEM-y2k , Civicprep list CC: EMERGENCY MANAGEMENT

.....but were afraid to ask......

All

I read most of the replies and they were interesting. Information I saw was either partial or incorrect. I kinda had the feeling that some of you 'rushed' to reply so that you could stem the fear. Well I've seen enough of that to last a lifetime.

It surprised me a little since no one even suggested that they were in the computer field. Funny how we are drawing conclusions without 'expert' testimony (Mr. Koshigen et el).

I thought the original question was for an explanation on how the embedded chip works. In researching I found the most detailed explanation to just that. I tried to remove the peripheral stuff to keep things from getting side tracked. Hope it isn't information overload. Oh yes, and feel free to draw your own conclusions from the information.

-------------

I found this article at: http://www.webpal.org/Gas.htm Go there and follow the links to other related embeded chip issues. It is (may be) shocking however, the Author does not subscribe to TEOTWAWKI. While you may not subscribe to his views, it should give you a better understanding of the "working" of the ROM PROM DRAM EPROM PLC chip.

KEY: I have removed some information not related to the question of the chips mechanics. If you see ........snip...... Its been cut. I have also placed [square brackets] to indicate my comments.

It is long, if you follow my comments it may shorten the read.

"follow the yellow brick road" ====================================================================== The Embedded Processor SECONDARY Clock Problem by Bruce Beach April 9th, 1999

This document is now here for just HISTORICAL purposes. For the updated information it is recommended that you go to: http://www.webpal.org/Beach2.htm

[ this is the question ]

I have REPEATEDLY heard or read statements made by people in VERY HIGH places that their systems do not have a Y2K embedded processor problem because:

their systems:

Don't show a date. Don't involve time. Don't have a time problem.

I have even read some Japanese, Chinese and Russian statements that they don't have a problem because they don't even use the same Calender as the US does (because of its predominant US Protestant and Roman Catholic Christianity).

[ here is the short answer ]

These highly placed people are wrong, wrong, wrong. The reason they are WRONG is because they do NOT understand the embedded processor SECONDARY clock issue.

Since I am posting this on a web page, and because others may see this who are not familiar with the source let me first present my credentials for explaining what I am about to explain:

[ this authors cridentials ]

I am a former college professor of computer science, and hold both U.S. and Canadian microprocessor patents. The particular application that this paper discusses is oil refineries. I spent one summer as a NUL Fellow with the Chevron Chemical Company, and was at one time a consultant to the Imperial Oil Company, both times in regards to computer systems.

I recently completed a lengthy study and report on natural gas pipelines, published at: http://www.webpal.org/REPORT.htm and it was feedback from that report that prompted this inquiry.

I spent five days reviewing all the microprocessor information that I could find in two libraries and on the Internet, and by consulting with some other individuals in the field. I felt well prepared to conduct the interview and I would say that at this point that I feel that I have a better grasp of the situation than 99% of the other people that I meet.

[ what is the clock ]

First of all let me emphasize regarding the SECONDARY clock problem with Microprocessors we are not talking about wall calender time. And we are NOT talking about wall clock time.

Let us look at a simple problem. A very highly placed railroad official was explaining on TV that railroad crossing gates will all work because they are EVENT controlled (the approach of a train) NOT time controlled.

This is the conventional viewpoint and conclusion. It may NOT be true, that they won't have a problem and if it just does happen to be true, then that is coincidental and not for the reason given by the official.

The Y2K problem is more real and subtle than these highly placed, dedicated, sincere people may comprehend. They are undoubtedly experts in their own fields but they don't necessarily understand how microprocessors work.

[ this part explans the mechanics of the clock ]

ALL microprocessors contain a PRIMARY first clock that we may call its heart beat clock. The tick of this clock determines the speed of the processor. Some of these clocks are very fast, ticking at the speed of millions, or billions of times a second. However, these are NOT the clocks about which we are concerned.

Many of these processors have either a built-in SECONDARY clock, or, as is the more usual case, associated with them in a second chip, a SECONDARY clock whose time and date is maintained by the first clock just as the wheels of old mechanical time pieces were governed by the spring and escape mechanism.

The first clock is much like a metronome, or the swinging pendulum on the old Grandfathers clock. This first clock in the microprocessor is like a drum beat that is keeping all the functions of the microprocessor marching in order. On one beat an instruction from the microprocessor's internal memory is fetched, on the next beat that instruction is executed. The pendulum then swings and it is fetch time, and then again it is execution time. Many times a second the dance goes on during the life of the microprocessor. Among the instructions executed may be that of maintaining a SECONDARY clock.

[ how the clock is powered ]

Some may wonder how the SECONDARY clock can be maintained if the power goes off. The answer is that these devices require VERY little power for their own function. The power might be maintained for a while in a built-in capacitor for such a low level function, but for a more reliable long term source many of the microprocessors have associated with them a TINY very LONG TERM battery that lasts for years. (Some of the many kinds used have been Alkaline, Carbon Zinc, Nickle Cadmium, Nickle Metal-Hydride, Lithium and Rechargeable. The latter, being recharged when the power is on, can maintain the SECONDARY clock for many months, or even years, when the power is off).

In some people's old computers these batteries have stopped functioning, and therefore, just like in electric wall clocks without batteries, or memories, if they wish to maintain the correct time on their computers, they have to reset the clock everytime the power goes off. On some devices, like microwaves, and vcrs, some people don't bother, and just let the clock blink or show whatever time that it will.

[ back to the secondary clock ]

Many computer programs do not use the SECONDARY clock, or many people did not concern themselves with such things as file dates on their computer files being incorrect. They may feel that if they were not using the SECONDARY clock, of if the SECONDARY clock setting made no difference to them then the computer was not using the SECONDARY clock or the SECONDARY clock was not affecting the computer. In very old computers this may have often been true. However, as computers and software progressed the designers made more use of the SECONDARY clocks.

[ We are not addressing the PC ]

While we are not really concerned about PC,s in this presentation, let me insert here that if you are concerned about the clock in your PC being Y2K compatible (and you should be) although it is a problem nothing of the MAGNITUDE that we are discussing here, then here is a good web address that you can go to that will tell you how to set your PC clock after Y2K. http://www.pc.ibm.com/year2000/year2000c.html

But, in fact, we are not really talking about desk top PC,s here. What we are talking about is tiny embedded microprocessors, that have been built into all sorts of industrial devices like machine and valve controls.

[ back to the secondary clock operation and ROM-(READ ONLY MEMORY) ]

The rules for the operation of these SECONDARY clocks and the operation of the microprocessor itself were maintained in a special kind of memory called ROM, which stands for Read Only Memory. Initially these Read Only Memories were "hard wired" and in the very earliest computers they were set by actually running wires from one connection to another on a plug board, and this is how many of the ancient (pre 1960) computers were programmed and which I actually went to IBM school and learned how to do, so that tells you how long I have been studying these systems.

With the invention of the microchip the "wiring" was done inside the ROM chip when it was manufactured, in much the same way that a printed circuit board is made.

[ ROM evolves to BIOS (BASIC INPUT/OUTPUT) ] [ became - PROM (Programmable/ROM) ]

Later, because so many different kinds of microprocessors were being developed a new method of more easily programming the ROMs with the computer's or microprocessor's built-in start up program was developed. (This chip on computer mother boards is called the BIOS, which stands for Basic Input Output System).The new type of ROM was called Programmable Read Only Memory or PROM.

[ programming the PROM (write once) ]

PROMs and all digital computer memories work by storing a binary pattern of 1's and 0's called bits. The presence of a connection may be a one and its absence a zero. The PROM chips were built with all there connections set at one, but then they were plugged into a ROM burner (a box with a socket that matched the chip's pins) and which had switches that could be set to reach all the desired specified locations and then had sent through them a current that would permanently burn out the specified locations and set those location's values to zero. Thus the PROM chip could be programmed one time (and one time only).

At that point in time, not only chip and microprocessor manufacturers were able to program microprocessors with permanent wired in programs but also engineers and other companies could then program these PROM chips which were then called: Field Programmable ROMs.

However, if they made a mistake (and engineers and programmers are only human and make lots of mistakes) then the PROM was permanently ruined as they tried to program it. Since they were often not doing much mass production, over which to spread the cost of this trial and error method, this was an expensive way to develop a PROM.

[ From write once to rewritable the EPROM (Erasable/PROM) ]

What they needed was a PROM that could be erased and re-recorded. This new Erasable PROM was called an EPROM. The memory of the EPROM was set by current, much as before. Sometimes the programming device was hooked directly onto a computer that had a program that it downloaded to the EPROM. However, what would erase the memory of the EPROM was light. Specifically ultraviolet light. So if you left the memory exposed to sunlight it would eventually get enough ultraviolet that it would be erased.

You could think of an EPROM chip as being a little bit like a camera. Inside were these connections sensitive to light that would set them all to one. You set them, as before with an electric current, but then, if you made a mistake, you could open up to the light a little window on the top of the camera like chip, and expose the interior of the chip to ultraviolet light (a matter of a few seconds to a few minutes depending on the chip and light source) and then cover up the little window again (usually with a piece of tape) and start over again.

In designing computer ROM memories, this EPROM was the way to go, and once you had the program working perfectly, if you were going to make tens of thousands, then it might be better to send the program to the factory and have it burnt into cheaper ROMs.

However, if you were going to use just a few of these chips, before changing the design, or because that is all that you were going to make, it was cheaper to just go ahead and use the EPROM, or to burn the program into the necessary number of PROMs. In fact some specialized Personal Computer (PC) BIOS chips did use such an EPROM itself, and many more used the PROMs. As mentioned before, BIOS stands for Basic Input Output System and it is the program that is always resident in the computer and lets you start it up.

.........snip........

[ EPROMs & PLCs (Programmable Logic Controllers) in devices ] [ how they were programmed ]

The real problem is because engineers began using microprocessors to build other specialized devices such as PLCs (Programmable Logic Controllers) which they then used to control specialized machine tools and such things as automatic valves. In these they too used these EPROMs to design their systems and they too sometimes put the EPROM in the system and they sometimes instead used Field Programmable ROMs,to make a number of the devices.

Programming these devices [EPROMS] can be very, very long and time consuming work and oftentimes requires a great amount of skill to do the task efficiently. The way that engineers got around that great expense (which was rather like re-inventing the wheel each time), was to take an existing program, that ALMOST did what they wanted to do, and then to slightly modify it.

[ Programming with the LADDER concept - build on existing ] [ This "ladder" concept of building on existing programs is ] [ in a nutshell the problem also with 'legacy systems' main- ] [ frames & the older programming languages i.e. cobol ]

The companies that sold them the EPROM programming equipment, also provided the users' engineers with libraries of programs and information about how they could combine the programs and modify them.This became know as the LADDER concept. You started with a number of steps and then you added another step on the ladder. Someone else wanting to do much of what you had done, then could use what you had done and add another step on the ladder. The intial LADDERs were much like Truth Tables. Later more programming capabilites were added in mnemonic form, and most recently greater levels of abstraction have been obtained by a hybrid relationship between the original LADDERs and programs like Visual BASIC.

Engineers came and went in companies. Even in the companies that designed the EPROM burning equipment to begin with, and no one really knew, or knows to this day, what is hidden back down the ladder or is in the EPROM or PROM program. (I feel least comfortable explaining about LADDER programming because while I have programmed in many languages, and have taught a number of languages, and have even written books on Programming, I have never done LADDER programming. And while I had read about LADDER I was not aware of the Hybrid aspect of it until the visit to the Research Facility).

[ how the knowledge became lost ]

In some of the programs, and chips, designed using this method, we can see that there are specifications for a SECONDARY clock, which we can access if we wish. Even if we don't use it ourselves, we don't know that someone else on some other rung of the ladder or in some earlier version level of the program didn't use it.

However, oftentimes the ladder and program have come to us in such a way, that documentation of many of the elements used to build them are completely lost, and no one knows whether there is a SECONDARY clock in there or not. And there is no practical way to look and find out. Along with the SECONDARY clock, when the chip manufacturer built the chip and set it working they set its time as a part of the manufacturing process. The time was set in accordance with some original date of design of the clock as programmed into the original code and updated in accordance with the current time of when and where it was manufactured. No evil intent was meant, it is just that everything has to have a starting place, (and within limited numbers an ending place or starting over place - 00) and that was coincidently the way some of the chip logic was designed.

[ clockes based on real-time not GMT or ZULU time]

Now, all this was based on real time, but not necessarily Zulu time, (the time at the grand meridian) and because the clocks are not THAT accurate, and some may have later accidently gotten set back to their starting date, we can't say that they are all going to stop working EXACTLY at midnight either Greenwich Time or local time. The nature of the coding problem that we have been describing is explained at another independent source. http://www.auto2000.ndirect.co.uk/plc/plcs.htm

[ more complex designs ]

There is still another category of IC designs called ASIC (Application Specific Integrated Circuits) which includes programmable logic devices (PLD) using either standard cells (SC) or full custom design (FC) and which may involve an even higher level of integration than we have been discussing.

And there were many standard IC products put out by manufacturers that contained such clocks. Every manufacturer has long lists of such devices and fortunately many of the manufacturers are making this fact known to the public. For an example look at: http://www.mot-sps.com/y2k/mailing.html

[ back to the time issue ]

And you may still say, but who cares, because we are not using a time function. And that may be true, that YOU are not using a time function, but the chips, I have just described, may be.

Chips often control things based on intervals. Such as checking every so often (perhaps in milliseconds)as to whether they should check to see if it has sensed a train coming and should lower the gates, or that the train has passed and that it should raise the gates. For this purpose it may, and very often does, use the SECONDARY internal clock to keep track of how much time has passed and whether or not it should check for the presence of a train.

No one knows what logic the engineer designing the gate control used, nor for that matter did the engineer designing the control have any idea what logic the programmers designing the earlier rungs down the Logic Ladder, may have used. All he knew is that the instructions said, put in this signal and under these conditions you will get out that signal.

[ the math function of the chip and the problem with 0 ]

In order to measure a delay, or interval, the logic in the Logic Ladder oftentimes used the difference between the current SECONDARY clock time and an earlier time from the same SECONDARY clock. This all works well and good and without any concern about whatever time people are using out in the real world. That is it works fine as long as you subtract a lesser time from a greater time. BUT should it ever occur that the system subtracts a greater number from a lesser number you will get a negative result. Something which will happen ONE TIME ONLY and that is when the SECONDARY clock flips over to zero zero for the year Y2K.

[ how the chip may react ]

What will happen when this happens is called UNDEFINED. Sometimes UNDEFINED results are not that bad but oftentimes they are and Engineers really do not like to be surprised by them. Unfortunately, we don't know exactly how this PROM logic works in many, many devices. It is VERY difficult to test for because there is NO PRACTICAL WAY to go around and set those INTERNAL SECONDARY clocks, even if we can determine that they are there and find them.

[ can we test by changing the clock? ]

The many people who do go around and set forward EXTERNAL clocks to the year 2000 on their systems may STILL be in for a horrible surprise, because setting external clocks often does not effect the SECONDARY internal clock. For some systems changing the EXTERNAL clocks has caused an interaction between them and the internal clock in such a way that the system has failed, but that is not always the case. This is why all the tests that we hear about the FAA flying planes on which they have set forward the clocks, or setting forward the clocks in the Air Traffic Control Centers or setting forward the clocks in the Electrical Generating Plants or setting forward the clocks anywhere else may have had ABSOLUTELY NO EFFECT on the SECONDARY clock itself.

On those occassions that I have heard of people going to the EXTREME EFFORT of changing the SECONDARY internal clock, or the PROM logic that uses it, they have found disasterous results. (Perhaps I have just not heard from people who have had good results). But, in the bad cases they have had valves or switches open and not close, or close and not open. (The Three Mile Island Effect, which was a similar timing effect although it was not Y2K connected.) Moreover, once they have prompted this result, MANY of them were never able to get the device to ever function again. The only option has been to replace the device.

[ replacing the chip ]

Sometimes they could not replace just the chip, because the old chips or programming for them was no longer available, and new types of chips would not fit the printed circuit board. They couldn't replace the circuit board, with one that would hold the new chip because they don't exist. They often couldn't even replace the PLC (Programmed Logic Controller) with a new PLC because the new ones were not made to fit into the old cabinet or to inteface with the old device. The only choice was to replace the entire valve or switch system.

[ this ends the discussion of way the memory chip works ] [ the rest is the Authors opinion of the problem it presents ]

[ do engineers today know the problem? ]

Discussing this situation with engineers, I cannot tell how many of them REALLY understand the problem.When I search the literature, I do not really find much discussion of it. When I speak with Engineers who have actually solved the problem, they of course understand it, but unfortunately I speak with many more that do not seem to understand it. But, the REAL fact is, that I cannot find THAT many who will, or who are permitted to speak with me.

[ how will the problem be addressed ]

Therefore, I am left wondering? How serious is the problem? How many engineers really have a grasp of the problem? If engineers in large companies understand the problem, but engineers in smaller companies haven't had the resources to examine it, does this mean that after the fact, there is going to suddenly a big demand for new PLCs that can't be timely fulfilled? Does this mean that there may be some smaller chemical plants, where there may be sudden catastrophic failures endangering the employees and public? Does this mean that there may be great numbers of suppliers whose production will be interrupted and that this in turn will disrupt much of the whole interdependent chain of supply and demand?

........snip.........[ Authors debates another engineer ]

[ date programming rules ]

Permit me to digress again here for a moment. The matter of leap year calculation in the Year 2000, is of some particular novel interest. The following are the IEEE: W2715 Rules for determining whether a year is a Leap Year (G3.4.3)

(a)If the number of the year is divisible by 4 then the year is a Leap Year, except (b)If the number of the year ends in 00 then the year is not a leap year; however (c)If the number of the year ends in 00 and the year is divisible by 400 the year is a Leap Year. Most people are aware of rule a and most programmers should know rule b - but most people and many programmers overlook rule c. Now rule c does not happen very often. You would have to been around in 1600 or in 2400 to have this rule apply. Not likely. But low and behold, it will also apply in 2,000.

Now, you may wonder how important this is. If you are flying in one airplane and calculate that you are at a particular location at a particular time, and another airplane figures it is going to be at the same location at the same time, but a day later then there is no problem UNLESS, one of you has a day left out of their calendar in which case you may end up there at the same time!

Or if some one says that they will send you some goods down the pipeline tomorrow. And then your computers compare the date and one of the computers say, "There is no tomorrow". Oh, well, you get the idea.

........snip.....

.....snip...[ Back to interview with the Engineer ]

So now we got down to my MOST IMPORTANT QUESTION, "How is it you can go into a ROM and determine whether a SECONDARY clock is being used by the program?"

........snip......[ more interview ]

Sometime later, the Engineer brought up the long explanation about LADDER Logic (see above) and how the current trend is to utilize it with Visual BASIC (the latter matter being something that I was not aware of until that moment), and then agreed, that because of this there is no way to be certain what the program logic in the ROM is doing and that it could well be utilizing a SECONDARY clock.

The way that the Engineer analyzed the problem, and given the manner that it was brought up later, I wondered if it was something that had REALLY been grasped before, and if he was not sort of analyzing it on the spot. Moreover, I wondered how many less skillful Engineers have really understood The PROBLEM. This was a matter that I brought up later, and the answer was VERY surprising.

[ how many systems could be involved ]

But, before we get to that, I was now rethinking through the numbers that I was being given. Conservatively 2,000 critical systems in a refinery, having about 25% problems or 500 with problems. These are being corrected one by one. But let us go back to my old rule of 3% and say that applies to only the identified 500 that have what I will call the hidden clock. This then gives us still 15 needles in the haystack that won't be found. If it were based on the original number of 2000 then there would be 60, and I guess that is where I probably feel that it lies in this situation. Somewhere between 15 and 60 and possibly nearer to the 15.

Even those 15, with their UNDEFINED results, may or may not have THAT detrimental an effect. In fact, I would say that will be the case in 90% of the cases. But this still leaves the probability of 1 or 2 catastrophic events in such a large installation. It just does not seem realistic that we can be 100% sure that we will get 100% of the needles that can cause a catastrophe.

It comes down to a matter of probabilities. In any given year even now, there is a probability that a certain number of plants will have an accident of such severity that it will cause fatalities. This is a fact of life. (Consider the past year). And it is, in my mind a very HIGH Probability, that the Y2K defect will cause more problems than would otherwise occur. I consider this a very conservative statement. Whether or not there may be catastrophes on the Bhopal scale, I don't know. With the proper effort and policies I think those kind of disasters could be avoided. Whether or not they will be depends on how well the problem is understood.

.........snip.....[ rest of article deleted - see page if interested ]

==========================================================================

If you made it this far and still have any questions, I would be happy to answer questions off-the-list via one-on-one email. This information enlighened my knowledge on the subject.

""" ; .___________________________________________________________________________. | Chayim Leib Weiss| All thoughts & views expressed here are my own and may | | EMT, CEM, CICS |or may not be those of anyone else. ;-) Have a nice day

-- Charli Claypool (claypool@belatlantic.net), December 03, 1999

Answers

Don't REMEMBER any of THIS from looking at LADDER logic. It MIGHT be that I'm JUST not clued up ENOUGH. It looks PLAUSIBLE, but all the CAPITALS and some of the technical PECULUARITIES (Rechargeable isn't a SEPARATE type of BATTERY) make me WONDER.

I guess we'll all FIND out SOON. :|

-- Colin MacDonald (roborogerborg@yahoo.com), December 03, 1999.


Excellent Post!!!! One of the most comprehensive I have seen. A must read for all forum participants!! Nice Work

-- Jim torrez (jimtorrez@hotmail.com), December 03, 1999.

Thak you Mr. Claypool,

Sir your graspe of the english language is far superior than mine. I salute you sir. And you article has explained excately what I have tried to relate to this forum. And it's explanation, to put it concisely. Is what has had me scared to death since June 1998. And is why I am infomagic on the how bad it will get poll.

~~~~~~~~~~~~~~~~~~~~~~Shakey~~~~~~~~~~~~~~~~

-- Shakey (in_a_buner@forty.feet), December 03, 1999.


I'm not a Mr. :(

-- Charli Claypool (claypool@belatlantic.net), December 03, 1999.

Another gut-wrenching read. The Grim Reaper is regrettably sharpening his scythe.

-- Ashton & Leska in Cascadia (allaha@earthlink.net), December 03, 1999.


"The Embedded Processor SECONDARY Clock Problem by Bruce Beach April 9th, 1999"

This is old news, folks. Been discusssed here many times.

-- (been@discussed.already), December 03, 1999.


Bruce Beach's letter was hard to understand, this one was very easy to understand. Thank you Charli!

Lurker 13

-- lurker 13 (lurker13@nowhere.here), December 03, 1999.


This stuff from Bruce Beach is the biggest bunch of bullshit I have ever read. He states " I spent five days reviewing all the microprocessor information that I could find in two libraries and on the Internet, and by consulting with some other individuals in the field. I felt well prepared to conduct the interview and I would say that at this point that I feel that I have a better grasp of the situation than 99% of the other people that I meet.

I have worked on embedded chips and embedded systems for over 30 years. BB reads some stuff on the web and library for 5 days and writes this crap.

And here sits people ready and willing to suck this misinformation up? Why?

It's the same thing with Paula Gordon, she wouldn't know an embedded chip if it were in her shot glass in front of her. But she goes surfing through the internet and reads information like this stuff Bruce Beach writes, and suddenly she is an expert too.

If you do not want to use your minds and see that they do not know what they are talking about, if you want to suck up information that by their background you should know they are not qualified to give opinions on then all of the fear and stress and horror you feel over what will happen is well deserved. I don't even feel sorry for you anymore, when you choose to be misinformed.

Maybe the fear you feel is adictive and you have come to depend on it.

What really gets me is those people here that know this is a bunch of bull don't have the guts to come forward and say it. I just hope Paula Gordon doesn't try to run for office again after this, her credibility will be in the gutter, and YES, people will be reminded of this.

-- Cherri (sams@brigadoon.com), December 06, 1999.


Moderation questions? read the FAQ