Tuesday, May 22 2018 at 2:59 PM PDT
On November 29, 2014, I posted the following message to the Classic Computer Mailing List.
Folks, For some reason I got it in my head that writing an AT&T 3B2 emulator might be a good idea. That idea has pretty much been derailed by lack of documentation. I have been unable to find any detailed technical description of any 3B2 systems. Visual inspection of a 3B2 300 main board reveals the following major components: - WE32100 CPU - WE32101 MMU - 4 x D2764A EPROM - TMS2797NL floppy disk controller - PD7261A hard disk controller - SCN2681A dual UART - AM9517A Multimode DMA Controller How these are addressed is anybody's guess. To even dream of doing an emulator, I at least need to know the system memory map -- what physical addresses these devices map to. Without that it's pretty pointless to get started. If anyone has access to this kind of information, please drop me a line. Otherwise, I'll just put this one on the far-back burner! -Seth
When I sent that message, I had no idea that it would lead to almost four years of effort, and perhaps if I had known I would have given up. But thankfully I didn't, and today I'm happy to say that the effort was, at long last, successful. The 3B2/400 emulator works well enough that I have released it to the world and rolled it back into the parent SIMH source tree.
Because this project required so much reverse engineering, and because documentation about the 3B2 is still so scarce and hard to come by, I wanted to take the time to document how the emulator came about.
Friday, March 24 2017 at 6:19 PM PDT
I had an absolute breakthrough tonight regarding how the WE32101 MMU handles caching. In fact, when I implemented the change, my simulator went from having 108 page miss errors during kernel boot, to 3. The cache is getting hit on almost every virtual address translation, and because of what I learned, the code is more efficient, too.
The key to all this was finally looking up 2-way set associative caching (see here, here, and here), which the WE32101 manual identifies as the caching strategy on the chip. Once I read about it, I was enlightened.
Friday, March 24 2017 at 8:26 AM PDT
I think I made a grievous error when I originally laid out how the 3B2 system timers would work in SIMH, and last night I started down the path of correcting that.
The 3B2/400 has multiple sources of clocks and interrupts: There's a programmable interval timer with three outputs being driven at 100KHz, a timer on the UART that runs at 235KHz, and a Time-of-Day clock. They're all driven at different speeds, and they're all important to system functionality.
SIMH offers a timing service that allows the developer to tie any of these clocks to the real wall clock. This is essential for time-of-day clocks or anything else that wants to keep track of time. I used this functionality to drive the UART and programmable interval timers at their correct clock speeds.
But that's completely wrong. Of course this is obvious in retrospect, but it seemed like a good idea at the time. The problem is that the main CPU clock is free to run as fast as it can the SIMH host. On some hosts it will run very fast, on some hosts it will run quite a bit slower. You can't possibly know how fast the simulated CPU is stepping.
When your timers are tied to the wall clock but your CPU is running as fast as it can, there are going to be all kinds of horrible timing issues. I had lots of unpredictable and non-reproducible behavior.
Last night, I undid all of that. The timers are now counting down in CPU machine cycles. I used the simple power of arithmetic to figure out how many CPU machine cycles each step of each timer would take, and just did that instead.
Now, it seems like everything is a lot more stable, and much less unpredictable.
Thursday, March 23 2017 at 7:48 AM PDT
I spent last night probing my 3B2/310's hard disk controller with a logic analyzer so I can see exactly how it behaves, both with and without a hard disk attached to the system. It proved to be very tricky to get the logic analyzer probes attached because the motherboard is so incredibly dense. In fact, I couldn't get a probe attached to the chip select line no matter how hard I tried. There just wasn't any room to fit a probe between the chip and a nearby resistor array, so I resorted to using a little piece of wire to just touch against the pin. I could have used three hands for that operation.
Wednesday, March 22 2017 at 8:34 AM PDT
My next mini-project in the 3B2/400 simulator will be emulating the hard disk. The 3B2/400 used a NEC µPD7261A hard disk controller (PDF datasheet here), which has proved to be harder to emulate correctly than I would have liked.
So far, my hard disk controller emulation has been limited to the most minimal functionality needed to get the emulator to pass self-checks at all. Other than that, it's just a skeleton. But I believe that it's actually hanging up the floppy boot process now when UNIX tries to discover what hard drives are attached, so it's time to get serious and fix it.
My progress isn't good. I am following the datasheet to the letter, trying to give the correct status bits at the correct time, but the 3B2 just gets confused. It never even tries to read data off the drive, it just gives up trying to read status bits. So, clearly I'm doing something wrong, but I don't know what it is.
Tonight I will strap a logic analyzer to the PD7261a in my real 3B2 and see exactly what it's doing. I'll report on my findings when I have them.
Tuesday, March 21 2017 at 7:59 PM PDT
And just like that, it's solved. I figured out the mystery of the Equipped Device Table.
The answer was in some obscure piece of System V Release 3 source code. The 3B2 system board has a 16-bit register called the Control and Status Register (CSR). In the CSR is a bit called named TIMEO that I never figured out.
It turns out that I just wasn't reading the disassembled ROM code closely enough. The exception handler checks this status bit whenever it catches a processor exception while filling the EDT. If the bit is set, it skips the device.
So what is TIMEO? It's the System Bus Timeout flag, according to the SVR3 source code.
The correct behavior, then, if nothing is listening at an I/O card's address is to set an External Memory Exception, plus set this bit in the CSR. Once I implemented that in my simulator, the EDT started working exactly the same as it does on my real 3B2/310. Success!
Tuesday, March 21 2017 at 8:18 AM PDT
There is yet one more puzzling aspect of the 3B2 that I do not yet understand, and that is the equipped device table, or EDT. I've documented the nitty-gritty details on my main 3B2 reverse-engineering page, so I won't bore you with the details. But here's the short version.
Monday, March 20 2017 at 10:37 AM PDT
The first and most important thing I learned is that indexing cache entries only by their tags does not work. There are collisions galore, and no way to recover from them. However, if I index SD cache entries by the full SSL, and PD cache entries by the full SSL+PSL, everything seems to work perfectly. This leaves several big questions unanswered, but they are probably unanswerable. After all, I have no way of looking inside the MMU to see how it actually indexes entries in its cache, I can only go on published information and make educated guesses.
Second, I learned that implementing MMU caching is required for the 3B2 simulator to work correctly. Until this weekend, I had not implemented any caching in the simulated MMU because I assumed that caching was only used for performance reasons and could be skipped. But this is not true. In fact, UNIX SVR3 changes page descriptors in memory without flushing the cache and relies on the MMU to use the old values until it requests a flush. Not having implemented caching was a source of several serious bugs in the simulator.
Third, I learned that the "Cacheable" bit in segment descriptors is inverted. When it's a 1, caching is disabled. When it's a 0, caching is enabled.
The 3B2/400 simulator now makes it all the way through booting the SVR3 kernel and starting /etc/init. There are some other bugs preventing init from spawning new processes, but I hope to have these ironed out soon.
Sunday, March 19 2017 at 8:41 AM PDT
I had a Eureka! moment last night about MMU caching, but it all came tumbling down this morning.
My realization was that the Segment Descriptors are 8 bytes long, and that Page Descriptors are 4 bytes long. So, if we assume that the virtual address encodes the addresses of the SDs and PDs on word-aligned boundaries (and SDs and PDs are indeed word-aligned in memory), then you don't need the bottom three bits for SD addresses, nor do you need the bottom two bits for PD addresses. Voila!
But this morning, I remembered two very important facts:
- The SSL field in the virtual address is an index into the table of SDs, not an address offset.
- Likewise, the PSL field is an index into the table of PDs, not an address offset.
Saturday, March 18 2017 at 6:37 PM PDT
I'm in the middle of a very long, very drawn out project to try to emulate the AT&T 3B2/400 computer. I should probably have been sharing my progress more frequently than I have been, but it has for the most part been a painful and solitary endeavor.
Today, though, there is something in particular that is bothering me greatly, and I must yell into the void to get this frustration off my chest. And that is, how in the hell does the MMU cache work?
So first, a little background.
Sunday, December 18 2016 at 1:51 PM PST
This is the story of a short time, a quarter century ago, when a little cluster of computers at Cornell University played a very important part in in my life and in the lives of my friends.
It was the fall of 1992, and the Internet was growing up. It would still be another year or two before it became a household name, so for the time being it was our little playground, our special place that you couldn't get to unless you were at a big research company or a reasonably well endowed University. I lived at Mary Donlon Hall, one of two dorm buildings that had recently been wired with Ethernet and therefore offered its residents a mainline into the addictive world of the Internet.