Seth Morabito ∴ A Weblog

MMU Caching Finally Explained

Friday, March 24 2017 at 6:19 PM PDT

I had an absolute breakthrough tonight regarding how the WE32101 MMU handles caching. In fact, when I implemented the change, my simulator went from having 108 page miss errors during kernel boot, to 3. The cache is getting hit on almost every virtual address translation, and because of what I learned, the code is more efficient, too.

The key to all this was finally looking up 2-way set associative caching (see here, here, and here), which the WE32101 manual identifies as the caching strategy on the chip. Once I read about it, I was enlightened.

3B2 System Timers and SIMH

Friday, March 24 2017 at 8:26 AM PDT

I think I made a grievous error when I originally laid out how the 3B2 system timers would work in SIMH, and last night I started down the path of correcting that.

The 3B2/400 has multiple sources of clocks and interrupts: There's a programmable interval timer with three outputs being driven at 100KHz, a timer on the UART that runs at 235KHz, and a Time-of-Day clock. They're all driven at different speeds, and they're all important to system functionality.

SIMH offers a timing service that allows the developer to tie any of these clocks to the real wall clock. This is essential for time-of-day clocks or anything else that wants to keep track of time. I used this functionality to drive the UART and programmable interval timers at their correct clock speeds.

But that's completely wrong. Of course this is obvious in retrospect, but it seemed like a good idea at the time. The problem is that the main CPU clock is free to run as fast as it can the SIMH host. On some hosts it will run very fast, on some hosts it will run quite a bit slower. You can't possibly know how fast the simulated CPU is stepping.

When your timers are tied to the wall clock but your CPU is running as fast as it can, there are going to be all kinds of horrible timing issues. I had lots of unpredictable and non-reproducible behavior.

Last night, I undid all of that. The timers are now counting down in CPU machine cycles. I used the simple power of arithmetic to figure out how many CPU machine cycles each step of each timer would take, and just did that instead.

Now, it seems like everything is a lot more stable, and much less unpredictable.

Simulating the Disk Controller

Thursday, March 23 2017 at 7:48 AM PDT

I spent last night probing my 3B2/310's hard disk controller with a logic analyzer so I can see exactly how it behaves, both with and without a hard disk attached to the system. It proved to be very tricky to get the logic analyzer probes attached because the motherboard is so incredibly dense. In fact, I couldn't get a probe attached to the chip select line no matter how hard I tried. There just wasn't any room to fit a probe between the chip and a nearby resistor array, so I resorted to using a little piece of wire to just touch against the pin. I could have used three hands for that operation.

It's Hard Disk Time

Wednesday, March 22 2017 at 8:34 AM PDT

My next mini-project in the 3B2/400 simulator will be emulating the hard disk. The 3B2/400 used a NEC µPD7261A hard disk controller (PDF datasheet here), which has proved to be harder to emulate correctly than I would have liked.

So far, my hard disk controller emulation has been limited to the most minimal functionality needed to get the emulator to pass self-checks at all. Other than that, it's just a skeleton. But I believe that it's actually hanging up the floppy boot process now when UNIX tries to discover what hard drives are attached, so it's time to get serious and fix it.

My progress isn't good. I am following the datasheet to the letter, trying to give the correct status bits at the correct time, but the 3B2 just gets confused. It never even tries to read data off the drive, it just gives up trying to read status bits. So, clearly I'm doing something wrong, but I don't know what it is.

Tonight I will strap a logic analyzer to the PD7261a in my real 3B2 and see exactly what it's doing. I'll report on my findings when I have them.

The Equipped Device Table Question is Answered

Tuesday, March 21 2017 at 7:59 PM PDT

And just like that, it's solved. I figured out the mystery of the Equipped Device Table.

The answer was in some obscure piece of System V Release 3 source code. The 3B2 system board has a 16-bit register called the Control and Status Register (CSR). In the CSR is a bit called named TIMEO that I never figured out.

It turns out that I just wasn't reading the disassembled ROM code closely enough. The exception handler checks this status bit whenever it catches a processor exception while filling the EDT. If the bit is set, it skips the device.

So what is TIMEO? It's the System Bus Timeout flag, according to the SVR3 source code.

The correct behavior, then, if nothing is listening at an I/O card's address is to set an External Memory Exception, plus set this bit in the CSR. Once I implemented that in my simulator, the EDT started working exactly the same as it does on my real 3B2/310. Success!

The Mystery of the Equipped Device Table

Tuesday, March 21 2017 at 8:18 AM PDT

EDIT: I have made a followup post detailing the answer to this mystery!

There is yet one more puzzling aspect of the 3B2 that I do not yet understand, and that is the equipped device table, or EDT. I've documented the nitty-gritty details on my main 3B2 reverse-engineering page, so I won't bore you with the details. But here's the short version.

A Last Word on MMU Caching

Monday, March 20 2017 at 10:37 AM PDT

Over the weekend I conducted several experiments with caching using my 3B2 simulator. I learned a few critical bits of information. For background, see this post and this post.

The first and most important thing I learned is that indexing cache entries only by their tags does not work. There are collisions galore, and no way to recover from them. However, if I index SD cache entries by the full SSL, and PD cache entries by the full SSL+PSL, everything seems to work perfectly. This leaves several big questions unanswered, but they are probably unanswerable. After all, I have no way of looking inside the MMU to see how it actually indexes entries in its cache, I can only go on published information and make educated guesses.

Second, I learned that implementing MMU caching is required for the 3B2 simulator to work correctly. Until this weekend, I had not implemented any caching in the simulated MMU because I assumed that caching was only used for performance reasons and could be skipped. But this is not true. In fact, UNIX SVR3 changes page descriptors in memory without flushing the cache and relies on the MMU to use the old values until it requests a flush. Not having implemented caching was a source of several serious bugs in the simulator.

Third, I learned that the "Cacheable" bit in segment descriptors is inverted. When it's a 1, caching is disabled. When it's a 0, caching is enabled.

The 3B2/400 simulator now makes it all the way through booting the SVR3 kernel and starting /etc/init. There are some other bugs preventing init from spawning new processes, but I hope to have these ironed out soon.

MMU Caching Redux

Sunday, March 19 2017 at 8:41 AM PDT

I had a Eureka! moment last night about MMU caching, but it all came tumbling down this morning.

My realization was that the Segment Descriptors are 8 bytes long, and that Page Descriptors are 4 bytes long. So, if we assume that the virtual address encodes the addresses of the SDs and PDs on word-aligned boundaries (and SDs and PDs are indeed word-aligned in memory), then you don't need the bottom three bits for SD addresses, nor do you need the bottom two bits for PD addresses. Voila!

But this morning, I remembered two very important facts:

  • The SSL field in the virtual address is an index into the table of SDs, not an address offset.
  • Likewise, the PSL field is an index into the table of PDs, not an address offset.

MMU Caching for Fun and Profit

Saturday, March 18 2017 at 6:37 PM PDT

I'm in the middle of a very long, very drawn out project to try to emulate the AT&T 3B2/400 computer. I should probably have been sharing my progress more frequently than I have been, but it has for the most part been a painful and solitary endeavor.

Today, though, there is something in particular that is bothering me greatly, and I must yell into the void to get this frustration off my chest. And that is, how in the hell does the MMU cache work?

So first, a little background.

An Ode to the Cruxen

Sunday, December 18 2016 at 1:51 PM PST

This is the story of a short time, a quarter century ago, when a little cluster of computers at Cornell University played a very important part in in my life and in the lives of my friends.

It was the fall of 1992, and the Internet was growing up. It would still be another year or two before it became a household name, so for the time being it was our little playground, our special place that you couldn't get to unless you were at a big research company or a reasonably well endowed University. I lived at Mary Donlon Hall, one of two dorm buildings that had recently been wired with Ethernet and therefore offered its residents a mainline into the addictive world of the Internet.

Fiber Optic Bliss

Tuesday, August 4 2015 at 1:34 PM PDT

IMG_2183.jpg

It's been a long time since I wrote about my tragic tale of Internet woe I owe you all an update, and I'm very happy to say that at long last, we have broadband. It's kind of a long story.

The WATs of JavaScript

Monday, June 29 2015 at 2:28 PM PDT

At CodeMash 2012, Gary Bernhardt gave a now infamous lightning talk that has become known simply as The WAT Talk, in which he presents several of the more surprising behaviors of Ruby and JavaScript. I've passed the video around quite a few times, and I've pointed out some other JavaScript behaviors that seem pretty outlandish at first sight. But I'm feeling a little guilty about poking fun at JavaScript, so I wanted to dive further into these WATs and talk about why they happen.

I'm not here to defend JavaScript — it doesn't need my defense. It's not a perfect language, and it's not my favorite language, but it is the single most popular language on the web right now, and because more and more people are using it for the first time, I think it's worth the effort to go over some of these unexpected behaviors to help newcomers avoid common pitfalls.

But first, the WATs.

The WATs

> [] == []
false
> [] == ![]
true
> [] + []
''
> [] - []
0
> [ null, undefined, [] ] == ',,'
true
> [] + {}
'[object Object]'
> {} + []
0
> Math.min() < Math.max()
false
> 10.8 / 100
0.10800000000000001

WAT?!