# 3B2 Math Acceleration

I recently started on a quest to finish up a loose end on the AT&T 3B2 emulator
by *finally* implementing a simulation of the WE32106 Math
Acceleration Unit (MAU). The MAU is an IC that accelerates floating
point operations, and it could be fitted onto the 3B2 motherboard as
an optional part. I’ve seen quite a few 3B2 systems that didn’t have
one; if it’s not present, the 3B2 uses software floating point
emulation, and gets on just fine without it. This means the 3B2
emulator is totally usable without the MAU. But still, wouldn’t it be
nice to simulate it?

Lucky for me, one of the critical pieces of documentation I’ve managed
to find over the last few years is the *WE32106 Math Acceleration Unit
Information Manual*. This little book describes the implementation
details of the WE32106, and without it it would be hopeless to even
try to simulate it. (Speaking of which: The book hasn’t been scanned
yet, which is a high priority. I have the only copy I’ve ever seen.)

But even with the book, this is no simple task. So let’s dive in a little deeper and look under the hood of the WE32106.

## Floating Point Background

The WE32106 came out in 1984, when the IEEE 754 floating point specification was just being finalized, so the chip is an extremely faithful IEEE 754 implementation. I won’t go into the nitty-gritty details of IEEE 754, because you can read about it in great detail in any number of places. But I’ll cover the basics.

Floating point numbers are stored internally in scientific notation,
using the concept of an *exponent* and a *significand* (or
*mantissa*). In decimal notation, scientific notation looks like
3.032×10^{-14} to represent a very small
number with four significant digits. If you wrote it out in full, it
would be 0.00000000000003032. That’s a lot of leading
zeros. Scientific notation allows us to break the number up into an
exponent (-14), and the significand or mantissa (3032). Using
this format requires much less space to hold very large or very
small numbers.

Of course, we don’t use decimal for storing numbers in a computer.
Instead of using powers of 10, we use powers of 2. But the principal
applies just the same, and we store a binary exponent and a binary
mantissa (the WE32106 calls this a *fraction* instead of *mantissa*).

There are three data types used in the WE32106: Single Precision, Double Precision, and Extended Precision. The main difference between these formats is how big or small a number they can hold, and with what number of significant digits.

Single Precision numbers fit into a single 32-bit word. They use one sign bit (0 for positive, 1 for negative), 23 bits for the fraction (mantissa), and 8 bits for the exponent.

Double Precision requires 64 bits and fits into two 32-bit words. It allows for much larger and smaller values than Single Precision. It also uses one sign bit, but 52 bits for the fraction, and 11 bits for the exponent.

Finally, Extended Precision allows for holding the largest and smallest values. All of the internal registers of the WE32106 use Extended Precision. It uses 80 bits, and fits into three 32-bit words.

The WE32106 uses this last format for all of its internal registers and operands, so that all of its internal operations have full precision when executed. The result can be converted to a form with less precision if needed.

### Simulating the WE32106

Obviously, a faithful simulation of the WE32106 needs to handle all of the operations performed by the real thing. So far this has been very slow going. The documentation I have is pretty good, but not as thorough as I would like, so I have had to resort to some trial and error. A favorite technique of mine is to run the MAU diagnostics, watch them fail, and then try to figure out what the tests were expecting and how that corresponds to what the documentation is saying. This has led to a lot of insights.

My first challenge, though, was dealing with the internal
representation of these numeric types. Here, I have cheated, and quite
badly. It just so turns out that *many* modern C compilers use IEEE
754 encoding under the hood, and that *many* modern C compilers
represent these three values with the types `float`

, `double`

, and
`long double`

. Emphasis here on *many*.

The C standard says nothing about how numbers need to be encoded. It
certainly does not specify IEEE 754. It also doesn’t say how many
bytes each type has to be. All of that is implementation
specific. But, since the major platforms I care about all *just
happen* to use IEEE 754 format, I let the compiler do the work for
me. Internally, I just use a C `union`

to access the raw bytes that
represent a `float`

, `double`

, or `long double`

. I store all intermediate
values as `long double`

, with a little bit of metadata, and peform
operations on `long double`

values. Then, when I need to write
out the result somewhere, I use casts where needed and reverse
the process by writing out the correct type into another C `union`

.

It gets the job done. It’s also risky. I will be adding some tests that will disable the MAU immediately if certain assumptions aren’t true at runtime. Better to run without a MAU than with a MAU that has undefined behavior.

## Exceptional Conditions

One of the hairier problems is how to handle all of the special values and edge cases of IEEE 754 math. I can’t just use C division to divide by zero, after all. I need to notice that the divisor is zero, and then mimic the correct exception. Otherwise, the simulator will crash. That’s no fun.

There are a *lot* of such edge cases, it turns out. Values can
underflow or overflow. Values can be positive zero or negative zero.
Values can be `NaN`

. The combinatorics get hairy, and so far I’ve only
just scratched the surface in the simulation.

## Where to Go From Here

I think the basic framework I’ve written so far is well on its way to being correct and usable. From here on out, I need to cover all of the special values and edge cases, do the right operations (addition, subtraction, multiplication, and division) when the values are valid, and handle potential overflows and underflows.

There’s a lot more to do, so I’ll be updating the blog here as I go.