6.371 Fall 2002 - Final Project
Booth Recoded Multiplication
Created By: Sean Lie, Benjamin Walker, Jeremy Walker
Introduction
Our project involved several different implementations of an 8x8 bit Booth recoded multiplier. Our initial plan was to implement 4 different versions of the multiplier:
1. A static CMOS 8x8 bit Booth-recoded multiplier
2. A dual-rail domino logic 8x8 bit Booth-recoded multiplier
3. A two-stage pipelined static CMOS 8x8 bit Booth-recoded multiplier
4. A two-stage self-timed pipeline dual-rail domino logic 8x8 bit Booth-recoded multiplier
In the end, we were able to implement the first 3 multipliers and some of the additional logic required to make a self-timed pipeline version.
For information on how Booth recoded multiplication works, please see the following link:
http://www.geoffknagge.com/fyp/booth.shtml>
The devil is in the details
Static CMOS Implementation
Pipelined Static CMOS Implementation
Dual-Rail Domino Logic Implementation
What worked, what didn't, what went wrong?
Tools
In general, we found using the Ocean set of tools very frustrating. We lost countless hours fighting with the tools to make them work. Seadali crashed, checknets and trout would core dump for no reason requiring us to go through a list of measures that had been able to fix a core dump problem in the past. For future reference, here is a list of what we tried when trout or check nets would continually core dump.
1. rm -r seadif; nelsea (Rebuild the nelsis database). Helped sometimes, usually if this didn't fix the error after the first try, then further attempts were useless.
2. mk6371 <temp project directory>; Import all the previous layouts into the new project directory. Helped a few times.
3. cp -r <project directory> <new project directory>. Never helped, but who knows? It might sometime. Use in conjunction with 1.
4. Properly name instances in your layout. That is, even though it may seem obvious what each instance is supposed to be named, helping trout or check nets along will never hurt.
5. Change the instance names in the SLS and your layout. Sometimes you just have to shake things up a bit.
6. Convert any arrays in SLS (i.e. {xinvs[0..8]} inv{a[0..8,y[0..8],gnd[0..8],pwr[0..8]}) into separate instances (i.e. {xinv1} inv(a1,y1,vss,vdd)). This particular solution resolved a 3-hour block in progress while laying out the domino Booth array.
7. Use another Athena account. One member of our group was unable to even do the demo Adder8 layout in lecture. It seems that some Athena accounts are more equal than others.
8. Make sure you have enough drive space available. As the amount of free space drops, it seems that trout and check nets have a tendency to fail more often.
Static Booth Multiplier
The static Booth multiplier worked under exhaustive testing. However, it seemed to be quite slow in terms of overall timing specs. It seems that the overall delay is due to a critical path through the partial product (pp) control and generation logic in the first stage, and the fulladders in the following three stages. This delay can be explained in terms of the slowness of the xor gate. Another problem that we had was that initially, there was no signal buffering from the pp control logic to the pp generation logicand also from the least significant two bits of the sum and carry outputs from each stage to the inputs of the 16-bit adder.
Pipelined Booth Multiplier
The pipelined Booth multiplier also worked under exhaustive testing. In this multiplier we also had timing issues. It actually turned out that the pipelined version was slower than the entire static Booth multiplier without any pipelining. We realised that the propagation delay of the adder would increase since most of the bits would not be set up the way they were in the unpipelined version. Despite this, the amount of time apparently required for the adder was much larger than expected. When we attempted to verify what was going on by simulating only the adder in spice, spice unfortunately kept crashing and we were unable to discover the cause of the massive slowdown seen.
Domino Multiplier
The domino multiplier worked under exhaustive testing and also worked when run under spice simulation and sls-timing. Unfortunately, when we wanted to take data for the test cases we wanted to use, we ended up running into a problem with SLS running spice in that it wouldn't display the entire simulation time. Trying to simulate in HSPICE yielded data files that awaves couldn't read unfortunately.
We also spent a number of hours trying to buffer the outputs from the domino booth array but unfortunately, check nets started acting up again and we were running out of time so this was never completed even though all components are in place.
The domino booth multiplier was definitely slower than we thought it would turn out to be. This was true for all domino circuitry. You can read more specifics about it in the Dual-Rail Domino Logic Implementation above.