logo

Goal
At first sight
The schematic
The power supply
The microcontroler
The synthesiser
The ROMs
Code and memory maps
AVR cartridge
CPLD cartridge
Converting sounds
Creating the ROM
PCBs
CPLD + Flash memory cartridge

A few weeks before I started this project, [Grapsus] found on Farnell a small QFP44 CPLD, which was 5V tolerant and inexpensive: the Altera EPM3032. Since he thought I could turn it into a mapper for a Gameboy cartridge, he asked if he had to order it for me. A few days later, after having received it, he wanted to simplify its use by crting a custom breakout board thanks to his etching technique (of which I'm jealous). So I got two of these CPLD and one breakout board from him, without really knowing what I'll do of it.

While taking care of my post-failure depression, my eyes met the small PCB. Hope returned instantly! As I wasn't able to get things done with i2c serial memory, I would certainly be able to by imitating the internal architecture of the VSMs with a little logic and parallel memory !

tms1000

I soldered the CPLD to the board, installs two decoupling capacitors, and used another prototype board to host the parallel memory and the voltage regulators.

With only 29SF256 32kB flash available, I routed the DIP28 socket to set A14 low and thus use only the first 16kB.

For power, as there are only VBAT (6V) and VSS (-3.5V) on the cartridge connector, I just chose to put the ground of the regulators to -3.5V. A 78L05 lowers the 9.5V to 5V for the flash, and a 1117-3.3 behind it to provide 3.3V to the CPLD.

This time I installed real voltage dividers on the inputs to not be too brutal with the CPLD which already makes the effort to tolerate 5V. I chose resistor values ​​high enough to not pull the lines low too hard: with 68k and 28k we get 2.7V from 9.5V, a level acceptable by the CPLD.Simulation

Wiring was done in Kynar, handy in this situation though somewhat fragile. I added a JTAG connector on the right side of the board.

As with the AVR, I used an external transistor (Q1) to make an open collector output for the CPLD.

Like most CPLD IDEs, Quartus allows to define the logic in many different ways. As I already tried the schematic capture method with another project, this time I wanted to try Verilog. Not wanting to go too fast, I first wrote a simple incremental counter clocked by ROMCLK and route the outputs to the flash's address lines.

First problem: When I connect the ByteBlaster (Altera programmer) to the JTAG port, the S&S crashed. As the programmer had buffered outputs which adapt to 3.3V, and that the voltage was good, the problem could only come from ground.
I needed a few minutes to realize that I was short-circuiting the "virtual ground" I made on the cartridge, which is actually at -3.5V refering to the real ground, which my still connected AVR programmer was giving to the USB ports of my computer! The solution? Unplug the programmer AVR: circuit opened.

After programming the CPLD, I encountered a second problem: the first address line output from the CPLD, which would be at half ROMCLK's frequency, is awfully glitchy. With such simple code, the cause of this problem was probably because the CPLD wasn't getting ROMCLK properly. To better understand the problem, I simplified the code to the maximum by just outputing a copy of the signal. Surprise: the signal seems to correspond to ROMCLK... With the exception of one detail. Zooming in on the falling edge of the signal replica, I noticed a very short spike of about 50 nanoseconds, which wasn't present on the signal from the S&S.tms1000

I was in doubt: wouldn't there be Schmitt trigger inputs on the CPLD? The downward slope of the true ROMCLK signal is much longer than the upward slope. The fact that the voltage varies (too) slowly would cause the CPLD's input to be unstable and cause the short oscillation. The datasheet confirmed the doubt: the only pins with hysteresis are the JTAG pins. The solution comes naturally: insert a component with such inputs between the S&S and the CPLD. I took a CD4093 (Quad NAND) because it was the only suited chip I had. Problem solved! This chip can be seen hanging from wires with lots of cut pins the left side of the board.

At this point, I was hoping that I'll only have to deal like that with ROMCLK. After a quick look at the bus and M1/M0, I was reassured to see that these signals didn't have such a long slope.

Speak and spell diagramI then went back to my counter code to test the address bus, and found inconsistencies on lines A4 and A11. I didn't notice that these two pins are face to face on the socket and spent a good half hour searching for the problem.

I finally realized that there was a short between these two lines, caused by a bad cut on the proto board track. Stupid mistake, went too quick. Once properly cut, all the address lines worked. However, something worried me again: when the code with the 14-bit counter was synthesized, Quartus was saying that 27 of the 32 macrocells in the CPLD were used. Not even a quarter of the functionality, and only 5 cells remaining. I lied to myself, hoping that everything would fit in the end.

Regardless of that, I started to write the code to simulate the behavior of the VSMs. To do this, ModelSim (packed with Quartus) has been of a great help, allowing me to visualize the state of the internal signals and outputs depending on inputs, without having to program the CPLD for real and the wire / rewire everything constantly.

By assigning all the necessary signals to the pins of CPLD, I realized also that I was very lucky for once: 34 pins used on 34 pin available. After dozens of tests and pulling my remaining hair, I got a simulated chronogram that closely resembled those of the datasheets. Some internal registers are used, such as "intaddress", a 23-bit register that contains the address of the next bit (yes bit, not byte) to read.

Simulation

A: Rising edge of ROMCLK, M1 and gotnop are high, 4 address bits are latched and gotnop becomes 0 to ignore M1 on the next rising edges of ROMCLK.
B: Rising edge of ROMCLK, M1 and M0 are low, that's a NOP, gotnop becomes 1.
C: Rising edge of ROMCLK, M1 and gotnop are high again, 4 new address bits are latched.
D: The last bits of the address are latched, the address bus of the flash now holds the right address.
E: Rising edge of ROMCLK with M0 high, first read command after an address set is a dummy read. The bit counter in intaddress is set to 0 and a byte is loaded from the flash in register datab (10100101). The dummy flag is set.
F: Rising edge of ROMCLK with M0 and dummy high. The 4 MSBs of intaddress are compared with the Chip select mask (0010), if it is, outp is set and intaddress is incremented.
G: Falling edge of M0, bit #0 is answered on bus8o ("1"). Note that in the final code, the output is inverted because it is open collector (Q1).
H: Rising edge of ROMCLK, M1 and M0 low: NOP after a read, datab is shifted one bit to the right to prepare for the next possible read.
I: Rising edge with ROMCLK with M0 high, new read.
J: Falling edge of M0, bit #1 is answered on bus8o ("0").
K: Rising edge with ROMCLK with M0 high, new read. A new byte must be reloaded from the flash (10010011).
L: Falling edge of M0, bit #0 is answered on bus8o ("1").
M: Rising edge of M1: Release ADD8, putting bus8o to 0.

The only problem with this code is that with the asynchronous signals and the large 23-bit counter, it synthesizes into 45 macrocells. The CPLD has only 32. I then prayed for a beefier version with the same package, and found the EPM3064 ! I order it, replace it on the board, and program it with the new code.

I wrote a minimal ROM with a single word and its LPC data, flashed it, install it on the board, and switched on the S&S. Cold sweat just before pressing the "Select module" button: the S&S plays the 4 tones indicating that the board is recognized. That meant the logic definition was good enough to get the magic bytes sent right.
I press the "GO" button and hear "SPELL BUTTHOLE". Victory !

 

download Quartus project with Verilog source.

Declaration of the inputs and outputs of the module (called "snsrom"): romclk, m0, m1, the 4-bit bus and 8-bit data bus from the flash are inputs. The address bus and ADD8 reply signal are outputs.

module snsrom
(
	input romclk,		// 160kHz clock from synthesizer
	input m0,			// Command
	input m1,			// Command
	input [3:0] bus,	// ADD8/4/2/1
	output reg bus8o,	// Bit output (open collector on ADD8)
	output reg [13:0] address,	// Flash address
	input [7:0] data	// Flash data
);

Declaration of the internal registers: gotnop is a flag to know whether we passed through a nop command, dummy if we got a dummy read, outp is a flag to know if we should output a bit on the next falling edge of m0 or not. intaddress is the current bit address (4 bits of CS, 14 byte address, 3 bit address). datab is the shift register to output bits.

reg gotnop;	// Did we get a nop ?
reg dummy;	// Did we get a dummy read ?
reg outp;	// Should we output a bit ?
reg [22:0] intaddress;	// __CC CCAA AAAA AAAA AAAA BBB, C=Chip Select, A=Address, B=Bit number
reg [7:0] datab;			// Data shift register

Asynchronous events: when there's a falling edge on m0 and outp is 1, a bit is replied. When ther's a rising edge on m1, release ADD8.

always @(negedge m0 or posedge m1)
begin
	if (m1 == 1)
	begin
		bus8o <= 1'b0;		// Release bus
	end
	else if (outp == 1 && gotnop == 1)
	begin
		if (datab[0] == 1)	// Output bit (inverted because of the open collector)
			bus8o <= 1'b0;
			else
			bus8o <= 1'b1;
	end
end

Synchronous events: If we have a nop, the gotnop flag is set to 1. If outp is 1, it's the first nop after a bit read command, datab is shifted right once to prepare for the next output and outp is set to 0.

always @(posedge romclk)	// Should be negedge, but romclk is inverted by the CD4093
begin
	if ({m1,m0} == 2'b00)	// 00: Nop
	begin
		gotnop = 1;
		if (outp == 1)			// First nop after a read ?
		begin
			datab[6:0] = datab[7:1];	// Yes: shift
			outp = 0;
		end
	end

If we get the set address command and gotnop is 1, we shift intaddress right 4 times and stick the value in the MSBs. Dummy, gotnop and outp are ste to 0.

else if ({m1,m0} == 2'b10)	// 10: Set address
	begin
		if (gotnop == 1)		// Only accept after a nop
		begin
			intaddress[18:3] = intaddress[22:7];	// Shift address
			intaddress[22:19] = bus[3:0];			// Load in MSBs
		end
		dummy = 0;
		gotnop = 0;
		outp = 0;
	end

If we get the read command, the bit must be prepared for the next falling edge of m0. If this is the first bit read command after a set address, then it's a dummy read: datab is loaded with the byte from the flash, the bit counter in intaddress is set to 0 and the dummy flag is set to 1.
If this is not a dummy read, intaddress is incremented and a new byte is loaded if the bit address is 1 (not 0 because the flash wouldn't have the time to set its data bus. The MSBs of intaddress are compared to the hardcoded CS mask and outp is set to 1 if there's a match.

else if ({m1,m0} == 2'b01)	// 01: Read bit
	begin
		if (dummy == 0)
		begin
			datab[7:0] = data[7:0];	// Dummy read, load byte and set bit counter to zero
			intaddress[2:0] = 0;
			dummy = 1;
		end
		else
		begin
			intaddress[20:0] = intaddress[20:0] + 1;	// Real read, pre-increment address
			if (intaddress[2:0] == 1)
				datab[7:0] = data[7:0];		// Load new byte if bit address is 1
			if (intaddress[20:17] == 4'b0010) outp = 1;	// If the CS matches, prepare output
		end
	end
end

Bits 3 to 16 of intaddress are routed to the address lines of the flash (4 MSBs used for CS and 3 LSBs used as bit counter aren't needed).

always @*
begin
	address[13:0] = intaddress[16:3];
end

endmodule
footer
symbol symbol symbol symbol symbol