[Video] A simple 16MHz hack for my STE: why memory access is so important.

General discussions or ideas about hardware.
User avatar
Badwolf
Posts: 2229
Joined: Tue Nov 19, 2019 12:09 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by Badwolf »

exxos wrote: Fri Jan 14, 2022 11:48 pm
Badwolf wrote: Fri Jan 14, 2022 11:39 pm Crikey! 10ns is a big ask. This is the RAM I'm targeting: https://www.farnell.com/datasheets/2622019.pdf
Must be doable though as the TFxxx series use 50Mhz CPU and SDRAM I think ?, not sure if that had waitstates, but that can keep up I think..
The 536 is *lightly faster than my DFB1 as I run at 50 and Stephen runs at 100 (the RAM's double clocked over the CPU) which (I think) means he can respond to AS half a clock faster than me. He has to space his SDRAM commands out though, on account of minimum times. Something I don't have to worry about at 50.

Net result is I get about 24MB/s reading and he gets about 30, IIRC. Theoretical limit for the 030 at 50MHz is around 45MB/s. I don't think either of us hit that.

I don't know how the 1260 performs -- nor if it uses the same style of SDRAM that Stephen has on the 536 and I have on the DFB1. That would presumably have a much higher theoretical limit, though.

CT60 goes a lot quicker, but I think that's using DDR. Not sure how a single access clocks, mind.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
Badwolf
Posts: 2229
Joined: Tue Nov 19, 2019 12:09 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by Badwolf »

olivier.jan wrote: Sat Jan 15, 2022 12:02 am Would you consider going for FPGA instead of CPLD?
There are these nice (on paper) Chinese FPGA: GW1NR-9 from Gowin. It’s flash based so no need to have external memory to bootstrap the FPGA and it includes RAM directly with access time of 4.5ns.
Hi Olivier,

These sound nice, and I can't speak for Exxos, but I think it's as big a jump from CPLD to FPGA as it was from NOT gates to CPLD for me.

To be honest, my next big adventure is likely going to be looking into PiStorm or an ST-centric equivalent. I've been thinking for some time that CPU emulation on a multi-core system-on-chip ARM is probably the real way forward.

Cheers,

BW.
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
exxos
Site Admin
Site Admin
Posts: 23488
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by exxos »

Badwolf wrote: Sat Jan 15, 2022 12:22 am The 536 is *lightly faster than my DFB1 as I run at 50 and Stephen runs at 100 (the RAM's double clocked over the CPU) which (I think) means he can respond to AS half a clock faster than me. He has to space his SDRAM commands out though, on account of minimum times. Something I don't have to worry about at 50.
It gets tricky to work out as there is PLD delays in the mix. If we went for 64mhz cpu, say 15ns, assume 2 clocks for the bus cycle, 30ns, assume 10ns for the PLD.. The ram would have to be ready in 20ns to keep up.

32mhz access on the STE booster was really pushing it with a 55ns ROM. But the ROM decoding delays didn't help.

So how are you working out 50MHz access speeds then ? I assume you don't have wait states in there ?

But if TFs controller is faster, then use that ? He did say a while back I could use it, just never have time to build a board up like you have done. I think I had it up to 66mhz speeds at one point. But CPU and RAM brands are likely factors there also.
https://www.exxosforum.co.uk/atari/ All my hardware guides - mods - games - STOS
https://www.exxosforum.co.uk/atari/store2/ - All my hardware mods for sale - Please help support by making a purchase.
viewtopic.php?f=17&t=1585 Have you done the Mandatory Fixes ?
Just because a lot of people agree on something, doesn't make it a fact. ~exxos ~
People should find solutions to problems, not find problems with solutions.
User avatar
sporniket
Posts: 956
Joined: Sat Sep 26, 2020 9:12 pm
Location: France
Contact:

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by sporniket »

I'm a bit late because these days I go to bed at a reasonable hour, but about MMU things, this is what I found recently and printed for myself. It might be of interest : "Memory Management Units for 68000 architectures", a scanned article from "BYTE", that sum up the principles, how some things have been implemented etc. Very interesting.


For further reading, I also printed technicalities of the Motorola MC68451 and MC68851, for the hypothetical time when I have become a seasoned enough VHDL designer (do not hold your breath :D )


(I did not pay attention when I first found all those PDFs, but they have been found on this site : http://marc.retronik.fr/motorola/68K/Peripherals.html )
czietz
Posts: 547
Joined: Sun Jan 14, 2018 1:02 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by czietz »

Badwolf wrote: Fri Jan 14, 2022 11:39 pm I can't help wondering if a CPLD could hold these tags as registers?
Don't forget that you need a separate tag for every cache location. There are by far to few register bits in an CPLD.

One might think to use a regular, albeit very fast SRAM for the tags and do the comparison (i.e., does the tag match?) inside the CPLD. But there's one feature of tag RAMs that I didn't mention before: There are instances (e.g., after a DMA access) where you need to invalidate the cache. Tag RAMs have a CLEAR input to facilitate that. I don't know how you could sensibly emulate this with a regular SRAM.
User avatar
Badwolf
Posts: 2229
Joined: Tue Nov 19, 2019 12:09 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by Badwolf »

czietz wrote: Sat Jan 15, 2022 9:49 am
Badwolf wrote: Fri Jan 14, 2022 11:39 pm I can't help wondering if a CPLD could hold these tags as registers?
Don't forget that you need a separate tag for every cache location. There are by far to few register bits in an CPLD.
Mmm. I was thinking about bytes (rather than kilobytes) of cache — like the 030. But you’re right, it would be taxing a CPLD even if it weren’t doing something else already.

Food for thought, though. Thanks!

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
Steve
Posts: 2570
Joined: Fri Sep 15, 2017 11:49 am

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by Steve »

@Badwolf I have been scanning through the PiStorm Discord channel for Atari specific work and it seems recently there has been some good development for Atari compatibility. It would be absolutely sensational to have not only the CPU and RAM aspect working but also VDI graphics output, this would open up a whole new world for so many Atari owners. Oh and of course, onboard wireless networking too.
User avatar
Badwolf
Posts: 2229
Joined: Tue Nov 19, 2019 12:09 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by Badwolf »

exxos wrote: Sat Jan 15, 2022 1:26 am It gets tricky to work out as there is PLD delays in the mix. If we went for 64mhz cpu, say 15ns, assume 2 clocks for the bus cycle, 30ns, assume 10ns for the PLD.. The ram would have to be ready in 20ns to keep up.
SDR SDRAM needs *4* clocks (minimum) for each random access from AS to data. It then needs another three cycles ‘recovery’, if you like (precharge). No problem, you might think, just run at twice, or more times, the CPU clock. Which is perfectly normal, but then you run into minimum times between commands, and have to pad out the access anyway.

I’d have to go through the data sheet and work out the optimum frequency that gives the lowest response time. My guess is we end up at around 60-70ns effective for our use case.

(Once you’ve opened a column, you can whizz down the data at one word every 7.5ns, but that is of absolutely no use to us, I’m afraid)
So how are you working out 50MHz access speeds then ? I assume you don't have wait states in there ?

But if TFs controller is faster, then use that ? He did say a while back I could use it, just never have time to build a board up like you have done. I think I had it up to 66mhz speeds at one point. But CPU and RAM brands are likely factors there also.
Once the CPU freq is beyond about 25MHz there are wait states on read accesses (but not writes). This is why you see memspeed figures in the 20s and 30s MB/s for read on my and Stephen’s cards, but writes are at the theoretical 40ish MB/s for a 50MHz processor.

My current controller is a hybrid of my own design from DFB1r3 and Stephen’s from TF330. I’ve basically credited Stephen as the author of it in the source.

My board was probably not really up to clocking the CPLD at 100 so I had a lot of trouble trying to do the double clocked routine. I rewrote it for 50 for ease of development (it also helps simplify my clock switching, which is hard on the Falcon) as the effective throughput cost is relatively low.

The speed boost should still be considerable, but it won’t be on a par with SRAM.

I’m planning to do development with a 32MHz oscillator at first, then will think about a 66 or similar. 100 is the limit of the CPLD.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
Badwolf
Posts: 2229
Joined: Tue Nov 19, 2019 12:09 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by Badwolf »

Steve wrote: Sat Jan 15, 2022 11:38 am @Badwolf I have been scanning through the PiStorm Discord channel for Atari specific work and it seems recently there has been some good development for Atari compatibility. It would be absolutely sensational to have not only the CPU and RAM aspect working but also VDI graphics output, this would open up a whole new world for so many Atari owners. Oh and of course, onboard wireless networking too.
I have a board sat on my desk, but can’t get the parts and then couldn’t test it as I have no machine it targets at the moment. If the PiStorm boys get an ST version going before I get round to it, great! I’ll probably look at getting it working in the Falcon instead ;)

I’m more interested in the bare metal JIT (fast!) implementation that Michał Schulz has been working on rather than the graphics and networking fanciness, but you know, so many projects, so little time.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
czietz
Posts: 547
Joined: Sun Jan 14, 2018 1:02 pm

Re: [Video] A simple 16MHz hack for my STE: why memory access is so important.

Post by czietz »

Badwolf wrote: Sat Jan 15, 2022 11:31 am Mmm. I was thinking about bytes (rather than kilobytes) of cache — like the 030.
Presumably, to really help with the slow bus speed, an ST accelerator would need more than a few bytes of cache, though.
Badwolf wrote: Sat Jan 15, 2022 12:06 pm If the PiStorm boys get an ST version going before I get round to it, great!
Last time I was asked to evaluate what was missing from the PiStorm for ST support, it turned out to be a lot, unfortunately. Some things rather obvious, others very subtle (i.e., requiring a skilled developer).

Then again, since I absolutely refuse to use Discord I'm not aware of the current status. (I could insert a long rant about why Discord is probably the least appropriate platform for developing such a project. But that would be off-topic.) Like you already said: so many projects, so little time; thus, I get to select the projects I can (and want to) support.
Post Reply

Return to “HARDWARE DISCUSSIONS”