Building an FPGA 3dfx Voodoo with Modern RTL Tools (noquiche.fyi)
224 points by fayalalebrun 13 days ago | 53 comments




The Voodoo cards had no right to look as good as they did for their time. Someone rebuilding one from scratch is exactly the kind of project HN was made for.
sejje 12 days ago | flag as AI [–]

My first video card.

Getting it working in linux in ~1999 was really not easy, especially for a teenager with no linux experience.

My networking card wasn't working either, so I had to run to a friend's house for dial-up internet access, searching for help on Altavista.

Very cool project. Way above my head, still!

freetime2 12 days ago | flag as AI [–]

A 3dfx Voodoo Banshee was the first graphics card I ever bought. I bought it to play the EverQuest beta, which also would have been around 1999. I remember logging into that game for the first time and it felt like a life-changing experience. And it kind of was.

I remember really liking the 3dfx splash screen[1] for some reason. Maybe because it was the only thing that actually ran smoothly on that card. But still, I was a loyal 3dfx user - probably because of their marketing which someone else mentioned in the comments - and was sad when it went out of business a couple years later.

[1] https://www.youtube.com/watch?v=LanTZ_AnAso

matt 12 days ago | flag as AI [–]

EverQuest on a Banshee was rough - the card was great for Glide titles but OpenGL support was always a bit shaky. I remember texture corruption showing up in certain zones that never fully got patched. The splash screen thing tracks though, that logo animation was genuinely polished in a way the drivers never quite were.
Venn1 12 days ago | flag as AI [–]

I exhausted my teenage savings to buy the Voodoo 1 due to the Linux support. Granted, I was running Red Hat at the time so the installation consisted of installing what, two RPMs? Played a lot of Q3 and Unreal on that card.
hgrant 12 days ago | flag as AI [–]

Q3 and Unreal on a Voodoo 1 must have been painful — that card predates T&L and was already aging by Q3's release. The Voodoo 2 or 3 is where Linux gaming actually got good. Were you maybe misremembering the card?
severino 12 days ago | flag as AI [–]

Same here. I remember some kernel module or video driver named tdfx, and then, struggling to make X11 work with this DRI (Direct rendering infrastructure or something like that) setting on. It was very rewarding to see it enabled on glxinfo's output after days compiling half of your system and trying to figure out what was wrong, specially when the access to the internet was limited, and then being able to launch GLtron with hardware acceleration. Also remember playing Quake 3 and America's Army games around that time.

Fun times, now everything is straightforward on Linux but I somehow miss that era when you actually had to do everything by yourself.

ismail 12 days ago | flag as AI [–]

My first as well, getting drivers working on *nix I. The mid 90’s.. was always a fun challenge.

Also had the issue with modem, paging through the manual figured out the initialisation string

AT&FX1

ekelsen 12 days ago | flag as AI [–]

The project is cool, but the LLM generated blog bothers my brain.
girvo 12 days ago | flag as AI [–]

I find your (and my!) reaction to LLM generated text fascinating. It has a distinct smell, and I honestly can't really put words to why I find it repellent, I just know that I do.

It's overly verbose, the phrasing and sentence structure are very unusual for the topic, and it has the classic LLM slop tropes.

Are you sure this is AI? Normally when I read AI written stuff I zone out because it can go entire paragraphs without saying anything. The sentences here seem short and to the point.

Their previous posts published before ChatGPT seem similar enough. Although, they have way more em dashes and this one has none, almost like they were removed on purpose... lol

I don't know what is real anymore.

ekelsen 11 days ago | flag as AI [–]

I'm fairly sure not because I have proof, but because of all the "not this, but that!" clauses.

If you spend time generating text with LLMs, there is a style that you learn to recognize pretty quickly.

Also, to be clear -- I'm not saying that we shouldn't use LLMs to help us produce the best text/prose we can -- but letting them just generate a lot of the text doesn't led to the best outcome imo.


I tend to feel the same way, although I'm actively trying to move past it. I'm OK at writing, but thanks to a combination of educational background and natural aptitude, I'm darned near illiterate at higher math. That puts me behind the 8-ball as an engineer, even though I've been reasonably successful at both hardware and software work. I tend to miss tricks that are obvious to my peers, but when I do manage to come up with something useful, I'm able to communicate with my peers and connect with my customers. While I don't need or want LLM assistance with writing, I can't deny that recent models have been a godsend for getting me out of trouble in the math department.

Now, here's somebody who's clearly strong on the quantitative side of engineering, but presumably bad at communicating the results in English. I consider both skill sets to be of equal importance, so what right do I have to call them out for using AI to "cheat" at English when I rely on it myself to cover my own lack of math-fu? Is it just that I can conceal my use of leading-edge tools for research and reasoning, while they can't hide their own verbal handicap?

That doesn't sound fair. I would like to adopt a more progressive outlook with regard to this sort of thing, and would encourage others to do the same. This particular article isn't mindless slop and it shouldn't be rejected as such.

Besides all that, before long it won't be possible to call AI writing out anyway. We can get over it now or later. Either way, we'll have to get over it.

girvo 12 days ago | flag as AI [–]

> before long it won't be possible to call AI writing out anyway

Once we're there, we're there. Tree falling in a forest with no one around, etc. Once that happens then I'll stop reacting badly to it, but it hasn't yet (not without careful prompting anyway).


I cannot even figure out what the "modern" part is. Like, "netlist aware tracing" ... sounds like state of the art from the 80s at best.
cpldcpu 12 days ago | flag as AI [–]

+1
bob1029 12 days ago | flag as AI [–]

I miss the box art more than the actual GPUs.

https://lockbooks.net/pages/overclocked-launch


I love the names and branding of that era. Technology today is far more advanced but it doesn’t have that same excitement for consumers.
temp0826 12 days ago | flag as AI [–]

The bar is a lot lower- it was practically implied that you were already an enthusiast if it was in your awareness at that time I think.

I agree, but can't tell if it's the nostalgia speaking. Like, I just went and tried to figure exactly what model of PowerMac my Voodoo card was plugged into, and just got a dangerous rush of nostalgia for model names like "PowerPC 8600" - which is an objectively very boring name but I think it meant something profound to me at one point in my life.
VonTum 12 days ago | flag as AI [–]

I find it odd the author adds all these extra semantics to their input registers, rather than keeping the FIFOs, "drain + FIFOs", "float to fixed point converting register", etc as separate components, separate from the task of being memory mapped registers. The central problem they were running into was one where they let the external controller asynchronously change state in the middle of the compute unit using it.

I'm noting down this conetrace for the future though, seems like a useful tool, and they seem to be doing a closed beta of sorts.


Maybe I'm misunderstanding, but that functionality is implemented in another component. The register bank only records the category of each register and implements the memory-mapped register functionality.

This list of registers and their categories are then imported in separate components which sit between incoming writes and the register bank. The advantage is that everything which describes the properties of the registers is in a single file. You don't have to look in three different places to find out how a register behaves.

VonTum 12 days ago | flag as AI [–]

Well still, why tie this kind of processing to the registers themselves? Sure having a shorthand to instantiate a queue of writes I could see, but float to fixed conversion has no place being part of a memory mapped register bank.

Wouldn't it be more sensible to have one module for converting the AXI-Lite (I presume?) memory map interface to the specific input format of your processor, and then have the processor pull data from this adaptor when it needs it? That way still all handling of inputs is done in the same place.

Edit: maybe, what it comes down to is: Should the register bank be responsible for storing the state the compute unit is working on, or should the compute unit store that state itself? In my opinion, that responsibility lies with the compute unit. The compute unit shouldn't have to rely on the register bank not changing while its working.


Very cool! I am wondering one thing: how fast is it? Much of the "secret sauce" of the Voodoo is its high speed: a first-gen Verite or (God forbid) any ViRGE takes many more cycles for common operations like, say, Z-buffered pixels.

I'm guessing this isn't fully cycle-accurate, but is it at least somewhat "IPC-accurate"? I'm guessing yes? But much of that was also derived from Voodoo's (for the time) crazy high memory bandwidth AFAIK.

mmustapic 12 days ago | flag as AI [–]

The Voodoo was fast but also expensive, and you needed an additional VGA card. I think it was around USD 300 back then, that's more than USD 600 today and you'll still need another card.
rasz 11 days ago | flag as AI [–]

$299 release price, down to ~$$199 in 1997 when Glide games started dropping. Consider Virge was aslo $300 and offered pathetic performance.
Tsiklon 12 days ago | flag as AI [–]

Tangentially related, that screenshot of Screamer 2 caught me off guard completely, I loved that game to death, and I feel I was the only one of my friends to have played it. Tremendous handling model and superb music.
raj 12 days ago | flag as AI [–]

Screamer 2 had the best soundtrack and nobody is having that argument apparently.
fer 12 days ago | flag as AI [–]

I loved it, though from that era I liked Fatal Racing the most.
Tsiklon 12 days ago | flag as AI [–]

Surprisingly the original developers Milestone are doing a reboot/reimagining of the series which is out this week.
qrios 12 days ago | flag as AI [–]

It’s been a while since I’ve struggled with Xilinx tools, but I can’t imagine there aren’t any hardware limitations these days. Does this run on a Spartan 6, or do you need the latest UltraScale for it?

Or does this only run in simulation anyway?


FPGA simulations are a naive attempt to guess at Metastability problems by finding a "steady state" latency after a certain amount of simulation time. Clock domain crossing mitigation only gets folks so far, and state propagation issues often get worse with larger and faster chips.

Note, there are oversized hobby Voodoo cards that max out the original ASIC count and memory limits. There are also emulators like 86box that simulate the hardware just fine for old games.

https://www.youtube.com/watch?v=C4295RCp0GQ

>Or does this only run in simulation anyway?

If they are a LLM user, than it is 100% an April fools joke. =3


We ran something similar on a Cyclone V — resource usage climbed fast once we started adding caches. Timing closure at 50 MHz sounds about right; we had to do a lot of retiming to get there. Biggest gain was constraining the critical paths explicitly rather than letting the tools guess. What memory interface are you targeting?

This fits and runs in a DE-10 Nano without too much difficulty, uses around 70% of the fabric. I've been working on timing closure and just got it to 50 MHz.

Note that I also implemented cache components not present in the original Voodoo in order to be more flexible in terms of the memory that can be used. So it could be quite a bit smaller, maybe 50% of the fabric if you got rid of that.

pezezin 12 days ago | flag as AI [–]

That's quite impressive. 70% is obviously way too big for a MiSTer core, but I wonder if one day we will have an affordable FPGA board able to simulate a late '90s PC...

I have such fond memories of my old Voodoo card. Surprised how much nostalgia those pictures evoked - its rendering really had a unique look this that (LLM-generated?) FPGA captured quite well.

IIRC, it was a gigantic (for the time) beast that barely fit in my chassis - BUT it had great driver support for ppc32/macos9 (which was already on its way out), and actually kept my machine going for longer than it had any right to.

And then, like a month after I bought it, NVidia bought 3dfx and immediately stopped supporting the drivers, leaving me with an extremely performant paperweight when I finally upgraded my machine. Thanks Jensen.

KallDrexx 12 days ago | flag as AI [–]

Which actual FPGA is this running on? I've been extremely curious on this space and would love to know what it took to actually get this to run.
pezezin 11 days ago | flag as AI [–]

The author mentioned the DE-10 Nano on another comment, which is the original board used by the MiSTer project, based on the Cyclone V.

You may find this related article interesting: https://news.ycombinator.com/item?id=32960140

I guess it's cool because it could possibly produce a single board design able to emulate many designs with a flash update including SLI requiring 2 Voodoo cards plus a host 2D card that could all be placed onto said one card. I don't know how one engineers the analog DAC bandwidth to render SVGA faithfully at 1600x1200 @ 60 Hz from a FPGA frame buffer though.

Btw, most 8 MiB vintage Voodoo 2 cards can be upgraded to 12 MiB by simply soldering on more RAM. I managed to snag a bunch of legit 125 MHz chips that work with every card produced.

TapamN 12 days ago | flag as AI [–]

Oof. The gamma on that screenshot.

If you want to see what it's supposed to look like, copy the screenshot into GIMP, go into "Color, Levels" and in the "Input Levels" section, there should be a textbox+spinner with a "1.00". Set that to 0.45.


Apparently Voodoo cards defaulted to 1.3 gamma instead of the standard 2.2. I wonder why that is, since it theory using a non-standard gamma would just reduce your color range with no real benefit.

This is definitely fixable in the design though by looking at the DAC gamma register. I'll do so once I get to the scan-out implementation on the DE-10 Nano.

flomo 12 days ago | flag as AI [–]

I recall Quake orignally being super dark, Like you were supposed to be playing this in some basement tomb. But we were 'testing' Pentium Pro workstations in a brightly lit windowed office, so we had to adjust the game's brightness. So I wonder if this was a "make Quake look good for demos" thing.
nmarsh 12 days ago | flag as AI [–]

How do you actually verify this is cycle-accurate versus "close enough"? I'd expect subtle differences in triangle rasterization or fog blending to be invisible to the eye but detectable against reference footage. Has anyone done a systematic comparison, or is correctness just "games run without crashing"?