spikegifted - Random thoughts
November 29, 2004
Looking at the real world benchmarks (ie. excluding 3DMark05), while it looks like a pair of 6800 Ultra in SLI is very impressive, it is not exactly blowing ATi's latest and greatest away like nVidia (and its partners) have led me to believe. If anything, after reading the review at Tech Report, I think ATi X800 XT is better bang-per-buck than a pair of 6800 Ultra in SLI (if you can apply such measurement to bleeding-edge graphics cards)...
What's your view?
November 30, 2004
I’m not entirely sure if parallelism is merely a stop-gap...
The problems you mentioned regarding power consumption, current leakage and cooling of high thermal density is common is all modern high-end microelectronics. Processor complexity (that applies to both CPUs and VPUs) is not going to go into reverse. Take a look at the number of transistors in the latest generation of VPUs, they rival, if not surpass, the latest generation of CPUs, and VPUs do not carry anywhere near the amount of caches CPUs do (purely because of the nature of their work). To add to that, it is not nearly possible to mount a 80mm or 92mm fan with a giant copper heatsink with massively extended pins and fins on top of (or, to be more precise, under) the VPU, unless you have no other expansion requirements.
With the above in mind, I would argue that a parallel approach might be attractive in the long run. To achieve a similar or a superior level of performance, instead of creating a massively complicated VPU with 16, 20 or even up to 32 pipes, you can have a number of less complicated VPUs, each with 8 or 12 pipes, working in parallel. Could it be possible that you have a number (say, 4 or 8 or even more) of DX9.x compliant VPUs, each with 8 pipes onboard. You then split the screen to as many VPUs as you have and assign each an equal amount of work to complete. It is almost like the good old tile-based rendering of ST/Micro, but with multiple VPUs taking care of a single frame, instead of one VPU dealing with a single large frame.
This is the SFR approach that nVidia has with SLI, but apply to 2+ VPUs. AFR will not work for the above proposed solution for the individual under-powered VPU will not render the frames fast enough, so it become counter-productive. Anyway, AFR actually avoids parallelism - it is merely using each VPU to render alternate frames, there is no splitting of workload of a particular frame (task).
Ok, dynamically splitting a single frame into tiles will mean consumption of resources elsewhere, most likely to be the CPU. This is where us duallie owners would, and should, have a huge advantage: use the first CPU for game physics and AI and the second CPU for ‘auxiliary’ functions like dynamically resizing the tiles!
Going back to your point regarding investing in available technology here and now rather than waiting for the next generation, I would agree with you whole-heartedly. The price may be high right now, but if you can justify the cost, it can be ‘productivity’ or just bragging rights or just about anything else. For the imagined increase performance of a next generation part (just think those who hold out for a nVidia GF5X00 part), the ‘productivity’ you gain in having the technology today probably outweighs any potential eventual gains you have by waiting, due to the opportunity cost of waiting for the technology. On the other hand, you can argue that you can invest the money you may spend on a high-end VPU today on other parts of your rig that may improve performance or improve your PC experience. There is always ways to use up valuable resources, like, particularly, your money.