Posted on November 3rd, 2008 by Nosta

6 Comments

A stream processor is one of many (sometimes hundreds) parallel processing units that is part of the architecture of modern day graphics processors. The importance of it being parallel is stressed because a GPUs ultimate function is to continuously render pixel (millions of them at a time) imagery/content based on requests from applications and games. With an army of parallel units performing an enormous amount of similar incoming calculation requests (whether FP or INT arithmetic), as long as there are more requests than there are such processing units to handle them, the more stream processors, the better. General purpose CPUs are designed completely differently, as it can perform only so many operations in parallel – most CPUs (probably all) only have one (1) arithmetic logic unit (ALU) that handles all of the arithmetic such as Add, Subtract, Multiply, Divide, Bit Shift, and more, for FP and INT, whereas a typical GPU consists of hundreds of ALU-like units (what we call stream processors) designed for a more limited set of FP/INT calculations.

So what’s the difference between Nvidia’s and ATI’s GPU architecture and why do seemingly comparable ATI cards have more stream processors than Nvidia ones? The answer lies in their different approach in implementation. Nvidia’s GPUs are flooded with fewer stream processors (CUDA technology), where each one is identical in look, feel, design, and function (FP and INT arithmetic) to its neighbor. To be more exact, for every 8 identical stream processors, there is one special functional unit that keeps things in check. So if you look at a Geforce GTX 280 with 240 stream processors, it’s really using only about 88% (1 of every 9 sp’s are there police the other eight) of its advertised FP/INT arithmetic processing power.

Nonetheless, Nvidia’s GPU architecture is easier for application and games developers to program for due to its simplicity (every stream processing unit performs the same function)- as long as the units are fed numbers to crunch from the apps, you are getting fast raw results every clock cycle. Nvidia’s architecture has been deemed analogous to American moto engines- simple raw power, gas guzzling.

ATI’s architecture is a bit different- not every stream processor (Brook+ technology) is identical to its neighbor. For every block of 6 stream processing units, 4 are identical, the 5th carries different FP/INT arithmetic functions, and the 6th keeps things in check. So essentially, each block of 5 ATI stream processors (ignoring the special unit) is comparable to 1 Nvidia stream processor. The math isn’t that simple, but its a good generalization to make that helps demystify why a high-end ATI Radeon HD 4870 card with a rocking 800 stream processors is relatively weaker than an Nvidia GTX 280 with only 240. Because of ATI’s GPU architecture, apps and game developers have a tougher time programming to take full advantage of every stream processor on board the fact that specific FP/INT arithmetic functions can only be “worked on” by one out of every five units (per block). In order to take full advantage of ATI’s architecture, an app or game must be optimally coded- something like baiting the hook to suit the fish..

If a program is not optimized for the architecture, the work, to ultimately have as many blocks of stream processing units working every cycle of the clock, relies on the GPU scheduler – the Ultra Threaded Dispatch Processor. All in all, current Nvidia graphics cards lead ATI implementations in most (if not all) game benchmarks, but from a cost/performance standpoint, ATI is definitely the better bang for the buck. Choose wisely.

Image credits

Popularity: 24% [?]

, , , , , , , , , , , , , , , , , , , , , , , ,

Delicious, Digg, Mixx, Reddit, Stumble Upon, Technorati

6 Responses to “Stream Processing Units Implementation: NVIDIA vs AMD/ATI”

  1. Nice post u have here :D Added to my RSS reader

  2. Does all this mean that if/when a game was coded to be optimized using ATI’s architecture, that a $30 outdated ATI card could play the game at a performance level similar to a high end NVIDIA card? Do the ATI cards have incredible potential that has yet to be unlocked? I don’t understand why they would use a more complicated architecture that isn’t being supported by game developers and have so many stream processors not being used to their potential. It doesn’t do a lot to say you have 800 stream processors when 240 give more performance and are probably cheaper to manufacture as well.
    Maybe i’m missing something or this is the part that you said the math isnt that simple and you were just giving a generalization?

  3. мм… вот как оказалось …

  4. I opine that there is no reason to accomplish the essay papers by your own! As for me, this is more comfortable to purchase the argumentative essay from essays online service, because it will save time.

2 Trackbacks For This Post

  1. Two things to Consider When Buying a Hardcore Gaming Laptop Says:

    [...] to sacrifice for mobile versions. Either way, you’re going to want cards that have sufficient stream processing power to handle the type of games you’ll need your laptop to play. For FPS games, you’ll want [...]

  2. help me find a thread... - Overclock.net - Overclocking.net Says:

    [...] cycle of the clock, relies on the GPU scheduler – the Ultra Threaded Dispatch Processor. Source:http://nostasolutions.com/2008/11/03…dia-vs-amdati/ [...]

Leave a Reply

Spam Protection by WP-SpamFree