What is CUDA, and what is a CUDA core?
Compute Unified Device Architecture (CUDA) is a feature in most newer Nvidia graphics cards that allows the computer to use a part of the GPU (or even the complete GPU) as an “assistant” to the processor. GPUs pack much more muscle than computers, but their architecture has historically been more optimized for calculating drawing distances and polygons (which is why they’re slapped onto graphics cards in the first place). CUDA transforms the GPU into a math geek that can crunch numbers very quickly, using the insane muscle power of a GPU for other things than simply rendering and displaying graphics on the screen. In the article linked to at the beginning, I explained that SETI@Home takes advantage of CUDA by using graphics cards to perform calculations. This is just one example of how CUDA can be used to do amazing things. CUDA can also be used to transcode video (convert it from one format to another) using a special codec that communicates with the hardware. Nvidia’s encoder is known as NVENC, and it’s a powerful way to encode video much more quickly using the graphics card video engine as opposed to exhausting your CPU. If you’re a developer, and you’re interested in including NVENC in your program, you can see Nvidia’s resources here. OK, so now we know what CUDA is. What about CUDA cores? A CUDA core is one segment of the GPU that can be used for the purposes of CUDA. It’s the piece of the GPU that some monitoring programs call the “Video Engine.” Each core is a little piece of the entire GPU’s architecture which can be used for both traditional 3D rendering or CUDA-specific functions. In most graphics cards, the entire GPU is available for CUDA work. This means that the number of CUDA cores in the GPU actually defines how many cores the entire GPU has.
Why do GPUs have so many cores?
While today’s CPUs typically have four to eight cores, there are graphics cards out there with over 5,000 cores! Why is that, and why can’t CPUs have such an insane amount of cores? The GPU and CPU were both made for different purposes. While a CPU reacts to machine code in order to communicate with various pieces of hardware on your computer, the GPU is made for only one specific purpose: It’s supposed to render polygons into the beautiful scenes that we see in 3D-accelerated environments and then translate all of this stuff into an image 60 times or more per second. That’s a tall order for a CPU, but since the GPU has compartmentalized polygon processors, it can split the workload among all of its cores to render a graphical environment within a few milliseconds. That’s where the cores come in. A GPU needs all of those cores to split massive tasks into tiny pieces, each core processing its own part of the scene individually. Applications that use CPUs (like your browser) don’t benefit from having such an enormous number of cores unless each core has the muscle power of an entire processing unit. Your browser relies on fast access to information as opposed to the compartmentalization of tasks. When you load a webpage or read a PDF file, you only need one processing stream to load all of that up.
Does more RAM make a video card better?
RAM is a bit of a weird gray area with video cards. While it’s nice to have as much RAM as possible, you also need to be able to use all of that RAM. A video card with 1024 MB of RAM and a 192-bit-wide bus is going to perform much better than a video card with 2048 MB of RAM and the same bus. As I have explained in the previous piece, the 2048 MB video card will experience something called “bandwidth bottlenecking” because the bus (the road that data travels on) isn’t wide enough to carry a sufficient amount of data in a short amount of time. In short, no, more RAM isn’t necessarily better if the video card doesn’t have a wide bus. Here’s my guide to proper bus width: Your video card should have a maximum of eight times the amount of RAM in megabytes as the number of bits in the bus. For example, a 1024 MB card should have at least a 128-bit bus (1024 / 8 = 128). So, for a 2048 MB card, I recommend a minimum of 256 bits. If you still have more questions, be sure to ask them in the comments below!