NVIDIA Optimus Unveiled

Optimus is switchable graphics on steroids, but how does it all work and what makes it so much better than gen2? If you refer back to the last page where we discussed the problems with generation two switchable graphics, Optimus solves virtually every one of the complaints. Manual switching? It's no longer required. Blocking applications? That doesn't happen anymore. The 5 to 10 second delay is gone, with the actual switch taking around 200 ms—and that time is hidden in the application launch process, so you won't notice it. Finally, there's no flicker or screen blanking when you switch between IGP and dGPU. The only remaining concern is the frequency of driver updates. NVIDIA has committed to rolling Optimus into their Verde driver program, which means you should get at least quarterly driver updates, but we're still looking forward to the day when notebook and desktop drivers all come out at the same time.

As we mentioned, most of the work that went into Optimus is on the software side of things. Where the previous switchable graphics implementations used hardware muxes, all of that is now done in software. The trick is that NVIDIA's Optimus driver is able to look at each running application and decide whether it should use discrete graphics or the IGP. If an application can benefit from discrete graphics, the GPU is "instantly" powered up (most of the 200 ms delay is spent waiting for voltages to stabilize), the GPU does the necessary work, and the final result is then copied from the GPU frame buffer into the IGP frame buffer over the PCI Express bus. This is how NVIDIA is able to avoid screen flicker, and they have apparently done all of the background work using standard API calls so that there's no need to worry about updating drivers for both graphics chips simultaneously. They're a bit tight-lipped about the precise details of the software implementation (with a patent pending for what they've done), but we can at least go over the high-level view and block diagram as we discuss how things work.


NVIDIA states that their goal was to create a solution that was similar to hybrid cars. In a hybrid car, the driver doesn't worry about whether they're currently using the battery or if they're running off the regular engine. The car knows what's best and it dynamically switches between the two as necessary. It's seamless, it's easy, and it just works. You can get great performance, great battery life, and you don't need to worry about the small details. (Well, almost, but we'll discuss that in a bit.) The demo laptop for Optimus is the ASUS UL50Vf, which is identical from the outside when looking at the UL50Vt, but there are some internal changes.


Previously, switchable graphics required several hardware multiplexers and a hardware or software switch. With Optimus, all of the video connections come through the IGP, so there's no extra hardware on the motherboard. Let me repeat that, because this is important: Optimus requires no extra motherboard hardware (beyond the GPU, naturally). It is now possible for a laptop manufacturer to have a single motherboard design with an optional GPU. They don't need to have extra layers for the additional video traces and multiplexers, R&D times are cut down, you don't need to worry about signal integrity issues or other quality concerns, and there's no extra board real estate required for multiplexers. In short, if a laptop has an NVIDIA GPU and a CPU/chipset with an IGP, going forward there is no reason it shouldn't have Optimus. That takes care of one of the biggest barriers to adoption, and NVIDIA says we should see more than 50 Optimus enabled notebooks and laptops this summer. These will be in everything from next-generation ION netbooks to CULV designs, multimedia laptops, and high-performance gaming monsters.


We stated that most of the work was on the software side, but there is one new hardware feature required for Optimus, which NVIDIA calls the Optimus Copy Engine. In theory you could do everything Optimus does without the copy engine, but in that case the 3D Engine would be responsible for getting the data over the PCI-E bus and into the IGP frame buffer. The problem with that approach is that the 3D engine would have to delay work on graphics rendering while it copied the frame buffer, resulting in reduced performance (it would take hundreds of GPU cycles to copy a frame). To eliminate this problem, NVIDIA added a copy engine that works asynchronously from frame rendering, and to do that they had to separate the rendering effort from the rest of the graphics engine. With those hardware changes complete, the rest is relatively straightforward. Graphics rendering is already buffered, so the Copy Engine simply transfers a finished frame over the PCI-E bus while the 3D Engine continues working on the next frame.

If you're worried about bandwidth, consider this: In a worst-case situation where sixty 2560x1600 32-bit frames are sent at 60FPS (the typical LCD refresh rate), the copying only requires 983MB/s. An x16 PCI-E 2.0 link is capable of transferring 8GB/s, so there's still plenty of bandwidth left. A more realistic resolution of 1920x1080 (1080p) reduces the bandwidth requirement to 498MB/s. Remember that PCI-E is bidirectional as well, so there's still 8GB/s of bandwidth from the system to the GPU; the bandwidth from GPU to system isn't used nearly much. There may be a slight performance hit relative to native rendering, but it should be less than 5% and the cost and routing benefits far outweigh such concerns. NVIDIA states that the copying of a frame takes roughly 20% of the frame display time, adding around 3ms of latency.

That covers the basics of the hardware and software, but Optimus does work beyond simply rendering an image and transferring it to the IGP buffer. It needs to know which applications require the GPU, and that brings us to a discussion of the next major software enhancement NVIDIA delivers with Optimus.

A Brief History of Switchable Graphics Optimus: Recognizing Applications
Comments Locked

49 Comments

View All Comments

  • jfmeister - Tuesday, February 9, 2010 - link

    I was anxious to get an mx11 but 2 things were bothering me:
    1- No DirectX 11 compatibility
    2- No Core i5/i7 platform.

    Now there is another reason to wait for the refresh. But with arrendale prices droping, DX11 card available, Optimus, I would expect Alienware to get on the badwagon fast for a new mx11 platform and not wait 6 to 8 months for a refresh. This ultra laptop is intended for gamers and we all know that gamers are on top of their things. Optimus in the mx11 case should be a must.

    BTW, what I find funny is Optimus looks like a revolution, but what about 3dfx 10 years ago with their 3D Card addon (Monster 3D 8MB ftw)? Swithcing was used back then... This looks like the same thing except with HD video support! It took that long to come up with that?
  • JarredWalton - Tuesday, February 9, 2010 - link

    Remember that the switching back in the days of 3dfx was just in software and that the 3D GPU was always powered. There was the dongle cable situation as well. So the big deal here isn't just switching to a different GPU, but doing it on-the-fly and powering the GPU on/off virtually instantly. We think this will eventually make its way into desktops, but obviously it's a lot more important for laptops.
  • StriderGT - Tuesday, February 9, 2010 - link

    My take on Optimus:

    Optimus roots lie with hybrid SLI.
    Back then it was advertised as an nvidia only chipset feature (nvidia IGP + nvidia GPU) for both desktop and notebooks.

    Currently nvidia is being rapidly phased out of PC x86 chipsets so optimus is the only way to at least put an nvidia GPU on an intel IGP based system, but:

    1. Only real benefit is gaming performance without sacrificing autonomy in notebooks.
    2. Higher cost (in the form of the discrete GPU), intel has 60%+ of GPUs(=IGPs) because the vast majority do not care or are uninformed about game performance scaling.
    3. CUDA/Physx currently and in the foreseeable future irrelevant for mobile applications (gaming is much more relevant in comparison).
    4. Video decoding capabilities already present in most current IGPs (except pinetrail netbooks which can acquire it with a cheaper dedicated chip )
    5. Netbooks will not benefit from Optimus because they lack the CPU horsepower to feed the discrete GPU and are very cost sensitive... (same reason that ION1/2 is not the primary choice for netbook builders)
    6. In the desktop space only some niche small form factor PC applications could benefit from such a technology eg an SFF PC would need lesser cooling/noise during (IGP) normal operation and become louder more powerful while gaming (GPU)
    7. Idling/2D power consumption of most modern desktop GPUs is so low making the added complexity of a simultaneously working onboard IGP and the associated software a no benefit approach.
    8. Driver/application software problems that might arise from the complexity of profiles and the vastly different workload application scenarios.

    So in the end it boils down how can nvidia convince the world that a discrete GPU and its added cost is necessary in every portable (netbook and upwards sized) device out there. As for the desktop side it will be even more difficult to push such a thing with only noise reduction in small form factor PCs being of interest.

    BTW At least now the manufacturers won't have anymore excuses for the lack of descent GPU inside some of the cheaper notebook models (500-1000$), because of battery autonomy reasons.
    Oh well I'll keep my hopes low after so much time being a niche market since they might find some other excuse along the lines weight and space required for cooling the GPU during A/C operation... :-(

    PS Initially posted on yahoo finance forum
  • Zoomer - Tuesday, February 9, 2010 - link

    Not like it was really necessary; the Voodoo 2 used maybe 25W (probably less) and was meant for a desktop use.
  • jfmeister - Tuesday, February 9, 2010 - link

    Good point! I guess I did not take the time to think about it. I was more into the concept than the whole techincal side of that you brought up.

    Thanks!

    JF
  • cknobman - Tuesday, February 9, 2010 - link

    Man mx11 was biggest disappointment out there. weak sauce last gen processor on a so called premium high end gaming brand? Ill consider it once they get an arrandale culv and optima cause right now looking at notebookreview.com forums it is a manual switching graphics not optima.
  • crimson117 - Tuesday, February 9, 2010 - link

    Which processor should they have used, in your opinion?
  • cknobman - Tuesday, February 9, 2010 - link

    Should have waited another month to market and used the Core i7 ulv processors. There are already a few vendors using this proc (panasonic is one).
  • Wolfpup - Tuesday, April 20, 2010 - link

    Optimus is impressive software, but personally I don't want it, ever. I don't want Intel graphics on my CPU. I don't want Intel graphics in my memory controller. I don't want Intel graphics. I want my real GPU to by my real GPU, not a helper device that renders something that gets copied over to Intel's graphics.

    I just do not want this. I don't like having to rely on profiles either-thankfully you can manually add programs, but still.

Log in

Don't have an account? Sign up now