cancel
Showing results for 
Search instead for 
Did you mean: 

Power management update 8/23/2014

JohnCarmack
Explorer
The current (not yet in developers’ hands) device build has a new clock control architecture.

Developers now specify a CPU level and a GPU level. These are abstract quantities, not MHz / GHz, so some effort can be made to make them compatible with future devices. For the initial hardware, the levels can be 0, 1, 2, or 3 for CPU and GPU. 0 is the slowest and most power efficient, 3 is the fastest and hottest. The exact speeds are still being determined, but right now there is about a factor of two difference between the 0 and 3 levels. GPU level 3 is only allowed with CPU levels 0 and 1, and even there, it is expected to overheat fairly quickly. Choosing GPU 3 and CPU 2 or 3 will be automatically dropped down to GPU 2.

On the previous device build, the CPU / GPU clock rates functioned as a minimum, but the system governors would still increase the clock rate if it looked like you needed it, which would increase the temperature, and quickly lead to thermal throttling dropping the clocks well below the desired settings. On the new build, the CPU and GPU clocks are completely fixed to the set values until the device temperature reaches the limit, at which point the CPU and GPU clocks will change to the 0 / 0 levels. This can be detected, and some apps may choose to continue operating in a degraded fashion, perhaps by changing to 30 fps or monoscopic rendering. Other apps may choose to put up a warning screen saying that play can’t continue (because it would be terrible), which is what we are doing at the upcoming show.

How long your game will be able to play before running into the thermal limit is a function of how much work your app is doing, and what the clock rates are. Changing the clock rates all the way down only gives about a 25% reduction in power consumption for the same amount of work; most power saving has to come from actually doing less work in your app. This is critically important – there isn’t going to be any magic setting in the SDK that fixes power consumption.

If your app can run at the 0 / 0 settings, it should never have thermal issues. This is still two cores at around 1 GHz and a 240 MHz GPU, so it is certainly possible to make sophisticated applications at that level, but Unity based applications are very unlikely to make that cut.

There are effective tools to reduce the required GPU performance – don’t use chromatic aberration correction on TimeWarp, don’t use 4x MSAA, and reduce the eye target resolution. Using 16 bit color and depth buffers can also help some. It is probably never a good trade to go below 2x MSAA; you should instead reduce the eye target resolution. These are all quality tradeoffs, which need to be balanced against things you can do in your game, like reducing overdraw (especially blended particles) and complex shaders. Always make sure textures are compressed and mipmapped.

Reducing the required CPU performance is much less straightforward. Unity apps should be using the multithreaded renderer option, since two cores running at 1 GHz is a more efficient way to do the work than one core running at 2 GHz. The NGUI problems that some devs were seeing have been fixed in Unity 4.5.3F3, which is now publicly available. Beyond that, it is just profiling and optimizing.

If you find that you just aren’t close, then you may need to set MinimumVsyncs to 2 and run your game at 30 fps, with TimeWarp generating the extra frames. Some things work out OK like this, but some interface styles and scene structures highlight the limitations.

So, the general advice is:

If you are making an app that is expected to be used for long periods of time, like a movie player, you need to pick very low levels. Ideally 0 / 0, but it is possible to use more graphics if the CPUs are still mostly idle, perhaps up to 0 CPU / 2 GPU.

If you are ok with the app being something that is only played in ten minute chunks, you can choose higher clock levels. If you don’t work well at 2 / 2, you probably need to do some serious work.

With the clock rates fixed, observe the reported FPS and GPU times in logcat. The GPU time reported does not include the time spent by TimeWarp, or the time spent resolving the rendering back to main memory from on-chip memory, so it is an underestimate. If the GPU times stay under 12 ms or so, you can probably reduce your GPU clock level. If the GPU times are low, but the frame rate isn’t 60, you are CPU limited.

Optimize until it runs well.
9 REPLIES 9

IridiumStudios
Honored Guest
Thanks for the update, Mr. Carmack.

Since we won't have access to the new build in the next week, I guess trying to cut down on GPU consumption is the best way to go about power optimization. We're essentially doing so blind because A) GPU profiling on Unity is currently inoperable, and B) We have no direct way of measuring power consumption, but many of the settings you listed are a good start, and seeing which ones don't drastically decrease visual fidelity.

MikeMandel
Honored Guest
This is great info - can't wait for the new drop. We'll continue to optimize GPU/CPU in meantime, and try out some of the tips you outlined. For folks having trouble with GPU profiling, we've gotten some numbers on our shaders out of the Adreno profiler as long as you turn off the Oculus cameras and uncheck development build (otherwise it won't connect for us).

One question: will we be able to adjust these clock rates dynamically as our app runs? We do a bunch of analysis up front, which could benefit from something like CPU 3/GPU 0. But then in-game, we actually do far less CPU work and much more GPU work (so maybe CPU 0/ GPU 2). It would be awesome if we could switch rates between scene loads to optimize our app's experience.

bradherman
Honored Guest
I really appreciate the detailed update. It gives us some ideas of what we can work with. Looking forward to the new SDK. one thing we don't know is the final devices specifications when it comes to battery power and thermal dissipation properties. I would think that those things are influenced by the final device and headset.

IronMan
Honored Guest
@MikeMandel - I know it's tempting to ramp up CPU and GPU speeds when convenient, but you should always strive to optimize as best you can and lower the CPU and GPU clock rates as low as you can get them. At this stage of the tech, VR apps need to be highly optimized. These interfaces aren't really meant to compensate for less-than-desirable performance in code or scene data. (Not that your stuff is 🙂 .. just trying to get that point across)

AlanMoss
Honored Guest
Is it within best practice with this new power management to bump up the CPU level periodically? For instance, while loading assets. In the older July-S5 SDK the CPU gladly throttled up when unzipping assests. Now, it doesn't. That's the only real difference I see in our native app since switching over to the new SDK. That and toasts are not displaying properly 😕

Thanks,
-Alan
We're hiring Android, 3D, and Web Developers. Take a look at NextVR's job postings: http://nextvr.com/jobs

MikeMandel
Honored Guest
I'm getting at something similar to what AlanMoss points out. We are optimizing our game as best we can, but there are legit cases where the CPU and GPU levels we need are essentially flipped. On the loading screen, we do tons of procedural analysis and want to fire the CPU at a high level, but really need very little from the GPU beyond showing a progress bar (i.e. CPU 3 / GPU 0). In game, we need far less from the CPU, but want to hit the GPU harder (i.e. CPU 0 / GPU 2). User experience can obviously be improved with faster load times, and we are certainly optimizing the code that runs during loading, but I want to explore our options.

IronMan - are you saying that its not possible to switch these levels dynamically or just ill advised because we all need to hit CPU 0 / GPU 0 anyways?! 🙂

EMcNeill
Honored Guest
I'm curious as well about adjusting the speeds at runtime. My game involves a lot of procedural generation on-the-fly, where the game needs to stop and think for a while. It seems like it would be best to max out the CPU for those moments to reduce perceived loading times while running slower at other times.

AlanMoss
Honored Guest
I'm not sure if it is a best practice yet or not, but I am able to switch dynamically during loading and back again during frame rendering and it works just fine. I was just curious whether or not it would be frowned upon. 🙂
We're hiring Android, 3D, and Web Developers. Take a look at NextVR's job postings: http://nextvr.com/jobs

efroemling
Honored Guest
I'm curious if the plan going forward is to keep things fixed like this or if there's any hope of adjusting the system governors to better handle these types of heavy workloads.. (being more conservative in general while still allowing short bursts of speed)

For instance if I can get by with cpu level 0 most of the time but could use level 3 during brief spikes in activity such as chain reaction explosions, it could be nice to have that happen automatically.

At the moment I'm not sure whether it's better to aim low and risk dropping frames during the explosion or to aim high and be burning too much energy the rest of the time (and risk falling off the thermal cliff)

I realize this is a complicated problem; just curious..