Mountains

Mountains

Monday, January 27, 2025

2024 GPD Win 4 Undersize USB-C Adapter Benchmark Study and AMD APU TDP Benchmark in Split GPU/CPU Workloads

 I recently got a 2024 GPD Win 4 (32 gig RAM, Ryzen 8840u) for couch gaming. I had been recharging with a 30 W USB-C PD phone charger at my desk, and also using this power for updates and downloads.

A few days ago, I had run the battery down and wanted to finish a game level, and I plugged it into the phone charger. To my surprised dismay, the framerate of the game dropped and it started to stutter. Obviously I didn't expect it to perform perfectly with a power adapter that is less than the 65 W adapter recommended, but I also expected it to share the adapter and the battery simultaneously.

This lead me to do some experiments using GPD Motion Control to cap the current draw and to enable/disable CPU boost comparing the performance of the GPD Win 4 using a 65 W and 30 W (Anker 511 Nano) charger.

TL;DR "I HATE GRAPHS"

  1. Using a 30 W adapter with CPU Boost and GPU heavy workloads reduces performance by a lot: 69 % reduction in GPU, 81 % reduction in single core, and 97 % reduction in multi-core performance.
  2. The GPD WIN 4 cannot automatically limit it's current to the 30W limit of the power source.
  3. Enabling Boost causes transient surges that exceed the TDP cap and cause power delivery problems on weak chargers.
  4. There is no GPU/CPU mixed workload scenario: every watt used by compute is not available for graphics. This runs counter the claims elsewhere that the GPU has a current limit, at least in these tests.
  5. To that end: for maximum frame rates, reduce all CPU activities as much as possible and turn off CPU boost free power and TDP for the GPU.

Comparing 65 W and 30 W adapters

The purpose of this test was to evaluate how the 65 W and 30 W adapters impacted performance with CPU Boost Enabled and Disabled under different current limits in Motion Control. Geekbench 6 was run at 28, 18, and 12 W current limits, with Boost alternately enabled and disabled, on both adapters. Current usage was monitored using a Kill-A-Watt and the package current readout in Motion Control. The 18 W level was run twice to assess variation.

For single core workloads, there is no difference between the 30 W and 65 W adapters at any current limit with boost disabled. Enabling boost on the 65 W adapter results in a near doubling of single core performance while using boost on the 30 W adapter results in a severe performance penalty. This penalty is present in all 3 current limits, and is not reduced until the limits are reduced to 12 W.


For multicore workloads, both the 65 W adapter yielded better performance than the 30 Watt adapter at 28W. At 18W the multicore scores without boost were about the same, but the 65 W adapter still provided a some benefit to boost enabled benchmarks. The 30 W adapter encumbered performance at all power limit settings by a large margin, underperforming the single core boost disabled score.

 

The GPU scores follow the other tests: The benchmarks are largely the same with boost disabled or enabled, but tests with the 30 W adapter and boost enabled show lower GPU scores. Interestingly, at the 18 W limit, the boost enabled 65 W and 30W scores are similar.

During these tests with boost enabled on the 30 W, both the Kill-A-Watt and the Motion Control package power showed that the current usage at the wall was often far less than the set cap.


Sustained GPU Load Test

A weakness in using the Geekbench tests is that they don't generate sustained load on the the system. Because my primary concern was performance during gaming, I ran Furmark 2.0 benchmarks at 720, which prioritizes GPU load for long enough for the system to get close to thermal and power equilibrium.

The Furmark tests were extremely interesting because the the graphics workload suffered with the 30 W adapter regardless of boost status, but lined up with the 65 W results in other tests. 



Another observation is that the 28W tests showed wildly varying readings from the Kill-A-Watt. The wall power would spike into the mid-40W range momentarily, collapse, and spike again. I suspect this indicates that the GPD is over-drawing the power supply, and it's engaging some kind of overcurrent correction. If this is the case, the throttling of the benchmark might be due to the appearance of voltage droop on the APU VRMs that cause the CPU and/or GPU to down clock. This would explain why the other tests could not hold a steady ~30 W current draw.

Under the 18 W cap, the Kill-A-Watt showed a very steady 28-32 W draw at the wall, and the benchmarks align nicely.

This probably means that the GPD is not intelligently reducing it's performance to match the power source, which would ideally represent a stable current draw at or below 30 W even with the APU power cap set higher. 

I suspect, but have not way of confirming without either trying to run the GPD with test leads on the main board or an inline USB-PD current meter, that the GPD Win 4 does not probe the quality of the power supply and simply assumes or guesses that it has a 65 W adapter and asks the adapter for 20V and as many amps as possible.

The Anker Nano 511 support 20V at 1.5 Amps, which curiously appears to be out of Spec for USB-PD 3.0, but maybe I don't know how to read. This makes me wonder if the GPD can run off a 15V or a 9V power supply (assuming, of course it is asking for 20V).

Overall, I conclude from this battery of tests that the 30 W charger is probably okay for charging, but probably not okay for shore power, and that if you're in the market, a compact GaN 65W charger seems safe and sane. The current cap on the APU is not strict, and bursty work, for example with Boost enabled or from the GPU, can instantly cause high current loads that weaker power supplies struggle with. If you simply must use a smaller output adapter, consider an aggressive power cap for stable performance. 12 W seems like a starting point for a 30W without further controls.

GPU/CPU Mixed Workload Benchmark

After testing the performance of the GPD Win 4 under different APU power caps, I got curious about how the Ryzen 8840u shares power between the GPU and CPU in mixed workloads.

To test this, I ran prime95 torture tests with small (to stress cpus) and large (to test ram) FFTs. I logged the current draw (CPU was at 28W cap, boost dissabled) and then ran a Furmark benchmark while the prime95 load was in place. I stepped the number of prime95 works and threads and interated through all permutations.

I have read in various places that the Ryzen APUs have seperate caps for the the compute and graphics cores. For the 8840u I have seen it quoted that the GPU has a 15 W limit.

In this test, there is no plateau where the GPU performance is stable while the compute power draw increases. Every additional worker, whether CPU bound or mixed memory bound, added results in a decrease in Furmark performance

 

I don't think this is a perfect test to detect such a plateau: it's likely that there is a way to stress the GPU in the same way as Furmark with lower CPU load, however, I think it's a fair real world example. Games have a CPU component to them and this shows that for the 8840u, each watt of power used by compute is not available to the GPU, and there's not partition space where the GPU will run unhindered with headroom for calculations.

I ran the tests with both big and small FFTs to see if memory calls would bottleneck gpu performance more. Surprisingly, For small worker it seems like yes, there is a small decrease in graphics performance, but as the number of workers increases, the memory bottlenecked work does not pose much of a limit on performance and might actually be faster. Maybe this is due to cache misses with the compute work while the Furmark is able to work with cash? Regardless, it seems like compute memory bandwidth usage does not meaningfully block graphics performance.

 

There is a weakness here, however, a more focused benchmark would make memory I/O calls without generating a lot of compute load to generate contention in the memory controller and current load in the CPU. So some plateau for GPU performance might exist, but in a more synthetic scenario.