As I posted about in Raspberry Pi XBMC and DTS Audio, the Raspberry Pi is not quite fast enough to decode the DTS audio which often accompanies 1080p and 720p media files. I decided to have a go at overclocking the Raspberry Pi to see if I could get it to decode the DTS streams and speed up the slightly sluggish XBMC interface.
The Raspberry Pi is based around a System on Chip with an ARM CPU, an Broadcom Videocore IV GPU and 256MB of RAM. Each of these parts can be clocked individually, and the GPU can be further broken down into Core (part of the ARM which shares the GPU clockgen), H264 decoder, 3d processor and image sensor processor.
The GPU clocks must be exact integer multiples of each other though, because of the way they share a single clock source. It’s also known that the various blocks of the GPU have different overclocking capabilities, the “core” can often be overclocked much more than the 3d processor, allowing you to do things like clock the core at 400MHz and the rest of the GPU at 200MHz for non-graphical use cases.
To overclock the RasPi you need to create a file called “config.txt” in the root directory of the first DOS partition of your SD card (often /boot under Linux) with some of the following options:
- arm_freq – frequency of ARM in MHz. Default 700.
- gpu_freq- Sets core_freq, h264_freq, isp_freq, v3d_freq together.
- core_freq – frequency of GPU processor core in MHz. Default 250.
- h264_freq – frequency of hardware video block in MHz. Default 250.
- v3d_freq – frequency of 3D block in MHz. Default 250.
- isp_freq – frequency of image sensor block in MHz. Default 250.
- sdram_freq – frequency of SDRAM in MHz. Default 400.
- over_voltage – ARM/GPU core voltage adjust.
- over_voltage_sdram- Sets over_voltage_sdram_c, over_voltage_sdram_i, over_voltage_sdram_p together
- over_voltage_sdram_c – SDRAM controller voltage adjust.
- over_voltage_sdram_i – SDRAM I/O voltage adjust.
- over_voltage_sdram_p – SDRAM phy voltage adjust.
For voltage adjustements the values -16 to 8 equates to 0.8V to 1.4V in 0.025V steps. The default is 0 or 1.2V.
WARNING: Overvolting your Raspberry Pi will permanently set a fuse in your SoC and invalidate your warranty. Overvolting your Raspberry Pi will seriously reduce the lifespan of the SoC.
For example, to set the ARM to 900MHz, the GPU to 200MHz, the core to 400MHz, the RAM to 500MHz and set an overvolt of 1.25v you would use:
To test the results of the overclock I’ll be performing 5 tests:
- GTKPerf, an X windows graphical benchmark.
- GZip – Test the speed of GZipping the RPi’s kernel contained on a RAM disk.
- ffmpeg – Test the speed at which ffmpeg can decode an MP3 contained on a RAM disk.
- BC – Test the speed at which BC can perform a mathematical function.
- Quake III – Run the standard “timedemo”.
Overclocking the ARM
I managed to successfully overclock my Raspberry Pi’s ARM CPU to 900MHz without any overvolting, and 1150MHz with overvoltage set to 6, or 0.15V. For some reason Quake III became unstable above 950Mhz and would crash almost immediately. I suspect this is power related as the higher CPU overclocks became unstable when mixed with a known-stable core/memory overclock. At one point I had to reduce the RAM overvolt by 0.025V to prevent the Raspberry Pi from crashing after adding 0.05V to the CPU/GPU overvolt. It is possible that my power supply is not providing enough power, though it does claim to be 1.2A @ 5V.
The chart above shows the benchmark times in seconds for the various tests at each stable overclocking speed. I also underclocked the CPU to 600 and 500MHz to get a better picture of the Raspberry Pi’s performance scaling.
The chart above shows the relative performance of the Raspberry Pi at various tasks at each clock speed relative to the default clock speed of 700MHz. We can see that not all applications scale the same way on the Raspberry Pi, notably Quake III scales particularly poorly. This is because it’s largely bound by the speed of the GPU. BC scales much more linearly than the others because it has a limited data set and relies almost entirely on raw CPU throughput.
The chart above shows the average performance of all of the tests except Quake III compared against the default clock speed of 700MHz. We can see that on average it scales reasonably linearly, so every MHz you can get out of the ARM is beneficial. At 950MHz the Raspberry Pi’s average performance was 20% higher than the 700MHz default speed and 33.6% faster for compute-intensive tasks.
Overclocking the RAM and GPU
I had initially intended to also measure the scaling performance of the GPU in Quake III, but my Raspberry Pi refused to run Quake III with the 3D block of the GPU running at anything other than the default 250MHz.
I managed to double the speed of the “Core” element of the GPU to 500MHz and increase the RAM speed by 100MHz to 500MHz. I had to set the over_voltage_sdram to 1, adding 0.025v to the RAM controller voltage.
The results show how the two components affect performance differently. GTKPerf and Quake III were both given the greatest performance increase from the RAM overclock. This is because the GPU shares the RAM, so graphical tasks should receive more of a boost from RAM speed.
GZip and BC were most affected by the Core clock increase. This is apparently because this component contains the Cache for the CPU, and will affect CPU intensive operations.
It’s also well worth overclocking the GPU Core and RAM to squeeze the another 4% if your board can do it and remain stable.
The chart below shows the rendering performance of the GPU for Quake III. I couldn’t get Quake III to run properly at any CPU clock above 950MHz, despite everything else being rock-solid until atleast 1100MHz.
The entire overclock above represents about 4.5FPS difference, from 27.9FPS at 700MHz to 32.4FPS at 950MHz with RAM and Core overclocks. It’s worth noting that the RAM provided a 1.5FPS or 4.9% performance boost.
In previous tests a performance improvement was observed when the GPU Core was overclocked to 500MHz when the CPU was at 900MHz, but that appears to have disappeared at 950MHz, though it is visible when both the Core and RAM are overclocked. This is probably because the benchmarks fluctuate by 2-4% each time, and it takes several attempts and a lot of luck to get the best possible result for each configuration.