[hatari-devel] Profiling Hatari code with Valgrind
Eero Tamminen
oak at helsinkinet.fi
Fri Jan 7 23:30:26 CET 2011
Hi,
On perjantai 07 tammikuu 2011, Nicolas Pomarède wrote:
...
> In the case of "simpler" functions, another could indicator could be to
> just build a small test program that would do 1000000 calls of
> Update_e_u_n_z old/new versions and see which one is faster (this way
> you can get a real percentage of the speedup).
...
>> It might be possible that my i3 CPU scales its speed according to
>> the load[2] though and that's why I don't see the differences. I
>> hadn't thought of that earlier. If it does scaling, then "top" isn't
>> valid way to measure anything.
>
> It certainly does ; check "cat /proc/cpuinfo" to see if the "cpu MHZ" is
> changing depending on the overall load. In that case, using top is not a
> good option.
I noticed that I can get that info also from:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
In addition to:
grep -i mhz /proc/cpuinfo
Polling those with e.g. "cat" isn't enough though, I think the frequency
can be changed quite fast (several times per second). And according to:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
There are quite a few frequencies at which (each of these virtual cores[1])
can be running.
[1] i3 has two cores which both simulate an extra core with hyperthreading.
I would guess that if something uses all CPU constantly (hatari with
--fast-forward), the cpu governor would keep it at highest frequency.
Then one needs timing though (--run-vbls with Hatari) as I mentioned
in earlier mail, as CPU usage will/should be then constantly at 100%...
> The best option is a profiler that would completly emulate an i5x6 cpu
> with cycle precise value for each instruction. The program will usually
> run very slowly, but in the end you get an exact cycle count of what
> happened. If I recall correctly, there're such profiler under linux, but
> I don't remember their names.
It's called Valgrind. :-)
Its callgrind plugin gives callgraphs, but simulates just instructions,
the cachegrind plugin can simulate also cache misses etc (but lacks
callgraph info).
- Eero
More information about the hatari-devel
mailing list