[hatari-devel] DSP optimization?

Laurent Sallafranque laurent.sallafranque at free.fr
Sat Jan 15 20:32:42 CET 2011


I'm still doing the tests.

callgrind_control  --dump returns me

No active callgrind runs detected.
[Detection can fail on some systems; to work around this,
  specify the working directory of a callgrind run with '-w']


If I try with "-w ." option, it never stops.
I don't know what's wrong.


Anyway, also if there's no gain in Eero's optimisation (which I'll 
verify with valgrind), I really prefer the way code is written in his 
patch (I thought about something similar a few month ago).

For me, this patch should be committed. (I wish I can send the valgrind 
results tonight).

Regards,

Laurent


Le 15/01/2011 16:07, Eero Tamminen a écrit :
> Hi,
>
> On lauantai 15 tammikuu 2011, Nicolas Pomarède wrote:
>>> Now that Hatari DSP code doesn't anymore track Aranym, could we change
>>> the dsp_core to be a static array like CPU core data is?
>>>
>>> Accessing the core through a pointer is a small overhead in about
>>> everything DSP does and we don't gain anything from it as Hatari will
>>> never emulate more than one DSP at the time (like AFAIK has been the
>>> idea with the Aranym DSP code).
>> on contrary to what it may look, it's not obvious that such patch would
>> make things faster.
>>
>> For example on 680xx cpu, "address register indirect with displacement"
>> is faster than "absolute long", which means "move.l 12(a5),d0" is faster
>> than "move.l $75120,d0".
>>
>> I don't know for recent cpu (x86, arm or others), but it's also possible
>> you get similar results (harder to measure with moderm
>> pipeline/reordering).
>>
>> It would need to be profiled to see if there's a real different between
>> dereferencing a pointer or accessing directly the memory address, but I
>> don't think the gain would be noticable on CPU that would benefit from
>> it, and it could even be negative on 680x0 cpu or others cpu.
> Attached are modified dsp*.[ch] files for anybody who wants to test.
>
> (Diff to them is almost as large the files.)
>
>
> 	- Eero
>
> Because I have multiple cores which are frequency scaled, it's not so
> easy for me to see the difference.  I'm not sure whether even this
> (as root of course):
> ---
> for cpu in /sys/devices/system/cpu/*/cpufreq/scaling_governor; do
> 	echo "performance">  $cpu;
> done
> ---
> Would guarantee enough things?
>
>
> _______________________________________________
> hatari-devel mailing list
> hatari-devel at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/hatari-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.berlios.de/pipermail/hatari-devel/attachments/20110115/ea4ad47d/attachment.html>


More information about the hatari-devel mailing list