Page 1 of 1

[FIXED]HQ3x/4x ASM filters border

Posted: Sun Mar 23, 2008 2:42 pm
by spacy51

HQ3x/4x ASM implementation produces wrong interpolation on the image's border. Look at the attached screenshot. It is obvious at the top of the "VIOLET CITY" text.

This bug has already been fixed in the C version; look at hq_base.h / line 343 - 372.

The ASM version most likely only has something like skipLine instead of skipLinePlus and skipLineMinus, which is however necessary in order to work correctly.

 

This bug is most probably the cause why only the HQ3x/4x filters cause an exception with me new multi-threaded filter execution.

I get the following error message:

Unhandled exception at 0x00511ba9 in VisualBoyAdvance.exe: 0xC0000005: Access violation writing location 0x02c21000.

The debugger points me to hq4x_32.asm line 960.

 

The asm version of the HQ2x filter does not have this bug.

 

I would really appreciate if anyone knowledgeable of asm can try to figure a fix out for it. Until this bug is fixed, I can't really submit my multi-threading patch [img]<fileStore.core_Emoticons>/emoticons/sad.png[/img]/emoticons/sad@2x.png 2x" width="20" height="20" />


[FIXED]HQ3x/4x ASM filters border

Posted: Mon Mar 24, 2008 11:28 pm
by chrono

HQ3x/4x ASM implementation produces wrong interpolation on the image's border. Look at the attached screenshot. It is obvious at the top of the "VIOLET CITY" text.


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 12:09 am
by mudlord

Thanks for the patch. [img]<fileStore.core_Emoticons>/emoticons/smile.png[/img]/emoticons/smile@2x.png 2x" width="20" height="20" />


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 7:28 am
by spacy51

Thank you so much for the patch.

 

It looks perfect now: [attachment=6]

 

 

 

Unfortunately, Multi-Threading still doesn't work with thw HQ3x/4x filters, which means there has to be another cause.

This means, there is still a faulty piece of code in the filter, which accesses memory outside of the source or destination bitmap. This causes access conflicts when the other memory area is currently in use by another CPU core.

A mutex on the whole image area is no option, because the whole thing wouldn't be faster than on a single core, moreover the current state may have unforseen side effects even on single cores.


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 7:47 am
by mudlord

Unfortunately, Multi-Threading still doesn't work with thw HQ3x/4x filters, which means there has to be another cause.

This means, there is still a faulty piece of code in the filter, which accesses memory outside of the source or destination bitmap. This causes access conflicts when the other memory area is currently in use by another CPU core.

 

Gotta love multithreading and thread-safety hey? :bricks:

 

And its funny how I see on Ngemu that noobs see it as easy.....They should try it before they say such things....


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 12:33 pm
by chrono

Multi-Threading still doesn't work with thw HQ3x/4x filters' date=' which means there has to be another cause.[/quote']ASM filters are not thread-safe [img]<fileStore.core_Emoticons>/emoticons/wink.png[/img]/emoticons/wink@2x.png 2x" width="20" height="20" />


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 3:10 pm
by spacy51

<blockquote data-ipsquote="" class="ipsQuote" data-ipsquote-contentapp="forums" data-ipsquote-contenttype="forums" data-ipsquote-contentid="32" data-ipsquote-contentclass="forums_Topic"><div>Multi-Threading still doesn't work with thw HQ3x/4x filters' date=' which means there has to be another cause.[/quote']ASM filters are not thread-safe [img]<fileStore.core_Emoticons>/emoticons/wink.png[/img]/emoticons/wink@2x.png 2x" width="20" height="20" />

 

 

But the 2xSaI, Super 2xSaI and Super Eagle ASM filters work without changes.

 

You did a very great job on the hq4x_32 filter. It works perfectly now.

 

I get a speed increase from about 200% (1 core) to 250% (2 cores) now.

 

 

It looks like you had to change almost every line in the .asm file. Could you maybe provide me with some info on what was the problem with the code?

 

Would you be so kind to do the same magic on the other 3 versions? HQ3x_32 (has top priority)

HQ4x_16

HQ3x_16

 

I'll make sure you get your name into the about box.

 

 

EDIT:

Uploaded changes, SVN469

</div></blockquote>


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 7:44 pm
by chrono

thread-safe filters: bilinear.cpp, hq3x_16.asm, hq3x_32.asm, hq4x_16.asm, hq4x_32.asm


[FIXED]HQ3x/4x ASM filters border

Posted: Tue Mar 25, 2008 9:41 pm
by Squall Leonhart

which are faster, ASM or C filters?

 

secondly, can a 4xSaI filter be done in ASM?


[FIXED]HQ3x/4x ASM filters border

Posted: Wed Mar 26, 2008 12:23 pm
by spacy51

I comitted chrono's fixes and added an an option to set the maximum number of threads to create. The option is not yet exposed to the GUI, but can be changed in the ini. If the "maxCpuCores" option does not exist or is invalid, it will be auto-detected using the CPUID instruction.

 

Somehow my speed results vary, I probably have to add a mechanism to explicitly run each thread on an individual core instead of letting windows guess.