At Wed, 23 Aug 2006 10:08:27 +0100, Andrew Haley wrote: > > I don't see why that would be a problem: > > I know. :-) > > > the function is inlined after all. Also, I've tried the same with > > exit (defined in the same way, but takes an integer), that works as > > expected. > > It's a metter of _when_ the function gets inlined. If > __builtin_constant_p is folded before the function is inlined then > it'll return zero. If the constant doesn't happen to get propagated > it'll return zero. C'est la vie. OK, I see. Thanks for taking the time to look into this. > > The funny thing is that GCC realizes that it can put the constant > > address in the inline assembly with the "i,!r" constraint, but that > > __builtin_constant_p still returns 0. This is what I think looks > > inconsistent. > > Does it work if you make the whole thing a macro? Yes, then it correctly decides that the argument is a constant. Anyway, I can live with this behavior since it only makes some possible optimizations fail. However, I have another fun problem which I think is caused by my fix to the problem. My fix looks like this (slightly simplified): #define __emit_parameter(x) \ if (__builtin_constant_p(x)) { \ __asm__ volatile ( \ ".long 0xfffd0000\n" \ ".long %0\n" \ : : "i,!r"(x) ); \ } \ else { \ __asm__ volatile ( \ ".short 0xfffe\n" \ ".short %0\n" \ : : "r"(x) ); \ } #define _syscall1(type,name,atype,a) \ type name(atype a) { \ register unsigned long __v0 asm("$2"); \ __asm__ volatile (".set push\n.set noreorder\n"); \ __emit_parameter(a); \ __asm__ volatile ( \ ".short 0xffff\t#" #name "\n\t" \ ".short %1\n" \ : "=&r" (__v0) \ : "i" (__NR_##name) ); \ __asm__ volatile(".set\tpop\n"); \ return (type) __v0; \ } I've added another macro to create the bitpattern for the argument(s). My problem now is that these are no longer in one asm statement, and clever GCC then outsmarts me by moving things around. This is not necessarily a problem, except I get code like this: 1002244: slti a1,a0,8 1002248: .------ bne v0,s0,10022f8 <test_malloc+0x1a0> 100224c: / addiu v1,v1,4 ... | 1002280: | addiu v0,v0,248 1002284: | 0xfffe0002 1002288: | .-> 0xffff0001 100228c: | | lw ra,56(sp) ... | | 10022f8: `---+--> 0xfffe0009 10022fc: `--- j 1002288 <test_malloc+0x130> 01002300 <main>: 1002300: 27bdffe8 addiu sp,sp,-24 I.e, by jumping around, GCC can merge some sections of common code (0xffff0001 is the puts syscall). The problem is the jump instruction here - I don't see how that could be valid since the delay slot is filled by the addiu from main() (chaos breaks loose in my translated code after this). I've tried replacing my custom bitpattern with a normal instruction, but the end result is still the same. To me it seems that GCC simply produces incorrect code in this case. I would like a nop in the delay slot here. Any takes? :-) // Simon