Skip to content

Append optnone and noinline for -O0 optlevel#502

Merged
kvpanch merged 2 commits intomainfrom
kvpanch/o0_optnone
Apr 10, 2026
Merged

Append optnone and noinline for -O0 optlevel#502
kvpanch merged 2 commits intomainfrom
kvpanch/o0_optnone

Conversation

@kvpanch
Copy link
Copy Markdown
Contributor

@kvpanch kvpanch commented Apr 9, 2026

This appears to be NFC change as new pass manager handles --passes=default correctly. Still must be useful as a safeguard and possibly for the old pass manager that is still used by backend.

Contract                        Without (bytes)   With noinline+optnone (bytes)
─────────────────────────────── ───────────────── ────────────────────────────
contract.sol:C                            1,266                          1,266
Fibonacci:Iterative                       2,506                          2,506
Storage                                   3,671                          3,671
Fibonacci:Recursive                       3,794                          3,794
Computation                               5,776                          5,776
Fibonacci:Binet                           8,786                          8,786
large_div_rem                            10,705                          10,705
DivisionArithmetics                      14,831                          14,831
ERC20                                    32,288                          32,288
ERC20Tester                              34,589                          34,589

This appears to be NFC change as new pass manager handles
--passes=default<O0> correctly. Still must be useful as a safeguard and
possibly for the old pass manager that is still used by backend.

  ┌─────────────────────┬───────────┬──────────────────────┐
  │                     │ Without   │        With          │
  │      Contract       │  (bytes)  │  noinline+optnone    │
  │                     │           │       (bytes)        │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ contract.sol:C      │ 1,266     │ 1,266                │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ Fibonacci:Iterative │ 2,506     │ 2,506                │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ Fibonacci:Recursive │ 3,794     │ 3,794                │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ Storage             │ 3,671     │ 3,671                │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ Computation         │ 5,776     │ 5,776                │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ Fibonacci:Binet     │ 8,786     │ 8,786                │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ large_div_rem       │ 10,705    │ 10,705               │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ DivisionArithmetics │ 14,831    │ 14,831               │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ ERC20               │ 32,288    │ 32,288               │
  ├─────────────────────┼───────────┼──────────────────────┤
  │ ERC20Tester         │ 34,589    │ 34,589               │
  └─────────────────────┴───────────┴──────────────────────┘
@kvpanch kvpanch requested review from elle-j and xermicus April 9, 2026 15:33
@kvpanch
Copy link
Copy Markdown
Contributor Author

kvpanch commented Apr 9, 2026

still not clear why in @elle-j 's example there were SROA and mem2reg and why CSE optimized 2 identical loads, which is what I don't see for the simple c program https://godbolt.org/z/zajPn5e4M

Copy link
Copy Markdown
Contributor

@elle-j elle-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...why CSE optimized 2 identical loads, which is what I don't see for the simple c program

You can see the optimization of the loads here:
https://godbolt.org/z/ax4nhbzMf:

And to see the one DAG node (t5), see:
https://godbolt.org/z/bYba7jczr

Initial selection DAG: %bb.0 'square64:'
SelectionDAG has 10 nodes:
  t0: ch,glue = EntryToken
  t3: i64 = Constant<0>
    t2: i64,ch = CopyFromReg t0, Register:i64 %0
  t5: i64,ch = load<(load (s64) from %ir.p, align 32)> t0, t2, undef:i64
    t6: i64 = mul t5, t5
*** MachineFunction at end of ISel ***
# Machine code for function square64: IsSSA, TracksLiveness
Function Live Ins: $x10 in %0

bb.0 (%ir-block.0):
  liveins: $x10
  %0:gpr = COPY $x10
  %1:gpr = LD %0:gpr, 0 :: (load (s64) from %ir.p, align 32)
  %2:gpr = MUL %1:gpr, %1:gpr
  $x10 = COPY %2:gpr
  PseudoRET implicit $x10

mem2reg

I don't believe I saw this applied in one of the earlier examples, that was for some higher opt level.

@kvpanch kvpanch merged commit b3cd402 into main Apr 10, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants