Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: optimize compilation through preprocessing and smarter caching #197

Open
klkvr opened this issue Sep 8, 2024 · 3 comments
Open

feat: optimize compilation through preprocessing and smarter caching #197

klkvr opened this issue Sep 8, 2024 · 3 comments

Comments

@klkvr
Copy link
Member

klkvr commented Sep 8, 2024

A lot of the compilation overhead is coming from the need to compile all project's tests on every change to one of the src/ contracts. Ideally, we should only compile test contracts when interface of contracts they depend on changes, which doesn't happen often.

This can be made possible by avoiding solc mechanism of bytecode linking. Instead of linking bytecode at compile-time, we can link it at runtime, and we already have cheatcodes for that (deployCode, getCode).

Basic idea is to add a preprocessor translating all type(Contract).creationCode into vm.getCode invocations, and all new Contract(...) into vm.deployCode.

However, preprocessor on its own wouldn't let us skip compilation due to how caching works. We'll mark test file as dirty even if non-interface change is made to a contract. To address this, we would need to customize our caching logic by only marking test contracts as dirty if any of their dependencies had a non-interface change.

Implementation

Proposing an initial simple implementation of this preprocessing. It can be enabled as an opt-in, and improved over time.

Updated compilation pipeline

Before each compiler invocation, if we're compiling any test files, we'll do additional solc invocation to obtain AST for them and replace all creationCode/new keywords with getCode and deployCode invocations. Compiler will only see this preprocessed source code

Updated caching

  1. For each non-test file, we'll store hash of additional "interface representation" which would be the same source file with all function bodies removed. We already have somewhat similar preprocessing step in bind-json: https://github.com/foundry-rs/foundry/blob/96105b4d240681c336e063eac0e250cc51a84414/crates/forge/bin/cmd/bind_json.rs#L63. "interface representation" is easy to construct, and is independent of non-interface changes.
  2. When resolving dirty source files, non-test files are handled as usual, and test files are marked as dirty only if one of the following is true:
    • its source hash or compilation settings have changed
    • it depends on any test files which are dirty
    • it depends on any non-test files which "interface representation" have changed

Notes

  1. All of this could be done better if obtaining solc AST was cheaper. AST we receive from solc is very useful because it provides referencedDeclaration keys which allow us to detect names of items a specific file/contract depends on which could be useful to do caching smarter by filtering out unused imports. solang_parser is much cheaper to invoke, but does not provide this.
  2. This introduces fundamental change to how foundry treats test/script files. Right now we treat all sources equally during compilation, and anything that user can do in src/, can be also done in test/. IMO this separation makes sense, as it makes it explicit which contracts are compiled for deployment vs for foundry EVM
@mattsse
Copy link
Member

mattsse commented Sep 13, 2024

This can be made possible by avoiding solc mechanism of bytecode linking. Instead of linking bytecode at compile-time, we can link it at runtime, and we already have cheatcodes for that (deployCode, getCode).

Basic idea is to add a preprocessor translating all type(Contract).creationCode into vm.getCode invocations, and all new Contract(...) into vm.deployCode.

is this a strict requirement for this feature or should we recommend using vm.getCode?
would this also preprocess non-test code?

@klkvr
Copy link
Member Author

klkvr commented Sep 13, 2024

is this a strict requirement for this feature or should we recommend using vm.getCode?

using getCode is not type-safe and won't work like this if project does not divide each contract into interface and implementation. the idea here is to provide a way to keep type-safety and avoid refactoring whille still linking bytecode dynamically

would this also preprocess non-test code?

it should not. I don't think we should mess with metadata, so src/ contracts are not touched at all

@gakonst
Copy link
Member

gakonst commented Sep 18, 2024

All of this could be done better if obtaining solc AST was cheaper. AST we receive from solc is very useful because it provides referencedDeclaration keys which allow us to detect names of items a specific file/contract depends on which could be useful to do caching smarter by filtering out unused imports. solang_parser is much cheaper to invoke, but does not provide this.

Should this be the first use case of sulk @DaniPopes / something to prio?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants