Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Light forward sync mechanism #6515

Open
wants to merge 66 commits into
base: unstable
Choose a base branch
from
Open

Light forward sync mechanism #6515

wants to merge 66 commits into from

Conversation

cheatfate
Copy link
Contributor

No description provided.

Copy link

github-actions bot commented Aug 28, 2024

Unit Test Results

         9 files  ±0    1 355 suites  ±0   43m 27s ⏱️ + 2m 42s
  5 147 tests ±0    4 799 ✔️ ±0  348 💤 ±0  0 ±0 
21 510 runs  ±0  21 106 ✔️ ±0  404 💤 ±0  0 ±0 

Results for commit f3451ad. ± Comparison against base commit e94417c.

♻️ This comment has been updated with latest results.

@cheatfate cheatfate force-pushed the hybrid-sync branch 2 times, most recently from 5be1b71 to 575d711 Compare September 5, 2024 07:34
@cheatfate cheatfate marked this pull request as ready for review September 9, 2024 15:24
header: ChainFileHeader,
data: openArray[byte]
): Result[BlobSidecar, string] =
if header.plainSize > uint64(high(uint32)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be more precise than this. Uncompressed blobs are 32 * 4096 = 131,072 bytes:

# https://github.com/ethereum/consensus-specs/blob/v1.4.0-beta.5/specs/deneb/polynomial-commitments.md#custom-types
Blob* = array[BYTES_PER_FIELD_ELEMENT * FIELD_ELEMENTS_PER_BLOB, byte]

Snappy might increase their size somewhat in pathological cases, but it has bounds on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such cases are not pathological because we here mostly working with high entropy sources (crypto hashes and or encrypted data).

Decreased to MaxChunkSize in 361bc0b

header: ChainFileHeader,
data: openArray[byte]
): Result[ForkedSignedBeaconBlock, string] =
if header.plainSize > uint64(high(uint32)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can bound more tightly: https://github.com/ethereum/consensus-specs/blob/v1.5.0-alpha.5/specs/phase0/p2p-interface.md#configuration and https://github.com/ethereum/consensus-specs/blob/v1.5.0-alpha.5/configs/mainnet.yaml and https://github.com/ethereum/consensus-specs/blob/v1.5.0-alpha.5/configs/minimal.yaml note that GOSSIP_MAX_SIZE is 10MiB (for "uncompressed gossip messages").

That limit applies to all gossiped SSZ objects, including blocks (and blobs, but blobs are simply fixed-size).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 361bc0b

buffer = Chunk.init(kind, uint64(slot), uint32(plainSize), data)
wrote = writeFile(chandle.handle, buffer).valueOr:
discard truncate(chandle.handle, origOffset)
discard fsync(chandle.handle)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are some fsync calls fine to discard/ignore the return value of but others not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is because of error in writeFile (initial error we going to report to user), so this is attempt to cleanup. In case where number of bytes written was less than we expect, it means that file become inconsistent, so we trying to truncate file to known good size and we trying to fsync it, but we still want to report original error message to the user.

Copy link
Contributor

@etan-status etan-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement for the security of default sync compared to the existing genesis sync.

As this is timely to properly test, would be great to have at least a happy case test with a few epochs worth of minimal blocks to avoid accidental regressions.

If data size is a concern, one may consider syncing 1 (or N) sync committee period (~27 hrs / 8192 slots) at a time by treating each intermediate light client sync step as its own separate sync target, applying each of them separately before syncing to the next period. The hook to obtain intermediate results is LightClientUpdateObserver. It would add quite a bit of complexity, though, so maybe waiting for demand is better for now.

beacon_chain/beacon_chain_file.nim Outdated Show resolved Hide resolved
beacon_chain/beacon_node_light_client.nim Outdated Show resolved Hide resolved
beacon_chain/gossip_processing/block_processor.nim Outdated Show resolved Hide resolved
beacon_chain/spec/signatures_batch.nim Outdated Show resolved Hide resolved
beacon_chain/sync/sync_overseer.nim Outdated Show resolved Hide resolved
beacon_chain/sync/sync_overseer.nim Show resolved Hide resolved
beacon_chain/beacon_node_light_client.nim Outdated Show resolved Hide resolved
@@ -92,6 +89,8 @@ proc initLightClient*(
.shouldSyncOptimistically(node.currentSlot):
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new sync mechanism works for all lcDataFork > LightClientDataFork.None, but here we have an additional >= Capella restriction, because that's required for the EL blockHash and setOptimisticHead.

All networks that we actively support have advanced past Capella, so letting this remain in the >= Capella section is also fine.


let blockHandler: BlockAdded(consensusFork) = onBlockAddedHandler

discard addResolvedHeadBlock(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addResolvedHeadBlock calls dag.processNewBlockForLightClient(state, trustedBlock, parent.bid), and that may trigger a beacon state root computation.

If conf.lightClientDataImportMode is set to Full, this is not avoidable, as the user explicitly indicates interest in full historical data availability. However, if it is set to OnlyNew or OnDemand, the new sync overseer could gain some performance by setting dag.lcDataStore.cache.tailSlot to the LC sync target so that the state root computation can be avoided until the sync has caught up to the LC target.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea of downloading trust-minimized data into a separate ___location before processing it is great.

I wonder though, is there an overlap between this format and .era/.erb files?
And if no, is there an advantage of using a custom format over era?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERA files is more about static storage of limited number of blocks + state, without any need to read it from the tail.
Current file and format is made with sense that we downloading files from the head to tail and applying from tail to head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants