Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEVPROD-9461: Move Query-related tests to replica-* variants instead of standalones and single-replicas #1257

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

wqian94
Copy link
Contributor

@wqian94 wqian94 commented Aug 28, 2024

Jira Ticket: DEVPROD-9461

Whats Changed

Replace standalone-classic-query-engine, standalone-sbe, and the corresponding single-replica variants with the latest replica-query-engine-classic and replica-query-engine-sbe variants, respectively. Also clean up other Query-related tests to move away from standalones and single-replicas where possible.

Patch Testing Results

Patch.

@wqian94 wqian94 requested a review from a team as a code owner August 28, 2024 19:11
Copy link
Contributor

@alicedoherty alicedoherty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wqian94. This is changing a lot of tasks. From the perf perspective this looks good (and is what we want) but if possible can we get someone from query to also review this to make sure you're happy to shift all testing to replica sets?

Also there is no variant that runs with replica-80-feature-flag so I think you can remove all references to that.

src/workloads/execution/MultiPlanning.yml Outdated Show resolved Hide resolved
src/workloads/execution/MultiPlanning.yml Outdated Show resolved Hide resolved
- standalone-all-feature-flags
- standalone-classic-query-engine
- standalone-sbe
- replica-80-feature-flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we currently have any variants in sys-perf that run with replica-80-feature-flags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a bunch of tests with replica-80-feature-flags in the repo already. Do you know what's going on with those?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure - the only PR I could find related to it was https://github.com/10gen/dsi/pull/1637 but I couldn't find any that added/removed the variant and it wasn't removed as part of SERVER-89499. I can follow-up more on this if you think having a variant with the 8.0 FFs is necessary.

Copy link
Contributor Author

@wqian94 wqian94 Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do want an 8.0 FFs variant. Lemme ask around Query and see who might know more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've committed to releasing 8.0 with the feature flags we ended up choosing, having an 8.0 feature flags variant isn't that useful. However, I'd advocate to leave the specifications intact for now and have the Product Performance team do the work to wholesale remove these and ensure that we have good substitutes intact.

I've created a ticket to track that effort here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BlakeIsBlake for opening that ticket. As Blake said I'm happy for you to leave tests that have replica-80-feature-flags as is, and when our team gets to PERF-5857 we can easily switch them to a 9.0/all FF variant. Just bear in mind, that it may be slightly confusing that the Genny AutoRuns are implying it's running on an 8.0 FF variant when it actually isn't.

src/workloads/query/ArrayTraversal.yml Show resolved Hide resolved
src/workloads/query/BooleanSimplifier.yml Show resolved Hide resolved
src/workloads/transactions/LLTMixed.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@alicedoherty alicedoherty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good but can we do some perf/noise analysis to validate that moving to replica sets still gives useful signal. Maybe you could use the variant to variant feature in the perf analyser to compare these tests x5 running on standalone vs x5 running on replica? (I think configuring and running all the FF, SBE, classic engine, etc. variants will be a pain so just the plain standalone and replica should be fine).

Also can we get someone from query on this PR?

@wqian94 wqian94 requested a review from dstorch September 3, 2024 14:53
@wqian94
Copy link
Contributor Author

wqian94 commented Sep 3, 2024

Looks like SERVER-88741 is the ticket that added the -80-feature-flags variants we're seeing. I've asked Blake Oler, who originally committed these variants, whether we should keep or remove these variants from the variants lists.

@wqian94
Copy link
Contributor Author

wqian94 commented Sep 3, 2024

Also, @alicedoherty the variant-to-variant comparisons seemed to spawn an excessive number of tasks. Since many of them failed, the perf analyzer is unable to create a comparison. I'm going to retry this week with fewer tasks and hopefully that's enough to convince us that the CoVs are acceptable.

@wqian94
Copy link
Contributor Author

wqian94 commented Sep 3, 2024

@BlakeIsBlake this is the PR in question. Do you think we should leave the *-80-feature-flags in to make it easier to replace come 9.0, or is it better to remove them now and add them in later?

@BlakeIsBlake BlakeIsBlake self-requested a review September 3, 2024 19:10
@wqian94
Copy link
Contributor Author

wqian94 commented Sep 13, 2024

Just a quick update, since I neglected to do so earlier: I'm rerunning the v2v analysis, now that a crash with the v2v analyses I was doing has been fixed.

@wqian94
Copy link
Contributor Author

wqian94 commented Sep 13, 2024

Analysis complete for classic. CoV seems slightly high for some tests, though?

@wqian94
Copy link
Contributor Author

wqian94 commented Sep 13, 2024

I don't think the SBE variant is available on master right now, so I can either compare older checkouts or just let it be for now.

@alicedoherty
Copy link
Contributor

Analysis complete for classic. CoV seems slightly high for some tests, though?

Thank you! The tests with very high CV on the 3-node replica sets look to be already very noisy on standalones. Although the percentages seem to have gone up for some, clicking into the actual values makes it look like the values are in the range of noise previously seen on standalones so I'm not too worried about them.

I don't think the SBE variant is available on master right now, so I can either compare older checkouts or just let it be for now.

We have SBE Standalone ARM AWS 2023-11 and Query Engine (SBE) 3-Node ReplSet ARM AWS 2023-11 running on master right now - what issues were you having when scheduling them?

I'd like to see a variant to variant analysis for the SBE 3-Node vs SBE Standalone (just like you did with Classic Engine) if possible. Thanks again @wqian94 for the analysis on this!

@wqian94
Copy link
Contributor Author

wqian94 commented Sep 16, 2024

The old SBE variant doesn't seem to be available for v2v comparison last I checked. I'll look again today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants