-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mykrobe overcounts homopol deletions when making probes #148
Comments
For context. We have this example "Isoniazid": {
"predict": "R",
"called_by": {
"katG_GC1037G-GC2155074C": {
"variant": null,
"genotype": [
1,
1
],
"genotype_likelihoods": [
-711.0227088947968,
-417.9483717274669
],
"info": {
"coverage": {
"reference": {
"percent_coverage": 100.0,
"median_depth": 7,
"min_non_zero_depth": 6,
"kmer_count": 175,
"klen": 21
},
"alternate": {
"percent_coverage": 100.0,
"median_depth": 15,
"min_non_zero_depth": 13,
"kmer_count": 253,
"klen": 18
}
},
"expected_depths": [
24
],
"contamination_depths": [],
"filter": [],
"conf": 293
},
"_cls": "Call.VariantCall"
},
"katG_CC1038C-GG2155073G": {
"variant": null,
"genotype": [
1,
1
],
"genotype_likelihoods": [
-727.5071540961122,
-377.76816030583126
],
"info": {
"coverage": {
"reference": {
"percent_coverage": 100.0,
"median_depth": 7,
"min_non_zero_depth": 6,
"kmer_count": 160,
"klen": 21
},
"alternate": {
"percent_coverage": 100.0,
"median_depth": 15,
"min_non_zero_depth": 13,
"kmer_count": 253,
"klen": 18
}
},
"expected_depths": [
24
],
"contamination_depths": [],
"filter": [],
"conf": 350
},
"_cls": "Call.VariantCall"
},
"katG_CC1039C-GG2155072G": {
"variant": null,
"genotype": [
1,
1
],
"genotype_likelihoods": [
-729.808268296947,
-372.51398695693916
],
"info": {
"coverage": {
"reference": {
"percent_coverage": 100.0,
"median_depth": 7,
"min_non_zero_depth": 6,
"kmer_count": 158,
"klen": 21
},
"alternate": {
"percent_coverage": 100.0,
"median_depth": 15,
"min_non_zero_depth": 13,
"kmer_count": 253,
"klen": 18
}
},
"expected_depths": [
24
],
"contamination_depths": [],
"filter": [],
"conf": 357
},
"_cls": "Call.VariantCall"
}
}
},
The three mutations in the JSON above could concieveably be the same deletion...or a 2bp deletion I guess. GC1037G I guess it probably is a single 1bp deletion - at 1038. Although I'm a bit confused about how the mutations work. Which of the two bases is the one at the position? If I pull out these three bases, with one base flanking, from the katG reference, I get G GCC T. So if the position describes the first base, then 1039 should read CT1039C right? But if the second base describes the position, it's the same problem right? |
Hmmm, wonder why I got a different sequence. |
The probe generator is making one probe for each 1bp and 2bp deletions in katG and pncA. But if you make a all 1bp deletions in a homopolymer, they are essentially the same.
eg deleting one A from AAA could happen in 3 places and give the same sequence.
This leads to annoying things where we report >1 mutation being detected, when it is basicvally the same thing detected twice/whatever.
Workaround suggestions
The text was updated successfully, but these errors were encountered: