Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Wrong Z height after long print, Z-shifting #27347

Open
1 task done
Legion23705 opened this issue Aug 11, 2024 · 13 comments
Open
1 task done

[BUG] Wrong Z height after long print, Z-shifting #27347

Legion23705 opened this issue Aug 11, 2024 · 13 comments

Comments

@Legion23705
Copy link

Legion23705 commented Aug 11, 2024

Did you test the latest bugfix-2.1.x code?

Yes, and the problem still exists.

Bug Description

I discovered a problem after I finished a 24h print: The part which I wanted to print should have been 15mm tall, but was only 12,5mm.
After I manually moved the Z-Axis down, the nozzle touched the printbed at a displayed Z-height of 2,5mm. So the Z-height apparently has been shifted by 2,5mm during the print.

After I discovered this, I checked the mechanics of the machine: Everything sits properly, there is nothing loose, all belts are tensioned, pulleys are fixed. There is no measurable backlash on the z-axis.
When I move the Z-axis manually, even 10 times several 100mm +-, the shift doesn´t appear.
It only seems to appear at a high number of small movements.

Then, I wrote a test gcode file which rapidly performs a small Z-movement followed by a small XY-movement which mimics a complex print with many small islands with Z-hops 7500 times in total with an additional Z+0,2 every 500 cycles.

G91 ;set relative positioning
15x
(
(500x)
(
G1 Z0.2
G1 X0.2 Y0.2
G1 Z-0.2
G1 X-0.2 Y-0.2
)
followed by 1x
G1 Z0.2
)
G90 ;set absolute positioning

I´ve run this test G-Code with 3 different Marlin firmware versions:

Version 2.1.2.1: Z+0,25mm
Version 2.1.x (bugfix) from 11. Jun. 2024: Z-0,32mm
Version 2.1.x (bugfix) from 09. Aug. 2024 (latest version): Z-0,32mm

Z+ means the axis is higher than it is supposed to be (gap between nozzle and printbed at Z=0),
Z- means the axis is lower than it is supposed to be (nozzle crashes into printbed at Z=0).
The results are repeatable, always with identical values.

As shown above, the value and the direction of the z-shift changes depending on the firmware version. This clearly shows that this problem is firmware related and cannot be caused by other mechanical or electrical reasons.
I´ve attached the test g-code and the config file.

My Z-axis runs with 1280 steps/mm, 300mm/s² acceleration and 0 jerk. X-Axis is 160,22 steps/mm, Y-Axis 128,28 steps/mm, both 5000 mm/s² acceleration and 10mm/s jerk.
When you test this G-Code, you should set your machine to equal values if possible to reproduce the results, especially for the Z-axis.

Bug Timeline

Discovered a few weeks ago

Expected behavior

I expect no position shift in the Z-Axis. Even after a long print, Z=0 has to remain at the same position as at the beginning of the print.

Actual behavior

Z-Axis shifts after a high number of Z-hops.

Steps to Reproduce

  1. Home all axes
  2. Move X and Y so that the nozzle is above the printbed
  3. Measure the distance between the nozzle and the printbed (I used sheet metal gauges)
  4. Move Z+1mm so that the endstops won´t be triggered
  5. Start gcode "Test Z+-0.2 (total +3Z) +XY rel x7500.gcode"
  6. Move Z towards Z=0 carefully and test if it touches the bed or triggers the endstop above Z=0
  7. If the nozzle doesn´t touch the printbed at Z=0, measure the distance between the nozzle and the bed and compare it to the initial value.

Version of Marlin Firmware

2.1.x Bugfix from 07.08.2024

Printer model

custom built

Electronics

BIGTREETECH Manta M8P with Raspberry Pi 4

LCD/Controller

None

Other add-ons

closed loop servo motors and external steper drivers

Bed Leveling

None

Your Slicer

Simplify3D

Host Software

Other (explain below)

Don't forget to include

  • A ZIP file containing your Configuration.h and Configuration_adv.h.

Additional information & file uploads

Host Software: Repetier Server

Configuration.zip
Test Z+-0.2 (total +3Z) +XY rel x7500.zip

@thinkyhead
Copy link
Member

Do you see any difference if you reduce your Z axis acceleration and max feedrates? Set them as low as you can tolerate and give a few tests, and maybe try some extra high values too just for comparison.

@Legion23705
Copy link
Author

I´ve testet some settings to find potential reasons for this behavior.
I only changed the mentioned parameters in each line and reverted the others to default.

  • Decreased acceleration XY 4000 -> 1000, Z 300 -> 100, XY Jerk 10 -> 5: Same result (Z-0,28)
  • Changed steps/mm on Z 1280 -> 1000 on Marlin only, motor settings weren´t changed: Same result (displayed Z-0,35mm, real: Z-0,28)
  • Increased MINIMUM_STEPPER_PULSE_NS 3000 ->10000, MINIMUM_STEPPER_PRE_DIR_DELAY + POST_DIR_DELAY 5000 -> 10000: Same result (Z-0,28)
  • Doubled step resolution on Motor and Marlin (Z Steps/mm 1280 -> 2560): Different Result (Z-0,14)
  • Increased move distance for test program Z+-0,2mm -> +-0,4mm: Same result (Z-0,28)

It looks like there is a constant amount of missed / added steps independend from the speed (no change at smaller acceleration = smaller velocity), moved distance, pulse density, number of steps, etc. The only constant is the number of directional changes.

Did you try the test-gcode on one of your machines to validate if it´s a general firmware issue or a problem specifically related to my hardware?

@Legion23705
Copy link
Author

I did some more test:

  • Added an "M400" - command (Wait for current moves to finish) before and after any Z movement: Different Result (Z+-0, no error). However, the M400 increases the execution time by 4, so that the test progranm, which usually takes ~18 min, now took 70 min to finish. It looks like M400 doesn´t only empty the move buffer, but also adds a delay before the next move is started.
  • Decreased acceleration Z 300 -> 30: Same result (Z-0,28) (printtime of testfile almost doubled)
  • Increased acceleration Z300 -> 1200: Same result (Z-0,28)

Apparently, the acceleration has no impact on this problem.
Only the test with M400 eliminated the problem, but due to the massive increase in execution time, this is not a suitable workaround.
However, in my opinion this indicates that it might be a problem with the path planner, buffer, or something related.

What do you think?

@Legion23705
Copy link
Author

I did even more tests:
Since I had the idea that the issue is related to the motion planner or something related, I wanted to try different acceleration methods:

  • disabled "classic Jerk", enabled "Junction deviation": Same result (Z-0,28)
  • Enabled "S_Curve_Acceleration": Same result (Z-0,28)

Was anyone able to reproduce this problem on their machine?

@parallyze
Copy link

Was anyone able to reproduce this problem on their machine?

Not on the Z axis... but this whole thing reminds me of something. Recently I upgraded my Prusa MK2S
using a MKS SGen L 1.0 and some TMC2209 drivers. The heatbed has specific positions where the probe
has to be when homing/levelling, took some tries to hit them in the right spot.

But:
Every once in a while when doing a bed level the nozzle would crash into the bed, missing the spots on
the heatbed by a few mm. Whenever I moved the bed using the menus it would slightly shift/loose steps
and without a home y first any action involving the probe would fail.
For me it was Input Shaping causing this strange problem. Interestingly it was very speed dependent, any
speeds > 60mm/s would make the bed fail to home and the printer halt with a homing failure. Changing
frequencies/dampening didn't really change much here.

This was using the 2.1.2.4 release. I then tried bugfix-2.1.x just to find out it simply wouldn't home
my dual z axis (z2 in e1, did work without problems using 2.1.2.4) properly - moving one axis always
much faster than the other. Ended up installing 2.1.2.1 and absolutely no problems anymore on that
printer.

Wanted to post this but forgot about it, the problem with your Z axis does sound like what I've seen
happening on the Y axis. But I'm not able to deliver much more information, I'm afraid. What I DO now:

The problem did NOT occur when disabling IS. I never enabled IS on the Z axis, wanted to see how it
worked out for others first. I also never tested any FT-Motion stuff. I probably still have the configs if
they might be of any help here... bugfix was from July, 26th when I tested.

@Legion23705
Copy link
Author

I have an idea:
Since there seems to be a fixed shift per Z-move, is it somehow possible to add a fixed amount to every Z-move in the firmware? This is quite a sketchy workaround, but maybe the best I can do for now until a better solution is found.
According to my calculations, for each Z-move there is an error of Z-0.0000186666mm.
First, i had the idea to add something like this to the postprocessor after every Z-move:

G91 //relative positioning
G1 Z-0.0000186666 F1200
G90 //absolute positioning

But this wont work because as soon as the next absolute Z-move is executed, the additional offset will be deleted.
@parallyze
Input Shaping and FT-Motion were always deactivated. I don´t think that it is something related to your problem.

@parallyze
Copy link

parallyze commented Aug 17, 2024

I have an idea: Since there seems to be a fixed shift per Z-move, is it somehow possible to add a fixed amount to every Z-move in the firmware? This is quite a sketchy workaround, but maybe the best I can do for now until a better solution is found. According to my calculations, for each Z-move there is an error of Z-0.0000186666mm. First, i had the idea to add something like this to the postprocessor after every Z-move:

So far I'm wondering if anyone was able to confirm the theory of a fixed shift per z move...?

At 1280 steps/mm each step is 0.00078125mm - roughly 42 times of your calculated value. How do you know it's 0.000019mm for each move and not 0.0002mm for a single move every now and then, did you run your test with 1000/1500/2000 iterations or how did you verify this? Or did you simply divide 0.28mm/15000?

@parallyze Input Shaping and FT-Motion were always deactivated. I don´t think that it is something related to your problem.

Based on the timeline we've both likely been using bugfix releases around the same time with tons of changes... while I had been loosing steps on Y when IS was enabled I didn't enable IS when my Z axis was behaving strangely... the thing I didn't really notice initially was you reporting the problem with 2.1.2.1. I'm running that release on multiple printers, so I wanted to test this.

Because of the different steps/mm you're using I did not want to test your gcode on the printer I mentioned, but I did
run it on a different printer (bltouch, so x/y position with different steps/mm doesn't matter here). First I ran it at my default settings for acc/jerk/steps and then used the ones from your config:

M92 X128.22 Y80.14 Z1280 ; set steps per mm
M201 X4000 Y4000 Z300 E5000 ; sets maximum accelerations, mm/sec^2
M203 X400 Y400 Z20 E60 ; sets maximum feedrates, mm / sec
M204 P1500 R5000 T1000 ; sets acceleration (P, T) and retract acceleration (R), mm/sec^2
M205 X10 Y10 Z0 E10 ; sets the jerk limits, mm/sec

Also added 3 times moving to z=3 and z=0 at the start to verify the position when the test starts.

Usually my Z axis is running at 400 steps/mm, so it's running at > 60mm/s for this test and I did expect to see some trouble here, but no:

vlcsnap-00015
z=0 position right before starting your test

vlcsnap-00016
z=0 position right after your test has ended and z was moved manually to 0.0

https://www.youtube.com/watch?v=af9SPdLW4G4

While there is a TINY difference (way below 1/100th mm) it's nowhere near the ~0.25 or 0.3mm reported and I'm inclined to say that's more likely my hastily designed dial gauge mount and the decades old indicator itself which might be slightly off.

Been using 2.1.2.1 for quite a while and I'm using Z hop/lift on many prints, so I'm having a hard time to believe it's a generic problem like being off by a constant fraction of a step on each z move. But I'm also missing the equipment to reliably come up with something like 0.000019mm/move, I have to admit.

@Legion23705
Copy link
Author

@parallyze
Thank you for performing the test. It looks like the Z-shift didn´t appear on your machine, but could it be that there was a Z-shift and your machine was just stopped by the Z-axis-endstop on the way down to Z0? At least that´s the case on my machine. I always started the test at Z1 and after it was done, I moved the axis slowly with a resolution of 0,01mm towards Z0 until the Z-endstop was triggered. This happened between Z0,28 and Z0,3
Would you mind doing this with your setup, with the attached indicator? I mean start at Z1, let the test run until Z4 and the move back down to Z1 to compare the value on the indicator?

At 1280 steps/mm each step is 0.00078125mm - roughly 42 times of your calculated value. How do you know it's 0.000019mm for each move and not 0.0002mm for a single move every now and then, did you run your test with 1000/1500/2000 iterations or how did you verify this? Or did you simply divide 0.28mm/15000?

It definitely is a 0,0078125mm - step every now and then (after 42 Z-moves, as you said), the motor cannot perform a smaller move than a single step.

In fact I simply divided 0,28/15000.
However, this factor was confirmed by the failed print I initially had: This print had roughly 131.000 Z-moves and the part was 2,5mm smaller than it should be.
131000 x 0,00001866 = 2,44mm
The factor fits pretty well, so it seems to be a constant.

Could you send me your config so that I can compare it to mine? Maybe there are other settings different which I could try to test.

@parallyze
Copy link

parallyze commented Aug 17, 2024

@parallyze Thank you for performing the test. It looks like the Z-shift didn´t appear on your machine, but could it be that there was a Z-shift and your machine was just stopped by the Z-axis-endstop on the way down to Z0? At least that´s the case on my machine. I always started the test at Z1 and after it was done, I moved the axis slowly with a resolution of 0,01mm towards Z0 until the Z-endstop was triggered. This happened between Z0,28 and Z0,3 Would you mind doing this with your setup, with the attached indicator? I mean start at Z1, let the test run until Z4 and the move back down to Z1 to compare the value on the indicator?

Sorry, not really into testing again right now (already did two 18 minute runs without any indications of something wrong). But there was no Z-Shift, and I did not hit the endstop. As mentioned I was using a BLtouch, so there was no endstop to hit.
Before testing I adjusted the probe offset, the nozzle was ~1mm away from the bed at z=0, the dial gauge was the only thing touching the bed after starting the test, that's for sure.

At 1280 steps/mm each step is 0.00078125mm - roughly 42 times of your calculated value. How do you know it's 0.000019mm for each move and not 0.0002mm for a single move every now and then, did you run your test with 1000/1500/2000 iterations or how did you verify this? Or did you simply divide 0.28mm/15000?

It definitely is a 0,0078125mm - step every now and then (after 42 Z-moves, as you said), the motor cannot perform a smaller move than a single step.

I said "roughly 42 times of your calculated value". That's quite different from "loosing a single step every 42 z moves". And I am a bit confused now:

Since there seems to be a fixed shift per Z-move

It definitely is a 0,00078125mm - step every now and then

Are you saying there's something off on every MOVE or every now and then?

Edit:

The only constant is the number of directional changes

Have you tried changing your test script? Instead of +0.2/-0.2 in each "loop" moving 2x +0.2 and 1x -0.4 instead? That might give a clue if it's tied to the amount of moves (3 instead of 2) or the amount of directional changes (still only 1).

In fact I simply divided 0,28/15000. However, this factor was confirmed by the failed print I initially had: This print had roughly 131.000 Z-moves and the part was 2,5mm smaller than it should be. 131000 x 0,00001866 = 2,44mm The factor fits pretty well, so it seems to be a constant.

2.5mm vs 2.44mm is a difference of ~77 steps, so this sounds awkwardly rounded to fit other calculations.

Have you actually tested if any of this happens when using z moves only? Or Z+X only?

Right now I'm just getting more confused because of all the stuff not fitting together. Initially this was reported:

I´ve run this test G-Code with 3 different Marlin firmware versions:
Version 2.1.2.1: Z+0,25mm
Version 2.1.x (bugfix) from 11. Jun. 2024: Z-0,32mm
Version 2.1.x (bugfix) from 09. Aug. 2024 (latest version): Z-0,32mm

but from your second post on it's always about -0.28mm....?!

Initially you said:

X-Axis is 160,22 steps/mm, Y-Axis 128,28 steps/mm

while your config clearly states:

#define DEFAULT_AXIS_STEPS_PER_UNIT { 128.22, 80.14,

both 5000 mm/s² acceleration

does also not fit your configs or following posts.

Right now I can't see any coherence here or reliable steps to recreate the problem reported, sorry. And mind you, I'm using TR8x2/4 rods, so at 16x microstepping there's 400 steps/mm and the problem would've been even more visible. The 0.2mm z moves from your gcode equalled 0.64mm on that printer.

Could you send me your config so that I can compare it to mine? Maybe there are other settings different which I could try to test.

The config from that printer is vastly different. It's using dual Z, Trinamics, BLtouch and IS (which was disabled during the tests), so I highly doubt that'll show up anything interesting.

@Legion23705
Copy link
Author

Thanks for clarification. So it seems indeed to be a problem isolated to my hardware or firmware config.

There are some mismatches between the config I´ve uploaded and the values I mentioned later (XY Steps/mm and acceleration), I forgot to mention that I changed some values in the EEPROM.
The actual values are:

  • X-Axis is 160,22 steps/mm
  • Y-Axis 128,28 steps/mm
  • XY acceleration 4000 mm/s²

Since neither the acceleration nor the move distance seemed to have an impact on the issue, I assumed it wasn´t necessary to get too much into the detail here. I apologize for the confusion.
The measurements of the error are also not acurate below a few 1/100 mm: Sometimes the endstop triggers a little earlier, sometimes a little later (for all tests with Version 2.1.x (bugfix), there was a spread of +-0,02mm in the trigger position of the endstop, which in my opinion is caused by the inaccuracy of the endstop itself, it´s a mechanic endstop which hasn´t a perfect repeatability). Also the deviation of the initial failed print wasn´t exactly 2,5mm, it could be 2,4 or 2,6, I didn´t do such a precise measurement when I first noticed this issue.
All in all, my measurements weren´t super accurate, with an error rate of a few %. I also don´t have the tools for measuring below a few 1/100 mm.

I just wanted to say that the estimated error per Z-Move or Z-directional change which I calculated according to my test g-code is roughly the same as what I´ve seen on the initial failed print, but not to the very last decimal place.

To clarify the timeline:

  • The problem initially occured on the Version 2.1.x (bugfix) from 11. Jun. 2024 (part was ~2,5mm too small)
  • Then I did the first test with my test gcode with the Version 2.1.x (bugfix) from 11. Jun. 2024: Error Z-0,32
  • I repeated the test with version 2.1.x (bugfix) from 09. Aug. 2024 (latest version): Error Z-0,32
  • After that, I tested the same G-Code with Version 2.1.2.1: Error Z+0,25mm (measured with sheet metal gauges, +-0,05mm accuracy, so could also be Z+0,28mm, but deviation is "+", not "-".
  • Afterwards, I executed all other tests with the version 2.1.x (bugfix) from 09. Aug. 2024, were I had an error of Z-0,28 +-0,02mm in the most cases as documented.

I did some more tests:

  • Changed the test gcode, so that there were only Z-moves, no XY-moves: Different result: No error (Z+-0)
  • Executed the same gcode as before (with +-0,2mm XY-moves), but the XY-motors are unplugged: Same result (Z-0,28mm)
  • Changed the test gcode, all Z-moves doubled: instead of Z+0,2, XY+0,2, Z-0,2, XY-0,2 -> Z+0,2, Z+0,2, XY+0,2, Z-0,2, Z-0,2, XY-0,2: Same result (Z-0,28mm)

So I must modify my earlier assumption: The error doesn´t happen after each Z-move or Z-directional-change, but each time a Z-move is directly followed by a XY-move ore vice versa. This explains why there was also no error when the Z-move was embedded by M400-commands which I testet earlier: It look like the error happens when Z-move is directly followed by a XY-move ore vice versa inside the move buffer.

@Legion23705
Copy link
Author

I got a little further with the investigation of this problem:
When I increase the MINIMUM_STEPPER_PULSE_NS 3000 ->15000 and MINIMUM_STEPPER_PRE_DIR_DELAY + POST_DIR_DELAY 5000 -> 15000, the error disappears. This however limits the maximum step rate to ~33 kHz, which would be really slow.
I will do some further investigation and try to find the smallest possible value for the stepper timing.
At a MINIMUM_STEPPER_PULSE_NS of 7500, the error reappears.

I had the idea to set the timing for the Z-axis to 15000/15000 and leave the other axes at 3000/5000, but it looks like this isn´t possible with marlin: The timing is always set to the highest value of each configured stepper driver.

I have another printer which has the same JMC-motor, but for the X - and Y - axes. It runs with Repetier-Firmware and the timing is set to 3000 (step)/5000(dir). There, I don´t have any problems. So i don´t think that the motor is the problem.

What I also noticed is that the noise of the Z-Motor is slightly different when it runs alone compared to when it runs together with the XY-axes:

  • When it runs with XY, there are some clicking noises every now and then, which sound like the acceleration ramp isn´t uniform
  • When it runs alone, theses noises are gone and the motor appears to run smoother.

For me it looks like Marlin has some problems to get the step timings right at this specific situation, which may not causes problems for fast TMC-Drivers, but is a problem for slower leadshine or JMC-drivers.

@parallyze
Copy link

MINIMUM_STEPPER_PULSE_NS

Hmm... this is just an uneducated guess - but there's been minor changes to this, in 2.1.2.1 this was defined as microseconds: #27113

While you did have problems using 2.1.2.1 it was different from the bugfix ones. I wonder if those minor changes also did have any impact on the timings set...

@Legion23705
Copy link
Author

Legion23705 commented Aug 19, 2024

Success!!
I finally was able to fix this problem without limiting the print speed:
The following settings are working without Z-shift:

  • MINIMUM_STEPPER_PULSE_NS 3000
  • MINIMUM_STEPPER_PRE_DIR_DELAY + POST_DIR_DELAY 30000

The DIR_DELAY is now 6 times longer than it usually has to be with this motor, I have no idea why.
However, this fix works for me, finally !!!
Since the MINIMUM_STEPPER_PULSE time is the same as before, I can still print with full speed. The additional DIR_DELAY only costs a few extra seconds for every print hour, depending on the geometry.
While this is not great, it´s at least usable. Maybe the developers can find out what´s the reason for this strange behavior sooner or later.

I´m currently printing the failed 24h-print again to check if the problem also disappears under real conditions, but I am very confident.

Hmm... this is just an uneducated guess - but there's been minor changes to this, in 2.1.2.1 this was defined as microseconds: #27113

While you did have problems using 2.1.2.1 it was different from the bugfix ones. I wonder if those minor changes also did have any impact on the timings set...

Yes, I remember, there was a change. The current value for STEPPER_PULSE (3000ns) is longer than the pulse time mentioned in the motor manual (2,5ys = 2500 ns), so I kept this in regard.

@thisiskeithb thisiskeithb changed the title Wrong Z height after long print, Z-shifting [BUG] Wrong Z height after long print, Z-shifting Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants