Memory leaks with combinator functions #10

joshdifabio · 2016-06-21T16:17:17Z

I have uploaded a gist which shows a problem I've encountered recently when using promises and combinators: specifically, using a combinator function (particularly first()/race()/any()) will cause redundant resolver callbacks to remain attached to whichever promises do not 'win' the race to resolve the combined promise. The current Awaitable interface does not provide any mechanism to overcome this issue (e.g. some kind of cancel() or dispose() method). What do others think about this issue?

I don't personally think combinators should implicitly abort 'losing' promises (which is what cancel() sometimes means) but I do think it might be nice to have some kind of dispose() method:

interface Awaitable
{
    public function when(callback $onResolved): Disposable;
}

interface Disposable
{
    /** Inform the awaitable that you no longer care about its result and it can remove your callback **/
    public function dispose();
}

I'm guessing people will be against such a change, but it's come up for me a couple of times so I'm interested to hear people's thoughts.

The text was updated successfully, but these errors were encountered:

kelunik · 2016-06-21T16:54:07Z

Promises can't be aborted. And we always have to attach a handler.

To support something like Disposable, we'd have to issue identifiers for all when calls, as each handler should be independent.

kelunik · 2016-06-21T16:56:03Z

Where's the real memory leak now? Promises usually don't exist for a long time after they have been resolved, as they're useless then.

kelunik · 2016-06-21T17:05:25Z

What we could do is something like when returning an identifier and providing a cancel method in Awaitable that cancels a when, but not the awaitable itself.

joshdifabio · 2016-06-21T17:08:39Z

Yeah, I guess that'd be more efficient than creating an object for every when() call, and more consistent with Loop as well.

kelunik · 2016-06-21T17:10:43Z

We don't need an object, we could just increment a counter that's used as key for private $whens.

joshdifabio · 2016-06-21T17:26:15Z

I agree, that would be a better approach.

bwoebi · 2016-06-21T18:48:13Z

I sometimes, but rarely, have that need. In these cases, it typically is just fine to handle the abortion in the callback directly, i.e.

function ($ex, $val) {
    if ($this->aborted) {
       return;
    }
    /* work with $ex and $val */
}

This usually has the advantage of not explicitly cancelling something on cancellation site, but being explicit of ignoring under certain circumstances [and you possibly still might want to log possible errors nevertheless] (it's hard to think about all the things to be cleaned up and easy to forget one).

Furthermore, you're saying it leaves the callbacks attached … but is that a problem? It isn't a memory leak as the other Promises will be resolved at some point (if they aren't that's a bug in your program). Nor is it a performance hit as calling a method to remove the callback is about just as expensive as calling a callback which immediately returns.

In your specific example the when() callback just temporarily "leaks" but the Awaitable impl should free it upon resolution. Typically not much memory is used here … in an extreme case maybe 500 bytes for a fullblown Closure [including statics from use ()] (and IMO if you have millions of Promises pending concurrently, you're not on a 4 GB RAM machine, but on something much beefier...). There's usually much more memory used for the actual action behind. For example two dns lookups on different dns servers racing against each other … each lookup is having it's own structure, watcher callback, whatever… that's typically much more expensive and still counting twoards your "leaked" memory.
In case this gets too expensive, you should consider adding a real cancel() method [not to the standard, but to the class you started the operation on] which then aborts the lookup handling and resolves the Awaitable immediately.

Thus, unless there are stronger arguments, I'm pretty much opposed to adding this. The way it is currently is much cleaner IMO.

joshdifabio · 2016-06-22T08:22:36Z

Sorry guys, I was replying via email on my phone last night and I missed a bunch of what you said.

I'm not sure if I've been clear with what I'm saying, so allow me to try and clarify with a more concise, albeit artificial, example than the one I posted in a gist yesterday:

while (true) {
    yield race(doSomething(), $someVeryLongLivedPromise);
}

In this example, $someVeryLongLivedPromise is a promise which takes a long time to be resolved; it could take hours, days or weeks to resolve. Each time race() is called, it adds a callback to each of the promises it's been provided. Imagine that doSomething() is resolved within a second or so; the callback attached to $someVeryLongLivedPromise by race() is now redundant, as the 'race' was already won by doSomething(). Over time, $someVeryLongLivedPromise will accumulate millions of redundant callbacks in this way.

To be crystal clear, if doSomething() generally resolves very quickly and $someVeryLongLivedPromise takes a long time to resolve then this example can waste a lot of memory very quickly. It will also make eventual resolution of $someVeryLongLivedPromise extremely slow as it will have to call millions of callbacks.

The suggestion is that some method could be provided for cases like this so that race() could remove the callback it attached to $someVeryLongLivedPromise once the race has been won by doSomething().

As suggested by @kelunik, something like the following would be a more efficient way of implementing this than my earlier suggestion, although I'm not sure about naming the new method cancel() as this would conflict with existing cancel() methods on React, Guzzle and Icicle promises which have a different meaning to what I'm proposing:

interface Awaitable
{
    /**
     * @return string A unique (to this awaitable) watcher ID which can be passed to cancel()
     **/
    public function when(callable $callback) : string;

    public function cancel(string $watcherId) : void;
}

However, if people don't like this idea, that's totally cool with me and I'll close the issue.

bwoebi · 2016-06-22T11:23:13Z

This sounds like a legitimate, though extremely edge case.
I've not once encountered a need for something like this.

But in this very special case you should be able to just implement your own Awaitable decorator, I'm not sure whether it belongs in the standard.

class RaceableAwaitable implements Awaitable {
    private $awaitable;
    private $racing = [];
    private $resolved = false;
    public function __construct(Awaitable $awaitable) {
        $this->awaitable = $awaitable;
        $awaitable->when(function($ex, $val) {
            $this->resolved = true;
            foreach ($this->racing as $deferred) {
                $deferred->resolve($this->awaitable);
            }
            unset($this->racing);
        });
    }
    public function race(Awaitable $awaitable) {
        if ($this->resolved) {
            return $this->awaitable;
        }
        $deferred = new Deferred;
        $hash = spl_object_hash($deferred);
        $this->racing[$hash] = $deferred;
        $awaitable->when(function() use ($awaitable, $hash); {
            if ($this->resolved) { return; }
            $deferred = $this->racing[$hash];
            unset($this->racing[$hash]);
            $deferred->resolve($awaitable);
        });
        return $deferred->awaitable();
    }
    public function when(callable $callback) { $this->awaitable->when($callback); } 
}

Then you can wrap this locally and do your calls against this custom race() method. Then you have exactly one when() call for all your races.

I do not think this case warrants adding an extra function to the fundamental Awaitable interface.
For special cases you can have special solutions.

joshdifabio · 2016-06-25T11:03:55Z

@bwoebi Thanks for taking the time to reply in detail. Having considered your point and thought about this for a while I think you're right; this is an edge case which can have its own solution and doesn't need to be baked into the spec.

kelunik · 2017-01-02T12:50:53Z

As it came up again in #38, we might want to reconsider don't care / unregister. I can see a problem with things like combinators / timeouts, where bound variables can't be GCed, because they're still bound via when(), even if the timeout has already been fired long ago.

I think it's really not a real world problem though, I can't remember having a promise where I repeatedly called combinators on, but it might be the case when using more advanced implementations such as observables for a streaming API, which are long-lived promises then.

/cc @martinschroeder

bwoebi · 2017-01-02T17:12:02Z

I still think it shouldn't be part of this standard. It is an edge case.

It should be trivial to add another method in the custom Observable implementation or to return a custom object allowing for cancellation (it doesn't violate LSP to return a value when the parent returns void).

I see no general use case for interoperability here. If you know that the Promise may take really long to resolve [IMHO Promises/Observables are the wrong tool to use for potentially infinitely active streams though. They should be used for anything which has an end and just intermittently informs about current progress/data arrived.], you also know the concrete implementation of the Promise too and may use cancellation if present.

kelunik · 2017-01-02T17:50:44Z

I see no general use case for interoperability here.

If we want to unregister no longer needed callbacks in combinators, we need it to be interoperable, because combinators operate on the interface, not specific vendor implementations.

you also know the concrete implementation of the Promise too and may use cancellation if present

I think it will be common in libraries to declare the interop promise as return type, so the underlying implementation can be switched. In that case you don't know the implementation and you shouldn't know the implementation anyway if you have interfaces.

For HTTP requests and so on, I agree, there should be a custom implementation, basically a handle, that allows abortion. But in the case of combinators, that's not the case. On the other hand, there isn't that much state that has to be stored for combinators.

IMHO Promises/Observables are the wrong tool to use for potentially infinitely active streams though

What's the use case of observables if not (potentially infinite) streams?

bwoebi · 2017-01-02T18:08:14Z

What's the use case of observables if not (potentially infinite) streams?

Finite streams.

I think it will be common in libraries to declare the interop promise as return type, so the underlying implementation can be switched.

Why should the underlying implementation be switchable by that?

If we want to unregister no longer needed callbacks in combinators, we need it to be interoperable, because combinators operate on the interface, not specific vendor implementations.

That's true, but we don't want to.
Either we don't expect a Promise to resolve in finite time or you use a combinator on it. I do not see much sense in using a combinator on Promises we don't expect to resolve soon.

joshdifabio · 2017-01-02T18:37:32Z

it doesn't violate LSP to return a value when the parent returns void

I don't think that is true. Non-null types are not subtypes of null. If the return type of when() was changed to mixed then it would be possible to return a value from when() without defying the spec.

davidwdan · 2017-01-02T18:53:42Z

A finite stream is just one of the many use cases for an Observable. It should also be able to handle infinite streams, so any observable can have 0 to infinite values.

bwoebi · 2017-01-02T18:58:28Z

@joshdifabio void ≠ null. It's just PHP which makes void functions return null. void actually means "value irrelevant".

@davidwdan infinite streams are fine as long as the time they are active is finite. Otherwise that Observable should not implement Promise - the Promise promises to resolve timely.

kelunik · 2017-01-02T19:08:05Z

@bwoebi From MDN:

A Promise represents a value which may be available now, or in the future, or never.

bwoebi · 2017-01-02T19:16:10Z

@kelunik Correct. There may never be a value available, but then it will be resolved with an exception instead.

joshdifabio · 2017-01-02T19:41:48Z

@joshdifabio void ≠ null. It's just PHP which makes void functions return null. void actually means "value irrelevant".

I don't agree that void means value irrelevant. Void means that there is no return value.

cboden · 2017-01-02T19:50:51Z

IMHO Promises/Observables are the wrong tool to use for potentially infinitely active streams though

Observables are exactly the right tool for that use case.

bwoebi · 2017-01-02T20:02:22Z

If your Observable doesn't implement Promise, sure. And thus this is not on-topic on this spec.

trowski · 2017-01-02T20:03:13Z

@bwoebi What we called Observables in Amp is different from RxObservables, which are not promises. However, a promise doesn't ever need be resolved. An infinite set in our implementation would just never resolve the promise.

bwoebi · 2017-01-02T20:04:21Z

@trowski Yeah, I realized I was confusing some people with this by assuming the Amp Observables; sorry for that.

Also, yes, it would never resolve, but I TBH consider that a bug then.

cboden · 2017-01-02T20:19:46Z

Sorry for going off topic but what is your definition/specification of "observable"? In my experience across multiple languages I've seen the term either be synonymous with Reactive eXtensions or ECMA Script 7 which was adopted from Rx.

bwoebi · 2017-01-02T20:24:48Z

Our amp observables (as in current master) allow subscribing with a callback which will be updated each time a new value is emitted. The end of emitting will be signaled by the Promise (which Observable implements) being resolved.
Perhaps we need another name to lower confusion...

trowski · 2017-01-05T00:34:29Z

Cancellation of promises as part of this specification is, in my opinion, the incorrect place. Creators of promises (or observables) should be responsible for providing a mechanism to abort the operation if possible. Since async operations vary widely and are state dependent, providing a interoperable and simplified cancellation method is impossible. What does it mean for a promise to be cancelled? Does it reject the promise? How can we report how much progress was made before the promise was cancelled? These questions can be answered by specific APIs (perhaps returning specialized implementations of promises for their particular purpose), but I feel cancellation is not a question that needs answering in an interoperable specification.

As for combinator functions, if a promise is not expected to resolve in a reasonable amount of time, it shouldn't be given to a combinator function and should be considered a bug. Most libs provide a way to enforce a timeout on promises, such as Amp\timeout().

kelunik · 2017-01-05T00:39:51Z

@trowski Amp\timeout is one such combinator that adds a handler but never removes it.

trowski · 2017-01-05T00:42:34Z

@kelunik That is true… but is really only an issue if the promise is never destroyed. Is this something we see as being a problem? Promises shouldn't last forever within a program – eventually the reference to the object should be destroyed.

kelunik · 2017-01-05T00:49:34Z

@trowski That's the question this issue is about.

trowski · 2017-01-05T01:00:18Z

@kelunik The issue was that unnecessary when callbacks remain attached after a combinator function, which should be fine if the promise(s) passed to the combinator functions are eventually destroyed, since those functions will be destroyed along with it. I would consider indefinitely retaining a reference to a promise a bug.

It seems the ability to remove callbacks from promises is unnecessary, as I've never seen this capability in any promise library.

bwoebi · 2017-01-05T18:50:45Z

@kelunik Also, why should you apply a timeout() on something you do not expect to resolve?

Anyway, it makes more sense to have some way of cancelling the task itself then (which will lead to the Promise failing and thus releasing all associated watchers) - but that is out of scope of this specification.

kelunik · 2017-01-05T18:53:45Z

@kelunik Also, why should you apply a timeout() on something you do not expect to resolve?

If I know that it resolves now, I don't need timeout(), so it not resolving isn't that uncommon.

joshdifabio closed this as completed Jun 25, 2016

joshdifabio mentioned this issue Jan 2, 2017

Cancellation of promises #38

Closed

kelunik reopened this Jan 2, 2017

joshdifabio mentioned this issue Jan 2, 2017

Replace non-RFC 2119 'MAY NOT' with 'MAY' #39

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leaks with combinator functions #10

Memory leaks with combinator functions #10

joshdifabio commented Jun 21, 2016 •

edited

Loading

kelunik commented Jun 21, 2016

kelunik commented Jun 21, 2016 •

edited

Loading

kelunik commented Jun 21, 2016

joshdifabio commented Jun 21, 2016 •

edited

Loading

kelunik commented Jun 21, 2016

joshdifabio commented Jun 21, 2016 •

edited

Loading

bwoebi commented Jun 21, 2016 •

edited

Loading

joshdifabio commented Jun 22, 2016 •

edited

Loading

bwoebi commented Jun 22, 2016 •

edited

Loading

joshdifabio commented Jun 25, 2016

kelunik commented Jan 2, 2017 •

edited

Loading

bwoebi commented Jan 2, 2017

kelunik commented Jan 2, 2017

bwoebi commented Jan 2, 2017

joshdifabio commented Jan 2, 2017

davidwdan commented Jan 2, 2017

bwoebi commented Jan 2, 2017

kelunik commented Jan 2, 2017

bwoebi commented Jan 2, 2017

joshdifabio commented Jan 2, 2017

cboden commented Jan 2, 2017

bwoebi commented Jan 2, 2017

trowski commented Jan 2, 2017

bwoebi commented Jan 2, 2017

cboden commented Jan 2, 2017 •

edited

Loading

bwoebi commented Jan 2, 2017

trowski commented Jan 5, 2017

kelunik commented Jan 5, 2017

trowski commented Jan 5, 2017

kelunik commented Jan 5, 2017

trowski commented Jan 5, 2017 •

edited

Loading

bwoebi commented Jan 5, 2017

kelunik commented Jan 5, 2017

Memory leaks with combinator functions #10

Memory leaks with combinator functions #10

Comments

joshdifabio commented Jun 21, 2016 • edited Loading

kelunik commented Jun 21, 2016

kelunik commented Jun 21, 2016 • edited Loading

kelunik commented Jun 21, 2016

joshdifabio commented Jun 21, 2016 • edited Loading

kelunik commented Jun 21, 2016

joshdifabio commented Jun 21, 2016 • edited Loading

bwoebi commented Jun 21, 2016 • edited Loading

joshdifabio commented Jun 22, 2016 • edited Loading

bwoebi commented Jun 22, 2016 • edited Loading

joshdifabio commented Jun 25, 2016

kelunik commented Jan 2, 2017 • edited Loading

bwoebi commented Jan 2, 2017

kelunik commented Jan 2, 2017

bwoebi commented Jan 2, 2017

joshdifabio commented Jan 2, 2017

davidwdan commented Jan 2, 2017

bwoebi commented Jan 2, 2017

kelunik commented Jan 2, 2017

bwoebi commented Jan 2, 2017

joshdifabio commented Jan 2, 2017

cboden commented Jan 2, 2017

bwoebi commented Jan 2, 2017

trowski commented Jan 2, 2017

bwoebi commented Jan 2, 2017

cboden commented Jan 2, 2017 • edited Loading

bwoebi commented Jan 2, 2017

trowski commented Jan 5, 2017

kelunik commented Jan 5, 2017

trowski commented Jan 5, 2017

kelunik commented Jan 5, 2017

trowski commented Jan 5, 2017 • edited Loading

bwoebi commented Jan 5, 2017

kelunik commented Jan 5, 2017

joshdifabio commented Jun 21, 2016 •

edited

Loading

kelunik commented Jun 21, 2016 •

edited

Loading

joshdifabio commented Jun 21, 2016 •

edited

Loading

joshdifabio commented Jun 21, 2016 •

edited

Loading

bwoebi commented Jun 21, 2016 •

edited

Loading

joshdifabio commented Jun 22, 2016 •

edited

Loading

bwoebi commented Jun 22, 2016 •

edited

Loading

kelunik commented Jan 2, 2017 •

edited

Loading

cboden commented Jan 2, 2017 •

edited

Loading

trowski commented Jan 5, 2017 •

edited

Loading