Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaks with combinator functions #10

Open
joshdifabio opened this issue Jun 21, 2016 · 33 comments
Open

Memory leaks with combinator functions #10

joshdifabio opened this issue Jun 21, 2016 · 33 comments

Comments

@joshdifabio
Copy link
Contributor

joshdifabio commented Jun 21, 2016

I have uploaded a gist which shows a problem I've encountered recently when using promises and combinators: specifically, using a combinator function (particularly first()/race()/any()) will cause redundant resolver callbacks to remain attached to whichever promises do not 'win' the race to resolve the combined promise. The current Awaitable interface does not provide any mechanism to overcome this issue (e.g. some kind of cancel() or dispose() method). What do others think about this issue?

I don't personally think combinators should implicitly abort 'losing' promises (which is what cancel() sometimes means) but I do think it might be nice to have some kind of dispose() method:

interface Awaitable
{
    public function when(callback $onResolved): Disposable;
}

interface Disposable
{
    /** Inform the awaitable that you no longer care about its result and it can remove your callback **/
    public function dispose();
}

I'm guessing people will be against such a change, but it's come up for me a couple of times so I'm interested to hear people's thoughts.

@kelunik
Copy link
Member

kelunik commented Jun 21, 2016

Promises can't be aborted. And we always have to attach a handler.

To support something like Disposable, we'd have to issue identifiers for all when calls, as each handler should be independent.

@kelunik
Copy link
Member

kelunik commented Jun 21, 2016

Where's the real memory leak now? Promises usually don't exist for a long time after they have been resolved, as they're useless then.

@kelunik
Copy link
Member

kelunik commented Jun 21, 2016

What we could do is something like when returning an identifier and providing a cancel method in Awaitable that cancels a when, but not the awaitable itself.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Jun 21, 2016

Yeah, I guess that'd be more efficient than creating an object for every when() call, and more consistent with Loop as well.

@kelunik
Copy link
Member

kelunik commented Jun 21, 2016

We don't need an object, we could just increment a counter that's used as key for private $whens.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Jun 21, 2016

I agree, that would be a better approach.

@bwoebi
Copy link
Member

bwoebi commented Jun 21, 2016

I sometimes, but rarely, have that need. In these cases, it typically is just fine to handle the abortion in the callback directly, i.e.

function ($ex, $val) {
    if ($this->aborted) {
       return;
    }
    /* work with $ex and $val */
}

This usually has the advantage of not explicitly cancelling something on cancellation site, but being explicit of ignoring under certain circumstances [and you possibly still might want to log possible errors nevertheless] (it's hard to think about all the things to be cleaned up and easy to forget one).

Furthermore, you're saying it leaves the callbacks attached … but is that a problem? It isn't a memory leak as the other Promises will be resolved at some point (if they aren't that's a bug in your program). Nor is it a performance hit as calling a method to remove the callback is about just as expensive as calling a callback which immediately returns.

In your specific example the when() callback just temporarily "leaks" but the Awaitable impl should free it upon resolution. Typically not much memory is used here … in an extreme case maybe 500 bytes for a fullblown Closure [including statics from use ()] (and IMO if you have millions of Promises pending concurrently, you're not on a 4 GB RAM machine, but on something much beefier...). There's usually much more memory used for the actual action behind. For example two dns lookups on different dns servers racing against each other … each lookup is having it's own structure, watcher callback, whatever… that's typically much more expensive and still counting twoards your "leaked" memory.
In case this gets too expensive, you should consider adding a real cancel() method [not to the standard, but to the class you started the operation on] which then aborts the lookup handling and resolves the Awaitable immediately.

Thus, unless there are stronger arguments, I'm pretty much opposed to adding this. The way it is currently is much cleaner IMO.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Jun 22, 2016

Sorry guys, I was replying via email on my phone last night and I missed a bunch of what you said.

I'm not sure if I've been clear with what I'm saying, so allow me to try and clarify with a more concise, albeit artificial, example than the one I posted in a gist yesterday:

while (true) {
    yield race(doSomething(), $someVeryLongLivedPromise);
}

In this example, $someVeryLongLivedPromise is a promise which takes a long time to be resolved; it could take hours, days or weeks to resolve. Each time race() is called, it adds a callback to each of the promises it's been provided. Imagine that doSomething() is resolved within a second or so; the callback attached to $someVeryLongLivedPromise by race() is now redundant, as the 'race' was already won by doSomething(). Over time, $someVeryLongLivedPromise will accumulate millions of redundant callbacks in this way.

To be crystal clear, if doSomething() generally resolves very quickly and $someVeryLongLivedPromise takes a long time to resolve then this example can waste a lot of memory very quickly. It will also make eventual resolution of $someVeryLongLivedPromise extremely slow as it will have to call millions of callbacks.

The suggestion is that some method could be provided for cases like this so that race() could remove the callback it attached to $someVeryLongLivedPromise once the race has been won by doSomething().

As suggested by @kelunik, something like the following would be a more efficient way of implementing this than my earlier suggestion, although I'm not sure about naming the new method cancel() as this would conflict with existing cancel() methods on React, Guzzle and Icicle promises which have a different meaning to what I'm proposing:

interface Awaitable
{
    /**
     * @return string A unique (to this awaitable) watcher ID which can be passed to cancel()
     **/
    public function when(callable $callback) : string;

    public function cancel(string $watcherId) : void;
}

However, if people don't like this idea, that's totally cool with me and I'll close the issue.

@bwoebi
Copy link
Member

bwoebi commented Jun 22, 2016

This sounds like a legitimate, though extremely edge case.
I've not once encountered a need for something like this.

But in this very special case you should be able to just implement your own Awaitable decorator, I'm not sure whether it belongs in the standard.

class RaceableAwaitable implements Awaitable {
    private $awaitable;
    private $racing = [];
    private $resolved = false;
    public function __construct(Awaitable $awaitable) {
        $this->awaitable = $awaitable;
        $awaitable->when(function($ex, $val) {
            $this->resolved = true;
            foreach ($this->racing as $deferred) {
                $deferred->resolve($this->awaitable);
            }
            unset($this->racing);
        });
    }
    public function race(Awaitable $awaitable) {
        if ($this->resolved) {
            return $this->awaitable;
        }
        $deferred = new Deferred;
        $hash = spl_object_hash($deferred);
        $this->racing[$hash] = $deferred;
        $awaitable->when(function() use ($awaitable, $hash); {
            if ($this->resolved) { return; }
            $deferred = $this->racing[$hash];
            unset($this->racing[$hash]);
            $deferred->resolve($awaitable);
        });
        return $deferred->awaitable();
    }
    public function when(callable $callback) { $this->awaitable->when($callback); } 
}

Then you can wrap this locally and do your calls against this custom race() method. Then you have exactly one when() call for all your races.

I do not think this case warrants adding an extra function to the fundamental Awaitable interface.
For special cases you can have special solutions.

@joshdifabio
Copy link
Contributor Author

@bwoebi Thanks for taking the time to reply in detail. Having considered your point and thought about this for a while I think you're right; this is an edge case which can have its own solution and doesn't need to be baked into the spec.

@kelunik
Copy link
Member

kelunik commented Jan 2, 2017

As it came up again in #38, we might want to reconsider don't care / unregister. I can see a problem with things like combinators / timeouts, where bound variables can't be GCed, because they're still bound via when(), even if the timeout has already been fired long ago.

I think it's really not a real world problem though, I can't remember having a promise where I repeatedly called combinators on, but it might be the case when using more advanced implementations such as observables for a streaming API, which are long-lived promises then.

/cc @martinschroeder

@kelunik kelunik reopened this Jan 2, 2017
@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

I still think it shouldn't be part of this standard. It is an edge case.

It should be trivial to add another method in the custom Observable implementation or to return a custom object allowing for cancellation (it doesn't violate LSP to return a value when the parent returns void).

I see no general use case for interoperability here. If you know that the Promise may take really long to resolve [IMHO Promises/Observables are the wrong tool to use for potentially infinitely active streams though. They should be used for anything which has an end and just intermittently informs about current progress/data arrived.], you also know the concrete implementation of the Promise too and may use cancellation if present.

@kelunik
Copy link
Member

kelunik commented Jan 2, 2017

I see no general use case for interoperability here.

If we want to unregister no longer needed callbacks in combinators, we need it to be interoperable, because combinators operate on the interface, not specific vendor implementations.

you also know the concrete implementation of the Promise too and may use cancellation if present

I think it will be common in libraries to declare the interop promise as return type, so the underlying implementation can be switched. In that case you don't know the implementation and you shouldn't know the implementation anyway if you have interfaces.

For HTTP requests and so on, I agree, there should be a custom implementation, basically a handle, that allows abortion. But in the case of combinators, that's not the case. On the other hand, there isn't that much state that has to be stored for combinators.

IMHO Promises/Observables are the wrong tool to use for potentially infinitely active streams though

What's the use case of observables if not (potentially infinite) streams?

@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

What's the use case of observables if not (potentially infinite) streams?

Finite streams.

I think it will be common in libraries to declare the interop promise as return type, so the underlying implementation can be switched.

Why should the underlying implementation be switchable by that?

If we want to unregister no longer needed callbacks in combinators, we need it to be interoperable, because combinators operate on the interface, not specific vendor implementations.

That's true, but we don't want to.
Either we don't expect a Promise to resolve in finite time or you use a combinator on it. I do not see much sense in using a combinator on Promises we don't expect to resolve soon.

@joshdifabio
Copy link
Contributor Author

it doesn't violate LSP to return a value when the parent returns void

I don't think that is true. Non-null types are not subtypes of null. If the return type of when() was changed to mixed then it would be possible to return a value from when() without defying the spec.

@davidwdan
Copy link

A finite stream is just one of the many use cases for an Observable. It should also be able to handle infinite streams, so any observable can have 0 to infinite values.

@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

@joshdifabio void ≠ null. It's just PHP which makes void functions return null. void actually means "value irrelevant".

@davidwdan infinite streams are fine as long as the time they are active is finite. Otherwise that Observable should not implement Promise - the Promise promises to resolve timely.

@kelunik
Copy link
Member

kelunik commented Jan 2, 2017

@bwoebi From MDN:

A Promise represents a value which may be available now, or in the future, or never.

@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

@kelunik Correct. There may never be a value available, but then it will be resolved with an exception instead.

@joshdifabio
Copy link
Contributor Author

@joshdifabio void ≠ null. It's just PHP which makes void functions return null. void actually means "value irrelevant".

I don't agree that void means value irrelevant. Void means that there is no return value.

@cboden
Copy link

cboden commented Jan 2, 2017

IMHO Promises/Observables are the wrong tool to use for potentially infinitely active streams though

Observables are exactly the right tool for that use case.

@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

If your Observable doesn't implement Promise, sure. And thus this is not on-topic on this spec.

@trowski
Copy link
Contributor

trowski commented Jan 2, 2017

@bwoebi What we called Observables in Amp is different from RxObservables, which are not promises. However, a promise doesn't ever need be resolved. An infinite set in our implementation would just never resolve the promise.

@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

@trowski Yeah, I realized I was confusing some people with this by assuming the Amp Observables; sorry for that.

Also, yes, it would never resolve, but I TBH consider that a bug then.

@cboden
Copy link

cboden commented Jan 2, 2017

Sorry for going off topic but what is your definition/specification of "observable"? In my experience across multiple languages I've seen the term either be synonymous with Reactive eXtensions or ECMA Script 7 which was adopted from Rx.

@bwoebi
Copy link
Member

bwoebi commented Jan 2, 2017

Our amp observables (as in current master) allow subscribing with a callback which will be updated each time a new value is emitted. The end of emitting will be signaled by the Promise (which Observable implements) being resolved.
Perhaps we need another name to lower confusion...

@trowski
Copy link
Contributor

trowski commented Jan 5, 2017

Cancellation of promises as part of this specification is, in my opinion, the incorrect place. Creators of promises (or observables) should be responsible for providing a mechanism to abort the operation if possible. Since async operations vary widely and are state dependent, providing a interoperable and simplified cancellation method is impossible. What does it mean for a promise to be cancelled? Does it reject the promise? How can we report how much progress was made before the promise was cancelled? These questions can be answered by specific APIs (perhaps returning specialized implementations of promises for their particular purpose), but I feel cancellation is not a question that needs answering in an interoperable specification.

As for combinator functions, if a promise is not expected to resolve in a reasonable amount of time, it shouldn't be given to a combinator function and should be considered a bug. Most libs provide a way to enforce a timeout on promises, such as Amp\timeout().

@kelunik
Copy link
Member

kelunik commented Jan 5, 2017

@trowski Amp\timeout is one such combinator that adds a handler but never removes it.

@trowski
Copy link
Contributor

trowski commented Jan 5, 2017

@kelunik That is true… but is really only an issue if the promise is never destroyed. Is this something we see as being a problem? Promises shouldn't last forever within a program – eventually the reference to the object should be destroyed.

@kelunik
Copy link
Member

kelunik commented Jan 5, 2017

@trowski That's the question this issue is about.

@trowski
Copy link
Contributor

trowski commented Jan 5, 2017

@kelunik The issue was that unnecessary when callbacks remain attached after a combinator function, which should be fine if the promise(s) passed to the combinator functions are eventually destroyed, since those functions will be destroyed along with it. I would consider indefinitely retaining a reference to a promise a bug.

It seems the ability to remove callbacks from promises is unnecessary, as I've never seen this capability in any promise library.

@bwoebi
Copy link
Member

bwoebi commented Jan 5, 2017

@kelunik Also, why should you apply a timeout() on something you do not expect to resolve?

Anyway, it makes more sense to have some way of cancelling the task itself then (which will lead to the Promise failing and thus releasing all associated watchers) - but that is out of scope of this specification.

@kelunik
Copy link
Member

kelunik commented Jan 5, 2017

@kelunik Also, why should you apply a timeout() on something you do not expect to resolve?

If I know that it resolves now, I don't need timeout(), so it not resolving isn't that uncommon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants