Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retries and timeouts misunderstanding? bug? or conspiracy? #16

Open
bryanchriswhite opened this issue Oct 31, 2013 · 2 comments
Open

retries and timeouts misunderstanding? bug? or conspiracy? #16

bryanchriswhite opened this issue Oct 31, 2013 · 2 comments

Comments

@bryanchriswhite
Copy link

The Problem:

I'm attempting to use asynchronous callbacks with a pmrpc call. It would seem that the call is timing out with not enough retries to get my onSuccess callback to be called.

So, to test this hypothesis, in my project I, exaggerated the delay by using a setTimeout which seemed to cause the same problem consistently. I've replicated this scenario accurately in the demo below.

I attempted to increase the number of retries first instead of the timeout because I don't want to increase the delay of the callback if I can help it - calling-back sooner is better than later; however, increasing the retries from it's default value doesn't seem to yield any different result.

Finally, I increased the timeout to something larger than the delay and my onSucces got called but onError was also being called 3 times after (with two different messages).

Questions

  1. By watching the onError function I observe only 4 onError calls (when the default is 5 retries). Why? Is there something else going on here?
  2. If the default timeout is 1000ms and the delay before callback is 3000ms then why does the onSuccess not get fired on the 4th or 5th retry?
  3. If I set the timeout to 3500ms (when the delay is still 3000ms), onError is still being called. Why? Again, is there something else going on here?
  4. For every "set" of onError calls there is always 1 with message Method not found. The requestd remote procedure does not exist or is not available. and 2 or 3 with message Application error. Destination unavailable. What's up with that? (Also, not to be a dick, but there's a typo "... The requestd remote...")

Steps to Replicate / Using the Demo:

The demo's JS has a map object containing 2 key/value pairs. The HTML has 2 buttons. The pmrpc procedure/call pair are set up so that clicking a button in the "child" (iframe) will call the "parent" pmrpc procedure and callback with the respective map value.

  1. Use the default timeout and retries. This will generate my initial results, 4 onError calls and no onSuccess calls.
  2. Set the retries to 15. This results will be the same.
  3. Set the timeout to 3500. This will allow the callback to call and will also call onError 3 times.

Demo:

jsfiddle: "Parent" document example
gist "Child" document (iframe source)

Note: I'm currently hosting the iframe document (I didn't know where else to put it for working jsfiddle example)
@izuzak
Copy link
Owner

izuzak commented Nov 1, 2013

Thanks for submitting such a detailed and all-around-awesome issue! Seriously, I'd like to print it and frame it. ⚡

Unfortunately, I'll be traveling the next week so I'm not sure if I'll have the time to look into this before I get back. Still, I'll just mention that asynchronous methods is something I've not used or tested a lot when I was developing pmrpc. It's possible there are issues with async methods. I just tried your demo and get the same behavior as you do.

The retries parameter tells pmrpc how many times it should ping the destination to see if it's ready to accept the call. The waiting period between these pings is calculated using the timeout parameter. For the first retry, pmrpc waits timeout/retries miliseconds, for the second - timeout/(retries-1), then timeout/(retries-2), and so on. In other words - pmrpc uses a backoff algorithm to wait between retries. So, if the timeout is 1000ms, and retries is 3 -- then pmrpc will ping the destination after 333ms, then after 500ms, and then after 1000ms.

I guess that retries don't make much sense with async methods, so you just need to define a long enough timeout. How long is long enough? It depends on your situation. A better way to put it would be to say that you should know when you want to declare a particular call as failed -- is it after 1s? After 10s? Or if you want to wait really long - you can put an hour as the timeout.

If you'd like, feel free to poke around and submit a pull requests that fixes anything you notice needs fixing. If not -- I'll take a better look at this in a week.

@bryanchriswhite
Copy link
Author

Thanks for the positive response on this. I found a couple of bugs in my example code that explain some of the inconsistencies in that particular example. I havn't had a large enough block of time to fully read and understand the pmrpc code yet.

I was confused though by line 479, where you're dividing the timeout by the number of retries. When you have a chance it'd be great if you could explain why this is so because i thought the timeout was the time between retries, as opposed to a total timeout for all retries as this line would suggest.

When I have a bit longer I'll edit the original issue to reflect bugfixes I made to my example code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants