Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: timeout errors are not always wrapped when a func was retried #9720

Closed
BrennaEpp opened this issue Apr 5, 2024 · 1 comment
Closed
Assignees
Labels
api: storage Issues related to the Cloud Storage API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@BrennaEpp
Copy link
Contributor

Errors are wrapped when calling the run function in invoke.go, when the error was retried and there was a context timeout/deadline exceeded.
For example, this is what an error looks like after the retry was exhausted with a timeout on obj.Attrs:
retry failed with context deadline exceeded; last error: googleapi: Error 400: ...

However, there are (at least) 2 cases when those errors are not wrapped:

  1. When MaxAttempts is reached; this is what an error looks like after the number of max attempts was exhausted:
    googleapi: Error 400: ...

  2. On object uploads with HTTP; This is what an error looks like after retry timeout exhausted on object write:
    context deadline exceeded

Here is repro code:

		var err error
		// This bucket name is too short, which will produce a 400 error.
		obj := client.Bucket("a").Object("content")

		// Always retry, so no calls will succeed on this handle.
		// Retries everything, including the 400 error we get because of the bucket name.
		obj = obj.Retryer(WithPolicy(RetryAlways),
			WithErrorFunc(func(err error) bool { return true }),
		)

		// Timeout on obj.Attrs - this error is wrapped correctly
		timeoutCtx, cancel := context.WithTimeout(ctx, time.Second*1)
		defer cancel()

		_, err = obj.Attrs(timeoutCtx)
		if err == nil {
			t.Fatalf("obj.Attrs should have failed but succeeded")
		}
		fmt.Printf("obj.Attrs times out: \n%v\n-\n", err)

		// Max attempts on obj.Attrs - this error is not wrapped
		objMaxAttempt := obj.Retryer(WithMaxAttempts(4))

		_, err = objMaxAttempt.Attrs(ctx)
		if err == nil {
			t.Fatalf("obj.Attrs should have failed but succeeded")
		}
		fmt.Printf("obj.Attrs max attempts run out: \n%v\n-\n", err)

		// Timeout on obj write - this error is not wrapped
		timeoutCtx, cancel = context.WithTimeout(ctx, time.Second*5)
		defer cancel()

		w := obj.NewWriter(timeoutCtx)
		_, err = io.Copy(w, strings.NewReader("It was the best of times, it was the worst of times."))
		if err != nil {
			t.Errorf("io.Copy: %v", err)
		}

		err = w.Close()
		if err == nil {
			t.Fatalf("w.Close should have failed but succeeded")
		}
		fmt.Printf("w.Close times out: \n%v\n", err)

I'm also concerned about timeouts that happen inside the call on L72 in invoke.go, in the unlucky case that the timeout is exhausted right before that call and the call checks the timeout and errors. I haven't looked at this code path to see if that is possible and if it would return an unwrapped error, but we should verify.

@BrennaEpp BrennaEpp added the triage me I really want to be triaged. label Apr 5, 2024
@product-auto-label product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Apr 5, 2024
@BrennaEpp BrennaEpp added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed api: storage Issues related to the Cloud Storage API. triage me I really want to be triaged. labels Apr 5, 2024
@BrennaEpp BrennaEpp added the api: storage Issues related to the Cloud Storage API. label Apr 5, 2024
tritone added a commit to tritone/google-cloud-go that referenced this issue Apr 15, 2024
Wrap the error when retries are cut off by hitting the configured
value for MaxAttempts. This makes it easier to verify that retries
occurred.

Also adds emulator tests verifying that wrapping occurs as expected
for timeout errors and MaxAttempts.

Updates googleapis#9720
tritone added a commit that referenced this issue Apr 15, 2024
Wrap the error when retries are cut off by hitting the configured
value for MaxAttempts. This makes it easier to verify that retries
occurred.

Also adds emulator tests verifying that wrapping occurs as expected
for timeout errors and MaxAttempts.

Updates #9720
@tritone
Copy link
Contributor

tritone commented Jul 1, 2024

This should be fixed in googleapis/google-api-go-client#2657 and #9767

@tritone tritone closed this as completed Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

2 participants