Skip to content

race condition in S3MetaRequestResponseHandlerNativeAdapter throws NullPointerException #3919

@untra

Description

@untra

Describe the bug

We are experiencing occasional NullPointerException thrown when using the S3TransferManager. While we cannot consistently reproduce this bug, the stacktrace is consistent:

Exception in thread "Thread-4" java.lang.NullPointerException: Cannot read the array length because "array" is null
	at java.base/java.nio.ByteBuffer.wrap(ByteBuffer.java:43
	at software.amazon.awssdk.crt.s3.S3MetaRequestResponseHandlerNativeAdapter.onResponseBody(S3MetaRequestResponseHandlerNativeAdapter.java:15)
Exception in thread "main" java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: A callback has reported failure.
	at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:62)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:51)
	at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
	at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:76)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:[8]
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeAttemptExecute(AsyncRetryableStage.java:103)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:181)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:15
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:5[10]
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:76)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$null$0(MakeAsyncHttpRequestStage.java:103)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:165)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:[11]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: A callback has reported failure.
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:102)
	at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43)
	at software.amazon.awssdk.services.s3.internal.crt.S3CrtResponseHandlerAdapter.handleError(S3CrtResponseHandlerAdapter.java:97)
	at software.amazon.awssdk.services.s3.internal.crt.S3CrtResponseHandlerAdapter.onFinished(S3CrtResponseHandlerAdapter.java:77)
	at software.amazon.awssdk.crt.s3.S3MetaRequestResponseHandlerNativeAdapter.onFinished(S3MetaRequestResponseHandlerNativeAdapter.java:

Expected Behavior

software does not throw nullPointerExceptions, download continues as intended.

Current Behavior

we are using kotlin, and only sporadically reproducing the issue with this downloadFileFromS3 function. However for context, this is getting an object that definitely exists, and has just been uploaded to s3 about 0-60 seconds ago.

import org.apache.commons.io.FileUtils
import software.amazon.awssdk.auth.credentials.AwsCredentialsProvider
import software.amazon.awssdk.services.s3.S3AsyncClient
import software.amazon.awssdk.services.s3.model.GetObjectRequest
import software.amazon.awssdk.transfer.s3.S3TransferManager
import software.amazon.awssdk.transfer.s3.model.ObjectTransfer
import software.amazon.awssdk.transfer.s3.progress.TransferProgress
import java.nio.file.Path
import java.time.Instant

    fun downloadFileFromS3Aws(
        bucket: String,
        key: String,
        filePath: String,
        awsCredentialsProvider: AwsCredentialsProvider? = null,
        progressFormat: TransferProgress.(Instant) -> String = { "Download progress" },
        progressPrint: (String) -> Unit = { },
    ) {
        val outputFile = Path.of(filePath).toFile()
        FileUtils.createParentDirectories(outputFile)
        val clientBuilder = S3AsyncClient.builder()
            .credentialsProvider(awsCredentialsProvider)

        val s3TransferManager = S3TransferManager.builder()
            .s3Client(clientBuilder.build())
            .build()
        val getObjectRequest = GetObjectRequest.builder()
            .bucket(bucket).key(key).build()
        val fileDownload =
            s3TransferManager.downloadFile {
                it.getObjectRequest(getObjectRequest)
                    .destination(Path.of(filePath))
            }
        waitForS3Transfer(fileDownload, progressFormat, progressPrint)
        fileDownload.completionFuture().join()
    }
    private fun waitForS3Transfer(
        transfer: ObjectTransfer,
        progressFormat: TransferProgress.(Instant) -> String = { "" },
        progressPrint: (String) -> Unit = { },
        delay: Long = 1000,
    ) {
        val start = Instant.now()
        do {
            val progress = transfer.progress()
            if (transfer.completionFuture().isCompletedExceptionally) {
                transfer.completionFuture().join()
            }
            progressPrint(progressFormat(progress, start))
            Thread.sleep(delay)
        } while (progress.snapshot().ratioTransferred().orElseGet { 0.0 } < 1.0)
        val progress = transfer.progress()
        progressPrint(progressFormat(progress, start))
    }

Reproduction Steps

use the S3TransferManager to request objects, and occasionally the requests will fail with the reported stacktrace.

Possible Solution

smells like a race condition

Additional Information/Context

We believe this is a race condition, in part because it's not consistently reproducible, but also the nature of how we are using the TransferManager:

We are uploading bulk files to s3 and then multiple (2-7) clients are quickly downloading those files within the span of a minute. We encounter the stacktrace in the clients downloading these bulk files.

AWS Java SDK version used

2.20.46

JDK version used

openjdk 17 2021-09-14 OpenJDK Runtime Environment (build 17+35-2724) OpenJDK 64-Bit Server VM (build 17+35-2724, mixed mode, sharing)

Operating System and version

arm64 macos Ventura 13.3.1 / x86 ubuntu Amazon Linux 5.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    guidanceQuestion that needs advice or information.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions