Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipart copy method #1590

Closed
wants to merge 8 commits into from
Closed

Conversation

nmuntyanov
Copy link
Contributor

fix #1589

Copy link
Member

@jderusse jderusse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should split this PR in 2:

  • adding the uploadPartCopy operation in S3
  • adding a new method in SimpleS3 that use the new method of S3

Comment on lines 100 to 107
$bytePosition = 0;
$parts = [];
for ($i = 1; $bytePosition < $contentLength; ++$i) {
$startByte = $bytePosition;
$endByte = $bytePosition + $partSize - 1 >= $contentLength ? $contentLength - 1 : $bytePosition + $partSize - 1;
$parts[] = $this->doMultipartCopy($destBucket, $destKey, $uploadId, $i, "{$srcBucket}/{$srcKey}", $startByte, $endByte);
$bytePosition += $partSize;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$bytePosition = 0;
$parts = [];
for ($i = 1; $bytePosition < $contentLength; ++$i) {
$startByte = $bytePosition;
$endByte = $bytePosition + $partSize - 1 >= $contentLength ? $contentLength - 1 : $bytePosition + $partSize - 1;
$parts[] = $this->doMultipartCopy($destBucket, $destKey, $uploadId, $i, "{$srcBucket}/{$srcKey}", $startByte, $endByte);
$bytePosition += $partSize;
}
$partIndex = 1;
$startByte = 0;
while ($startByte < $contentLength) {
$endByte = min($startByte + $partSize, $contentLength) - 1;
$parts[] = $this->doMultipartCopy($destBucket, $destKey, $uploadId, $partIndex, "{$srcBucket}/{$srcKey}", $startByte, $endByte);
$startByte += $partSize;
$partIndex ++;
}

for ($i = 1; $bytePosition < $contentLength; ++$i) {
$startByte = $bytePosition;
$endByte = $bytePosition + $partSize - 1 >= $contentLength ? $contentLength - 1 : $bytePosition + $partSize - 1;
$parts[] = $this->doMultipartCopy($destBucket, $destKey, $uploadId, $i, sprintf('%s/%s', $srcBucket, $srcKey), $startByte, $endByte);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sens to run this in parallel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense
but i have no idea how to implement this))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this could be runned in parallel?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid consuming the response: When you access a property (calling a getter) the response blocks until the response is fully processed.

once again, I don't know if it's better to run theses requests in sequence or parallel.
(Maybe run too many requests in parallel have worse performance) => you need to check AWS recommendations for this.

to process requests in parallel you should do something like

for (...) {
  $responses[] = $client->uploadPartCopy(...)
}

$success = true;
foreach ($responses as $response) {
  try {
    $copyPartResult = $response->getCopyPartResult();
    $parts[] = new CompletedPart(['ETag' => $copyPartResult->getEtag(), 'PartNumber' => $partNumber]);
  } catch (\Throwable $e) {
    $success = false;
    break'
  }
}

if (!$success) {
  $this->abortMultipartUpload(['Bucket' => $bucket, 'Key' => $key, 'UploadId' => $uploadId]);
  foreach ($responses as $response) {
    try {
      $response->cancel();
    } catch (\Throwable $e) {
      // ...
    }
  }

   throw ...;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!
Seems like the only limit is 10 000 connections (equal to parts)
https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html
Anyway, think that 10k connections, even without body, it is too much.

AWS SDK has concurrency option for such copy
Will see how to implement it here

@stof
Copy link
Member

stof commented Oct 25, 2023

Closing in favor of #1592

@stof stof closed this Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support UploadPartCopy in async-aws/s3
3 participants