AWS Developer Tools Blog

Pausing and Resuming transfers using Transfer Manager

One of the really cool features that TransferManager now supports is pausing and resuming file uploads and downloads. You can now pause a very large file upload and resume it at a later time without having the necessity to re-upload the bytes that have been already uploaded. Also, this helps you survive JVM crashes as the operation can be resumed from the point at which it was stopped.

On an upload/download operation, TransferManager tries to capture the information that is required to resume the transfer after the pause. This information is returned as a result of executing the pause operation.

Here is an example of how to pause an upload:

// Initialize TransferManager.
TransferManager tm = new TransferManager();

// Upload a file to Amazon S3.
Upload myUpload = tm.upload(myBucket, myKey, myFile);

// Sleep until data transferred to Amazon S3 is less than 20 MB.
long MB = 1024 * 1024;
TransferProgress progress = myUpload.getProgress();
while( progress.getBytesTransferred() < 20*MB ) Thread.sleep(2000);

// Initiate a pause with forceCancelTransfer as true. 
// This cancels the upload if the upload cannot be paused.
boolean forceCancel = true;
PauseResult<PersistableUpload> pauseResult = myUpload.tryPause(forceCancel);

In some cases, it is not possible to pause the upload. For example, if the upload involves client-side encryption using AmazonS3EncryptionClient, then TransferManager doesn’t capture the encrypted context for security reasons and will not be able to resume the upload. In such cases, the user can decide to cancel the uploads by setting the forceCancelTransfers attribute of Upload#tryPause(boolean). The status of the pause operation can be retrieved using PauseResult#getPauseStatus() and can be one of the following.

  • SUCCESS – Upload is successfully paused.
  • CANCELLED – User requested to cancel the upload if the pause has no effect on the upload.
  • CANCELLED_BEFORE_START – User tried to pause the upload even before the start and cancel was requested.
  • NO_EFFECT – Pause operation has no effect on the upload. Upload continues to transfer data to Amazon S3.
  • NOT_STARTED – Pause operation has no effect on the upload because it has not yet started.

On a successful upload pause, PauseResult#getInfoToResume() returns an instance of PersistableUpload that can be used to resume the upload operation at a later time. To persist this information to a file, use the following code,

// Retrieve the persistable upload from the pause result.
PersistableUpload persistableUpload = pauseResult.getInfoToResume();

// Create a new file to store the information.
File f = new File("resume-upload");
if( !f.exists() ) f.createNewFile();
FileOutputStream fos = new FileOutputStream(f);

// Serialize the persistable upload to the file.
persistableUpload.serialize(fos);
fos.close();    

While the Upload#tryPause(boolean) returns a PauseResult when the pause operation succeeds or fails, there is an Upload#pause() that throws an PauseException in case the upload cannot be paused.

Here is an example of how to resume an upload.

// Initialize TransferManager.
TransferManager tm = new TransferManager();

FileInputStream fis = new FileInputStream(new File("resume-upload"));

// Deserialize PersistableUpload information from disk.
PersistableUpload persistableUpload = PersistableTransfer.deserializeFrom(fis);

// Call resumeUpload with PersistableUpload.
tm.resumeUpload(persistableUpload);

fis.close();

TransferManager skips the parts of the file that was uploaded previously and uploads the rest to Amazon S3.

Similar to the upload example, the following example pauses an Amazon S3 object download and persists the PersistableDownload to a file.

// Initialize TransferManager.
TransferManager tm = new TransferManager();

//Download the Amazon S3 object to a file.
Download myDownload = tm.download(myBucket, myKey, new File("myFile"));

// Sleep until the progress is less than 20 MB.
long MB = 1024 * 1024;
TransferProgress progress = myDownload.getProgress();
while( progress.getBytesTransferred() < 20*MB ) Thread.sleep(2000);

// Pause the download.
PersistableDownload persistableDownload = myDownload.pause();

// Create a new file to store the information.
File f = new File("resume-download");
if( !f.exists() ) f.createNewFile();
FileOutputStream fos = new FileOutputStream(f);

// Serialize the persistable download to a file.
persistableDownload.serialize(fos);
fos.close();

To resume a download, use the following code

// Initialize TransferManager.
TransferManager tm = new TransferManager();

FileInputStream fis = new FileInputStream(new File("resume-download"));

// Deserialize PersistableDownload from disk.
PersistableDownload persistDownload = PersistableTransfer.deserializeFrom(fis);

// Call resumeDownload with PersistableDownload.
tm.resumeDownload(persistDownload);

fis.close();

TransferManager performs a range GET operation during the resumeDownload operation to download the remaining Amazon S3 object contents. ETag’s are returned only when downloading whole Amazon S3 objects and hence ETag validation is skipped during resumeDownload operation. Also, resuming a download for an object encrypted using CryptoMode.StrictAuthenticatedEncryption would result in AmazonClientException because authenticity cannot be guaranteed for a range GET operation.

In order to support resuming uploads/downloads during JVM crashes, PersistableUpload or PersistableDownload must be serialized to disk as soon as it is available. You can achieve this by passing an instance of S3SyncProgressListener to TransferManager#upload or TransferManager#download that serializes the data to disk. The following example shows how to serialize the data to a file without calling a pause operation.

// Initialize TransferManager.
TransferManager tm = new TransferManager();

PutObjectRequest putRequest = new PutObjectRequest(myBucket,myKey,file);

// Upload a file to Amazon S3.
tm.upload(putRequest, new S3SyncProgressListener() {

    ExecutorService executor = Executors.newFixedThreadPool(1);

    @Override
    public void onPersistableTransfer(final PersistableTransfer persistableTransfer) {

       executor.submit(new Runnable() {
          @Override
          public void run() {
              try {
                  File f = new File("resume-upload");
                  if (!f.exists()) {
                      f.createNewFile();
                  }
                  FileOutputStream fos = new FileOutputStream(f);
                  persistableTransfer.serialize(fos);
                  fos.close();
              } catch (IOException e) {
                  throw new RuntimeException("Unable to persist transfer to disk.", e);
              }
          }
       });
    }
});

As the name indicates, S3SyncProgressListener is executed in the same thread as the upload/download operation. It should be very fast and return control to TransferManager since it will affect the performance of the upload/download. Note that the above example code is for illustrative purposes only, so in your progress listener implementation you must avoid blocking operations such as writing to disk.

Do you like the new Pause and Resume functionality supported by TransferManager? Let us know your feedback in the comments.