Secure file uploads with redux-plupload, ClamAV and S3

by Rachel Newstead

We have recently added a new feature that allows a user to upload a file from our webpage. We implemented this using redux-plupload, ClamAV and S3 to satisfy the following requirements:

  • the file should be uploaded from the client to avoid excessive memory use on the server while streaming files.
  • the upload must be secure and the file must be stored securely (and ideally encrypted at rest).
  • the file should be virus free so that it can be downloaded without worry.

Business context

We have been working on a new service to handle job application forms. The ultimate goal is to drive schools to use our shiny new electronic application form to make the process consistent, smooth and quicker for applicants. However we also had to keep supporting existing application form types so that we could shut off the old applications service rather than implement conditional logic around which service to call. One of these application types involves an applicant uploading their CV, which is then emailed to the employer as a link when the application is submitted.

Secure file storage

We use Amazon’s S3 service to store documents. We tell Amazon to encrypt the data it stores for us (encryption at rest). Our S3 bucket is configured to reject all requests that do not specify this encryption. We upload to the S3 bucket via an https request from the client with unique (per request) parameters from the server. This includes a unique file name that S3 will use to store the document and that we can later use to retrieve it. The parameters include the AWS access key Id which matches the secret access key held on the server. The secret access key is used to encrypt an HMAC signature. We then add a policy to the signature which includes facts about the upload request and the parameters that will be sent with it. When Amazon receives the https request it will check that:

  • the signature has been encrypted with a secret key that matches the access key Id
  • the encrypted policy in the signature matches the unencrypted policy (so this cannot be tampered with)
  • the file key in the policy matches the one in the request
  • the facts about the request parameters specified in the policy are met

The code to generate these parameters on the server looks like this:

  const getMultipartParams = () => {
    const now = Math.floor(new Date().getTime());

    const key = uuid.v4();
    const policy = {
      expiration: new Date(now + THIRTY_MINUTES_IN_MS).toISOString(),
      conditions: [
        { bucket: awsConfig.bucket },
        ['starts-with', '$name', ''],
        { key },
        ['starts-with', '$Content-Type', ''],
        ['starts-with', '$Content-Disposition', ''],
        ['content-length-range', 1, config.maxFileSize],
        { 'x-amz-server-side-encryption': 'AES256' },
        ['starts-with', '$redirect', ''],
      ],
    };

    const base64Policy = new Buffer(JSON.stringify(policy)).toString('base64');
    const hmac = crypto.createHmac('sha1', awsConfig.secretAccessKey);
    hmac.update(base64Policy);
    const signature = hmac.digest('base64');

    return {
      key,
      'Content-Type': '',
      AWSAccessKeyId: awsConfig.accessKeyId,
      policy: base64Policy,
      signature,
      'x-amz-server-side-encryption': 'AES256',
      redirect: '',
    };
  };

User interface

We wanted a simple user interface where the user can choose to upload a file by clicking a button. Our application used react and redux. We opted to use plupload for the upload itself. Plupload was already being used on another part of the Tes website and a team had started working on redux-plupload. Redux-plupload is an open source library providing a set of redux bindings for plupload. It wraps Plupload’s actions as redux actions which means we can follow the familar action-middleware-reducer pattern of redux. Upload user interface

We built a stateful React component to initialise a plupload upload instance in componentDidMount:

 componentDidMount() {
    const options = {
      browse_button: this._browseButton,
      filters: [
        { title: 'Documents', extensions: 'pdf,doc,docx,rtf' },
      ],
      multi_selection: false,
      handle: this.props.field.name,
    };
    const uploaderOptions = Object.assign(options, this.props.uploader.params);
    this.props.uploader.onInit(uploaderOptions);
  }

The redux-plupload INIT action is dispatched by the this.props.uploader.onInit function. The plupload instance is initialised with some defaults such as the url to upload to and the max_file_size parameter. Then when the browse button is clicked and the redux-plupload FILES_ADDED action dispatched, we first make a request to the server for multipart_params. In our case this includes the S3 policy, signature and file key. These are set as unique parameters for each request via the redux-plupload SET_OPTION action. We then dispatch the redux-plupload START action which tells plupload to upload the file to our specified url.

The client code looks like this:

require('es6-promise').polyfill();
require('isomorphic-fetch');
const { ActionTypes: UploadActions } = require('redux-plupload');
const { ActionTypes: Actions } = require('./types');

module.exports.startUpload = (multipartParamsEndpoint, fieldName) =>
  (dispatch) => {
    const dispatchSetMultipartParamsAction = (value) => dispatch({
      type: UploadActions.SET_OPTION,
      payload: { option: 'multipart_params', value },
    });

    const dispatchStartUploaderAction = () => dispatch({ type: UploadActions.START });

    return fetch(multipartParamsEndpoint, { method: 'GET', credentials: 'same-origin', timeout: 2000, headers: { Accept: 'application/json', 'Content-Type': 'application/json' })
      .then((res) => {
          if (!res.ok) {
            throw new Error(`Failed to get params for upload. Status: ${res.status}`);
          }
          return res.json();
      })
      .then((params) => dispatchSetMultipartParamsAction(params))
      .then(dispatchStartUploaderAction);
  };

Virus checking

Rather than uploading directly to S3, we proxy the request via a service running ClamAV. We tell the ClamAV service the name of the S3 bucket we want to upload to in each request. The ClamAV service first authenticates that the request has come from a known Tes service, then scans the file for a virus. If it finds a problem with the file it responds with a code of 499. On the client we handle this response code by informing the user that there is a problem with their file. If the file is good, the ClamAV service forwards the request on to the specified S3 bucket, and then proxies the response back to us. Testing the whole flow with a real virus was difficult as Mac anti virus software is very good at automagically removing problematic files from your machine. We ended up writing a stub to return the 499 error for testing.

Validation

When the client receives a success response it calls to the backend service to validate the upload. This uses the AWS access token and key to make a headObject request to S3 and checks that the ContentLength of the file fits with our specified maximum file size using the Joi validation library.

  const validateUpload = (key) =>
      Promise.promisifyAll(new AWS.S3(awsConfig)).headObjectAsync({ Bucket: awsConfig.bucket, Key: key })
        .then((data) => {
          const schema = joi.object({
            ContentLength: joi.number().max(config.maxFileSize),
          });
          return joi.validate(data, schema, { allowUnknown: true }, (error) => {
            if (error) {
              return { valid: false, error };
            }
            return { valid: true };
          });
        });

Successful validation dispatches a custom UPLOAD_VALIDATED action and we display the filename to the user (with the option to remove the upload if they wish): Upload Successful

Making it work in IE

We have a number of users on IE9 and had to do a few magic tricks to make it work here. Plupload supports multiple runtimes and IE9 uses HTML4 rather than HTML5 like modern browsers. For the uploader, plupload wraps the upload component in an iFrame and the upload is done through the iFrame. The success or failure response is then returned to the iFrame rather than the outer page. The default S3 behaviour is to respond with a 204 when the file has been successfully uploaded. Unfortunately there is a bug in IE9 and below where 204s are internally mapped to code 1123, which is then aborted by the browser. To get around this problem we tell S3 to redirect the response to the current page, instead of returning a 204. This is defaulted in redux-plupload. This means we have to always include ['starts-with', '$redirect', ''] in our AWS policy and set this to an empty string for all non-HTML4 cases. AWS will ignore the empty string and respond with its default of 204.


This article is now up on hacker news, feel free to discuss there.