Getting Started with Kloudless Webhooks

Introduction

There are times when your application needs to keep up with changes that take place in your users’ cloud storage. Perhaps your application needs to perform an action when new files appear, or needs to update its state when files are modified/deleted. Kloudless solves this problem by making an HTTP request (a “webhook“) to a URL you define, with information on which accounts have changed. You can then query the events endpoint for more detailed information on modified files in the account.

The first release of Kloudless’ events endpoint supports Dropbox, Box, OneDrive, Evernote, Sharepoint, and OneDrive for Business. While some services like Box expose an endpoint for listing events, others such as SharePoint and OneDrive do not. Kloudless supports these services by periodically polling them for changes, allowing you to have a consistent interface to query all the services for changes.

This short walkthrough gets you up and running with Kloudless’s Webhooks. It assumes that:

  • You already have a Kloudless account and app set up (if you don’t, you can get one here)
  • You know some Python (we will be using Flask to create a basic webserver)
  • You are familiar enough with HTTP to make/understand requests
  • You can run code somewhere accessible by the Kloudless server requests

Configuring Webhooks

The webhooks for your application can be configured on the Application Details page of the developer dashboard. All you have to do is add the URL that you would like Kloudless to make requests to. Note: before the webhook is saved, Kloudless makes a test request to it to ensure that it works, so if your application isn’t up and running yet, you won’t be able to create the webhooks.

Webhooks

The Structure of the Webhooks

The requests that we make for our webhooks are extremely simple. Here is the raw HTTP of an example request Kloudless might make to the above endpoint:

POST /kloudless-webhook HTTP/1.1
Host: yourapp.example.com
Content-Length: 14
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: kloudless-webhook/1.0
X-Kloudless-Signature: VPl14EUveHQ7HcAvigiRZ7dmX7QLM5iD+pnXJv2XYok=
Content-Type: application/x-www-form-urlencoded

account=593652

As you can see it is a standard http form. We provide the id of the account that we have new events for. There will only be one account per notification. The primary purpose is to inform you that new events are available from the events endpoint. We also include a signature so that you can verify that the event actually came from our service.

Our service expects a 200 OK, with the response body consisting of only your Kloudless Application ID (available on the Developer Portal’s App Details page).

A Simple WebHooks Receiver

Using Flask we can quickly build a simple web server that can receive these notifications, retrieve the corresponding events, and do something interesting. For an extremely simple server you can use for testing purposes, check out this Gist.

Here, we will describe a more useful receiver that verifies the webhook sender and launches a task. We start with the event receiver:

#!/usr/bin/env python
from flask import Flask, request, abort

import hmac
import base64
import hashlib
import threading

app = Flask('demo_app')
APP_ID = ''
API_KEY = ''
@app.route('/webhook', methods=['POST'])
def receiver():
    # Verify the Request Signature
    digest = hmac.new(API_KEY, msg=request.data, 
                      digestmod=hashlib.sha256).digest()
    sig = base64.b64encode(digest)
    if not sig == request.headers.get('X-Kloudless-Signature'):
        abort(403)

    # Since the request is verified we can actually do something
    account_id = request.form.get('account')
    if account_id:
        threading.Thread(target=process_account, args=(account_id,)).start()

    # In order to verify that the endpoint is alive you should always return
    # your application id with a 200 OK response
    return APP_ID

if __name__ == 'main':
    app.run()

This basic structure allows you to insert just about any functionality into the process_account function. So that you don’t fetch the same events over and over again, you will have to keep track of where you are in the event stream for each account. Each time you make a request to the events endpoint you get back a cursor value that marks your current place. To track our current progress through the event stream, we are going to store the cursors for each account id in [redis])(http://redis.io). You could use your preferred data store, but redis is simple and quick to get started with. If you want to fetch new events and then process added files, write a function similar to the following (this uses python requests so that it is easier to translate to other languages):

import requests
import redis

redis_client = redis.StrictRedis(host='localhost', port=6379, db=0)

def process_account(account_id):
    # if there is an existing cursor we want to make sure we use it
    cursor = redis_client.get(account_id)
    params = {"cursor": cursor}
    headers = {"Authorization": "ApiKey %s" % API_KEY}
    url = "https://api.kloudless.com/v1/accounts/%s/events" % account_id
    response = requests.get(url, headers=headers, params=params)

    if response.ok:
        data = response.json()
        events = data['objects']
        new_cursor = data['cursor']
        for event in events:
            if event['type'] == 'add': # Only care about new/modified files
                do_something_cool(event['metadata']['id'])
        # updating cursor in datastore
        redis_client.set(account_id, new_cursor)

def do_something_cool(file_id):
    print "There is a new file: %s" % file_id

In this case, you would put your interesting functionality in the do_something_cool function. For example, you could do some sort of document conversion or content auditing with the file id that you get. The event contains the full file metadata so you can make other decisions about what to do based on extension, parent folder, or mime-type.

Best Practices

There are a few things that should be done to ensure that your Kloudless Webhooks integration works as well as possible.

Handle WebHooks Quickly

The initial handling of the webhooks requests should be done as quickly as possible. Otherwise, a high volume of requests could easily overwhelm a naive implementation that does a large amount of work prior to sending a response to the webhook requests.

Manage Concurrency

When there are lots of changes occurring for a particular account, a large number of notifications could be sent to your server. Your application should make sure that this doesn’t cause problems. For certain types of applications, the duplicate actions would not cause any issues. However, if you have non-idempotent actions or actions that are particularly expensive in terms of computation time, then make sure there is locking, leasing, or mutual exclusion for that user for the duration of the action. This ensures that duplicate work isn’t being done while the action is running.

There are other design practices for event-driven applications, but they are beyond the scope of this post.

Conclusions

Hopefully this will have provided you with a good introduction to how to take advantage of Kloudless Webhooks so that you can build more effective event-driven applications around your user’s cloud storage. Please email support@kloudless.com or chat with us on Stack Overflow if you have any feedback, questions, or insights about building your application with Kloudless Webhooks.

What're your thoughts!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s