Building End-to-End Encrypted Chat with E3Kit and Stream

Building End-to-End Encrypted Chat with E3Kit and Stream

Rebecca Yarbrough — December 11th, 2019

Thanks to our friends at Stream for authoring and originally publishing this guide on 11/11/2019.

As banking technology advances, secure real-time communication is becoming increasingly crucial to any modern banking application. It's essential to provide the technology experiences customers expect while protecting their privacy and data.

In this tutorial, we will walk through how to create a full, end-to-end encrypted chatbot solution using Stream Chat combined with Dialogflow using Virgil Security for encryption. Combining these services allows developers to create a modern chatbot experience while keeping sensitive information encrypted. The application embeds Virgil Security's E3Kit with Stream Chat React's components The react app communicates with a backend that uses Stream's webhook integration and Dialogflow to respond to the user. All source code for this application is available on GitHub.

Stream Chat, Virgil, and Dialogflow make it easy to build a solution with excellent security with all the features you expect.

Note: The original GitHub repo can be found here.

What Is End-To-End Encryption?

End-to-end encryption means that they can only the sender and recipients of a message can read it. To do this, the message is encrypted before it leaves a user's device and is only decrypted when it reaches the intended recipient’s device.

Virgil Security is a vendor that will enable us to create end-to-end encryption via public/private key technology. Virgil provides a platform and JavaScript SDK that will enable us to implement robust end-to-end secure encryption.

During this tutorial, we will create a Stream Chat app that uses Virgil's encryption to prevent anyone except the intended parties from reading messages. No one in your company, nor any cloud provider you use, can read these messages. Even if a malicious person gained access to the database containing the messages, all they would see is encrypted text, called ciphertext.

Building the Chatbot

To build this application, we're going to rely on a few libraries, Stream React Chat, Virgil E3Kit, Virgil SDK, Virgil Crypto, and Google Dialogflow. Our final product will encrypt text in the browser before sending a message to Stream Chat. The encrypted message will be relayed to our backend via Stream's webhooks. Decryption and verification will happen on the backend before passing it to Dialogflow for interpretation. Once an intent has been determined, the backend performs any necessary actions, encrypts the response, and relays it via a Stream channel.

Our chatbot will have 3 intents a user can perform, with 1 fallback in case we don't understand what is said. These are a greeting, check balances, and transfer money between accounts. Once we're done we'll have a chatbot that's capable of this:

Stream Chat Example

To accomplish this, the app performs the following process:

  1. A user authenticates with your backend.
  2. The user's app requests a Stream auth token and api key from the backend. The browser creates a Stream Chat Client for that user.
  3. The user's app requests a Virgil auth token from the backend and registers with Virgil. This generates their private and public key. The private key is stored locally, and the public key is stored in the Virgil Cloud.
  4. The user joins a Stream Chat Channel with the chatbot.
  5. The browser app asks Virgil for chatbot's public key.
  6. The user types a message and sends it to stream. Before sending, the app uses the chatbot's public key and Virgil E3Kit to encrypt the message. The message is relayed through Stream Chat to the backend via a webhook. Stream receives ciphertext, meaning they can never see the original message.
  7. When the backend receives the message, the app decrypts the message using Virgil E3Kit. Virgil verifies the message is authentic by using the sender's public key.
  8. The backend passes the decrypted text to Dialogflow to determine the user's intent. Dialogflow returns a result that contains the information necessary for the backend to decide how to respond.
  9. The backend receives the Diagflow response, decides what action to take, and creates the response text.
  10. Using Virgil E3Kit, the backend encrypts the response text and responds to the user via the Stream Chat Channel. The client decrypts the message.

This looks intimidating, but luckily Stream, Virgil, and Dialogflow do the heavy lifting for us. As a developer using these services, our responsibility is to wire them together correctly.

The code is split between the React frontend contained in the frontend folder, and the Express (Node.js) backend is found in the backend folder. See the README.md in each folder to see installing and running instructions. If you'd like to follow along with running code, make sure you get both the backend and frontend running before continuing.

Let's walk through and look at the necessary code needed for each step.

Prerequisites

Basic knowledge of React and Node.js is required to follow this tutorial. This code is intended to run locally on your machine.

You will need an account with Stream, Virgil Security, and Google Dialogflow. Dialogflow is a little tricky, so follow the instructions found in their nodejs library and how to authenticate with Google's cloud APIs Once you've created your accounts, place your credentials in backend/.env. You can use backend/.env.example as a reference for what credentials are required.

This tutorial uses the following package versions:

  1. Node 11.14.0
  2. Yarn 1.17.0
  3. Stream Chat 0.13.3
  4. Stream Chat React 0.6.26
  5. Virgil Crypto 3.2.0
  6. Virgil SDK 5.3.0
  7. Virgil E3Kit 0.5.3
  8. Dialogflow 0.12.2
  9. Express 4.17.1

Except for node and yarn, all of these dependencies are declared in backend/package.json and frontend/package.json.

Step 1.1 Set up Dialogflow

For our chatbot to respond correctly, we need to set up a few intents and one entity in our Dialogflow:

  1. Add a "Check Accounts" intent. Click on "Intents" in the nav, and click "Create Intent". For this intent, all we need is a few training phrases that indicate what sort of phrase means the user wants to check their account balances. Make sure to add some responses in the "Responses" section. This intent should look something like this:

End to end Encryption with Virgil and Stream Chat - Check accounts

  1. Add a "Transfer" entity. Click on "Entities" in the nav, and click "Create Entity". Name it "transfer" and create two synonyms, "checking to savings" and "savings to checking". This is a straightforward intent used for demonstration purposes only. Be sure to read up on how to create more sophisticated entities:

End to end Encryption with Virgil and Stream Chat - transfer

  1. Now add a "Transfer Money" intent. This one is a bit more complex, as it'll need some parameters identified to work correctly. Add two training phrases, "transfer 300 dollars from savings to checking" and "transfer 300 dollars from checking to savings". Dialogflow should automatically detect the currency entity "300 dollars" and add that parameter. Highlight the transfer words in each phrase (for example, "savings to checking") and set the entity type to "transfer". Make sure to add whatever response you'd like. Your final result should look like this:

Stream Chat transfer money

If you're set up correctly you should see these four intents in your Dialogflow console:

End-to-end Encryption with Virgil and Stream Chat intents

Step 1.2 Set up Stream Webhooks

For us to monitor and respond to a user's chat message, we need to hook into Stream via webhooks. From your Stream dashboard, navigate to Chat -> Chat overview and look for the "Chat events" section. Switch the webhook to active and add the URL for your server. For local development, you can use a service like ngrok to make your localhost routable online. The path we'll use is /v1/message to handle all Stream events. For convenience, we'll turn off auth/permission checks. In a production environment, make sure you don't bypass these and implement the necessary code to secure your Stream account. Your webhook should look like this with your, ngrok, or otherwise, URL instead of the ngrok URL.

End to end Encryption with Virgil and Stream Chat - webhook

We'll look at the implementation of /v1/message in Step 9.

Step 2. Set up the Backend to Allow User to Get Credentials

For our React frontend to interact with Stream and Virgil, the application provides three endpoints:

  1. POST /v1/authenticate: This endpoint generates an auth token that allows the React frontend to communicate with /v1/stream-credentials and /v1/virgil-credentials. To keep things simple, this endpoint allows the client to be any user. The frontend tells the backend whom it wants to authenticate as. In your application, this should be replaced with your API's authentication endpoint.
  2. POST /v1/stream-credentials: This returns the data required for the React app to establish a session with Stream. In order return this info we need to tell Stream this user exists and ask them to create a valid auth token:
// backend/src/controllers/v1/stream-credentials.js
import { chat } from '../../stream';

exports.streamCredentials = async (req, res) => {
try {
    const data = req.body;

    const user = Object.assign({}, data, {
    id: req.user.sender,
    role: 'user',
    image: `https://robohash.org/${req.user.sender}`,
    });

    const token = chat.createToken(user.id);
    await chat.updateUsers([user]);

    res.status(200).json({ user, token, apiKey: process.env.STREAM_API_KEY });
} catch (error) {
    console.log(error);
    res.status(500).json({ error: error.message });
}
};

The response payload has this shape:

{
    "apiKey": "<string>",
    "token": "<string>",
    "user": {
        "id": "<string>",
        "role": "<string>",
        "image": "<string>"
    }
}
  1. apiKey is the stream account identifier for your Stream instance. Needed to identify what account your frontend is trying to connect with.

  2. token: JWT token to authorize the frontend with Stream.

  3. user: This object contains the data that the frontend needs to connect and render the user's view.

  4. POST /v1/virgil-credentials: This returns the authentication token used to connect the frontend to Virgil. We use the Virgil Crypto SDK to generate a valid auth token for us:

// backend/src/virgil.js
const virgilCrypto = new VirgilCrypto();

const generator = new JwtGenerator({
appId: process.env.VIRGIL_APP_ID,
apiKeyId: process.env.VIRGIL_KEY_ID,
apiKey: virgilCrypto.importPrivateKey(process.env.VIRGIL_PRIVATE_KEY),
accessTokenSigner: new VirgilAccessTokenSigner(virgilCrypto)
});

exports.virgilToken = (user) => generator.generateToken(user);

// backend/src/controllers/v1/virgil-credentials.js
import { virgilToken } from '../../virgil';

exports.virgilCredentials = (req, res) => {
const virgilJwtToken = virgilToken(req.user.sender);

res.json({ token: virgilJwtToken.toString() });
};

In this case, the frontend only needs the auth token.

Step 3. User Authenticates with the Backend

Now that we have our backend set up and running, it is time to authenticate with the backend. If you're running the application, you'll be presented with a screen like so:

Stream Chat registration

This is a pure React form that takes the provided input, stores it in the state as sender, and uses that information to authenticate against the backend:

// frontend/src/StartChat.js
post("http://localhost:8080/v1/authenticate", { sender: this.state.sender })
  .then(res => res.authToken)
  .then(this._connect);

Once we have created a sender identity with an auth token, we can connect to Stream and Virgil.

Step 4. User Connects to Stream

Using the credentials from Step 3, we can request Stream credentials from the backend. Using those we connect our frontend client to Stream:

// frontend/src/StartChat.js
const response = await post("http://localhost:8080/v1/stream-credentials", {}, backendAuthToken);

const client = new StreamChat(response.apiKey);
client.setUser(response.user, response.token);

This initializes the StreamChat object from the Stream Chat React library and authenticates a user using the token generated in the backend.

Step 5. User Connects to Virgil

Once again, using the credentials acquired in Step 3 we ask the backend to generate a Virgil auth token. Using this token, we initialize the EThree object from Virgil's E3Kit SDK:

// frontend/src/StartChat.js
const response = await post("http://localhost:8080/v1/virgil-credentials", {}, backendAuthToken);
const eThree = await EThree.initialize(() => response.token);
await eThree.register();

Step 6. Create Stream Chat Channel

Once we're connected to both Stream and Virgil, we're ready to start chatting with our chatbot. To do this, the client creates a channel between them and the chatbot.

// frontend/src/StartChat.js
const channel = this.state.stream.client.channel('team', `${this.state.sender}-chatbot`, {
  image: `https://getstream.io/random_svg/?id=rapid-recipe-0&name=${this.state.sender}`,
  name: this.state.sender,
  members: [this.state.sender, 'chatbot'],
});

The client we're accessing in the state is the one created in Step 4. Calling .channel will create or join a unique channel based on the identities of the members. Only the user and the chatbot will be allowed in. However, this is not enough to protect Stream or others from viewing those users' messages. Next, we'll use Virgil to encrypt the messages.

Step 7. Lookup Virgil Public Keys

To encrypt a message before sending it through a Stream channel, we need to look up the receiver's public key:

// frontend/src/StartChat.js
const publicKeys = await this.state.virgil.eThree.lookupPublicKeys([this.state.sender, 'chatbot']);

The eThree instance in our state is from Step 5. Assuming that the sender's identity is will, this returns an object that looks like:

{
  will: {/* Public Key Info */},
  chatbot: {/* Public Key Info */}
}

Since we need to decrypt our own messages for display, and convenience, we ask for both public keys at the same time.

Step 8. Sender Encrypts Message and Sends It via Stream

We have everything we need to send a secure, end-to-end encrypted message via Stream. Time to chat! First, we need to show the user the chat room:

// frontend/src/App.js
<Chat client={this.state.stream.client} theme={'messaging light'}>
  <Channel channel={this.state.stream.channel}>
    <Window>
      <ChannelHeader/>
      <MessageList Message={this._buildMessageEncrypted}/>
      <MessageInputEncrypted virgil={this.state.virgil} channel={this.state.stream.channel}/>
    </Window>
    <Thread/>
  </Channel>
</Chat>

This renders the Stream React Chat component that creates a great out-of-the-box experience for our users. If you're following along you'll see this:

End to end Encryption with Virgil and Stream Chat - empty chat

Notice the line where we include our custom class MessageInputEncrypted. This component uses the sender's public key from Virgil to encrypt, then wrap, a Stream React MessageInput component before sending the message over the Stream channel:

// frontend/src/MessageInputEncrypted.js
export class MessageInputEncrypted extends PureComponent {
  sendMessageEncrypted = async (data) => {
    const encryptedText = await this.props.virgil.eThree.encrypt(data.text, this.props.virgil.publicKeys);
    await this.props.channel.sendMessage({
      ...data,
      text: encryptedText
    });
  };

  render = () => {
    const newProps = {
      ...this.props,
      sendMessage: this.sendMessageEncrypted
    };

    return <MessageInput {...newProps} />
  }
}

Now all Stream will see is the ciphertext!

Step 9. The Backend Receives a Webhook from Stream, Sends It to Dialogflow and Responds

Now we can react to the sender's message on the backend, figure out the user's intent via Dialogflow, perform an action if any, and respond. Since Stream sends us every even that happens for our account, we need first to decide if we should take action. We only want to do something when we have a message.new event from any user except for chatbot:

// backend/src/controllers/v1/message.js
exports.message = async (req, res) => {
  try {
    const data = req.body;
    const userId = data['user']['id'];
    if (data['type'] === 'message.new' && userId !== 'chatbot') {
      respondToUser(data);
    }
    res.status(200).json({});
  } catch (error) {
    console.log(error);
    res.status(500).json({ error: error.message });
  }
};

Once we have decided this a message to respond to, we need to decrypt the message, interpret it, decide how to respond, encrypt the response, and send it via the chat channel.

// backend/src/controllers/v1/message.js
const respondToUser = async (data) => {
    const userId = data['user']['id'];

    const eThree = await getEThree();
    const publicKey = await eThree.lookupPublicKeys(userId);
    const channel = chat.channel('team', `${userId}-chatbot`, {});

    const result = await interpretMessage(eThree, publicKey, data);
    const response = await handleMessage(userId, result);

    const encryptedText = await eThree.encrypt(response, publicKey);
    const message = {
        text: encryptedText,
        user: { id: 'chatbot' },
    };

    await channel.sendMessage(message);
};

To interpret the message, we use the Dialogflow setup configured in Step 1.1. We decrypt the user's message and send the decrypted message to Dialogflow:

// backend/src/controllers/v1/message.js
const interpretMessage = async (eThree, publicKey, data) => {
  const userId = data['user']['id'];
  const message = await eThree.decrypt(data['message']['text'], publicKey);

  const sessionClient = new dialogflow.SessionsClient();
  const sessionPath = sessionClient.sessionPath(process.env.GOOGLE_APPLICATION_PROJECT_ID, sessions[userId]);

  const responses = await sessionClient.detectIntent({
    session: sessionPath,
    queryInput: {
      text: {
        text: message,
        languageCode: 'en-US',
      },
    },
  });

  return responses[0].queryResult;
};

Once we've interpreted the message, we can decide how to respond and what actions to take. In this simple app, we have two explicit actions we care about, "Check Accounts" and "Transfer Money". Otherwise, we fallback to the fullfillmentText configured in Dialogflow. This will either be from the Default Welcome Intent or Default Fallback Intent intents.

In the case of "Check Accounts", we look up the user's account balances and respond. For "Transfer Money", we determine the direction, perform the balance transfer then respond:

// backend/src/controllers/v1/message.js
const handleMessage = (userId, result) => {
  let text = '';

  if (result.intent.displayName === 'Check Accounts') {
    text = `Here are your balances\nChecking: $${balances[userId].checking}\nSavings: $${balances[userId].savings}`;
  } else if (result.intent.displayName === 'Transfer Money') {
    const parameters = struct.decode(result.parameters);
    const transfer = parameters.transfer;
    const amount = parameters.amount.amount;
    if (transfer === 'checking to savings') {
      balances[userId].checking -= amount;
      balances[userId].savings += amount;
      text = result.fulfillmentText;
    } else if (transfer === 'savings to checking') {
      balances[userId].checking += amount;
      balances[userId].savings -= amount;
      text = result.fulfillmentText;
    } else {
      text = 'Failed to transfer, unknown accounts';
    }
  } else {
    text = result.fulfillmentText;
  }

  return text;
};

That's it for the server. Even in this constrained example, you can see the power that Stream, Virgil, and Dialogflow give you when building a secure chatbot.

Step 10. Decrypt the response message on the client

Finally, we can display the servers. To decrypt the message, we follow a similar pattern to Step 8. If you look at how we create the MessageList you'll see a custom Message component called MessageEncrypted:

// frontend/src/App.js
<MessageList Message={this._buildMessageEncrypted}/>

Since we need to provide decryption props to add props for decryption to our custom Message component, we add them to the props passed by the Stream React:

// frontend/src/App.js
_buildMessageEncrypted = (props) => {
  const newProps = {
    ...props,
    sender: this.state.sender,
    receiver: this.state.receiver,
    virgil: this.state.virgil
  };
  return <MessageEncrypted {...newProps}/>
};

Once we have the props we need, we can decrypt each message:

// frontend/src/MessageEncrypted.js
export class MessageEncrypted extends PureComponent {
  _isMounted = false;

  constructor(props) {
    super(props);
    this.state = { decryptedText: null };
  }

  componentDidMount = () => {
    this._isMounted = true;
    this._decryptText()
      .then(
        (decryptedText) => {
          if (this._isMounted) {
            this.setState({ decryptedText });
          }
        }
      );
  };

  componentWillUnmount = () => {
    this._isMounted = false;
  };

  _decryptText = async () => {
    const messageCreator = this.props.isMyMessage(this.props.message) ? this.props.sender : 'chatbot';
    return this.props.virgil.eThree.decrypt(
      this.props.message.text,
      this.props.virgil.publicKeys[messageCreator]
    );
  };

  render = () => {
    const newProps = {
      ...this.props,
      message: {
        ...this.props.message,
        text: this.state.decryptedText || ""
      }
    };

    return <MessageSimple {...newProps} />
  }
}

This class decrypts the message before rendering the MessageSimple component from Stream Chat React. To do this, we first determine if the message is our message with Stream's .isMyMessage. We then find the correct public key and ask Virgil E3Kit to decrypt it. Once that's done, we can pass the key along with the rest of the props to the Stream's MessageSimple component.

The _isMounted flag prevents updating the component after the message has been decrypted. This can occur if you're scrolling quickly, or upon page load when there are lots of messages.

Where to Go from Here

This tutorial is intended to get you up and running as fast as possible. Because of this, some critical functionality may be missing from your application. Here are some tips for what to do next with your app.

  1. Configure a more in-depth chatbot experience. Dialogflow has a ton of functionality, such as context, to build robust chatbot experiences.
  2. Build real user registration and protect identity registration. This tutorial simplified registration and retrieving valid tokens to interact with Stream and Virgil.
  3. Backup user's private keys to restore sessions and for multiple devices. Using Virgil's eThree.backupPrivateKey(pwd) will securely store the private key for restoration on any device.

---

Virgil Security, Inc. enables developers to eliminate passwords & encrypt everything, in hours, without having to become security experts.

We believe privacy is a fundamental human right. With our products, only your customer holds their data, proven with full transparency in everything from cryptographic libraries to services. Earn your customer’s trust by not asking for it.

Get started today at VirgilSecurity.com.