Boiling Data Source Apps

3 min readOct 18, 2022

What can you do with NodeJS runtime on single tenant computing resources where customers bring their own access roles?

We have launched support for Data Source Apps (DSA) for Boiling! They are templated JS functions, callable as SQL Table Functions.

Listing S3 Bucket using Boiling DSA for AWS SDK S3

DSAs are integrations with Javascript Functions on-the-fly into BoilingData that can then be queried with SQL like any other SQL Compute Cached data source. No need to install plugins, JAR files, compile, transpile, package, upload, release, etc. but just write Javascript function template into JSON formatted string, INSERT into Boiling apps catalog and its ready for use immediately.

And you get the results like any other SQL Query

We have released a SDK for Boiling Data Stream Analytics (DSAs), which can be found in our GitHub repository at https://github.com/boilingdata/boilingdata-dsa-sdk. This repository includes example applications, such as those utilizing the Amazon Web Services (AWS) SDK and a basic random number generator. The AWS SDK example demonstrates full paging capabilities. Additionally, the repository provides an in-depth analysis and discussion of the SDK’s features as well as any related security implications.

Aliases and more aliases..

Interestingly, you can create aliases for SQL Table functions with all parameters predefined, only a few parameters defined, or even apply transformations to them. By doing so, you can assign meaningful names to your virtual table aliases that closely reflect their semantic purpose. For example:

SELECT * FROM apps.awssdk.allBuckets WHERE name LIKE '%boiling%';
SELECT * FROM apps.awssdk.demoBucketRootListing;
SELECT * FROM apps.awssdk.gluePartitions('nyctaxis');
SELECT * FROM apps.awssdk.glueTables('default');

All the examples above are using the same Boiling DSA, but instead of calling the “main” application, we used the defined aliases.

Boiling DSA installation instructions captured as image from the GitHub repository https://github.com/boilingdata/boilingdata-dsa-sdk

So, how does it work really?

Boiling processes the function template using json-template, taking into account any pre-defined SQL Table function parameters from an alias. It then creates a Function and calls it with a parameters object containing S3, Glue, and Lambda AWS SDK instances, which are based on the assumed IAM Role provided by the Boiling user.

The output should be an array of objects, so one of the SQL Table Function parameters needs to specify the path to the list of objects in the response, such as from a REST API. Boiling extracts all keys from the objects and generates a table schema based on their JSON data types (loosely). It then stores the results in an in-memory table. Objects with varying schemas are permitted, and null values are used if some keys are missing in certain rows.

The original SQL query referencing the Boiling DSA is replaced with a temporary table name, which represents the table containing the in-memory DSA results. The SQL is then executed within an embedded database, such as DuckDB. When a query requests the same Boiling DSA with identical parameters again, the data is readily available in the in-memory table for instant access. In this way, Boiling efficiently handles REST API calls.

You can start using BoilingData by signing up to our application here https://app.boilingdata.com/ and play with the demo datasets, set your own IAM role and access your S3 Buckets they way you like.

Boiling Data Source Apps

Written by BoilingData

No responses yet