Create an operator for Applications
Last updated
Last updated
Before going through this walkthrough we assume you have checked out the DCH hello-world app from . From here, in order to get started we need to go through the prerequisites and be able to run the tests locally.
The DCH hello world application uses python. The most recent image for the operator uses version 3.9. Install python 3.9 for your platform. We do not have any opinion about how to install python, but a good place to start is or .
The example code uses poetry to manage dependencies. To get started with the dch hello world application you need to install .
Once poetry is installed, you can use poetry to install the dependencies.
The hello world comes with some simple tests. Writing tests for your service logic is a good way to make sure your operator does what you expect before deploying the application.
You can find example of tests in the operator/tests folder in the hello world application repository.
To run tests, execute the following command:
The DCH application system is flexible on how the operator is built. The only requirement is that there is an entry point script with an annotation that shows what function to start the operator. However, we do have recommendations for how to setup your operator.
Use poetry as a package manager.
Decouple data input and output from operator logic.
Only read and write data in the entry point script.
Verify that input exists and are valid in the entry point.
Separate operator business logic.
Develop the business logic inside the src folder.
This makes the logic easier to test.
Allows the logic to be more portable.
Lets you run the operator logic locally more easily.
Write tests for the business logic.
Using whatever framework or library you prefer.
Write input and output tests for the entry point
These tests must use DCH/Senaps custom library code to mimic the runtime environment.
This should only be testing read/write/document validation and not the business logic.
Define config classes that maps to input documents.
Use type advice
This is the first interaction DCH has with the operator. You can name the script what you want, because you specify the script in the manifest file, see below. We suggest placing it on the root path of the operator folder, named [operator_name]_entry.py. The entry script in our example is found here operator/operator_entry.py.
To decouple the operator code and the infrastructure code, we suggest that the entry script handles all the inputs and outputs of the application, delegating all the business logic to separate modules.
This section describes how to access and read inputs. The operator is deployed with access to document nodes and data streams that you can use in the operator.
The DCH application operator can get access to:
Single Data Stream
Corresponds to STREAM node type in the recipe file.
Multiple Data Streams - This is a collection of data streams.
Corresponds to STREAM_COLLECTION node type in the recipe file
Document - A document node contains a free-text string value, less than 16 MB.
Use with the DOCUMENT node type in the recipe file
Use with the MODEL_METADATA node type in the recipe file
Use with the BRIQL_RESULT node type in the recipe file
This section describes how to write to outputs. The operator is deployed with access to document nodes and data streams that you can write to from the operator.
The DCH application operator can write data to:
A Single Data Stream
Corresponds to STREAM node type in the recipe file.
Document - A document node in any format, less than 16 MB.
Use with the DOCUMENT node type in the recipe file
Set the base image. You should select the image based on the python version your operator is written for. Ensure you chose the image that has the correct pandas version. See the list below to find the image you want to use.
Python 3.5 Base Image
PYTHON3
pandas python numpy
Python 3.6 Base Image
PYTHON3
pandas python numpy
Python 3 StatsModels
PYTHON3
pandas python statsmodels numpy
Python 3 Keras
PYTHON3
pandas python keras numpy
Python 3.8 Base Image
PYTHON3
pandas python numpy
Python 3.9 Base Image
PYTHON3
pandas python3.9 numpy
Python 3.9 - With TensorFlow
PYTHON3
pandas python 3 python 3.9 numpy tensorflow 2.14.0
The entry point needs to specify the entry point script referred to earlier in this walkthrough.
The manifest file has to specify the libraries you need to add when the operator image is generated. You can specify the provider as PIP or APT and the name of the package with the version specification following the pattern of the provider.
The Manifest file contains a list of models. Each model specifies an operator. The id must match the operator id and version used in the recipe file. In addition you must specify the ports used in the operator. Each port must have a name, which must match the name used in the recipe file and direction (input or output) and the port type (document, stream or stream[]). The required field must be added, but should be set to false and is handled in the BRIQL query.
There are a few different ways to write multiple operators.
In your entry point script have multiple entry point functions and map the annotation to different "models" in the manifest file. These in turn must match the operator specified in the recipe file.
Copy the example operator folder and develop your other operators. The DCH application system supports bundling multiple operators with you application.
Use a separate operator in the recipe file. You can upload an operator in another context and still use it in your application recipe file.
The hello world application demo does not have any tests for the entry point yet, but it will be added soon.
The entry script specifies an annotation, @model("hello_world"), to instruct the DCH infrastructure what function to execute first. The string in the annotation must match the operator id used in the and the manifest file.
You can use the to read value of a document port.
You can use the to read value of a document port as JSON.
You can use the to get the stream id and the to read a stream.
You can use the to get the stream ids and the to read a collection of streams.
You can use the to get the stream ids and the to read a collection of streams.
You can use the to write the value to a document port. The function is write_to_document(context, value, port_id).
You can use the to get the stream id and the to write to a stream. The function is write_observations_series_to_stream_id(context, pandas.DataFrame, stream_id)
The operator needs the manifest file and the operator manifest describes the data that the operator is expecting and is custom to the deployment platform for the code. In the manifest file you need to set the entrypoint for the operator, the dependencies and base image. The manifest file also has to specify the ports the operator are connected to.