Version: Current

Data Server - basics

Let's make things really simple.

A Data Server is a component that supplies streaming real-time data to the front end of your application.
You define your application's Data Server in a Kotlin script file:
server/appname/src/main/genesis/scripts/appname-dataserver.kts
In this file, you define specific query codeblocks, each of which is designed to supply different sets of data.
Each query listens to a specified table or view; when data on that source changes, it publishes the changes.
A query can include a number of other subtleties, such as filter clauses or ranges, so that you can create code that matches your precise requirements.
If you use AppGen to build from your dictionary, then a basic kts file will be built automatically for you, covering all the tables and views in your data model. You can edit this file to add sophistication to the component.
Otherwise, you can build your kts by defining each query codeblock from scratch.

The simplest possible definition

Your application-name-dataserver.kts Kotlin script file contains all the queries you create. These are wrapped in a single dataServer statement. Our example below shows a file with a single query, which publishes changes to the table INSTRUMENT_DETAILS.

dataServer {
    query(INSTRUMENT_DETAILS)
}

And here is a simple example that has two queries:

Note that each query is a separate subscription.

dataServer {
    query(INSTRUMENT_DETAILS)

    query(COUNTERPARTY)
}

Naming

You don't have to give each query a name. If you do provide one, it will be used exactly as you specify. If you don't, then a name is allocated automatically when the data object is created. The syntax for this allocation is "ALL_{table/view name}S". So, in the example below:

The first query is called INSTRUMENT_DETAILS, as this name has been specified.
The second query is called ALL_COUNTERPARTYS, because it has not had a name specified. The allocated name is based on the COUNTERPARTY table, which it queries.

dataserver {
    // Name of the dataserver: INSTRUMENT_DETAILS
    query("INSTRUMENT_DETAILS", INSTRUMENT_DETAILS)

    // Name of the dataserver: ALL_COUNTERPARTYS
    query(COUNTERPARTY)
}

Specifying fields

By default, all table or view fields in a query definition will be exposed. If you don't want them all to be available, you must define the fields that are required. In the example below, we specify eight fields:

dataServer {
    query(INSTRUMENT_DETAILS) {
        fields {
            INSTRUMENT_CODE
            INSTRUMENT_ID
            INSTRUMENT_NAME
            LAST_TRADED_PRICE
            VWAP
            SPREAD
            TRADED_CURRENCY
            EXCHANGE_ID
        }
    }
}

Derived fields

You can also define derived fields in a Data Server, to supplement the fields supplied by the table or view. The input for the derived field is the Data Server query row. All fields are available for use.

In the example below, we add a ninth field to our Data Server from the previous example. The new field is a derived field:

dataServer {
    query(INSTRUMENT_DETAILS) {
        fields {
            INSTRUMENT_CODE
            INSTRUMENT_ID
            INSTRUMENT_NAME
            LAST_TRADED_PRICE
            VWAP
            SPREAD
            TRADED_CURRENCY
            EXCHANGE_ID
            derivedField("IS_USD", BOOLEAN) {
                tradedCurrency == "USD"
            }
        }
    }
}

Configuration settings

Before you define any queries, you can make configuration settings for the Data Server within the config block. These control the overall behaviour of the Data Server.

Here is an example of some configuration settings:

dataServer {
    config {
        lmdbAllocateSize = 512.MEGA_BYTE() // top level only setting
        // Items below can be overriden in individual query definitions
        compression = true 
        chunkLargeMessages = true
        defaultStringSize = 40
        batchingPeriod = 500
        linearScan = true
        excludedEmitters = listOf("PurgeTables")
        enableTypeAwareCriteriaEvaluator = true
        serializationType = SerializationType.KRYO // Available from version 7.0.0 onwards
        serializationBufferMode = SerializationBufferMode.ON_HEAP // Available from version 7.0.0 onwards
        directBufferSize = 8096 // Available from version 7.0.0 onwards
    }
    query("SIMPLE_QUERY", SIMPLE_TABLE) {
        config {
            // Items below only available in query level config
            defaultCriteria = "SIMPLE_PRICE > 0"
            backwardsJoins = false
            disableAuthUpdates = false
        }
    }
}

Below, we shall examine the settings that are available for your config statement.

Global settings

Global settings can be applied at two levels:

Top level
Query level

Top-level global setting

lmdbAllocateSize This sets the size of the memory-mapped file where the in-memory cache stores the Data Server query rows. This configuration setting can only be applied at the top level. It affects the whole Data Server.

By default, the size is defined in bytes. To use MB or GB, use the MEGA_BYTE or GIGA_BYTE functions. You can see these in the example below. The default is 2 GB.

lmdbAllocateSize = 2.GIGA_BYTE()
// or
lmdbAllocateSize = 512.MEGA_BYTE()

serializationType (available from GSF version 7.0.0)

This sets the serialization approach used to convert query rows into bytes and vice versa. There are two options: SerializationType.KRYO and SerializationType.FURY

The KRYO option is the default setting if serializationType is left undefined. It uses the Kryo library.
The FURY option uses the Fury library instead. Internal tests show that Fury can serialize and deserialize row objects 13-15 times more quickly than Kryo in our current Data Server implementation; this leads to faster LMDB read/write operations (up to twice as quick in some cases). This performance improvement has a great impact on the latency incurred when requesting rows from a Data Server query, whether it happens as part of the first subscription message (DATA_LOGON messages) or subsequent row request messages (MORE_ROWS messages). However, the serialized byte array size of a Fury object can be between 10 and 15 per cent larger than a Kryo object, so there is a small penalty to pay for using this option.

serializationBufferMode (available from GSF version 7.0.0)

This option changes the buffer type used to read/write information from/to LMDB. There are two options: SerializationBufferMode.ON_HEAP and SerializationBufferMode.OFF_HEAP.

The ON_HEAP option is the default setting if serializationBufferMode is left undefined. It uses a custom cache of Java HeapByteBuffer objects that exist within the JVM addressable space.
The OFF_HEAP option uses DirectByteBuffer objects instead, so memory can be addressed directly to access LMDB buffers natively in order to write and read data. OFF_HEAP buffers permit zero-copy serialization/deserialization to LMDB; buffers of this type are generally more efficient as they allow Java to exchange bytes directly with the operating system for I/O operations. To use OFF_HEAP, you must specify the buffer size in advance (see directBufferSize setting); consider this carefully if you do not know the size of the serialized query row object in advance.

directBufferSize

This option sets the buffer size used by the serializationBufferMode option when the SerializationBufferMode.OFF_HEAP setting is configured.

By default, the size is 8096 bytes.

Query-level global settings

The following settings can be applied at a query-level.

defaultCriteria This represents the default criteria for the query. Defaults to null.

disableAuthUpdates This disables real-time auth updates in order to improve the overall data responsiveness and performance of the server. Defaults to false.

backJoins Seen in older versions of the platform, this has been replaced by backwardsJoins. It is functionally the same.

backwardsJoins This enables backwards joins on a view query. Backwards joins ensure real-time updates on the fields that have been joined. These need to be configured at the join level of the query in order to work correctly. Defaults to true.

Shared settings

compression If this is set to true, it will compress the query row data before writing it to the in-memory cache. Defaults to false.

chunkLargeMessages If this is set to true, it will split large updates into smaller ones. Defaults to false.

defaultStringSize This is the size to be used for string storage in the Data Server in-memory cache. Higher values lead to higher memory use; lower values lead to lower memory use, which can lead to string overflow. See the onStringOverflow setting for details of how the overflows are handled. Defaults to 40.

batchingPeriod This is the delay in milliseconds to wait before sending new data to Data Server clients. Defaults to 500ms.

linearScan This enables linear scan behaviour in the query definition. If false, it will reject criteria expressions that don't hit defined indexes. Defaults to true.

excludedEmitters This enables update filtering for a list of process names. Any database updates that originate from one of these processes will be ignored. Defaults to an empty list.

onStringOverflow This controls how the system responds to a string overflow. A string overflow happens when the value of a String field in an index is larger than defaultStringSize, or the size set on the field.

There are two options for handling string overflows:

IGNORE_ROW - rows with string overflows will be ignored. This can lead to data missing from the Data Server.
TRUNCATE_FIELD - indices with string overflows will be truncated. The data with the overflow will still be returned in full, and will be searchable. However, if multiple rows are truncated to the same value, any subsequent rows will lead to duplicate index exceptions during the insert, so these rows will not be available to the Data Server.

enableTypeAwareCriteriaEvaluator This enables the type-aware criteria evaluator. Defaults to false.

The type-aware criteria evaluator can automatically convert criteria comparisons that don't match the original type of the Data Server field; these can still be useful to end users.

For example, you might want a front-end client to perform a criteria search on a TRADE_DATE field like this: TRADE_DATE > '2015-03-01' && TRADE_DATE < '2015-03-02'. This search can be translated automatically to the right field types internally (even though TRADE_DATE is a field of type DateTime). The Genesis index search mechanism can also identify the appropriate search intervals in order to provide an optimised experience. The type-aware evaluator can transform strings to integers, and any other sensible and possible conversion (e.g TRADE_ID == '1'). As a side note, this type-aware evaluator is also available in DbMon for operations like search and qsearch.

By contrast, the traditional criteria evaluator needs the field types to match the query fields in the Data Server. So the same comparison using the default criteria evaluator for TRADE_DATE would be something like: TRADE_DATE > new DateTime(1425168000000) && TRADE_DATE < new DateTime(1425254400000). This approach is less intuitive and won't work with our automatic index selection mechanism. In this case, you should use our common date expressions (more at Advanced technical details) to handle date searches.

Backwards joins

Each query in a Data Server creates what is effectively an open connection between each requesting user and the Data Server. After the initial send of all the data, the Data Server sends only modifications, deletions and insertions in real time. Unchanged fields are not sent, which avoids unnecessary traffic.

By default, only updates to the root table are sent. So if you need to see updates to fields from a joined table, then you must specifically declare a backwards join. For queries with backwards joins enabled, the primary table and all joined tables and views are actively monitored. Any changes to these tables and views are sent automatically.

Monitoring of backwards joins can come at a cost, as it can cause significant extra processing. Do not use backwards joins unless there is a need for the data in question to be updated in real time. For example, counterparty data, if changed, can wait until overnight. The rule of thumb is that you should only use backwards joins where the underlying data is being updated intraday.

The backwards join flag is true by default. You must explicitly set backwardsJoins = false if you wish to turn backwards joins off for your query.

Filter clauses

If you include a filter clause in a query, the request is only processed if the criteria specified in the clause are met. For example, If you have an application that deals with derivatives that have parent and child trades, you can use a filter clause to confine the query to parent trades only.

Also, you can use these clauses to focus on a specific set of fields or a single field.

Finally, note that filter clauses can also be used in conjunction with permissioning. In the below example, users that have the TRADER or SUPPORT permission code are able to see all trades that meet the filter condition.

dataServer {

    query("ALL_TRADES_LARGE_POSITION", ENHANCED_TRADE_VIEW) {
        permissioning {
            permissionCodes = listOf("TRADER", "SUPPORT")
            auth(mapName = "ENTITY_VISIBILITY") {
                authKey {
                    key(data.counterpartyId)
                }
            }
        }
        filter {
            (data.quantity * data.price) > 1000000
        }
    }
}

Indices

A query can optionally include an indices block to define additional indexing at the query level. When an index is specified in the query, all rows returned by the query are ordered by the fields specified, in ascending order. The definition and the behaviour of an index in a query are exactly the same as when you define an index in a table.

Here is an example of a simple index:

dataServer {
    query("SIMPLE_QUERY", SIMPLE_TABLE) {
        indices {
            unique("SIMPLE_QUERY_BY_QUANTITY") {
                QUANTITY
                SIMPLE_ID
            }
        }
    }
}

There are two scenarios where an index is useful:

Optimising query criteria search. in the example above, if the front-end request specifies criteria such as QUANTITY > 1000 && QUANTITY < 5000, then the Data Server will automatically select the best matching index. In our example, it would be SIMPLE_QUERY_BY_QUANTITY. This means that the platform doesn't need to scan all the query rows stored in the Data Server memory-mapped file cache; instead, it performs a very efficient indexed search.
Where the front-end request specifies an index. You can specify one or more indices in your Data Server query. And the front end can specify which index to use by supplying ORDER_BY as part of the DATA_LOGON process. The Data Server will use that index to query the data. The data will be returned to the client in ascending order, based on the index field definition.

Important: Index definitions are currently limited to unique indices. As quantity does not have a unique constraint in the example definition shown above, we need to add SIMPLE_ID to the index definition to ensure we maintain uniqueness.

Auto-generated REST endpoints

As we have mentioned before, all queries created in the dataserver file are automatically exposed as a REST endpoints.

Below you see an example of accessing a custom endpoint called ALL_TRADES running locally. This endpoint was automatically generated by the Genesis platform when you defined the query ALL_TRADES in the application's -dataserver.kts file.

[POST] http://localhost:9064/ALL_TRADES/

You can see how to manipulate and work with auto-generated endpoints in our pages on REST endpoints.

Custom Log messages

Genesis provides a class called LOG that can be used to insert custom log messages. This class provides you with 5 methods to be used accordingly: info, warn, error,trade,debug. To use these methods, you need to provide, as an input, a string that will be inserted into the Logs.

Here is an example where you can insert an info message into the log file:

LOG.info("This is an info message")

The LOG instance is a Logger from SLF4J.

note

In order to see these messages, you must set the logging level accordingly. You can do this using the logLevel command.

The simplest possible definition​

Naming​

Specifying fields​

Derived fields​

Configuration settings​

Global settings​

Top-level global setting​

Query-level global settings​

Shared settings​

Backwards joins​

Filter clauses​

Indices​

Auto-generated REST endpoints​

Custom Log messages​