SPF| SRS| Karma
Navigation
› Home › Overview › Interactive Demo › Email Anti-Spam › Blog/WWW Anti-Spam › Client libraries › The BQuery Protocol › The DP Language › The Model Language › Download › Mailing list › FAQ
Validation
Validate the XHTML and CSS of this page.

BQuery Protocol Guide

BQuery is the native Karmasphere query protocol. All the Karmasphere client libraries use BQuery. Compared to DNS, these queries return a richer response, describing exactly which attributes matched, and why the feedset reached the verdict it did.

Packet wire format

A BQuery packet is a hash table (Python: dictionary, Perl: hash, Java: Map) which is Bencoded, and sent over a network. Bencoding is a simple encoding format for data structure serialization, borrowed from BitTorrent.

The “official” description of Bencoding from BitTorrent’s protocol specification is as follows:

Each BQuery packet itself contains a number of key-value pairs. Most keys have both a short form and a long form. When constructing a packet, either the short form or the long form may be used, but not both; the short form is preferred to save network bandwidth. When receiving and decoding a packet, both forms must be checked. If the short form is found, the long form may be ignored.

BQuery Query Format

Here is an example query, before BEncoding:

{
    "_" : 12345,
    "a" : [ "il4l4brs2ksyrf36", "rwf8l7oj" ],
    "i" :
        [
            ["127.0.0.1", "ip4", "smtp.client-ip"],
            ["antony@karmasphere.com", "email", "smtp.env.mail-from"],
            ["karmasphere.com", "domain", "smtp.env.helo"]
        ],
    "s" : "karmasphere.email-sender",
    "fl": 1
}

Formally, a BQuery query is a Bencoded map containing the following key-value pairs:

Short name Long name Value
_ _ (optional) A unique cookie for the query. This will be returned in the response.
a auth (optional) A list of Query Credentials (see below), usually a username and a password
i ids (required) A list of Identities (see below)
s composites (recommended) Either a single string, which is the name of the composite to query, or a list of strings, which are the names of several composites to query.
fl flags (optional) A bit significant number (see below).
f feeds (deprecated) A list of additional feeds by id number to query against.
c combiners (deprecated) Either a single string, which is the name of an additional combiner to query, or a list of strings, which are the names of several additional combiners to query.

Cookie: This is an arbitary value which is returned in the response. This enables the application client to keep track of query / response pairs in an asynchronous situation. It is recommended that a simple string or integer be used.

Query Credentials: A list containing a username and a password. These credentials are required only if you need to access non-public data on the Karmasphere server.

Identities: A list of identities. Each identity is itself a list in the following format:

identity string, identity type, [, tag1 [, tag2 ....]]

Identity types are represented by strings. The numeric type ids supported in earlier versions of the protocol are now deprecated.

Type Description
ip4 IP4 Address
ip6 IP6 Address
domain Domain name
email Email Address
url URL
opaque Opaque identity string

Tags communicate the context of an identity, and are used by some of the special DP Operators.

Tag name Meaning
smtp.client-ip IP address of a smtp client
smtp.env.helo HELO string sent from a smtp client
smtp.env.mail-from MAIL FROM: email address from a smtp client

Composites: A list of the names of feedsets to query the identities provided against. For example, “karmasphere.email-sender”.

Feeds: This is a list of additional feed names to query. Generally, this parameter is not required.

Combiners: This is a list of additional combiners to use. Generally, this parameter is not required. Permissible values for this parameter will depend on the configuration of the slave which you are querying; local administrators may install additional combiners for you.

Flags: A bit significant number, currently only the lowest bit has any meaning:

Bit Value Semantic
0 1 Include all facts in the response.
1 2 Include all generated DP data in the response.
2 4 Include a DP trace in the response.

Response format

The BQuery response is a BEncoded map with the following keys:

Short key Long key Value
_ _ The cookie which was sent with the query. This field will not be present if no cookie was sent.
f facts A list of facts (matches against each feed for each identity).
c combinations A map from combination name to combination data.
t time The time the query took in milliseconds.
error error An error flag, set to 1 when an error has occured. Other fields may be absent in this case.
message message An informational message, sent when an error has occured.

Facts: This is a list of maps with the following data:

Key Value
f The feed name.
v The value returned by the feed. (a 32 bit signed integer).
i The identity that was matched by this feed.
d An optional string data value returned by the feed.

Combinations: This is a map of maps. The outer map is keyed by DP-program, feedset or combiner name. The inner map is as follows:

Key Value
v The value returned by the combiner (This is almost always a number in the range -1000 to 1000).
d An optional string explaining the value.

Here is a sample response, decoded by Perl and dumped:

{
    "_" : 12345,
    "facts" :
    [
        {
            "f" : 4000,
            "v" : -1,
            "d" : "Invalid Source IP Address (cymru)",
            "i" : "127.0.0.1"
        }
    ],
    "combiners" :
    {
        "karmasphere.email-sender" : 
        {
            "v" : -1000,
            "d" : "<f4000: if-fail(0) => return bad(1.0)>"
        }
    }
}

Here’s a sample error response, decoded by Perl and dumped:

{
    "_" : 1655274485,
    "error" : 1,
    "message" : "java.lang.IllegalArgumentException: Not an IP4 address: 'wrong': Wrong number of octets: 1"
}

UDP and TCP

BQuery is capable of both UDP and TCP modes. In general, UDP is much faster than TCP, and should be preferred. We recommend that you use UDP by default and only consider TCP for large queries or responses, or if your firewall cannot handle UDP connection tracking. There is currently no automatic failover from UDP to TCP for large queries or responses.

Responses to UDP queries are sent to the originating IP address and port.

In TCP mode, each packet, query or response, MUST be prefixed with a 4 byte field containing the length of the encoded data in network byte order. Responses to TCP queries are sent back over the TCP connection.

Multiple queries can be sent with each TCP connection. However, if multiple queries are sent without waiting for each response individually, the order of the responses may differ from the order of the queries. Clients are advised to use cookies in order to reassociate each response with its query. The standard C libkarmaclient library contains packet reordering code for an asynchronous TCP mode.

Client libraries

Karmasphere provides a number of developer libraries which implement the BQuery protocol.