Avro

This connector requires data in AVRO format; other formats may lead to errors.

Each record from Kafka should match the following structure, verified by comparing the schema to the AVRO record:

ARRAY<STRUCT<columnID INT, text VARCHAR, quote BOOLEAN>>

The Array represents one event (which corresponds to one line in the CSV file), with each STRUCT in the Array representing a column of the event (a field in the CSV file, like the caseId or activity).

Thus, one record from Kafka equates to one event, and the connector aggregates multiple events. When a threshold is met, these aggregated events are written to the same file, which is then sent to the iGrafx API.

To correctly write a field to the CSV file, the following are needed:

  • The column number (columnId),
  • The value (text),
  • Whether or not the field is quoted (quote).

For example, the following data from a Kafka topic (illustrated here in JSON format but actually in AVRO):

{
    "DATAARRAY": [
        {"QUOTE": true, "TEXT": "activity1", "COLUMNID": 1},
        {"QUOTE": false, "TEXT": "caseId1", "COLUMNID": 0},
        {"QUOTE": false, "TEXT": "endDate1", "COLUMNID": 3}
    ]
}
````

will be written as the following line in the CSV file:

caseId1,"activity1",null,endDate1 ``` If the following connector properties are set:

  • csv.separator = ,
  • csv.quote = "
  • csv.defaultTextValue = null
  • csv.fieldsNumber = 4

Note: The field names DATAARRAY, QUOTE, TEXT, and COLUMNID must be respected in ksqlDB to correctly read AVRO data from a Kafka topic.

Any null value for an event, a column in an event, or a parameter in a column is considered an error and will halt the Task.