Import Data

How to import labels

DataGym can import your labeled data. How this works and the requirements for this task are described in this document.

The requirements:

  • If you use our API, you need a valid API Token. How to create one is described here.

  • The project id is also required. This id can be found within the URL of the project's detail page or can be found with the API. Try the endpoint to list all Projects.

  • In addition, the internal ids of all affected images are required. They can be caught via the API. Try the endpoint to get a dataset by its id.

To use the import feature the import data must match the label configuration.

The import:

You can import the labeled data either via our python package using the public api or upload the JSON formatted file from the project overview page.

The import data format is similar to the export format and described in the following sections.

Example label configuration:

As an example to visualize the import, the following configuration is used. It's the same configuration used in the Export Data section of this documentation:

Within the exported data the label_classes look like the following snippet from the export data section:

label_classes.json
{
  "label_classes": [
    {
      "class_name": "bus",
      "geometry_type": "rectangle"
    },
    {
      "class_name": "car",
      "geometry_type": "polygon"
    },
    {
      "class_name": "car_type",
      "classification_type": "select"
    },
    {
      "class_name": "weather",
      "classification_type": "radio"
    },
    {
      "class_name": "daytime",
      "classification_type": "select"
    }
  ]
}

The used class names are required for the import. Also, the used geometry type or classification type defines the import format.

Classifications of type option are exported as type radio if they contain up to three options otherwise they are exported as type select.

The section "label_classes" is not part of the import JSON, but useful to understand how the import works.

Lets look into the expected data:

The import format:

In its simplest form, the import looks like this:

[
  {
    "internal_media_ID": "<media_id>",
    "keepData": false,
    "global_classifications": {},
    "labels": {}
  }
]

To do a successful import a valid media_id in the internal_media_ID node is required.

Either of the other three properties is optional.

Schema validation

To see if your import Json is valid, you can validate it against this JSON Schema.

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Json Import Schema",
  "description": "Validate the user input",
  "definitions": {
    "rectangle": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/point_def_rectangle"
      },
      "minItems": 1,
      "maxItems": 1
    },
    "polygon": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/point_def_not_rectangle"
      },
      "minItems": 3
    },
    "line": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/point_def_not_rectangle"
      },
      "minItems": 2
    },
    "point": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/point_def_not_rectangle"
      },
      "minItems": 1,
      "maxItems": 1
    },
    "numberOrString": {
      "oneOf": [
        {
          "type": "number",
          "minimum": 0
        },
        {
          "type": "string",
          "pattern": "^[0-9]+$"
        }
      ]
    },
    "point_def_not_rectangle": {
      "type": "object",
      "properties": {
        "x": {
          "$ref": "#/definitions/numberOrString"
        },
        "y": {
          "$ref": "#/definitions/numberOrString"
        }
      },
      "minProperties": 2,
      "maxProperties": 2,
      "required": [
        "x",
        "y"
      ]
    },
    "point_def_rectangle": {
      "type": "object",
      "properties": {
        "x": {
          "$ref": "#/definitions/numberOrString"
        },
        "y": {
          "$ref": "#/definitions/numberOrString"
        },
        "w": {
          "$ref": "#/definitions/numberOrString"
        },
        "h": {
          "$ref": "#/definitions/numberOrString"
        }
      },
      "required": [
        "x",
        "y",
        "w",
        "h"
      ]
    },
    "classification": {
      "type": "array",
      "items": [
        {
          "oneOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ]
        },
        {
          "oneOf": [
            {
              "type": "string"
            },
            {
              "type": "object",
              "patternProperties": {
                "": {
                  "$ref": "#/definitions/classification"
                }
              }
            }
          ]
        }
      ],
      "additionalItems": {
        "type": "string"
      }
    },
    "geometry": {
      "type": "array",
      "items": {
        "oneOf": [
          {
            "$ref": "#/definitions/point_def_not_rectangle"
          },
          {
            "$ref": "#/definitions/point_def_rectangle"
          }
        ]
      }
    }
  },
  "type": "array",
  "minItems": 1,
  "items": {
    "type": "object",
    "properties": {
      "internal_media_ID": {
        "type": "string"
      },
      "keepData": {
        "type": "boolean"
      },
      "global_classifications": {
        "type": "object",
        "patternProperties": {
          "": {
            "$ref": "#/definitions/classification"
          }
        }
      },
      "labels": {
        "type": "object",
        "patternProperties": {
          "": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "geometry": {
                  "$ref": "#/definitions/geometry"
                },
                "rectangle": {
                  "$ref": "#/definitions/rectangle"
                },
                "polygon": {
                  "$ref": "#/definitions/polygon"
                },
                "point": {
                  "$ref": "#/definitions/point"
                },
                "line": {
                  "$ref": "#/definitions/line"
                },
                "image_segmentation": {
                  "type": "array"
                },
                "classifications": {
                  "type": "object",
                  "patternProperties": {
                    "": {
                      "$ref": "#/definitions/classification"
                    }
                  }
                },
                "nested_geometries": {
                  "patternProperties": {
                    "": {
                      "type": "array",
                      "items": {
                        "type": "object",
                        "properties": {
                          "geometry": {
                            "$ref": "#/definitions/geometry"
                          },
                          "rectangle": {
                            "$ref": "#/definitions/rectangle"
                          },
                          "polygon": {
                            "$ref": "#/definitions/polygon"
                          },
                          "point": {
                            "$ref": "#/definitions/point"
                          },
                          "line": {
                            "$ref": "#/definitions/line"
                          },
                          "classifications": {
                            "type": "object",
                            "patternProperties": {
                              "": {
                                "$ref": "#/definitions/classification"
                              }
                            }
                          },
                          "additionalProperties": false
                        }
                      }
                    }
                  }
                }
              },
              "additionalProperties": false
            }
          }
        }
      }
    }
  }
}

For validation you can either use one of the validators from the official JSON Schema page. Those are available for several programming languages and the console. Or you can use one of many JSON Schema online validators.

Properties explained

Lets have a look into the global_classifications property:

This property may look like the following snippet:

global_classifications.json
{
  "global_classifications": {
    "weather": [
      "sunny",
      {
        "daytime": [
          "day"
        ]
      }
    ]
  }
}

This example can also be found within the Export data page. Nested classifications are possible. If a value is not defined use null.

The possible object keys and their values are defined in the label configuration.

Lets have a look into the labels object:

The snippet may look like the following:

labels_snippet.json
{
  "labels" : {
    "bus": [],
    "car": []
  }
}

This object holds all classes of geometries defined within the label configuration and found within your image. Each entry is a list of geometries. Each geometry in such a list consists of two entries:

Let's define a car to look at:

car.json
{
  "car": [
    {
      "polygon": [
        {
          "x": 57,
          "y": 412
        },
        {
          "x": 54,
          "y": 463
        },
        {
          "x": 130,
          "y": 524
        }
      ],
      "classifications": {
        "car_type": [
          null
        ]
      }
    }
  ]
}

The car's geometry type is defined as a polygon. See the label_classes.json or the label configuration as reference. To define a polygon a list of points is required. You need to enter at least three points for a valid polygon, but you can enter as many as needed.

line.json
{
  "street": [
    {
      "line": [
        {
          "x": 57,
          "y": 412
        },
        {
          "x": 54,
          "y": 463
        }
      ],
      "classifications": {}
    }
  ]
}

This is an example for a line. Lines are defined as polylines, therefore they need to have at least two points but can have as many as needed.

point.json
{
  "poi": [
    {
      "point": [
        {
          "x": 57,
          "y": 412
        }
      ],
      "classifications": {}
    }
  ]
}

A point geometry may only have one point containing x and y values.

A bus may look like the following snippet:

bus.json
{
  "bus": [
    {
      "rectangle": [
        {
          "x": 272,
          "y": 302,
          "w": 324,
          "h": 169
        }
      ],
      "classifications": {}
    }
  ]
}

The geometry properties now define a rectangle with the start coordinates x & y and the properties w & h for its width and height. In addition, the classifications object is empty because bus has no classifications.

Last updated