Import Data
How to import labels
Last updated
How to import labels
Last updated
DataGym can import your labeled data. How this works and the requirements for this task are described in this document.
If you use our API, you need a valid API Token. How to create one is described here.
The project id is also required. This id can be found within the URL of the project's detail page or can be found with the API. Try the endpoint to list all Projects.
In addition, the internal ids of all affected images are required. They can be caught via the API. Try the endpoint to get a dataset by its id.
To use the import feature the import data must match the label configuration.
You can import the labeled data either via our python package using the public api or upload the JSON formatted file from the project overview page.
The import data format is similar to the export format and described in the following sections.
As an example to visualize the import, the following configuration is used. It's the same configuration used in the Export Data section of this documentation:
Within the exported data the label_classes look like the following snippet from the export data section:
The used class names are required for the import. Also, the used geometry type or classification type defines the import format.
Classifications of type option are exported as type radio if they contain up to three options otherwise they are exported as type select.
The section "label_classes" is not part of the import JSON, but useful to understand how the import works.
Lets look into the expected data:
In its simplest form, the import looks like this:
Property | Description |
internal_media_ID | The internal id to identify the media. |
keepData | If keepData is equal to false, all already existing labels for the current Image will be deleted after the labels upload. If keepData is equal to true, all new labels will be added to the already existing labels for the current Image. Default value for keepData is true |
global_classifications | Uses the label_classes to describe the image. |
labels | Uses the class names to describe a geometry within the image. |
To do a successful import a valid media_id in the internal_media_ID node is required.
Either of the other three properties is optional.
To see if your import Json is valid, you can validate it against this JSON Schema.
For validation you can either use one of the validators from the official JSON Schema page. Those are available for several programming languages and the console. Or you can use one of many JSON Schema online validators.
Lets have a look into the global_classifications property:
This property may look like the following snippet:
This example can also be found within the Export data page. Nested classifications are possible. If a value is not defined use null.
The possible object keys and their values are defined in the label configuration.
Lets have a look into the labels object:
The snippet may look like the following:
This object holds all classes of geometries defined within the label configuration and found within your image. Each entry is a list of geometries. Each geometry in such a list consists of two entries:
Property | Description |
geometry | This section defines the area of interest within the image. The structure depends on the type of geometry. (Deprecated, use the appropriate entry from one of the properties below) |
polygon | Use this property if you want to import a polygon |
line | Use this property if you want to import a line |
point | Use this property if you want to import a point |
rectangle | Use this property if you want to import a rectangle |
classifications | This section is similar to the global_classifications section described above. |
nested_geometries | This section holds an array of individual geometries (like those above) |
image_segmentation | You cant import image segmentation values yet |
Let's define a car to look at:
The car's geometry type is defined as a polygon. See the label_classes.json or the label configuration as reference. To define a polygon a list of points is required. You need to enter at least three points for a valid polygon, but you can enter as many as needed.
This is an example for a line. Lines are defined as polylines, therefore they need to have at least two points but can have as many as needed.
A point geometry may only have one point containing x and y values.
A bus may look like the following snippet:
The geometry properties now define a rectangle with the start coordinates x & y and the properties w & h for its width and height. In addition, the classifications object is empty because bus has no classifications.