Skip to main content

Preparing to run the OpenAPI collector

Setting up pre-requisites for running the collector

Make sure that the machine from where you are running the collector meets the following hardware and software requirements.

Table 1.

Item

Requirement

Hardware

Note: The following specs are based upon running one collector process at a time. Please adjust the hardware if you are running multiple collectors at the same time.

RAM

8 GB

CPU

2 Ghz processor

Software

Docker

Click here to get Docker.

data.world specific objects

Dataset

You must have a ddw-catalogs dataset set up to hold your catalog files when you are done running the collector.

If you are using Catalog Toolkit , follow these instructions to prepare the datasets for collectors.

Network connection

Allowlist IPs and domains



Preparing the OpenAPI spec file for collector

Before running the OpenAPI v3 collector, you will need to make sure your API specifications are set up correctly. The collector uses metadata in your OpenAPI specification to generate unique IRIs (Internationalized Resource Identifiers) for the API itself and for its constituent parts (such as endpoints, parameters, and schemas). These IRIs are critical because they determine how resources are identified and linked across updates.

To prepare the file:

  1. Open your API’s OpenAPI v3 spec file (YAML or JSON).

  2. Do the following to add a stable identifier.

    1. Include a custom property called x-api-kos-id in the root info object of your specification.

    2. Set its value to a unique, stable identifier string for this API (for example, my-service-v1).

      openapi: 3.0.1
      info:
        title: My Service
        version: "1.0"
        x-api-kos-id: my-service-v1

    The collector uses this ID to distinguish resources across different versions of your API. If the x-api-kos-id changes, the collector will treat it as a new API and generate new IRIs for its resources.

  3. If you don’t provide x-api-kos-id, the collector will fall back to using the title property from the spec. Titles are often edited for cosmetic or branding reasons, so relying only on the title could lead to duplicate IRIs if the name changes.

  4. Ensure uniqueness across APIs. Each API spec should have its own unique x-api-kos-id value. This prevents collisions where different APIs might otherwise share the same title.