Skip to main content

Assign tags to database tables

Goal of this automation

With this automation we will check names of tables and add corresponding tags to them. For example, all tables with sales in the name will be automatically tagged as sales.

Note

Before you begin - Ensure your organization has cataloged database tables and they appear in your catalog.

flow_eu_automation_comp.png

STEP 1: Prepare the seed file and add to a dataset

First we will prepare the seed file:

  • Create a seed file that defines the words that the automation will look for in the table names and the corresponding tags that it will assign to the tables when the word. In this example, all table names that have the string sales will have a tag sales, and so on. You can add to this seed file any assignments you wish.

    tables-tag.seed.csv

    string,tag
    sales,sales
    order,order
    sys,system
    tmp,temp
    temp,temp

Now, we will create a dataset and add the seed file to it:

  1. Create a dataset and name it ddw-eureka-assign-tags-to-tables.

  2. Upload the seed file to the ddw-eureka-assign-tags-to-tables dataset.

STEP 2: Add dataset and SPARQL query to the Eureka Automations Project

Note

These tasks are done in the ddw-eureka-automations project created here.

  1. Go to the project ddw-eureka-automations.

  2. On the Project details page, click the Launch Workspace button.

  3. Link the dataset ddw-eureka-assign-tags-to-tables created in STEP 1 to the project by clicking Add > Dataset > .

  4. Create a new SPARQL Query by clicking +Add > SPARQL Query.

  5. Add the following query. Replace yourorgname with the name of your organization.

    PREFIX : <https://yourorgname.linked.data.world/d/ddw-eureka-assign-tags-to-tables/>
    PREFIX dwec: <https://dwec.data.world/v0/> 
    PREFIX dct:  <http://purl.org/dc/terms/>
    
    CONSTRUCT{
        ?table dwec:textTag ?tag
    }
    where {
        ?table a dwec:DatabaseTable;
            dct:identifier ?tableID;
        .
    
        ?x a  :tbl-tables_tag_seed;
            :col-tables_tag_seed-string ?tableString;
            :col-tables_tag_seed-tag ?tag;
        .
        FILTER regex(?tableID, ?tableString, "i") 
    }
  6. Save the query with:

    1. Name: ea-assign-tags-to-tables.result.ttl.rq

    2. Description: Given a seed file of strings to match and corresponding tag, apply tags to database table.

STEP 3: Run your automation

Note

This file goes in theddw-catalogs dataset.

  1. Execute the SPARQL query added in Step 2.

  2. From the Results page, select Download the results > Save to dataset.

  3. Set file name. The query name will automatically appear. Just delete the suffix .rq

  4. Save the file to the ddw-catalogs dataset of your organization. You must set the file to auto-sync every hour.

    Note

    SuccessπŸ‘ The file is added to your dataset.

View the results! πŸŽ‰

Go to your organization profile page and from the Resources tab, view the tags available for filtering.

Note

πŸŽ‰ Congrats!

Your tables are automatically tagged based on the terms scanned in the table names. You can use these tags to filter your resources lists.

STEP 4: Add the Eureka Automation SPARQL query

  • Download the following file. Replace yourorgname with the name of your organization and upload it to the Apply column count tag dataset.

    eureka-assign-tags-to-tables.result.ttl.rq

    PREFIX : <https://yourorgname.linked.data.world/d/assign-tags-to-tables/>
    PREFIX dwec: <https://dwec.data.world/v0/> 
    PREFIX dct:  <http://purl.org/dc/terms/>
    
    CONSTRUCT{
        ?table dwec:textTag ?tag
    }
    where {
        ?table a dwec:DatabaseTable;
            dct:identifier ?tableID;
        .
    
        ?x a  :tbl-tables_tag_seed;
            :col-tables_tag_seed-string ?tableString;
            :col-tables_tag_seed-tag ?tag;
        .
        FILTER regex(?tableID, ?tableString, "i") 
    }

STEP 5: Run your automation

  1. Execute the SPARQL query added in Step 5.

  2. From the Results page, select Download the results > Save to dataset.

  3. Set file name. The query name will automatically appear. Just delete the suffix .rq

  4. Save the file to the ddw-catalogs dataset of your organization. You must set the file to auto-sync every hour.

    Note

    This file goes in theddw-catalogs dataset.

View the results! πŸŽ‰

Go to your organization profile page and from the Resources tab, view the tags available for filtering.

Note

πŸŽ‰ Congrats!

Your tables are automatically tagged based on the terms scanned in the table names. You can use these tags to filter your resources lists.