Generate relationships between tables and business glossary terms
Goal of this Eureka automation
This automation runs against the all cataloged database tables and all business glossary terms and generates relationships between them.
Note
Before you begin - Ensure your organization has cataloged database tables and business glossary terms and they appear in your catalog.
STEP 1: Add automation SPARQL query to the Eureka Automations Project
Note
The SPARQL query goes in the ddw-eureka-automations project created here.
Go to the project ddw-eureka-automations.
On the Project details page, click the Launch Workspace button.
Create a new SPARQL Query by clicking +Add > SPARQL Query.
Add the following query. Replace yourorgname with the name of your organization.
PREFIX : <https://yourorgname.linked.data.world/d/generate-relationships/> PREFIX fn: <http://data.world/function/functions#> PREFIX prov: <http://www.w3.org/ns/prov#> PREFIX dwec: <https://dwec.data.world/v0/> construct { ?table_IRI dwec:subject ?term_IRI. } where { ?btType rdfs:subClassOf* dwec:BusinessTerm . ?term_IRI a ?btType ; dct:title ?BizTerm ; . MINUS{?term_IRI prov:wasGeneratedBy dwec:Eureka} BIND(LCASE(?BizTerm) AS ?BizTerm1) # standardize to lower case BIND(REPLACE(?BizTerm1, "[^a-z0-9]", "") AS ?BizTerm_ClusterKey) # collapse whitespace and punctuation ?table_IRI a dwec:DatabaseTable ; dct:identifier ?table_ID ; . BIND(LCASE(?table_ID) AS ?table_ID1) # standardize to lower case BIND(REPLACE(?table_ID1, "[^a-z0-9]", "") AS ?table_ClusterKey) # collapse whitespace and punctuation ##StringMatching BIND(fn:jaroWinklerDistance(?BizTerm_ClusterKey, ?table_ClusterKey) AS ?jw) BIND(fn:levenshteinDistance(?BizTerm_ClusterKey, ?table_ClusterKey) AS ?lev) FILTER(?jw > .8) }
Save the query with:
Name:Â ea-generate-relationships.result.ttl.rq
Description: Given a business glossary, automatically create relationships between Database Tables and Business Glossary terms.
STEP 2: Run your Eureka automation
Execute the SPARQL query from the previous step.
From the Results page, select Download the results > Save to dataset.
Set file name. The query name will automatically appear. Just delete the suffix .rq
Save the file to the ddw-catalogs dataset of your organization. You must set the file to auto-sync every hour.
Note
This file goes in the ddw-catalogs dataset.
View the results! 🎉
Browse to the Glossary of your organization.
Note
🎉 Congrats! You will see the list of newly added glossary terms.
Browse to the tables you used for creating a list of glossary terms.
Note
🎉 Congrats! You will see the that the table now shows related business terms.