Documentation

Overview of the custom_types.ttl file

In this article we introduce the composition of the custom_types.ttl file and the vocabulary used to describe it. Then we go on to discuss the format of the file and indicate what should not be changed, what must be changed, what can be changed, and how. If you do not currently have a ddw-ontologies dataset with a custom_types.ttl file in it, you can request one from support. Your custom types file can be called anything you'd like and stored in any of your organization's datasets that you would like.

Note

Though you can name your .ttl file anything you would like and put it in any dataset you would like, we recommend using ddw-ontologies as the name for your dataset and storing the .ttl file there.

A .ttl file is divided into two parts: the prologue at the top followed by the body. The prologue is where short forms of the prefixes for the iris (addresses or references for resources) are defined. These short forms are used in the body so that you don't have to type out the long form of the prefix each time you describe a resource.

Figure 1. .ttl prologue
.ttl prologue


Most of the prefixes in the custom_types.ttl prologue refer to standard ontologies and should be left alone. The one exception is the hg prefix which should be changed to reference your organization's ddw-ontologies dataset:

CTttl_2_hg_prefix.png

Important

Because you are defining the prefixes in the prologue, you can rename them, for example changing hg to something else. However, if you do, you will also need to change all the references to that prefix elsewhere in the file:

Ctttl_3_Must_change_if_prefix_changes.png

The body is made up of information about resources. This information is in a form called a triple . A triple has a subject-predicate-object format, which is a standardized way of describing something. For example, one triple might be, "Data science is fun for everyone" "Data science" is the subject, "is fun for" is the predicate, and "everyone" is the object.

Triples for the same resource can be grouped together in a stanza. The benefit of grouping the triples for a resource is that

  • The subject in the triple does not have to be rewritten for every triple in the stanza

  • You can see all the triples for the same resource at once.

The first stanza of the body of the data.world custom_types.ttl file describes the data.world ontology. It needs to remain unchanged for custom types to run.

CTttl_4_leave_at_the_top_needs_to_be_there_to_run_correctly.png

The second stanza contains an example of a custom data type defined by data.world:

CTttl_second_stanza.png

The following table shows the triples in the stanza separated into subjects, predicates and objects. The elements that can be modified are shown in bold. Notice that only the subject and the objects that are text strings can be changed, and the prefix for the subject should only be changed if it was changed in the prologue:

Table 6. Second stanza editable elements

Line

Subject

Predicate

Object

line 1

hg:ssn

a

csvw:Datatype

line 2

dc:title

"SSN"

line 3

dc:description

"Social Security Number"

line 4

dwo:isDefinedIn

hg:

line 5

csvw:base

xsd:string

line 5

csvw:format

"[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]"

line 7

dwo:contraintsOptional

true



The third stanza shows a masked custom data type:

CTttl_third_stanza.png

As in the previous example, the elements that can be modified are in bold:

Table 7. Third stanza editable elements

Line

Subject

Predicate

Object

line1

hg:masked_ssn

a

csvw:Datatype

line 2

dc:title

"Masked SSN"

line 3

dc:description

"A Social Security number (SSN) is a ..."

line 4

dwo:isDefinedIn

hg:

line 5

csvw:base

dwo:maskedString

line 6

csvw:format

"[0-9]{3}-[0-9]{2}-[0-9]{4}"

line 7

dwo:mask

[ a dwo:RegexMask; dwo:matchPattern "([0-9]{3})-([0-9]{2})-([0-9]{4})"; dwo:rewritePattern "XXX-XX-$3"]



The example is complicated a bit by additional predicates and objects nested in the object of the triple on the seventh line. However, as before, only the subject and objects that are text strings can be modified. Everything else should stay the same. For more information about Regex and how to use it, visit the Oracle Java documentation.