The opportunity
It is not clear in which order to read traces when there are more than one in a single semantic correspondence statement. Some traces are, in fact, qualifiers of other traces.
Is it possible to:
- somehow differentiate the traces and
- apply a reading order?
Discussion
The discussion has been split into parts to make easier to follow.
Assumptions
The SWIM Information Definition Specification assumes that we are not yet in an environment that is fully supported by semantic technologies. Therefore, humans are the more realistic target for the semantic correspondence statements. This leads to the following statements about traces:
- The traces are to be read by humans but may be read by machines (machine processable).
- The traces are to be created and inspected by humans. There is nothing to stop machines creating the mappings but any such mappings will still need human inspection.
- The maintenance and migration of traces can be managed by machines (or humans) based on scripts.
The need for a machine readable list of mappings has been recorded in the AIRM CCB to ease the migration of mappings between versions of the AIRM. This can be summarised as the need for "machine processable" maintenance of traces.
Different types of traces
The table below outlines the names and definition to be used for the different types of trace. The names are inspired by the words from the SWIM Information Specification's requirements. This approach makes it clear which requirement is being satisfied by the trace.
Trace name | Definition | Requirement | Trace required |
---|---|---|---|
"information concept" trace | trace from the information concept in the information definition to the AIRM concept that has an equivalent or wider meaning | SWIM-INFO-016 Mapping of information concepts | requires one concept trace |
|
| SWIM-INFO-017 Mapping of data concepts | requires one concept trace and one data type trace |
"narrowing" trace | trace to an AIRM concept to fully describe the narrowing of the concept being mapped | SWIM-INFO-018 Additional traces to clarify the mapping | allows any number of additional narrowing traces |
Source and target of traces
The Interoperability Architecture provides good guidance on the best place to start when looking to establish a semantic correspondence. Basically, the best place to start is usually the adjacent box within the grid.
The usual start point depends on the type of information definition being traced.
Type of information definition | Best place to start | Trace name |
---|---|---|
information exchange requirements | The best place to start is the AIRM Conceptual Model. However, information exchange requirements can vary in the level of detail included. Therefore, if no suitable AIRM concept is found there, the AIRM Logical Model may also be useful. The specification doesn't rule out mapping to the AIRM Contextual Model but this is not a good practice. | "information concept" trace |
service message | The best place to start is the AIRM Logical Model. Although it is difficult to give generic advice that is applicable in all cases, the following guidance may be of help: Classes have one trace - 016 Attributes have two traces - 017 "data type" trace is dependent on "data concept" trace. Difference between information and data concepts
If no suitable AIRM concept is found there, the Conceptual Model may be used for the "data concept" trace. The specification doesn't rule out mapping to the AIRM Contextual Model but this is not a good practice. Note: AIRM has internal traces that are inherited by any mapping. |
|
Clarifying traces should be in the same model as the trace being qualified. | "clarifying" trace |
Reading order of traces
General reading order is:
- "information concept" trace
- "clarifying" traces (0..*)
or
- "data concept" trace
- "data type" trace (1)
- "clarifying" traces (0..*)
All traces have an AND relationship.
The following rules apply to the traces:
- The root trace is mandatory. This is either an "information concept" trace or a "data concept" trace.
- A "data type" trace is mandatory when the root trace is a "data concept" trace.
- "Clarifying" traces cannot exist in their own right.
Which structure (in terms of “logic”) has the order of multiple traces? Possible options (maybe also combinations):
- from most necessary to least necessary ("which trace is at least necessary for understanding the concept?")
- from logical to technical ("what does the data contain - in which format does it come?")
- from technical to logical ("which format does the data have - what does it contain?")
How many traces are sufficient/enough?
- would it be helpful to give advice on how many traces are "ok"? Maybe something like "up to 5 traces" or so?
- if there are too much traces needed maybe think about a change request?
- Multiple traces: Conceptual vs. Logical Model - which to prefer?
Level of semantic correspondence
Advanced users may like to add extra detail concerning the level of semantic correspondence achieved. The requirements talk about "equivalent or wider meaning". The table below contains the old AIRM Rulebook names and the skos equivalents.
Definition being traced to is... | Annotations that can make this more explicit in SESAR documents | in skos |
---|---|---|
Equivalent |
|
|
Wider | Specialised: The definition in the information definition is a special case of the definition found in the AIRM. | skos:narrowMatch: used to state a hierarchical mapping link between two concepts. |
The skos names are preferred. Skos has rich support in semantic technologies. However, SESAR documents use slightly different names based on the old AIRM Rulebook.
Both options are therefore valid.
We only need narrowing traces if the main trace is "specialised" or "narrowMatch"
Traces cannot be annotated as "generalised" as this breaks the requirement.
Annotating traces
It is possible to add further notes to the mapping (the container for one or more trace). This comes in handy when e.g. tracing legacy interfaces that have data type constraints leading to loss of Information.
The table below gives two alternatives for recording the traces in XSD.
Using element names is the preferred option as it can be used more easily in rrules. The element name contains semantic hints even if the attributes are not added.
However, the attribute option is also supported as there are a lot of traces developed that do not use element names. Support for this option should be deprecated in the future.
<dataConceptTrace>
<dataTypeTrace>
<trace keyword="dataConceptTrace>
<trace keyword="dataTypeTrace>
"clarifying" trace
<trace keyword="clarifyingTrace>
Full example
If we apply all of this:
see if "degree" has a skos name instead.
check for a compressed notation
note attribute should become a <comment> element.
or: