FAIR Data is a set of principles applied to data to make it:
- Findable: based on common, human readable language that is independent of standards, silos and constraints (semantics)
- Accessible: both within and without company boundaries, as an overlay on top of existing infrastructure without re-architecting existing systems
- Interoperable: a safe space for secure and trusted data sharing, creating a true data ecosystem that you control what can and cannot be shared
- Reusable: integrate data only once then easily share all or part of your data with trusted parties, removing the need for traditional point-to-point integrations and centralised data storage
You can read more about the origins of FAIR data at the Go Fair website.
IOTICS provides the infrastructure and the tooling that supports the “FAIRification” process. In other words, users can use IOTICS to make their data FAIR.
In this paper we analyse how IOTICS relates to the FAIR data principles and how the FAIRification process can be implemented.
The core concepts in IOTICS are “data interactions” and “digital twins”.
A digital twin is a virtual representation of a “real” asset and it provides a single access point to both metadata of the asset and its data.
“Data interactions” occur when two twins exchange or share data with each other.
By having Digital twins making the underlying data asset FAIR, IOTICS enables interactions to be dynamic and autonomous.
In other words, Digital twins in IOTICS form a network of FAIR data points interacting dynamically and autonomously with each other.
Autonomous data interactions happen because IOTICS supports a “find and bind” pattern whereby a twin wanting some data can search the network, find relevant twins matching the search criteria, describe them to determine whether they’re useful or not, then bind to them to receive the data.
The find and bind pattern can be programmed in each twin, as such making a twin an autonomous agent in the network.
Albeit FAIR data principles are generic and apply to any kind of (meta)data and for any data consumer (both humans and machines), IOTICS emphasis is on streaming data representing the now view of the asset.
Let’s deep dive into how IOTICS allows users to make their data FAIR by analysing one by one the FAIR data principles in details.
Findability refers to being able to find metadata and data unambiguously by humans and machines.
✅ IOTICS uses W3C DiD specification. Each twin has a globally unique and persistent identifiers.
A DID is globally resolvable to a DID document providing the cryptographic identity of the twin, its keys and permissions. A globally resolvable identifier provides a simple mechanism to disambiguate
Persistence refers to the fact that the same DID refers to the same data point once assigned. This is important in that it allows disambiguation and reliability. In IOTICS, once a DID is assigned to a twin, it can’t be changed.
✅ In IOTICS, “data” is generated by underlying “real” asset. It is available in feeds (topics) another twin can subscribe to. A twin may have as many feeds as needed. Each feed can have as many values as desired.
The twin, the feeds and the values in a feed are all enriched with metadata. The twin decides how rich that metadata is.
It is very important to know that the metadata plane and the data plane are totally separated. This implies that a digital twin can exist and express its metadata even if the underlying real asset is not available or “attached” to the twin itself.
Metadata, therefore, can be searched in IOTICS.
✅ IOTICS adopts a “find&bind” pattern. Search requests are sent to IOTICS to match metadata and responses are sent back to a requestor as an array of “twin descriptions” including a subset of the available meta. Consumer twins can then choose to “describe” one or more twins for a full metadata report or directly bind to one or more feeds to get data.
A successful execution of a bind operation completes the interaction.
✅ IOTICS exposes a web API to the application. Each digital twin description is recheable via a URL on this web API. Publicly available twin metadata can therefore be listed and made available to internet search engines
Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation.
✅ IOTICS uses HTTP/Websocket/STOMP or gRPC
✅ IOTICS uses HTTP/Websocket/STOMP or gRPC
IOTICS delegation model, DiD and brokered interactions provide the necessary means for this
As discussed in F2, Data and Metadata sit on separate planes and metadata is available at the application discretion independently from the existence of the underlying real asset
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
IOTICS uses semantic web technology, specifically RDF.
A twin is mapped to a web resource identified by its DID which is a URI.
Every piece of metadata is expressed as a “fact” and implemented as an RDF triple.
IOTICS uses semantic web tech; As mentioned in I1, each metadata is expressed as a triple subject/predicate/object; the subject can be a link to a term part of an ontology.
IOTICS applications can refer to custom ontologies, publicly available ontologies or ontologies hosted by IOTICS
IOTICS allows use of linked data ✅
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
IOTICS encourages the producer to use “product thinking” when creating twins. the use of RDF and semantic web tech. USEFULNESS of data is subjective but the creation of the data in context is achievable using linked data, directly supported by IOTICS by means of triple stores ✅
License can be linked to the digital twin using RDF (for example this) ✅
Data provenance can be encapsulated using RDF and cryptographically bound to the twin by using IOTICS DiD spec (example here) ✅
There’s an emergent standard to sign RDF metadata and IOTICS is working to implement it.
IOTICS abides to semantic web tech and standards can be encoded in RDF (we have done it conceptually by implementing WaterML as RDF). It’s then possible to share the ontologies and formats as appropriate. ✅
The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure. For instance, principle F4 defines that both metadata and data are registered or indexed in a searchable resource (the infrastructure component).
Updated about 1 year ago