From 120cf1289ace3c2a5a6e0a6fede0d76827215fd9 Mon Sep 17 00:00:00 2001 From: jvita Date: Tue, 21 Sep 2021 12:49:21 -0500 Subject: [PATCH 1/9] Update optimade.rst Initial work preparing Collections endpoint --- optimade.rst | 43 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 41 insertions(+), 2 deletions(-) diff --git a/optimade.rst b/optimade.rst index e4a72765d..6cc6a5405 100644 --- a/optimade.rst +++ b/optimade.rst @@ -149,7 +149,7 @@ The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SH For example, a :entry:`structures` entry is comprised by data that pertain to a single structure. **Entry type** - Entries are categorized into types, e.g., :entry:`structures`, :entry:`calculations`, :entry:`references`. + Entries are categorized into types, e.g., :entry:`structures`, :entry:`calculations`, :entry:`references`, :entry:`collections`. Entry types MUST be named according to the rules for identifiers. **Entry property** @@ -195,6 +195,9 @@ The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SH The key used in response formats that return data in associative-array-type data structures. This is particularly relevant for the default JSON-based response format. In this case, **field** refers to the name part of the name-value pairs of JSON objects. + +**Collection** + A Collection defines a relationship between a group of Entry resources. A Collection can be used to store metadata that applies to all of the entries in the group, and to aggregate metadata from each entry in the group. Data types ---------- @@ -323,7 +326,7 @@ Index Meta-Database A database provider MAY publish a special Index Meta-Database base URL. The main purpose of this base URL is to allow for automatic discoverability of all databases of the provider. Thus, it acts as a meta-database for the database provider's implementation(s). The index meta-database MUST only provide the :endpoint:`info` and :endpoint:`links` endpoints, see sections `Info Endpoints`_ and `Links Endpoint`_. -It MUST NOT expose any entry listing endpoints (e.g., :endpoint:`structures`). +It MUST NOT expose any entry listing endpoints (e.g., :endpoint:`structures` and :endpoint:`collections`). These endpoints do not need to be queryable, i.e., they MAY be provided as static JSON files. However, they MUST return the correct and updated information on all currently provided implementations. @@ -1075,6 +1078,7 @@ Example: }, "available_endpoints": [ "structures", + "collections", "calculations", "info", "links" @@ -2317,6 +2321,41 @@ structure\_features - **Examples**: - A structure having implicit atoms and using assemblies: :val:`["assemblies", "implicit_atoms"]` + +Collections Entries +------------------- +- **Description**: The `collections` endpoint is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using `relationships`_ as described in the `Relationships`_ section. + +An example use case would be to define a relationship between a collection of Structure entries that are all conceptually similar (e.g., "A collection of FCC Al structures containing a single vacancy defect"). + +:entry:`collections` entries have the properties described in the section `Properties Used by Multiple Entry Types`_ as well as the following properties: `relationships`_, `additional_metadata`_, and `aggregated_fields`_. + +relationships +~~~~~~~~~~~~~ +- **Description**: relationships to other Entry resources, as described in `Relationships`_. +- **Type**: a `JSON API Relationships `__ object. +- **Requirements/Conventions**: + - **Support**: MUST be supported by all implementations, MUST NOT be :val:`null`. + - **Query**: MUST be a queryable property with support for all mandatory filter features. + +additional_metadata +~~~~~~~~~~~~~~~~~~~ +- **Description**: Additional metadata that applies to all of the entries in `relationships`. +- **Type**: a dictionary of string-string pairs +- **Requirements/Conventions**: + - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. + - **Query**: support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported. + - The keys should be short strings describing the type of metadata being supplied. + - The values can be any string, which may be human-readable. + +aggregated_fields +~~~~~~~~~~~~~~~~~~~~~~ +- **Description**: Names of fields that were generated by aggregating over the corresponding fields in each of the entries specified in `relationships`. +- **Type**: a list of strings +- **Requirements/Conventions**: + - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. + - **Query**: support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported. + - Strings provided in this list should correspond to other queryable fields within the `collections` entry. Calculations Entries -------------------- From 4adae80c96726e85ae655e50ec707e7caedbb022 Mon Sep 17 00:00:00 2001 From: Josh Vita Date: Wed, 22 Sep 2021 11:35:40 -0500 Subject: [PATCH 2/9] Test to see if edits work as expected --- optimade.rst | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/optimade.rst b/optimade.rst index 6cc6a5405..438f81d67 100644 --- a/optimade.rst +++ b/optimade.rst @@ -2324,19 +2324,11 @@ structure\_features Collections Entries ------------------- -- **Description**: The `collections` endpoint is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using `relationships`_ as described in the `Relationships`_ section. +- **Description**: The `collections` endpoint is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using :field:`relationships` as described in the `Relationships`_ section. -An example use case would be to define a relationship between a collection of Structure entries that are all conceptually similar (e.g., "A collection of FCC Al structures containing a single vacancy defect"). +An example use case would be to define a relationship between a collection of Structure entries that are all conceptually related (e.g., "A collection of FCC Al structures containing a single vacancy defect"). -:entry:`collections` entries have the properties described in the section `Properties Used by Multiple Entry Types`_ as well as the following properties: `relationships`_, `additional_metadata`_, and `aggregated_fields`_. - -relationships -~~~~~~~~~~~~~ -- **Description**: relationships to other Entry resources, as described in `Relationships`_. -- **Type**: a `JSON API Relationships `__ object. -- **Requirements/Conventions**: - - **Support**: MUST be supported by all implementations, MUST NOT be :val:`null`. - - **Query**: MUST be a queryable property with support for all mandatory filter features. +:entry:`collections` entries have the properties described in the section `Properties Used by Multiple Entry Types`_ as well as the following properties: :field:`relationships`, :field:`additional_metadata`, and :field:`aggregated_fields`. additional_metadata ~~~~~~~~~~~~~~~~~~~ From d9312feead904e94072c7ffa498b5da952204958 Mon Sep 17 00:00:00 2001 From: jvita Date: Wed, 22 Sep 2021 11:37:01 -0500 Subject: [PATCH 3/9] Update optimade.rst --- optimade.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/optimade.rst b/optimade.rst index 438f81d67..261a8937e 100644 --- a/optimade.rst +++ b/optimade.rst @@ -2328,7 +2328,7 @@ Collections Entries An example use case would be to define a relationship between a collection of Structure entries that are all conceptually related (e.g., "A collection of FCC Al structures containing a single vacancy defect"). -:entry:`collections` entries have the properties described in the section `Properties Used by Multiple Entry Types`_ as well as the following properties: :field:`relationships`, :field:`additional_metadata`, and :field:`aggregated_fields`. +:entry:`collections` entries have the properties described in the section `Properties Used by Multiple Entry Types`_ as well as the following properties: `additional_metadata`_ and `aggregated_fields`_. additional_metadata ~~~~~~~~~~~~~~~~~~~ From abccba030f3ccba894447d25f8ba7624b2128a60 Mon Sep 17 00:00:00 2001 From: jvita Date: Wed, 22 Sep 2021 13:25:59 -0500 Subject: [PATCH 4/9] Update optimade.rst --- optimade.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/optimade.rst b/optimade.rst index 261a8937e..21d2aa960 100644 --- a/optimade.rst +++ b/optimade.rst @@ -2324,7 +2324,7 @@ structure\_features Collections Entries ------------------- -- **Description**: The `collections` endpoint is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using :field:`relationships` as described in the `Relationships`_ section. +A Collection is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using :field:`relationships` as described in the `Relationships`_ section. An example use case would be to define a relationship between a collection of Structure entries that are all conceptually related (e.g., "A collection of FCC Al structures containing a single vacancy defect"). From 8308a505ed1a44354b5a1d9b9e241242abe6f50d Mon Sep 17 00:00:00 2001 From: Josh Vita Date: Fri, 24 Sep 2021 10:09:20 -0500 Subject: [PATCH 5/9] Minor edits and cleanup --- optimade.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/optimade.rst b/optimade.rst index 438f81d67..2fd2ceff5 100644 --- a/optimade.rst +++ b/optimade.rst @@ -195,7 +195,7 @@ The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SH The key used in response formats that return data in associative-array-type data structures. This is particularly relevant for the default JSON-based response format. In this case, **field** refers to the name part of the name-value pairs of JSON objects. - + **Collection** A Collection defines a relationship between a group of Entry resources. A Collection can be used to store metadata that applies to all of the entries in the group, and to aggregate metadata from each entry in the group. @@ -2321,7 +2321,7 @@ structure\_features - **Examples**: - A structure having implicit atoms and using assemblies: :val:`["assemblies", "implicit_atoms"]` - + Collections Entries ------------------- - **Description**: The `collections` endpoint is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using :field:`relationships` as described in the `Relationships`_ section. @@ -2333,7 +2333,7 @@ An example use case would be to define a relationship between a collection of St additional_metadata ~~~~~~~~~~~~~~~~~~~ - **Description**: Additional metadata that applies to all of the entries in `relationships`. -- **Type**: a dictionary of string-string pairs +- **Type**: a dictionary - **Requirements/Conventions**: - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. - **Query**: support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported. From c7a00ddc62c26f69082e65dbd085a2bccf24d478 Mon Sep 17 00:00:00 2001 From: Rickard Armiento Date: Tue, 21 Jun 2022 10:00:28 +0200 Subject: [PATCH 6/9] Modified according to PR review comments, workshop discussions --- optimade.rst | 96 +++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 73 insertions(+), 23 deletions(-) diff --git a/optimade.rst b/optimade.rst index 3a0ed0dda..4fb40ca2e 100644 --- a/optimade.rst +++ b/optimade.rst @@ -149,7 +149,7 @@ The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SH For example, a :entry:`structures` entry is comprised by data that belong to a single structure. **Entry type** - Entries are categorized into types, e.g., :entry:`structures`, :entry:`calculations`, :entry:`references`, :entry:`collections`. + Entries are categorized into types, e.g., :entry:`structures`, :entry:`calculations`, :entry:`references`. Entry types MUST be named according to the rules for identifiers. **Entry property** @@ -196,9 +196,6 @@ The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SH This is particularly relevant for the default JSON-based response format. In this case, **field** refers to the name part of the name-value pairs of JSON objects. -**Collection** - A Collection defines a relationship between a group of Entry resources. A Collection can be used to store metadata that applies to all of the entries in the group, and to aggregate metadata from each entry in the group. - Data types ---------- @@ -326,7 +323,7 @@ Index Meta-Database A database provider MAY publish a special Index Meta-Database base URL. The main purpose of this base URL is to allow for automatic discoverability of all databases of the provider. Thus, it acts as a meta-database for the database provider's implementation(s). The index meta-database MUST only provide the :endpoint:`info` and :endpoint:`links` endpoints, see sections `Info Endpoints`_ and `Links Endpoint`_. -It MUST NOT expose any entry listing endpoints (e.g., :endpoint:`structures` and :endpoint:`collections`). +It MUST NOT expose any entry listing endpoints (e.g., :endpoint:`structures`). These endpoints do not need to be queryable, i.e., they MAY be provided as static JSON files. However, they MUST return the correct and updated information on all currently provided implementations. @@ -2805,30 +2802,83 @@ structure\_features Collections Entries ------------------- -A Collection is used to define groups of Entry resources. It can be used to store metadata that applies to all of the entries in the group, or metadata that is generated by aggregating fields from each of the entries in the group. The group of entries is defined using :field:`relationships` as described in the `Relationships`_ section. -An example use case would be to define a relationship between a collection of Structure entries that are all conceptually related (e.g., "A collection of FCC Al structures containing a single vacancy defect"). +Collections entries are used to define a set of Entries of any types. +For example, a collection of Structure entries can be used to indicate that they are conceptually related, such as structures representing aluminium unit cells with point defects. +The set of entries that belong to the collection is defined using a :val:`contents` relationship (see the `Relationships`_ section). +A collection can contain other collections. +Furthermore, implementations are suggested to add database-specific properties for additional metadata they want to store about the collections. +An OPTIMADE response representing a collection with all referenced entries included via the JSON API field :field:`included` (or equivalent in other response formats) can be used as a universal format for storage or transfer of a subset of (or all) data in an OPTIMADE database. -:entry:`collections` entries have the properties described in the section `Properties Used by Multiple Entry Types`_ as well as the following properties: `additional_metadata`_ and `aggregated_fields`_. +The following example shows how the :val:`contents` relationship from a collection to other entries defines the contents of a collection: -additional_metadata -~~~~~~~~~~~~~~~~~~~ -- **Description**: Additional metadata that applies to all of the entries in `relationships`. -- **Type**: a dictionary +.. code:: jsonc + + { + "data": { + "type": "collections", + "id": "example.com:collections:42", + "attributes": { + "name": "Results set for vacancies in FCC Al" + "category": "results_set" + }, + "relationships": { + "structures": { + "data": [ + { "type": "structures", "id": "example.com:structures:4711" }, + { "type": "structures", "id": "example.com:structures:4712" }, + { "type": "structures", "id": "example.com:structures:4713" } + { "type": "calculations", "id": "example.com:calculations:1899" } + ] + } + } + }, + } + +Collections entries have the properties described above in section `Properties Used by Multiple Entry Types`_, as well as the following additional properties: + +name +~~~~ + +- **Description**: A name for the collection +- **Type**: String - **Requirements/Conventions**: - - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. - - **Query**: support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported. - - The keys should be short strings describing the type of metadata being supplied. - - The values can be any string, which may be human-readable. -aggregated_fields -~~~~~~~~~~~~~~~~~~~~~~ -- **Description**: Names of fields that were generated by aggregating over the corresponding fields in each of the entries specified in `relationships`. -- **Type**: a list of strings + - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. + - **Query**: Support for queries on this property is OPTIONAL. + +- **Examples**: + + - :val:`"Results set for vacancies in FCC Al"` + +description +~~~~~~~~~~~ + +- **Description**: A longer text that describes the collection +- **Type**: String - **Requirements/Conventions**: - - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. - - **Query**: support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported. - - Strings provided in this list should correspond to other queryable fields within the `collections` entry. + + - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. + - **Query**: Support for queries on this property is OPTIONAL. + +- **Examples**: + + - :val:`"This collection contains structures used in an investigation into point defects in Al"` + +category +~~~~~~~~ + +- **Description**: A free-form text categorizing the collection. + It is suggested that individual collections with similar purposes are assigned the same category to aid browsing and searching. +- **Type**: String +- **Requirements/Conventions**: + + - **Support**: OPTIONAL support in implementations, i.e., MAY be :val:`null`. + - **Query**: Support for queries on this property is OPTIONAL. + +- **Examples**: + + - :val:`"results_set"` Calculations Entries -------------------- From e8177908126fd7d879f88d397d83d86bbed2bea8 Mon Sep 17 00:00:00 2001 From: Rickard Armiento Date: Tue, 28 Jun 2022 07:52:40 +0200 Subject: [PATCH 7/9] Changes due to review: drop "'contents' relationship" --- optimade.rst | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/optimade.rst b/optimade.rst index 4fb40ca2e..d8d025be3 100644 --- a/optimade.rst +++ b/optimade.rst @@ -2805,12 +2805,12 @@ Collections Entries Collections entries are used to define a set of Entries of any types. For example, a collection of Structure entries can be used to indicate that they are conceptually related, such as structures representing aluminium unit cells with point defects. -The set of entries that belong to the collection is defined using a :val:`contents` relationship (see the `Relationships`_ section). +The set of entries that belong to the collection is defined using relationships from the collection to each entry (OPTIMADE relationships are defined in `Relationships`_). A collection can contain other collections. Furthermore, implementations are suggested to add database-specific properties for additional metadata they want to store about the collections. An OPTIMADE response representing a collection with all referenced entries included via the JSON API field :field:`included` (or equivalent in other response formats) can be used as a universal format for storage or transfer of a subset of (or all) data in an OPTIMADE database. -The following example shows how the :val:`contents` relationship from a collection to other entries defines the contents of a collection: +The following example shows how relationships from a collection to other entries defines the contents of the collection: .. code:: jsonc @@ -2828,6 +2828,10 @@ The following example shows how the :val:`contents` relationship from a collecti { "type": "structures", "id": "example.com:structures:4711" }, { "type": "structures", "id": "example.com:structures:4712" }, { "type": "structures", "id": "example.com:structures:4713" } + ] + }, + "calculations": { + "data": [ { "type": "calculations", "id": "example.com:calculations:1899" } ] } From 5e3dc8af3a153ad0aa24d97aac81fde13c0db987 Mon Sep 17 00:00:00 2001 From: Rickard Armiento Date: Wed, 6 Jul 2022 20:10:36 +0200 Subject: [PATCH 8/9] Suggestions from @ml-evs in PR review Co-authored-by: Matthew Evans <7916000+ml-evs@users.noreply.github.com> --- optimade.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/optimade.rst b/optimade.rst index de7c29cd1..d11ff8357 100644 --- a/optimade.rst +++ b/optimade.rst @@ -2813,8 +2813,8 @@ Collections Entries ------------------- Collections entries are used to define a set of Entries of any types. -For example, a collection of Structure entries can be used to indicate that they are conceptually related, such as structures representing aluminium unit cells with point defects. -The set of entries that belong to the collection is defined using relationships from the collection to each entry (OPTIMADE relationships are defined in `Relationships`_). +For example, a collection of Structure entries can be used to indicate that they are conceptually related, such as structures representing aluminium unit cells with point defects, or structures that comprise a training set for a particular interatomic potential. +The set of entries that belong to the collection is defined using one-to-many relationships from the collection to each entry (OPTIMADE relationships are defined in `Relationships`_). A collection can contain other collections. Furthermore, implementations are suggested to add database-specific properties for additional metadata they want to store about the collections. An OPTIMADE response representing a collection with all referenced entries included via the JSON API field :field:`included` (or equivalent in other response formats) can be used as a universal format for storage or transfer of a subset of (or all) data in an OPTIMADE database. @@ -2826,7 +2826,7 @@ The following example shows how relationships from a collection to other entries { "data": { "type": "collections", - "id": "example.com:collections:42", + "id": "42", "attributes": { "name": "Results set for vacancies in FCC Al" "category": "results_set" @@ -2863,6 +2863,7 @@ name - **Examples**: - :val:`"Results set for vacancies in FCC Al"` + - :val:`"Training set for the 'Si-001' interatomic potential."` description ~~~~~~~~~~~ @@ -2877,6 +2878,7 @@ description - **Examples**: - :val:`"This collection contains structures used in an investigation into point defects in Al"` + - :val:`"Training set of structures, forces and energies used to construct the interatomic potential 'Si-001' for silicon, with associated bibliographic references."` category ~~~~~~~~ @@ -2892,6 +2894,7 @@ category - **Examples**: - :val:`"results_set"` + - :val:`"training_set"` Calculations Entries -------------------- From ad271edc9c722493fa1ff9d25399f1c231f2e3aa Mon Sep 17 00:00:00 2001 From: Matthew Evans <7916000+ml-evs@users.noreply.github.com> Date: Wed, 6 Jul 2022 21:38:03 +0100 Subject: [PATCH 9/9] Remove double-space --- optimade.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/optimade.rst b/optimade.rst index d11ff8357..aa5147148 100644 --- a/optimade.rst +++ b/optimade.rst @@ -2863,7 +2863,7 @@ name - **Examples**: - :val:`"Results set for vacancies in FCC Al"` - - :val:`"Training set for the 'Si-001' interatomic potential."` + - :val:`"Training set for the 'Si-001' interatomic potential."` description ~~~~~~~~~~~