-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document 'XCollection' serialization specifics #509
Conversation
Notes that in the proposed writeup Since "XCollection" is a subclass of
There are also a number of other optional properties like I can see quite a few benefits to having these present in the serialized data, so I personally don't see any issue - but I thought I would point this out to make sure we're on the same page on how the "XCollection" is serialized. |
Signed-off-by: Gary O'Neall <[email protected]>
Signed-off-by: Gary O'Neall <[email protected]>
…n team Signed-off-by: Gary O'Neall <[email protected]>
This PR describes X-Collection / SerializableCollection / LogicalDocument. PR #500 defines SerializedElements / PhysicalDocument. It is now called PhysicalDocument for clarity, and it is a subclass of Artifact as agreed in the 10/16 serialization meeting. PR #500 also includes the Payload non-element datatype class that defines serialized data. If there are "quite a few benefits to having these ( |
serialization/README.md
Outdated
|
||
Within the model, we have an "XCollection" which represents the common properties of the collection of elements across all data formats. | ||
|
||
We can refer to the actual serialized bytes as and Artifact in the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest slight tweak to "We can represent/characterize the actual serialized bytes using an Artifact element in the model."
serialization/README.md
Outdated
|
||
In that case, the "XCollection" will be in the serialized data with ONLY the properties not supported by the serialization format. | ||
|
||
There is at most one "XCollection" in the serialization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest slight tweak:
"There is at most one "XCollection" element defined in any given instance of serialization."
There can be any number of references via id but only a single XCollection defined within a given instance of serialization.
serialization/README.md
Outdated
|
||
The serialized "XCollection" shall not contain properties where that property value can be derived from the the serialized data itself. | ||
|
||
The "XCollection" section in the specification markdown file describes how each of the properties are serialized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest slight tweak:
"The specification markdown files describing rules for each serialization format will contain an "XCollection" section describing how each of the properties of "XCollection" are serialized for the given serialization format."
The RDF graph of an instance of the SPDX model shall contain all Element nodes (i.e. objects that are subclasses of Element) as a list on top-level under the "@graph" key. | ||
This means that all references to Element nodes have to use the URI of the referenced Element. | ||
Inlining/Embedding of Element nodes into other nodes is not allowed. | ||
On the other hand, non-Element nodes (like those of type "ExternalReference" or similar complex data classes) have to be inlined (TODO: we may want to make an exception for CreationInfo, depending on the outcome of the surrounding discussion). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that the consensus result regarding serialized compaction for CreationInfo is that WHEN SERIALIZED each instance of CreationInfo containing different data would be serialized (with its @type and @id) as an object within the '@graph' JSON-LD structure and referenced by any Elements with that same CreationInfo. When deserialized each serialized CreationInfo MUST be expanded back out to all of the elements it applies to such that each element MUST include its own CreationInfo (with separate @id from any other elements CreationInfo).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbarnum - Let's separate this out into a separate issue we can tackle after merging this. I did not intend to change the treatment of CreationInfo, just copying what was there before in the rdf.md file. I also have some concerns in how it is currently worded.
The RDF graph of an instance of the SPDX model shall contain all Element nodes (i.e. objects that are subclasses of Element) as a list on top-level under the "@graph" key. | ||
This means that all references to Element nodes have to use the URI of the referenced Element. | ||
Inlining/Embedding of Element nodes into other nodes is not allowed. | ||
On the other hand, non-Element nodes (like those of type "ExternalReference" or similar complex data classes) have to be inlined (TODO: we may want to make an exception for CreationInfo, depending on the outcome of the surrounding discussion). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also needs calling out explicitly here that each embedded non-Element node MUST have an explicit id. This is true for ALL serialization formats.
The serialized embedding within the json-ld object structure is a method of compaction to make the content easier to read (and process by some parsers) but it does not inherently represent the native deserialized logical content which is a graph where EVERY class instance (Element nodes and non-Element nodes) are separate nodes in the graph connected using the globally unique id of each node which MUST be present on all nodes.
There is a possible serialization compaction approach where we COULD (though I would not recommend it) assert that in situations where a producer is minting new elements to be directly serialized without creation anywhere internally first which would require ids already be assigned they could use blank node ids (like has been discussed for CreationInfo) within a specific serialization instance. These blank nodes would then be REQUIRED to be skolemized to create full globally unique ids (like done for CreationInfo) upon any deserialization. This could potentially save some space within the serialized content but it would likely be minimal and the extra complexity in handling and potential for misinterpretation makes this approach a questionable option to pursue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above - i just copy/pasted what was there before.
serialization/json-ld.md
Outdated
|
||
## XCollection | ||
|
||
The following XCollection properties are mapped to JSON-LD syntax specifications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest small tweak:
"The following XCollection properties are mapped to native JSON-LD mechanisms defined within the JSON-LD syntax specifications."
Also renames XCollection to SpdxDocument Signed-off-by: Gary O'Neall <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me.
It accurately covers the consensus results of serialization WG discussions on these topics and aligns with the rest of the model.
There is one issue identified in comments regarding like 23 of the json-ld.md file that needs to be raised and addressed to clarify serialization rules for non-Element classes.
Discussed on 7 November 2023 call - merging |
No description provided.