Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while uploading same SBOM Second time #2131

Closed
g-sahil22 opened this issue Nov 7, 2022 · 24 comments · Fixed by #3357
Closed

Error while uploading same SBOM Second time #2131

g-sahil22 opened this issue Nov 7, 2022 · 24 comments · Fixed by #3357
Assignees
Labels
defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk pending release
Milestone

Comments

@g-sahil22
Copy link

The defect may already be reported! Please search for the defect before creating one.

Current Behavior:

When I upload the SBOM in Dependency Track for the First time in a new project it uploads successfully without any error logs and also reports the correct number of Components, but I upload the same SBOM in the same project again it shows error logs and reports the incorrect count of components

Logs When I upload SBOM First time
image
Number of Components: 1686

Logs When I upload the SBOM Second time
image
Number of Components: 1611

Steps to Reproduce:

Step 1: Create a New project
Step 2: Upload SBOM in the New project
Step 3: Monitor the logs and Number of components shown in the Dependency track
Step 4: Again upload the same SBOM in the Same project
Step 5: Monitor the logs and Number of components shows in the Dependency track

Expected Behavior:

Environment:

Report the Same number of components without any error logs when we upload SBOM second time in same project

  • Dependency-Track Version: 4.6.2
  • Distribution: Docker
  • BOM Format & Version: JSON, Cyclonedx 1.4
  • Database Server: H2 and PostgreSQL
  • Browser: Chrome
@jakobpinggera
Copy link

jakobpinggera commented Nov 7, 2022

We are experiencing the same issue. Interestingly, it does not seem to be deterministic whether the issue arises.

I uploaded the same BOM file twice (to a project that already had processed BOM files in the past). On the first try, the exception occurred. A minute later, the same file got processes successfully.

2022-11-07 13:14:16,580 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Processing CycloneDX BOM uploaded to project: 2eaa3deb-1c38-4d24-a306-4c8d3081e393 2022-11-07 13:14:22,020 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Identified 0 new components 2022-11-07 13:14:22,021 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Processing CycloneDX dependency graph for project: 2eaa3deb-1c38-4d24-a306-4c8d3081e393 2022-11-07 13:14:22,583 [] ERROR [org.dependencytrack.tasks.BomUploadProcessingTask] Error while processing bom javax.jdo.JDOObjectNotFoundException: Object with id "org.dependencytrack.model.Component:0" not found ! at org.datanucleus.api.jdo.JDOAdapter.getJDOExceptionForNucleusException(JDOAdapter.java:634) at org.datanucleus.api.jdo.JDOPersistenceManager.getObjectById(JDOPersistenceManager.java:1726) at org.dependencytrack.persistence.ComponentQueryManager.recursivelyDelete(ComponentQueryManager.java:414) at org.dependencytrack.persistence.ComponentQueryManager.reconcileComponents(ComponentQueryManager.java:517) at org.dependencytrack.persistence.QueryManager.reconcileComponents(QueryManager.java:771) at org.dependencytrack.tasks.BomUploadProcessingTask.inform(BomUploadProcessingTask.java:138) at alpine.event.framework.BaseEventService.lambda$publish$0(BaseEventService.java:101) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: org.datanucleus.exceptions.NucleusObjectNotFoundException: Object with id "org.dependencytrack.model.Component:0" not found ! at org.datanucleus.store.rdbms.request.FetchRequest.execute(FetchRequest.java:473) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.fetchObject(RDBMSPersistenceHandler.java:354) at org.datanucleus.state.StateManagerImpl.loadFieldsFromDatastore(StateManagerImpl.java:1608) at org.datanucleus.state.StateManagerImpl.validate(StateManagerImpl.java:5570) at org.datanucleus.ExecutionContextImpl.findObject(ExecutionContextImpl.java:3446) at org.datanucleus.ExecutionContextImpl.findObject(ExecutionContextImpl.java:2928) at org.datanucleus.api.jdo.JDOPersistenceManager.getObjectById(JDOPersistenceManager.java:1721) ... 8 common frames omitted 2022-11-07 13:15:01,134 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Processing CycloneDX BOM uploaded to project: 2eaa3deb-1c38-4d24-a306-4c8d3081e393 2022-11-07 13:15:06,350 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Identified 0 new components 2022-11-07 13:15:06,351 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Processing CycloneDX dependency graph for project: 2eaa3deb-1c38-4d24-a306-4c8d3081e393 2022-11-07 13:15:06,716 [] INFO [org.dependencytrack.tasks.BomUploadProcessingTask] Processed 434 components and 0 services uploaded to project 2eaa3deb-1c38-4d24-a306-4c8d3081e393 2022-11-07 13:15:13,825 [] INFO [org.dependencytrack.tasks.scanners.InternalAnalysisTask] Starting internal analysis task 2022-11-07 13:15:15,072 [] INFO [org.dependencytrack.tasks.scanners.InternalAnalysisTask] Internal analysis complete 2022-11-07 13:15:15,074 [] WARN [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] An API username or token has not been specified for use with OSS Index. Using anonymous access 2022-11-07 13:15:15,075 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Starting Sonatype OSS Index analysis task 2022-11-07 13:15:17,695 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Analyzing 89 component(s) 2022-11-07 13:15:19,848 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Analyzing 82 component(s) 2022-11-07 13:15:21,835 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Analyzing 98 component(s) 2022-11-07 13:15:23,204 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Analyzing 100 component(s) 2022-11-07 13:15:23,877 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Analyzing 33 component(s) 2022-11-07 13:15:23,877 [] INFO [org.dependencytrack.tasks.scanners.OssIndexAnalysisTask] Sonatype OSS Index analysis complete 2022-11-07 13:15:23,878 [] INFO [org.dependencytrack.policy.PolicyEngine] Evaluating 434 component(s) against applicable policies 2022-11-07 13:15:25,771 [] INFO [org.dependencytrack.policy.PolicyEngine] Policy analysis complete 2022-11-07 13:15:25,772 [] INFO [org.dependencytrack.tasks.metrics.ProjectMetricsUpdateTask] Executing metrics update for project 2eaa3deb-1c38-4d24-a306-4c8d3081e393 2022-11-07 13:15:28,239 [] INFO [org.dependencytrack.tasks.metrics.ProjectMetricsUpdateTask] Completed metrics update for project 2eaa3deb-1c38-4d24-a306-4c8d3081e393 in 00:02:466

Environment:

Dependency-Track Version: 4.6.1
Distribution: Docker
BOM Format & Version: JSON, Cyclonedx 1.4
Database Server: PostgreSQL
Browser: Chrome

@valentijnscholten
Copy link
Contributor

I am seeing the same exception in my logs from time to time. I haven't been able to reproduce it.

2022-11-05 00:22:26,676 ERROR [BomUploadProcessingTask] Error while processing bom
javax.jdo.JDOObjectNotFoundException: Object with id "org.dependencytrack.model.Component:0" not found !
        at org.datanucleus.api.jdo.JDOAdapter.getJDOExceptionForNucleusException(JDOAdapter.java:634)
        at org.datanucleus.api.jdo.JDOPersistenceManager.getObjectById(JDOPersistenceManager.java:1726)
        at org.dependencytrack.persistence.ComponentQueryManager.recursivelyDelete(ComponentQueryManager.java:414)
        at org.dependencytrack.persistence.ComponentQueryManager.reconcileComponents(ComponentQueryManager.java:517)
        at org.dependencytrack.persistence.QueryManager.reconcileComponents(QueryManager.java:771)
        at org.dependencytrack.tasks.BomUploadProcessingTask.inform(BomUploadProcessingTask.java:138)
        at alpine.event.framework.BaseEventService.lambda$publish$0(BaseEventService.java:101)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

@nscuro nscuro added defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk and removed in triage labels Nov 9, 2022
@nscuro
Copy link
Member

nscuro commented Nov 9, 2022

FWIW this is happening when the BOM includes duplicate components. The reconcileComponents method is relying on the persistence layer to determine whether two components are equal. It's possible that >1 components in Java end up being mapped to the same database row. When deleting such a component in Java, the backing DB row is deleted by the ORM, but the other Java component instances that mapped to the same row are not. When trying to delete those too the exception Object with id ... not found will be thrown.

Another symptom of this is a decrease in component count when uploading the BOM multiple times. Upon every upload, more and more of those duplicate components are removed.

@g-sahil22
Copy link
Author

Hi @nscuro, Thanks for the information, I have one question is this issue related to SBOM?

@ariyonaty
Copy link

We're noticing similar behavior to @nscuro mentioned when uploading the same BOM multiple times. The first upload to Dependency Track matches the number of components listed in the BOM. However each upload of the same BOM after that (to the same project name/version) results in a decreased count.

In particular, this only occurs with merged BOMs (using cyclonedx-cli), such as when merging a language-specific BOM with a container image BOM. A manual inspection shows duplicate components in the BOM.

Environment:
Dependency-Track Version: 4.5.0

@g-sahil22
Copy link
Author

Hi @nscuro, when I delete all the components and then re-upload the SBOM then Its works, Yes the previous Project contains Number of Duplicates entries

@flemminglau
Copy link

I attempted to delete all components but get left with one or two that cannot be deleted. Not the same ones at each attempt.

Is there a workaround for this?
Something I can do to my CDX file to prevent this from happening?
(Except removing all duplicates :-( )

@flemminglau
Copy link

flemminglau commented Dec 6, 2022

Edited: My original suggestion to do unique in purl did nothing and only seemed to work due to the apparent randomness of the issue.

So instead:
Which can by the way be done via:

jq   '.components |= unique_by([.name, .version]) '  bom.json  > bom-unique.json

This seems to work around the problem.
(Original files failed 5 out of 9 attempts uploading the same file. The deduplicated one was successful 9 out of 9 attempts.)

@rkg-mm
Copy link
Contributor

rkg-mm commented Dec 12, 2022

Since we ran into the same issue with #2265 we did some further analysis.

The issue seems to be following:
First Upload:
Every component entry from BOM is thrown into the database 1:1, no deduplication

Second upload (same or different BOM):

  1. For every component, the method matchSingleIdentity does search an equivalent in the database. It always finds the first match for duplicates
  2. Multiple entries of the same component in the new BOM file will all be matched to the first existing equivalent component, ideally the same one
  3. All other duplicates already in the DB will be deleted, since those are not flagged to be kept
  4. Since the components in new BOM file are known already, they are not persistet but all mapped to the same one in DB now
  5. During the deletion then somehow the references to the duplicates are messed up since they might be deleted already. Not 100% sure whats going on there then, that seems to be persistence layer specifics

Therefore, it makes sense that some reported the problem here only with merged BOMS (can merge duplicates into one, since merge tool does not deduplicate either) and me having the issue with @cyclonedx/cyclonedx-npm, since that tool generates a ton of duplicates (which have different ref-entries in the XML but the data used by matchSingleIdentity is identical for all of them (which makes sense in my opinion).

Basically I see this issue here:

  1. First upload does not deduplicate
  2. Further uplaods DO deduplicate but can't handle existing duplicates well.

We would provide a fix (this is affecting many of our projects since i switched to the new NPM cyclonedx module, as I just noticed) but I am unsure how to fix. In my opinion, during every upload we should FIRST deduplicate the BOM before hitting the database. That would eliminate the issue, ensure every upload has the same result and not different component counts after a second upload, and also fix the issue that component show up X times for the same project.
Is this what others would expect too?

Then it's just the question how to fix existing data....

@nscuro
Copy link
Member

nscuro commented Dec 13, 2022

Doing the de-duplication upfront should definitely solve this.

But de-duping components can be complex, as evident by the thought work that @jkowalleck has been doing in CycloneDX/cyclonedx-node-npm#307. For DT specifically, when de-duplicating 2 or more components, the dependency graph needs to reflect this, otherwise it will have broken edges.

The golden question is: "When are two components identical?"
At the very least, for DT purposes, the following properties should match:

  • Type
  • Group (if exists)
  • Name
  • Version
  • CPE (if exists)
  • PURL (if exists)
  • SWID Tag ID (if exists)
  • Hashes (if exists)

But even those don't cover the whole picture. License, dependencies, supplier/author/provider, etc. are all things that could make a component "different".

@jkowalleck
Copy link
Contributor

re: #2131 (comment)

the point of view that DependencyTrack might have is dependency-focused. Therefore, it could be acceptable to de-dupe on a handful of properties, and apply this result on the dependency graph, too.

@flemminglau
Copy link

Also consider on de-duplication:
Different instances of the exact same component used in multiple places may need to receive different Audit Treatment.
The dependencies may come from completely different transient dependencies which makes it unreasonable to only apply a single Audit treatment for a single common instance. (One may be a "False-Positive" while the other is a "Not-Affected".)

But maybe that is going too far.

(There is as far as we can see in general a challenge when trying to apply (multiple) non-deduplicated VEX data to a deduplicated project. I don't see a logical solution to this.)

@rkg-mm
Copy link
Contributor

rkg-mm commented Dec 13, 2022

re: #2131 (comment)

the point of view that DependencyTrack might have is dependency-focused. Therefore, it could be acceptable to de-dupe on a handful of properties, and apply this result on the dependency graph, too.

I also would take the purpose of Dependency-Track into account here. We use PURLs etc. to find vulnerabilities. If those are equal, there is no point in keeping 2 instances. License information is the 2nd use case. So possibly take license-ID into account, too. I don't see a big chance the license will differ, but who knows.

Regarding @flemminglau comment:
I see your point, but thats nothing Dependency-Track supports today either. You don't really see the difference between components in the audit views. Possibly in the new dependency-graph you can see it, but that gets difficult. I'd rather have one instance of the component and then see all occurances in the graph at once, not having 5 instances and having to open the graph 5 times to see the occurances, then trying to figure out which graph is which entry in the audit view.

And yes, in any case the graph also needs to reflect the deduplication. All instances in graph must be redirected to the one instance that is kept.

edit:
When adding components manually we should also ensure no duplicates are added then to prevent further errors.

@rkg-mm
Copy link
Contributor

rkg-mm commented Dec 13, 2022

Current logic for matching on re-upload seems to be:

"project == :project && ((purl != null && purl == :purl) || (purlCoordinates != null && purlCoordinates == :purlCoordinates) || (swidTagId != null && swidTagId == :swidTagId) || (cpe != null && cpe == :cpe) || (group == :group && name == :name && version == :version))"

@valentijnscholten
Copy link
Contributor

Seems like this is quite a common error. So for 4.7 it might be could to at least add a try-catch around the delete statement. If something we want to delete is already deleted, it's probably, at least for now in this specific case, OK to ignore that exception.

@rkg-mm
Copy link
Contributor

rkg-mm commented Dec 13, 2022

Workaround PR is created, that would fix the failing of BOM processing.
Nonetheless, we should consider the deduplicating I think. Could be made optional (default active maybe?) for someone who explicitly does not want this during upload via parameter.

@syalioune
Copy link
Contributor

My two cents : since the deduplication logic performed on second upload does not seem to question anyone, I would apply the same filter on the first upload as a first approach. From then on, it can surely be improved with additional fields.

@jonny64
Copy link

jonny64 commented Feb 27, 2023

#2131 (comment)

same situation, SBOM, component cnt decreased, cannot delete those two

Caused by: org.datanucleus.exceptions.NucleusObjectNotFoundException: Object with id "org.dependencytrack.model.Component:81" not found !
        at org.datanucleus.store.rdbms.request.FetchRequest.execute(FetchRequest.java:473)
        at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.fetchObject(RDBMSPersistenceHandler.java:354)
        at org.datanucleus.state.StateManagerImpl.loadFieldsFromDatastore(StateManagerImpl.java:1608)
        at org.datanucleus.state.StateManagerImpl.loadUnloadedFieldsInFetchPlan(StateManagerImpl.java:3931)
        at org.datanucleus.state.StateManagerImpl.isLoaded(StateManagerImpl.java:4135)
        ... 74 common frames omitted

what shoud we do to clean out those components from database?

@rkg-mm
Copy link
Contributor

rkg-mm commented Feb 27, 2023

@jonny64 there should be a workaround in place since 4.7., which version are you on?

@jonny64
Copy link

jonny64 commented Mar 3, 2023

@rkg-mm 4.6.2, update to 4.7 and re-upload BOM to fix ?

@rkg-mm
Copy link
Contributor

rkg-mm commented Mar 3, 2023

@jonny64 yes 4.7 should fix this.

@nscuro
Copy link
Member

nscuro commented Jul 5, 2023

I am working on this area of the code right now, and I am confident that I have a fix at hand for this problem.

What the persistence layer does today is basically de-duplication based on ComponentIdentity. I have shifted this logic so it now happens before we even touch the database. It also handles re-wiring of the dependency graph so that de-dupe'd components to not break the graph.

I'd love to test this with BOMs that are known to be problematic. Anyone in this thread willing to share some with me? If you're uncomfortable sharing publicly, you can also DM them to me in the OWASP Slack (my name there is nscur0), or email them to me (address is in my GitHub profile).

@melba-lopez
Copy link
Contributor

@nscuro has this been addressed per DependencyTrack/hyades-apiserver#218? and would this be a 4.10 potential fix?

Copy link
Contributor

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
defect Something isn't working p2 Non-critical bugs, and features that help organizations to identify and reduce risk pending release
Projects
None yet