Provide necessary support for SpotBugs SARIF #1952

michaelcfanning · 2020-06-30T22:14:34Z

e.g., can we help provide a schema-driven Java OM?

KengoTODA · 2020-07-01T01:55:36Z

Biggest problem is that our descriptor is written in HTML like this so we cannot provide it in TEXT and MD.

It's also nice if we can generate necessary classes/interfaces from JSON schema.
But it's not mandatory; we can code classes manually.

KengoTODA · 2020-07-02T02:56:18Z

Biggest problem is that our descriptor is written in HTML like this so we cannot provide it in TEXT and MD.

I remember that markdown can contain HTML elements, I'll directly put our descriptors into text.markdown property as an experiment.

KengoTODA · 2020-07-02T22:37:17Z

You can find the .sarif report generated by the latest SpotBugs (not released yet) at here:
spotbugs/spotbugs#1184 (comment)

When you find problems, please come spotbugs/spotbugs#1185 to comment, or mention me at here. Thanks!

KengoTODA · 2020-07-08T10:43:19Z

I remember that markdown can contain HTML elements, I'll directly put our descriptors into text.markdown property as an experiment.

I want to confirm this really work or not. However, even when I put fullDescription.markdown property, the SARIF viewer plugin for VS Code doesn't display it. How I can verify that the current JSON file is valid or not?

Here is viewers' screenshot from my local:

Here is the JSON data in tool.driver.rules:

{"messageStrings":{"default":{"text":"{0} is Serializable; consider declaring a serialVersionUID"}},"id":"SE_NO_SERIALVERSIONID","shortDescription":{"text":"Class is Serializable, but doesn't define serialVersionUID"},"fullDescription":{"markdown":"\n\n  <p> This class implements the <code>Serializable<\/code> interface, but does\n  not define a <code>serialVersionUID<\/code> field.&nbsp;\n  A change as simple as adding a reference to a .class object\n    will add synthetic fields to the class,\n   which will unfortunately change the implicit\n   serialVersionUID (e.g., adding a reference to <code>String.class<\/code>\n   will generate a static field <code>class$java$lang$String<\/code>).\n   Also, different source code to bytecode compilers may use different\n   naming conventions for synthetic variables generated for\n   references to class objects or inner classes.\n   To ensure interoperability of Serializable across versions,\n   consider adding an explicit serialVersionUID.<\/p>\n\n    "},"helpUri":"https://spotbugs.readthedocs.io/en/latest/bugDescriptions.html#SE_NO_SERIALVERSIONID","properties":{"tags":["BAD_PRACTICE"]}}

Or you may check the while main.sarif.txt.
Thanks in advance!

KengoTODA · 2020-07-12T11:30:01Z

What is the difference between tool.driver.notifications and invocations.toolConfigurationNotifications?
I'm using driver.notifications to put errors and exceptions like below, is it OK?

{
  "version":"2.1.0",
  "$schema":"https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.4.json",
  "runs":[
    {
      "tool":{
        "driver":{
          "name":"SpotBugs",
          "version":"Development",
          "language":"en",
          "notifications":[
            {
              "level":"error",
              "descriptor":{
                "id":"spotbugs-missing-classes"
              },
              "message":{
                "text":"Classes needed for analysis were missing: [com.github.spotbugs.MissingClass]"
              }
            }
          ]
        }
      }
    }
  ]
}

michaelcfanning · 2020-07-12T17:51:12Z

tool.driver.notifications are a set of reporting descriptors that provide metadata on the notifications a component might raise.

invocations.toolConfigurationNotifications and invocations.toolExecutionNotifications are instances of those notifications that actually occurred at runtime. We call the analysis equivalents rules and results. It's confusing here because we use the same term for notification metadata as well as the notifications.

toolConfigurationNotifications are errors that resolve to some sort of configuration problem. That is, they reflect a problem that can be resolved by the user properly configuring the tool (in your example, by ensuring that all classes relevant to analysis are present. toolExecutionNotifications are errors in the engine itself (i.e., bugs that need to be addressed by the tool developer). You'd use this property to store an unhandled exception or other engine correctness problem.

KengoTODA · 2020-07-12T23:34:28Z

OK thanks. I'm still not sure when I need to use tool.driver.notifications, I know that toolConfigurationNotifications is better place to put information about missing classes.

microsoft/sarif-sdk#1952 (comment)

michaelcfanning · 2020-07-13T13:32:29Z

When you want to report an actual notification that occurred to a user, such as when classes are missing, use toolConfigurationNotifications to send those messages. tool.driver.notifications is a place to add metadata about each class of notification, for example, this is where you would store a full description of the notification.

@lgolding, you are an effective explainer of the standard, want to jump in here? :)

ghost · 2020-07-13T17:40:59Z

@KengoTODA, @michaelcfanning is correct (except that, looking at your file, I see that you have toolExecutionNotifications, not toolConfigurationNotifications.)

The SARIF Tutorials has a section on Notifications that might help. Here is an example (which I will add to the tutorial):

{
  "version": "2.1.0",
  "runs": [
    {                                          // A run object.
      "tool": {
        ...
      },
      "results": [
        ...
      ],
      "invocations": [
        {                                      // An invocation object
          "toolExecutionNotifications": [
            {                                  // A notification object.
              "level": "error",
              "descriptor": {
                "id": "spotbugs-missing-classes"
              },
              "message": {
                "text": "Classes needed for analysis were missing: [apply, iterator, ...]"
              }
            },
          ]
          "toolConfigurationNotifications": [
            {                                  // A notification object.
              ...
            },
          ]
        }
      ]
    }
  ]
}

Now, the SARIF toolComponent object does have a notification property, but it doesn't hold notifications: it holds descriptions of (metadata about) notifications. So if, for example, you wanted to provide a full description of what the notification with id NOT10001 meant, you could put it in tool.driver.notifications.

The property tool.driver.notifications used to be called ...notificationDescriptors. Towards the end of the standardization process, we went through an exercise of shortening many of the property names. In this case, it seems that the shortened name caused confusion.

ghost · 2020-07-13T18:31:57Z

@KengoTODA I added a Notifications sample to the SARIF Tutorials. I hope you find it helpful.

I'm going through your sample SARIF output in detail and will provide more feedback later today. Thank you for doing this!

KengoTODA · 2020-07-13T19:32:40Z

Thanks. This is the latest example spotbugs.sarif, that was made for this PR. At least it passed the SARIF validator.

I will check your comments carefully, to try to understand which property I should use for this case.

ghost · 2020-07-13T20:10:31Z

Kengo, thank you for the latest spotbugs.sarif. It saved me a lot of work :-)

This file is almost valid according to the schema. The only error is that the location.logicalLocation property is actually named logicalLocations (plural). Note that its value is an array.

The spec (§3.28.4) explains the reason for this:

NOTE: logicalLocations is an array because some logical locations can be expressed in more than one way. For example, the logical location of an element in an HTML document might be expressed by an XML Path expression such as /html/body/img[1] or by a CSS selector such as #logo.

After I fixed that, the latest version of the SARIF validator tool (which we will publish very soon) finds only two problems:

The $schema property should have the value https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.5.json, which is the final official version. The only difference between -rtm.4 and -rtm.5 was to correct a typo in the name of one property that you are not using.
Rule messages should be expressed in complete sentences, so they should end with a period (for example, "{0} may expose internal representation by returning {1}."

I will now look by hand to see if I can make any more recommendations.

ghost · 2020-07-14T21:16:09Z

Hi @KengoTODA, here are some more comments on the SpotBugs SARIF output.

NOTE: Many of the things that I mention here, we will enhance our validator to report automatically.

Do you plan to include codeFlows in your output? spotbugs/discuss#69 mentioned potential improvements to SpotBugs output related to code flows, and SARIF's code flow feature would make it easier to do.
In your previous sample output, you included tool notifications, including exception objects. That was very helpful. Do you plan to put them back?
Your logicalLocations.fullyQualifiedName (for example, edu/umd/cs/findbugs/workflow/RejarClassesForAnalysis.java:[line 220]) is actually a physical location: a file name plus a line number. You already have that information in physicalLocation. The spec defines "logical location" like this:

logical location

location specified by reference to a programmatic construct, without specifying the artifact within which that construct occurs

Example: A class name, a method name, a namespace.

The idea of logicalLocation.fullyQualifiedName is to have (for example) a namespace + a class + a method name, like System.Collections.Generic.Dictionary.Keys. In that case, you might use Keys as the name property, and the whole string as the fullyQualifiedName property.
The spec describes a rule id as a "stable", "opaque" identifier (see §3.49.3):

In the case of a rule, id SHALL contain a stable identifier for the rule and SHOULD be opaque.

By "opaque" the spec means a purely symbolic name, like SPOT1001. We suggest you assign ids such as these to your rules, and place your existing human-readable rule names in the name property:
```
"tool": {
    "driver": {
        ...
        "rules": [
            {
                "id": "SPOT1001",
                "name": "EI_EXPOSE_REP",
                ...
            }
        ]
    }
}
```
The property names in originalUriBaseIds are numbers. The spec (§3.3.4) does say that this is legal:

The uriBaseId property can be any string; it does not need to have any particular syntax or follow any particular naming convention.

However, it is helpful to use a symbolic name that is meaningful to the reader. In examples, we frequently use names like SOURCE_ROOT, BIN_ROOT, or REPO_ROOT.
Since tool.driver.notifications is an empty array, you can omit it.
In a region object, if endLine is the same as startLine, you don't have to mention endLine (it defaults to startLine).
We recommend enclosing "dynamic content" (the placeholders like {0} in your message strings) in single quotes. The quotes make the variable content easier to spot, and we've found single quotes give a nice clean appearance. To be clear, the spec does not require this, but we did add a rule to our SARIF validator tool to recommend it.

ghost · 2020-07-15T16:12:05Z

@KengoTODA, I want to say a little more about SARIF "opaque" rule ids. When the spec says that a rule id must be "opaque", it does not mean that it must be incomprehensible to human readers, like "SPOT1001". It just means that a generic SARIF consumer -- that is, a consumer with no special knowledge of the tool -- is not allowed to assume anything about the internal structure of the identifier. For example, a generic SARIF consumer is not allowed to assume that because the rule id starts with "EI_", it has to do with "exposing internal representation" (as I see from the documentation of EI_EXPOSE_REP, EI_EXPOSE_REP2, etc.)

SARIF opaque rule identifiers should be short (to take up less UI space) and search-engine-friendly (to make it easy for users to learn more about the rule). Your strings like EI_EXPOSE_REP are fine in those regards. I should not have encouraged you to invent additional "purely symbolic" identifiers like "SPOT1001".

If your rules have additional, human-readable names, you could put them in the rule's name property, for example:

"rules": [
  {
    "id": "EI_EXPOSE_REP",
    "name": "ExposesInternalRepresentation",

@michaelcfanning FYI

ghost · 2020-07-26T20:45:45Z

@KengoTODA, we have now published the latest SARIF validator, which you can use to assess your SARIF output. To use the validator:

Install the .NET Core SDK 3.1: https://dotnet.microsoft.com/download/dotnet-core/3.1

This adds the .NET Core command line tool dotnet to your path.
In a new terminal window, run the command:
```
dotnet tool install --global Sarif.Multitool
```
This adds the SARIF Multitool command line tool sarif to your path.
In a new terminal window, run the validate command on your SpotBugs SARIF output:
```
sarif validate SpotBugs.sarif
```
This produces a list of results on the console. You can save those results as a SARIF log file:
```
sarif validate SpotBugs.sarif --output SpotBugs-validation.sarif
```

The SARIF validator checks for:

JSON syntax errors.
Violations of the SARIF schema.
Violations of certain conditions that can't be expressed in the schema, including:
- Actual spec requirements.
- Recommended best practices.

We continue to add new rules to the validator and to refine existing rules. To get the latest version:

dotnet tool update --global Sarif.Multitool

Hope this helps!

KengoTODA · 2020-07-26T23:00:03Z

Thanks, I've tried it. It seems that it needs not SDK 3.1 but SDK 2.1, so I installed it from https://dotnet.microsoft.com/download/dotnet-core/2.1

Here is the log which was reported when I run sarif tool with SDK 3.1:

$ sarif validate Desktop/spotbugs.sarif                                                                                                    
It was not possible to find any compatible framework version
The framework 'Microsoft.NETCore.App', version '2.1.0' was not found.
  - The following frameworks were found:
      3.1.6 at [/usr/local/share/dotnet/shared/Microsoft.NETCore.App]

You can resolve the problem by installing the specified framework and/or SDK.

The specified framework can be found at:
  - https://aka.ms/dotnet-core-applaunch?framework=Microsoft.NETCore.App&framework_version=2.1.0&arch=x64&rid=osx.10.15-x64

microsoft/sarif-sdk#1952 (comment)

* fix: fix JSON1005 reported by SARIF validator error JSON1005: at "runs[0].results[0].locations[0].logicalLocation": The schema does not define a property named "logicalLocation", and the schema does not permit additional properties. * fix: fix SARIF1011 reported by SARIF validator $schema: The property value https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.4.json does not refer to the final version of the SARIF 2.1.0 schema. If you are using an earlier version of the SARIF format, consider upgrading your analysis tool to produce the final version. If this file does in fact conform to the final version of the schema, upgrade the tool to populate the property with a URL that refers to the final version of the schema. * refactor: extract the logic to decide exit code, to reuse from the SARIF reporter * docs: note that the SARIF support is experimental * fix: move notifications from tool.driver to invocations microsoft/sarif-sdk#1952 (comment)

KengoTODA · 2020-08-02T04:29:03Z

Sorry for my lazy reply, I'll check your comments one by one.

Do you plan to include codeFlows in your output? spotbugs/discuss#69 mentioned potential improvements to SpotBugs output related to code flows, and SARIF's code flow feature would make it easier to do.

Issued as spotbugs/spotbugs#1241

In your previous sample output, you included tool notifications, including exception objects. That was very helpful. Do you plan to put them back?

It should exist even in the current version (SpotBugs 4.1.1). I need to improve the sample output to cover features in the format.

KengoTODA · 2020-09-05T22:04:15Z

Trying to generate POJO from JSON schema. It marks fields in MessageStrings object 'additional properties' so the generated JSON file keeps nothing. So we cannot keep the message template with placeholder explained as 3.49.11. Here is generated Java code:
https://gist.github.com/KengoTODA/357e483984c8ed5442da7ed4b7331907

Still not sure where this problem comes from.

eddynaka · 2020-11-26T16:11:31Z

@michaelcfanning , i think we can close this, since we were able to solve the issues in the sarif version from spotbugs. What do you think?

KengoTODA · 2020-12-02T07:44:05Z

@eddynaka in my case, in which project I should make an issue? is it https://github.com/oasis-tcs/sarif-spec/ ?

eddynaka · 2020-12-02T08:08:22Z

@KengoTODA I thought this was the issue that we closed in spotbugs

KengoTODA added a commit to spotbugs/spotbugs that referenced this issue Jul 13, 2020

fix: move notifications from tool.driver to invocations

b75bbc8

microsoft/sarif-sdk#1952 (comment)

KengoTODA added a commit to spotbugs/spotbugs that referenced this issue Jul 13, 2020

fix: move notifications from tool.driver to invocations

4fda772

microsoft/sarif-sdk#1952 (comment)

KengoTODA mentioned this issue Jul 13, 2020

Move notifications from tool.driver to invocations spotbugs/spotbugs#1215

Closed

KengoTODA mentioned this issue Jul 27, 2020

Fix the SARIF report spotbugs/spotbugs#1235

Merged

KengoTODA added a commit to spotbugs/spotbugs that referenced this issue Jul 28, 2020

fix: move notifications from tool.driver to invocations

5aa6f2b

microsoft/sarif-sdk#1952 (comment)

eddynaka closed this as completed Dec 1, 2020

eddynaka reopened this Dec 2, 2020

KengoTODA mentioned this issue Feb 17, 2021

Please consider org.json:json alternative spotbugs/spotbugs#1279

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide necessary support for SpotBugs SARIF #1952

Provide necessary support for SpotBugs SARIF #1952

michaelcfanning commented Jun 30, 2020

KengoTODA commented Jul 1, 2020

KengoTODA commented Jul 2, 2020

KengoTODA commented Jul 2, 2020

KengoTODA commented Jul 8, 2020

KengoTODA commented Jul 12, 2020

michaelcfanning commented Jul 12, 2020 •

edited

Loading

KengoTODA commented Jul 12, 2020

michaelcfanning commented Jul 13, 2020

ghost commented Jul 13, 2020

ghost commented Jul 13, 2020

KengoTODA commented Jul 13, 2020

ghost commented Jul 13, 2020

ghost commented Jul 14, 2020

ghost commented Jul 15, 2020 •

edited by ghost

Loading

ghost commented Jul 26, 2020

KengoTODA commented Jul 26, 2020

KengoTODA commented Aug 2, 2020

KengoTODA commented Sep 5, 2020 •

edited

Loading

eddynaka commented Nov 26, 2020

KengoTODA commented Dec 2, 2020

eddynaka commented Dec 2, 2020

Provide necessary support for SpotBugs SARIF #1952

Provide necessary support for SpotBugs SARIF #1952

Comments

michaelcfanning commented Jun 30, 2020

KengoTODA commented Jul 1, 2020

KengoTODA commented Jul 2, 2020

KengoTODA commented Jul 2, 2020

KengoTODA commented Jul 8, 2020

KengoTODA commented Jul 12, 2020

michaelcfanning commented Jul 12, 2020 • edited Loading

KengoTODA commented Jul 12, 2020

michaelcfanning commented Jul 13, 2020

ghost commented Jul 13, 2020

ghost commented Jul 13, 2020

KengoTODA commented Jul 13, 2020

ghost commented Jul 13, 2020

ghost commented Jul 14, 2020

ghost commented Jul 15, 2020 • edited by ghost Loading

ghost commented Jul 26, 2020

KengoTODA commented Jul 26, 2020

KengoTODA commented Aug 2, 2020

KengoTODA commented Sep 5, 2020 • edited Loading

eddynaka commented Nov 26, 2020

KengoTODA commented Dec 2, 2020

eddynaka commented Dec 2, 2020

michaelcfanning commented Jul 12, 2020 •

edited

Loading

ghost commented Jul 15, 2020 •

edited by ghost

Loading

KengoTODA commented Sep 5, 2020 •

edited

Loading