-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add four additional performance rules #691
Conversation
…ed up. No test coverage yet
… shared helper to eliminate non-Windows sarif spew
…he command line is empty
foreach (DisposableEnumerableView<Symbol> omView in pdb.CreateObjectModuleIterator()) | ||
{ | ||
Symbol om = omView.Value; | ||
if (om.GetObjectModuleDetails().COMDATFoldingEnabled) | ||
{ | ||
// There should be exactly one symbol in the binary containing knowledge about the linker command-line. If that one symbol | ||
// says that OPT:ICF was enablelsd then the whole PE had it enabled too. | ||
return true; | ||
} | ||
} |
Check notice
Code scanning / CodeQL
Missed opportunity to use Select
foreach (DisposableEnumerableView<Symbol> omView in pdb.CreateObjectModuleIterator()) | ||
{ | ||
Symbol om = omView.Value; | ||
if (om.GetObjectModuleDetails().OptimizeReferencesEnabled) | ||
{ | ||
// There should be exactly one symbol in the binary containing knowledge about the linker command-line. If that one symbol | ||
// says that OPT:REF was enabled then the whole PE had it enabled too. | ||
return true; | ||
} | ||
} |
Check notice
Code scanning / CodeQL
Missed opportunity to use Select
foreach (DisposableEnumerableView<Symbol> omView in pdb.CreateObjectModuleIterator()) | ||
{ | ||
Symbol om = omView.Value; | ||
ObjectModuleDetails omDetails = om.GetObjectModuleDetails(); | ||
|
||
if (omDetails.Language != Language.C && | ||
omDetails.Language != Language.Cxx && | ||
omDetails.Language != Language.MASM) | ||
{ | ||
continue; | ||
} | ||
|
||
if (!omDetails.HasDebugInfo) | ||
{ | ||
continue; | ||
} | ||
|
||
bool isMSVC = (omDetails.WellKnownCompiler == WellKnownCompilers.MicrosoftC || | ||
omDetails.WellKnownCompiler == WellKnownCompilers.MicrosoftCxx); | ||
if (isMSVC) | ||
{ | ||
if (!omDetails.EliminateDuplicateStringsEnabled) | ||
{ | ||
CompilandRecord record = om.CreateCompilandRecord(); | ||
if (!string.IsNullOrEmpty(record.Library)) | ||
{ | ||
compilandsLibraryWithoutStringPooling.Add(omDetails); | ||
} | ||
else | ||
{ | ||
compilandsBinaryWithoutStringPooling.Add(omDetails); | ||
} | ||
} | ||
else | ||
{ | ||
int i = 0; | ||
i++; | ||
} | ||
} | ||
} |
Check notice
Code scanning / CodeQL
Missed opportunity to use Select
@@ -919,6 +919,14 @@ public bool IsMostlyOptimized(Pdb pdb) | |||
Symbol om = omView.Value; | |||
ObjectModuleDetails moduleDetails = om.GetObjectModuleDetails(); | |||
|
|||
// Many non-code-file modules will pass through here. Referenced libraries will show up as LINK, for example. Exclude those | |||
// from consideration to reduce skew. | |||
if ((moduleDetails.Language != Language.C) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change greatly improved the accuracy of the "is optimized?" check. At least for a small sample project the majority of modules were libraries and those do not have relevant compiler options. #Resolved
if (ArgumentStartsWith(argument, "debug")) | ||
// There are multiple /debug options so use StartsWith. There is also /debugtype: so we must be careful not | ||
// to over-match against that. | ||
if (ArgumentStartsWith(argument, "debug:") || ArgumentEquals(argument, "debug")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes a pre-existing but where /debugtype over-matched.
{ | ||
optLbr = true; | ||
// "The /OPT arguments may be specified together, separated by commas. For example, instead of | ||
// /OPT:REF /OPT:NOICF, you can specify /OPT:REF,NOICF." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bug that this parsing did not already support specifying multiple /OPT options with commas.
…h recommended by an expert
This pull request introduces 1 alert when merging 027e37b into 257e9cc - view on LGTM.com new alerts:
|
…ased on CodeQL feedback
} | ||
|
||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to nest these if statements? The outer if
doesn't check anything additional to the inner ones. #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I referenced a pre-existing rule that calls out objs/libs by name and it had this same logic (BA2004). The return statement does prevent a Pass from being logged so this does seem to be meaningful. The if blocks are independent so if/elseif/else is not appropriate.
@marmegh @michaelcfanning it is now hackathon week so I have some time to revisit these changes and make any necessary edits to get them ready for merging. Please let me know if there is anything that you'd like to see. Thanks. |
src/BinaryParsers/PEBinary/ProgramDatabase/CompilerCommandLine.cs
Outdated
Show resolved
Hide resolved
src/BinaryParsers/PEBinary/ProgramDatabase/CompilerCommandLine.cs
Outdated
Show resolved
Hide resolved
I've added a handful of comments, mostly around cleaning up comments and user facing strings. Next step for me is to enlist in your branch and step through the code. Then, I'll ask Michael to give a final review before we merge. This looks about 99% done, minor cleanups and we should be good to merge. @dmachaj In reply to: 1251303961 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…m/microsoft/binskim into user/dmachaj/mo-rules-mo-problems
/// <remarks> | ||
/// This isn't as simple as looking for the well known compiler | ||
/// values because both rust and clang binaries can link with | ||
/// C runtime libraries compiled with MSVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it is worth this whole block was cut/paste from BA2027.EnableSourceLink.cs. I needed the same answer and a valid implementation existed and could be made more widely visible.
src/BinaryParsers/PEBinary/ProgramDatabase/CompilerCommandLine.cs
Outdated
Show resolved
Hide resolved
src/BinaryParsers/PEBinary/ProgramDatabase/LinkerCommandLine.cs
Outdated
Show resolved
Hide resolved
|
||
private const string AnalyzerName = RuleIds.EliminateDuplicateStrings + "." + nameof(EliminateDuplicateStrings); | ||
|
||
public override AnalysisApplicability CanAnalyzePE(PEBinary target, Sarif.PropertiesDictionary policy, out string reasonForNotAnalyzing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not all identical (although there is a lot of overlap). The notable difference is that some rules only apply to Release builds whereas others apply to both Release and Debug.
For what it is worth I would much prefer a flat helper instead of any sort of class hierarchy for this sort of thing. Some sort of CanAnalyzePEForPerformance(bool allowDebug)
helper would do the trick. I just need a reasonable place to put it.
{ | ||
if (compilandsLibraryWithoutStringPooling.Count > 0) | ||
{ | ||
GenerateCompilandsAndLog(context, compilandsLibraryWithoutStringPooling); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing wrong with this pattern but I note that currently we can only configure a single severity for this check. A common pattern is for results that are directly relevant to a binary are 'errors' (because these are immediately fixable/under control of the user) while library dependencies remain as 'warnings' (because it is harder to solicit fixes for dependencies).
Note action here, just a thought to keep in mind... #Pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that the compiland list is a tricky one. I tried doing the same thing for another rule (LTCG and the /GL option?) but the spew for things outside your control was too large so the signal/noise ratio was not good enough.
src/BinSkim.Rules/PERules/BA6006.EnableLinkTimeCodeGeneration.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ReleaseHistory.md
Outdated
@@ -5,6 +5,8 @@ | |||
* BUGFIX: Fix `error ERR997.ExceptionLoadingPdb : '[filename]' was not evaluated because its PDB could not be loaded (E_PDB_NOT_FOUND).` when reading PE file built with `PDBPageSize:8192` or greater, by upgrading msdia140.dll from `14.27.28826.96` to `14.32.31326.0`. [685](https://github.com/microsoft/binskim/pull/685) | |||
* FEATURE: Add BA3031.EnableClangSafeStack, rename BA3030.UseCheckedFunctionsWithGcc to BA3030.UseGccCheckedFunctions [#663](https://github.com/microsoft/binskim/pull/663) | |||
* Bump Sarif.Sdk by updating submodule from [fc9a9df to 698adb6](https://github.com/microsoft/sarif-sdk/compare/fc9a9dfb865096b5aaa9fa3651854670940f7459...698adb6365a242c6bb75adde56e3bd4be39c21d7). [#674](https://github.com/microsoft/binskim/pull/674) | |||
* Introduce first performance rule BA6001.DisableIncrementalLinkingInReleaseBuilds [#667](https://github.com/microsoft/binskim/pull/667) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. I'm just getting back from a short vacation so I'll need some time to look through the feedback in detail. In the meantime I've merged the head of main into my branch so it is at least up to date in that regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, besides a minor point on switching to a different logging overload for failure cases (that specifies the severity explicitly), my feedback is mostly around renaming. If you're under the gun and want me to simply go in and try to address my concerns, let me know, it wouldn't take long.
Of course, if you want time/inclination to discuss naming conventions, let's go. :) It's surprising how much thinking has to go into this topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no particular hurry on my side. I would like to get to it today it will just be a few hours because I have other catching up to do.
Naming things is always a challenge :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I think I addressed all of your feedback. The one exception is whether there should be more code sharing for the "CanAnalyzePE" logic in the performance rules. We can discuss that one further and decide if further changes are appropriate. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, two small nits, update the rules metadata and expand the rule ids in the release notes to meet our standard, encapsulated in back-ticks, both the opaque and readable rule names are explicit. Once this is done, we'll merge.
We can go tinker in future with various things as required, based on testing, output review, etc.
Closes #677
Closes #679
Closes #680
Closes #681
Why is this change being made?
This change is continuing the pattern established with rule BA6001. Four additional performance rules have been added, all of which are at the Warning level.
Briefly summarize what changed
The following new rules were added with this change. BA6003 is a bit more controversial so I am saving it for a future PR. The pattern established by BA6001 was followed and the changes were mostly contained within the new rule implementations. The other changes were largely in the Compiler/Linker command line parser objects.
I also refactored the "Is compiled with MSVC?" detection out of the SourceLink rule and into a shared location so that all of the perf rules can reference it. The compiler/linker options are MSVC specific so it doesn't make sense to flag them for other compilers.
BA6002.EliminateDuplicateStrings
Applies to both Release and Debug. Looks for the
/GF
compiler option.BA6004.EnableCOMDATFolding
Applies to Release builds. Warns if it is not enabled. Also has a bonus check for Debug builds that enable COMDAT folding (which makes debugging harder).
BA6005.EnableOptimizeReferences
Applies to both Release and Debug. Looks for
/opt:ref
linker option.BA6006.EnableLinkTimeCodeGeneration
Applies to Release builds.
How was this change tested?
This change was tested with a combination of unit and functional tests. The linker and compiler command line parsing are finnicky, and I kept having subtle errors crop up during local testing, so I added unit test coverage to these classes. They turn a string into some bools so they are amenable to good unit testing.
The functional tests were expanded with a Pass/Fail/NotApplicable for each new rule. Adding those tests found a few more subtle problems that were fixed. Yay for testing.