How to diff Pylint output

Pylint and other code linters are great tools to give you warnings about smells and errors in your code. In most real-world applications, Pylint will find hundreds of issues and you will feel overwhelmed. This is because it only looks at the current state of the code and output all the issues in the whole project. Usually, what you are really looking for is the difference in issues that your changes make. This means you need a diff of the issue output for the current branch and the main or integration branch.

The diff should contain both new issue and fixed issues. New ones require your attention while fixed ones can give a slight boost in morale, because you feel that you did something right in your changes.

There are some possible solutions for the problem:

Run Pylint on changed files only

You can run Pylint on changed files, e.g. using a pre-commit hook This has three drawbacks:

  • It will report all (not just new) issues in all the files that are staged for commit, not just new issues.

  • It will not report issues in other files caused by your changes (e.g. a renamed function that is still called by its old name in an unchanged file).

  • It will not report fixed issues.

Implement the diffing in your CI

This sounds pretty easy to implement. Basically, you just have to

  • On every build run the linter and store its output for later diffs.

  • On the build for a branch, get the diff output for the integration branch, and diff it against the output on the current branch.

I implemented this a couple of times and found some major drawbacks. For one, you need some external storage for the diff output because CI systems usually don’t let you store/access files on builds. Additionally, this strategy only gives you a dumb text diff of the output, which may contain a lot of noise (e.g. when the order of issues in the output changed).

Only show issues on lines that are different

This is what diff-cov-lint does. It basically filters the Pylint output (and coverage reports) to only show changed lines. I see the following problems with this approach:

  • It will report all issues on changed lines, even old issues.

  • It will not report fixed issues.

  • It will not report issues in other files caused by your changes (e.g. a renamed function that is still called by its old name in an unchanged file).

Use PyCodeQual

This approach is pretty simple. You just need to set up PyCodeQual to follow your Github repository. Every time you create a pull request or add commits, it will run the analysis on the latest commit and calculate a diff of Pylint issues (and other linters & metrics) and show it to you in the pull request’s Checks tab.

../_images/github-pr-check-detail.png

Interested?

Click here to try PyCodeQual for free!