tl;dr; Implementing SwiftLint using SwiftSyntax instead of SourceKitten would make it run over 20x slower ðŸ˜
Update: Since writing this post, I learnt that SwiftSyntax’s upcoming byte tree deserialization mode will speed this up considerably. I hope to post a follow-up article on this shortly.
I have for some time been looking forward to reimplementing some of SwiftLint’s simpler syntax-only rules with SwiftSyntax. If you’re not familiar with it, the recent NSHipster article gives a great overview. My motivation for integrating it into SwiftLint was that it would be nice to use an officially maintained library directly to obtain the syntax tree rather than the open source but community-maintained SourceKitten library. I was also under the false impression that SwiftSyntax would be significantly faster than SourceKit/SourceKitten.
SourceKitten gets its syntax tree by dynamically loading SourceKit and making cross-process XPC calls to a SourceKit daemon. In a typical uncached lint run, SwiftLint spends a significant amount of time waiting on this syntax tree for each file being linted. Because SwiftSyntax is code-generated from the same syntax definition files as the Swift compiler, I had (incorrectly) assumed that calculating a Swift file’s syntax tree using SwiftSyntax was done entirely in-process by the library, which would have lead to significant performance gains by avoiding the cross-process XPC call made by SourceKitten for equivalent functionality.
In reality, SwiftSyntax delegates all parsing & lexing to the swiftc
binary, launching the process, reading its output from stdout and deserializing the JSON response into its SourceFileSyntax
Swift type. This is repeated for each file being parsed 😱.
Launching a new instance of the Swift compiler for each file parsed is orders of magnitude slower than SourceKitten’s XPC call to a long-lived SourceKit daemon.
I discovered this after reimplementing a very simple SwiftLint rule with a SwiftSyntax-based implementation: Fallthrough. This opt-in rule is a perfect proof-of-concept for integrating SwiftSyntax into SwiftLint because it literally just finds all occurrences of the fallthrough
keyword and reports a violation at that location. I measured the time it took to lint a folder of ~100 Swift files from Lyft’s iOS codebase with only the fallthrough
rule whitelisted.
1 2 3 4 5 |
|
I compiled both SwiftLint from master
and again with this fallthrough-swift-syntax
branch with swift build -c release
and named the binaries swiftlint-master
and swiftlint-swift-syntax
. I then benchmarked both binaries using the excellent hyperfine utility.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
The SwiftSyntax version was 22x slower than the existing SourceKitten version
Note that I ran SwiftLint with its caching mechanism and logging disabled to accurately measure the time it took just to perform the lint, rather than the overhead from logging or skipping the lint entirely by just returning cached results. Although logging only added 3ms to 10ms in my tests.
Ultimately, this means SwiftLint will be keeping its SourceKitten-based implementation for the foreseeable future, unless SwiftSyntax removes its reliance on costly compiler invocations and drastically improves its performance. I really hope the Swift team can somehow find a way to move parsing and lexing into SwiftSyntax itself, making the library much more appealing to use.