Go Icon blog.sourced.tech

Detecting licenses in code with Go and ML

Why not just query GitHub's API to get the licenses?

we were not satisfied with its detection quality: many projects which actually contain the license file in a non-standard format are missed, and some are misclassified.

What they came up with is go-license-detector, which detects 99% of licenses in a test dataset (compared to GitHub's 75%) in a fraction of the time. And the winner is... MIT.

Detecting licenses in code with Go and ML
0:00 / 0:00