An open-source search engine for open-source code in the climate field
This is a proof-of-concept for a search engine for open-source code in the climate sector.
To the best knowledge of the maintainer, this project is currently unique because it combines the following aspects:
- it is 100% open-source (including the underlying data),
- it references repositories across hosting platforms,
- it has the highest repository coverage to date (since it combines data from the largest open-source indexes in the field),
- it is self-maintaining (i.e., the underlying data is regularly updated automatically to prevent dead links).
Contribution and feedback
The project is open-source and welcomes contributions, you can check its Github here, provide quick feedback here,
or get in touch with the maintainer here.
The project is currently still a proof-of-concept, with a very rudimentary underlying search engine and basic visual styling (as you could obviously notice).
Indexing and scraping methodology
The indexing and scraping methodology is fully transparent on the code repository.
It is nevertheless worth knowing that:
- indexing for search is only based upon descriptions and readme files of the repositories (this is quite restrictive),
- search is only made by keywords so far (e.g. adjacent terms are not considered at this stage).