๐Ÿ“ฆ rust-lang / rust-repos

Dataset of Rust source code repositories

โ˜… 127 stars โ‘‚ 41 forks ๐Ÿ‘ 127 watching โš–๏ธ MIT License
rust-infra
๐Ÿ“ฅ Clone https://github.com/rust-lang/rust-repos.git
HTTPS git clone https://github.com/rust-lang/rust-repos.git
SSH git clone git@github.com:rust-lang/rust-repos.git
CLI gh repo clone rust-lang/rust-repos
lists updater lists updater Automatic lists update 7e5f402 3 days ago ๐Ÿ“ History
๐Ÿ“‚ 7e5f402506981aa6b98fdff1b7c276206027032f View all commits โ†’
๐Ÿ“ .github
๐Ÿ“ data
๐Ÿ“ src
๐Ÿ“„ .gitignore
๐Ÿ“„ Cargo.lock
๐Ÿ“„ Cargo.toml
๐Ÿ“„ ci-update.sh
๐Ÿ“„ LICENSE
๐Ÿ“„ README.md
๐Ÿ“„ README.md

Rust repositories list

This repository contains a scraped list of all the public GitHub repos with source code written in the Rust programming language. The source code for the scraper is also included.

Everything in this repository, unless otherwise specified, is released under the MIT license.

Running the scraper

To run the scraper, execute the program with the GITHUB_TOKEN environment variable (containing a valid GitHub API token -- no permissions are required) and the data directory as the first argument:

$ GITHUB_TOKEN=foobar cargo run --release -- data

The scraper automatically saves its state to disk, so it can be interrupted and it will resume where it left. This also allows incremental updates of the list.

Using the data

The data is available in the data/github.csv file, in CSV format. That file contains the GitHub GraphQL ID of the repository, its name, and whether it contains a Cargo.toml and Cargo.lock.

All the repositories contained in the dataset are marked as using the language by GitHub. Some results might be inaccurate for this reason.