๐Ÿ“ฆ rust-lang / rust-repos

Dataset of Rust source code repositories

โ˜… 127 stars โ‘‚ 41 forks ๐Ÿ‘ 127 watching โš–๏ธ MIT License
rust-infra
๐Ÿ“ฅ Clone https://github.com/rust-lang/rust-repos.git
HTTPS git clone https://github.com/rust-lang/rust-repos.git
SSH git clone git@github.com:rust-lang/rust-repos.git
CLI gh repo clone rust-lang/rust-repos
lists updater lists updater Automatic lists update 48d767f 11 hours ago ๐Ÿ“ History
๐Ÿ“‚ master View all commits โ†’
๐Ÿ“ .github
๐Ÿ“ data
๐Ÿ“ src
๐Ÿ“„ .gitignore
๐Ÿ“„ Cargo.lock
๐Ÿ“„ Cargo.toml
๐Ÿ“„ ci-update.sh
๐Ÿ“„ LICENSE
๐Ÿ“„ README.md
๐Ÿ“„ README.md

Rust repositories list

This repository contains a scraped list of all the public GitHub repos with source code written in the Rust programming language. The source code for the scraper is also included.

Everything in this repository, unless otherwise specified, is released under the MIT license.

Running the scraper

To run the scraper, execute the program with the GITHUB_TOKEN environment variable (containing a valid GitHub API token -- no permissions are required) and the data directory as the first argument:

$ GITHUB_TOKEN=foobar cargo run --release -- data

The scraper automatically saves its state to disk, so it can be interrupted and it will resume where it left. This also allows incremental updates of the list.

Using the data

The data is available in the data/github.csv file, in CSV format. That file contains the GitHub GraphQL ID of the repository, its name, and whether it contains a Cargo.toml and Cargo.lock.

All the repositories contained in the dataset are marked as using the language by GitHub. Some results might be inaccurate for this reason.