With GitStats you can pull git data in a uniform way from GitHub and GitLab. For the time-being you can get data on:
- organizations,
- repositories,
- commits,
- issues,
- users,
- release logs,
- files tree (structure),
- text files content,
- pull requests.
You can also prepare basic statistics with get_*_stats() functions for commits and issues.
Installation
From CRAN:
install.packages("GitStats")From GitHub:
devtools::install_github("r-world-devs/GitStats")Examples:
Setup your GitStats:
library(GitStats)
git_stats <- create_gitstats() |>
set_gitlab_host(
repos = "mbtests/gitstatstesting"
) |>
set_github_host(
orgs = "r-world-devs",
repos = "openpharma/DataFakeR"
) Get commits:
commits <- git_stats |>
get_commits(
since = "2022-01-01"
)
commits
#> # A tibble: 3,434 × 11
#> repo_name id committed_date author author_login author_name additions
#> <chr> <chr> <dttm> <chr> <chr> <chr> <int>
#> 1 gitstats… 7f48… 2024-09-10 11:12:59 Macie… maciekbanas Maciej Ban… 0
#> 2 gitstats… 9c66… 2024-09-10 10:35:37 Macie… maciekbanas Maciej Ban… 0
#> 3 gitstats… fca2… 2024-09-10 10:31:24 Macie… maciekbanas Maciej Ban… 0
#> 4 gitstats… e8f2… 2023-03-30 14:15:33 Macie… maciekbanas Maciej Ban… 1
#> 5 gitstats… 7e87… 2023-02-10 09:48:55 Macie… maciekbanas Maciej Ban… 1
#> 6 gitstats… 62c4… 2023-02-10 09:17:24 Macie… maciekbanas Maciej Ban… 2
#> 7 gitstats… 55cf… 2023-02-10 09:07:54 Macie… maciekbanas Maciej Ban… 92
#> 8 shinyGiz… C_kw… 2023-05-08 09:43:31 Kryst… krystian8207 Krystian I… 18
#> 9 shinyGiz… C_kw… 2023-04-28 12:30:40 Kamil… <NA> Kamil Kozi… 18
#> 10 shinyGiz… C_kw… 2023-03-01 15:05:10 Kryst… krystian8207 Krystian I… 296
#> # ℹ 3,424 more rows
#> # ℹ 4 more variables: deletions <int>, organization <chr>, repo_url <chr>,
#> # api_url <glue>
commits |>
get_commits_stats(
time_aggregation = "month",
group_var = author
)
#> # A tibble: 374 × 4
#> stats_date githost author stats
#> <dttm> <chr> <chr> <int>
#> 1 2022-01-01 00:00:00 github Admin_mschuemi 1
#> 2 2022-01-01 00:00:00 github Gowtham Rao 5
#> 3 2022-01-01 00:00:00 github Krystian Igras 1
#> 4 2022-01-01 00:00:00 github Martijn Schuemie 1
#> 5 2022-02-01 00:00:00 github Hadley Wickham 3
#> 6 2022-02-01 00:00:00 github Martijn Schuemie 2
#> 7 2022-02-01 00:00:00 github Maximilian Girlich 13
#> 8 2022-02-01 00:00:00 github Reijo Sund 1
#> 9 2022-02-01 00:00:00 github eitsupi 1
#> 10 2022-03-01 00:00:00 github Maximilian Girlich 14
#> # ℹ 364 more rowsGet repositories with specific code:
git_stats |>
get_repos(
with_code = "shiny",
add_contributors = FALSE
)
#> # A tibble: 9 × 17
#> repo_id repo_name repo_fullpath default_branch stars forks created_at
#> <chr> <chr> <chr> <chr> <int> <int> <dttm>
#> 1 R_kgDO… GitAI r-world-devs… main 8 2 2024-11-07 11:51:03
#> 2 R_kgDO… hypothes… r-world-devs… master 2 0 2023-04-13 13:52:24
#> 3 R_kgDO… shinyGiz… r-world-devs… dev 21 0 2022-04-20 10:04:32
#> 4 R_kgDO… shinyTim… r-world-devs… master 3 0 2023-02-21 16:41:59
#> 5 R_kgDO… shinyCoh… r-world-devs… dev 10 0 2022-05-22 19:04:12
#> 6 R_kgDO… cohortBu… r-world-devs… dev 9 2 2022-05-22 18:31:55
#> 7 R_kgDO… GitStats r-world-devs… master 9 2 2023-01-09 14:02:20
#> 8 R_kgDO… shinyQue… r-world-devs… master 3 0 2024-09-20 18:59:56
#> 9 R_kgDO… queryBui… r-world-devs… master 1 1 2024-09-20 14:54:12
#> # ℹ 10 more variables: last_activity_at <dttm>, languages <chr>,
#> # issues_open <int>, issues_closed <int>, organization <chr>, repo_url <chr>,
#> # commit_sha <chr>, api_url <chr>, githost <chr>, last_activity <drtn>Get files:
git_stats |>
get_files(
pattern = "\\.md",
depth = 2L
)
#> # A tibble: 68 × 10
#> repo_id repo_name organization file_path file_content file_size file_id
#> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 43398933 gitstatst… mbtests README.md "# GitStats… 122 fe2407…
#> 2 R_kgDOHNMr2w shinyGizmo r-world-devs NEWS.md "# shinyGiz… 2186 c13994…
#> 3 R_kgDOHNMr2w shinyGizmo r-world-devs README.md "\n# shinyG… 2337 585f62…
#> 4 R_kgDOHNMr2w shinyGizmo r-world-devs cran-com… "## Test en… 1700 cfcaf4…
#> 5 R_kgDOHYNOFQ cohortBui… r-world-devs NEWS.md "# cohortBu… 1369 8c5cf3…
#> 6 R_kgDOHYNOFQ cohortBui… r-world-devs README.md "\n# cohort… 13382 ebc919…
#> 7 R_kgDOHYNrJw shinyCoho… r-world-devs NEWS.md "# shinyCoh… 2213 8b0e9a…
#> 8 R_kgDOHYNrJw shinyCoho… r-world-devs README.md "\n# shinyC… 3355 e49dfe…
#> 9 R_kgDOHYNxtw cohortBui… r-world-devs README.md "\n# cohort… 3472 d4687c…
#> 10 R_kgDOIvtxsg GitStats r-world-devs LICENSE.… "# MIT Lice… 1075 141471…
#> # ℹ 58 more rows
#> # ℹ 3 more variables: repo_url <chr>, commit_sha <chr>, api_url <chr>Get pull requests:
git_stats |>
get_pull_requests(
since = "2022-01-01"
)
#> # A tibble: 411 × 10
#> repo_name number created_at merged_at state author
#> <chr> <chr> <dttm> <dttm> <chr> <chr>
#> 1 gitstatstesting 2 2026-02-25 09:34:18 NA closed <NA>
#> 2 gitstatstesting 1 2026-02-19 11:58:14 NA open <NA>
#> 3 shinyGizmo 1 2022-04-22 12:49:10 2022-04-22 12:58:32 merged krysti…
#> 4 shinyGizmo 2 2022-05-18 08:21:46 2022-06-02 11:16:13 merged galach…
#> 5 shinyGizmo 10 2022-06-13 09:13:59 2022-06-13 09:25:45 merged krysti…
#> 6 shinyGizmo 11 2022-06-13 10:22:16 2022-06-13 20:08:35 merged krysti…
#> 7 shinyGizmo 13 2022-06-13 21:34:05 2022-06-17 08:49:33 merged stla
#> 8 shinyGizmo 16 2022-06-15 13:48:08 2022-06-15 14:12:52 merged krysti…
#> 9 shinyGizmo 17 2022-06-17 10:55:10 2022-06-17 10:56:09 merged krysti…
#> 10 shinyGizmo 19 2022-07-05 10:01:44 2022-07-05 11:05:30 merged krysti…
#> # ℹ 401 more rows
#> # ℹ 4 more variables: source_branch <chr>, target_branch <chr>,
#> # organization <chr>, api_url <glue>Print GitStats to see what it stores:
git_stats
#> A GitStats object for 2 hosts:
#> Hosts: https://gitlab.com/api/v4, https://api.github.com
#> Scanning scope:
#> Organizations: [1] r-world-devs
#> Repositories: [2] mbtests/gitstatstesting, openpharma/DataFakeR
#> Storage:
#> Repositories: 9
#> Commits: 3434 [date range: 2022-01-01 - 2026-02-26]
#> Files: 68 [file pattern: \.md]
#> Pull_requests: 411 [date range: 2022-01-01 - 2026-02-26]See also
GitStats is used to facilitate workflow of the GitAI R package, a tool for gathering AI-based knowledge about git repositories: https://r-world-devs.github.io/GitAI/
Acknowledgement
Special thanks to James Black, Karolina Marcinkowska, Kamil Koziej, Matt Secrest, Krystian Igras, Kamil Wais, Adam Forys - for the support in the package development.
