AGENTS.md

Working reference for AI coding agents contributing to this repository. For environment setup (Node.js, package manager, editor configuration), see README.md.

Project Overview
Technical Context
Project Structure
Build And Test Commands
Contribution Instructions
Code Guidelines

Project Overview

@adguard/github-stats is a CLI toolkit that polls activity events from the GitHub REST API for a given repository, stores them as daily JSONL files with sidecar metadata, aggregates per-contributor statistics for an arbitrary time window, and either prints them to the console or publishes them to a Slack channel as formatted Block Kit messages.

The package ships three executables:

github-poll — collect today's events from the GitHub Events API and append them (deduplicated) to the local JSONL collection.
github-stats — compute and print contributor and repository statistics for a given time period.
github-publish — compute statistics for a time period and publish them to a Slack channel.

It is intended to be invoked from CI (e.g. GitHub Actions — see examples/) on a regular schedule.

Technical Context

Language/Version: JavaScript (ES Modules in src/, transpiled by Babel, bundled to CommonJS in bin/). Targets modern Node.js (Node 22.17.0+, enforced via the engines field in package.json).
Primary Dependencies: @octokit/core (GitHub REST), @slack/web-api (Slack publishing), date-fns (date math), lodash (utilities), stream-json / stream-chain (streaming JSONL parsing), dotenv (env loading).
Storage: Local filesystem. Events are written as one JSONL file per day (YYYY-MM-DD.jsonl) plus a YYYY-MM-DD-metadata.json sidecar under the directory pointed to by COLLECTION_PATH (e.g. stats-data/).
Testing: Jest (with babel-jest). Tests live under tests/ mirroring the src/ structure and use fixtures from tests/test-files/.
Target Platform: Node.js CLI, run locally or from CI runners (GitHub Actions). No browser support.
Project Type: Single-package CLI tool (multi-binary).
Performance Goals: N/A. Runs are short-lived (seconds to a couple of minutes), bounded by the GitHub Events API which returns at most MAX_NUMBER_OF_MOST_RECENT_EVENTS (300) events per repo.
Constraints:
- GitHub Events API only exposes the 300 most recent events per repo, so github-poll must be scheduled frequently enough to avoid gaps.
- Unauthenticated requests are limited to 60/hour; provide GITHUB_TOKEN in production.
- Daily JSONL files older than EVENT_EXPIRATION_DAYS (30) are pruned during stat preparation.
Scale/Scope: Single repository per invocation, single Slack channel per publish run. Designed for the AdGuard filter repositories (see TEAM_MEMBERS in src/constants.js) but configurable via env vars.

Project Structure

.
├── bin/                          # Rollup build output (CJS, with shebang)
│   ├── github-poll.js            # Entry: poll GitHub events
│   ├── github-publish.js         # Entry: publish stats to Slack
│   └── github-stats.js           # Entry: print stats to console
├── .github/
│   └── workflows/                # Production GitHub Actions (poll, stats, publish)
├── examples/                     # Sample GitHub Actions workflows for downstream users
├── scripts/                      # (reserved; currently empty)
├── src/
│   ├── github-poll.js            # CLI entry: read env, call pollEvents
│   ├── github-publish.js         # CLI entry: prepareStats + publishStats
│   ├── github-stats.js           # CLI entry: prepareStats + printStats
│   ├── constants.js              # Shared constants (team, thresholds, ...)
│   ├── poll-events/              # GitHub polling + JSONL append
│   ├── prepare-stats/            # Aggregation: contributors, repo, activity
│   ├── print-stats/              # Console formatting
│   ├── publish-stats/            # Slack publishing orchestration
│   ├── publish-utils/            # Slack client + Block Kit formatters
│   │   └── format-utils/         # Block Kit builders, pruning
│   └── tools/                    # Shared helpers (fs, gh, streams, print)
├── tests/                        # Jest tests mirroring src/ layout
│   ├── publish-utils/            # Tests for Slack block formatters
│   └── test-files/               # Fixtures
├── stats-data/                   # Default COLLECTION_PATH (JSONL + metadata)
├── babel.config.js               # Babel preset-env config
├── rollup.config.js              # Bundle src/*.js → bin/ as CJS w/ shebang
├── .eslintrc                     # ESLint (airbnb-base + project overrides)
├── .eslintignore
├── .env-example                  # Template for local .env
├── package.json
├── README.md                     # User-facing usage and install docs
└── AGENTS.md                     # This file

Build And Test Commands

This project uses Yarn as its package manager (yarn.lock is the committed lockfile and all GitHub Actions workflows under .github/workflows/ invoke yarn). Use yarn for day-to-day work; npm run <script> also works because the scripts are plain wrappers, but match the existing tooling unless you have a reason not to.

Command	Purpose
`yarn install`	Install dependencies
`yarn build`	Build CLI bundles (`src/` → `bin/`)
`yarn test`	Run unit tests
`yarn lint`	Lint the entire repo
`yarn lint --fix`	Auto-fix lint issues
`yarn lint-staged`	Run lint-staged (used by Husky)
`yarn poll`	Poll events (locally)
`yarn stats`	Print stats (locally)
`yarn publish`	Publish to Slack (locally)

There is no separate formatter or type checker — ESLint (airbnb-base) is the only static-analysis gate, and yarn build (Rollup + Babel) is the closest thing to a type/parse check.

Contribution Instructions

You MUST verify your changes with the linter and the build.

Use the following commands:
- yarn lint to run ESLint over src/, tests/, and config files
- yarn lint --fix to auto-fix lint issues where possible
- yarn build to ensure the Rollup bundle still produces valid CommonJS output in bin/
You MUST update the unit tests under tests/ for any changed code paths, and add new tests for new behavior.
You MUST run yarn test and confirm all Jest suites pass before considering a task done.
You MUST keep the bin/ directory in sync with src/ by running yarn build whenever you change source files that are bundled into a CLI entry. Commit the regenerated bin/ artifacts together with the source change (the repository ships them as runnable binaries).
When making changes to the project structure (adding/removing directories, moving modules, renaming entry points), ensure the Project Structure section in this AGENTS.md is updated and remains valid.
If a prompt essentially asks you to refactor or improve existing code, check whether the underlying rule can be phrased as a code guideline. If it can, add it to the relevant Code Guidelines subsection in this AGENTS.md.
After completing the task you MUST verify that the code you've written follows the Code Guidelines in this file.

Code Guidelines

System Design

Design for a command-line tool:

The tool runs and exits — no long-lived daemons. Each CLI entry (github-poll, github-stats, github-publish) performs the requested work and exits with code 0 on success, non-zero on failure.
Prefer stdout for primary output (stats, results) and stderr for diagnostics and errors. Keep the channels separate so that output can be piped or captured by CI.
Fail fast with clear messages — validate required env vars (COLLECTION_PATH, REPO, plus SLACK_* for publish) at the very top of each entry script, log a human-readable error to stderr, and exit with a non-zero status.
Configuration is via environment variables (loaded via dotenv). Document every supported variable in README.md and in .env-example.
Keep startup time short — avoid heavy top-level work in modules that may be imported but not used by every entry point.
Treat the local filesystem as the only persistent state. Writes to COLLECTION_PATH MUST be idempotent: re-running github-poll on the same day MUST deduplicate events, not duplicate them.

Architecture

Universal principles the codebase follows:

Separation of Concerns — polling, aggregation, console output, and Slack publishing each live in their own module directory under src/.
Single Responsibility Principle — each file does one thing (e.g. prepare-contributors.js, format-repo-stat.js, publish-message.js). Less critical for tiny helper modules under src/tools/, but still followed.
Dependency Direction — CLI entry scripts depend on feature modules, feature modules depend on tools/ and publish-utils/; never the reverse. Constants flow from src/constants.js to everyone.
Explicit Boundaries — every feature module exposes its public API through an index.js barrel. Importers MUST go through index.js rather than reaching into internal files.
Data Flow Clarity — events flow GitHub API → JSONL files → prepare-stats → {print-stats, publish-stats}. There are no back edges.
Minimize Coupling, Maximize Cohesion — Slack-specific logic lives entirely under publish-utils/; aggregation under prepare-stats/ knows nothing about Slack or the console.
Make Invalid States Impossible — less applicable in untyped JavaScript, but enforced through small, focused functions and constants in src/constants.js (event types, action names, label names) instead of magic strings.
Observability Built-in — github-poll logs structured progress and rate-limit warnings to stdout/stderr. New long-running steps SHOULD log similarly via console.log / console.warn.
Keep It Boring — prefer well-understood Node patterns (async/await, plain objects) over clever or novel solutions.

This project follows a layered architecture. From top (entry) to bottom (infrastructure):

CLI entries (src/github-{poll,stats,publish}.js, bin/*)
        ↓
Feature modules (src/poll-events, src/prepare-stats,
                 src/print-stats, src/publish-stats)
        ↓
Domain utilities (src/publish-utils, src/publish-utils/format-utils)
        ↓
Shared tools & constants (src/tools, src/constants.js)
        ↓
External services (GitHub REST API, Slack Web API, local filesystem)

A higher layer may call any layer below it. No layer may depend on a layer above it. Sibling feature modules MUST NOT import from each other; if they need shared logic, lift it into src/tools/ or src/publish-utils/.

Code Quality

Module system: Source under src/ and tests/ is authored as ES Modules (import / export). Babel + Rollup transpile to CommonJS for the bin/ output.
Linter: ESLint with eslint-config-airbnb-base. Project-specific overrides live in .eslintrc. Do NOT relax the airbnb rules globally; if a rule is genuinely wrong for a single line, use a scoped // eslint-disable-next-line <rule> with a reason rather than editing .eslintrc.
No formatter / no type checker: There is no Prettier, no TypeScript. Match the surrounding style; let ESLint catch deviations.
Error handling: Top-level CLI entry scripts SHOULD wrap their main IIFE in try/catch, log via console.error, and process.exit(1) on failure. Library functions SHOULD throw normal Error instances; let the caller decide how to recover.
Logging: Use console.log for progress, console.warn for recoverable issues (rate-limit warnings, missing optional data), and console.error for fatal errors. Disable the no-console ESLint rule inline (// eslint-disable-next-line no-console) at each call site, as done elsewhere in the codebase.
Naming: Files and directories use kebab-case (prepare-contributors-stat.js). Exported identifiers use camelCase for functions/values and PascalCase for classes (e.g. Contributor). Constants are UPPER_SNAKE_CASE and live in src/constants.js.
Imports: Each feature module exposes its public surface through index.js. Importers from outside the module MUST import from the module's directory (e.g. from './prepare-stats'), not from internal files.
JSDoc: Document non-trivial functions and exported constants with JSDoc, following the style already present in src/constants.js.

Testing

Framework: Jest with babel-jest. Configuration is implicit (defaults from jest). Tests are discovered under tests/.
Layout: Tests mirror the src/ directory layout under tests/ (e.g. src/publish-utils/format-utils/format-repo-stat.js → tests/publish-utils/repo-stat-to-blocks.test.js).
Naming: Test files end in .test.js.
Fixtures: Static input/output fixtures live under tests/test-files/. Reuse existing fixtures when possible; add new ones rather than mutating shared ones.
What to test: Focus on pure functions in prepare-stats/ and publish-utils/format-utils/ — they have well-defined inputs/outputs and no I/O. Network and filesystem boundaries (@octokit/core, @slack/web-api) SHOULD be mocked with jest.mock rather than hit live.
Coverage: No formal threshold is enforced. New behavior MUST come with at least one test that exercises the success path and one for any meaningful failure path.
Verification: yarn test MUST pass with no failing or pending suites before a change is considered done.

Dependency Management

Pin all dependency versions explicitly — replace caret/tilde ranges with exact versions in package.json for any new or updated dependency. Do not allow npm to silently move to an untested release.
Prefer vanilla solutions — Node's standard library (node:fs/promises, node:path, node:stream, global fetch on Node 18+) usually suffices. Only add a dependency when it provides meaningful value over a hand-rolled solution.
Reputable sources only — new dependencies MUST come from well-established, actively maintained projects (high weekly downloads, recent commits, recognized maintainers).
Avoid unpopular libraries — do NOT add niche or obscure packages. They are a supply-chain risk and tend to become unmaintained.
Minimize dependency count — every new dependency increases attack surface and bundle size. Justify each addition in the PR description.
Use the latest stable version — when adding a dependency, check the npm registry for the latest stable release rather than copying a version number from another project or from memory.

Rationale: Fewer, well-vetted dependencies reduce security vulnerabilities, supply-chain risks, and long-term maintenance cost.

Known exclusions (to be fixed):

lodash is largely replaceable with built-in language features for the small set of helpers actually used; consider migrating.

Configuration & Documentation

Runtime configuration: All configuration is provided via environment variables, loaded by dotenv at the top of each CLI entry in src/. The supported variables are: COLLECTION_PATH, GITHUB_TOKEN, REPO, SINCE, UNTIL, SLACK_OAUTH_TOKEN, SLACK_CHANNEL_ID. See README.md#configuration for semantics.
Local overrides: A .env file in the repo root is loaded at startup. .env is gitignored; commit changes to .env-example instead so contributors can copy it.
No secrets in code: Tokens (GITHUB_TOKEN, SLACK_OAUTH_TOKEN) MUST come from the environment. Never hardcode them, log them, or commit them. Constants that are public (team handles, thresholds, excluded usernames) belong in src/constants.js.
Documentation to keep in sync:
- README.md — when CLI behavior, env vars, or install instructions change.
- .env-example — when a new env var is introduced or a name changes.
- examples/ — when the GitHub Actions integration surface changes.
- This AGENTS.md — when project structure, build/test commands, conventions, or architecture change.
- CHANGELOG.md — for any user-visible change.

Markdown Formatting

All Markdown files MUST comply with the rules defined in .markdownlint.json. Run yarn lint to check for violations; yarn lint --fix auto-fixes most of them.

Other

Bundled CLI artifacts: The contents of bin/ are generated by Rollup. Do NOT hand-edit them. The hashed sibling files (fs-utils-*.js, prepare-stats-*.js) are Rollup chunk artifacts; removing or renaming them by hand will break the published binaries — re-run yarn build to regenerate them cleanly.
Daily JSONL files: stats-data/*.jsonl and the matching *-metadata.json sidecars are runtime data, not code. Treat them as fixtures only; do not refactor through them.
Husky / lint-staged: A pre-commit hook (configured under .husky/) runs lint-staged, which lints staged {src,tests,scripts}/**/*.js files. Do not bypass it with --no-verify unless you have a specific reason and note it in the PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGENTS.md

Table of Contents

Project Overview

Technical Context

Project Structure

Build And Test Commands

Contribution Instructions

Code Guidelines

System Design

Architecture

Code Quality

Testing

Dependency Management

Configuration & Documentation

Markdown Formatting

Other

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AGENTS.md

Table of Contents

Project Overview

Technical Context

Project Structure

Build And Test Commands

Contribution Instructions

Code Guidelines

System Design

Architecture

Code Quality

Testing

Dependency Management

Configuration & Documentation

Markdown Formatting

Other