GitHub & Engineering
Track engineering activity, open source engagement, and repository health across millions of companies with public GitHub presence.
GitHub signals surface engineering activity patterns — what companies are building, how fast their open-source projects are growing, and where they're investing technically.
We track public GitHub organizations mapped to company domains, monitoring repository-level metrics (stars, forks, watchers) over 30, 60, and 180-day windows. Signals fire when growth exceeds thresholds (>20% star growth in 90 days), when patterns indicate strategic investment (new AI/ML repos, infrastructure tooling, platform ecosystem plays), or when portfolio-wide momentum suggests a company is in active engineering build-out mode. Each signal includes both portfolio-level metrics and individual repository details.
See real delivered data → Sample Files
Subtypes represent the specific engineering pattern detected — from rapid open-source growth to AI/ML investment to platform ecosystem plays.
Available Subtypes (9)
| Subtype Enum | Description |
|---|---|
githubRapidGrowth | Repository experiencing fast star/fork growth (>20% in 90 days) |
githubHighAdoption | High fork ratio indicates developers actively building on this project |
githubNewProjectTraction | New repository gaining significant early traction |
githubPortfolioMomentum | Company's GitHub portfolio shows strong overall engineering velocity |
githubMajorOSSPlayer | Company maintains flagship repositories (10k+ stars) |
githubEnterpriseSignal | Signals of enterprise-grade development (security, compliance, scale) |
githubPlatformEcosystem | Company building a developer platform/ecosystem |
githubAIMLInvestment | Active investment in AI/ML repositories and tooling |
githubInfraInvestment | Investment in infrastructure, DevOps, and cloud tooling |
Example Signal
What a single entry looks like in a delivered signal file:
{
"signal_id": "b9c1d3e5-f7a2-4b6c-8d0e-2f4a6b8c0d2e",
"batch_id": "2026-04-20-00-00-00",
"signal_type": "github",
"signal_subtype": "githubAIMLInvestment",
"detected_at": "2026-04-20T03:15:44Z",
"association": "company",
"company": {
"name": "Vercel",
"domain": "vercel.com", // match on domain
"linkedin_url": "linkedin.com/company/vercel", // or match on LinkedIn URL
"industries": ["Software Development", "Internet"],
"employee_count_low": 501,
"employee_count_high": 1000,
"description": "Frontend cloud platform for developers..."
},
"contact": [],
"data": {
"summary": "Vercel's AI SDK repositories exploded 340% in stars over 60 days, signaling aggressive developer adoption of their AI tooling layer...",
"detail": "Vercel is rapidly expanding its AI developer toolkit — the ai and ai-chatbot repos are gaining mass adoption while new inference packages signal a platform play beyond hosting...",
"relevance": 0.92, // 0.0-1.0; higher = more actionable for outreach
"confidence": "high", // how certain this signal is accurate
"sentiment": "positive",
"technologies_mentioned": ["AI SDK", "Next.js", "Edge Functions", "Streaming", "LLM orchestration"],
"referenced_repos": ["vercel/ai", "vercel/ai-chatbot", "vercel/next.js"],
"portfolio_metrics": {
"repository_count": 184,
"concentration": {
"top_3_star_share": 0.72
},
"growth": {
"stars_pct": {
"30d": 8.4,
"60d": 22.1,
"180d": 67.3
},
"forks_pct": {
"30d": 5.2,
"60d": 14.8,
"180d": 41.6
}
},
"velocity": {
"avg_stars_per_repo_30d": 47.2
}
},
"top_repositories": [
{
"name": "ai",
"full_name": "vercel/ai",
"url": "https://github.com/vercel/ai",
"description": "Build AI-powered applications with React, Svelte, Vue, and Solid",
"first_seen_at": "2023-06-01T00:00:00Z",
"current": {
"stars": 14200,
"forks": 2100,
"watchers": 14200
},
"growth_pct": {
"stars": { "d30": 12.4, "d60": 34.2, "d180": 128.6 },
"forks": { "d30": 8.1, "d60": 19.7, "d180": 84.3 }
},
"readme": {
"text": "Build AI-powered applications with React, Svelte, Vue, and Solid. The AI SDK provides a unified API for generating text, structured objects, and tool calls with LLMs...",
"source_url": "https://github.com/vercel/ai/blob/main/README.md"
}
},
{
"name": "ai-chatbot",
"full_name": "vercel/ai-chatbot",
"url": "https://github.com/vercel/ai-chatbot",
"description": "A full-featured, hackable Next.js AI chatbot built by Vercel",
"first_seen_at": "2023-05-15T00:00:00Z",
"current": {
"stars": 8900,
"forks": 2400,
"watchers": 8900
},
"growth_pct": {
"stars": { "d30": 9.8, "d60": 28.4, "d180": 95.2 },
"forks": { "d30": 11.2, "d60": 31.6, "d180": 112.4 }
},
"readme": {
"text": "An open-source AI chatbot app template built with Next.js, the Vercel AI SDK, and various LLM providers...",
"source_url": "https://github.com/vercel/ai-chatbot/blob/main/README.md"
}
},
{
"name": "next.js",
"full_name": "vercel/next.js",
"url": "https://github.com/vercel/next.js",
"description": "The React Framework",
"first_seen_at": "2016-10-25T00:00:00Z",
"current": {
"stars": 128400,
"forks": 27100,
"watchers": 128400
},
"growth_pct": {
"stars": { "d30": 1.2, "d60": 2.8, "d180": 8.4 },
"forks": { "d30": 0.9, "d60": 2.1, "d180": 6.2 }
},
"readme": {
"text": "Next.js is a React framework for building full-stack web applications. You use React Components to build user interfaces...",
"source_url": "https://github.com/vercel/next.js/blob/canary/README.md"
}
}
]
}
}Field Reference
Standard envelope and entity fields are shared across all signals — see Schema and Resolution. The fields below are specific to this signal:
Signal-Specific Fields
The data object contains everything unique to this signal type — the intelligence extracted from GitHub activity analysis.
| Field | Type | Description |
|---|---|---|
summary | string | One-line headline describing the engineering signal (e.g., "Vercel's AI SDK repositories exploded 340% in stars"). Designed for notifications and list views. Typically 15–25 words |
detail | string | Multi-sentence analysis explaining what the GitHub activity means commercially. Covers what's being built, why it matters for the company's strategy, and what it signals about technical investment. Typically 2–4 sentences |
relevance | float (0.0–1.0) | How actionable this signal is for outreach. Factors in growth velocity, repository significance, and strategic alignment. Useful for prioritization |
confidence | string | How certain this signal reflects real strategic investment vs. viral noise. high, medium, or low. Useful for filtering — one-hit viral repos score lower than sustained portfolio growth |
sentiment | string | Whether the engineering activity is favorable (positive), concerning (negative), or neutral. Useful for segmenting outreach tone |
technologies_mentioned | array[string] | Technologies, frameworks, and tools identified across active repositories. Useful for tech-stack targeting |
referenced_repos | array[string] | Repository full names (org/repo) that drove this signal. Quick reference without the full repo detail |
portfolio_metrics | object | Aggregate metrics across the company's entire GitHub portfolio |
portfolio_metrics.repository_count | integer | Total public repositories in the organization. Context for understanding scale |
portfolio_metrics.concentration.top_3_star_share | float | Share of total stars held by the top 3 repos. High concentration (>0.7) means a few flagship projects; low means distributed engineering |
portfolio_metrics.growth.stars_pct | object | Portfolio-wide star growth: 30d, 60d, 180d as percentage change. Measures developer interest acceleration |
portfolio_metrics.growth.forks_pct | object | Portfolio-wide fork growth: 30d, 60d, 180d as percentage change. Forks indicate developers building on top of these projects |
portfolio_metrics.velocity.avg_stars_per_repo_30d | float | Average new stars per active repo in the last 30 days. Normalizes for portfolio size |
top_repositories | array[object] | The 3–5 most significant repositories driving this signal. Each contains full metrics and growth data |
top_repositories[].name | string | Repository name (short) |
top_repositories[].full_name | string | Full repository path (org/repo) |
top_repositories[].url | string (URL) | GitHub URL. Useful for research and validation |
top_repositories[].description | string | Repository description from GitHub |
top_repositories[].first_seen_at | string (ISO 8601) | When we first tracked this repo. New repos (<6 months) with traction signal strategic bets |
top_repositories[].current | object | Current absolute metrics: stars, forks, watchers |
top_repositories[].growth_pct | object | Growth rates with stars and forks sub-objects, each containing d30, d60, d180 percentage changes |
top_repositories[].readme | object | Contains text (README content, truncated) and source_url. Useful for understanding what the repo does without visiting GitHub |
Timing & Delivery
detected_atis when the growth threshold was crossed. GitHub metrics are computed over rolling 30/60/180-day windows from this date.- One signal per subtype per company per month. A company can fire multiple subtypes (e.g., both
githubAIMLInvestmentandgithubPortfolioMomentum) but won't repeat the same subtype within 30 days. - Each delivery arrives in a timestamped folder. Treat all signals in a new folder as recent — no need to diff against prior deliveries.
Coverage
- Refresh: Monthly
- Coverage: 15,000,000+ companies with mapped GitHub organizations
- Best for: Developer tool sales, identifying engineering investment themes, tech-stack targeting, competitive intelligence on build-vs-buy decisions
Updated 7 days ago
