GitHub & Engineering

Track engineering activity, open source engagement, and repository health across millions of companies with public GitHub presence.

GitHub signals surface engineering activity patterns — what companies are building, how fast their open-source projects are growing, and where they're investing technically.

We track public GitHub organizations mapped to company domains, monitoring repository-level metrics (stars, forks, watchers) over 30, 60, and 180-day windows. Signals fire when growth exceeds thresholds (>20% star growth in 90 days), when patterns indicate strategic investment (new AI/ML repos, infrastructure tooling, platform ecosystem plays), or when portfolio-wide momentum suggests a company is in active engineering build-out mode. Each signal includes both portfolio-level metrics and individual repository details.

📎

See real delivered dataSample Files

Subtypes represent the specific engineering pattern detected — from rapid open-source growth to AI/ML investment to platform ecosystem plays.

Available Subtypes (9)
Subtype EnumDescription
githubRapidGrowthRepository experiencing fast star/fork growth (>20% in 90 days)
githubHighAdoptionHigh fork ratio indicates developers actively building on this project
githubNewProjectTractionNew repository gaining significant early traction
githubPortfolioMomentumCompany's GitHub portfolio shows strong overall engineering velocity
githubMajorOSSPlayerCompany maintains flagship repositories (10k+ stars)
githubEnterpriseSignalSignals of enterprise-grade development (security, compliance, scale)
githubPlatformEcosystemCompany building a developer platform/ecosystem
githubAIMLInvestmentActive investment in AI/ML repositories and tooling
githubInfraInvestmentInvestment in infrastructure, DevOps, and cloud tooling

Example Signal

What a single entry looks like in a delivered signal file:

{
  "signal_id": "b9c1d3e5-f7a2-4b6c-8d0e-2f4a6b8c0d2e",
  "batch_id": "2026-04-20-00-00-00",
  "signal_type": "github",
  "signal_subtype": "githubAIMLInvestment",
  "detected_at": "2026-04-20T03:15:44Z",
  "association": "company",
  "company": {
    "name": "Vercel",
    "domain": "vercel.com",                     // match on domain
    "linkedin_url": "linkedin.com/company/vercel",  // or match on LinkedIn URL
    "industries": ["Software Development", "Internet"],
    "employee_count_low": 501,
    "employee_count_high": 1000,
    "description": "Frontend cloud platform for developers..."
  },
  "contact": [],
  "data": {
    "summary": "Vercel's AI SDK repositories exploded 340% in stars over 60 days, signaling aggressive developer adoption of their AI tooling layer...",
    "detail": "Vercel is rapidly expanding its AI developer toolkit — the ai and ai-chatbot repos are gaining mass adoption while new inference packages signal a platform play beyond hosting...",
    "relevance": 0.92,                          // 0.0-1.0; higher = more actionable for outreach
    "confidence": "high",                       // how certain this signal is accurate
    "sentiment": "positive",
    "technologies_mentioned": ["AI SDK", "Next.js", "Edge Functions", "Streaming", "LLM orchestration"],
    "referenced_repos": ["vercel/ai", "vercel/ai-chatbot", "vercel/next.js"],
    "portfolio_metrics": {
      "repository_count": 184,
      "concentration": {
        "top_3_star_share": 0.72
      },
      "growth": {
        "stars_pct": {
          "30d": 8.4,
          "60d": 22.1,
          "180d": 67.3
        },
        "forks_pct": {
          "30d": 5.2,
          "60d": 14.8,
          "180d": 41.6
        }
      },
      "velocity": {
        "avg_stars_per_repo_30d": 47.2
      }
    },
    "top_repositories": [
      {
        "name": "ai",
        "full_name": "vercel/ai",
        "url": "https://github.com/vercel/ai",
        "description": "Build AI-powered applications with React, Svelte, Vue, and Solid",
        "first_seen_at": "2023-06-01T00:00:00Z",
        "current": {
          "stars": 14200,
          "forks": 2100,
          "watchers": 14200
        },
        "growth_pct": {
          "stars": { "d30": 12.4, "d60": 34.2, "d180": 128.6 },
          "forks": { "d30": 8.1, "d60": 19.7, "d180": 84.3 }
        },
        "readme": {
          "text": "Build AI-powered applications with React, Svelte, Vue, and Solid. The AI SDK provides a unified API for generating text, structured objects, and tool calls with LLMs...",
          "source_url": "https://github.com/vercel/ai/blob/main/README.md"
        }
      },
      {
        "name": "ai-chatbot",
        "full_name": "vercel/ai-chatbot",
        "url": "https://github.com/vercel/ai-chatbot",
        "description": "A full-featured, hackable Next.js AI chatbot built by Vercel",
        "first_seen_at": "2023-05-15T00:00:00Z",
        "current": {
          "stars": 8900,
          "forks": 2400,
          "watchers": 8900
        },
        "growth_pct": {
          "stars": { "d30": 9.8, "d60": 28.4, "d180": 95.2 },
          "forks": { "d30": 11.2, "d60": 31.6, "d180": 112.4 }
        },
        "readme": {
          "text": "An open-source AI chatbot app template built with Next.js, the Vercel AI SDK, and various LLM providers...",
          "source_url": "https://github.com/vercel/ai-chatbot/blob/main/README.md"
        }
      },
      {
        "name": "next.js",
        "full_name": "vercel/next.js",
        "url": "https://github.com/vercel/next.js",
        "description": "The React Framework",
        "first_seen_at": "2016-10-25T00:00:00Z",
        "current": {
          "stars": 128400,
          "forks": 27100,
          "watchers": 128400
        },
        "growth_pct": {
          "stars": { "d30": 1.2, "d60": 2.8, "d180": 8.4 },
          "forks": { "d30": 0.9, "d60": 2.1, "d180": 6.2 }
        },
        "readme": {
          "text": "Next.js is a React framework for building full-stack web applications. You use React Components to build user interfaces...",
          "source_url": "https://github.com/vercel/next.js/blob/canary/README.md"
        }
      }
    ]
  }
}

Field Reference

Standard envelope and entity fields are shared across all signals — see Schema and Resolution. The fields below are specific to this signal:

Signal-Specific Fields

The data object contains everything unique to this signal type — the intelligence extracted from GitHub activity analysis.

FieldTypeDescription
summarystringOne-line headline describing the engineering signal (e.g., "Vercel's AI SDK repositories exploded 340% in stars"). Designed for notifications and list views. Typically 15–25 words
detailstringMulti-sentence analysis explaining what the GitHub activity means commercially. Covers what's being built, why it matters for the company's strategy, and what it signals about technical investment. Typically 2–4 sentences
relevancefloat (0.0–1.0)How actionable this signal is for outreach. Factors in growth velocity, repository significance, and strategic alignment. Useful for prioritization
confidencestringHow certain this signal reflects real strategic investment vs. viral noise. high, medium, or low. Useful for filtering — one-hit viral repos score lower than sustained portfolio growth
sentimentstringWhether the engineering activity is favorable (positive), concerning (negative), or neutral. Useful for segmenting outreach tone
technologies_mentionedarray[string]Technologies, frameworks, and tools identified across active repositories. Useful for tech-stack targeting
referenced_reposarray[string]Repository full names (org/repo) that drove this signal. Quick reference without the full repo detail
portfolio_metricsobjectAggregate metrics across the company's entire GitHub portfolio
portfolio_metrics.repository_countintegerTotal public repositories in the organization. Context for understanding scale
portfolio_metrics.concentration.top_3_star_sharefloatShare of total stars held by the top 3 repos. High concentration (>0.7) means a few flagship projects; low means distributed engineering
portfolio_metrics.growth.stars_pctobjectPortfolio-wide star growth: 30d, 60d, 180d as percentage change. Measures developer interest acceleration
portfolio_metrics.growth.forks_pctobjectPortfolio-wide fork growth: 30d, 60d, 180d as percentage change. Forks indicate developers building on top of these projects
portfolio_metrics.velocity.avg_stars_per_repo_30dfloatAverage new stars per active repo in the last 30 days. Normalizes for portfolio size
top_repositoriesarray[object]The 3–5 most significant repositories driving this signal. Each contains full metrics and growth data
top_repositories[].namestringRepository name (short)
top_repositories[].full_namestringFull repository path (org/repo)
top_repositories[].urlstring (URL)GitHub URL. Useful for research and validation
top_repositories[].descriptionstringRepository description from GitHub
top_repositories[].first_seen_atstring (ISO 8601)When we first tracked this repo. New repos (<6 months) with traction signal strategic bets
top_repositories[].currentobjectCurrent absolute metrics: stars, forks, watchers
top_repositories[].growth_pctobjectGrowth rates with stars and forks sub-objects, each containing d30, d60, d180 percentage changes
top_repositories[].readmeobjectContains text (README content, truncated) and source_url. Useful for understanding what the repo does without visiting GitHub

Timing & Delivery

  • detected_at is when the growth threshold was crossed. GitHub metrics are computed over rolling 30/60/180-day windows from this date.
  • One signal per subtype per company per month. A company can fire multiple subtypes (e.g., both githubAIMLInvestment and githubPortfolioMomentum) but won't repeat the same subtype within 30 days.
  • Each delivery arrives in a timestamped folder. Treat all signals in a new folder as recent — no need to diff against prior deliveries.

Coverage

  • Refresh: Monthly
  • Coverage: 15,000,000+ companies with mapped GitHub organizations
  • Best for: Developer tool sales, identifying engineering investment themes, tech-stack targeting, competitive intelligence on build-vs-buy decisions

Contact Sales →