GitHub Usage Tips
How to activate GitHub signals for your end users.
GitHub signals surface engineering investment patterns from public repository activity. Rather than static technographic data ("uses Python"), these signals show active development direction — what companies are building, which technologies they're adopting, and how fast those projects are growing.
The data is delivered as structured JSON in weekly flat files.
Common Use Cases
- Technographic enrichment — Add real-time tech stack data to a B2B database, showing not just what companies use but where they're actively investing
- Product intelligence — Understand what products a company is building based on their public repositories and README content
- Buyer persona enrichment — Infer technical buyer personas from the technologies and frameworks a company is adopting
- Integration/partnership mapping — Identify which platforms and tools a company integrates with based on their SDK and plugin development
The example below walks through a common activation pattern with screenshots.
Example: Vector Database Prospecting
This example shows how a vector database company (Pinecone, Weaviate, etc.) would use GitHub signals to find qualified prospects.
The Filter Setup
- Signal Type: AI/ML Investment
- Confidence: High
- Technologies: RAG
This returns companies actively building RAG (retrieval-augmented generation) pipelines — Notion, Linear, Intercom, Shopify, Retool, Zapier — with live GitHub repos proving real investment.
Sample Signal
{
"signal_id": "b8c3d912-5f6e-4c7b-8d9e-2a3b4c5d6e7f",
"signal_type": "github-initiative",
"signal_subtype": "githubAIMLInvestment",
"association": "company",
"company": {
"name": "Intercom",
"domain": "intercom.com",
"linkedin_url": "https://linkedin.com/company/intercom",
"industries": ["Software", "Customer Support", "SaaS"]
},
"data": {
"summary": "Building Fin AI agent with massive knowledge base RAG",
"detail": "Intercom's Fin repositories show heavy investment in retrieval-augmented generation for their AI support agent. LangChain integration with custom embeddings pipeline.",
"relevance": 0.87,
"confidence": "high",
"sentiment": "positive",
"technologies_mentioned": ["Python", "LangChain", "RAG", "Embeddings", "OpenAI"],
"referenced_repos": ["fin-ai-agent", "knowledge-embeddings"],
"portfolio_metrics": {
"repository_count": 24,
"growth": {
"stars_pct": { "30d": 0.71, "60d": 0.89, "180d": 1.45 },
"forks_pct": { "30d": 0.52, "60d": 0.68, "180d": 1.12 }
}
},
"top_repositories": [
{
"name": "fin-ai-agent",
"full_name": "intercom/fin-ai-agent",
"url": "https://github.com/intercom/fin-ai-agent",
"description": "RAG-powered AI agent for customer support",
"current": { "stars": 890, "forks": 124 },
"growth_pct": { "stars": { "30d": 0.71 }, "forks": { "30d": 0.52 } }
}
]
},
"detected_at": "2026-01-24T10:15:00Z",
"batch_id": "gh-20260124-def456"
}What Makes This Useful
The technologies_mentioned field provides stack-level specificity. RAG, LangChain, Embeddings indicates the company needs vector storage infrastructure — not just that they're "doing AI."
The growth_pct_30d field (71% in this case) shows the project has momentum, distinguishing active investment from abandoned experiments.
Key Fields
| Field | Description |
|---|---|
data.summary | Signal headline, suitable for display |
data.technologies_mentioned | Specific technologies: RAG, LangChain, Kubernetes, etc. |
data.confidence | Signal quality (high, medium, low) |
data.top_repositories[].stars | Project traction |
data.top_repositories[].growth_pct_30d | 30-day growth rate |
Filter Dimensions
| Filter | Field | Values |
|---|---|---|
| Signal type | signal_subtype | AI/ML Investment, Infrastructure, Platform Ecosystem |
| Confidence | data.confidence | high, medium, low |
| Technologies | data.technologies_mentioned | RAG, LangChain, Kubernetes, Go, TypeScript, etc. |
Broader Coverage Patterns
The example above uses narrow filters (AI/ML Investment + RAG technology) which surface highly qualified but lower-volume signals. For broader coverage, consider these patterns:
By Signal Type
| Pattern | Filter | Coverage | Best For |
|---|---|---|---|
| All AI/ML activity | githubAIMLInvestment (no tech filter) | High | AI infrastructure, MLOps |
| Infrastructure buildout | githubInfraInvestment | High | DevOps, cloud tooling |
| Fast-growing projects | githubRapidGrowth | Medium | Trend spotting, early adopters |
| Platform builders | githubPlatformEcosystem | Medium | Developer tools, integrations |
| Major OSS presence | githubMajorOSSPlayer | Low | Enterprise deals, partnerships |
By Field Combinations
Technology-based targeting — Use data.technologies_mentioned to filter by stack:
Python,LangChain,OpenAI— AI/ML ecosystemKubernetes,Terraform,Docker— InfrastructureTypeScript,React,Next.js— Frontend/fullstack
Growth-based targeting — Use portfolio_metrics.growth.stars_pct.30d to find momentum:
> 0.5(50%+ growth) — Rapid adoption, likely funded/prioritized> 0.2(20%+ growth) — Active investment- Any positive — At least not abandoned
Repository-based targeting — Use data.referenced_repos or top_repositories to find specific project types:
- SDK/plugin repos indicate platform plays
- Infrastructure repos (terraform-, k8s-) indicate ops investment
- AI repos (llm-, embeddings-, rag-*) indicate AI investment
Coverage vs. Precision Tradeoffs
| Approach | Volume | Precision | Use When |
|---|---|---|---|
| Narrow (subtype + specific tech + high growth) | Low | High | Account-based targeting |
| Medium (subtype + confidence = high) | Medium | Medium | Default for most UIs |
| Broad (any subtype, growth > 0) | High | Lower | Technographic enrichment, market research |
Questions?
Contact [email protected] for integration support.
Updated 6 days ago
