2 months ago

Improved

Improved Delivery Coverage

by ReadMe API

TL;DR: Folder names stay the same and represent when the file was uploaded (UTC). Signals between that date and the previous folder should be unique. We now ship stragglers that were previously lost to processing delays. Every signal already has a unique signal_id, so customers who deduplicate on it will automatically get more data with no code changes.

What's changing

Better coverage, more signals.

Our pipeline now tracks each entity individually to ensure signals aren't lost between delivery cycles. Previously, processing delays could cause signals to fall through the cracks permanently. Now they ship in the next available delivery.

This was a silent data loss problem.

Signals that hit processing delays (LLM retries, late third-party data, API failures) would simply never appear. With this change, those signals land in the next batch. Since signal_id is already globally unique, customers who deduplicate on it benefit automatically.

We considered changing folder namingbut decided against it. The current format works and signal-level date fields (e.g. data.filing_date on SEC filings, data.posted_date on LinkedIn posts) handle time-series filtering better than any folder name could. See individual signal schema pages for the relevant date field per signal type.

How to use deliveries

Pull the latest folder. Everything inside is new since the previous folder.
Deduplicate on signal_id. Globally unique, never repeated across deliveries.
For time-series filtering, use the signal's own date field, not the folder name.

What stays the same

Delivery schedule, frequency, and buckets
File format (output.jsonl + output.parquet)
Authentication and service accounts
Signal schema and field names
Folder name format (YYYY-MM-DD-HH-MM-SS)

3 months ago

Improved

March 2026: Schema Standardization Across All Signal Types

by ReadMe API

We've standardized schemas across all signal types in the Signal Database, improving field consistency and adding richer company and contact metadata. These changes align with AN-8873 and AN-8861.

What Changed

Schema Standardization (All Signals)

Company objects now include a consistent set of enrichment fields across all signal types:

company.employee_count_low / company.employee_count_high — employee count range
company.industries — industry classifications
company.linkedin_url — LinkedIn company page URL
company.description — company description

These fields were previously available on some signals but not others. They are now present across all signal types (nullable when data is unavailable).

Field Renames

Several field names have been standardized for consistency:

Signal Type	Old Field	New Field
Earnings Transcripts	`company.financial_symbol`	`company.ticker`
Twitter/X (Company)	`company.company_size_low`	`company.employee_count_low`
Twitter/X (Company)	`company.company_size_high`	`company.employee_count_high`
Work Milestones	`contact.title`	`contact.job_title`
LinkedIn Comments	`contact.full_name`	`contact.name`
Product Reviews (G2)	`relevance_score` (top-level)	`data.relevance`
Product Reviews (G2)	`insight.headline` (top-level)	`data.headline`
Employee Growth	`data.relevance_score`	`data.relevance`

Structural Changes (SEC 20-F & 6-K)

For 20-F and 6-K signals, signal_category and metrics have moved from top-level fields into the data object:

signal_category → data.signal_category
metrics.* → data.metrics.*
New field: data.llm_call_category — internal LLM classification category
New field: data.fiscal_year_end — fiscal year end period

Reddit Improvements (AN-8861)

Reddit signals now include several new fields:

signal_category — high-level category (e.g. buying-intent, product, sentiment, risk, competitive, market, pain-point) derived deterministically from signal_subtype
data.post_text — full text of the source Reddit post (required on every record)
data.topics_tags — keyword tags extracted from the discussion (now populated; was previously empty)
signal_name — human-readable signal label
batch_id — processing batch identifier
New subtype: brandReputation (category: sentiment)
Removed: data.evidence_urls and data.topics_tags with 0% fill rate have been cleaned up — topics_tags is now populated, evidence_urls has been removed

Contact Enrichment (Twitter/X, Work Milestones)

Contact-level signals now include richer contact metadata:

contact.first_name, contact.last_name, contact.email — added to Work Milestones and Twitter/X (Contact)
contact.linkedin_url — added to Twitter/X (Contact) and YouTube (Contact)

Bucket Discovery

All service accounts can now list bucket names and folder timestamps across the entire project using gsutil ls -p autobound-signal-delivery — including buckets you are not licensed for. This lets you discover what signal types are available before requesting access. Reading file contents still requires objectViewer on the specific bucket.

Delivery Timeline

Already live (March 24 delivery): Reddit, SEC 20-F, SEC 6-K, Earnings Transcripts, News, Work Milestones
Next delivery cycle (April 2026): Product Reviews, Twitter/X, YouTube, LinkedIn Comments, Employee Growth, Website Intelligence, Patents

Schema Documentation

All signal schema pages have been updated. See the Signal Catalog for links to each signal's schema documentation.

Signals shipping with the new schema in April will include a notice at the top of their doc page until the first delivery lands.

3 months ago

Added

GCS Bucket Discovery Now Available

by ReadMe API

All service accounts can now list signal bucket names in the delivery project.

gsutil ls -p autobound-signal-delivery

This returns the names of all available signal buckets. Object-level access still requires per-bucket permissions (unchanged). Buckets you aren't licensed for will return 403 on read.

This replaces the previous workflow where you had to check our Delivery docs to find bucket URIs.

Scope: Only bucket names are visible. No object contents, no IAM policies, and no other GCP resources.

3 months ago

Improved

Weekly Refresh Now Live for SEC, News, and Hiring Signals

by ReadMe API

All SEC filing signals, news, and hiring signals are now on weekly refresh — completing the migration from quarterly/monthly.

Signal	Previous	Current
10-K, 10-Q, 8-K, 20-F, 6-K	Monthly	Weekly
Earnings Transcripts	Monthly	Weekly
News	Monthly	Weekly
Hiring Velocity	Monthly	Weekly
Hiring Trends	Monthly	Weekly

New signal types (Podcast, Form D, HackerNews, ProductHunt, Conference, Federal Contracts) launch on a daily cadence.

See Signal Catalog for current refresh frequencies.

3 months ago

Improved

Hiring Velocity & Trends Improvements

by ReadMe API

Based on partner feedback, we shipped several improvements to the hiring velocity and trends signals:

Time window alignment — Velocity and trends calculations now use consistent lookback periods, resolving discrepancies in company counts between datasets
1:1 department mapping — Each open role maps to a single primary department, eliminating double-counting
Full distribution exposed — Removed the top-5 category cap. All departments, locations, seniority levels, and contract types are now returned, sorted by count descending
Interpretation guide — Published a Hiring Velocity Interpretation Guide covering how velocity, trends, and breakdown metrics are calculated

These changes are reflected in all deliveries starting March 10, 2026.

3 months ago

Added

7 New Signal Types Now Live

by ReadMe API

We launched 7 new signal types in Q1 2026, bringing the total to 32+ signal categories.

Company Signals

Signal	Refresh	Description
SEC Form D Funding	2× Daily	Pre-announcement funding signals from SEC EDGAR Form D filings. Catches fundraising before press releases.
New Business Formations	Daily	Secretary of State filings from all 50 US states. ~10,000+ new registrations per day.
Federal Contract Awards	Daily	US government contract awards from USASpending.gov. Tech/services NAICS codes, >$100K threshold.
Conference & CFP Events	Daily	Upcoming tech conferences with CFP deadlines, sponsor tiers, and audience matching.
HackerNews Signals	Daily	Show HN launches, trending discussions, and company mentions with B2B relevance scoring.
ProductHunt Launches	Daily	New product launches with upvotes, maker profiles, and AI/B2B classification.

Contact Signals

Signal	Refresh	Description
Podcast Appearances	Daily	Executive podcast guest appearances with episode topics, key insights, and outreach hooks.

All new signals follow the standard schema pattern and are available via GCS delivery. Contact [email protected] to add these to your subscription.

5 months ago

Deprecated

Bucket URI Migration (v1, v2 Schema Updates)

by Daniel Wiener

We've migrated 15 signal categories to new GCS bucket URIs with corrected, standardized schemas. The new buckets use a -v1, -v2, or -v3 suffix.

What's Changing

Signal Type	Old URI (Deprecated)	New URI
SEC 10-K	`gs://autobound-10k/`	`gs://autobound-10k-v1/`
SEC 10-Q	`gs://autobound-10q/`	`gs://autobound-10q-v1/`
SEC 20-F	`gs://autobound-20f/`	`gs://autobound-20f-v1/`
SEC 6-K	`gs://autobound-6k/`	`gs://autobound-6k-v1/`
Employee Growth	`gs://autobound-employee-growth/`	`gs://autobound-employee-growth-v1/`
GitHub	`gs://autobound-github/`	`gs://autobound-github-v1/`
Glassdoor (Company)	`gs://autobound-glassdoor-company/`	`gs://autobound-glassdoor-company-v2/`
Hiring Velocity	`gs://autobound-hiring-velocity/`	`gs://autobound-hiring-velocity-v1/`
LinkedIn Comments (Contact)	`gs://autobound-linkedin-comments-contact/`	`gs://autobound-linkedin-comments-contact-v1/`
LinkedIn Post (Company)	`gs://autobound-linkedin-post-company/`	`gs://autobound-linkedin-post-company-v2/`
LinkedIn Post (Contact)	`gs://autobound-linkedin-post-contact/`	`gs://autobound-linkedin-post-contact-v3/`
News	`gs://autobound-news/`	`gs://autobound-news-v2/`
Product Reviews (G2)	`gs://autobound-product-reviews/`	`gs://autobound-product-reviews-v1/`
Reddit (Company)	`gs://autobound-reddit-company/`	`gs://autobound-reddit-company-v1/`
Website Intelligence	`gs://autobound-website-intelligence/`	`gs://autobound-website-intelligence-v1/`

Why We Made This Change

As part of a broader delivery infrastructure cleanup, we've migrated these signal categories to new buckets with corrected and standardized schemas that align with our Signal Schema documentation. This ensures consistent field naming, data types, and structure across all signal categories.

📘
One-Time Migration: This cleanup is a one-time effort to standardize our delivery infrastructure. We do not anticipate additional bucket URI changes going forward. Once you've updated to the new URIs, your integration should remain stable.

Deprecation Timeline

Date	Action
January 2026	New versioned buckets are live and receiving data
January 2026	Old buckets stop receiving new data
February 2026	Old bucket URIs will be deprecated and access removed

⚠️
Action Required: Update your data pipelines to use the new bucket URIs before February 2026.

Historical Data Note

Due to the new delivery mechanism, historical data in the new buckets may not extend the full 3-6 months initially. If you require historical backfill for specific signal categories, please contact us at [email protected].

Buckets Not Affected

The following buckets remain at their current URIs with no changes:

gs://autobound-8k/ — SEC 8-K current reports
gs://autobound-company-database/ — Company database
gs://autobound-contact-database/ — Contact database
gs://autobound-earnings-transcripts/ — Earnings call transcripts
gs://autobound-financials/ — Financial data
gs://autobound-hiring-trends/ — Hiring trends
gs://autobound-intent/ — Intent signals
gs://autobound-manifests/ — Data manifests
gs://autobound-patents/ — Patent filings
gs://autobound-seo-traffic/ — SEO & traffic signals
gs://autobound-tech-used/ — Technology stack
gs://autobound-twitter-company-posts/ — Twitter/X posts (company-level)
gs://autobound-work-milestones/ — Work milestones
gs://autobound-x-company/ — Twitter/X posts (company-level)
gs://autobound-x-contact/ — Twitter/X posts (contact-level)
gs://autobound-youtube-company/ — YouTube activity (company-level)
gs://autobound-youtube-contact/ — YouTube activity (contact-level)

Migration Checklist

Use this checklist to ensure a smooth migration:

Identify which of the 15 migrated signal types you currently use
Update bucket URIs in your data pipeline configuration
Test access to new buckets with your service account credentials
Verify data schema compatibility with your downstream systems
Update any monitoring or alerting that references old bucket names
Complete migration before February 2026 deprecation date

Questions?

If you have questions about this migration or need assistance updating your pipelines, contact us at [email protected].

5 months ago

Improved

Earnings Transcript Schema Updates

by Daniel Wiener

We've made the following updates to the Earnings Transcript signal schema to improve signal quality and provide richer context for sales outreach.

New Fields Added: Speaker Attribution

We've added speaker attribution to help you reference who said what in your outreach. Instead of just having quotes, you now know if it was the CEO, CFO, or another executive—making your outreach more credible and personalized.

Field	Type	Description
`data.evidence_speakers`	`array[object]`	Speaker attribution for each quote in the evidence array
`data.evidence_speakers[].speaker_name`	`string`	Full name of the speaker (e.g., "Satya Nadella")
`data.evidence_speakers[].speaker_title`	`string`	Title of the speaker (e.g., "Chief Executive Officer")

Example usage:

"In your Q3 earnings call, your CEO Michael O'Sullivan mentioned you're 'aggressively going after' the performance gap with competitors. We help retailers close that gap—worth a quick chat?"

Format Changes

Field	Old Format	New Format
`data.earnings_date`	ISO 8601 (`2025-01-29T17:00:00Z`)	Space-separated (`2025-01-29 17:00:00`)

Migration Notes

evidence_speakers: This is an additive, non-breaking change. The new field provides speaker attribution for quotes in the evidence array. The evidence field itself remains unchanged as array[string]. If you don't need speaker info, you can ignore this field.
earnings_date: This is a breaking change if you're parsing dates with strict ISO 8601 parsers. Update your date parsing logic to handle the new YYYY-MM-DD HH:MM:SS format (no T separator, no timezone suffix).

Affected Data

All historical earnings transcript data (2025-01 through 2026-01) has been regenerated with the new schema
Data delivered prior to January 11, 2026 uses the old schema (no evidence_speakers, ISO 8601 dates)
Action required: If you received earnings transcript data before January 11, 2026, contact your account manager to request a re-sync to get the updated schema

Questions?

Contact [email protected] if you have questions about this update or need assistance with migration.

5 months ago

Added

Moderation Scores Expanded to Product Reviews

by Daniel Wiener

We've expanded our content moderation scoring to include product review signals, ensuring brand-safe data for customer-facing use cases.

Affected Signals

G2 Reviews
Glassdoor Reviews
Reddit, LinkedIn, Twitter/X, YouTube (already supported)

How to Use

Filter by moderation_score (0.0–1.0, higher = safer):

{
  "signal_type": "g2-review",
  "data": {
    "moderation_score": 0.95,
    "review_text": "Great product for enterprise teams...",
    ...
  }
}

Recommended Thresholds

Use Case	Threshold
Email personalization (passing into an LLM)	≥ 0.8
Customer-facing display	≥ 0.9

5 months ago

Added

Manifest Files for S3/GCS Drops

by Daniel Wiener

Manifest files are now generated for all signal data drops, enabling event-driven data pipelines.

After each data delivery, a manifest file is written to a dedicated manifest bucket. Each signal type gets its own manifest file per delivery date.

For full details on manifest location, schema, and usage, see our Manifest Files guide.

Quick Example

File: news_2026-04-07.json

{
  "signal_type": "news",
  "delivery_date": "2026-04-07",
  "status": "complete",
  "destination": "internal",
  "deliveries": [
    {
      "delivery_timestamp": "2026-04-07T00:00:00Z",
      "data_path": "gs://autobound-news-v3/2026-04-07-00-00-00/",
      "files": [
        {
          "file_name": "output.jsonl",
          "file_path": "gs://autobound-news-v3/2026-04-07-00-00-00/output.jsonl",
          "format": ".jsonl",
          "size_bytes": 79510730,
          "record_count": 16766
        },
        {
          "file_name": "output.parquet",
          "file_path": "gs://autobound-news-v3/2026-04-07-00-00-00/output.parquet",
          "format": ".parquet",
          "size_bytes": 39160557,
          "record_count": 16766
        }
      ],
      "record_count": 33532
    }
  ],
  "total_record_count": 33532,
  "total_file_count": 2,
  "pipeline_run_id": null,
  "created_at": "2026-04-07T14:36:33Z"
}

What's changing

How to use deliveries

What stays the same

What Changed

Schema Standardization (All Signals)

Field Renames

Structural Changes (SEC 20-F & 6-K)

Reddit Improvements (AN-8861)

Contact Enrichment (Twitter/X, Work Milestones)

Bucket Discovery

Delivery Timeline

Schema Documentation

Company Signals

Contact Signals

What's Changing

Why We Made This Change

One-Time Migration: This cleanup is a one-time effort to standardize our delivery infrastructure. We do not anticipate additional bucket URI changes going forward. Once you've updated to the new URIs, your integration should remain stable.

Deprecation Timeline

Action Required: Update your data pipelines to use the new bucket URIs before February 2026.

Historical Data Note

Buckets Not Affected

Migration Checklist

Questions?

New Fields Added: Speaker Attribution

Format Changes

Migration Notes

Affected Data

Questions?

Affected Signals

How to Use

Recommended Thresholds

Quick Example