Delivery Dedup Ledger & Straggler Handling (Part 2)
by ReadMe APINews Signals Now on Daily Cadence
by ReadMe APITL;DR: News signals now deliver daily instead of weekly — same schema, same fields, just faster.
What's changing
News signals have moved from a weekly to a daily delivery cadence. During the transition period, weekly and daily deliveries ran in parallel to ensure continuity. The weekly cadence is now fully retired.
Customers receive same-day news signals instead of waiting for the weekly batch, enabling faster response to breaking developments at target accounts.
What stays the same
- Same signal schema and fields
- Same format and structure
- Same filtering and query behavior
- No integration changes required
⚠️ Breaking — new fields added to data and data.evidence[] objects.
Product Review signals now include source URLs and richer evidence metadata for each individual review.
New Fields
| Field | Location | Type | Description |
|---|---|---|---|
source_page_url | data | string (URL) | Direct link to the review page (e.g. G2 product page) |
product_name | data | string | LLM-identified product being reviewed |
review_url | data.evidence[] | string (URL) | Link to the individual review |
star_rating | data.evidence[] | float | Numeric rating (1-5) |
review_date | data.evidence[] | string (date) | When the review was posted (YYYY-MM-DD) |
Additional Enrichments
| Field | Location | Type | Description |
|---|---|---|---|
switching_intent | data | object | Detected intent to switch vendors (detected, signal_phrase, urgency) |
quantified_impact | data | object | Numerical business impact mentioned (has_numbers, metrics) |
decision_maker_complaint | data | object | Whether reviewer is a decision-maker (is_decision_maker, title) |
competitors_mentioned | data | array | Competitor products referenced in the review |
Example
{
"data": {
"source_page_url": "https://www.g2.com/products/scalepad-backup-radar/reviews",
"product_name": "ScalePad Backup Radar",
"switching_intent": {
"detected": true,
"signal_phrase": "looking for alternatives",
"urgency": "medium"
},
"competitors_mentioned": ["Veeam", "Datto"],
"evidence": [
{
"quote": "The reporting is limited and we have been exploring other options...",
"reviewer_name": "IT Director",
"review_url": "https://www.g2.com/products/scalepad-backup-radar/reviews#review-12710470",
"star_rating": 2,
"review_date": "2026-04-29"
}
]
}
}Review excerpts in the evidence array are now longer and more detailed.
Effective: May 12, 2026 delivery onward (autobound-product-reviews-v2 bucket).
Action required: If your schema strictly validates the data or evidence[] objects, add handling for the new fields. All new fields are nullable.
⚠️ Breaking — new timestamp fields added, previously-null field now populated.
LinkedIn Comments (contact) signals now include actual posting timestamps. Previously, detected_at (when Autobound ingested the signal, typically 1-3 weeks after posting) was the only time reference.
New & Updated Fields
| Field | Type | Change | Description |
|---|---|---|---|
data.comment_posted_at | string (ISO 8601) | New | When the comment was actually posted on LinkedIn |
data.parent_post.posted_date | string (ISO 8601) | Now populated | When the parent post was published (field existed but was always null) |
Example
{
"data": {
"comment_posted_at": "2026-04-27T00:03:51.002Z",
"parent_post": {
"posted_date": "2026-04-24T00:21:40.232Z"
}
}
}All other fields unchanged. View full schema →
Effective: May 12, 2026 delivery onward (autobound-linkedin-comments-contact-v2 bucket).
Action required: If you filter on posted_date IS NULL or parse posted_date as a date-only string, update your logic. The field is now a full ISO 8601 datetime with milliseconds.
Upstream seniority categories on Hiring Velocity and Hiring Trends signals have been updated. First delivery containing these changes: April 7, 2026 (2026-04-07-00-00-00 folder).
Seniority Categories (Breaking)
Removed: non_manager, owner
Added: junior, mid_senior
The full data.seniority[].category enum is now: c_level, director, founder, head, junior, manager, mid_senior, partner, president, vice_president.
A new value not_set may also appear for roles where seniority could not be determined.
Customers filtering or grouping by seniority category should update to the new values.
Title Normalization (Additive)
Two new fields have been added to Hiring Velocity and Hiring Trends signals:
| Field | Type | Description |
|---|---|---|
sample_normalized_titles | string[] | Cleaned/standardized job titles |
sample_translated_titles | string[] | English translations for non-English postings |
These appear in the data object alongside existing sample_titles. Additive — no existing fields modified.
News Signals: 10 New Event Subtypes
by ReadMe API10 new subtypes have been added to news signals, expanding coverage of corporate events, financials, and risk signals. First delivery containing these subtypes: April 28, 2026 (2026-04-28-00-00-00 folder in autobound-news-v3).
| Subtype | Category | Description |
|---|---|---|
spins_off_company | Strategic | Company spins off a subsidiary |
spins_off_division | Strategic | Company spins off a division |
rebrands_to | Strategic | Company rebrands to a new name/identity |
splits_into | Strategic | Company splits into multiple entities |
declares_bankruptcy | Risk | Company files for bankruptcy |
loses_client | Revenue | Company loses a customer |
ends_partnership_with | Strategic | Company ends a partnership |
has_valuation | Financial | Company valuation reported |
has_earnings | Financial | Company earnings reported |
has_revenue | Financial | Company revenue reported |
These are additive — existing subtypes and schemas are unchanged. Total news subtypes: 40 (up from 30).
Customers ingesting news signals should update any subtype validation or switch logic to accept the new values. See the News schema page for the full subtype list.
Summary
Our S3 delivery pipeline experienced a 29-day sync outage from February 25 to March 26, 2026. During this period, signal data continued to be produced and delivered to GCS buckets normally, but was not propagated to S3 mirrors. This affected all S3 delivery customers.
All missing data has been fully backfilled as of April 16, 2026. We have also upgraded the sync architecture to prevent recurrence.
What happened
- Feb 25: The GCS→S3 sync scheduler was inadvertently removed during infrastructure maintenance
- Feb 25 – Mar 26: S3 mirrors stopped receiving updates. GCS (primary) was unaffected
- Mar 26: Sync pipeline was restored. New data began flowing to S3 again
- Apr 16: Full backfill of all 121 missing deliveries completed across 36 buckets (242 files)
Impacted buckets (36)
| Bucket | Missing deliveries |
|---|---|
| autobound-10k-v1 | 4 |
| autobound-10q-v1 | 4 |
| autobound-20f-v1 | 1 |
| autobound-20f-v2 | 1 |
| autobound-6k-v1 | 3 |
| autobound-6k-v2 | 1 |
| autobound-8k | 4 |
| autobound-conference-cfp | 9 |
| autobound-earnings-transcripts | 3 |
| autobound-earnings-transcripts-v2 | 1 |
| autobound-federal-contract-award | 8 |
| autobound-financials | 1 |
| autobound-github-v1 | 1 |
| autobound-glassdoor-company-v2 | 1 |
| autobound-hackernews | 12 |
| autobound-hiring-trends | 4 |
| autobound-hiring-velocity-v1 | 4 |
| autobound-linkedin-comments-contact-v1 | 1 |
| autobound-linkedin-post-company-v2 | 1 |
| autobound-linkedin-post-contact-v3 | 2 |
| autobound-news-v2 | 3 |
| autobound-news-v3 | 1 |
| autobound-patents | 1 |
| autobound-podcast-appearance | 17 |
| autobound-product-reviews-v1 | 1 |
| autobound-producthunt | 9 |
| autobound-reddit-company-v2 | 1 |
| autobound-sec-form-d-funding | 12 |
| autobound-seo-traffic | 1 |
| autobound-twitter-company-posts | 1 |
| autobound-twitter-contact-posts | 1 |
| autobound-website-intelligence-v1 | 1 |
| autobound-work-milestones | 3 |
| autobound-work-milestones-v2 | 1 |
| autobound-youtube-company | 1 |
| autobound-youtube-contact | 1 |
What we changed
- Full mirror architecture: S3 now mirrors every folder in every signal bucket, not just the current day. Any historical gap is automatically caught and filled on the next sync run.
- GCP-native scheduling: The sync trigger has been moved from an external scheduler to GCP Cloud Scheduler, eliminating the single point of failure that caused this outage.
- Increased capacity: Cloud Run job memory upgraded from 8GB to 16GB to handle large signal files (e.g., website-intelligence at 13.7GB).
- Backfill manifest: A backfill manifest (
backfill-2026-04-16.json) has been uploaded tos3://autobound-s3-manifests/syncs/documenting all recovered deliveries.
Action required
None. All missing data is now available in your S3 buckets in the same folder structure and file naming convention as regular deliveries. No changes to your ingestion pipeline are needed.
If you notice any remaining gaps, please reach out in your Slack Connect channel.
Product Reviews Schema Update — v2 Migration
by ReadMe APIProduct Reviews Schema Update — v2 Migration
We've updated the Product Reviews signal schema as part of the v2 migration. This update brings structural improvements and a significant increase in data coverage.
Schema Changes
New fields added:
| Field | Type | Description |
|---|---|---|
batch_id | string | Unique identifier for the delivery batch |
signal_name | string | Standardized signal name |
association | string | Entity association type |
detected_at | datetime | Timestamp when the signal was detected |
headline | string | Human-readable signal headline |
Fields removed:
| Field | Notes |
|---|---|
insight | Replaced by headline — provides a cleaner, more actionable summary |
relevance_score | Deprecated in v2 schema |
Record Count Increase
The v2 migration includes a +20% increase in record count due to expanded data source coverage and improved entity matching.
Migration Notes
- All v2 fields follow the standardized signal schema documented in Schema Reference
- The
headlinefield replacesinsightwith a more concise, actionable format - No breaking changes to existing integration patterns — new fields are additive (except for the two removed fields noted above)
We are expanding the news signal schema with structured second company data and additional enrichment fields.
This is an upcoming change, targeted for early June 2026. We will communicate the exact rollout date in advance. No action is needed until then.
Second company enrichment
News signals for event categories that involve two companies (M&A, partnerships, investments, integrations, litigation, talent movement) now include structured data for the second party. Previously, the second company was only available in the article text.
Two new fields in the data object identify the second company:
| Field | Type | Description |
|---|---|---|
data.related_company_name | string | Name of the second company involved in the event |
data.related_company_domain | string | Domain of the second company |
Which subtypes include second company data
| Subtype | company (primary) | Second company |
|---|---|---|
| Acquisition | Acquirer | Company acquired |
| Merger | Company A | Merge partner |
| Sells Assets | Seller | Buyer |
| New Customer | Vendor | The new client |
| Files Lawsuit | Plaintiff | Defendant |
| Invests Into | Investor | Company invested in |
| Integration | Company A | Integration partner |
| Partnership | Company A | Partner |
| Competitor Identified | Company A | The competitor |
| Executive Departure | (person-level) | Company departed from |
| Executive Retirement | (person-level) | Company retired from |
Additional fields
The following fields are being added to the news signal schema. All fields are nullable and only present when relevant to the signal subtype.
Top-level fields
| Field | Type | Description |
|---|---|---|
signal_id | string | Unique identifier for the signal record |
signal_type | string | Signal category (always news for news signals) |
signal_subtype | string | Specific event type (e.g. acquires, partnership, increases_headcount_by) |
signal_name | string | Human-readable signal name |
detected_at | string | ISO 8601 timestamp when the signal was detected |
batch_id | string | Delivery batch identifier |
association | string | How the signal is associated with the company |
Company fields
| Field | Type | Description |
|---|---|---|
company.name | string | Company name |
company.domain | string | Company domain |
company.linkedin_url | string | LinkedIn company page URL |
company.industries | array of strings | Industry classifications |
company.employee_count_low | integer | Lower bound of employee count range |
company.employee_count_high | integer | Upper bound of employee count range |
Data fields
| Field | Type | Description |
|---|---|---|
data.title | string | Article headline |
data.summary | string | Short human-readable excerpt of the event |
data.body | string | Full article text |
data.overview | string | Company or event overview |
data.url | string | Source article URL |
data.image_url | string | Article image URL |
data.author | string | Article author name |
data.published_at | string | ISO 8601 date when the article was published |
data.effective_date | string | Date the event takes or took effect |
data.event | string | Name of the event attended (for event-related signals) |
data.amount | integer | Monetary amount in USD (funding, acquisition value, revenue, etc.) |
data.confidence | double | Reliability score between 0 and 1. A value of 1 indicates highest certainty. |
data.is_planned | boolean | true if the event is planned but not yet completed |
data.headcount | integer | Number of people involved (hiring, layoffs) |
data.contact | string | Person name mentioned in the event |
data.job_title | string | Job title referenced in the event |
data.job_title_tags | array of strings | Normalized job title tags (e.g. marketing, directors) |
data.ticker | string | Stock ticker symbol |
data.location | string | Location as text (built from location_data when available) |
data.location_data | array of objects | Structured location data (see below) |
data.financing_type | string | Type of financing (e.g. Series B funding) |
data.financing_type_tags | array of strings | Normalized financing category tags (e.g. equity) |
data.product | string | Product name mentioned |
data.product_tags | array of strings | Normalized product tags |
data.product_data.name | string | Cleaned product name |
data.product_data.full_text | string | Full product mention as extracted from text |
data.product_data.release_type | string | Product release type (e.g. major) |
data.product_data.fuzzy_match | boolean | true if the product name may not have been extracted cleanly |
data.assets | string | Assets involved (e.g. properties, facilities) |
data.assets_tags | array of strings | Normalized asset tags |
data.award | string | Award or recognition name |
data.recognition | string | Name of the recognition the company received |
data.vulnerability | string | Security or operational issue identified |
data.related_company_name | string | Name of the second company (see above) |
data.related_company_domain | string | Domain of the second company (see above) |
Location data object
Each entry in data.location_data contains:
| Field | Type | Description |
|---|---|---|
city | string | City name |
state | string | State or province |
zip_code | string | Postal code |
country | string | Country name |
region | string | Geographic region (e.g. Northern America) |
continent | string | Continent (e.g. Americas) |
fuzzy_match | boolean | true if location data may not have been extracted accurately |
Example
{
"signal_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"signal_type": "news",
"signal_subtype": "acquires",
"signal_name": "Acquisition",
"detected_at": "2026-05-15T14:30:00Z",
"batch_id": "batch_20260515_001",
"association": "direct",
"company": {
"name": "Acquirer Corp",
"domain": "acquirercorp.com",
"linkedin_url": "https://linkedin.com/company/acquirer-corp",
"industries": ["Technology", "Enterprise Software"],
"employee_count_low": 1001,
"employee_count_high": 5000
},
"data": {
"title": "Acquirer Corp Completes Acquisition of Target Inc",
"summary": "Acquirer Corp has acquired Target Inc for $50M to expand its AI capabilities.",
"body": "Enterprise software company Acquirer Corp announced today that it has completed the acquisition of Target Inc...",
"url": "https://techcrunch.com/2026/05/15/acquirer-corp-acquires-target-inc",
"author": "Sarah Johnson",
"image_url": "https://techcrunch.com/images/acquirer-target-deal.jpg",
"published_at": "2026-05-15T10:00:00Z",
"effective_date": "2026-05-15",
"amount": 50000000,
"confidence": 0.95,
"is_planned": false,
"location": "San Francisco, California, 94105, United States, Northern America, Americas",
"location_data": [
{
"city": "San Francisco",
"state": "California",
"zip_code": "94105",
"country": "United States",
"region": "Northern America",
"continent": "Americas",
"fuzzy_match": false
}
],
"related_company_name": "Target Inc",
"related_company_domain": "targetinc.com"
}
}What stays the same
- Signal types without a second party (e.g., Funding, IPO, Launches, Headcount changes) will not include
data.related_company_nameordata.related_company_domain. - Delivery schedule and file format are unchanged.
- All fields are nullable and will only be present when relevant to the specific signal subtype.
What to update
If you ingest news signals, add handling for the new data.related_company_name and data.related_company_domain fields on the 11 subtypes listed above, and ensure your schema accommodates the additional data fields documented in this changelog.
