Signal Confidence Scoring
Probabilistic assessment of how likely a data point is to be accurate, based on source quality, evidence strength, recency, and cross-validation.
Confidence scores enable risk-weighted decisions—high-confidence data supports immediate action; low-confidence data requires verification before outreach to avoid embarrassing errors.
Expanded Definition
Confidence scoring combines multiple factors: source authority (regulatory > news > social media), evidence triangulation (3+ independent sources > single source), recency (fresh < 6 months > stale > 2 years), verification method (direct confirmation > cross-source validation > single observation), and field decay rate (stable fields maintain confidence longer).
Typical scoring tiers: High (95%+ confident)—multiple strong sources, recent, cross-validated; Medium (70-95%)—single strong source or multiple weak sources, moderately recent; Low (<70%)—unverified, single weak source, or aged data. Scores should update dynamically as evidence accumulates or decays.
Signals & Evidence
Confidence scoring factors:
- Source strength: Primary (regulatory, legal) > secondary (news, databases) > tertiary (social, websites)
- Triangulation: 3+ independent sources > 2 sources > 1 source
- Recency: <6 months (high), 6-24 months (medium), >24 months (low for high-decay fields)
- Verification: Direct confirmation > cross-source match > inference
- Field stability: Low-decay fields retain confidence longer than high-decay
Decision Framework
- Action thresholds: High confidence = act immediately; medium confidence = verify if high-stakes; low confidence = always verify before action
- Resource allocation: Invest more verification effort on critical fields (decision authority, mandates) than stable identifiers
- Confidence updates: Re-score as new evidence emerges or data ages
Common Misconceptions
"Confidence = certainty" → Even high confidence can be wrong; it's probability, not guarantee. "Single strong source = high confidence" → Usually medium; triangulation required for high confidence except for authoritative sources (regulatory filings). "Confidence is permanent" → It decays with data age and improves with fresh verification.
Key Takeaways
- Confidence scoring prevents acting on weak data—use thresholds (high/medium/low) to determine when verification is required
- Combine source quality, triangulation, recency, and field decay rate to calculate realistic confidence
- Update confidence scores dynamically as evidence ages, accumulates, or contradicts