OSINT

Signal Confidence Scoring

Probabilistic assessment of how likely a data point is to be accurate, based on source quality, evidence strength, recency, and cross-validation.

Confidence scores enable risk-weighted decisions—high-confidence data supports immediate action; low-confidence data requires verification before outreach to avoid embarrassing errors.

Expanded Definition

Confidence scoring combines multiple factors: source authority (regulatory > news > social media), evidence triangulation (3+ independent sources > single source), recency (fresh < 6 months > stale > 2 years), verification method (direct confirmation > cross-source validation > single observation), and field decay rate (stable fields maintain confidence longer).

Typical scoring tiers: High (95%+ confident)—multiple strong sources, recent, cross-validated; Medium (70-95%)—single strong source or multiple weak sources, moderately recent; Low (<70%)—unverified, single weak source, or aged data. Scores should update dynamically as evidence accumulates or decays.

Signals & Evidence

Confidence scoring factors:

  • Source strength: Primary (regulatory, legal) > secondary (news, databases) > tertiary (social, websites)
  • Triangulation: 3+ independent sources > 2 sources > 1 source
  • Recency: <6 months (high), 6-24 months (medium), >24 months (low for high-decay fields)
  • Verification: Direct confirmation > cross-source match > inference
  • Field stability: Low-decay fields retain confidence longer than high-decay

Decision Framework

  • Action thresholds: High confidence = act immediately; medium confidence = verify if high-stakes; low confidence = always verify before action
  • Resource allocation: Invest more verification effort on critical fields (decision authority, mandates) than stable identifiers
  • Confidence updates: Re-score as new evidence emerges or data ages

Common Misconceptions

"Confidence = certainty" → Even high confidence can be wrong; it's probability, not guarantee. "Single strong source = high confidence" → Usually medium; triangulation required for high confidence except for authoritative sources (regulatory filings). "Confidence is permanent" → It decays with data age and improves with fresh verification.

Key Takeaways

  • Confidence scoring prevents acting on weak data—use thresholds (high/medium/low) to determine when verification is required
  • Combine source quality, triangulation, recency, and field decay rate to calculate realistic confidence
  • Update confidence scores dynamically as evidence ages, accumulates, or contradicts