Data Overview

Primary Platform Telegram
Content Type Public Channels
Output Format Structured AAT Frames
Access Open / On Request

What's in the Dataset

The Vector dataset contains two primary layers of data:

Raw Message Data

  • Message text (original language)
  • Channel metadata (name, subscriber count, creation date)
  • Timestamps and message IDs
  • Forwarding chains and source attribution
  • View counts and engagement metrics (where available)

Extracted Frames

  • Actor-action-target (AAT) triples per message
  • Entity types and normalized entity names
  • Sentiment and stance labels
  • Narrative cluster assignments
  • Confidence scores for each extraction

Schema

Each record in the processed dataset follows this structure:

{
  "message_id": "ch_12345_msg_67890",
  "channel": "example_channel",
  "timestamp": "2025-01-15T14:32:00Z",
  "text": "Original message text...",
  "language": "ru",
  "forwarded_from": "source_channel",
  "frames": [
    {
      "actor": "NATO",
      "action": "expanding",
      "target": "Eastern Europe",
      "sentiment": "negative",
      "confidence": 0.87
    }
  ],
  "narrative_cluster": "nato_expansion_threat",
  "views": 15420
}

Access & Licensing

The Vector dataset is available under a research-use license. Access is provided in two tiers:

Open Access

Aggregated statistics, narrative cluster summaries, and anonymized trend data are freely available through the dashboard and published reports.

Research Access

Full message-level data with extracted frames is available to verified researchers, journalists, and institutions upon request.

To request research access, reach out via the contact page with a brief description of your intended use case.

Responsible Use

  • The dataset is intended for research, journalism, and counter-disinformation purposes only.
  • Redistribution of raw data requires prior written consent.
  • Users must not use the data to target, harass, or dox individuals.
  • Citations should reference the Vector project and dataset version.