Overview
Score Distribution
Engagement follows a long tail distribution — most posts score under 50, while a tiny fraction of viral content captures the bulk of attention.

Category Breakdown
Categories extracted from confession title text — 91% of posts have a hashtag prefix like #studies, #romance, or #campus.

Topic Regions (ML Clusters)
Using Mean Shift + HDBSCAN clustering on embeddings to discover four natural topic regions.
Embedding Landscape
UMAP projection of all 72K confessions — a continuous gradient with viral posts concentrated in specific regions.

Feature Importance: What Drives Scores
Correlation and boost of each feature against engagement score. Baseline avg score: 23.2.
Best Time to Post
Average score by time of day and day of week.
Hour of Day
Day of Week
Category Viral Rates
Which categories go viral most often. Viral rate = % of posts in that category in the top 25% by score.
Post Length vs Engagement
Longer posts perform significantly better — long posts (>80 words) average38.6, nearly double the medium post average.
Top 10 Most Engaged Posts
Monthly Activity

Copypasta Watch
Repeated templates and copypastas circulating in the channel. 1,373 posts (827 unique texts) are exact duplicates — 1.2% of all confessions.
Methodology
Embedding: text-embedding-3-small (512d).
Reduction: UMAP (15 neighbours, min distance 0.1).
Clustering: Mean Shift (bandwidth=2.14) + HDBSCAN.
Virality: Score = reactions + 2× replies + 3× forwards. Top 25% = viral.
Feature impact: Per-post features computed from text (emoji count, curse words, relationship keywords, etc.). Correlation measured against engagement score across all 72K posts.
Source: t.me/NUSConfessIT via Telegram API. May 2026.