Skip to content

lystwork/tbt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Telegram Chat Analytics Bot

A Telegram bot that collects group chat statistics, builds analytics via API, and provides a web interface using Next.js.

📦 Replication and Launch

Bot requires administrator privileges in the group to read messages reliably. Without admin rights, it may miss messages and analytics will be incomplete.

  1. Clone and Prepare Environment
    git clone <repo_url>
    cd tbt
    cp .env.example .env   # see below for what to fill
  2. Configure Variables (substitute your own API key, at least one required):
    TELEGRAM_BOT_TOKEN=123:ABC             # your bot token
    GEMINI_API_KEY=AIza...                 # key from Google AI Studio (optional)
    GROQ_API_KEY=grq_...                   # optional, Groq LLM
    OPENROUTER_API_KEY=or-...              # optional, OpenRouter
    OPENROUTER_MODEL=meta-llama/llama-3.1-70b-instruct:free
    OPENROUTER_REFERRER=https://github.com/yourname/telegram-analytics
    OPENROUTER_APP_NAME=tbt-analyzer
    MISTRAL_API_KEY=sk-...                 # optional, Mistral
    DEEPSEEK_API_KEY=sk-...                # optional, DeepSeek
    QWEN_API_KEY=sk-...                    # optional, Alibaba Qwen (DashScope)
    QWEN_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions
    LLAMA_API_KEY=sk-...                   # optional, Llama API
    LLAMA_API_BASE=https://api.together.xyz/v1/chat/completions
    GROQ_MODEL=llama-3.1-70b-versatile     # models can be overridden
    MISTRAL_MODEL=mistral-small-latest
    DEEPSEEK_MODEL=deepseek-chat
    QWEN_MODEL=qwen-max
    LLAMA_MODEL=meta-llama/Llama-3-70b-chat-hf
    DB_USER=postgres
    DB_PASSWORD=postgres
    DB_NAME=chat_analytics
    STATS_CACHE_TTL=1200                   # (optional) stats cache in seconds
  3. Launch Entire Infrastructure
    docker-compose up --build
    • bot starts polling and is ready with any valid token; just add it to a group chat.
    • web available at http://localhost:3000.

🛠️ Manual Launch for Development

# bot service
cd bot
npm install
npm run dev      # works with local PostgreSQL/Redis (or via docker-compose)

# web interface
cd ../web
npm install
npm run dev      # Next.js dev server

✨ Functionality

Feature What It Does Why How to Use
/stats with inline buttons Shows top-10 authors, overall stats, and toggle "Top 10 / User stats" with time filters (today, week, month, all time). Quickly understand chat activity and dive into details for a specific user right in Telegram. Call /stats in chat. Switch ranges and modes with buttons. For a specific person – buttons or `/userstats @username [today
/analyze @username or reply Takes last 100 messages from user, sends to Gemini, returns structured result (style, tone, topics, activity, avg length, habits, frequent words). Instantly get a "communication portrait" of a user. /analyze @nickname or reply to user's message.
/top_words (custom feature) Generates top-20 words across the entire chat (postgres ts_stat). Spot what people talk about most; quickly uncover trending topics. /top_words in any chat with the bot.
Next.js Web Page Single page with @username field + "Analyze" button. Result shown as a card on the same page. Access same analytics without Telegram – convenient for specialists without chat access. npm run dev in web/, open http://localhost:3000, enter nickname.

🧠 LLM Logic and Fallback

  • Primary provider: Gemini (default gemini-2.0-flash).
  • If Gemini key missing, requests fallback sequentially: Groq → OpenRouter → Mistral → DeepSeek → Qwen → Llama. Just set up at least one key.
  • All requests go through a shared rate limiter: max 4 requests per minute for entire bot/web to avoid burning quotas.
  • Custom model per provider via *_MODEL vars and optionally *_API_BASE.
  • Bot always takes last ~100 user messages and requests strict JSON per schema for predictable output. If provider returns text instead of JSON, shows "raw response" without error.
  • DeepSeek and alternatives may require positive balance. If API returns Insufficient Balance, bot tries next provider.

🌐 Web Interface

  • Uses Next.js App Router.
  • / page has form with validation, loading state, error block, and result card.
  • API route /api/analyze mirrors bot server logic: finds user, fetches messages, calls Gemini, returns JSON { summary, formatted }.
  • Styling without Tailwind; uses next/font (Space Grotesk), mobile-responsive.

📊 Stats Implementation in Bot

  • All data stored in PostgreSQL (users, messages tables). On each message, bot upserts user and saves text.
  • Redis caches heavy queries: top-10, totals, user stats per range. TTL via STATS_CACHE_TTL (default 20 min).
  • /stats opens inline menu:
    • First row: range switch.
    • Second: mode "Top 10" or "User stats".
    • Below: buttons with nicknames (top by activity) for quick user switching.
  • /userstats for direct user stats search (@username or reply), same ranges.

🧪 Tests

  • Vitest + DB mocks.
  • Run: cd bot && npm test (bot service npm test covers requirements).
  • Coverage:
    1. Message reordering in getLastMessagesByUser.
    2. getTopUsers query building with day filters.
    3. Totals aggregation and defaults in getTotalStats.
    4. getUserStats parsing (avg length, dates).
    5. Arg parsing helpers (extractArgs, findRangeArg, findUsernameArg) for commands.

🤖 AI Usage

Where AI Helped What It Provided Manual Tweaks
Starter Dockerfile/compose Quick 4-service env Env linking, deps, health checks done manually
SQL stats queries Top-10 and total sketches Final queries optimized, indexes/cache added
Next.js API / Gemini Basic prompt examples Added JSON schema, error handling, formatting
Vitest mocks pool.query mock template Tests and scenarios written manually
Context-based code automation via Gemini, Codex, Windsurf, Claude, Perplexity.

Roughly, AI saved ~24 hours; code rewritten to task specs afterward.

🧱 Architecture and Decisions

  1. Monorepo: bot/ and web/ side-by-side, docker-compose lifts Postgres + Redis + both services. Simplifies local dev, matches task.
  2. Raw SQL + pg: Enables precise queries and ts_stat for custom feature.
  3. Redis cache: Static queries (tops/totals) pre-warmed and reused, avoiding DB hits per inline button.
  4. Separate stats.ts: Extracted keyboard/arg parsing funcs for isolated testing, keeps index.ts lean.
  5. LLM schema-first: Prompt always demands valid JSON for structured, predictable UI.

📂 Repository Structure

/tbt
├── docker-compose.yml         # postgres + redis + bot + web
├── schema.sql                 # migration
├── task.txt                   # original task
├── bot/
│   ├── src/
│   │   ├── index.ts           # Telegraf bot
│   │   ├── stats.ts           # inline keyboards + helpers
│   │   ├── gemini.ts          # Gemini API integration
│   │   ├── models/            # raw SQL
│   │   └── redis.ts / db.ts
│   └── tests/                 # Vitest
└── web/
    ├── app/page.tsx           # single analysis form page
    ├── app/api/analyze/       # Next.js API Route
    └── src/lib/               # Gemini + pg clients

📣 How to Verify

  1. docker-compose up --build — ensure Postgres/Redis up, bot logs Bot started.
  2. Add bot to group chat, chat, test /stats, /userstats, /top_words, /analyze.
  3. cd web && npm run dev — open http://localhost:3000, enter @username, compare with Telegram analytics.
  4. cd bot && npm test — unit tests without real DB.

📣 Contacts

n@ndaotec.com

About

Telegram Chat Analytics Bot

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published