<rss version="2.0">
  <channel>
    <title>Tech on Tracking information about the Russian War against Ukraine</title>
    <link>https://benborges.xyz/categories/tech/</link>
    <description></description>
    
    <language>en</language>
    
    <lastBuildDate>Wed, 24 Dec 2025 16:13:04 +0100</lastBuildDate>
    
    <item>
      <title></title>
      <link>https://benborges.xyz/2025/12/24/osintukraine-v-prototype-is-live.html</link>
      <pubDate>Wed, 24 Dec 2025 16:13:04 +0100</pubDate>
      
      <guid>http://benb.micro.blog/2025/12/24/osintukraine-v-prototype-is-live.html</guid>
      <description>&lt;p&gt;OSINTukraine v2 prototype is live, we&amp;rsquo;re looking for feedback, bug, data inconsistency : &amp;ndash;&amp;gt; &lt;a href=&#34;https://v2.osintukraine.com&#34;&gt;v2.osintukraine.com&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>OSINT Intelligence Platform - Architecture Views</title>
      <link>https://benborges.xyz/2025/12/05/osint-intelligence-platform-architecture-views.html</link>
      <pubDate>Fri, 05 Dec 2025 01:48:00 +0100</pubDate>
      
      <guid>http://benb.micro.blog/2025/12/05/osint-intelligence-platform-architecture-views.html</guid>
      <description>&lt;p&gt;Pipeline View (Data Plane)&lt;/p&gt;
&lt;p&gt;&amp;ldquo;How a Telegram message becomes actionable intelligence&amp;rdquo;&lt;/p&gt;
&lt;p&gt;This view shows the journey of data through the platform - from raw Telegram messages and RSS feeds to enriched, searchable intelligence. External sources flow through ingestion services, get queued in
Redis, processed for spam filtering and LLM classification, enriched with AI tagging, entity matching, and sanctions screening, then stored in PostgreSQL and served through a REST API to the frontend.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Infrastructure View (Control Plane)&lt;/p&gt;
&lt;p&gt;&amp;ldquo;How the platform runs in production&amp;rdquo;&lt;/p&gt;
&lt;p&gt;This view shows all 40+ services organized by operational layers: edge proxy (Caddy), authentication (Ory Kratos/Oathkeeper), web applications, backbone coordination (API + Redis queues + Router), worker
pools (ingestion, processing, enrichment), data stores (PostgreSQL, MinIO), and observability stack (Prometheus, Grafana, alerting).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://cdn.uploads.micro.blog/110176/2025/osint-pieline.png&#34; alt=&#34;Auto-generated description: A complex flowchart with multiple interconnected colored boxes and directional arrows represents a process or system.&#34;&gt;&lt;img src=&#34;https://cdn.uploads.micro.blog/110176/2025/osint-infra.png&#34; alt=&#34;A complex flowchart with interconnected boxes and labeled sections in various colors depicts a structured process or system.&#34;&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>So I have a GPU problem....</title>
      <link>https://benborges.xyz/2025/12/03/so-i-have-a-gpu.html</link>
      <pubDate>Wed, 03 Dec 2025 18:27:57 +0100</pubDate>
      
      <guid>http://benb.micro.blog/2025/12/03/so-i-have-a-gpu.html</guid>
      <description>&lt;p&gt;This is just an estimation based on an average, development setup, without GPU :&lt;/p&gt;
&lt;p&gt;Current Stats:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;11 active channels&lt;/li&gt;
&lt;li&gt;1,389 messages over 10 days of telegram_date (Nov 23 - Dec 3)&lt;/li&gt;
&lt;li&gt;Processing rate: ~130-176 msg/hour (average ~135 msg/hr)&lt;/li&gt;
&lt;li&gt;Spam rate: 7.5% (only 7.5% filtered as spam, 92.5% kept)&lt;/li&gt;
&lt;li&gt;Average: 138 messages/day across all 11 channels = 12.5 msg/channel/day&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Scaling to 200 channels from Feb 2022:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;February 2022 to December 2025 = ~34 months = ~1,034 days&lt;/li&gt;
&lt;li&gt;200 channels&lt;/li&gt;
&lt;li&gt;Messages per channel per day: 12.5&lt;/li&gt;
&lt;li&gt;Total messages: 200 × 12.5 × 1,034 = 2,585,000 messages (before spam filter)&lt;/li&gt;
&lt;li&gt;After 7.5% spam filter: ~2,391,125 messages need LLM processing&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Processing Time Calculation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Current rate: ~135 messages/hour (this is with 1 processor worker)&lt;/li&gt;
&lt;li&gt;Time at current rate: 2,391,125 / 135 = 17,712 hours = 738 days = ~2 years&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But wait - the processing rate includes the LLM bottleneck of 22-26s per message.&lt;/p&gt;
&lt;p&gt;Let me recalculate with the LLM being the constraint:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LLM takes ~24s per message (midpoint of 22-26s)&lt;/li&gt;
&lt;li&gt;1 processor can do: 3600/24 = 150 messages/hour&lt;/li&gt;
&lt;li&gt;That&amp;rsquo;s close to observed 135/hr (some overhead)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Scaling Options:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;With 1 processor worker: 738 days&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;With 4 processor workers: 738/4 = 185 days = ~6 months&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;With 8 processor workers: 738/8 = 92 days = ~3 months&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With 8 workers, the processing time drops significantly. However, Telegram&amp;rsquo;s rate limits introduce another complexity. The GetHistory API allows ~300 requests per minute, enabling about 100 messages per
request. For 200 channels over 1,034 days, this means 26,000 requests. At 300 requests per minute, fetching would take roughly 87 minutes. Interestingly, the Telegram data retrieval is relatively quick
compared to the LLM processing bottleneck.&lt;/p&gt;
&lt;p&gt;● Excellent data! Now let me do the math:&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;📊 Backfill Scaling Calculation: 200 Channels from Feb 2022&lt;/p&gt;
&lt;p&gt;★ Insight ─────────────────────────────────────
This calculation reveals why LLM processing is the true bottleneck. Telegram fetching is fast (~300 requests/min), but the qwen2.5:3b model running on CPU takes ~24s per message for classification. This is
the constraint that dominates scaling.
─────────────────────────────────────────────────&lt;/p&gt;
&lt;p&gt;Current Baseline (from actual data):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Active Channels&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Messages (10 days)&lt;/td&gt;
&lt;td&gt;1,389&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg msgs/channel/day&lt;/td&gt;
&lt;td&gt;12.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Processing Rate&lt;/td&gt;
&lt;td&gt;~135 msg/hr&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spam Filter Rate&lt;/td&gt;
&lt;td&gt;7.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM Time per Message&lt;/td&gt;
&lt;td&gt;~24 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Scaled Scenario: 200 Channels, Feb 2022 → Dec 2025&lt;/p&gt;
&lt;p&gt;Timeline: February 2022 → December 2025 = ~34 months = 1,034 days&lt;/p&gt;
&lt;p&gt;Total Messages Estimate:
200 channels × 12.6 msg/channel/day × 1,034 days
= 2,605,680 messages (raw)&lt;/p&gt;
&lt;p&gt;After Spam Filter (7.5% removed):
= 2,410,254 messages need LLM processing&lt;/p&gt;
&lt;p&gt;Processing Time Estimates:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Processor Workers&lt;/th&gt;
&lt;th&gt;Time to Backfill&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Current&lt;/td&gt;
&lt;td&gt;1 worker&lt;/td&gt;
&lt;td&gt;17,853 hours = 744 days = 2+ years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaled&lt;/td&gt;
&lt;td&gt;2 workers&lt;/td&gt;
&lt;td&gt;372 days = 1 year&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaled&lt;/td&gt;
&lt;td&gt;4 workers&lt;/td&gt;
&lt;td&gt;186 days = 6 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Practical&lt;/td&gt;
&lt;td&gt;8 workers&lt;/td&gt;
&lt;td&gt;93 days = 3 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maximum&lt;/td&gt;
&lt;td&gt;16 workers&lt;/td&gt;
&lt;td&gt;47 days = 6.5 weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Bottleneck Analysis:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Telegram Fetching: ~87 minutes (NOT a bottleneck)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;26,000 API requests at 300/min&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;LLM Classification: THE BOTTLENECK&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;24 seconds × 2.4M messages = 16M seconds&lt;/li&gt;
&lt;li&gt;= 185 days with 1 worker&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Database/Media: Fast (parallel, async)&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Realistic Recommendation:&lt;/p&gt;
&lt;p&gt;With 8 processor-workers (reasonable for your server):
→ ~3 months to complete full backfill&lt;/p&gt;
&lt;p&gt;With GPU-accelerated LLM (RTX 3060 or better):
→ Could cut time by 5-10x → 2-4 weeks&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;★ Insight ─────────────────────────────────────
Key optimizations to consider:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Scale workers: docker-compose up -d &amp;ndash;scale processor-worker=8&lt;/li&gt;
&lt;li&gt;GPU LLM: Switch Ollama to GPU mode (if hardware available) → 5-10x speedup&lt;/li&gt;
&lt;li&gt;Skip old content: Many Feb 2022 messages may be less relevant - consider backfilling only last 6-12 months for most channels&lt;/li&gt;
&lt;li&gt;Tiered approach: Backfill high-priority channels fully, others only recent history&lt;/li&gt;
&lt;/ol&gt;
</description>
    </item>
    
    <item>
      <title></title>
      <link>https://benborges.xyz/2025/06/11/investigation-on-telegram-hosting-infrastructure.html</link>
      <pubDate>Wed, 11 Jun 2025 18:37:25 +0100</pubDate>
      
      <guid>http://benb.micro.blog/2025/06/11/investigation-on-telegram-hosting-infrastructure.html</guid>
      <description>&lt;p&gt;Investigation on Telegram Hosting Infrastructure&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.occrp.org/en/investigation/telegram-the-fsb-and-the-man-in-the-middle&#34;&gt;www.occrp.org/en/invest&amp;hellip;&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title></title>
      <link>https://benborges.xyz/2025/03/28/to-thousands-of-american-media.html</link>
      <pubDate>Fri, 28 Mar 2025 07:10:46 +0100</pubDate>
      
      <guid>http://benb.micro.blog/2025/03/28/to-thousands-of-american-media.html</guid>
      <description>&lt;p&gt;To thousands of American Media, Indie YouTubers and Substack &amp;ldquo;media&amp;rdquo; : #Signal is not vulnerable, your device is, phishing attacks on your email account makes your entire phone vulnerable which could then give an attacker access to your signal account.&lt;/p&gt;
&lt;p&gt;Your phone is vulnerable. Not #Signal&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title></title>
      <link>https://benborges.xyz/2025/03/27/the-best-explainer-on-signalgate.html</link>
      <pubDate>Thu, 27 Mar 2025 20:57:40 +0100</pubDate>
      
      <guid>http://benb.micro.blog/2025/03/27/the-best-explainer-on-signalgate.html</guid>
      <description>&lt;p&gt;The best &lt;a href=&#34;https://youtu.be/YXsT3DOxBbY?si=fclDkfcnc6k-r5CS&#34;&gt;explainer&lt;/a&gt; on #Signalgate #Signalgang and it&amp;rsquo;s lack of consequences&lt;/p&gt;
&lt;p&gt;And to be precise, #Signal is secure, what&amp;rsquo;s insecure is their phones, what&amp;rsquo;s insecure is digital communication in general on insecure devices. Which all our devices are by nature.&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>