A Day of Hardening, Data, and Machine Learning Across Four Projects

Today was one of those marathon sessions where you look up and realize you’ve touched every corner of your infrastructure. Four projects, dozens of files, and a security posture that went from “probably fine” to “genuinely hardened.” Here’s the rundown. The Security Sweep: FilmFestKit & ReelShorts The biggest chunk of work went into a comprehensive security audit across our two production web platforms — FilmFestKit (a film festival management SaaS) and ReelShorts (a short film streaming platform). What We Found & Fixed Command Injection in ReelShorts — This was the scariest one. The video processing pipeline (transcoding, watermarking, thumbnail generation) was using shell command execution with string-concatenated FFmpeg commands. A crafted filename could have broken out of the command. We replaced every instance with execFile and array-style arguments, plus added path validation that blocks directory traversal and null bytes. SQL Injection in ReelShorts — Because the platform uses raw pg queries instead of an ORM, there were spots where user input could sneak into query strings. Every query now uses parameterized placeholders ($1, $2), and dynamic fields like sort columns are validated against strict whitelists. CSRF Protection on Both Platforms — Swapped in the modern csrf-csrf library (the old csurf is deprecated) with a double-submit cookie pattern. Cookies use the __Host- prefix, SameSite=strict, and httpOnly. Both platforms now have dedicated /api/csrf-token endpoints. JWT Overhaul — On ReelShorts, access and refresh tokens were using the same secret. We separated them into distinct secrets and added token-type verification. Both platforms now blacklist tokens on logout via Redis with matching TTLs, so a stolen token can’t be replayed after the user logs out. Account Lockout — ReelShorts now locks accounts after 5 failed login attempts for 30 minutes, with attempt tracking stored in Redis (and an in-memory fallback if Redis goes down). WebSocket Auth — The Socket.io layer on ReelShorts was wide open. Now it requires JWT verification on connection and validates that users can only join rooms matching their own userId. Chat messages get sanitized to strip HTML characters. The Rest of the Checklist: Trust proxy locked to 1 hop (nginx only) to prevent IP spoofing Privilege escalation blocked on FilmFestKit by stripping platform_admin and access_level from tenant settings updates XSS fixed in FilmFestKit email templates with proper HTML escaping and ReDoS-safe regex Environment secrets validated at startup (32-char minimum for JWT/CSRF secrets) and redacted in all logs A new logSanitizer utility on ReelShorts that scrubs passwords, bearer tokens, credit card numbers, and SSNs from log output Stack traces hidden in production responses, correlated via request IDs for debugging Multi-tier rate limiting: stricter on auth endpoints, separate limits for read vs. write APIs Helmet security headers with a proper Content Security Policy, HSTS with preload, and frame-src: none WTracker (DramValue): Whisky Market Intelligence Completely different vibe here. WTracker is a whisky secondary market price tracking platform built on FastAPI + SQLAlchemy, and today was about filling it with data and building out the frontend. New Features Market Overview Dashboard — A new analytics page with Chart.js visualizations showing monthly trading volume, average winning bids, and lot counts across auction houses. Built a new MarketStat SQLAlchemy model to store monthly aggregates per auction house. Brands Browser — Paginated, filterable page for browsing whisky brands by category (Scotch, Bourbon, Irish, Japanese, etc.) with search. Professional Dark Theme — Full Tailwind CSS redesign with gold/amber accents, responsive nav, toast notifications, and mobile support. Data Imports Pulled in three Kaggle datasets: 562 cask auction records with inflation-adjusted hammer prices and distillery metadata 1,157 distilleries and 4,880 brands from a world whisky dataset, with country-to-category normalization 27 auction houses of market data (2005–2023) with monthly trading statistics All imports handle deduplication and normalize categories intelligently (e.g., Islay to Scotch, Ireland to Irish). FalseFestTrack: Scam Festival Detection Gets ML This one might be the most interesting. FalseFestTrack helps filmmakers identify fraudulent film festivals, and today we leveled it up significantly. Machine Learning Model Trained Random Forest and XGBoost classifiers on 1,305+ festivals to predict legitimacy tiers (A/B/C/Warning). The models use 40+ engineered features including: Accreditation signals (FIAPF, Oscar-qualifying, BAFTA, Film Fest Alliance) Name pattern analysis (detecting “chain festival” naming schemes, prestigious city exploitation) Category inflation (50+ categories is a red flag) Fee analysis (high entry fees, escalation ratios) Submission window length (9+ months = suspicious) The trained model is exported to festival_predictor.pkl for production integration. Enhanced Scoring Algorithm Updated the hand-crafted scoring with new red flags discovered through Reddit research: Monthly festival pattern (-20 pts) — classic award mill behavior Guaranteed selection (-25 pts) — real festivals are competitive Post-submission upsells (-15 pts) — pay-for-laurels schemes Chain festival naming (-15 pts) — “London International Filmmaker Festival” pattern Film Fest Alliance membership (+10 pts) — new positive signal New Data Sources Integrated three external datasets: 1,022 US festivals from a curated Google Sheet with Oscar-qualifying status 4,194 festivals from the Zenodo Film Circulation research library with academic metadata 133 major market festivals from the Cinando network dataset All imports use slug-based deduplication and enrich existing records without overwriting. The Takeaway Four very different projects, one consistent theme: don’t ship what you wouldn’t trust with your own data. The security fixes on FilmFestKit and ReelShorts were overdue — command injection and SQL injection aren’t theoretical risks, they’re the kind of thing that ends up in a breach notification. Meanwhile, WTracker and FalseFestTrack are getting the data foundations they need to actually be useful tools. Not a bad day’s work.

A Day of Hardening, Data, and Machine Learning Across Four Projects

More in Tech

When Your Labels Come from a Writers' Room

The Kayfabe Problem

I Built a Smarter Privacy Shield for Your Browser. Then Discovered Why Dumb Noise Wins.