Software & AIFebruary 3, 2026

Redis in Our Stack: Caching, Sessions, and Real-Time Queues That Make the Difference

When CRM81 started growing and LetsAI needed to handle hundreds of parallel AI jobs, we needed something faster than the database. Redis became the invisible glue of our stack: caching the most requested data, user sessions, job queues, and real-time notifications. Here's how we actually use it.

Redis in Our Stack: Caching, Sessions, and Real-Time Queues That Make the Difference - Software & AI | i3k

Smart Caching: From 800ms to 12ms on the Most Frequent Queries

CRM81 has a dashboard showing aggregated statistics: number of open deals, total pipeline value, recent activities, team performance. Every time a user opens the dashboard, the system needs to run 6-8 PostgreSQL queries with GROUP BY, JOIN, and subqueries. The result? 800ms of waiting before seeing the numbers. Multiplied by 50 users opening the dashboard every morning at 9:00 AM, the database was on its knees. The solution was simple: Redis as a cache layer. When a user loads the dashboard, the backend first checks if aggregated data is cached (60-second TTL). If yes, it responds in 12ms. If not, it runs the queries, saves the result in Redis, and responds. Writes (new deal, contact update) surgically invalidate only the affected cache keys. The pattern we use is cache-aside with event-driven invalidation. We don't invalidate the entire cache on every write — only the specific keys. If you update a deal value, we only invalidate the total pipeline key, not the recent activities one. This gives us a 94% hit rate on the dashboard, meaning 94 out of 100 requests never touch PostgreSQL. We also implemented a cache warming system: a job that pre-populates the cache every morning at 8:55 AM, five minutes before the access peak. Result: even the first user of the day sees the dashboard in 12ms.

Pub/Sub and Job Queues: The Real-Time Engine of LetsAI

LetsAI is our platform that lets companies create custom AI workflows. A user can launch a job that analyzes 500 documents, generates reports, and sends notifications — all in the background. The problem: how to manage hundreds of concurrent jobs without losing any, and how to notify users in real time of progress? Redis solves both problems. For job queues, we use Redis Lists with the BRPOPLPUSH pattern (now BLMOVE in Redis 7). Node.js workers pull jobs from the queue, process them, and move them to a "completed" queue. If a worker crashes mid-processing, the job stays in the processing queue and gets reassigned after a timeout. Zero lost jobs, even under heavy load. For real-time notifications, we use Redis Pub/Sub. When a job changes state (in_progress, completed, failed), the worker publishes a message on a Redis channel. The Node.js server is subscribed to that channel and forwards the update to the user's browser via WebSocket. End-to-end latency is under 50ms: the user sees the progress bar updating almost instantly. We evaluated RabbitMQ and Apache Kafka as alternatives. RabbitMQ is more robust for mission-critical queues, but adds a separate service to manage. Kafka is overkill for our volumes. Redis was already in our stack for caching, so using it for queues too meant zero additional infrastructure.

Sessions and Rate Limiting: Redis as the API Guardian

CRM81 and LetsAI user sessions are managed entirely on Redis. When a user logs in, we create a Redis hash with session data (user_id, role, permissions, login timestamp) and set an 8-hour TTL. The session ID goes in an HTTP-only cookie. On every request, the Node.js middleware checks the session on Redis in under 1ms — much faster than a database query. The architectural advantage is enormous: if we scale horizontally with multiple Node.js instances behind a load balancer, sessions work automatically because they're centralized on Redis, not in local memory. A user can be served by any instance without sticky session issues. We also implemented rate limiting with Redis to protect our APIs. We use the sliding window pattern with ZADD and ZRANGEBYSCORE: each request adds a timestamp to the user's sorted set, and we count how many requests fall within the last 60-second window. If the count exceeds the limit (100 req/min for standard users, 500 for enterprise), the API responds with 429 Too Many Requests. All of this happens in 0.5ms thanks to Redis. A practical tip: configure maxmemory-policy to allkeys-lru. If Redis runs out of memory, it automatically starts evicting the least recently used keys. Better to lose a cache entry than to crash. In two years we've never had a Redis emergency.

Related Services

See how we apply these technologies in our enterprise projects.

Interested?

Contact us to receive a personalized quote.

All articles

Securvita S.r.l. — i3k.eu