Using Rails to Handle Streaming Data at Scale

Modern applications require real-time data processing to handle millions of events per second. Can Rails manage streaming data at scale?

While Rails is traditionally a request-response framework, it can process real-time data streams efficiently with:
✅ WebSockets for live updates
✅ Message queues like Kafka & RabbitMQ
✅ Background jobs for parallel processing
✅ Optimized database writes to prevent bottlenecks

In this guide, we’ll build a real-time, scalable Rails architecture for handling streaming data efficiently. 🚀

1. Understanding Streaming Data in Rails

Streaming data means continuous, real-time data ingestion, processing, and storage. Examples include:

Stock market price updates 📈
Live sports scores 🏆
Social media feeds 📢
IoT sensor data 📡

Unlike batch processing, streaming requires low-latency, high-throughput handling.

🚀 How Rails Can Handle Streaming

WebSockets → For real-time bidirectional communication
Kafka / RabbitMQ → For high-throughput event processing
ActiveJob & Sidekiq → For background task execution
Database Optimization → To prevent slow inserts

Let’s implement these step by step.

2. Implementing Real-Time WebSockets in Rails

Rails has built-in WebSocket support using ActionCable.

📌 Install Redis for ActionCable

bundle add redis

📌 Configure WebSockets in `cable.yml`

development:
adapter: redis
url: redis://localhost:6379/1

📌 Create a WebSocket Channel

rails generate channel Streaming

Modify app/channels/streaming_channel.rb:

class StreamingChannel < ApplicationCable::Channel
def subscribed
stream_from "streaming_data"
end
end

Now, broadcast data in real-time:

ActionCable.server.broadcast("streaming_data", { message: "New event received" })

This allows clients to receive live updates instantly.

3. Scaling with Kafka or RabbitMQ for High-Throughput Streaming

For large-scale streaming, Rails alone isn’t enough. We use message queues like Kafka or RabbitMQ to handle high-throughput event processing.

📌 Kafka vs. RabbitMQ

📌 Install Kafka for Rails

brew install kafka
bundle add ruby-kafka

📌 Producer: Sending Events to Kafka

require "kafka"

kafka = Kafka.new(["localhost:9092"])
kafka.deliver_message("User signed up!", topic: "events")

📌 Consumer: Processing Kafka Events

kafka.each_message(topic: "events") do |message|
puts "Received event: #{message.value}"
end

This allows Rails to handle millions of messages per second asynchronously.

4. Using Background Jobs for Parallel Data Processing

Streaming data requires parallel processing to avoid blocking requests.

📌 Use Sidekiq for Background Jobs

bundle add sidekiq

Modify config/application.rb:

config.active_job.queue_adapter = :sidekiq

📌 Define a Background Worker

class ProcessEventJob < ApplicationJob
queue_as :default

def perform(event_data)
Event.create!(data: event_data) # Store event in DB
end
end

📌 Trigger Jobs from Kafka Consumer

kafka.each_message(topic: "events") do |message|
ProcessEventJob.perform_later(message.value)
end

Now, Rails can process thousands of events in parallel. 🚀

5. Optimizing Database Writes for Streaming Data

Handling large event streams can overload the database. Optimize writes to prevent slow inserts.

📌 Use Bulk Inserts for Performance

Event.insert_all([
{ data: "Event 1", created_at: Time.now },
{ data: "Event 2", created_at: Time.now }
])

📌 Index Event Columns for Faster Queries

rails generate migration AddIndexToEvents data:string:index

📌 Partition Large Tables

For millions of records, use PostgreSQL partitioning:

CREATE TABLE events (
id SERIAL PRIMARY KEY,
event_time TIMESTAMP NOT NULL,
data JSONB NOT NULL
) PARTITION BY RANGE (event_time);

Now, old events auto-archive to prevent slow queries.

6. Real-Time Dashboard: Streaming Data to Clients

Use Rails with React/Vue.js & WebSockets to visualize data live.

📌 Example: Stream Kafka Events to Frontend

ActionCable.server.broadcast("streaming_data", { message: event_data })

📌 Frontend: Listen for Live Updates

const cable = ActionCable.createConsumer("ws://localhost:3000/cable");
const channel = cable.subscriptions.create("StreamingChannel", {
received(data) {
console.log("New event received:", data.message);
}
});

Now, users get real-time data without reloading. 🚀

Conclusion

Rails can handle streaming data at scale with:
✅ WebSockets for real-time updates
✅ Kafka or RabbitMQ for high-throughput event processing
✅ Background jobs for async processing
✅ Optimized database writes for efficiency

With these strategies, you can build high-performance, real-time Rails applications that handle millions of events per second. 🚀

What streaming tech do you use in Rails? Drop a comment below!