Using Rails to Handle Streaming Data at Scale
Learn how to process real-time streaming data efficiently using Ruby on Rails.
Modern applications require real-time data processing to handle millions of events per second. Can Rails manage streaming data at scale?
While Rails is traditionally a request-response framework, it can process real-time data streams efficiently with:
✅ WebSockets for live updates
✅ Message queues like Kafka & RabbitMQ
✅ Background jobs for parallel processing
✅ Optimized database writes to prevent bottlenecks
In this guide, we’ll build a real-time, scalable Rails architecture for handling streaming data efficiently. 🚀
1. Understanding Streaming Data in Rails
Streaming data means continuous, real-time data ingestion, processing, and storage. Examples include:
- Stock market price updates 📈
- Live sports scores 🏆
- Social media feeds 📢
- IoT sensor data 📡
Unlike batch processing, streaming requires low-latency, high-throughput handling.
🚀 How Rails Can Handle Streaming
- WebSockets → For real-time bidirectional communication
- Kafka / RabbitMQ → For high-throughput event processing
- ActiveJob & Sidekiq → For background task execution
- Database Optimization → To prevent slow inserts
Let’s implement these step by step.
2. Implementing Real-Time WebSockets in Rails
Rails has built-in WebSocket support using ActionCable.
📌 Install Redis for ActionCable
bundle add redis
📌 Configure WebSockets in cable.yml
development:
adapter: redis
url: redis://localhost:6379/1
📌 Create a WebSocket Channel
rails generate channel Streaming
Modify app/channels/streaming_channel.rb
:
class StreamingChannel < ApplicationCable::Channel
def subscribed
stream_from "streaming_data"
end
end
Now, broadcast data in real-time:
ActionCable.server.broadcast("streaming_data", { message: "New event received" })
This allows clients to receive live updates instantly.
3. Scaling with Kafka or RabbitMQ for High-Throughput Streaming
For large-scale streaming, Rails alone isn’t enough. We use message queues like Kafka or RabbitMQ to handle high-throughput event processing.
📌 Kafka vs. RabbitMQ
| Feature | Kafka | RabbitMQ | |———-|——–|———–| | Best for | High-throughput logs | Real-time event processing | | Scaling | Horizontally | Vertically | | Use case | Analytics, logs, metrics | Chat, notifications |
📌 Install Kafka for Rails
brew install kafka
bundle add ruby-kafka
📌 Producer: Sending Events to Kafka
require "kafka"
kafka = Kafka.new(["localhost:9092"])
kafka.deliver_message("User signed up!", topic: "events")
📌 Consumer: Processing Kafka Events
kafka.each_message(topic: "events") do |message|
puts "Received event: #{message.value}"
end
This allows Rails to handle millions of messages per second asynchronously.
4. Using Background Jobs for Parallel Data Processing
Streaming data requires parallel processing to avoid blocking requests.
📌 Use Sidekiq for Background Jobs
bundle add sidekiq
Modify config/application.rb
:
config.active_job.queue_adapter = :sidekiq
📌 Define a Background Worker
class ProcessEventJob < ApplicationJob
queue_as :default
def perform(event_data)
Event.create!(data: event_data) # Store event in DB
end
end
📌 Trigger Jobs from Kafka Consumer
kafka.each_message(topic: "events") do |message|
ProcessEventJob.perform_later(message.value)
end
Now, Rails can process thousands of events in parallel. 🚀
5. Optimizing Database Writes for Streaming Data
Handling large event streams can overload the database. Optimize writes to prevent slow inserts.
📌 Use Bulk Inserts for Performance
Event.insert_all([
{ data: "Event 1", created_at: Time.now },
{ data: "Event 2", created_at: Time.now }
])
📌 Index Event Columns for Faster Queries
rails generate migration AddIndexToEvents data:string:index
📌 Partition Large Tables
For millions of records, use PostgreSQL partitioning:
CREATE TABLE events (
id SERIAL PRIMARY KEY,
event_time TIMESTAMP NOT NULL,
data JSONB NOT NULL
) PARTITION BY RANGE (event_time);
Now, old events auto-archive to prevent slow queries.
6. Real-Time Dashboard: Streaming Data to Clients
Use Rails with React/Vue.js & WebSockets to visualize data live.
📌 Example: Stream Kafka Events to Frontend
ActionCable.server.broadcast("streaming_data", { message: event_data })
📌 Frontend: Listen for Live Updates
const cable = ActionCable.createConsumer("ws://localhost:3000/cable");
const channel = cable.subscriptions.create("StreamingChannel", {
received(data) {
console.log("New event received:", data.message);
}
});
Now, users get real-time data without reloading. 🚀
Conclusion
Rails can handle streaming data at scale with:
✅ WebSockets for real-time updates
✅ Kafka or RabbitMQ for high-throughput event processing
✅ Background jobs for async processing
✅ Optimized database writes for efficiency
With these strategies, you can build high-performance, real-time Rails applications that handle millions of events per second. 🚀
What streaming tech do you use in Rails? Drop a comment below!