Load Testing Socket.IO with Artillery: Real-Time at Scale
Master load testing WebSocket and Socket.IO applications using Artillery. From basic connection testing to complex real-time scenarios with custom metrics.
When our sales team started missing critical opportunity updates and customer actions on quotes due to system delays, I was tasked with building a real-time notification system for our in-house sales application. The system needed to ensure prompt updates whenever opportunities changed status or customers interacted with quotes—actions that directly impact revenue. Traditional HTTP polling wasn't cutting it, and load testing WebSocket connections required a completely different approach.
Why Socket.IO Load Testing is Different
Unlike REST APIs where you send a request and get a response, Socket.IO applications maintain persistent connections with ongoing bidirectional communication. You need to test:
- Connection establishment - Can your server handle rapid connection spikes?
- Message throughput - How many messages per second can you process?
- Broadcasting performance - What happens when one message goes to thousands of clients?
- Connection persistence - How long can connections stay active under load?
- Memory usage - Do you have connection or message memory leaks?
Setting Up the Test Server
First, let's create a basic Socket.IO server that we can load test:
const express = require('express')
const http = require('http')
const socketIo = require('socket.io')
const app = express()
const server = http.createServer(app)
const io = socketIo(server, {
cors: {
origin: "*",
methods: ["GET", "POST"]
}
})
// Track sales team connections
let activeReps = 0
let notificationsSent = 0
io.on('connection', (socket) => {
activeReps++
console.log(`Sales rep connected. Active reps: ${activeReps}`)
// Authenticate sales rep and join their team room
socket.on('join-sales-team', (data) => {
const { repId, teamId, territory } = data
// Join team-specific rooms for targeted notifications
socket.join(`team_${teamId}`)
socket.join(`territory_${territory}`)
socket.join(`rep_${repId}`)
socket.emit('authenticated', {
message: 'Connected to sales notification system',
repId,
timestamp: Date.now()
})
})
// Handle different notification types
socket.on('new-lead', (leadData) => {
notificationsSent++
// Broadcast to relevant territory
io.to(`territory_${leadData.territory}`).emit('lead-notification', {
type: 'NEW_LEAD',
leadId: leadData.id,
clientName: leadData.clientName,
value: leadData.estimatedValue,
territory: leadData.territory,
timestamp: Date.now()
})
})
// Deal status updates
socket.on('deal-update', (dealData) => {
notificationsSent++
// Notify specific rep and their manager
io.to(`rep_${dealData.assignedRep}`).emit('deal-notification', {
type: 'DEAL_UPDATE',
dealId: dealData.id,
status: dealData.status,
clientName: dealData.clientName,
value: dealData.value,
timestamp: Date.now()
})
})
// Client interaction notifications
socket.on('client-interaction', (interactionData) => {
notificationsSent++
// Notify team about client activity
io.to(`team_${interactionData.teamId}`).emit('client-activity', {
type: 'CLIENT_INTERACTION',
clientId: interactionData.clientId,
clientName: interactionData.clientName,
interactionType: interactionData.type, // email, call, meeting
repId: interactionData.repId,
timestamp: Date.now()
})
})
socket.on('disconnect', () => {
activeReps--
console.log(`Sales rep disconnected. Active reps: ${activeReps}`)
})
})
// Health check with sales metrics
app.get('/health', (req, res) => {
res.json({
status: 'ok',
activeReps,
notificationsSent,
uptime: process.uptime(),
systemLoad: process.cpuUsage()
})
})
server.listen(3000, () => {
console.log('Sales notification server running on port 3000')
})Basic Artillery Configuration
Artillery has built-in Socket.IO support. Here's a basic configuration that simulates users connecting and sending messages:
config:
target: 'http://localhost:3000'
phases:
- duration: 60
arrivalRate: 10
name: "Morning shift ramp-up"
- duration: 180
arrivalRate: 50
name: "Peak sales hours"
- duration: 60
arrivalRate: 80
name: "End-of-quarter push"
engines:
socketio: {}
variables:
territories:
- "north"
- "south"
- "east"
- "west"
teams:
- "enterprise"
- "smb"
- "inbound"
- "outbound"
scenarios:
- name: "Sales rep simulation"
weight: 100
engine: socketio
flow:
- connect:
namespace: "/"
- think: 1
# Authenticate as sales rep
- emit:
channel: "join-sales-team"
data:
repId: "{{ $randomInt(1, 500) }}"
teamId: "{{ teams[$randomInt(0, 3)] }}"
territory: "{{ territories[$randomInt(0, 3)] }}"
# Wait for authentication
- think: 2
# Simulate new lead creation
- emit:
channel: "new-lead"
data:
id: "{{ $randomInt(10000, 99999) }}"
clientName: "Test Client {{ $randomInt(1, 1000) }}"
estimatedValue: "{{ $randomInt(5000, 100000) }}"
territory: "{{ territories[$randomInt(0, 3)] }}"
# Stay connected to receive notifications
- think: 30Run this test with:
npm install -g artillery
artillery run socketio-basic.ymlAdvanced Scenarios
Real applications have complex user behaviors. Here's an advanced configuration that simulates realistic chat application usage:
config:
target: 'http://localhost:3000'
phases:
- duration: 30
arrivalRate: 25
name: "Early morning shift"
- duration: 240
arrivalRate: 125
name: "Peak business hours"
- duration: 60
arrivalRate: 200
name: "Quarter-end crunch"
engines:
socketio:
transports: ['websocket']
variables:
repNames:
- "Sarah_Johnson"
- "Mike_Chen"
- "Emma_Rodriguez"
- "David_Kim"
- "Lisa_Thompson"
territories:
- "northeast"
- "southeast"
- "midwest"
- "west_coast"
- "southwest"
teams:
- "enterprise_sales"
- "smb_sales"
- "inside_sales"
- "field_sales"
- "channel_partners"
dealStatuses:
- "qualified"
- "proposal_sent"
- "negotiation"
- "closed_won"
- "closed_lost"
scenarios:
- name: "Active sales rep workflow"
weight: 60
engine: socketio
flow:
- connect:
namespace: "/"
- think: 1
# Join sales team with realistic rep data
- emit:
channel: "join-sales-team"
data:
repId: "{{ $randomInt(100, 999) }}"
teamId: "{{ teams[$randomInt(0, 4)] }}"
territory: "{{ territories[$randomInt(0, 4)] }}"
repName: "{{ repNames[$randomInt(0, 4)] }}"
# Wait for authentication confirmation
- think: "{{ $randomInt(1, 3) }}"
# Simulate various sales activities
- loop:
# Create new leads
- emit:
channel: "new-lead"
data:
id: "{{ $randomInt(10000, 99999) }}"
clientName: "{{ $randomString() }} Corp"
estimatedValue: "{{ $randomInt(10000, 500000) }}"
territory: "{{ territories[$randomInt(0, 4)] }}"
source: "website"
- think: "{{ $randomInt(5, 15) }}"
# Update deal status
- emit:
channel: "deal-update"
data:
id: "{{ $randomInt(1000, 9999) }}"
assignedRep: "{{ $randomInt(100, 999) }}"
status: "{{ dealStatuses[$randomInt(0, 4)] }}"
clientName: "{{ $randomString() }} Industries"
value: "{{ $randomInt(25000, 1000000) }}"
- think: "{{ $randomInt(10, 30) }}"
# Log client interactions
- emit:
channel: "client-interaction"
data:
clientId: "{{ $randomInt(500, 5000) }}"
clientName: "{{ $randomString() }} LLC"
teamId: "{{ teams[$randomInt(0, 4)] }}"
type: "email"
repId: "{{ $randomInt(100, 999) }}"
notes: "Follow-up call scheduled"
count: "{{ $randomInt(3, 8) }}"
# Stay connected for extended period (simulating work day)
- think: "{{ $randomInt(300, 600) }}"
- name: "Manager monitoring notifications"
weight: 25
engine: socketio
flow:
- connect:
namespace: "/"
- think: 1
# Manager joins multiple team rooms for monitoring
- emit:
channel: "join-sales-team"
data:
repId: "{{ $randomInt(1, 50) }}"
teamId: "{{ teams[$randomInt(0, 4)] }}"
territory: "{{ territories[$randomInt(0, 4)] }}"
role: "manager"
# Stay connected to monitor team activity
- think: "{{ $randomInt(600, 1200) }}"
- name: "High-frequency notification burst"
weight: 15
engine: socketio
flow:
- connect:
namespace: "/"
- think: 1
# Simulate system integration pushing bulk updates
- loop:
- emit:
channel: "deal-update"
data:
id: "{{ $randomInt(1000, 9999) }}"
assignedRep: "{{ $randomInt(100, 999) }}"
status: "{{ dealStatuses[$randomInt(0, 4)] }}"
clientName: "Bulk Import {{ $randomInt(1, 1000) }}"
value: "{{ $randomInt(1000, 50000) }}"
count: "{{ $randomInt(10, 25) }}"
# Quick disconnect after batch processing
- think: 5Custom Metrics and Functions
Artillery allows custom JavaScript functions for advanced testing scenarios. This is where you can implement application-specific logic:
// artillery-functions.js
module.exports = {
// Custom function to track message round-trip time
trackMessageLatency: (context, events, done) => {
context.vars.messageStartTime = Date.now()
return done()
},
// Measure time from message send to response
measureResponseTime: (context, events, done) => {
const responseTime = Date.now() - context.vars.messageStartTime
events.emit('customStat', 'message.response_time', responseTime)
return done()
},
// Generate realistic user data
generateUserData: (context, events, done) => {
const users = ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve']
const actions = ['typing', 'idle', 'active']
context.vars.user = users[Math.floor(Math.random() * users.length)]
context.vars.status = actions[Math.floor(Math.random() * actions.length)]
context.vars.sessionId = `session_${Math.random().toString(36).substr(2, 9)}`
return done()
},
// Validate server responses
validateResponse: (requestParams, response, context, events, done) => {
if (response.data && response.data.timestamp) {
events.emit('customStat', 'valid_responses', 1)
} else {
events.emit('customStat', 'invalid_responses', 1)
}
return done()
}
}Then reference these functions in your Artillery configuration:
config:
target: 'http://localhost:3000'
phases:
- duration: 60
arrivalRate: 50
engines:
socketio:
transports: ['websocket']
processor: "./artillery-functions.js"
scenarios:
- name: "Advanced chat simulation"
weight: 100
engine: socketio
flow:
# Setup user data
- function: "generateUserData"
# Connect to server
- connect:
namespace: "/"
# Listen for welcome message
- on:
channel: "welcome"
function: "validateResponse"
# Track message latency
- function: "trackMessageLatency"
# Send message and measure response time
- emit:
channel: "message"
data:
user: "{{ user }}"
text: "Hello from {{ sessionId }}"
status: "{{ status }}"
# Listen for message broadcast
- on:
channel: "message"
function: "measureResponseTime"
# Simulate realistic user behavior
- loop:
- think: "{{ $randomInt(2, 8) }}"
- emit:
channel: "message"
data:
user: "{{ user }}"
text: "Message {{ $loopCount }} from {{ sessionId }}"
count: "{{ $randomInt(3, 10) }}"
# Stay connected
- think: 30Real-Time Performance Monitoring
While Artillery runs your load test, you need to monitor your application's performance. Here's a custom monitoring script I use:
// performance-monitor.js
const io = require('socket.io-client')
const EventEmitter = require('events')
class SocketIOMonitor extends EventEmitter {
constructor(url, options = {}) {
super()
this.url = url
this.options = options
this.metrics = {
connections: 0,
messages: {
sent: 0,
received: 0,
errors: 0
},
latency: [],
errors: []
}
}
async startMonitoring(duration = 60000) {
const socket = io(this.url, this.options)
const startTime = Date.now()
socket.on('connect', () => {
this.metrics.connections++
console.log('Monitor connected')
// Send test messages periodically
const interval = setInterval(() => {
const messageStart = Date.now()
socket.emit('ping', { timestamp: messageStart })
this.metrics.messages.sent++
// Listen for pong response
socket.once('pong', (data) => {
const latency = Date.now() - messageStart
this.metrics.latency.push(latency)
this.metrics.messages.received++
})
}, 1000)
// Stop after duration
setTimeout(() => {
clearInterval(interval)
socket.disconnect()
this.generateReport()
}, duration)
})
socket.on('connect_error', (error) => {
this.metrics.errors.push({
type: 'connection',
message: error.message,
timestamp: Date.now()
})
})
socket.on('disconnect', () => {
console.log('Monitor disconnected')
})
}
generateReport() {
const avgLatency = this.metrics.latency.reduce((a, b) => a + b, 0) / this.metrics.latency.length
const maxLatency = Math.max(...this.metrics.latency)
const minLatency = Math.min(...this.metrics.latency)
console.log('
=== Performance Report ===')
console.log(`Messages sent: ${this.metrics.messages.sent}`)
console.log(`Messages received: ${this.metrics.messages.received}`)
console.log(`Success rate: ${((this.metrics.messages.received / this.metrics.messages.sent) * 100).toFixed(2)}%`)
console.log(`Average latency: ${avgLatency.toFixed(2)}ms`)
console.log(`Min latency: ${minLatency}ms`)
console.log(`Max latency: ${maxLatency}ms`)
console.log(`Errors: ${this.metrics.errors.length}`)
this.emit('report', {
messagesSent: this.metrics.messages.sent,
messagesReceived: this.metrics.messages.received,
successRate: (this.metrics.messages.received / this.metrics.messages.sent) * 100,
latency: {
average: avgLatency,
min: minLatency,
max: maxLatency
},
errors: this.metrics.errors
})
}
}
// Usage
const monitor = new SocketIOMonitor('http://localhost:3000')
monitor.startMonitoring(60000)Containerized Load Testing
For consistent testing environments, I recommend using Docker. Here's a complete setup that includes your app, Redis for scaling, and Artillery for testing:
# docker-compose.yml
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- PORT=3000
deploy:
resources:
limits:
memory: 512M
cpus: '0.5'
reservations:
memory: 256M
cpus: '0.25'
redis:
image: redis:alpine
ports:
- "6379:6379"
artillery:
image: artilleryio/artillery:latest
volumes:
- ./load-tests:/artillery
command: run /artillery/socketio-test.yml
depends_on:
- app
environment:
- TARGET_URL=http://app:3000Production-Ready Socket.IO Server
Here's how I structure Socket.IO servers for production load testing. This includes clustering, Redis adapter, and proper error handling:
// production-socketio-server.js
const cluster = require('cluster')
const numCPUs = require('os').cpus().length
const Redis = require('ioredis')
const redisAdapter = require('socket.io-redis')
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`)
// Fork workers
for (let i = 0; i < numCPUs; i++) {
cluster.fork()
}
cluster.on('exit', (worker, code, signal) => {
console.log(`Worker ${worker.process.pid} died`)
cluster.fork() // Restart worker
})
} else {
const express = require('express')
const http = require('http')
const socketIo = require('socket.io')
const app = express()
const server = http.createServer(app)
const io = socketIo(server, {
transports: ['websocket', 'polling'],
pingTimeout: 60000,
pingInterval: 25000,
upgradeTimeout: 30000,
allowUpgrades: true
})
// Redis adapter for scaling
const redis = new Redis(process.env.REDIS_URL || 'redis://localhost:6379')
io.adapter(redisAdapter({
pubClient: redis,
subClient: redis.duplicate()
}))
// Connection tracking with Redis
let connectionCount = 0
io.on('connection', (socket) => {
connectionCount++
// Store connection info in Redis
redis.hset('connections', socket.id, JSON.stringify({
connectedAt: Date.now(),
workerId: process.pid
}))
// Optimized message handling
socket.on('message', async (data) => {
try {
// Rate limiting check
const messageKey = `messages:${socket.id}`
const messageCount = await redis.incr(messageKey)
await redis.expire(messageKey, 60) // 1 minute window
if (messageCount > 100) { // Max 100 messages per minute
socket.emit('rate_limited', {
message: 'Too many messages, please slow down'
})
return
}
// Broadcast message
io.emit('message', {
id: Date.now(),
user: data.user,
text: data.text,
timestamp: Date.now()
})
} catch (error) {
console.error('Message handling error:', error)
socket.emit('error', { message: 'Message processing failed' })
}
})
socket.on('disconnect', () => {
connectionCount--
redis.hdel('connections', socket.id)
})
})
// Health check with detailed metrics
app.get('/health', async (req, res) => {
try {
const connections = await redis.hlen('connections')
const memoryUsage = process.memoryUsage()
res.json({
status: 'ok',
worker: process.pid,
connections: connections,
memory: {
rss: Math.round(memoryUsage.rss / 1024 / 1024) + ' MB',
heapUsed: Math.round(memoryUsage.heapUsed / 1024 / 1024) + ' MB'
},
uptime: process.uptime()
})
} catch (error) {
res.status(500).json({ status: 'error', message: error.message })
}
})
const PORT = process.env.PORT || 3000
server.listen(PORT, () => {
console.log(`Worker ${process.pid} listening on port ${PORT}`)
})
}Key Metrics to Watch
When load testing real-time sales notifications, monitor these critical metrics:
- Connection Rate - Sales reps connecting per second during shift changes
- Active Connections - Total concurrent sales reps online
- Notification Throughput - Lead/deal notifications delivered per second
- Memory Usage - Watch for connection leaks during long sales sessions
- CPU Usage - Event loop blocking during notification bursts
- Network I/O - Bandwidth during high-value deal notifications
- Error Rates - Failed notifications (critical for sales)
- Delivery Latency - Time from deal update to rep notification
- Room Management - Performance of team/territory-based notifications
Common Pitfalls and Solutions
After load testing dozens of Socket.IO applications, here are the most common issues I've encountered:
- File descriptor limits - Increase ulimit for high connection counts
- Memory leaks - Always clean up event listeners on disconnect
- CPU blocking - Use clustering to distribute load across cores
- Redis bottlenecks - Use Redis Cluster for high-throughput scenarios
- Sticky sessions - Configure load balancers for WebSocket support
- Heartbeat tuning - Adjust ping/pong intervals based on your use case
Production Lessons Learned
After load testing Socket.IO applications handling millions of connections, here are my key insights:
- Start simple - Test basic connection/disconnection before complex scenarios
- Ramp up gradually - Sudden load spikes hide gradual memory leaks
- Test connection persistence - Many apps fail after hours, not minutes
- Monitor the client side - Network issues affect client reconnection logic
- Test failure scenarios - How does your app behave when Redis goes down?
- Use multiple Artillery instances - Distribute load generation across machines
- Test different transports - WebSocket vs polling performance varies
Load testing our sales notification system with Artillery helped us identify critical bottlenecks before launch. During Q4 (our busiest quarter), we successfully handled over 500 concurrent sales reps receiving real-time notifications about leads, deals, and client interactions. The key was creating realistic test scenarios that matched actual sales workflows - from morning shift ramp-ups to end-of-quarter notification storms.