Production Rails applications need health check endpoints. Load balancers poll them. Kubernetes uses them for pod readiness. Monitoring services depend on them. Yet many teams either skip them entirely or expose too much information to the public internet.
Rails 8 provides built-in health check functionality through Rails::HealthController, but real-world applications need more: database connectivity checks, Redis availability, queue system health, and disk space monitoring. This guide covers building comprehensive health checks that satisfy ops requirements without becoming security liabilities.
The Built-in Health Check
Rails 8 ships with a minimal health check endpoint. Enable it by mounting the built-in route:
# config/routes.rb
Rails.application.routes.draw do
get "up" => "rails/health#show", as: :rails_health_check
# ... rest of routes
endThis endpoint returns a 200 status when the Rails process is running. Load balancers can hit /up to verify the application server responds. However, a running Rails process doesn't mean the application actually works—the database could be down, Redis unavailable, or disk space exhausted.
Building a Comprehensive Health Check
Production systems need deeper checks. Create a custom health controller that verifies each critical dependency:
# app/controllers/health_controller.rb
class HealthController < ActionController::API
# Public endpoint for load balancers - minimal info
def show
checks = run_health_checks
if checks.values.all? { |v| v[:healthy] }
render json: { status: "healthy" }, status: :ok
else
render json: { status: "unhealthy" }, status: :service_unavailable
end
end
# Detailed endpoint - protect with authentication or IP restriction
def detailed
checks = run_health_checks
all_healthy = checks.values.all? { |v| v[:healthy] }
render json: {
status: all_healthy ? "healthy" : "unhealthy",
timestamp: Time.current.iso8601,
checks: checks
}, status: all_healthy ? :ok : :service_unavailable
end
private
def run_health_checks
{
database: check_database,
redis: check_redis,
queue: check_queue,
disk: check_disk_space,
memory: check_memory
}
end
def check_database
ActiveRecord::Base.connection.execute("SELECT 1")
{ healthy: true, message: "Connected" }
rescue StandardError => e
{ healthy: false, message: e.message.truncate(100) }
end
def check_redis
return { healthy: true, message: "Not configured" } unless defined?(Redis)
redis = Redis.new(url: ENV.fetch("REDIS_URL", "redis://localhost:6379"))
redis.ping
{ healthy: true, message: "Connected" }
rescue StandardError => e
{ healthy: false, message: e.message.truncate(100) }
end
def check_queue
# Check Solid Queue health by verifying the database table exists
# and has been accessed recently by a worker
SolidQueue::Process.where("last_heartbeat_at > ?", 5.minutes.ago).exists?
{ healthy: true, message: "Workers active" }
rescue StandardError => e
{ healthy: false, message: e.message.truncate(100) }
end
def check_disk_space
stat = Sys::Filesystem.stat("/")
available_percent = (stat.blocks_available.to_f / stat.blocks * 100).round(1)
if available_percent < 10
{ healthy: false, message: "#{available_percent}% available" }
else
{ healthy: true, message: "#{available_percent}% available" }
end
rescue StandardError => e
{ healthy: true, message: "Check unavailable" }
end
def check_memory
return { healthy: true, message: "Check unavailable" } unless File.exist?("/proc/meminfo")
meminfo = File.read("/proc/meminfo")
total = meminfo[/MemTotal:\s+(\d+)/, 1].to_i
available = meminfo[/MemAvailable:\s+(\d+)/, 1].to_i
available_percent = (available.to_f / total * 100).round(1)
if available_percent < 10
{ healthy: false, message: "#{available_percent}% available" }
else
{ healthy: true, message: "#{available_percent}% available" }
end
rescue StandardError => e
{ healthy: true, message: "Check unavailable" }
end
endAdd the sys-filesystem gem to the Gemfile for disk space checks, or remove that check if running in containers where disk monitoring happens elsewhere.
Routing and Security
Health check endpoints present a security consideration. The basic endpoint reveals whether your application runs—fine for public access. The detailed endpoint exposes infrastructure information—restrict it to internal networks or authenticated requests:
# config/routes.rb
Rails.application.routes.draw do
# Public health check for load balancers
get "health" => "health#show"
# Detailed health check - consider protecting with constraints
constraints ->(req) { internal_request?(req) } do
get "health/detailed" => "health#detailed"
end
# ... rest of routes
end
# Helper method for IP-based restriction
def internal_request?(request)
allowed_ips = %w[127.0.0.1 ::1 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16]
allowed_ips.any? { |ip| IPAddr.new(ip).include?(request.remote_ip) }
rescue IPAddr::InvalidAddressError
false
endFor Kubernetes deployments, expose separate endpoints for liveness and readiness probes:
# config/routes.rb
Rails.application.routes.draw do
# Liveness: Is the process running?
get "health/live" => "health#live"
# Readiness: Can this instance handle traffic?
get "health/ready" => "health#ready"
end
# app/controllers/health_controller.rb
class HealthController < ActionController::API
# Liveness check - just verify the process responds
def live
head :ok
end
# Readiness check - verify dependencies are available
def ready
ActiveRecord::Base.connection.execute("SELECT 1")
head :ok
rescue StandardError
head :service_unavailable
end
endAdding Timeouts
Health checks should fail fast. A hanging database connection shouldn't make the health check hang for 30 seconds. Wrap checks in timeouts:
# app/controllers/health_controller.rb
class HealthController < ActionController::API
TIMEOUT_SECONDS = 3
private
def check_database
Timeout.timeout(TIMEOUT_SECONDS) do
ActiveRecord::Base.connection.execute("SELECT 1")
end
{ healthy: true, message: "Connected" }
rescue Timeout::Error
{ healthy: false, message: "Connection timeout" }
rescue StandardError => e
{ healthy: false, message: e.message.truncate(100) }
end
endApply the same pattern to Redis and other external service checks. Three seconds works well for most deployments—long enough for a slow query under load, short enough to fail promptly when services are down.
Caching Health Check Results
Under heavy load, health check endpoints can become a burden themselves. If Kubernetes polls every 10 seconds across 20 pods, that creates steady database traffic. Cache results briefly:
# app/controllers/health_controller.rb
class HealthController < ActionController::API
def show
checks = Rails.cache.fetch("health_check_result", expires_in: 5.seconds) do
run_health_checks
end
if checks.values.all? { |v| v[:healthy] }
render json: { status: "healthy" }, status: :ok
else
render json: { status: "unhealthy" }, status: :service_unavailable
end
end
endFive seconds of caching prevents hammering dependencies while still detecting failures quickly. Adjust based on monitoring requirements.
Common Mistakes
Several patterns cause problems in production health checks:
- Checking non-critical services: If the email service is down, should the app return unhealthy? Usually not—degrade gracefully instead
- No timeouts: A health check that hangs defeats its purpose
- Exposing stack traces: Error messages should be truncated and sanitized
- Checking too much: Each check adds latency and failure modes
- Skipping authentication on detailed endpoints: Attackers use infrastructure information for reconnaissance
Summary
Production-ready health checks balance thoroughness with simplicity. Start with the built-in Rails health check for basic load balancer integration. Add database connectivity verification for readiness probes. Include detailed checks for infrastructure monitoring, but protect those endpoints from public access. Apply timeouts to every external call, and cache results to prevent health checks from becoming a performance problem themselves.
The goal remains straightforward: quickly answer whether this application instance can handle traffic, without exposing sensitive details or creating new failure modes in the process.