Add manual Docker socket mount fix and comprehensive troubleshooting

Problem: CapRover's serviceUpdateOverride in captain-definition doesn't
always apply automatically, causing "Docker socket NOT found" errors.

Solution: Manual docker service update command to apply the mount.

Changes:
- Added CAPROVER_TROUBLESHOOTING.md with complete step-by-step fix
  - Manual docker service update command
  - Verification steps
  - Common issues and solutions
  - SELinux/AppArmor troubleshooting

- Created fix-caprover-docker-mount.sh automated script
  - Finds service automatically
  - Applies mount
  - Verifies configuration
  - Shows service status

- Enhanced backend/app.py diagnostics
  - Lists /var/run directory contents
  - Shows Docker-related files
  - Better error messages explaining the issue
  - Explicit note when mount is missing

- Updated backend/requirements.txt
  - Docker SDK 7.0.0 -> 7.1.0 (fixes URL scheme error)

- Updated CAPROVER_DEPLOYMENT.md
  - Prominent warning about serviceUpdateOverride limitation
  - New Step 4: Verify and Apply Docker Socket Mount
  - Quick fix command prominently displayed
  - Links to troubleshooting guide
  - Updated troubleshooting section with manual fix

- Updated QUICKSTART.md
  - Warning after backend deployment instructions
  - Quick fix command for both deployment options
  - Links to troubleshooting guide

This provides users with immediate solutions when encountering the
"Cannot connect to Docker" error, which is now properly diagnosed
and can be fixed with a single command.

https://claude.ai/code/session_01NfGGGQ9Zn6ue7PRZpAoB2N
This commit is contained in:
Claude
2026-01-30 19:48:53 +00:00
parent 97790045ff
commit f8d2320236
6 changed files with 503 additions and 32 deletions

View File

@@ -71,6 +71,25 @@ The `backend/captain-definition` file contains critical configuration:
5. **Replicas**: Set to 1 (multiple replicas can't share the same socket)
### ⚠️ IMPORTANT: serviceUpdateOverride Limitation
**The `serviceUpdateOverride` in `captain-definition` may not be applied automatically by CapRover.** This is a known limitation with some CapRover versions.
**If you see "Docker socket NOT found" in your logs**, you MUST manually apply the Docker socket mount after deployment.
**Quick Fix** (run on your CapRover server):
```bash
# SSH into your CapRover server
ssh root@your-server.com
# Apply the mount (replace with your service name)
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
srv-captain--terminalbackend
```
**See [CAPROVER_TROUBLESHOOTING.md](CAPROVER_TROUBLESHOOTING.md) for detailed instructions.**
### Security Considerations
**IMPORTANT**: Granting Docker socket access to a container is a security-sensitive operation. The container effectively has root access to the host system.
@@ -116,27 +135,79 @@ caprover deploy
Or manually:
1. Create a tarball: `tar -czf backend.tar.gz .`
1. Create a tarball: `tar -cf backend.tar .`
2. Upload via CapRover dashboard
3. Wait for deployment to complete
#### 4. Verify Deployment
#### 4. **CRITICAL: Verify and Apply Docker Socket Mount**
Check the application logs in CapRover dashboard. You should see:
After deployment, check if the Docker socket is mounted:
**a) Check Application Logs** (in CapRover dashboard):
Look for:
```
=== Docker Environment Diagnosis ===
DOCKER_HOST: unix:///var/run/docker.sock
✓ Docker socket exists at /var/run/docker.sock
Socket permissions: 0o140777
Readable: True
Writable: True
Current user: root (UID: 0, GID: 0)
✓ Successfully connected to Docker using Unix socket
✓ Docker connection verified on startup
```
If you see errors, check the "Troubleshooting" section below.
If you see:
```
✗ Docker socket NOT found at /var/run/docker.sock
```
Then the `serviceUpdateOverride` wasn't applied. **You must manually apply it.**
**b) Manually Apply the Mount** (run on your CapRover server):
```bash
# SSH into your CapRover server
ssh root@your-server.com
# Find your service name
docker service ls | grep terminalbackend
# Should show something like: srv-captain--terminalbackend
# Apply the Docker socket mount
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
srv-captain--terminalbackend
```
**c) Verify the Mount Was Applied**:
```bash
docker service inspect srv-captain--terminalbackend \
--format '{{json .Spec.TaskTemplate.ContainerSpec.Mounts}}' | python3 -m json.tool
```
Should show:
```json
[
{
"Type": "bind",
"Source": "/var/run/docker.sock",
"Target": "/var/run/docker.sock"
}
]
```
**d) Wait for Service Restart**:
The service will automatically restart with the new configuration. Monitor:
```bash
docker service ps srv-captain--terminalbackend
```
**e) Check Logs Again**:
In CapRover dashboard, refresh the logs. You should now see:
```
✓ Docker socket exists at /var/run/docker.sock
✓ Docker connection verified on startup
```
**See [CAPROVER_TROUBLESHOOTING.md](CAPROVER_TROUBLESHOOTING.md) for detailed troubleshooting.**
#### 5. Test the API
@@ -184,33 +255,72 @@ caprover deploy
### "Cannot connect to Docker" Error
If you see this error, check the following:
**This is the most common issue!** CapRover's `serviceUpdateOverride` often doesn't apply automatically.
1. **Verify captain-definition**: Ensure `serviceUpdateOverride` is present and correct
#### Quick Fix (Run on CapRover Server)
2. **Check logs for diagnostics**:
```bash
# SSH into your CapRover server
ssh root@your-server.com
# Apply the Docker socket mount manually
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
srv-captain--terminalbackend
# Verify it worked
docker service inspect srv-captain--terminalbackend \
--format '{{json .Spec.TaskTemplate.ContainerSpec.Mounts}}' | python3 -m json.tool
```
**📖 See [CAPROVER_TROUBLESHOOTING.md](CAPROVER_TROUBLESHOOTING.md) for complete step-by-step instructions.**
#### Diagnostic Checklist
If the quick fix doesn't work, check:
1. **Check logs in CapRover dashboard** for:
```
=== Docker Environment Diagnosis ===
Docker socket NOT found at /var/run/docker.sock
```
Look for:
- Socket existence
- Permissions (should be readable and writable)
- User info (should be root)
3. **Common issues**:
2. **Verify socket exists on host**:
```bash
ls -la /var/run/docker.sock
```
**Socket not found**:
- The mount configuration isn't being applied
- Redeploy the app after updating `captain-definition`
3. **Check service is running as root**:
```bash
docker service inspect srv-captain--terminalbackend \
--format '{{.Spec.TaskTemplate.ContainerSpec.User}}'
```
Should return: `root`
**Permission denied**:
- User isn't root
- Socket permissions are wrong
- Check that `"User": "root"` is in captain-definition
4. **Check Docker version compatibility**:
```bash
docker version
```
**Connection refused**:
- Docker daemon isn't running on the host
- Check CapRover host: `docker info`
5. **Review SELinux/AppArmor** if on RHEL/Ubuntu:
```bash
getenforce # Should be Permissive or Disabled for testing
```
#### Common Issues
**Socket not found**:
- ✅ **Solution**: Manually apply mount (see Quick Fix above)
- The `serviceUpdateOverride` wasn't applied by CapRover
**Permission denied**:
- ✅ **Solution**: Ensure service runs as root:
```bash
docker service update --user root srv-captain--terminalbackend
```
**Connection refused / "Not supported URL scheme http+docker"**:
- ✅ **Solution**: Update docker library version in `requirements.txt` to `docker==7.1.0`
- Redeploy the application
### Viewing Logs

259
CAPROVER_TROUBLESHOOTING.md Normal file
View File

@@ -0,0 +1,259 @@
# CapRover Docker Socket Troubleshooting Guide
This guide helps resolve the "Cannot connect to Docker" error in CapRover deployments.
## Problem: Docker Socket Not Mounted
### Symptoms
In your CapRover application logs, you see:
```
✗ Docker socket NOT found at /var/run/docker.sock
This means the Docker socket mount is NOT configured in CapRover
The serviceUpdateOverride in captain-definition may not be applied
```
### Root Cause
CapRover's `serviceUpdateOverride` in `captain-definition` **may not always be applied automatically**. This is a known limitation with some CapRover versions or configurations.
## Solution: Manual Docker Service Update
You need to manually update the Docker Swarm service to mount the Docker socket.
### Step 1: SSH into Your CapRover Server
```bash
ssh root@your-caprover-server.com
```
### Step 2: Find Your Service Name
List all services to find your backend service:
```bash
docker service ls
```
Look for your service, typically named: `srv-captain--terminalbackend` (or whatever you named your app)
### Step 3: Check Current Mounts
```bash
docker service inspect srv-captain--terminalbackend \
--format '{{json .Spec.TaskTemplate.ContainerSpec.Mounts}}' | python3 -m json.tool
```
If this returns `null` or an empty array, the mount isn't configured.
### Step 4: Apply the Docker Socket Mount
Run this command to mount the Docker socket:
```bash
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
srv-captain--terminalbackend
```
**Important**: Replace `srv-captain--terminalbackend` with your actual service name.
### Step 5: Verify the Mount
```bash
docker service inspect srv-captain--terminalbackend \
--format '{{json .Spec.TaskTemplate.ContainerSpec.Mounts}}' | python3 -m json.tool
```
You should see:
```json
[
{
"Type": "bind",
"Source": "/var/run/docker.sock",
"Target": "/var/run/docker.sock"
}
]
```
### Step 6: Wait for Service Restart
Docker Swarm will automatically restart your service with the new configuration. Monitor the status:
```bash
docker service ps srv-captain--terminalbackend --no-trunc
```
### Step 7: Check Logs
In CapRover dashboard, go to your app and check logs. You should now see:
```
✓ Docker socket exists at /var/run/docker.sock
Socket permissions: 0o140777
Readable: True
Writable: True
✓ Successfully connected to Docker using Unix socket
✓ Docker connection verified on startup
```
### Step 8: Test the API
```bash
curl https://terminalbackend.wardcrew.com/api/health
```
Should return:
```json
{"status":"healthy"}
```
## Automated Script
We've provided a script to automate this process. Download it from the repository:
```bash
# On your CapRover server
wget https://raw.githubusercontent.com/johndoe6345789/docker-swarm-termina/main/fix-caprover-docker-mount.sh
chmod +x fix-caprover-docker-mount.sh
# Run it
./fix-caprover-docker-mount.sh srv-captain--terminalbackend
```
## Alternative Solution: Use CapRover's Service Update Feature
Some CapRover versions support manual service configuration through the UI:
1. Go to CapRover dashboard
2. Navigate to your app
3. Click on "⚙️ Edit Default Nginx Configurations" (or similar settings)
4. Look for advanced Docker/Swarm settings
5. Add the mount configuration
However, this feature availability varies by CapRover version.
## Why serviceUpdateOverride Doesn't Always Work
The `captain-definition` file's `serviceUpdateOverride` field is designed to apply custom Docker Swarm configurations. However:
1. **Timing Issue**: It may only apply on initial deployment, not on updates
2. **CapRover Version**: Older versions may not fully support this feature
3. **Validation**: CapRover may skip configurations it deems risky
4. **Security**: Some CapRover installations restrict privileged configurations
## Persistence
Once you've manually applied the mount using `docker service update`, it will **persist across app updates** as long as you don't:
- Delete and recreate the app in CapRover
- Manually remove the mount
- Use a CapRover feature that resets service configuration
## Additional Troubleshooting
### Issue: "Permission denied" errors
**Solution**: Ensure the service runs as root:
```bash
docker service update \
--user root \
srv-captain--terminalbackend
```
### Issue: Socket exists but connection still fails
**Diagnosis**: Check socket permissions on the host:
```bash
ls -la /var/run/docker.sock
```
Should be:
```
srw-rw---- 1 root docker /var/run/docker.sock
```
**Solution**: Fix permissions:
```bash
chmod 666 /var/run/docker.sock # Temporary - not recommended for production
# OR
chmod 660 /var/run/docker.sock
chown root:docker /var/run/docker.sock
```
### Issue: "Not supported URL scheme http+docker"
This error indicates a docker-py library issue.
**Solution**: Update the docker library version in `requirements.txt`:
```
docker==7.1.0
```
Then redeploy the app.
### Issue: Can't find service name
**Solution**: List all services with details:
```bash
docker service ls --format "table {{.Name}}\t{{.Mode}}\t{{.Replicas}}"
```
Look for services starting with `srv-captain--`
### Issue: Mount applied but service won't start
**Diagnosis**: Check service logs:
```bash
docker service logs srv-captain--terminalbackend --tail 100 --follow
```
**Common causes**:
- SELinux blocking socket access
- AppArmor policies
- Container runtime restrictions
**Solution**: Temporarily disable SELinux/AppArmor to test:
```bash
# SELinux
setenforce 0
# After testing, re-enable
setenforce 1
```
## Production Recommendations
For production deployments:
1. **Use Docker Socket Proxy**: Instead of mounting the raw socket, use a proxy like [tecnativa/docker-socket-proxy](https://github.com/Tecnativa/docker-socket-proxy)
2. **Limit API Access**: Configure proxy to only allow specific Docker API endpoints
3. **Network Isolation**: Deploy backend on a dedicated private network
4. **Audit Logging**: Enable Docker audit logging for socket access
5. **Regular Updates**: Keep Docker, CapRover, and your application updated
## Support
If you're still experiencing issues:
1. **Check CapRover version**: `caprover --version`
2. **Check Docker version**: `docker version`
3. **Review CapRover logs**: `docker service logs captain-captain --tail 100`
4. **Test Docker socket on host**: `docker ps` (should work without errors)
Open an issue with:
- CapRover version
- Docker version
- Complete application logs
- Output of `docker service inspect srv-captain--terminalbackend`

View File

@@ -19,7 +19,21 @@ Get up and running with Docker Swarm Terminal in minutes.
- Go to "Deployment" tab
- Upload the `.tar` file (uncompressed - required by CapRover)
3. Wait for deployment to complete
4. Check logs for: `✓ Docker connection verified on startup`
4. **Check logs for: `✓ Docker connection verified on startup`**
**⚠️ IMPORTANT**: If you see `✗ Docker socket NOT found`, you must manually apply the mount:
```bash
# SSH into your CapRover server
ssh root@your-server.com
# Apply the Docker socket mount
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
srv-captain--terminalbackend
```
See [CAPROVER_TROUBLESHOOTING.md](CAPROVER_TROUBLESHOOTING.md) for details.
### Frontend
@@ -50,6 +64,14 @@ caprover deploy
caprover logs terminalbackend --follow
```
**⚠️ IMPORTANT**: If logs show `✗ Docker socket NOT found`, manually apply the mount on your CapRover server:
```bash
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
srv-captain--terminalbackend
```
### Frontend
```bash

View File

@@ -39,6 +39,22 @@ def diagnose_docker_environment():
logger.info(f"DOCKER_CERT_PATH: {docker_cert_path}")
logger.info(f"DOCKER_TLS_VERIFY: {docker_tls_verify}")
# Check what's in /var/run
logger.info("Checking /var/run directory contents:")
try:
if os.path.exists('/var/run'):
var_run_contents = os.listdir('/var/run')
logger.info(f" /var/run contains: {var_run_contents}")
# Check for any Docker-related files
docker_related = [f for f in var_run_contents if 'docker' in f.lower()]
if docker_related:
logger.info(f" Docker-related files/dirs found: {docker_related}")
else:
logger.warning(" /var/run directory doesn't exist")
except Exception as e:
logger.error(f" Error reading /var/run: {e}")
# Check Docker socket
socket_path = '/var/run/docker.sock'
logger.info(f"Checking Docker socket at {socket_path}")
@@ -63,6 +79,8 @@ def diagnose_docker_environment():
logger.warning(f"⚠ Socket exists but lacks proper permissions!")
else:
logger.error(f"✗ Docker socket NOT found at {socket_path}")
logger.error(f" This means the Docker socket mount is NOT configured in CapRover")
logger.error(f" The serviceUpdateOverride in captain-definition may not be applied")
# Check current user
import pwd

View File

@@ -1,4 +1,4 @@
Flask==3.0.0
Flask-CORS==6.0.0
python-dotenv==1.0.0
docker==7.0.0
docker==7.1.0

View File

@@ -0,0 +1,62 @@
#!/bin/bash
# Script to manually apply Docker socket mount to CapRover service
# Run this on your CapRover server if serviceUpdateOverride doesn't work
set -e
# Service name - update this to match your CapRover app name
SERVICE_NAME="${1:-srv-captain--terminalbackend}"
echo "=== CapRover Docker Socket Mount Fix ==="
echo "Service name: $SERVICE_NAME"
echo ""
# Check if service exists
if ! docker service ls | grep -q "$SERVICE_NAME"; then
echo "❌ Error: Service '$SERVICE_NAME' not found"
echo ""
echo "Available services:"
docker service ls --format "{{.Name}}"
echo ""
echo "Usage: $0 <service-name>"
echo "Example: $0 srv-captain--terminalbackend"
exit 1
fi
echo "✓ Service found"
echo ""
# Show current service configuration
echo "Current service mounts:"
docker service inspect "$SERVICE_NAME" --format '{{json .Spec.TaskTemplate.ContainerSpec.Mounts}}' | python3 -m json.tool || echo "No mounts configured"
echo ""
# Update service with Docker socket mount
echo "Applying Docker socket mount..."
docker service update \
--mount-add type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
"$SERVICE_NAME"
echo ""
echo "✓ Mount applied successfully"
echo ""
# Verify the update
echo "Verifying updated configuration:"
docker service inspect "$SERVICE_NAME" --format '{{json .Spec.TaskTemplate.ContainerSpec.Mounts}}' | python3 -m json.tool
echo ""
# Check service status
echo "Service status:"
docker service ps "$SERVICE_NAME" --no-trunc
echo ""
echo "=== Next Steps ==="
echo "1. Wait for the service to restart (check logs in CapRover dashboard)"
echo "2. Look for this in logs: '✓ Docker socket exists at /var/run/docker.sock'"
echo "3. Test the API: curl https://your-backend-domain.com/api/health"
echo ""
echo "If you still see errors, check:"
echo " - The service is running as root (not restricted by CapRover)"
echo " - SELinux/AppArmor isn't blocking socket access"
echo " - Docker socket exists on host: ls -la /var/run/docker.sock"