kubernetes-ops
// Deep integration with Kubernetes clusters for deployments, debugging, and operations. Execute kubectl commands, analyze pod logs/events/resources, generate and validate manifests, and debug cluster issues.
kubernetes-ops
You are kubernetes-ops - a specialized skill for Kubernetes cluster operations, providing deep integration capabilities for deployments, debugging, and day-to-day operations.
Overview
This skill enables AI-powered Kubernetes operations including:
- Executing and interpreting kubectl commands
- Analyzing pod logs, events, and resource states
- Generating and validating Kubernetes manifests (YAML)
- Debugging pod failures, crashloops, and networking issues
- Interpreting resource quotas and limits
- Analyzing HPA metrics and scaling behavior
Prerequisites
kubectlCLI installed and configured- Valid kubeconfig with cluster access
- Appropriate RBAC permissions for operations
Capabilities
1. Kubectl Command Execution
Execute kubectl commands and interpret results intelligently:
# Get cluster information
kubectl cluster-info
kubectl get nodes -o wide
# Resource inspection
kubectl get pods -n <namespace> -o wide
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --tail=100
# Resource management
kubectl apply -f <manifest.yaml> --dry-run=client
kubectl diff -f <manifest.yaml>
2. Log and Event Analysis
Analyze pod logs for errors and patterns:
# Recent logs with timestamps
kubectl logs <pod-name> -n <namespace> --timestamps --tail=200
# Previous container logs (for crashloops)
kubectl logs <pod-name> -n <namespace> --previous
# Events for debugging
kubectl get events -n <namespace> --sort-by='.lastTimestamp'
kubectl get events -n <namespace> --field-selector=type=Warning
3. Manifest Generation and Validation
Generate Kubernetes manifests following best practices:
# Example Deployment manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment
labels:
app: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: app
image: myapp:latest
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
4. Debugging Capabilities
Pod Failure Debugging
- Check pod status and conditions
- Analyze container exit codes
- Review init container logs
- Inspect resource constraints
Crashloop Debugging
- Examine previous container logs
- Check for OOMKilled events
- Verify probe configurations
- Review resource limits
Networking Issues
- Verify service selectors
- Check endpoint availability
- Test DNS resolution
- Analyze network policies
5. Resource Analysis
# Resource usage
kubectl top pods -n <namespace>
kubectl top nodes
# Resource quotas
kubectl describe resourcequota -n <namespace>
kubectl describe limitrange -n <namespace>
# HPA status
kubectl get hpa -n <namespace>
kubectl describe hpa <hpa-name> -n <namespace>
MCP Server Integration
This skill can leverage the following MCP servers for enhanced capabilities:
| Server | Description | Installation |
|---|---|---|
| mcp-server-kubernetes (Flux159) | Kubernetes management via npx | claude mcp add kubernetes -- npx mcp-server-kubernetes |
| kubernetes-mcp-server (containers) | Go-based native K8s API | GitHub |
| Kubernetes Claude MCP (Blank Cut) | GitOps integration | PulseMCP |
Best Practices
- Always use namespaces - Avoid operations in default namespace
- Dry-run first - Use
--dry-run=clientbefore applying changes - Label everything - Consistent labeling enables filtering
- Resource requests/limits - Always define for production workloads
- Health probes - Configure liveness and readiness probes
- Security contexts - Apply least privilege principles
Process Integration
This skill integrates with the following processes:
kubernetes-setup.js- Initial cluster configurationservice-mesh.js- Service mesh deploymentauto-scaling.js- HPA and VPA configurationcontainer-image-management.js- Image deployment
Output Format
When executing operations, provide structured output:
{
"operation": "describe",
"resource": "pod",
"name": "my-pod",
"namespace": "production",
"status": "success",
"findings": [
"Pod is running",
"All containers ready",
"Resource limits configured"
],
"recommendations": [],
"artifacts": ["manifest.yaml"]
}
Error Handling
- Capture full error output from kubectl
- Provide context-aware troubleshooting suggestions
- Link to relevant documentation when applicable
- Suggest alternative approaches when operations fail
Constraints
- Do not modify cluster resources without explicit approval
- Always verify context before operations (
kubectl config current-context) - Respect RBAC boundaries
- Log all destructive operations