跳转至

🚀 Prometheus

参考

Helm

Bash
1
snap install helm --classic
  • repo

    • 查看

      Bash
      1
      helm repo list
      
    • 添加

      Bash
      1
      helm repo add 仓库名称 仓库地址
      
    • 更新

      Bash
      1
      helm repo update
      
  • release

    • 查看

      Bash
      1
      helm list -A
      
    • 卸载

      Bash
      1
      helm uninstall [release]
      
    • 安装

      Bash
      1
      helm install [release] 仓库名称/包名 -n 命名空间 --create-namespace
      
  • upgrade

    • 查看

      Bash
      1
      helm show values 仓库名称/包名
      
      Bash
      1
      helm get values [release] -n 命名空间
      
    • 修改

      Bash
      1
      helm upgrade [release] 仓库名称/包名 -n 命名空间 -f values.yaml
      
    • rollback

      Bash
      1
      helm rollback [release] 版本号 -n 命名空间
      
      Bash
      1
      helm history [release] -n 命名空间
      



Prometheus

  1. 添加 prometheus-community 仓库

    Bash
    1
    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    
  2. 安装 prometheus

    Bash
    1
    helm install prometheus prometheus-community/kube-prometheus-stack --namespace prometheus --create-namespace
    
  3. 让服务暴露出来

    YAML
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    # values-prometheus.yaml
    prometheus:
      prometheusSpec:
        service:
          type: NodePort
    
    grafana:
      service:
        type: NodePort
    
    alertmanager:
      service:
        type: NodePort
    
    Bash
    1
    helm upgrade prometheus prometheus-community/kube-prometheus-stack -n prometheus -f values-prometheus.yaml
    
  4. 康康服务

    Bash
    1
    kubectl get svc -n prometheus
    



GPU Operator

  1. 导入仓库

    Bash
    1
    helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
    
  2. 安装 GPU Operator

    Bash
    1
    helm install gpu-operator nvidia/gpu-operator -n gpu-operator --create-namespace
    
  3. 暴露服务

    YAML
    1
    2
    3
    4
    # values-gpu-operator.yaml
    dcgmExporter:
      service:
        type: NodePort
    
    Bash
    1
    helm upgrade gpu-operator nvidia/gpu-operator -n gpu-operator -f values-gpu-operator.yaml
    
  4. 康康服务

    Bash
    1
    kubectl get svc -n gpu-operator