How node-feature-discovery actually labels GPU nodes

Node Feature Discovery labels nodes with the exact GPU model string returned by the driver, and pod selectors must match that string exactly to schedule workloads.

The Kubernetes scheduler cannot see GPU hardware directly. It relies on labels injected into the node object by external controllers. Node Feature Discovery (NFD) is the standard mechanism for this injection, and GPU Feature Discovery (GFD) extends it to NVIDIA hardware. The system reads the PCI device ID and the driver’s model string, then writes a label to the node. This label becomes the source of truth for node affinity rules in pod specs.

Operators often assume the label is a stable identifier like gpu-type=a100. In practice, the label value is the literal string returned by nvidia-smi or the PCI subsystem. If the driver version changes the string format, the label value changes. If the pod’s nodeAffinity does not match the new string, the scheduler rejects the pod. This mismatch is invisible in the scheduler logs because the scheduler is doing exactly what it was told: finding a node with a label that matches the affinity rule.

The discovery and labeling flow

The process begins with the GFD DaemonSet running on each node. It scans the PCI bus for NVIDIA devices and queries the kernel driver for the device name. It then updates the Node object with a set of feature labels. The standard prefix for these labels is feature.node.kubernetes.io/.

The specific key for the GPU model is feature.node.kubernetes.io/nvidia-gpu.product. The value is the full model string, such as NVIDIA-A100-SXM4-80GB. This is not a normalized value like A100. It includes the interface type (SXM4) and memory size (80GB). This granularity is intentional; it allows schedulers to distinguish between an A100 with 40GB VRAM and one with 80GB VRAM, which have different performance characteristics for large models.

Once the label is on the node, the kube-scheduler can use it. However, operators often want to use a shorter or more stable label for their pod specs. This is where the NodeFeatureRule CRD comes in. It allows NFD to read the discovered feature label and write a secondary, custom label. This custom label is what the pod spec typically references.

The following table shows the relationship between the hardware, the NFD discovery label, and the derived custom label.

Source	Label Key	Label Value	Purpose
Driver	`feature.node.kubernetes.io/nvidia-gpu.product`	`NVIDIA-A100-SXM4-80GB`	Raw discovery
NodeFeatureRule	`nvidia.com/gpu.product`	`NVIDIA-A100-SXM4-80GB`	Scheduling selector
Pod Spec	`nodeAffinity`	`nvidia.com/gpu.product`	Constraint

The NodeFeatureRule is the critical translation layer. Without it, pods must reference the feature.node.kubernetes.io/ prefix, which is less familiar to most ML engineers. With it, the cluster can expose a domain-specific label like nvidia.com/gpu.product. The nvidia-device-plugin does not manage these labels; it manages the resource quota (nvidia.com/gpu). NFD manages the metadata labels.

Writing the NodeFeatureRule

The NodeFeatureRule CRD lives in the nfd.k8s-sigs.io/v1alpha1 API group. It defines rules that match discovered features and apply new labels. The matchOn block specifies the condition. The labels block specifies the output.

A common mistake is assuming matchOn supports wildcards. It does not. The value field requires an exact string match. If the driver reports NVIDIA-A100-SXM4-80GB, a rule looking for NVIDIA-A100 will not match. This is a safety feature to prevent accidental broad matches, but it creates fragility when driver versions change the string.

The following YAML shows a valid NodeFeatureRule that maps the raw discovery label to a scheduling-friendly label.

apiVersion: nfd.k8s-sigs.io/v1alpha1
kind: NodeFeatureRule
metadata:
  name: nvidia-gpu-product
spec:
  rules:
    - name: "NVIDIA A100 80GB"
      matchOn:
        - key: "feature.node.kubernetes.io/nvidia-gpu.product"
          value: "NVIDIA-A100-SXM4-80GB"
      labels:
        nvidia.com/gpu.product: "NVIDIA-A100-SXM4-80GB"

This rule tells NFD to look for the exact string NVIDIA-A100-SXM4-80GB on the node. If found, it writes nvidia.com/gpu.product=NVIDIA-A100-SXM4-80GB. The pod spec then uses nvidia.com/gpu.product in its nodeSelector or nodeAffinity.

To verify the label exists on the node, use the kubectl get nodes command with label filtering.

kubectl get nodes --show-labels | grep nvidia-gpu.product

The output should show the feature.node.kubernetes.io/nvidia-gpu.product key. If the NodeFeatureRule is active, the nvidia.com/gpu.product key should also appear. If the second key is missing, the NodeFeatureRule did not match the node’s features. This usually means the value in the rule does not match the value on the node exactly.

Failure modes and string drift

The most common failure mode is the “exact match” requirement. Operators often write NodeFeatureRule values with wildcards to be safe, such as NVIDIA-A100*. The NodeFeatureRule API does not support globbing in the value field of matchOn. It only supports exact matches or existence checks. If the rule specifies value: "NVIDIA-A100-SXM4-80GB" but the node has value: "NVIDIA-A100-80GB", the rule fails silently. The node does not get the custom label, and the pod stays Pending.

Silence is the danger here. The kube-scheduler does not log a failure because the pod’s affinity rule is valid; it just cannot find a matching node. The nfd-worker logs on the node might show the discovery, but they do not show why a rule failed to apply. Debugging requires comparing the NodeFeatureRule spec against the actual node labels.

String drift occurs when the NVIDIA driver updates the model string. A driver update might change NVIDIA-A100-80GB to NVIDIA-A100-SXM4-80GB to distinguish the interface type. If the NodeFeatureRule is pinned to the old string, the new nodes will not get the label. This is not a bug in NFD; it is a configuration drift. The NodeFeatureRule is static, but the driver output is dynamic.

Another failure mode is confusing resource requests with labels. The nvidia-device-plugin allocates the resource nvidia.com/gpu. This is a count, not a type. A pod requesting nvidia.com/gpu: 1 will schedule on any node with at least one GPU, regardless of the model. To schedule on a specific model, the pod must also have nodeAffinity or nodeSelector constraints. The resource request ensures capacity; the affinity constraint ensures capability.

Decision frame

The question the next time a GPU pod stays Pending is not “did NFD discover the hardware.” It is “does the NodeFeatureRule value match the node’s label value exactly.” The NodeFeatureRule API requires exact string matching; it does not support wildcards. If the driver reports NVIDIA-A100-SXM4-80GB and the rule expects NVIDIA-A100-80GB, the label will not be applied. Check the node’s actual labels with kubectl get nodes --show-labels before adjusting the scheduler or autoscaler.