Skip to content

Create post artifact collect#78722

Open
eifrach wants to merge 1 commit intoopenshift:mainfrom
eifrach:collect_artifact_post
Open

Create post artifact collect#78722
eifrach wants to merge 1 commit intoopenshift:mainfrom
eifrach:collect_artifact_post

Conversation

@eifrach
Copy link
Copy Markdown
Contributor

@eifrach eifrach commented May 3, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added automated artifact collection from compute cluster tests, gathering diagnostic data and logs from test environments for improved post-test analysis and troubleshooting.
  • Tests

    • Integrated artifact collection into the test workflow's post-execution phase to ensure comprehensive diagnostics are captured after each test run.

Signed-off-by: Eran Ifrach <eifrach@redhat.com>
@openshift-ci openshift-ci Bot requested review from dgoodwin and stbenjam May 3, 2026 10:41
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 3, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eifrach

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 3, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Walkthrough

A new CI operator step reference for artifact collection is introduced into the telcov10n functional test workflow. The step copies artifacts from a bastion host using SSH after preparing an Ansible inventory and extracting SSH credentials. This step is integrated into the workflow's post phase and removes duplicate artifact-collection logic from the config script.

Changes

Artifact Collection Step

Layer / File(s) Summary
Step Reference Definition
ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-ref.yaml
Declares the telcov10n-functional-compute-nto-collect-artifacts step to run commands from the eco-ci-cd:eco-ci-cd image with 4h timeout and 100m CPU.
Implementation
ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh
Bash script that enforces strict mode, skips on marker file, prepares inventory under /eco-ci-cd/inventories/ocp-deployment, extracts SSH parameters from inventory YAML, and copies artifacts from bastion via SCP.
Metadata & Ownership
ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-ref.metadata.json, ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/OWNERS
Metadata file references the step YAML and lists approvers; OWNERS file delegates to parent.
Workflow Integration
ci-operator/step-registry/telcov10n/functional/compute-nto/ocp-setup/telcov10n-functional-compute-nto-ocp-setup-workflow.yaml
Adds the new collect-artifacts step to the post phase before html-report.
Cleanup
ci-operator/step-registry/telcov10n/functional/compute-nto/config/telcov10n-functional-compute-nto-config-commands.sh
Removes the bastion SSH setup and artifact copy logic that is now handled by the dedicated step; adds Ansible playbook invocation.

Sequence Diagram

sequenceDiagram
    participant Step as Artifact Collector
    participant SharedDir as Shared Directory
    participant Inventory as Inventory System
    participant SSH as SSH/Key Extraction
    participant Bastion as Bastion Host
    participant ArtifactDir as Artifact Directory

    Step->>SharedDir: Check for skip.txt
    alt skip.txt exists
        Step->>Step: Exit early
    else
        Step->>SharedDir: Read inventory files
        Step->>Inventory: Prepare /eco-ci-cd/inventories/ocp-deployment
        Step->>Inventory: Copy group_vars and host_vars
        Step->>SharedDir: Read cluster_name
        Step->>SSH: Extract ansible_ssh_private_key
        Step->>SSH: Extract BASTION_IP and BASTION_USER from YAML
        Step->>SSH: Write key to /tmp/temp_ssh_key with restricted perms
        Step->>Bastion: SCP /tmp/artifacts/* using key
        Bastion->>ArtifactDir: Copy to ${ARTIFACT_DIR}
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Create post artifact collect' is vague and lacks clarity. It omits key details about what is being created (a new CI step reference for artifact collection in the telcov10n functional compute-nto workflow) and uses incomplete phrasing ('post artifact collect' is not standard terminology). Revise the title to be more specific and descriptive, such as 'Add telcov10n artifact collection step to post phase' or 'Create artifact collection step for compute-nto workflow'.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The custom check for Stable and Deterministic Test Names is not applicable to this PR. The PR adds bash scripts, YAML CI configuration files, JSON metadata, and an OWNERS file for a CI operator step registry. There are no Ginkgo tests present in any of the changed files.
Test Structure And Quality ✅ Passed The custom check for Ginkgo test code quality is not applicable to this pull request. The PR exclusively modifies CI infrastructure configuration files including shell scripts for artifact collection and orchestration, YAML workflow definitions, JSON metadata files, and OWNERS files. No Ginkgo test files or Go test code exists in this PR.
Microshift Test Compatibility ✅ Passed This pull request does not add any new Ginkgo e2e tests. The changes consist exclusively of CI/CD infrastructure files including a Bash artifact collection script, YAML step references, and JSON metadata. The MicroShift Test Compatibility check is designed specifically to validate e2e tests written in Go that use MicroShift-incompatible APIs, and is therefore not applicable to this pull request since no Go test files are being added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR contains only CI infrastructure files and automation scripts with no Ginkgo e2e tests, making the SNO compatibility check not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds CI/CD artifact collection infrastructure without introducing Kubernetes scheduling constraints that assume HA topology.
Ote Binary Stdout Contract ✅ Passed This PR contains no OpenShift Tests Extension test binaries or Go source code, only CI/CD infrastructure files.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e test patterns found in CI/CD infrastructure files; custom check not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@eifrach: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-crun-bm-4-16 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-nightly-compute-nto-e2e-telcov10n-nto-tests-v2-t0-bm-4-22-nightly N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v2-t0-crun-bm-4-21 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v2-t0-crun-bm-4-19 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-upgrade-4-19 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-runc-bm-4-14 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-crun-bm-4-17 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v2-t0-crun-bm-4-20 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-t0-crun-bm-4-18 N/A periodic Registry content changed
periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v2-t0-crun-bm-4-18 N/A periodic Registry content changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh (2)

20-26: ⚡ Quick win

Quote variables and use cut -f3- to avoid word-splitting and filename truncation.

Three related issues flagged by shellcheck (SC2231/SC2086):

  1. ${SHARED_DIR}/* in the for loop is unquoted — filenames with spaces or globs in SHARED_DIR will be split.
  2. $file, $DEST_DIR, and $DEST_FILE in lines 22–24 are unquoted — same risk.
  3. cut -d'_' -f3 extracts only the 3rd underscore-delimited field. A filename like host_vars_some_name would yield DEST_FILE=some, silently dropping _name. Use -f3- to take everything from the 3rd field onwards.
🛠 Proposed fix
-    for file in ${SHARED_DIR}/*; do 
+    for file in "${SHARED_DIR}"/*; do
         if [[ "$file" == *"group_vars_"* || "$file" == *"host_vars_"* ]]; then
-            DEST_DIR=$( basename $file | cut -d'_' -f1,2 )
-            DEST_FILE=$( basename $file | cut -d'_' -f3 )
-            cp $file ${ECO_CI_CD_INVENTORY_PATH}/$DEST_DIR/$DEST_FILE
+            DEST_DIR=$( basename "$file" | cut -d'_' -f1,2 )
+            DEST_FILE=$( basename "$file" | cut -d'_' -f3- )
+            cp "$file" "${ECO_CI_CD_INVENTORY_PATH}/$DEST_DIR/$DEST_FILE"
         fi
     done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh`
around lines 20 - 26, The loop and variable usages must be quoted and DEST_FILE
should preserve trailing underscores/fields: change the for-loop to iterate over
"${SHARED_DIR}"/* and wrap all expansions in quotes (e.g., "$file", "$DEST_DIR",
"$DEST_FILE") to avoid word-splitting; compute DEST_DIR with basename "$file" |
cut -d'_' -f1,2 and compute DEST_FILE with basename "$file" | cut -d'_' -f3-
(note the -f3- to keep the rest of the name), and ensure the cp invocation uses
quoted arguments like cp "$file"
"${ECO_CI_CD_INVENTORY_PATH}/$DEST_DIR/$DEST_FILE".

54-54: ⚡ Quick win

SSH key extraction with grep -A 100 is fragile and may silently truncate large keys.

The -A 100 flag caps the captured output at 100 lines after the match. A standard RSA 4096-bit key body is ~52 lines, but an RSA 8192-bit key exceeds 100 lines — meaning the -----END ... KEY----- marker would be cut off, producing an invalid key file. Beyond truncation, every subsequent YAML field in the inventory that falls within the 100-line window is appended to the file; while PEM parsers typically stop at the END marker, relying on trailing garbage being silently ignored is brittle.

A more reliable alternative is to extract the value using a dedicated YAML-aware tool or awk:

🛠 More robust key extraction (awk)
-grep ansible_ssh_private_key -A 100 "${ECO_CI_CD_INVENTORY_PATH}/group_vars/all" | sed 's/ansible_ssh_private_key: //g' | sed "s/'//g" > "/tmp/temp_ssh_key"
+awk '/ansible_ssh_private_key:/{found=1; sub(/ansible_ssh_private_key: /, ""); gsub(/'\''/, "")} found{print; if(/END .* KEY/) exit}' \
+    "${ECO_CI_CD_INVENTORY_PATH}/group_vars/all" > "/tmp/temp_ssh_key"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh`
at line 54, The current extraction using "grep ansible_ssh_private_key -A 100
... > /tmp/temp_ssh_key" can truncate or pull unrelated YAML because of the
fixed "-A 100" window; replace this with a YAML-aware or line-aware extractor
(e.g., use awk to locate the "ansible_ssh_private_key:" key and stream all
following lines until the "-----END ... KEY-----" marker, stripping leading
"ansible_ssh_private_key: " and surrounding quotes, then write to
/tmp/temp_ssh_key) or use yq to read the ansible_ssh_private_key value directly
and output it to /tmp/temp_ssh_key; update the snippet that contains the grep
command in telcov10n-functional-compute-nto-collect-artifacts-commands.sh to use
the chosen robust extractor and ensure the resulting file permissions are set
appropriately (chmod 600).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh`:
- Around line 62-63: The scp command will fail the post step when /tmp/artifacts
on the bastion is missing or empty; change the artifact-copy logic to first SSH
to the bastion and check that /tmp/artifacts exists and is non-empty (e.g.,
using test -d and a non-empty ls check) and only then perform the transfer,
otherwise skip with success; update the block that currently runs scp -r ...
"${BASTION_USER}@${BASTION_IP}":/tmp/artifacts/* "${ARTIFACT_DIR}" to perform
the remote existence/non-empty check using ${BASTION_USER}@${BASTION_IP} and
/tmp/artifacts before invoking scp (or switch to an ssh+tar stream that is only
run when the check passes) so the post step remains best-effort and does not
abort on missing artifacts.

---

Nitpick comments:
In
`@ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh`:
- Around line 20-26: The loop and variable usages must be quoted and DEST_FILE
should preserve trailing underscores/fields: change the for-loop to iterate over
"${SHARED_DIR}"/* and wrap all expansions in quotes (e.g., "$file", "$DEST_DIR",
"$DEST_FILE") to avoid word-splitting; compute DEST_DIR with basename "$file" |
cut -d'_' -f1,2 and compute DEST_FILE with basename "$file" | cut -d'_' -f3-
(note the -f3- to keep the rest of the name), and ensure the cp invocation uses
quoted arguments like cp "$file"
"${ECO_CI_CD_INVENTORY_PATH}/$DEST_DIR/$DEST_FILE".
- Line 54: The current extraction using "grep ansible_ssh_private_key -A 100 ...
> /tmp/temp_ssh_key" can truncate or pull unrelated YAML because of the fixed
"-A 100" window; replace this with a YAML-aware or line-aware extractor (e.g.,
use awk to locate the "ansible_ssh_private_key:" key and stream all following
lines until the "-----END ... KEY-----" marker, stripping leading
"ansible_ssh_private_key: " and surrounding quotes, then write to
/tmp/temp_ssh_key) or use yq to read the ansible_ssh_private_key value directly
and output it to /tmp/temp_ssh_key; update the snippet that contains the grep
command in telcov10n-functional-compute-nto-collect-artifacts-commands.sh to use
the chosen robust extractor and ensure the resulting file permissions are set
appropriately (chmod 600).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a119c790-e5f3-4c2f-babe-4296c5d97c41

📥 Commits

Reviewing files that changed from the base of the PR and between 47e52e1 and 76d38c9.

📒 Files selected for processing (6)
  • ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/OWNERS
  • ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh
  • ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-ref.metadata.json
  • ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-ref.yaml
  • ci-operator/step-registry/telcov10n/functional/compute-nto/config/telcov10n-functional-compute-nto-config-commands.sh
  • ci-operator/step-registry/telcov10n/functional/compute-nto/ocp-setup/telcov10n-functional-compute-nto-ocp-setup-workflow.yaml
💤 Files with no reviewable changes (1)
  • ci-operator/step-registry/telcov10n/functional/compute-nto/config/telcov10n-functional-compute-nto-config-commands.sh

Comment on lines +62 to +63
scp -r -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /tmp/temp_ssh_key \
"${BASTION_USER}@${BASTION_IP}":/tmp/artifacts/* "${ARTIFACT_DIR}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

scp will fail (and abort the post step) if /tmp/artifacts/ is absent or empty on the bastion.

The remote glob /tmp/artifacts/* is expanded by the remote shell. If the directory does not exist or is empty — which is a real scenario when the test step was skipped or failed early — scp returns a non-zero exit code and set -e terminates the entire post step. Since artifact collection is best-effort in a post step, consider a graceful fallback.

🛠 Proposed fix
 echo "Copy logs and artifacts to artifacts directory"
-scp -r -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /tmp/temp_ssh_key \
-    "${BASTION_USER}@${BASTION_IP}":/tmp/artifacts/* "${ARTIFACT_DIR}"
+scp -r -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /tmp/temp_ssh_key \
+    "${BASTION_USER}@${BASTION_IP}":/tmp/artifacts/* "${ARTIFACT_DIR}" || \
+    echo "WARNING: scp returned non-zero (no artifacts or bastion unreachable); continuing."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/telcov10n/functional/compute-nto/collect-artifacts/telcov10n-functional-compute-nto-collect-artifacts-commands.sh`
around lines 62 - 63, The scp command will fail the post step when
/tmp/artifacts on the bastion is missing or empty; change the artifact-copy
logic to first SSH to the bastion and check that /tmp/artifacts exists and is
non-empty (e.g., using test -d and a non-empty ls check) and only then perform
the transfer, otherwise skip with success; update the block that currently runs
scp -r ... "${BASTION_USER}@${BASTION_IP}":/tmp/artifacts/* "${ARTIFACT_DIR}" to
perform the remote existence/non-empty check using ${BASTION_USER}@${BASTION_IP}
and /tmp/artifacts before invoking scp (or switch to an ssh+tar stream that is
only run when the check passes) so the post step remains best-effort and does
not abort on missing artifacts.

@eifrach
Copy link
Copy Markdown
Contributor Author

eifrach commented May 3, 2026

/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-t0-crun-bm-4-18

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@eifrach: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 3, 2026

@eifrach: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-t0-crun-bm-4-18 76d38c9 link unknown /pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-compute-nto-e2e-telcov10n-nto-tests-v1-t0-crun-bm-4-18

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant