Files
TJWaterFrontend_Refine/memery.md
T
jiang a1442fc062
Build Push and Deploy / docker-image (push) Failing after 3s
Build Push and Deploy / deploy-fallback-log (push) Successful in 1s
ci: retry registry pushes
Retry Docker image pushes to the Gitea registry so transient EOF failures during blob upload do not fail the whole CD run on the first attempt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 16:07:59 +08:00

4.6 KiB

CI build notes

2026-04-24

  • Observed failure while reproducing workflow checkout locally: the Checkout code step ran git remote add origin ... unconditionally. In a workspace that already had an origin remote, the job failed with error: remote origin already exists. and exited before docker build.
  • Why this matters for act_runner: self-hosted Gitea runners can reuse working directories or start from repositories that already contain Git metadata, so checkout logic must be idempotent.
  • Applied fix: changed .gitea/workflows/package.yml to initialize Git only when needed, use git remote set-url origin ... when origin already exists, and force-clean the workspace after checking out FETCH_HEAD.
  • Safety improvement for remote validation: tags ending with -test now run the build verification path only. They skip registry login, image push, latest updates, and the deploy webhook so act_runner can be tested without deployment side effects.
  • Root cause found on the real act_runner: although the runner was registered with ubuntu:docker://gitea/runner-images:ubuntu-22.04, the workflow used runs-on: ubuntu, and the job log showed Start image=ubuntu:latest. That default image does not include the expected toolset, which explains the remote git: not found failure.
  • Applied fix for label selection: changed both jobs to runs-on: "ubuntu:docker://gitea/runner-images:ubuntu-22.04" so Gitea resolves the exact runner image instead of falling back to ubuntu:latest.
  • Follow-up from server validation: Gitea then reported No matching online runner with label: ubuntu:docker://gitea/runner-images:ubuntu-22.04. The runner advertises the short label ubuntu-22.04, so the workflow was updated again to use runs-on: ubuntu-22.04, which should map to docker://gitea/runner-images:ubuntu-22.04 on the runner side.
  • Next remote failure on act_runner: Docker rejected the tag gitea.waternetwork.cn/OrgTJWater/TJWaterFrontend_Refine:v2026.04.24-test3 with repository name must be lowercase. The workflow had normalized the registry host but not the repository path from github.repository.
  • Applied fix for image naming: lowercased REPOSITORY_PATH during image metadata normalization so image tags remain valid even when the Gitea owner or repository name contains uppercase letters.
  • Latest remote failure on act_runner: a *-test run still reached Notify Deploy Server and failed with curl: (3) URL using bad/illegal format or missing URL. That showed the shell-level IS_TEST_TAG guard was not reliable enough for cross-step skip control on this runner.
  • Applied fix for test-tag skipping: moved registry login and deploy webhook skipping to workflow-level if: conditions based on endsWith(github.ref_name, '-test'), and made the image-push branch check the tag name directly instead of relying on IS_TEST_TAG from a previous step.
  • Follow-up from server validation: the runner still executed Notify Deploy Server for v2026.04.24-test5, so Gitea step-level if: with endsWith(...) was not sufficient in this environment.
  • Applied hardening: replaced those step-level conditions with direct shell case "${{ github.ref_name }}" in *-test) guards inside the login, push, and deploy steps. This avoids relying on Gitea expression behavior for test-tag skipping.
  • Workflow mode changed for full CD verification: per latest request, all *-test bypass logic was removed again so the workflow always runs registry login, image push, and deploy webhook. Full deployment validation now depends on using a normal v* tag and observing the real CD result instead of synthetic skip branches.
  • Next full-CD failure on act_runner: image build completed, but pushing to the Gitea registry failed on blob upload commit with failed to do request: Put ... EOF. This is past the workflow logic stage and points to a transient or infrastructure-side registry upload failure.
  • Applied push hardening: wrapped both docker push "${IMAGE_NAME}:${IMAGE_TAG}" and docker push "${IMAGE_NAME}:latest" in a 3-attempt retry helper with a short backoff to absorb transient registry EOF failures.
  • Current local result: npm run lint, npm run test -- --runInBand, npm run build, docker build ..., and npm run build inside gitea/runner-images:ubuntu-22.04 all completed successfully after the workflow adjustment.
  • Non-blocking note: local Jest run reported a haste-map naming collision between package.json and .next/standalone/package.json; tests still passed, and this does not affect the current image-build workflow.