diff --git a/.claude/skills/peach-check/SKILL.md b/.claude/skills/peach-check/SKILL.md index 9a3eb99b2a7..55f51f9e30c 100644 --- a/.claude/skills/peach-check/SKILL.md +++ b/.claude/skills/peach-check/SKILL.md @@ -3,7 +3,7 @@ name: peach-check description: Use before a SonarJS release or when the nightly Peach Main Analysis workflow shows failures that need triage. Classifies each failure as a critical analyzer bug or a safe-to-ignore infrastructure problem. -allowed-tools: Bash(gh run list:*), Bash(gh api:*), Bash(gh run rerun:*), Bash(mkdir:*),Bash(sed --sandbox:*), Read, Agent +allowed-tools: Bash(gh run list:*), Bash(gh api:*), Bash(mkdir:*), Bash(jq:*), Bash(sed --sandbox:*), Read, Agent --- # Peach Main Analysis Check @@ -29,6 +29,16 @@ Before running this skill, ensure: separate Bash call. Chaining bypasses the per-tool permission prompts that allow the user to review each action individually. +A common violation is labelling output by prepending `echo "=== name ===" &&` to a command. Do +not do this. Job names belong in your prose response, not in the Bash call. Write the label as +plain text, then issue the command on its own. + +**Parallel execution is separate from chaining.** Issuing multiple independent Bash calls in +the same response message is the correct way to run jobs concurrently — it does not violate the +no-chaining rule. The no-chaining rule is about what goes *inside* a single Bash call; parallel +execution is about how many Bash calls appear in a single response. Both rules apply together: +separate calls, issued at the same time. + ## Invocation ``` @@ -58,42 +68,91 @@ gh run list \ This prints the `databaseId`, `conclusion`, and `createdAt` of the most recent completed run (meaning finished running, not necessarily passed — a completed run can have failed jobs, which is what we're looking for). Record `databaseId` as `RUN_ID`. -**Step 1b — Rerun if the run was cancelled** - -If the run `conclusion` is `"cancelled"`, the run did not finish normally — some jobs were cut short before they could produce results. Rerun the cancelled/failed jobs automatically: - -```bash -gh run rerun RUN_ID --repo SonarSource/peachee-js --failed -``` +**Step 1b — Stop if the run was cancelled** -Then print: +If the run `conclusion` is `"cancelled"`, the run did not finish normally and is not usable for +release triage. Print: ``` ⚠️ Run RUN_ID (DATE) was cancelled before completion. -Rerun triggered for all failed/cancelled jobs. Check back once the rerun completes. +Rerun recommended for all failed/cancelled jobs. Check back once the rerun completes. ``` Then stop — do not attempt to triage the incomplete results. **Step 2 — Collect all failed jobs** -The run has ~250 jobs across 3 pages. Fetch all three pages and collect jobs where -`conclusion == "failure"`: +The run has ~250 jobs across 3 pages. Fetch the run jobs with the Actions API and extract failed +jobs from the merged result. Do not use `gh run view --json jobs` for Peach Main Analysis because +the matrix is large. + +First, create the output directory so all artifacts land in a predictable location: ```bash -gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=1" \ - --jq '[.jobs[] | select(.conclusion == "failure") | {name, id, completedAt}]' -gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=2" \ - --jq '[.jobs[] | select(.conclusion == "failure") | {name, id, completedAt}]' -gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=3" \ - --jq '[.jobs[] | select(.conclusion == "failure") | {name, id, completedAt}]' +mkdir -p target/peach-logs ``` -Each command outputs only the failed jobs for that page. `completedAt` may be `null` — see Step 7 for handling. +Then download the paginated jobs list: + +```bash +gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100" --paginate > target/jobs.json +``` + +Then slurp the paginated output with `jq -s` before querying it: + +```bash +jq -s ' + { + total_jobs: (map(.jobs | length) | add), + failed_jobs: (map([.jobs[] | select(.conclusion == "failure")] | length) | add), + jobs: (map(.jobs) | add) + } +' target/jobs.json +``` + +Important: `gh api --paginate` emits one JSON object per page. Always slurp with `jq -s` or merge +pages explicitly before querying `.jobs`. + +Before counting, sampling, or triaging, fetch metadata for every failed job. + +Exclude the job named `diff-validation-aggregated` from the analyzed job set immediately: + +- do not include it in analyzed failure counts +- do not include it in the mass-failure ratio +- do not classify it +- do not emit it as an ignored finding +- mention it at most once as an excluded-by-design workflow job if that context is useful + +For each remaining failed job, record: + +- job id +- job name +- completion time +- job URL +- `failing_step_name` +- `owning_phase` (`pre-scan`, `analyze`, or `post-scan`) + +If the job metadata shows multiple failed steps, use the earliest failed step that actually ran as +the phase owner. Treat later failed report/cleanup steps as downstream noise unless they are the +only failed steps. + +Important: the literal GitHub step name is not always the failure phase owner. A job can show only +`Analyze project: failure` in step metadata but still be a `post-scan` failure when the +JavaScript sensor already completed and the stack trace later shows `ReportPublisher.upload`, +`/api/ce/submit`, or another report-submission failure. + +Before deeper triage, check whether the failure belongs to Diff Val monitoring rather than the +analysis itself: + +- If the failing step name contains `Diff Val` or `diff-val`, classify the job immediately as + `IGNORE`. +- These jobs are monitoring / post-processing only. They are not release blockers for SonarJS. +- Per-project Diff Val failures stay in scope as `IGNORE` findings. +- The final `diff-validation-aggregated` job is out of scope and already excluded entirely. **Step 3 — Early exit if no failures** -If there are no failed jobs, print: +If there are no failed jobs left after exclusions, print: ``` ✓ All jobs passed in run RUN_ID (DATE). Safe to proceed with release. @@ -103,7 +162,7 @@ Then stop. **Step 4 — Mass failure detection** -If **≥80% of jobs failed** (e.g. 200+ out of 253), this indicates a single shared root cause. +If **≥80% of analyzed jobs failed** after exclusions, this indicates a single shared root cause. Do not triage every job individually. Instead: @@ -131,35 +190,61 @@ Instead: **Step 5 — Read the classification guide and triage all logs** -Read `docs/peach-main-analysis.md` once to load the failure categories and decision flowchart. +Read `docs/peach-main-analysis.md` (at the repository root) once to load the failure categories and decision flowchart. + +Use the metadata already collected in Step 2 to determine `failing_step_name` and `owning_phase` +before downloading logs. Only download logs for jobs that still need log-based classification. -Create the work directory where logs will be stored for inspection: +If the metadata is missing or incomplete for a job, fetch it with: ```bash -mkdir -p target/peach-logs +gh api "repos/SonarSource/peachee-js/actions/jobs/JOB_ID" ``` -Then triage each failed job using a graduated approach. Work through phases as needed — stop as +Use this to confirm whether the job failed in `Checkout project`, `Install dependencies`, +`Analyze project`, `Report analyzer version`, or another phase boundary. + +When multiple steps are marked failed: +- If `Analyze project` was skipped, classify from the earlier failed pre-scan step. +- If an earlier step failed and later report/post steps also failed, attribute the job to the + earliest real failure. +- Do not classify from `Report analyzer version` when the project was never analyzed. +- If `Analyze project` is the only failed GitHub step but the log shows + `JavaScript/TypeScript/CSS analysis [javascript] (done)` before a later + `ReportPublisher.upload` / `/api/ce/submit` failure, classify it as `post-scan`, not `analyze`. + +Then triage each remaining failed job using a graduated approach. Work through phases as needed — stop as soon as a job can be classified. Run all jobs in parallel within each phase. -**Phase 1 — Download log and filter for failure signals (always, all jobs in parallel)** +If the failing step is a Diff Val / diff-val monitoring step (`Setup Diff Val`, +`Diff Val Snapshot generation`, `Diff Val aggregated snapshot generation`, or similar), classify +it immediately as `IGNORE` and stop triage for that job. The final +`diff-validation-aggregated` job should not reach this step because it is already excluded. + +**Phase 1 — Download log and filter for failure signals (only for jobs not already classified from metadata)** Download the log to disk, then filter for key failure signals. Saving to disk avoids re-downloading in Phase 2 and leaves logs available for manual inspection after the run. Do NOT use `tail -40` — cleanup steps often run after the scan step fails (e.g. always-run SHA extraction), pushing the exit code out of the tail window. A multi-line `sed -n` script is more reliable and easier to maintain than one long regular expression. `--sandbox` prevents sed from executing shell commands -via the `e` command, which is a risk when processing untrusted log content: +via the `e` command, which is a risk when processing untrusted log content. + +Write each job's name as plain text to identify the output, then issue each command as a +standalone Bash call with no prefix: ```bash gh api "repos/SonarSource/peachee-js/actions/jobs/JOB_ID/logs" \ > target/peach-logs/JOB_ID.log sed --sandbox -n ' +/\[36;1m/b /Process completed with exit code/p /EXECUTION FAILURE/p /OutOfMemoryError/p /502 Bad Gateway/p /503 Service Unavailable/p +/Diff Val/p +/diff-val/p /Artifact has expired/p /All 3 attempts failed/p /ERR_PNPM/p @@ -168,14 +253,39 @@ sed --sandbox -n ' /notarget/p /Invalid value of sonar/p /does not exist for/p +/SocketTimeoutException/p +/ReportPublisher\.upload/p ' target/peach-logs/JOB_ID.log ``` +Do not treat the first `Process completed with exit code ...` line in the raw log as the owning +failure by default. Nested commands can emit intermediate non-fatal exit codes that the workflow +handles and then continues past. In particular, early `Artifact has expired (HTTP 410)` lines may +appear before the real later failure. Trust the job step metadata first, then use the final +failing section of the log to determine ownership. + Use the decision flowchart and failure categories from `docs/peach-main-analysis.md` to classify the filtered output. If the filtered lines show exit code 3 (EXECUTION FAILURE from the SonarQube scanner), always continue to Phase 2 — Phase 1 does not surface Java stack traces, so the SonarJS plugin involvement cannot be ruled out from Phase 1 alone. +Many jobs can be classified immediately from Phase 1: + +- project misconfiguration +- dependency install failure +- Peach unavailable +- artifact expired +- clone/network failures +- cancelled or incomplete run evidence + +Also watch for checkout failures before analysis, for example: + +- `fatal: could not read Username for 'https://github.com'` +- repeated checkout retries followed by `All 3 attempts failed` + +These are pre-scan failures. If the upstream GitHub repository appears removed or inaccessible, +call that out explicitly rather than leaving it as a generic auth failure. + **Phase 2 — Sensor and stack trace filter (for exit code 3 failures)** When Phase 1 shows exit code 3, run this to find the last sensor that ran and surface any @@ -183,17 +293,25 @@ SonarJS plugin stack trace. The log is already on disk from Phase 1 — no re-do ```bash sed --sandbox -n ' +/\[36;1m/b /Sensor /p /EXECUTION FAILURE/p +/Node\.js process running out of memory/p +/sonar\.javascript\.node\.maxspace/p +/sonar\.javascript\.node\.debugMemory/p /OutOfMemoryError/p +/ReportPublisher\.upload/p +/api\/ce\/submit/p +/SocketTimeoutException/p /Process completed with exit code/p /org\.sonar\.plugins\.javascript/p ' target/peach-logs/JOB_ID.log ``` This surfaces both the last sensor that ran and any `org.sonar.plugins.javascript` frames in the -stack trace. Apply the classification rules in `docs/peach-main-analysis.md` and run this only -for jobs that need it, all concurrently. +stack trace, plus Node-heap exhaustion hints and the post-scan report-upload timeout pattern. +Apply the classification rules in `docs/peach-main-analysis.md` and run this only for jobs that +need it, all concurrently. **Phase 3 — Full log (only when Phase 2 is still ambiguous)** @@ -231,26 +349,35 @@ evidence `Agent returned no output`. **Step 7 — Check for clustered failures** If 2 or more jobs share the same category, check whether they failed within a -5-minute window. Use `completedAt` timestamps if available; otherwise extract the timestamp prefix -from log lines (format: `2026-MM-DDTHH:MM:SS.`). If clustered, record a general note for the -summary, for example: +5-minute window. Note: `completedAt` is reliably `null` in the paginated jobs API response — +always extract timestamps from log lines instead (format: `2026-MM-DDTHH:MM:SS.`). If clustered, +record a general note for the summary, for example: > ⚠️ N jobs failed with the same pattern within a 5-minute window — likely caused by a single infrastructure event. **Step 8 — Print summary** -Sort rows by verdict: CRITICAL first, then NEEDS-MANUAL-REVIEW, then IGNORE. -Place the Category column first. After the verdict counts and release recommendation, list any +Findings should be grouped by shared cause, not emitted as a flat one-row-per-job list. +Within each cause group, list the affected jobs and short evidence. + +Do not emit `diff-validation-aggregated` as a finding. At most, add a short note such as +`Excluded by design: diff-validation-aggregated`. + +After the grouped findings, print verdict counts and the release recommendation. Then list any general notes collected during log analysis (for example clustered failures or mass-failure -observations): +observations). ``` ## Peach Main Analysis — Run RUN_ID (DATE) -| Category | Job | Verdict | Evidence | -|---------------------------|-------------|-----------------------|----------------------------------------------| -| Analyzer crash | gutenberg | 🔴 CRITICAL | IllegalArgumentException: invalid line offset | -| Dep install failure | builderbot | ✅ IGNORE | ERR_PNPM_OUTDATED_LOCKFILE | -| Dep install failure | hono | ✅ IGNORE | ETARGET: No matching version for @hono/... | +Excluded by design: diff-validation-aggregated + +### IGNORE — Peach report upload timeout +- closure-library — `ReportPublisher.upload` to `/api/ce/submit` timed out after JS analysis completed +- nx — `ReportPublisher.upload` to `/api/ce/submit` timed out after JS analysis completed + +### IGNORE — Diff Val monitoring failure +- go-view — `Diff Val Snapshot generation` +- ioredis — `Diff Val Snapshot generation` ### Summary - 🔴 CRITICAL: N jobs — investigate before release @@ -268,8 +395,11 @@ The release recommendation is: - **NOT SAFE** — one or more CRITICAL jobs - **REVIEW NEEDED** — zero CRITICAL but one or more NEEDS-MANUAL-REVIEW jobs +If every failed job is either a Diff Val monitoring failure or another `IGNORE` category, the +release recommendation is still **SAFE**. + **Step 9 — Update docs if a new failure pattern was found** If any job was classified as NEEDS-MANUAL-REVIEW and you identified its root cause during this -session, update `docs/peach-main-analysis.md` with a new category entry. This keeps the +session, update `docs/peach-main-analysis.md` (at the repository root) with a new category entry. This keeps the classification guide current for future runs. diff --git a/.gitignore b/.gitignore index 9b8dfee4dea..0f7fe47e123 100644 --- a/.gitignore +++ b/.gitignore @@ -78,3 +78,6 @@ lcov.info .claude/* !.claude/*.md !.claude/skills/ + +.codex/ +.mcp.json diff --git a/docs/peach-main-analysis.md b/docs/peach-main-analysis.md index eb198d538f8..10ac921eaa4 100644 --- a/docs/peach-main-analysis.md +++ b/docs/peach-main-analysis.md @@ -28,30 +28,114 @@ log shows one of these names in the failing sensor context: Any other sensor name (e.g. `Sensor Declarative Rule Engine for Shell`, `Java sensor`, `Security SonarQube`) belongs to a **different plugin** and is not a SonarJS issue. +### Last Sensor Wins + +When classifying a scanner failure, use the **last sensor that started before the stack trace** +as the owner of the crash. + +- If `Sensor JavaScript/TypeScript/CSS analysis [javascript]` is the last active sensor and the + stack trace contains `org.sonar.plugins.javascript`, treat it as a SonarJS failure. +- If the JavaScript sensor finished successfully and a later non-SonarJS sensor started, the + later sensor owns the failure even if JavaScript analysis ran earlier in the job. +- Do not blame SonarJS just because the log contains earlier JavaScript sensor lines. The + failing sensor context matters, not the first sensor that appeared in the job. + +Example: + +```text +Sensor JavaScript/TypeScript/CSS analysis [javascript] (done) +Sensor JsSecuritySensorV2 [jasmin] +... +java.lang.OutOfMemoryError: Java heap space +``` + +This is **not** a SonarJS analyzer failure. The last active sensor is `JsSecuritySensorV2 +[jasmin]`, so the failure belongs to Jasmin/security. + ### Decision Flowchart ``` 1. At which step did the job fail? - ├─ Pre-scan step (checkout, vault secrets, dependency install) → IGNORE - ├─ During sonar-scanner execution → go to step 2 + ├─ Pre-scan step (checkout, vault secrets, clone, cache, dependency install) → IGNORE + ├─ During `Analyze project` / sonar-scanner execution → go to step 2 + ├─ Job is `diff-validation-aggregated` → EXCLUDE from analyzed-job counts and findings + ├─ `Diff Val` / `diff-val` monitoring step → IGNORE + ├─ After analysis completed (report upload / post-scan step) → usually IGNORE └─ Unclear / no recognizable step → NEEDS-MANUAL-REVIEW -2. What is the scanner exit code? +Before classifying from the step name, check whether multiple steps are marked failed. +Use the earliest failed step that actually ran as the phase owner. +If `Analyze project` was skipped, do not classify from later `Report analyzer version` noise. +Also, do not assume that the GitHub Actions step name `Analyze project` means the owning phase is +`analyze`: if the JavaScript sensor already finished and the stack trace later shows +`ReportPublisher.upload` or `/api/ce/submit`, the real owner is `post-scan`. + +2. Only now inspect the scanner exit code. ├─ Exit code 3 → read the error message: │ ├─ "The folder X does not exist" or "Invalid value of sonar.X" → IGNORE (project misconfiguration) │ └─ Java stack trace present → go to step 3 ├─ Exit code 137 → CRITICAL (out-of-memory, escalate) └─ Other exit code → NEEDS-MANUAL-REVIEW -3. Which component crashed? +3. Which component crashed? Use the last sensor that started before the error. ├─ Stack trace is in `ReportPublisher.upload` → IGNORE (Peach server / report upload timeout) + ├─ Log says `Node.js process running out of memory` or suggests `sonar.javascript.node.maxspace` → CRITICAL (SonarJS bridge Node heap exhausted) ├─ Stack trace contains `org.sonar.plugins.javascript` frames but no sensor name → CRITICAL (SonarJS plugin initialization failure) - ├─ Sensor name is one of the SonarJS sensors listed above → CRITICAL (SonarJS analyzer crash) - └─ Sensor name is something else (DRE Shell, Java, Security...) → IGNORE (different plugin, not our problem) + ├─ Last active sensor is one of the SonarJS sensors listed above → CRITICAL (SonarJS analyzer crash) + └─ Last active sensor is something else (DRE Shell, Java, Security...) → IGNORE (different plugin, not our problem) ``` +Do not classify by exit code before you know the failing phase. For example, exit code `1` +during `npm install` is a dependency/setup problem, not a scanner problem. +Likewise, do not classify from the first matching exit-code line in the log alone: nested commands +can emit intermediate non-fatal `Process completed with exit code 1` lines that the job then +recovers from before the real terminal failure. + ## Failure Categories +### IGNORE: Diff Val Monitoring Failure + +**Verdict:** IGNORE — the failure happened in differential-validation monitoring, not in project +analysis. + +**How to identify:** +- The failing step name contains `Diff Val` or `diff-val` +- Examples: + - `Setup Diff Val` + - `Diff Val Snapshot generation` + - `Diff Val aggregated snapshot generation` + - `Upload diff-val artifacts` +- These steps run after analysis or as workflow-level post-processing to compare daily snapshots + for monitoring purposes +- Failures here often come from the same Peach API flakiness seen elsewhere, but they do not say + anything about SonarJS analyzer correctness +- This category applies to per-project jobs only. The standalone workflow job + `diff-validation-aggregated` is excluded entirely from analyzed-job counts and detailed findings. + +**Detection patterns:** +- Classify from step metadata first; do not require log inspection +- Phase 1 log hints may include `/502 Bad Gateway/`, `/503 Service Unavailable/`, + `/timeout/`, `/difference is found/`, or non-zero exit codes from diff-val tooling + +**Example outcomes:** +``` +Diff Val Snapshot generation +Process failed ... SnapshotHttpException: HTTP request failed with error code '502' ... +##[error]Process completed with exit code 1. +``` + +``` +Diff Val aggregated snapshot generation +Application run failed ... ExitDiffAppException: The difference is found for projects: ... +##[error]Process completed with exit code 2. +``` + +**Action:** None for SonarJS release triage. Ignore and silence these failures in the detailed +review output. If needed, track them separately as Peach monitoring noise. Do not use this +category for `diff-validation-aggregated`; exclude that workflow job entirely instead. + +--- + ### CRITICAL: SonarJS Plugin Failure **Verdict:** CRITICAL — must be investigated before any release. @@ -94,6 +178,43 @@ EXECUTION FAILURE --- +### CRITICAL: SonarJS Node Heap Exhaustion + +**Verdict:** CRITICAL — investigate analyzer memory use before release. + +**How to identify:** +- Failure occurs during the SonarScanner execution step +- The last active sensor is `JavaScript/TypeScript/CSS analysis [javascript]` +- The log contains one or more of: + - `The analysis will stop due to the Node.js process running out of memory` + - `sonar.javascript.node.maxspace` + - `sonar.javascript.node.debugMemory` +- The Java stack trace often ends with `WebSocket connection closed abnormally` and + `Analysis of JS/TS files failed` + +**Detection patterns:** +- Phase 2: `/Node\.js process running out of memory/`, + `/sonar\.javascript\.node\.(maxspace|debugMemory)/`, + `/org\.sonar\.plugins\.javascript/` + +**Example log excerpt (`tape`, 2026-04-07):** +``` +Sensor JavaScript/TypeScript/CSS analysis [javascript] +ERROR The analysis will stop due to the Node.js process running out of memory (heap size limit 4288 MB) +ERROR You can see how Node.js heap usage evolves during analysis with "sonar.javascript.node.debugMemory=true" +ERROR Try setting "sonar.javascript.node.maxspace" to a higher value to increase Node.js heap size limit +... +java.lang.IllegalStateException: Analysis of JS/TS files failed + at org.sonar.plugins.javascript.analysis.WebSensor.execute(WebSensor.java:175) +EXECUTION FAILURE +##[error]Process completed with exit code 3. +``` + +**Action:** Raise the heap size for the affected project and investigate whether the analyzer +has a memory regression or pathological input pattern. + +--- + ### CRITICAL: Out-of-Memory / Runner Killed **Verdict:** CRITICAL — escalate for investigation. @@ -122,7 +243,7 @@ EXECUTION FAILURE **How to identify:** - Failure occurs during the SonarScanner execution step - Scanner exits with code 3 and a Java stack trace is present -- The failing sensor name is **not** one of the SonarJS sensors listed above +- The **last active** sensor name is **not** one of the SonarJS sensors listed above - Common non-SonarJS sensor names seen on Peach: - `Sensor Declarative Rule Engine for Shell` — belongs to **sonar-iac** - `Sensor Declarative Rule Engine for Terraform` — belongs to **sonar-iac** @@ -132,7 +253,7 @@ EXECUTION FAILURE **Detection patterns:** - Phase 1: `/EXECUTION FAILURE/`, `/Process completed with exit code/` — exit code 3 with no misconfiguration signal escalates to Phase 2 -- Phase 2: `/Sensor /` — last sensor name is not a SonarJS sensor; `/org\.sonar\.plugins\.javascript/` absent → IGNORE +- Phase 2: `/Sensor /` — last sensor name is not a SonarJS sensor and no later SonarJS sensor appears after it; `/org\.sonar\.plugins\.javascript/` absent → IGNORE **Example log excerpt (gutenberg, 2026-03-11):** ``` @@ -147,7 +268,7 @@ EXECUTION FAILURE The class `com.A.A.D.H` is from the sonar-iac plugin (obfuscated). No `org.sonar.plugins.javascript` frame is present — this is not a SonarJS crash. -**Action:** None for the SonarJS team. Optionally notify the team responsible for the failing sensor (e.g. sonar-iac team for DRE Shell failures). +**Action:** None for the SonarJS team. Optionally notify the team responsible for the failing sensor (e.g. sonar-iac team for DRE Shell failures). If the project should stay green on Peach, create or update a Peach tracking task. --- @@ -174,7 +295,41 @@ The class `com.A.A.D.H` is from the sonar-iac plugin (obfuscated). No `org.sonar ##[error]Process completed with exit code 3. ``` -**Action:** None. The analyzed project's sonar-project.properties references a path that no longer exists. Not a SonarJS issue. +**Action:** No SonarJS release blocker. The analyzed project's sonar-project.properties references a path that no longer exists. Create or update a Peach tracking task if the project should stay green. + +--- + +### IGNORE: Upstream Repository Removed / Inaccessible + +**Verdict:** IGNORE — the target project can no longer be fetched from GitHub, so analysis never +started and this is not a SonarJS analyzer issue. + +**How to identify:** +- Failure occurs during `Checkout project` +- `Analyze project` is skipped +- The checkout URL points to a repository or owner that is no longer accessible +- The log shows repeated clone retries ending with messages such as: + - `fatal: could not read Username for 'https://github.com': No such device or address` + - `All 3 attempts failed` + +This can happen when the upstream owner or repository has been removed from GitHub, for example +after abuse or malware takedowns. + +**Detection patterns:** +- Phase 1: `/fatal: could not read Username for 'https:\/\/github\.com'/` +- Phase 1: `/All 3 attempts failed/` + +**Example log excerpt (`vote-coin-demo`, 2026-04-14):** +``` +Checking out vote-coin-demo at ed7b9fd5d9504311598bd05d622fc4244c78848f from https://github.com/scholtz/vote-coin-demo +fatal: could not read Username for 'https://github.com': No such device or address +Attempt 3 failed with exit code 128 +All 3 attempts failed +##[error]Process completed with exit code 1. +``` + +**Action:** No SonarJS release blocker. Remove or replace the Peach project entry, or track it as a +Peach maintenance issue if the matrix should remain green. --- @@ -190,7 +345,8 @@ The class `com.A.A.D.H` is from the sonar-iac plugin (obfuscated). No `org.sonar **Detection patterns:** - Phase 1: `/EXECUTION FAILURE/`, `/Process completed with exit code/` — exit code 3 with no misconfiguration signal escalates to Phase 2 -- Phase 2: `/org\.sonar\.plugins\.javascript/` absent; `/Sensor /` present (analysis ran) but no SonarJS frame → escalate to Phase 3 to confirm `ReportPublisher.upload` in stack trace +- Phase 2: `/ReportPublisher\.upload/`, `/api\/ce\/submit/`, or `/SocketTimeoutException/` +- Phase 2: `/org\.sonar\.plugins\.javascript/` absent; `/Sensor /` present and the SonarJS sensor already finished → IGNORE **Example log excerpt (fossflow, 2026-03-13):** ``` @@ -204,6 +360,10 @@ Caused by: java.net.SocketTimeoutException: timeout Multiple jobs failing this way within the same ~2-minute window is a strong indicator that the Peach server was temporarily unavailable or overloaded. +This can still appear under the GitHub step name `Analyze project`. If +`Sensor JavaScript/TypeScript/CSS analysis [javascript] (done)` appears before the stack trace, +the analysis completed and the job should be classified as `post-scan`, not as an analyzer crash. + **Action:** None for the SonarJS team. Re-run the workflow if needed; the failures are unrelated to the analyzer. --- @@ -214,6 +374,7 @@ Multiple jobs failing this way within the same ~2-minute window is a strong indi **How to identify:** - Failure occurs during the dependency install step (npm/pnpm/yarn install) +- The `Analyze project` step never starts - Error messages such as: - `ERR_PNPM_OUTDATED_LOCKFILE` — pnpm lockfile out of sync with package.json - `npm error notarget No matching version found for ` — package version doesn't exist @@ -221,7 +382,8 @@ Multiple jobs failing this way within the same ~2-minute window is a strong indi - `Cannot find module` — missing dependency **Detection patterns:** -- Phase 1: `/ERR_PNPM/`, `/ERESOLVE/`, `/ETARGET/`, `/notarget/`, `/Process completed with exit code/` — exit code 1 +- Phase 1: `/ERR_PNPM/`, `/ERESOLVE/`, `/ETARGET/`, `/notarget/`, `/Process completed with exit code/` +- Confirm from surrounding log lines that the failure is in the install step and not in `Analyze project` **Example log excerpt (pnpm lockfile mismatch):** ``` @@ -239,7 +401,7 @@ npm error notarget a package version that doesn't exist. ##[error]Process completed with exit code 1. ``` -**Action:** None. The analyzed project needs to update its dependencies. Not a SonarJS issue. +**Action:** No SonarJS release blocker. The analyzed project needs to update its dependencies. Create or update a Peach tracking task if the project should stay green. --- @@ -315,6 +477,52 @@ INFO EXECUTION FAILURE --- +### IGNORE: Scanner Bootstrap / Plugin Download Timeout + +**Verdict:** IGNORE — the scanner timed out while provisioning its engine or downloading plugins +from Peach; the analyzer never started. + +**How to identify:** +- Failure occurs during the `Analyze project` step +- No `Sensor ...` line appears before the stack trace, so no analyzer sensor owned the failure +- Error message contains one of: + - `Fail to download plugin [javascript]` + - `Call to URL [https://peach.sonarsource.com/api/v2/analysis/engine] failed` + - `ScannerEngineLauncherFactory` or `PluginFiles.downloadBinaryTo` + - `SocketTimeoutException: timeout`, `Connection timed out`, or `failed: closed` +- Scanner exits with code 1 or 3 +- No `org.sonar.plugins.javascript` frame is present from the plugin itself; the failure is in + scanner bootstrap / download code, not a SonarJS sensor + +**Detection patterns:** +- Phase 1: `/Process completed with exit code/` — exit code 1 or 3 inside `Analyze project` +- Phase 2/3: no `/Sensor /` match before the stack trace, plus bootstrap markers such as + `/Fail to download plugin \[javascript\]/`, `/api\/v2\/analysis\/engine/`, + `/ScannerEngineLauncherFactory/`, or `/PluginFiles\.downloadBinaryTo/` + +**Example log excerpt (`strapi`, 2026-04-13):** +``` +ERROR Error during SonarScanner Engine execution +java.lang.IllegalStateException: Fail to download plugin [javascript] into ... +Caused by: java.net.SocketTimeoutException: timeout +INFO EXECUTION FAILURE +##[error]Process completed with exit code 3. +``` + +**Example log excerpt (`open-lovable`, 2026-04-13):** +``` +ERROR Error during SonarScanner CLI execution +java.lang.IllegalStateException: Call to URL [https://peach.sonarsource.com/api/v2/analysis/engine] failed: closed +Caused by: java.io.IOException: Connection timed out +##[error]Process completed with exit code 1. +``` + +**Action:** None for the SonarJS team. Re-run the workflow if needed; if this pattern becomes +frequent, treat it as Peach / scanner bootstrap infrastructure noise rather than an analyzer +regression. + +--- + ### IGNORE: Artifact Expired **Verdict:** IGNORE — the SonarJS JAR artifact used by the workflow has expired; the analyzer never ran. @@ -364,11 +572,16 @@ Identifies which step failed and what exit code was produced. Run on every faile ```bash sed --sandbox -n ' +/\[36;1m/b # skip GitHub Actions script-preview lines (ANSI-colored) /Process completed with exit code/p # universal — exit code value drives the flowchart /EXECUTION FAILURE/p # scanner ran and failed (exit code 3) /OutOfMemoryError/p # OOM / Runner Killed /502 Bad Gateway/p # Peach Server Unreachable (502) /503 Service Unavailable/p # Peach Server Unreachable (503) +/Diff Val/p # Diff Val monitoring failure +/diff-val/p # Diff Val monitoring failure +/Fail to download plugin \[javascript\]/p # Scanner Bootstrap / Plugin Download Timeout +/api\/v2\/analysis\/engine/p # Scanner Bootstrap / Plugin Download Timeout /Artifact has expired/p # Artifact Expired /All 3 attempts failed/p # Git Clone / Network Timeout /ERR_PNPM/p # Dependency Install Failure (pnpm) @@ -377,27 +590,58 @@ sed --sandbox -n ' /notarget/p # Dependency Install Failure (npm version not found) /Invalid value of sonar/p # Project Misconfiguration /does not exist for/p # Project Misconfiguration +/SocketTimeoutException/p # Peach report upload timeout (post-scan) +/ReportPublisher\.upload/p # Peach report upload timeout (post-scan) ' target/peach-logs/JOB_ID.log ``` If Phase 1 shows exit code 3 with `EXECUTION FAILURE` but none of the misconfiguration patterns, escalate to Phase 2 — a Java stack trace may be present that Phase 1 does not surface. +Do not assume the first printed exit code is the terminal failure. Logs may contain earlier +non-fatal subcommand failures such as `gh: Artifact has expired (HTTP 410)` followed by +`Process completed with exit code 1`, even though the job later proceeds to a different real +failure. ### Phase 2 — Sensor and stack trace detection Identifies which sensor was running and whether the SonarJS plugin is involved. Run only for -jobs where Phase 1 showed exit code 3 without a clear misconfiguration signal. +jobs where Phase 1 showed exit code 3 without a clear misconfiguration signal. Apply the +**last sensor wins** rule: the last sensor that started before the error owns the failure unless +the stack trace proves a later phase such as `ReportPublisher.upload`. ```bash sed --sandbox -n ' +/\[36;1m/b # skip GitHub Actions script-preview lines (ANSI-colored) /Sensor /p # last sensor name — is it a SonarJS sensor? /EXECUTION FAILURE/p # scanner failure marker /OutOfMemoryError/p # OOM inside scanner +/Node\.js process running out of memory/p # SonarJS bridge Node heap exhaustion +/sonar\.javascript\.node\.maxspace/p # SonarJS heap tuning hint +/sonar\.javascript\.node\.debugMemory/p # SonarJS memory debugging hint +/ReportPublisher\.upload/p # Peach report-upload timeout / post-scan failure +/api\/ce\/submit/p # Peach report submission endpoint +/SocketTimeoutException/p # timeout during report upload /Process completed with exit code/p # exit code confirmation /org\.sonar\.plugins\.javascript/p # SonarJS plugin frame in stack trace → CRITICAL ' target/peach-logs/JOB_ID.log ``` +This surfaces both the last sensor that ran and any `org.sonar.plugins.javascript` frames in the +stack trace, plus the explicit Node-heap hints used by the `tape` failure pattern and the +post-scan report-upload timeout pattern. + +--- + +## Output Guidance + +When summarizing a run for SonarJS release triage: + +- Treat Diff Val / `diff-validation-aggregated` failures as silenced `IGNORE` items +- Do not emit one detailed line per ignored Diff Val failure unless they are the only failures +- Prefer a short roll-up note such as `Ignored 4 Diff Val monitoring failures` +- If every failed job is either a Diff Val monitoring failure or another `IGNORE` category, the + release verdict is `SAFE` + --- ## How to Run the Check