36 Matching Annotations
  1. Last 7 days
    1. FindingRemediationV3 already uses a Claude SDK agent with Write/Edit/Bash tools in a writable workspace. Production-tested.

      most of this code should be reusable

    2. and opens a PR via the existing GitHubPRExecutor

      pr right now is opened exclusively via the api endpoint, and only updated asynchronously after a rebase. the workflow should also decouple this, and should allow user review

    3. Partial success policy: If 3 of 5 targets succeed and 2 fail, should we open a PR with the successful upgrades or abort entirely?

      this is an entirely new agent decision making process

    4. Kagami clone may still be slow even with bundles. Should we set a size threshold for clone-based vs. a future overlayfs path

      overlayfs will always be the best call here. fsx persists between node reboots that's why the worktree creation is actually very recoverable.

    5. Multi-plan execution: Can a user execute multiple plans for the same codebase simultaneously? (Proposal: no — one active execution per codebase to avoid branch conflicts.)

      yes 100% they can and they will. each one may have different permutations. if they change their minds they should make another one.

    6. Plan staleness: How old can a plan be before we require re-planning? Commit SHA validation catches code changes, but should we also check SBOM scan freshness?

      it should have no extraneous external file changes and it should allow us to create a ref with the permissions at time of execution

    7. Workflow execution history visible in Temporal Web UI with heartbeat details, activity retries, and failure reasons. First line of debugging for execution issues.

      don't ship patch contents between the activities hahaha we do this now I have a ticket for it

    8. Check .github/PULL_REQUEST_TEMPLATE.md in clone 2. Org template Check .github/PULL_REQUEST_TEMPLATE.md in .github repo 3. Nebari default Built-in template with CVE summary, changes, and test guidance

      way out of scope, let's stick with the nebari default imo and move the pr template shit to a separate task

    9. Progress Tracking

      I would again add the plan generation in here so we're getting terminal states of everything. design for the ui! encode the steps! state machine!

    10. Execution rejected if base_commit_sha doesn't match current HEAD on the target branch. Prevents applying stale plans to changed code.

      this is when we rebase which we should be doing otherwise nothing will actually get accepted, if we don't then we end up with a pr that has many many unrelated changes and clients have complained many times

    11. Each execution gets its own temp directory on local disk. No shared state with other executions, other codebases, or other tenants.

      there should be some error recovery, I was told the disks aren't persisted

    12. Input: remediationId (UUID), upgradeOption ("minimal" | "moderate" | "comprehensive"), optional branchName, asDraft. Starts Temporal workflow, returns workflowId. Validates status = "plan_ready" and no active execution.

      this isn't the pr open thing yet right? right now, we generate the plan, then the patch, then let user review patch, then iterate on patch, then open pr immediately. that system should be the same imo so it's consistent

    13. Clone from Kagamigit transfer progress (objects received / total)~30s Run Executor AgentAgent tool calls (each tool invocation = heartbeat)Per tool call

      what happens in between a deploy here? do we reclone? that's the main error case we currently see

    14. RETRY

      the agent should make a decision and output whether or not the pr feedback is actionable. this won't work for all ci mechanisms. we should limit to just 1, like GitHub for now.

      this is a can of worms, as soon as I started bringing this up they wanted all of them. that includes Jenkins, self hosted nonsense, bespoke stuff, etc. we should do it but maybe one at a time. I think we have the wiring in place to accept comments.

      reusing the existing pr stuff is completely worth doing too so they all benefit from this!

    15. changes_generated

      we went from a patch lifecycle to pr lifecycle. is this a terminal state after running the remediation, before a pr is opened? there's missing steps above this

    16. execution_status — per-target: success / failed / skipped

      feedback from a user should be included, and there's no running state, we currently miss this in the current ui

      started, failed, running, etc are probably good. then you can encode planning as well.

      terminal states make sense too. planner_running, plan_generated, patch_running, patch_generated, then *_failed states

    17. execution_workflow_id — Temporal workflow ID

      this shouldn't be necessary, you can encode the remediation id in the workflow id and that provides the same thing

    18. pr_url, pr_number — direct access

      we shouldn't denormalize these, you can open multiple prs or update a remediation and that will make this quickly out of sync

    19. npm, pip, poetry, cargo all have different mechanics. An agent with shell access adapts without bespoke parsers per ecosystem.

      we talked about why we can't do this and why it's not a good idea. we have 0 sandboxing functionality, and most of our clients use their own internal package hosting. all package managers have some level of arbitrary code execution, if we're not running the installs and version numbers ourselves we do run the risk of exposure to completely unknown supply chain attacks if this is done completely agentically. I would stick to simple version bumps in the manifests, and fixing code that needs to be fixed.