The Copilot setup that survives an audit
How I run GitHub Copilot Enterprise on GPT-5.3 Codex so the coding agent moves fast and still cannot do anything our auditors would flag.
I am going to be honest about my bias before we start. I do not get excited by demos where an agent writes a whole app in one prompt. I have watched too many of those become someone else's incident review. My job at a regulated fintech is less glamorous: I have to let two hundred engineers move quickly while making sure that when an examiner asks who changed what, when, and under whose approval, I can answer in under a minute. So when people ask why we standardized on GitHub Copilot Enterprise instead of the flashier tools, the answer is dull and correct. It has the governance story already built, and I do not have to bolt one on after the fact.
This is the build I run. The model underneath is GPT-5.3 Codex, which gives us completions that are well tested and, more importantly for me, predictable. Two MCP servers: github and sentry. Two agents: the Copilot coding agent that files pull requests, and a code-review agent that reads them. Then a thin layer of hooks and policy that I will walk through, because the policy is the actual product here. The model is a commodity. The guardrails are not.
Custom instructions are the policy, not the vibe
Every Copilot suggestion in our org is shaped by a single checked-in file: .github/copilot-instructions.md. I treat it the way I treat a control document, not a README. It is reviewed, it is versioned, and changes to it go through the same PR process as code. People want to stuff it with personality and clever framing. I resist that. It states what is mandatory and what is forbidden, in plain language the model follows reliably.
# Org coding policy (enforced for all Copilot output)
## Hard rules (never violate)
- Never write secrets, tokens, or connection strings into source. Read them from the secrets manager.
- Never disable a lint rule or a type check to make code pass. Fix the cause.
- Never introduce a new third-party dependency without an entry in deps/APPROVED.md.
- Do not weaken auth, logging, or input validation to "simplify" a change.
## Required on every change
- Validate all external input at the boundary with our zod schemas.
- Log security-relevant actions through audit-logger, not console.
- New service code ships with a test in the same folder.
- PII fields must be tagged with @pii so the redaction layer can find them.
## How to behave
- Read the files you are about to change before editing them.
- Keep one logical change per pull request. Small diffs review faster and audit cleaner.
- If a request conflicts with these rules, stop and say so. Do not work around the rule.Two things I want to call out. First, the line that tells the model to stop and refuse when a request conflicts with policy. People assume an instructions file only adds capability. It also subtracts it, and the subtraction is the part auditors care about. Second, we keep the approved-dependency list in the repo, deps/APPROVED.md, so the rule is checkable in CI rather than living in someone's head. A rule you cannot verify is a suggestion.
MCP, kept deliberately small
Most MCP write-ups read like a shopping list. Mine is the opposite. Every server I connect is an expansion of what the agent can reach, which means it is an expansion of my threat model and my audit surface. So I run exactly two, and I can justify both in a sentence. The github server lets the coding agent open and update pull requests through our normal review flow. The sentry server lets it read production error context when it is fixing a bug, so it is reasoning about the real failure rather than guessing.
{
"servers": {
"github": {
"type": "http",
"url": "https://api.githubcopilot.com/mcp/",
"headers": {
"Authorization": "Bearer ${input:github_pat}"
}
},
"sentry": {
"type": "http",
"url": "https://mcp.sentry.dev/mcp",
"headers": {
"Authorization": "Bearer ${input:sentry_token}"
}
}
},
"inputs": [
{ "id": "github_pat", "type": "promptString", "password": true },
{ "id": "sentry_token", "type": "promptString", "password": true }
]
}Note that credentials are prompted inputs marked password: true, never literals in the file. The file is committed; the secrets are not. The github token is scoped to read code and write pull requests, nothing else, because the agent has no business deleting branches or editing repo settings. I review the granted scopes quarterly. It is tedious. It is also the cheapest insurance I buy all year.
| Component | What I run | Why it earns its place |
|---|---|---|
| Model | GPT-5.3 Codex | Predictable, well-tested completions |
| MCP | github, sentry | PR flow plus real error context, nothing more |
| Agents | copilot-coding-agent, code-review-agent | One files PRs, one reads them before a human does |
| Hooks | pre-commit lint, required CI checks | Block at commit and again at merge |
| Governance | SSO, audit logs, human approval on agent PRs | The part that survives an examiner |
The two agents, and the wall between them
The coding agent does the work. You assign it an issue, it spins up its own environment, makes the change, and opens a pull request. The thing I insist on, and the thing I will not negotiate, is that a pull request from the agent is treated exactly like a pull request from a junior engineer who started last week. It is reviewed. It is approved by a human. It runs the full check suite. The agent does not get a special lane.
Our code-review agent runs first, before any human looks. It is not there to approve anything. It cannot approve. It posts comments and flags the obvious problems so that by the time a human reviewer arrives, the noise is already filtered and they can spend their attention on judgment calls. Machine reviews structure, human reviews intent. That division has held up well.
Hooks: block early, block again
I run two gates. The first is a pre-commit hook that lints locally, so problems die on the developer's machine before they ever reach a branch. The second is required CI status checks at the pull request, so nothing merges unless lint, types, tests, and the dependency allowlist all pass. Belt and suspenders. The local hook is a courtesy that saves a round trip. The CI checks are the law, because the local hook can be skipped and the server check cannot.
name: required-checks
on:
pull_request:
branches: [main]
jobs:
guardrails:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 20, cache: npm }
- run: npm ci
- name: Lint
run: npm run lint
- name: Types
run: npm run typecheck
- name: Tests
run: npm test -- --ci
- name: Dependency allowlist
run: node scripts/check-deps-allowlist.mjs
- name: Secret scan
run: npx gitleaks detect --no-banner --redactEvery one of these jobs is marked required in branch protection. That last step, gitleaks, is non-negotiable in our setup. The instructions file tells the model not to write secrets, and the model is good about it, but trust without verification is not a control. The scan is the verification.
The governance layer nobody screenshots
This is the part that does not demo well and is the entire reason we chose Copilot Enterprise. Single sign-on is mandatory, so access maps to identities I already manage and revoke through the normal joiner-mover-leaver process. Audit logs capture who did what, which I export into our SIEM on a schedule. And the policy controls let me set what is allowed at the org level instead of hoping every team configures it correctly on their own. I have run pilots of tools where governance was a roadmap promise. I will not do that again in a regulated environment.
- SSO required, so every Copilot session ties to a managed identity I can disable in seconds.
- Audit logs exported to the SIEM, so the who-changed-what question has a real answer.
- Block on policy violations at the org level, not per repo and per developer's good intentions.
- Agent pull requests always require a human approval. No exception lane for the bot.
Move fast and break things is a fine motto until the thing you break is a customer's money. We move fast and prove things instead.
18:33If your team is new to Copilot, that GitHub walkthrough is a reasonable starting point before you layer policy on top. Watch it for the mechanics, then come back and add the controls, because the default experience is intentionally permissive and your job is to constrain it.
References I actually keep open
Two documents live in my browser bookmarks bar, not buried in a folder. The custom-instructions reference, because I revisit it whenever someone proposes a new rule and I need to confirm the precedence and file locations. And the AGENTS.md standard, because we are converging on a single instruction format across tools and I would rather follow an open standard than invent house style that the next architect has to reverse engineer.
Adding repository custom instructions for GitHub CopilotThe official reference for copilot-instructions.md, path-specific instructions, and how AGENTS.md is honored. This is the source of truth for the policy file.docs.github.comAGENTS.mdThe open AGENTS.md standard, stewarded under the Linux Foundation, for one Markdown file that guides coding agents across tools. We are standardizing on it.agents.mdThat is the whole build. A predictable model, two MCP servers I can defend, two agents separated by a wall, gates at commit and at merge, and a governance layer that was there before I needed it. It is not the setup that wins a demo. It is the setup that lets two hundred engineers ship without me losing sleep, and in my line of work that is the only benchmark that pays. If you want to stand it up, start here: gh extension install github/gh-copilot, then enable SSO and required checks before you write a single line with it.