AI-Powered SQL Migration Script Generator
- Mar 6
- 21 min read
Updated: May 10
[UPDATED]
There's a principle that stuck with me recently. The core idea was simple: stop describing what you want to build — start building it. If you can explain what you want clearly enough, AI can build it. But the catch is that "clearly enough" is doing a lot of heavy lifting in that sentence.
Without structure, AI coding tools produce confident-looking output that quietly falls apart. They ignore your conventions, invent patterns you didn't ask for, and generate code that contradicts what's already in your project. You spend more time cleaning up after the AI than it would have taken to write the code yourself.
I experienced this firsthand. And then I figured out how to fix it — not with better prompts, but with better structure. The result was a fully automated GitHub Actions workflow that uses Claude's API to generate production-ready SQL Server migration scripts every time a pull request is opened. No manual script writing. No inconsistent formatting. No missed idempotency checks.
This is the story of how I built it, what I learned about working with AI, and why the SKILL.md framework changed everything.
Since the original post, the workflow has grown three new pieces — a vendored static SQL analyzer, a pre-flight CI gate, and dependency-ordered Repeatable migrations. I've folded those into the relevant sections below. The bones of the system haven't changed, but it's more robust and harder to misuse than v2.1.
The Problem I Was Solving
Our database has 21 schemas and hundreds of objects — stored procedures, tables, views, functions, triggers, synonyms. Every time an engineer made a schema change, someone had to manually write a migration script for it. That meant:
Remembering the correct Flyway naming convention (V{major}.{minor}__{date}.ProjectDB.branchname.sql for versioned, R__{date}.ProjectDB.branchname.sql for repeatable)
Writing idempotency checks for every DDL statement — and getting them right
Knowing which objects are repeatable (views, stored procedures, functions) vs. versioned (tables, indexes, synonyms)
Handling deleted files by generating the correct DROP statements with existence checks
Calculating the next available version number without conflicting with other in-flight branches
Following SQL best practices: ONLINE = ON for index operations, FILLFACTOR settings, OPTIMIZE_FOR_SEQUENTIAL_KEY for identity PKs, proper constraint ordering
Every step was a chance for human error. And in a production database that powers the core business platform, human error isn't just inconvenient — it's expensive.
Before I went down the AI path, I tried the conventional approach first. Schema compare tools — Devart, Azure Data Studio — seemed like the natural solution. Point at two databases, diff, generate scripts. It worked, until it didn't. A version upgrade to Devart broke the PowerShell integration we used to automate the compare. Azure Data Studio's schema compare had its own quirks with synonym handling and cross-database references. The pattern was always the same: the tool worked great in demos, then crumbled when it hit our real-world complexity. And when it broke, debugging was opaque — proprietary internals, closed-source logic, no way to see why a particular diff was generated wrong.
I wanted engineers to modify their source SQL files, open a PR, and have the migration scripts appear automatically — validated, formatted, and ready for review.
Why My First Attempts Failed
My first instinct was straightforward: call the Claude API with the changed files and ask it to generate migration scripts. It worked — sort of. The scripts looked right at first glance, but they were inconsistent. Sometimes Claude used CREATE OR ALTER for tables (wrong — tables need versioned migrations with IF NOT EXISTS checks). Sometimes it forgot the GO batch separators. Sometimes it generated scripts for objects that hadn't actually changed.
The problem wasn't Claude's capability. The problem was that I was asking it to generate scripts without telling it how we do things here.
That's when I realized the answer wasn't a single clever prompt. It was a layered framework — operating rules, a skill definition, and a library of reference files — that gave the AI everything it needed before it wrote a single line of SQL.
The SKILL.md Framework (Applied to Database CI/CD)
The framework is built around the idea that AI tools fail not because they're incapable, but because they have no context. Every time you start a session, your AI assistant wakes up with amnesia. It doesn't know your project structure, your conventions, or the decisions you made last week. The fix is to write that context down in structured files that the AI reads before doing anything.
Three layers of context, but far more than three files:
Layer | File(s) | Purpose | Analogy |
Operating Layer | Conventions, architecture, anti-patterns — always loaded | Employee handbook | |
Skill Layer | sql-gener8-migration/SKILL.md | Migration type rules, permission placement, synonym replacement, self-validation | Project brief + playbook |
Reference Layer | 8 reference files | Detailed patterns, constraints, and standards for specific domains | Technical manuals |
Let me walk through how each layer shaped the final workflow.
CLAUDE.md — The Operating Layer
This is the backbone. Our CLAUDE.md tells Claude Code everything it needs to know about our database — before it writes a single line of SQL.
Project Architecture
The database has 49 functional schemas organized by domain:
DatabaseProject/
dbo/ # Default schema
messaging/ # Messaging analytics
voice/ # Voice communication
finance/ # Financial/billing
analytics/ # Business intelligence
etl/ # ETL processes
staging/ # Data staging
sync/ # Data synchronization
alerts/ # Alert/notification system
accounts/ # Customer/account management
carrier/ # Carrier/operator integration
erp/ # ERP system integration
procurement/ # Procurement integration
metadata/ # Metadata tracking
config/ # Operating mode config
mapping/ # Mapping/reference
conversion/ # Conversion tracking
transactions/ # Transaction system
utility/ # Utility functions
audit/ # Audit trail/logging
security/ # Access control/permissions
reporting/ # Report generation
notifications/ # Notification delivery
inventory/ # Inventory management
compliance/ # Regulatory compliance
routing/ # Message/call routing
provisioning/ # Service provisioning
rating/ # Rate/pricing engine
billing/ # Billing cycles
payments/ # Payment processing
disputes/ # Dispute resolution
orders/ # Order management
catalog/ # Product/service catalog
workflow/ # Workflow automation
scheduling/ # Job scheduling
archive/ # Data archival
integration/ # Third-party integration
monitoring/ # System monitoring
logging/ # Application logging
cache/ # Caching layer
geo/ # Geographic/location data
auth/ # Authentication
portal/ # Customer portal
support/ # Support/ticketing
docs/ # Document management
api/ # API management
batch/ # Batch processing
network/ # Network management
quality/ # Quality assuranceEach schema has subdirectories for Tables/, Stored Procedures/, Views/, Functions/, Triggers/, and Synonyms/. Claude needs to know this structure to categorize changes correctly.
SQL Anti-Patterns Table
The CLAUDE.md includes a table of 13 SSDT-derived anti-pattern rules that Claude must enforce when generating or reviewing SQL. These map directly to Visual Studio's static analysis rules and catch issues that schema compare tools silently ignore:
Rule | What It Catches |
SR0001 | No SELECT * in procedures — always list explicit columns |
SR0008 | Use SCOPE_IDENTITY() instead of @@IDENTITY |
SR0009 | Use CHAR(1)/CHAR(2) instead of VARCHAR(1)/VARCHAR(2) |
SR0010 | No deprecated = or = join syntax |
SR0016 | No sp_ prefix on stored procedures |
Beyond SSDT rules, the operating layer also enforces modern SQL Server 2019+ features: OPTIMIZE_FOR_SEQUENTIAL_KEY = ON for identity-based primary keys, RESUMABLE = ON for index operations, STRING_AGG() over FOR XML PATH, and SET XACT_ABORT ON for all transaction blocks.
Idempotency Patterns
This is where the specificity matters most. Instead of saying "make scripts idempotent," the CLAUDE.md spells out exactly which pattern to use for each object type:
-- Tables:
IF NOT EXISTS (SELECT 1 FROM sys.objects
WHERE object_id = OBJECT_ID(N'schema.TableName') AND type = 'U')
BEGIN
CREATE TABLE schema.TableName (...)
END
GO
-- Indexes (with online + resumable):
IF NOT EXISTS (SELECT 1 FROM sys.indexes
WHERE name = 'IX_Name' AND object_id = OBJECT_ID('schema.Table'))
BEGIN
CREATE NONCLUSTERED INDEX IX_Name ON schema.Table(Column)
WITH (FILLFACTOR = 95, ONLINE = ON (WAIT_AT_LOW_PRIORITY
(MAX_DURATION = 5 MINUTES, ABORT_AFTER_WAIT = SELF)),
RESUMABLE = ON, MAX_DURATION = 60 MINUTES)
END
GO
-- Stored Procedures (repeatable):
CREATE OR ALTER PROCEDURE schema.ProcName
AS
BEGIN
SET NOCOUNT ON;
...
END
GOWhat NOT to Do
Just as important as telling the AI what to do is telling it what not to do:
Never manually create migration scripts — the CI/CD workflow generates them
Never use CREATE OR ALTER for tables (tables are versioned, not repeatable)
Never use RAISERROR — use THROW instead
Never grant CONTROL ON SCHEMA to the CI/CD role
Never use MERGE statements — use UPDATE + IF @@ROWCOUNT = 0 INSERT instead
Never use @@IDENTITY — use SCOPE_IDENTITY()
Column modifications require a 3-step process: DROP dependent objects, MODIFY column, RECREATE dependent objects
This level of specificity is what makes the difference between "AI-generated scripts that look right" and "AI-generated scripts that actually work in production."
SKILL.md — The Skill Layer
While CLAUDE.md covers the entire project, the skill file sql-gener8-migration/SKILL.md focuses specifically on how migration generation operates. This is the central prompt — the single document that tells Claude exactly how to transform a set of changed SQL files into deployment-ready migration scripts.
The skill is defined with triggers (generate migration, migration script, flyway migration, sql-gener8-migration) so Claude Code activates it automatically when an engineer asks for migration scripts. Here's what it covers:
Migration type rules — repeatable (R__) for views, stored procedures, functions, triggers; versioned (V__) for tables, indexes, synonyms, UDTs, deleted objects
Permission placement — object-level GRANTs go only in repeatable migrations; schema-level GRANTs go only in version migrations for new schemas
Synonym variable replacement — SSDT project variables like $(LINKED_DB) get replaced with actual database names in migration output
Modified file handling — for status M files, Claude analyzes the diff (not the full file) and generates only the specific ALTER/DROP/ADD operations for what changed
Deleted file handling — scan for embedded triggers, generate DROP statements in correct order with IF EXISTS guards
New schema detection — automatically adds CREATE SCHEMA + role_app_db_cicd GRANT statements before any objects in that schema
31-check self-validation checklist — every generated script must pass all checks before the AI responds
This file is the spec. And as I learned the hard way: vague specs produce vague code. The more precise this document is, the less time you spend fixing what the AI generates.
Reference Files — The Knowledge Base
The third layer is a set of 8 reference files that live alongside SKILL.md in the references/ directory. Seven are loaded into the Claude prompt during migration generation; the eighth (dependency-scan.md) drives the workflow's impact analysis logic. Each one codifies a specific domain of knowledge.
File | Purpose |
Database name, schema list, SSDT variable-to-database mappings — the source of truth for this specific repository | |
Complete catalog of IF EXISTS / IF NOT EXISTS patterns for every object type (tables, indexes, constraints, triggers, synonyms, UDTs) | |
The 3-step DROP / ALTER / RECREATE process for modifying columns with dependent constraints and indexes | |
FILLFACTOR, ONLINE, RESUMABLE, OPTIMIZE_FOR_SEQUENTIAL_KEY rules — every index option and when to apply it | |
The CI/CD role's permission model — what role_app_db_cicd can and cannot do, and why CONTROL ON SCHEMA is excluded | |
Data type standards, naming conventions, SSDT anti-patterns (SR0001 through SR0016), SARGable predicates, dynamic SQL safety | |
Rules for object dependency scanning — how the workflow detects references to deleted tables and synonyms (drives workflow logic, not loaded into Claude prompt) | |
dependency-ordering.md (new in v2.2) | Topological sort rules for emitting Repeatable migrations in dependency order — functions before views, views before stored procedures, stored procedures before triggers |
The reference layer means I never had to cram everything into a single massive prompt. Each file is focused, maintainable, and version-controlled alongside the code it governs. When index standards change, I update index-standards.md — not a 3,000-line prompt buried in a YAML workflow.
Three Layers of SQL Enforcement (new in v2.2)
The original workflow had one validation layer: the 31-check self-validation embedded in SKILL.md, run by Claude as it generated each script. That's a guardrail against AI mistakes. But it doesn't catch human mistakes — the engineer who pushes a stored procedure with SELECT * or a table without a primary key. The repeatable migration generator simply mirrors the source file. If the source SQL is broken, the migration carries the breakage forward.
I added two more enforcement layers in v2.2. The result is the same SQL ruleset checked in three places:
Layer | Where | When it runs | Behavior on CRITICAL |
Local pre-commit | .githooks/pre-commit (opt-in) | On every git commit against staged *.sql | Blocks the commit |
Pre-flight CI gate | Step 4 of the migration generation workflow | Inside the workflow, before any Claude API call | Aborts migration generation |
Authoritative CI gate | Standalone sql-valid8.yml workflow | Every PR; every push to master | Fails the required check |
All three layers run the same vendored analyzer — a 52-rule, multi-dialect SQL static analyzer that ships in the repo at .githooks/analyze_sql.py. It auto-detects MSSQL / PostgreSQL / Oracle / MySQL / SQLite from syntax. Severity is classified as CRITICAL / HIGH / MEDIUM / LOW. CRITICAL findings block the merge; the rest are surfaced as warnings.
Why three layers, when one would technically be enough?
Local hook = fast feedback. Engineers find out about violations on commit, not after pushing and waiting for CI.
Pre-flight gate = cost control. If the source SQL is already broken, there's no point spending an LLM call to generate a migration from it.
Authoritative CI gate = enforcement. git commit --no-verify skips the local hook, but it can't bypass the CI check. The PR doesn't merge if the gate fails.
The analyzer self-updates from the upstream marketplace on every CI run, committing back to master when the version changes. Engineers get the latest rules on the next git pull — no manual update.
The Workflow Itself — How It All Comes Together
Here's what happens when an engineer opens a PR with SQL changes. The pipeline is now an 18-step GitHub Actions workflow (it was 17 before v2.2; the pre-flight gate is the new step):
1. Checkout and Branch Extraction
The workflow checks out the full repository history and extracts the branch name and base branch from the PR context.
2. Change Detection
The workflow diffs the PR branch against master and identifies every changed .sql file in the database project directory:
git diff --name-status -M90% "origin/master...HEAD" -- "ProjectDB/**/*.sql"The -M90% flag handles file renames — if a file was moved or renamed, git tracks it instead of treating it as a delete + add.
3. Pre-flight sql-valid8 Gate (new in v2.2)
Before any Claude API call, the workflow runs the vendored analyzer against the changed SQL. CRITICAL findings abort migration generation immediately and post a workflow summary listing the violations. This catches the obvious mistakes — missing primary keys, bare data types, deprecated TEXT/NTEXT/IMAGE — before the LLM gets involved. Cheaper, faster, and the engineer sees the failure as a clear analyzer report rather than a confusing AI-generated artifact.
4. Categorization
Each changed file is classified based on its directory path:
Object Type | Migration Type | Prefix |
Views, Stored Procedures, Functions, Triggers | Repeatable | R__ |
Tables (structural changes), Indexes, Synonyms, User Defined Types | Versioned | V{major}.{minor}__ |
Tables (permissions only) | Repeatable | R__ |
The categorization isn't just based on the directory. The workflow reads the file content to distinguish between a table file that contains CREATE TABLE (versioned) vs. one that only contains GRANT statements (repeatable). This is a nuance that a simpler approach would miss.
5. Impact Analysis with Dependency Scanning
This is where the workflow goes beyond simple script generation. Before calling Claude, it analyzes the impact of every change across the codebase. When tables or synonyms are deleted, the workflow detects code references using schema-qualified git grep with line-level filtering:
Unbracketed search: git grep -n -F -w "schema.ObjectName" catches FROM core.Supplier, JOIN core.Suppliers
Bracketed search: git grep -n -F "[schema].[ObjectName]" catches [voice].[Supplier]
Each matching line is filtered to exclude false positives — lines starting with -- (SQL comments) and lines matching CREATE (OR ALTER)? PROCEDURE|FUNCTION|VIEW|TRIGGER (object declarations, not references). Self-references, migration files, and security files are also excluded.
When dependencies are found, the risk level auto-escalates: 1–2 references bumps LOW to MEDIUM, 3+ bumps to HIGH. The PR comment shows a collapsible section listing every affected object and file. Reviewers see the blast radius before they approve anything.
6–7. Cleanup and Version Calculation
Old migration scripts from previous pushes on the same branch are deleted (one set of scripts per branch, always). The workflow scans the Migration/ folder for existing versions and calculates the next available version number.
8. Context Preparation
The workflow builds a structured prompt containing the changed file contents (or diffs for modified files), deleted file contents retrieved from origin/master, the branch name, database name, categorization results, and the reference files.
9. Claude API Call
The API call uses Claude Opus 4.7. In v2.1 I was passing temperature: 0 for deterministic output, but the parameter is now deprecated upstream and was dropped in v2.2 — determinism comes from the prompt itself and the validation checklist, not from a model parameter. Claude returns its response in a structured XML format with separate sections for repeatable and version migrations.
10. Repeatable Migration Assembly (with dependency ordering)
For repeatable objects (views, stored procedures, functions, triggers), the workflow reads the source files directly and builds the repeatable migration by converting CREATE to CREATE OR ALTER and appending the object-level GRANT statements.
New in v2.2: the order objects are emitted is no longer alphabetical. I learned this the hard way after a deploy where a stored procedure compiled fine but failed at runtime with Invalid object name 'cc.vwOrderTotal'. The view existed in the source files — the alphabetical sort just emitted the SP first, and SQL Server's deferred name resolution let the SP CREATE succeed even though the view didn't exist yet at that point in the batch. The first call to the SP after deploy was the failure.
The fix: a topological sort with a tier model. Functions emit first (referenced by everything), then views, then stored procedures, then triggers. Inside a tier, alphabetical. Cycles get flagged as a warning — cycles in views/functions are usually a real bug, not something to silently rearrange. The workflow renders the inferred dependency graph as a Mermaid diagram in the job summary so reviewers can spot accidental cycles before they hit QA.
11. SQL Validation
After generation, every script goes through the 31-check self-validation. Critical errors block the PR. Warnings are flagged but don't block.
12–15. File Writing, Security Audit, Role Permissions, Project File Update
Generated scripts are written to the Migration/ folder. The workflow audits security permissions, detects new schemas and auto-grants CI/CD role permissions, and updates the Visual Studio project file (.sqlproj) with wildcard patterns for new schema directories.
16–17. Deleted File Cleanup and Commit
The .sqlproj is synced for deleted files (stale <Build Include> entries get removed), and all changes are committed directly to the PR branch. If the push is rejected because another bot commit landed first, the workflow fetches and rebases automatically, retrying up to 3 times.
18. PR Comment
A detailed comment is posted showing validation status, script names, impact analysis, dependency warnings, and what was generated. The reviewer sees everything without having to run anything manually.
The 31-Check Self-Validation
Early versions of the workflow had basic validation — check for a PRIMARY KEY, verify BEGIN/END matching, confirm data types have lengths. That caught the obvious mistakes. But production taught me that the subtle mistakes were the expensive ones.
The current validation is a 31-check self-validation checklist embedded directly in SKILL.md. Claude must verify every check passes before it emits a response. If any check fails, it fixes the issue before responding. The checks are split into two categories:
Core Checks (1–12)
# | Check |
1 | BEGIN/END matching — every BEGIN has a corresponding END |
2 | GO separators between DDL batches; never inside a BEGIN…END block |
3 | Valid data types with length/precision — no bare VARCHAR() or DECIMAL |
4 | Correct filename format for both V__ and R__ migrations |
5 | Permission placement — object-level GRANTs only in repeatable, schema-level only in version |
6 | No duplicate objects — each CREATE/ALTER appears only once |
7 | Idempotent guards on all version migration objects |
8 | CREATE OR ALTER for all repeatable objects |
9 | No SSDT variable syntax remains — all $(...) replaced with plain database names |
10 | Reference data inserts use per-row guards, not a single wrapping IF NOT EXISTS |
11 | Index options: FILLFACTOR, ONLINE, RESUMABLE, OPTIMIZE_FOR_SEQUENTIAL_KEY where applicable |
12 | Security constraints — no GRANT to unauthorized roles, no DROP SCHEMA, no TRUNCATE TABLE |
SSDT Anti-Pattern Checks (13–31)
# | Check |
13 | No deprecated TEXT, NTEXT, or IMAGE data types |
14 | No @@IDENTITY — use SCOPE_IDENTITY() |
15 | No SELECT * in stored procedures (except in EXISTS) |
16 | No deprecated = / = join syntax |
17 | SET NOCOUNT ON in all stored procedures |
18 | TRY-CATCH with ROLLBACK for all transaction blocks |
19 | Dynamic SQL uses sp_executesql with parameters, never EXEC(@variable) |
20 | No cursors — use set-based operations |
21 | SARGable predicates — no functions applied to columns in WHERE clauses |
22 | No sp_ prefix on stored procedures |
23 | No leading wildcard LIKE '%value' patterns |
24 | CHAR(1)/CHAR(2) instead of VARCHAR(1)/VARCHAR(2) |
25 | No MERGE statements |
26 | OPTIMIZE_FOR_SEQUENTIAL_KEY on IDENTITY PKs |
27 | Covering indexes with proper key vs. INCLUDE column placement |
28 | SET XACT_ABORT ON with all transactions |
29 | Schema qualification on all object references |
30 | Explicit INSERT column lists — never bare VALUES |
31 | EXISTS instead of COUNT(*) > 0 for existence checks |
Critical errors block the PR. The engineer sees exactly what failed in a PR comment. Warnings are flagged but don't block. This is the guardrail layer — AI-generated scripts need validation, not blind trust. And in v2.2, the same ruleset is enforced again externally by the static analyzer described above — so even if Claude's self-check missed something, the analyzer catches it before merge.
Object Dependency Scanning
One of the most valuable additions to the workflow came from a real production incident: someone deleted a table without realizing three stored procedures still referenced it. The procedures compiled fine in SSDT (they were in different schemas), but failed at runtime.
Now, whenever tables or synonyms are deleted, the workflow automatically scans the entire codebase for references. The search uses schema-qualified git grep with line-level filtering:
Lines starting with -- (SQL comments) are excluded — -- Description: core.Supplier is not a real dependency
Lines matching CREATE (OR ALTER)? PROCEDURE|FUNCTION|VIEW|TRIGGER are excluded — these are object name declarations, not table references
Self-references, migration files, and security files are filtered out
For modified tables, the workflow detects dropped columns from the diff and searches for column-level references across stored procedures and views.
The risk auto-escalates based on reference count: 1–2 references found bumps a LOW risk to MEDIUM, 3 or more bumps it to HIGH. The PR comment renders the dependencies in a collapsible section so reviewers see the blast radius at a glance without cluttering the main summary.
This catches cross-schema dependencies that SSDT's build process misses entirely — because SSDT only validates within the project boundary, while git grep searches everything.
Automated Config Drift Detection
There's a subtle problem with any system that maintains configuration files alongside source code: they drift. A new schema gets added in a PR, but nobody updates repo-config.md. A new synonym variable mapping gets introduced, but the reference file still has the old list. The migration generator works from stale data and produces wrong output.
I solved this with a separate GitHub Actions workflow: sync-skill-config.yml. It runs on every push to master that touches SQL files. Here's what it does:
Scans the filesystem to collect the actual directory tree, synonym file contents, and a per-schema inventory of tables, views, SPs, functions, triggers, UDTs, and synonyms
Reads the current repo-config.md and role_app_db_cicd.sql files
Calls Claude to compare the filesystem state against what the config files claim
If drift is detected — missing schemas, new synonym mappings, stale variable references, missing schema-level GRANTs — it pushes the corrected files to a long-lived bot branch and opens (or updates) a single PR with auto-merge enabled
Update in v2.2: the original implementation created a new feature/skill-migration-generation-v{N} branch on every drift event, which led to version sprawl — old branches piling up, manual review for every config change, and reviewers having to understand a dozen flavors of "auto-detected, please update description." The current implementation uses a single long-lived branch (bot/sql-skill-config-sync) that gets force-pushed on every drift event. One transient PR is reused or recreated, and auto-merge is enabled (squash + branch delete) so it merges itself the moment required status checks pass. Net effect: at most one config-sync PR exists at any time, master stays in sync without manual review for non-functional config drift, and there's no version sprawl.
The workflow also surfaces orphans — schemas listed in config but no longer present on disk, or variable mappings referenced by no synonym file. These go into a separate <DEPRECATED> section in the PR body, but they're never auto-removed. Deletion stays a human decision because an "orphan" is sometimes really a renamed or moved schema, not a dead one.
The result: the reference files that drive migration generation are always in sync with the actual repository state. No manual maintenance. No "forgot to update the config" surprises three PRs later.
What Made It Work: Structure Over Prompting
The insight that changed everything for me was this: the developers getting real results from AI aren't using secret prompt tricks. They're giving the AI enough context — written down, structured, and consistent.
When I was prompting Claude ad-hoc, I got inconsistent results. When I put the rules in CLAUDE.md, the skill definition in SKILL.md, and the detailed patterns in the reference files, the output quality jumped dramatically. Not because Claude suddenly got smarter, but because it finally had the context it needed to do the job right.
A few specific things that made the biggest difference:
Explicit Idempotency Patterns
Instead of saying "make scripts idempotent," I showed Claude exactly which IF NOT EXISTS pattern to use for each object type. Copy-paste-level specificity. The AI stopped guessing and started following the template.
Negative Rules
Telling the AI what NOT to do was just as important as telling it what to do. "Never use CREATE OR ALTER for tables" prevented a mistake that would have broken Flyway's version tracking.
Real Examples
The reference files include actual patterns from our project. Not hypothetical examples — real ones that had been deployed to production. This gave Claude a reference point that was grounded in our codebase, not generic SQL Server documentation.
Phased Approach
The workflow doesn't try to do everything in one shot. It detects changes first, analyzes impact, categorizes them, generates scripts, then validates. Each phase is isolated. When something fails, you know exactly where and why.
Belt and Suspenders (new in v2.2)
I used to think one good validation layer was enough. After a few production-adjacent near-misses, I added two more — a local pre-commit hook and a pre-flight CI gate. They run the same ruleset; the redundancy is the point. If one layer is bypassed (someone runs git commit --no-verify), the next one catches it. If the AI's self-validation misses something, the static analyzer catches it. Defense in depth, applied to LLM output.
The Tech Stack
Database | SQL Server 2022 (21 schemas, hundreds of objects) |
Migration tool | Flyway |
CI/CD orchestration | GitHub Actions + Jenkins |
Migration script generation | Claude API (Opus 4.7; temperature dropped in v2.2) |
Prompt framework | sql-gener8-migration SKILL.md + 8 reference files |
SQL static analysis (new in v2.2) | Vendored 52-rule analyzer at .githooks/analyze_sql.py — pre-commit + pre-flight CI + authoritative CI gate |
AI coding assistant | Claude Code (for building the workflow itself) |
Build / compile | MSBuild / SSDT |
Automation scripts | PowerShell |
Config sync | sync-skill-config.yml workflow (long-lived bot branch + auto-merging PR) |
Version tracking | Custom VersionHistory table + Flyway schema history |
Notifications | Slack |
Building the Workflow with Claude Code
Here's the part that I find most worth sharing: I used Claude Code to build the workflow that calls Claude's API. AI building AI tooling. It sounds recursive, but it worked — and it worked because of structure.
The process looked like this:
I wrote the CLAUDE.md first. Before any code existed, I documented every convention, every naming rule, every idempotency pattern. This took time — but it was time I would have spent explaining the same things to a human developer anyway.
I wrote the skill definition (SKILL.md) and reference files. What the workflow should do, step by step. What triggers it. What it outputs. What blocks a PR vs. what warns. I co-authored these with Claude Code, asking it to challenge the spec for gaps and edge cases before implementation. The reference files evolved from a single large document into 8 focused files as the system grew.
Then I let Claude Code implement. With the operating layer and skill layer in place, the implementation was fast. Claude Code generated the GitHub Actions YAML, the PowerShell scripts for change detection and categorization, the API call logic, the impact analysis, and the validation rules. I reviewed, iterated, and refined — but the heavy lifting was done.
The entire workflow — from first commit to working automation — took days, not weeks. That's the speed that structure unlocks.
Results
Since rolling this out:
Zero manual migration script writing. Engineers modify source SQL files. The pipeline generates the migration scripts automatically.
Consistent quality. Every script follows the same idempotency patterns, naming conventions, and formatting. No more "it depends on who wrote it."
Faster code reviews. Reviewers focus on business logic and correctness — not on whether someone remembered a GO statement or an IF EXISTS check.
Fewer deployment failures. The 31-check validation layer catches issues before they reach staging, let alone production.
Three layers of SQL enforcement (v2.2). Local pre-commit hook + pre-flight CI gate + authoritative CI gate — the same 52-rule analyzer running in three places. CRITICAL findings can't reach master.
Dependency-correct Repeatable migrations (v2.2). Topological sort means stored procedures emit after the views they reference. No more "Invalid object name" failures from alphabetical sort order.
Version conflicts resolved automatically. When two branches generate the same version number, the pipeline detects and reassigns transparently.
Object dependencies surfaced automatically. When a table or synonym is deleted, the PR comment shows every stored procedure, view, and function that references it — before the reviewer has to go looking.
Config drift detected and fixed via auto-merging PRs (v2.2). The sync-skill-config workflow catches when the filesystem diverges from the reference files and creates a corrective PR that merges itself once required checks pass.
What I Learned
AI-generated scripts need guardrails, not trust
Claude generates solid migration scripts, but the validation layer is non-negotiable. The AI occasionally misses edge cases — a trigger that needs COALESCE for null safety, a column modification that forgets to drop the dependent constraint first. The 31-check self-validation catches these before a human reviewer ever sees the PR. And in v2.2, the external static analyzer catches them again. Belt and suspenders.
Write the context document before writing code
The CLAUDE.md and reference files took real effort to write. But every hour spent on those documents saved many more hours of correcting AI output, debugging inconsistent scripts, and explaining the same things in different prompts. If you can explain your conventions clearly enough for a new team member, you can write them down for AI.
Specificity beats cleverness
"Make it idempotent" is a vague instruction. "Use this exact IF NOT EXISTS pattern for tables, this pattern for indexes, and CREATE OR ALTER for stored procedures" is a spec. The more precise your instructions, the less time you spend on rework. If you find yourself correcting the AI repeatedly, your spec wasn't specific enough.
Determinism comes from structure, not from a temperature dial
The original v2.0 workflow set temperature: 0 on the API call to get repeatable output. Then temperature was deprecated upstream and dropped in v2.2. I expected the output to get more variable. It didn't — because by then, determinism wasn't coming from the model parameter. It was coming from the prompt: explicit patterns, an enforced ruleset, and a 31-check self-validation that fixes the same things the same way every time. If your output drifts when you remove temperature 0, your prompt isn't doing the work.
The phased approach prevents fragile systems
Building the workflow in phases — detect, analyze impact, categorize, generate, validate, commit — meant each layer was testable independently. When something broke, I knew exactly where to look. Trying to build it all at once would have produced a fragile monolith.
Schema compare tools fail silently; AI fails transparently
When Devart's schema compare broke after a version upgrade, I spent days debugging opaque PowerShell integration errors with no visibility into what went wrong internally. When Claude generates a wrong migration script, I can read the prompt, see what context was missing, and fix it in the reference files. The AI approach is version-controlled, auditable, and transparent in ways that proprietary tools never were. When it breaks, I know why — and the fix is a markdown edit, not a vendor support ticket.
Start with Structure
If there's one takeaway from this entire project, it's this: AI-assisted coding works when you provide structure. Not perfect prompts. Not clever tricks. Structure.
Start with a CLAUDE.md that covers your project's conventions and architecture. Add a SKILL.md for the specific capability you're building. Break out detailed patterns into focused reference files. Then let the AI implement against that foundation.
The gap between "AI that generates plausible-looking code" and "AI that generates production-ready code" isn't model capability. It's context. Write the context down. Make it structured. Make it specific. The results will speak for themselves.
The workflow described in this post is the sql-gener8-migration skill, running in production today across multiple database repositories, generating migration scripts for SQL Server databases with 21 schemas each. The skill framework — SKILL.md, 8 reference files, the sync-skill-config workflow, and the vendored sql-valid8 analyzer — is version-controlled alongside the database code it governs. If you're building something similar for database CI/CD or have questions about structuring AI context files, feel free to reach out.
![[Part 2] SQL Server 2022 Crashes — When the Fix Doesn’t Hold: A Second Root Cause Hiding Behind the First](https://static.wixstatic.com/media/46731a_fa73ba43b8dc406ca299be3be01f50bc~mv2.png/v1/fill/w_980,h_512,al_c,q_90,usm_0.66_1.00_0.01,enc_avif,quality_auto/46731a_fa73ba43b8dc406ca299be3be01f50bc~mv2.png)
![[Part 1] SQL Server 2022 Engine Crashes on Windows Server 2025 — Tracing the Cause to an EDR Heap Hook Incompatibility](https://static.wixstatic.com/media/46731a_bec6851d7d7947b18114fb31eb9a2aa9~mv2.jpg/v1/fill/w_980,h_980,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/46731a_bec6851d7d7947b18114fb31eb9a2aa9~mv2.jpg)

Comments