AI-Powered SQL Migration Script Generator

Mar 6
21 min read

Updated: May 10

[UPDATED]

There's a principle that stuck with me recently. The core idea was simple: stop describing what you want to build — start building it. If you can explain what you want clearly enough, AI can build it. But the catch is that "clearly enough" is doing a lot of heavy lifting in that sentence.

Without structure, AI coding tools produce confident-looking output that quietly falls apart. They ignore your conventions, invent patterns you didn't ask for, and generate code that contradicts what's already in your project. You spend more time cleaning up after the AI than it would have taken to write the code yourself.

I experienced this firsthand. And then I figured out how to fix it — not with better prompts, but with better structure. The result was a fully automated GitHub Actions workflow that uses Claude's API to generate production-ready SQL Server migration scripts every time a pull request is opened. No manual script writing. No inconsistent formatting. No missed idempotency checks.

This is the story of how I built it, what I learned about working with AI, and why the SKILL.md framework changed everything.

Since the original post, the workflow has grown three new pieces — a vendored static SQL analyzer, a pre-flight CI gate, and dependency-ordered Repeatable migrations. I've folded those into the relevant sections below. The bones of the system haven't changed, but it's more robust and harder to misuse than v2.1.

The Problem I Was Solving

Our database has 21 schemas and hundreds of objects — stored procedures, tables, views, functions, triggers, synonyms. Every time an engineer made a schema change, someone had to manually write a migration script for it. That meant:

Remembering the correct Flyway naming convention (V{major}.{minor}__{date}.ProjectDB.branchname.sql for versioned, R__{date}.ProjectDB.branchname.sql for repeatable)
Writing idempotency checks for every DDL statement — and getting them right
Knowing which objects are repeatable (views, stored procedures, functions) vs. versioned (tables, indexes, synonyms)
Handling deleted files by generating the correct DROP statements with existence checks
Calculating the next available version number without conflicting with other in-flight branches
Following SQL best practices: ONLINE = ON for index operations, FILLFACTOR settings, OPTIMIZE_FOR_SEQUENTIAL_KEY for identity PKs, proper constraint ordering

Every step was a chance for human error. And in a production database that powers the core business platform, human error isn't just inconvenient — it's expensive.

Before I went down the AI path, I tried the conventional approach first. Schema compare tools — Devart, Azure Data Studio — seemed like the natural solution. Point at two databases, diff, generate scripts. It worked, until it didn't. A version upgrade to Devart broke the PowerShell integration we used to automate the compare. Azure Data Studio's schema compare had its own quirks with synonym handling and cross-database references. The pattern was always the same: the tool worked great in demos, then crumbled when it hit our real-world complexity. And when it broke, debugging was opaque — proprietary internals, closed-source logic, no way to see why a particular diff was generated wrong.

I wanted engineers to modify their source SQL files, open a PR, and have the migration scripts appear automatically — validated, formatted, and ready for review.

Why My First Attempts Failed

My first instinct was straightforward: call the Claude API with the changed files and ask it to generate migration scripts. It worked — sort of. The scripts looked right at first glance, but they were inconsistent. Sometimes Claude used CREATE OR ALTER for tables (wrong — tables need versioned migrations with IF NOT EXISTS checks). Sometimes it forgot the GO batch separators. Sometimes it generated scripts for objects that hadn't actually changed.

The problem wasn't Claude's capability. The problem was that I was asking it to generate scripts without telling it how we do things here.

That's when I realized the answer wasn't a single clever prompt. It was a layered framework — operating rules, a skill definition, and a library of reference files — that gave the AI everything it needed before it wrote a single line of SQL.

The SKILL.md Framework (Applied to Database CI/CD)

The framework is built around the idea that AI tools fail not because they're incapable, but because they have no context. Every time you start a session, your AI assistant wakes up with amnesia. It doesn't know your project structure, your conventions, or the decisions you made last week. The fix is to write that context down in structured files that the AI reads before doing anything.

Three layers of context, but far more than three files:

Layer	File(s)	Purpose	Analogy
Operating Layer	CLAUDE.md	Conventions, architecture, anti-patterns — always loaded	Employee handbook
Skill Layer	sql-gener8-migration/SKILL.md	Migration type rules, permission placement, synonym replacement, self-validation	Project brief + playbook
Reference Layer	8 reference files	Detailed patterns, constraints, and standards for specific domains	Technical manuals

Let me walk through how each layer shaped the final workflow.

CLAUDE.md — The Operating Layer

This is the backbone. Our CLAUDE.md tells Claude Code everything it needs to know about our database — before it writes a single line of SQL.

Project Architecture

The database has 49 functional schemas organized by domain:

DatabaseProject/
    dbo/              # Default schema
    messaging/        # Messaging analytics
    voice/            # Voice communication
    finance/          # Financial/billing
    analytics/        # Business intelligence
    etl/              # ETL processes
    staging/          # Data staging
    sync/             # Data synchronization
    alerts/           # Alert/notification system
    accounts/         # Customer/account management
    carrier/          # Carrier/operator integration
    erp/              # ERP system integration
    procurement/      # Procurement integration
    metadata/         # Metadata tracking
    config/           # Operating mode config
    mapping/          # Mapping/reference
    conversion/       # Conversion tracking
    transactions/     # Transaction system
    utility/          # Utility functions
    audit/            # Audit trail/logging
    security/         # Access control/permissions
    reporting/        # Report generation
    notifications/    # Notification delivery
    inventory/        # Inventory management
    compliance/       # Regulatory compliance
    routing/          # Message/call routing
    provisioning/     # Service provisioning
    rating/           # Rate/pricing engine
    billing/          # Billing cycles
    payments/         # Payment processing
    disputes/         # Dispute resolution
    orders/           # Order management
    catalog/          # Product/service catalog
    workflow/         # Workflow automation
    scheduling/       # Job scheduling
    archive/          # Data archival
    integration/      # Third-party integration
    monitoring/       # System monitoring
    logging/          # Application logging
    cache/            # Caching layer
    geo/              # Geographic/location data
    auth/             # Authentication
    portal/           # Customer portal
    support/          # Support/ticketing
    docs/             # Document management
    api/              # API management
    batch/            # Batch processing
    network/          # Network management
    quality/          # Quality assurance

Each schema has subdirectories for Tables/, Stored Procedures/, Views/, Functions/, Triggers/, and Synonyms/. Claude needs to know this structure to categorize changes correctly.

SQL Anti-Patterns Table

The CLAUDE.md includes a table of 13 SSDT-derived anti-pattern rules that Claude must enforce when generating or reviewing SQL. These map directly to Visual Studio's static analysis rules and catch issues that schema compare tools silently ignore:

Rule	What It Catches
SR0001	No SELECT * in procedures — always list explicit columns
SR0008	Use SCOPE_IDENTITY() instead of @@IDENTITY
SR0009	Use CHAR(1)/CHAR(2) instead of VARCHAR(1)/VARCHAR(2)
SR0010	No deprecated = or = join syntax
SR0016	No sp_ prefix on stored procedures

Beyond SSDT rules, the operating layer also enforces modern SQL Server 2019+ features: OPTIMIZE_FOR_SEQUENTIAL_KEY = ON for identity-based primary keys, RESUMABLE = ON for index operations, STRING_AGG() over FOR XML PATH, and SET XACT_ABORT ON for all transaction blocks.

Idempotency Patterns

This is where the specificity matters most. Instead of saying "make scripts idempotent," the CLAUDE.md spells out exactly which pattern to use for each object type:

-- Tables:
IF NOT EXISTS (SELECT 1 FROM sys.objects
    WHERE object_id = OBJECT_ID(N'schema.TableName') AND type = 'U')
BEGIN
    CREATE TABLE schema.TableName (...)
END
GO

-- Indexes (with online + resumable):
IF NOT EXISTS (SELECT 1 FROM sys.indexes
    WHERE name = 'IX_Name' AND object_id = OBJECT_ID('schema.Table'))
BEGIN
    CREATE NONCLUSTERED INDEX IX_Name ON schema.Table(Column)
        WITH (FILLFACTOR = 95, ONLINE = ON (WAIT_AT_LOW_PRIORITY
            (MAX_DURATION = 5 MINUTES, ABORT_AFTER_WAIT = SELF)),
            RESUMABLE = ON, MAX_DURATION = 60 MINUTES)
END
GO

-- Stored Procedures (repeatable):
CREATE OR ALTER PROCEDURE schema.ProcName
AS
BEGIN
    SET NOCOUNT ON;
    ...
END
GO

What NOT to Do

Just as important as telling the AI what to do is telling it what not to do:

Never manually create migration scripts — the CI/CD workflow generates them
Never use CREATE OR ALTER for tables (tables are versioned, not repeatable)
Never use RAISERROR — use THROW instead
Never grant CONTROL ON SCHEMA to the CI/CD role
Never use MERGE statements — use UPDATE + IF @@ROWCOUNT = 0 INSERT instead
Never use @@IDENTITY — use SCOPE_IDENTITY()
Column modifications require a 3-step process: DROP dependent objects, MODIFY column, RECREATE dependent objects

This level of specificity is what makes the difference between "AI-generated scripts that look right" and "AI-generated scripts that actually work in production."

SKILL.md — The Skill Layer

While CLAUDE.md covers the entire project, the skill file sql-gener8-migration/SKILL.md focuses specifically on how migration generation operates. This is the central prompt — the single document that tells Claude exactly how to transform a set of changed SQL files into deployment-ready migration scripts.

The skill is defined with triggers (generate migration, migration script, flyway migration, sql-gener8-migration) so Claude Code activates it automatically when an engineer asks for migration scripts. Here's what it covers:

Migration type rules — repeatable (R__) for views, stored procedures, functions, triggers; versioned (V__) for tables, indexes, synonyms, UDTs, deleted objects
Permission placement — object-level GRANTs go only in repeatable migrations; schema-level GRANTs go only in version migrations for new schemas
Synonym variable replacement — SSDT project variables like $(LINKED_DB) get replaced with actual database names in migration output
Modified file handling — for status M files, Claude analyzes the diff (not the full file) and generates only the specific ALTER/DROP/ADD operations for what changed
Deleted file handling — scan for embedded triggers, generate DROP statements in correct order with IF EXISTS guards
New schema detection — automatically adds CREATE SCHEMA + role_app_db_cicd GRANT statements before any objects in that schema
31-check self-validation checklist — every generated script must pass all checks before the AI responds

This file is the spec. And as I learned the hard way: vague specs produce vague code. The more precise this document is, the less time you spend fixing what the AI generates.

Reference Files — The Knowledge Base

The third layer is a set of 8 reference files that live alongside SKILL.md in the references/ directory. Seven are loaded into the Claude prompt during migration generation; the eighth (dependency-scan.md) drives the workflow's impact analysis logic. Each one codifies a specific domain of knowledge.

File	Purpose
repo-config.md	Database name, schema list, SSDT variable-to-database mappings — the source of truth for this specific repository
idempotent-patterns.md	Complete catalog of IF EXISTS / IF NOT EXISTS patterns for every object type (tables, indexes, constraints, triggers, synonyms, UDTs)
column-modification.md	The 3-step DROP / ALTER / RECREATE process for modifying columns with dependent constraints and indexes
index-standards.md	FILLFACTOR, ONLINE, RESUMABLE, OPTIMIZE_FOR_SEQUENTIAL_KEY rules — every index option and when to apply it
security-constraints.md	The CI/CD role's permission model — what role_app_db_cicd can and cannot do, and why CONTROL ON SCHEMA is excluded
sql-standards.md	Data type standards, naming conventions, SSDT anti-patterns (SR0001 through SR0016), SARGable predicates, dynamic SQL safety
dependency-scan.md	Rules for object dependency scanning — how the workflow detects references to deleted tables and synonyms (drives workflow logic, not loaded into Claude prompt)
dependency-ordering.md (new in v2.2)	Topological sort rules for emitting Repeatable migrations in dependency order — functions before views, views before stored procedures, stored procedures before triggers

The reference layer means I never had to cram everything into a single massive prompt. Each file is focused, maintainable, and version-controlled alongside the code it governs. When index standards change, I update index-standards.md — not a 3,000-line prompt buried in a YAML workflow.

Three Layers of SQL Enforcement (new in v2.2)

The original workflow had one validation layer: the 31-check self-validation embedded in SKILL.md, run by Claude as it generated each script. That's a guardrail against AI mistakes. But it doesn't catch human mistakes — the engineer who pushes a stored procedure with SELECT * or a table without a primary key. The repeatable migration generator simply mirrors the source file. If the source SQL is broken, the migration carries the breakage forward.

I added two more enforcement layers in v2.2. The result is the same SQL ruleset checked in three places:

Layer	Where	When it runs	Behavior on CRITICAL
Local pre-commit	.githooks/pre-commit (opt-in)	On every git commit against staged *.sql	Blocks the commit
Pre-flight CI gate	Step 4 of the migration generation workflow	Inside the workflow, before any Claude API call	Aborts migration generation
Authoritative CI gate	Standalone sql-valid8.yml workflow	Every PR; every push to master	Fails the required check

All three layers run the same vendored analyzer — a 52-rule, multi-dialect SQL static analyzer that ships in the repo at .githooks/analyze_sql.py. It auto-detects MSSQL / PostgreSQL / Oracle / MySQL / SQLite from syntax. Severity is classified as CRITICAL / HIGH / MEDIUM / LOW. CRITICAL findings block the merge; the rest are surfaced as warnings.

Why three layers, when one would technically be enough?

Local hook = fast feedback. Engineers find out about violations on commit, not after pushing and waiting for CI.
Pre-flight gate = cost control. If the source SQL is already broken, there's no point spending an LLM call to generate a migration from it.
Authoritative CI gate = enforcement. git commit --no-verify skips the local hook, but it can't bypass the CI check. The PR doesn't merge if the gate fails.

The analyzer self-updates from the upstream marketplace on every CI run, committing back to master when the version changes. Engineers get the latest rules on the next git pull — no manual update.

The Workflow Itself — How It All Comes Together

Here's what happens when an engineer opens a PR with SQL changes. The pipeline is now an 18-step GitHub Actions workflow (it was 17 before v2.2; the pre-flight gate is the new step):

1. Checkout and Branch Extraction

The workflow checks out the full repository history and extracts the branch name and base branch from the PR context.

2. Change Detection

The workflow diffs the PR branch against master and identifies every changed .sql file in the database project directory:

git diff --name-status -M90% "origin/master...HEAD" -- "ProjectDB/**/*.sql"

The -M90% flag handles file renames — if a file was moved or renamed, git tracks it instead of treating it as a delete + add.

3. Pre-flight sql-valid8 Gate (new in v2.2)

Before any Claude API call, the workflow runs the vendored analyzer against the changed SQL. CRITICAL findings abort migration generation immediately and post a workflow summary listing the violations. This catches the obvious mistakes — missing primary keys, bare data types, deprecated TEXT/NTEXT/IMAGE — before the LLM gets involved. Cheaper, faster, and the engineer sees the failure as a clear analyzer report rather than a confusing AI-generated artifact.

4. Categorization

Each changed file is classified based on its directory path:

Object Type	Migration Type	Prefix
Views, Stored Procedures, Functions, Triggers	Repeatable	R__
Tables (structural changes), Indexes, Synonyms, User Defined Types	Versioned	V{major}.{minor}__
Tables (permissions only)	Repeatable	R__

The categorization isn't just based on the directory. The workflow reads the file content to distinguish between a table file that contains CREATE TABLE (versioned) vs. one that only contains GRANT statements (repeatable). This is a nuance that a simpler approach would miss.

5. Impact Analysis with Dependency Scanning

This is where the workflow goes beyond simple script generation. Before calling Claude, it analyzes the impact of every change across the codebase. When tables or synonyms are deleted, the workflow detects code references using schema-qualified git grep with line-level filtering:

Unbracketed search: git grep -n -F -w "schema.ObjectName" catches FROM core.Supplier, JOIN core.Suppliers
Bracketed search: git grep -n -F "[schema].[ObjectName]" catches [voice].[Supplier]

Each matching line is filtered to exclude false positives — lines starting with -- (SQL comments) and lines matching CREATE (OR ALTER)? PROCEDURE|FUNCTION|VIEW|TRIGGER (object declarations, not references). Self-references, migration files, and security files are also excluded.

When dependencies are found, the risk level auto-escalates: 1–2 references bumps LOW to MEDIUM, 3+ bumps to HIGH. The PR comment shows a collapsible section listing every affected object and file. Reviewers see the blast radius before they approve anything.

6–7. Cleanup and Version Calculation

Old migration scripts from previous pushes on the same branch are deleted (one set of scripts per branch, always). The workflow scans the Migration/ folder for existing versions and calculates the next available version number.

8. Context Preparation

The workflow builds a structured prompt containing the changed file contents (or diffs for modified files), deleted file contents retrieved from origin/master, the branch name, database name, categorization results, and the reference files.

9. Claude API Call

The API call uses Claude Opus 4.7. In v2.1 I was passing temperature: 0 for deterministic output, but the parameter is now deprecated upstream and was dropped in v2.2 — determinism comes from the prompt itself and the validation checklist, not from a model parameter. Claude returns its response in a structured XML format with separate sections for repeatable and version migrations.

10. Repeatable Migration Assembly (with dependency ordering)

For repeatable objects (views, stored procedures, functions, triggers), the workflow reads the source files directly and builds the repeatable migration by converting CREATE to CREATE OR ALTER and appending the object-level GRANT statements.

New in v2.2: the order objects are emitted is no longer alphabetical. I learned this the hard way after a deploy where a stored procedure compiled fine but failed at runtime with Invalid object name 'cc.vwOrderTotal'. The view existed in the source files — the alphabetical sort just emitted the SP first, and SQL Server's deferred name resolution let the SP CREATE succeed even though the view didn't exist yet at that point in the batch. The first call to the SP after deploy was the failure.

The fix: a topological sort with a tier model. Functions emit first (referenced by everything), then views, then stored procedures, then triggers. Inside a tier, alphabetical. Cycles get flagged as a warning — cycles in views/functions are usually a real bug, not something to silently rearrange. The workflow renders the inferred dependency graph as a Mermaid diagram in the job summary so reviewers can spot accidental cycles before they hit QA.

11. SQL Validation

After generation, every script goes through the 31-check self-validation. Critical errors block the PR. Warnings are flagged but don't block.

12–15. File Writing, Security Audit, Role Permissions, Project File Update

Generated scripts are written to the Migration/ folder. The workflow audits security permissions, detects new schemas and auto-grants CI/CD role permissions, and updates the Visual Studio project file (.sqlproj) with wildcard patterns for new schema directories.

16–17. Deleted File Cleanup and Commit

The .sqlproj is synced for deleted files (stale <Build Include> entries get removed), and all changes are committed directly to the PR branch. If the push is rejected because another bot commit landed first, the workflow fetches and rebases automatically, retrying up to 3 times.

18. PR Comment

A detailed comment is posted showing validation status, script names, impact analysis, dependency warnings, and what was generated. The reviewer sees everything without having to run anything manually.

The 31-Check Self-Validation

Early versions of the workflow had basic validation — check for a PRIMARY KEY, verify BEGIN/END matching, confirm data types have lengths. That caught the obvious mistakes. But production taught me that the subtle mistakes were the expensive ones.

The current validation is a 31-check self-validation checklist embedded directly in SKILL.md. Claude must verify every check passes before it emits a response. If any check fails, it fixes the issue before responding. The checks are split into two categories:

Core Checks (1–12)

#	Check
1	BEGIN/END matching — every BEGIN has a corresponding END
2	GO separators between DDL batches; never inside a BEGIN…END block
3	Valid data types with length/precision — no bare VARCHAR() or DECIMAL
4	Correct filename format for both V__ and R__ migrations
5	Permission placement — object-level GRANTs only in repeatable, schema-level only in version
6	No duplicate objects — each CREATE/ALTER appears only once
7	Idempotent guards on all version migration objects
8	CREATE OR ALTER for all repeatable objects
9	No SSDT variable syntax remains — all $(...) replaced with plain database names
10	Reference data inserts use per-row guards, not a single wrapping IF NOT EXISTS
11	Index options: FILLFACTOR, ONLINE, RESUMABLE, OPTIMIZE_FOR_SEQUENTIAL_KEY where applicable
12	Security constraints — no GRANT to unauthorized roles, no DROP SCHEMA, no TRUNCATE TABLE

SSDT Anti-Pattern Checks (13–31)

#	Check
13	No deprecated TEXT, NTEXT, or IMAGE data types
14	No @@IDENTITY — use SCOPE_IDENTITY()
15	No SELECT * in stored procedures (except in EXISTS)
16	No deprecated = / = join syntax
17	SET NOCOUNT ON in all stored procedures
18	TRY-CATCH with ROLLBACK for all transaction blocks
19	Dynamic SQL uses sp_executesql with parameters, never EXEC(@variable)
20	No cursors — use set-based operations
21	SARGable predicates — no functions applied to columns in WHERE clauses
22	No sp_ prefix on stored procedures
23	No leading wildcard LIKE '%value' patterns
24	CHAR(1)/CHAR(2) instead of VARCHAR(1)/VARCHAR(2)
25	No MERGE statements
26	OPTIMIZE_FOR_SEQUENTIAL_KEY on IDENTITY PKs
27	Covering indexes with proper key vs. INCLUDE column placement
28	SET XACT_ABORT ON with all transactions
29	Schema qualification on all object references
30	Explicit INSERT column lists — never bare VALUES
31	EXISTS instead of COUNT(*) > 0 for existence checks

Critical errors block the PR. The engineer sees exactly what failed in a PR comment. Warnings are flagged but don't block. This is the guardrail layer — AI-generated scripts need validation, not blind trust. And in v2.2, the same ruleset is enforced again externally by the static analyzer described above — so even if Claude's self-check missed something, the analyzer catches it before merge.

Object Dependency Scanning

One of the most valuable additions to the workflow came from a real production incident: someone deleted a table without realizing three stored procedures still referenced it. The procedures compiled fine in SSDT (they were in different schemas), but failed at runtime.

Now, whenever tables or synonyms are deleted, the workflow automatically scans the entire codebase for references. The search uses schema-qualified git grep with line-level filtering:

Lines starting with -- (SQL comments) are excluded — -- Description: core.Supplier is not a real dependency
Lines matching CREATE (OR ALTER)? PROCEDURE|FUNCTION|VIEW|TRIGGER are excluded — these are object name declarations, not table references
Self-references, migration files, and security files are filtered out

For modified tables, the workflow detects dropped columns from the diff and searches for column-level references across stored procedures and views.

The risk auto-escalates based on reference count: 1–2 references found bumps a LOW risk to MEDIUM, 3 or more bumps it to HIGH. The PR comment renders the dependencies in a collapsible section so reviewers see the blast radius at a glance without cluttering the main summary.

This catches cross-schema dependencies that SSDT's build process misses entirely — because SSDT only validates within the project boundary, while git grep searches everything.

Automated Config Drift Detection

There's a subtle problem with any system that maintains configuration files alongside source code: they drift. A new schema gets added in a PR, but nobody updates repo-config.md. A new synonym variable mapping gets introduced, but the reference file still has the old list. The migration generator works from stale data and produces wrong output.

I solved this with a separate GitHub Actions workflow: sync-skill-config.yml. It runs on every push to master that touches SQL files. Here's what it does:

Scans the filesystem to collect the actual directory tree, synonym file contents, and a per-schema inventory of tables, views, SPs, functions, triggers, UDTs, and synonyms
Reads the current repo-config.md and role_app_db_cicd.sql files
Calls Claude to compare the filesystem state against what the config files claim
If drift is detected — missing schemas, new synonym mappings, stale variable references, missing schema-level GRANTs — it pushes the corrected files to a long-lived bot branch and opens (or updates) a single PR with auto-merge enabled

Update in v2.2: the original implementation created a new feature/skill-migration-generation-v{N} branch on every drift event, which led to version sprawl — old branches piling up, manual review for every config change, and reviewers having to understand a dozen flavors of "auto-detected, please update description." The current implementation uses a single long-lived branch (bot/sql-skill-config-sync) that gets force-pushed on every drift event. One transient PR is reused or recreated, and auto-merge is enabled (squash + branch delete) so it merges itself the moment required status checks pass. Net effect: at most one config-sync PR exists at any time, master stays in sync without manual review for non-functional config drift, and there's no version sprawl.

The workflow also surfaces orphans — schemas listed in config but no longer present on disk, or variable mappings referenced by no synonym file. These go into a separate <DEPRECATED> section in the PR body, but they're never auto-removed. Deletion stays a human decision because an "orphan" is sometimes really a renamed or moved schema, not a dead one.

The result: the reference files that drive migration generation are always in sync with the actual repository state. No manual maintenance. No "forgot to update the config" surprises three PRs later.

What Made It Work: Structure Over Prompting

The insight that changed everything for me was this: the developers getting real results from AI aren't using secret prompt tricks. They're giving the AI enough context — written down, structured, and consistent.

When I was prompting Claude ad-hoc, I got inconsistent results. When I put the rules in CLAUDE.md, the skill definition in SKILL.md, and the detailed patterns in the reference files, the output quality jumped dramatically. Not because Claude suddenly got smarter, but because it finally had the context it needed to do the job right.

A few specific things that made the biggest difference:

Explicit Idempotency Patterns

Instead of saying "make scripts idempotent," I showed Claude exactly which IF NOT EXISTS pattern to use for each object type. Copy-paste-level specificity. The AI stopped guessing and started following the template.

Negative Rules

Telling the AI what NOT to do was just as important as telling it what to do. "Never use CREATE OR ALTER for tables" prevented a mistake that would have broken Flyway's version tracking.

Real Examples

The reference files include actual patterns from our project. Not hypothetical examples — real ones that had been deployed to production. This gave Claude a reference point that was grounded in our codebase, not generic SQL Server documentation.

Phased Approach

The workflow doesn't try to do everything in one shot. It detects changes first, analyzes impact, categorizes them, generates scripts, then validates. Each phase is isolated. When something fails, you know exactly where and why.

Belt and Suspenders (new in v2.2)

I used to think one good validation layer was enough. After a few production-adjacent near-misses, I added two more — a local pre-commit hook and a pre-flight CI gate. They run the same ruleset; the redundancy is the point. If one layer is bypassed (someone runs git commit --no-verify), the next one catches it. If the AI's self-validation misses something, the static analyzer catches it. Defense in depth, applied to LLM output.

The Tech Stack

Database	SQL Server 2022 (21 schemas, hundreds of objects)
Migration tool	Flyway
CI/CD orchestration	GitHub Actions + Jenkins
Migration script generation	Claude API (Opus 4.7; temperature dropped in v2.2)
Prompt framework	sql-gener8-migration SKILL.md + 8 reference files
SQL static analysis (new in v2.2)	Vendored 52-rule analyzer at .githooks/analyze_sql.py — pre-commit + pre-flight CI + authoritative CI gate
AI coding assistant	Claude Code (for building the workflow itself)
Build / compile	MSBuild / SSDT
Automation scripts	PowerShell
Config sync	sync-skill-config.yml workflow (long-lived bot branch + auto-merging PR)
Version tracking	Custom VersionHistory table + Flyway schema history
Notifications	Slack

Building the Workflow with Claude Code

Here's the part that I find most worth sharing: I used Claude Code to build the workflow that calls Claude's API. AI building AI tooling. It sounds recursive, but it worked — and it worked because of structure.

The process looked like this:

I wrote the CLAUDE.md first. Before any code existed, I documented every convention, every naming rule, every idempotency pattern. This took time — but it was time I would have spent explaining the same things to a human developer anyway.
I wrote the skill definition (SKILL.md) and reference files. What the workflow should do, step by step. What triggers it. What it outputs. What blocks a PR vs. what warns. I co-authored these with Claude Code, asking it to challenge the spec for gaps and edge cases before implementation. The reference files evolved from a single large document into 8 focused files as the system grew.
Then I let Claude Code implement. With the operating layer and skill layer in place, the implementation was fast. Claude Code generated the GitHub Actions YAML, the PowerShell scripts for change detection and categorization, the API call logic, the impact analysis, and the validation rules. I reviewed, iterated, and refined — but the heavy lifting was done.

The entire workflow — from first commit to working automation — took days, not weeks. That's the speed that structure unlocks.

Results

Since rolling this out:

Zero manual migration script writing. Engineers modify source SQL files. The pipeline generates the migration scripts automatically.
Consistent quality. Every script follows the same idempotency patterns, naming conventions, and formatting. No more "it depends on who wrote it."
Faster code reviews. Reviewers focus on business logic and correctness — not on whether someone remembered a GO statement or an IF EXISTS check.
Fewer deployment failures. The 31-check validation layer catches issues before they reach staging, let alone production.
Three layers of SQL enforcement (v2.2). Local pre-commit hook + pre-flight CI gate + authoritative CI gate — the same 52-rule analyzer running in three places. CRITICAL findings can't reach master.
Dependency-correct Repeatable migrations (v2.2). Topological sort means stored procedures emit after the views they reference. No more "Invalid object name" failures from alphabetical sort order.
Version conflicts resolved automatically. When two branches generate the same version number, the pipeline detects and reassigns transparently.
Object dependencies surfaced automatically. When a table or synonym is deleted, the PR comment shows every stored procedure, view, and function that references it — before the reviewer has to go looking.
Config drift detected and fixed via auto-merging PRs (v2.2). The sync-skill-config workflow catches when the filesystem diverges from the reference files and creates a corrective PR that merges itself once required checks pass.

What I Learned

AI-generated scripts need guardrails, not trust

Claude generates solid migration scripts, but the validation layer is non-negotiable. The AI occasionally misses edge cases — a trigger that needs COALESCE for null safety, a column modification that forgets to drop the dependent constraint first. The 31-check self-validation catches these before a human reviewer ever sees the PR. And in v2.2, the external static analyzer catches them again. Belt and suspenders.

Write the context document before writing code

The CLAUDE.md and reference files took real effort to write. But every hour spent on those documents saved many more hours of correcting AI output, debugging inconsistent scripts, and explaining the same things in different prompts. If you can explain your conventions clearly enough for a new team member, you can write them down for AI.

Specificity beats cleverness

"Make it idempotent" is a vague instruction. "Use this exact IF NOT EXISTS pattern for tables, this pattern for indexes, and CREATE OR ALTER for stored procedures" is a spec. The more precise your instructions, the less time you spend on rework. If you find yourself correcting the AI repeatedly, your spec wasn't specific enough.

Determinism comes from structure, not from a temperature dial

The original v2.0 workflow set temperature: 0 on the API call to get repeatable output. Then temperature was deprecated upstream and dropped in v2.2. I expected the output to get more variable. It didn't — because by then, determinism wasn't coming from the model parameter. It was coming from the prompt: explicit patterns, an enforced ruleset, and a 31-check self-validation that fixes the same things the same way every time. If your output drifts when you remove temperature 0, your prompt isn't doing the work.

The phased approach prevents fragile systems

Building the workflow in phases — detect, analyze impact, categorize, generate, validate, commit — meant each layer was testable independently. When something broke, I knew exactly where to look. Trying to build it all at once would have produced a fragile monolith.

Schema compare tools fail silently; AI fails transparently

When Devart's schema compare broke after a version upgrade, I spent days debugging opaque PowerShell integration errors with no visibility into what went wrong internally. When Claude generates a wrong migration script, I can read the prompt, see what context was missing, and fix it in the reference files. The AI approach is version-controlled, auditable, and transparent in ways that proprietary tools never were. When it breaks, I know why — and the fix is a markdown edit, not a vendor support ticket.

Start with Structure

If there's one takeaway from this entire project, it's this: AI-assisted coding works when you provide structure. Not perfect prompts. Not clever tricks. Structure.

Start with a CLAUDE.md that covers your project's conventions and architecture. Add a SKILL.md for the specific capability you're building. Break out detailed patterns into focused reference files. Then let the AI implement against that foundation.

The gap between "AI that generates plausible-looking code" and "AI that generates production-ready code" isn't model capability. It's context. Write the context down. Make it structured. Make it specific. The results will speak for themselves.

The workflow described in this post is the sql-gener8-migration skill, running in production today across multiple database repositories, generating migration scripts for SQL Server databases with 21 schemas each. The skill framework — SKILL.md, 8 reference files, the sync-skill-config workflow, and the vendored sql-valid8 analyzer — is version-controlled alongside the database code it governs. If you're building something similar for database CI/CD or have questions about structuring AI context files, feel free to reach out.