The Night Audit That Found 308 Issues Across 12 Projects
- Running 12 projects solo means things slip through, no matter how careful you are
- A 5-phase audit (MAP, SCAN, FIX, VERIFY, REPORT) found 308 issues across every repo in one overnight run
- 250+ of those issues were auto-fixed without human intervention
- 14 categories covered: security, accessibility, brand compliance, dead code, SEO, dependencies, and more
- Living reports let you compare audits over time to see if your codebase is getting healthier or rotting
Things Slip Through
I run 12 active projects. Different stacks. Different stages of maturity. Some are Next.js apps on Vercel. Some are Shopify themes with Liquid templates. Some are CLI tools. One is an entire design system.
When you work alone on that many codebases, entropy is not a risk. It is a guarantee.
A dependency goes stale in one repo while you are focused on another. An accessibility attribute gets dropped during a refactor. A spacing value creeps in that does not match the design system. A blog post references a product count that was accurate 3 months ago but is wrong now.
None of these are catastrophic on their own. But they compound. After 6 months of shipping fast across 12 repos, the accumulated debt becomes a drag. Small bugs. Inconsistencies that erode trust. SEO descriptions that no longer match reality. Security advisories you missed because you were heads-down on a feature.
I tried scheduling manual review sessions. Fridays, 2 hours, go through each repo. It lasted 3 weeks. By week 4, I skipped it because I was behind on a launch. By week 6, I had forgotten the practice existed.
That is when I built the night audit.
The Idea: Run It While You Sleep
The concept is simple. Before I close my laptop at night, I kick off an automated audit that crawls every project, categorizes every issue it finds, fixes what it can, and generates a report I read over coffee the next morning.
No manual checklist. No "I will get to it this weekend." Just a systematic sweep that runs whether I am disciplined enough to do it manually or not.
The first time I ran it, it found 308 issues across 12 projects. That number shocked me. I thought I was keeping things clean. I was wrong.
Here is what the process looks like.
Phase 1: MAP
Before you can audit anything, you need to know what you are auditing. MAP builds a complete picture of the ecosystem.
It scans every repository and catalogs:
- Which frameworks and languages each project uses
- How many files, components, and pages exist
- Which dependencies are installed and at what versions
- What the deployment targets are
- How the projects relate to each other (shared code, submodules, cross-references)
This matters because a one-size-fits-all audit would miss context. Checking for React best practices in a Liquid template is useless. Checking for Shopify-specific security issues in a CLI tool is noise. MAP ensures that each project gets audited with the right ruleset.
The mapping phase takes about 2 minutes across all 12 repos. It produces a structured manifest that the remaining phases reference. Think of it as the audit knowing what it is looking at before it starts looking.
Phase 2: SCAN
SCAN is where the work happens. It runs checks across 14 categories:
Security: Exposed credentials, API keys in code, outdated dependencies with known vulnerabilities, insecure configurations, missing rate limiting.
Accessibility: Missing alt text, insufficient color contrast, missing ARIA labels, keyboard navigation gaps, form elements without labels.
Brand compliance: Wrong colors (#fff instead of #F5F5F7), wrong fonts, incorrect spacing values, em dashes in content, banned words in copy.
SEO: Missing meta descriptions, titles over 60 characters, missing Open Graph tags, broken internal links, missing structured data.
Dead code: Unused imports, unreachable code paths, orphaned components, CSS classes referenced nowhere, functions called by nothing.
Dependencies: Outdated packages, packages with security advisories, unnecessary dependencies that inflate bundle size.
Performance: Unoptimized images, missing lazy loading, render-blocking resources, excessive bundle sizes.
Content accuracy: Stale numbers (product counts, blog post counts, "years of experience"), outdated screenshots, broken external links.
Code quality: Inconsistent naming conventions, missing error handling, console.log statements left in production code.
Configuration: Missing environment variables, wrong API versions, inconsistent deployment settings across projects.
Documentation: Outdated README files, missing setup instructions, stale comments that describe code that no longer exists.
Testing: Missing test coverage for critical paths, broken tests that pass silently, test files that import modules that have been renamed.
Duplication: Copy-pasted code blocks across repos, duplicated utility functions, repeated CSS that should be shared.
Infrastructure: SSL certificate issues, DNS misconfigurations, deployment pipeline gaps, missing health checks.
Each category runs its checks in parallel. A single category might have 20-40 individual checks. Across 14 categories, that is hundreds of validation points per repository, multiplied by 12 repositories.
The first run takes about 15 minutes. Subsequent runs are faster because the MAP phase caches project structure.
Phase 3: FIX
This is the phase that saves hours. SCAN finds issues. FIX resolves the ones that have clear, safe, automated solutions.
Not every issue can be auto-fixed. If a component is missing a critical accessibility attribute, the fix depends on context that requires human judgment. Those get flagged for manual review.
But many issues have exactly one correct fix:
- Wrong color value? Replace with the correct one.
- Missing alt text on a decorative image? Add alt="".
- Outdated dependency with no breaking changes? Update it.
- Console.log left in production code? Remove it.
- Stale number in a meta description? Update to current count.
- Em dash in blog content? Replace with a comma or period.
On the first full run, 250+ out of 308 issues were auto-fixable. That is an 81% auto-fix rate. The remaining 58 required human decisions. Mostly structural issues, missing test coverage, or accessibility improvements that needed design context.
The FIX phase creates all changes as uncommitted modifications. Nothing gets pushed automatically. I review the diffs the next morning, confirm they look right, and commit. The audit does the work. I make the final call.
This is a deliberate design choice. Fully automated fixing without human review is how you introduce 50 new bugs while fixing 250 old ones. The audit is the workhorse. The human is the quality gate.
Phase 4: VERIFY
After FIX runs, VERIFY re-scans every modified file to confirm the fixes are correct and did not introduce new issues.
This sounds obvious but it matters. An auto-fix that updates a dependency might break an import. A color replacement might hit a false positive in a comment. A removed console.log might accidentally delete the line below it if the parser gets confused.
VERIFY catches these regressions. On the first run, it found 3 fixes that introduced minor issues (two formatting problems and one import that needed updating after a dependency bump). Those got flagged and I handled them manually.
After VERIFY passes, the audit has high confidence that the proposed fixes are clean. Not perfect confidence. But high enough that reviewing the diffs takes 15 minutes instead of 2 hours.
Phase 5: REPORT
The final phase generates the audit report. This is not a raw dump of findings. It is a structured document organized by severity, category, and project.
The report includes:
- **Executive summary:** Total issues found, total auto-fixed, total requiring manual attention, comparison to previous audit if one exists
- **Critical findings:** Security issues, exposed credentials, broken production features (address these first)
- **Category breakdown:** Each of the 14 categories with counts and specific findings
- **Per-project view:** Issues grouped by repository, so you can focus on one project at a time
- **Fix log:** Every auto-fix applied, with before/after diffs
- **Manual review queue:** Issues that need human judgment, with context and suggested approaches
- **Trend data:** If you have run previous audits, how the numbers compare over time
The first report was humbling. 308 issues is a lot when you thought your codebases were in decent shape. But the breakdown helped prioritize. 12 were security-related (fixed those first). 47 were accessibility gaps (fixed those second). The rest were quality and consistency issues that I worked through over the following week.
What 308 Issues Actually Looked Like
Let me share some specifics from that first run, because the categories are less interesting than the actual findings.
Security (12 issues): Two API tokens that were hardcoded in test files instead of using environment variables. Four dependencies with published CVEs. Six configuration files missing proper access controls.
Accessibility (47 issues): 23 images missing meaningful alt text. 8 interactive elements without ARIA labels. 11 color contrast failures against the dark background. 5 forms with unlabeled inputs.
Brand compliance (31 issues): 14 instances of #fff instead of #F5F5F7. 9 spacing values not on the approved scale. 5 em dashes in blog content. 3 uses of "we" instead of "I" in public copy.
Dead code (42 issues): 18 unused imports across 6 projects. 12 functions that nothing called. 7 CSS classes applied to elements that no longer existed. 5 entire components that were orphaned after refactors.
Content accuracy (28 issues): Product counts that were 3 months stale. Blog post counts that did not match reality. Two pages claiming "10+ years" when the actual number is nearly 20. Several meta descriptions referencing features that had been renamed.
None of these would have caused an outage. All of them were eroding quality. The kind of slow rot that you do not notice until a customer does.
Living Reports and Audit Comparison
The report from a single audit is useful. The comparison between audits is where the real value lives.
After the first audit, I ran a second one a week later (after fixing the critical findings from round one). The comparison showed:
- Security issues: 12 down to 0
- Accessibility issues: 47 down to 11 (the remaining 11 needed design decisions)
- Brand compliance: 31 down to 2
- Total issues: 308 down to 89
That trajectory matters. It tells you whether your codebase is getting healthier or rotting. Without comparison data, every audit feels like a fresh pile of problems. With comparison data, you see progress.
The audit stores each run as a timestamped report. You can compare any two runs. You can track specific categories over time. You can spot trends (are accessibility issues creeping back up? are dependency updates falling behind again?).
I run the audit roughly once a week now. Sometimes more often before a launch. The numbers keep going down, and the manual review queue gets shorter every time. That feedback loop is addictive.
Why This Cannot Be a CI Pipeline
Someone will read this and think "just put it in CI." I tried. It does not work the same way, for three reasons.
First, CI runs on a single repo. This audit runs across 12 repos simultaneously, comparing cross-project patterns and catching inconsistencies between them. Duplicated utility functions across repos. Diverging brand implementations. Version mismatches in shared dependencies.
Second, CI checks on every commit. That creates alert fatigue. An audit that runs nightly (or weekly) gives you a consolidated view. One report. One review session. One batch of fixes. Not 30 separate CI failures across 15 pull requests.
Third, the auto-fix capability. CI tells you something is wrong. This audit tells you something is wrong, fixes it, verifies the fix, and presents you with a clean diff to approve. The difference in time-to-resolution is massive.
CI is great for catching regressions in active development. The night audit is great for catching the slow decay that happens across an entire ecosystem while you are focused elsewhere.
Running It Yourself
The methodology is not secret. MAP your ecosystem. SCAN across categories. FIX what has safe automated solutions. VERIFY the fixes. REPORT with comparison data.
You could build this with shell scripts and some patience. The hard part is not the concept. It is the coverage. Getting 14 categories of checks to work correctly across different project types. Handling edge cases in auto-fixes. Building the comparison engine that tracks trends.
I spent about 3 weeks getting the first version right. Another 2 weeks refining the auto-fix logic after it introduced a few false positives. And ongoing tuning as new patterns emerge.
If you want the finished product, FULLMOON is available as a Claude Code skill. It runs the full 5-phase audit with all 14 categories, parallel scanning, auto-fix, verification, and living reports with audit comparison. 49 EUR.
It is the single tool in my stack that I would rebuild from scratch if I lost it. Nothing else catches what it catches. Nothing else saves as much time while I am not working.
This article contains affiliate links. If you sign up through them, I may earn a small commission at no extra cost to you. (Ad)
Back to all articles