Architect Your Development Lifecycle with Context

The fastest-moving teams treat documentation as infrastructure, not afterthought. SDLC.md gives you structured markdown templates that capture requirements, architecture decisions, and domain knowledge in a format both humans and AI understand.

Drop a CLAUDE.md or SDLC.md file into your repository root. AI coding assistants automatically consume it, gaining deep understanding of your project structure, conventions, and constraints before writing a single line of code.

Stop re-explaining your architecture in every prompt. Start versioning your context alongside your code and watch your entire team - human and AI - move faster with shared understanding.

SDLC Context Best Practices

Proven patterns for structuring your software development lifecycle documentation so AI assistants and team members get maximum value from every file.

Structure by Phase

Organize context files by SDLC phase - planning, design, implementation, testing, deployment. Each phase gets its own markdown section with clear ownership and review cadence.

Version Decision Records

Capture every architecture decision as a versioned ADR in markdown. Include the context, options considered, decision rationale, and consequences. Future developers and AI assistants need the why, not just the what.

Define Stakeholder Context

Document who owns each phase, who reviews, and who approves. Clear ownership in your SDLC context prevents decisions from falling through cracks and helps AI assistants route questions appropriately.

Keep Context Fresh

Stale documentation is worse than no documentation - it actively misleads. Set review cadences per file type: architecture docs quarterly, API specs with every release, onboarding guides monthly.

Make Context Discoverable

Use consistent naming conventions and a root-level index file that maps context files to their purpose. AI assistants and new team members should find the right file in seconds, not minutes.

Layer Your Context

Build context in layers - project overview at the top, domain details in the middle, implementation specifics at the bottom. AI assistants perform best when context flows from broad to narrow.

Test Your Context

Give your context files to a new team member or AI assistant and measure how quickly they produce correct output. If the context does not enable accurate work, iterate on it like you would iterate on code.

Guard Sensitive Context

Mark sections that contain security constraints, compliance requirements, or access control rules. AI assistants need to know what they cannot do as much as what they should do.

The SDLC Principle

Context is not documentation - it is infrastructure. The teams that move fastest are the ones who treat their markdown files with the same rigor as their code: versioned, reviewed, tested, and continuously improved. When your SDLC context is strong, AI assistants write better code, new developers onboard faster, and architectural decisions compound rather than contradict.

The SDLC Template

SDLC.md
# SDLC.md - Software Development Lifecycle Context
<!-- Template for AI coding assistants (CLAUDE.md, .cursorrules, etc.) -->
<!-- Provides full project lifecycle context: architecture, stack, workflows, deployment -->
<!-- Last updated: YYYY-MM-DD -->

## Project Overview

**Project**: Meridian - Customer Analytics Platform
**Version**: 3.2.1
**Status**: Production
**Repository**: https://github.com/acme-corp/meridian
**Team**: Platform Engineering (6 developers, 2 QA)

### Elevator Pitch

Meridian is a real-time customer analytics platform that ingests event data from web and mobile clients, processes it through a streaming pipeline, and serves interactive dashboards to business users. It replaces our legacy batch-processing system with sub-second query performance across 2TB+ of event data.

### Key Business Context

- Serves 340 internal business users across marketing, product, and sales
- Processes 12M events/day from 3 client applications
- SLA: 99.9% uptime, dashboard queries under 2 seconds
- Revenue impact: directly supports $4.2M ARR through customer insights

## Architecture Decisions (ADRs)

### ADR-001: ClickHouse for Analytics Storage
**Status**: Accepted
**Date**: 2026-02-09
**Context**: Our PostgreSQL analytics tables hit 800M rows. Aggregate queries were taking 30+ seconds even with materialized views and proper indexing. Business users complained about dashboard load times daily.
**Decision**: Adopt ClickHouse as the primary analytics data store. PostgreSQL remains for transactional data (users, configs, permissions). Events flow from Kafka into ClickHouse via a custom consumer service.
**Consequences**: Query performance improved from 30s to under 500ms for 95th percentile. Trade-off is that ClickHouse does not support UPDATE/DELETE well, so we use an append-only model with deduplication views. Team needed 3 weeks of ClickHouse training.

### ADR-002: Migrate from REST to GraphQL for Dashboard API
**Status**: Accepted
**Date**: 2026-02-09
**Context**: Dashboard frontend was making 8-12 REST calls per page load to assemble widget data. This created waterfall latency and tight coupling between frontend components and backend endpoints.
**Decision**: Implement a GraphQL gateway (Apollo Server) that sits in front of the existing services. REST endpoints remain for mobile and third-party integrations.
**Consequences**: Frontend page loads dropped from 12 requests to 1-2. Reduced average dashboard load time by 60%. Added complexity to the backend - the GraphQL resolver layer requires its own testing and monitoring. Schema changes require coordination between frontend and backend teams.

### ADR-003: Event-Driven Architecture with Kafka
**Status**: Accepted
**Date**: 2026-02-09
**Context**: The old system used synchronous API calls between services. When the analytics service was slow or down, it cascaded failures to the event ingestion API, causing data loss.
**Decision**: Introduce Apache Kafka as the central event bus. All services publish and consume events asynchronously. Events are persisted in Kafka for 7 days, allowing replay if a consumer falls behind or fails.
**Consequences**: Services are fully decoupled. We can add new consumers without modifying producers. Operational complexity increased - Kafka cluster requires dedicated monitoring and occasional partition rebalancing. Added ~200ms latency for event processing (acceptable for analytics use case).

## Tech Stack

### Frontend
- **Framework**: React 18 with TypeScript
- **State Management**: TanStack Query (server state) + Zustand (UI state)
- **Charting**: Recharts for dashboards, D3.js for custom visualizations
- **Styling**: Tailwind CSS v4 with component library (internal)
- **Build**: Vite 5, deployed as static assets to CloudFront

### Backend
- **Runtime**: Node.js 20 LTS
- **API Layer**: Apollo Server (GraphQL) + Express (REST)
- **Event Processing**: Custom Kafka consumers in Node.js
- **Task Queue**: BullMQ with Redis for scheduled reports and exports
- **Authentication**: Auth0 with SAML SSO for enterprise customers

### Data Layer
- **Transactional DB**: PostgreSQL 16 (AWS RDS) - users, configs, permissions
- **Analytics DB**: ClickHouse cluster (3 nodes) - event data, aggregations
- **Cache**: Redis 7 (ElastiCache) - session data, query cache, rate limiting
- **Message Bus**: Apache Kafka (MSK) - event streaming between services

### Infrastructure
- **Cloud**: AWS (us-east-1 primary, us-west-2 disaster recovery)
- **Container Orchestration**: EKS (Kubernetes 1.29)
- **CI/CD**: GitHub Actions with ArgoCD for GitOps deployments
- **Monitoring**: Datadog (APM, logs, metrics), PagerDuty (alerting)
- **IaC**: Terraform for AWS resources, Helm charts for K8s

## Development Workflows

### Local Setup
```bash
# Clone and install
git clone [email protected]:acme-corp/meridian.git
cd meridian
nvm use 20
npm install

# Start dependencies (Postgres, Redis, ClickHouse, Kafka)
docker compose up -d

# Environment setup
cp .env.example .env
# Edit .env - get secrets from 1Password vault "Meridian Dev"

# Run database migrations
npm run db:migrate
npm run db:seed

# Start development (all services)
npm run dev
# Frontend: http://localhost:5173
# GraphQL Playground: http://localhost:4000/graphql
# REST API: http://localhost:4000/api/v1
```

### Branch Strategy
- `main` - Production-ready code, deploys automatically
- `staging` - Pre-production integration, deploys to staging environment
- `feature/*` - New features, branch from `main`
- `hotfix/*` - Production fixes, branch from `main`, merge to both `main` and `staging`

### Code Review Process
1. Create feature branch from `main`
2. Implement changes with tests (minimum 80% coverage on new code)
3. Open PR with description template filled out (what, why, how to test)
4. Automated checks must pass: lint, type-check, unit tests, integration tests
5. Require 1 approval from code owner for the affected area
6. Squash and merge - PR title becomes the commit message

### Commit Convention
```
feat: add retention cohort chart to dashboard
fix: resolve timezone offset in event timestamps
perf: add ClickHouse projection for top-events query
refactor: extract shared auth middleware to @meridian/auth
docs: update API changelog for v3.2 release
test: add integration tests for GraphQL mutations
chore: upgrade TanStack Query to v5
```

## Deployment Procedures

### Staging Deployment
```bash
# Automated on merge to staging branch
git checkout staging
git merge feature/my-feature
git push origin staging
# GitHub Actions: lint -> test -> build -> deploy to EKS staging
# URL: https://staging.meridian.internal.acme.com
# Slack notification sent to #meridian-deploys
```

### Production Deployment
```bash
# Merge staging to main (after QA sign-off)
git checkout main
git merge staging
git push origin main
# GitHub Actions: full test suite -> build -> push to ECR -> ArgoCD sync
# ArgoCD performs rolling update (zero-downtime)
# Canary: 10% traffic for 15 minutes, then full rollout
# URL: https://meridian.acme.com
```

### Rollback Procedure
```bash
# Option 1: ArgoCD rollback (fastest, under 2 minutes)
argocd app rollback meridian-prod --revision [previous-revision]

# Option 2: Git revert (creates audit trail)
git revert [commit-hash]
git push origin main
# ArgoCD auto-syncs the revert

# Option 3: Emergency - direct image rollback
kubectl set image deployment/meridian-api api=meridian-api:[previous-tag] -n production
```

### Database Migrations
```bash
# Migrations run automatically during deployment
# For manual execution:
npm run db:migrate          # Apply pending migrations
npm run db:migrate:status   # Check migration status
npm run db:migrate:rollback # Rollback last migration

# ClickHouse migrations are separate:
npm run ch:migrate          # Apply ClickHouse DDL changes
```

## Critical Dependencies

- **Auth0** (authentication) - If Auth0 is down, no users can log in. Mitigation: JWT tokens are validated locally with cached JWKS, existing sessions continue to work for up to 1 hour.
- **Kafka (MSK)** (event pipeline) - If Kafka is down, events queue in the ingestion API (up to 10K events in memory buffer, then rejected with 503). Events are not lost if Kafka recovers within the buffer window.
- **ClickHouse** (analytics queries) - If ClickHouse is down, dashboards show cached data (up to 5 minutes stale) and a degraded mode banner. Transactional features (user management, project config) are unaffected.
- **Redis** (caching/sessions) - If Redis is down, query performance degrades (every query hits ClickHouse directly) and rate limiting is disabled. Sessions fall back to JWT-only validation.

## Known Issues and Technical Debt

- [ ] **ClickHouse dedup lag** - Deduplication views can lag up to 30 seconds during high-throughput periods. Users occasionally see duplicate events in real-time dashboards. Priority: Medium. Workaround: client-side dedup in the dashboard query layer.
- [ ] **GraphQL N+1 in nested resolvers** - The `project.members.recentActivity` resolver generates N+1 queries when loading team dashboards with 20+ members. Priority: High. Fix planned: implement DataLoader batching in Sprint 47.
- [ ] **Legacy REST endpoints** - 14 REST endpoints are still used by the mobile app (v2.x). These duplicate logic that now lives in GraphQL resolvers. Priority: Low. Plan: deprecate after mobile v3 ships (Q3 2026).
- [ ] **Test coverage gap in Kafka consumers** - Consumer error handling paths have ~40% test coverage. Priority: Medium. Integration tests are hard to write because they require a running Kafka instance.

## Environment Configuration

| Variable | Required | Description |
|----------|----------|-------------|
| `DATABASE_URL` | Yes | PostgreSQL connection string |
| `CLICKHOUSE_URL` | Yes | ClickHouse HTTP endpoint |
| `REDIS_URL` | Yes | Redis connection string |
| `KAFKA_BROKERS` | Yes | Comma-separated Kafka broker addresses |
| `AUTH0_DOMAIN` | Yes | Auth0 tenant domain |
| `AUTH0_CLIENT_ID` | Yes | Auth0 application client ID |
| `AUTH0_CLIENT_SECRET` | Yes | Auth0 application client secret |
| `NODE_ENV` | No | Environment mode (development/staging/production) |
| `LOG_LEVEL` | No | Logging verbosity (debug/info/warn/error) |
| `GRAPHQL_INTROSPECTION` | No | Enable GraphQL introspection (disabled in prod) |

## Contact and Resources

- **Tech Lead**: [Name] - [Email/Slack handle]
- **Product Manager**: [Name] - [Email/Slack handle]
- **On-Call Rotation**: See #meridian-oncall channel topic in Slack
- **Runbook**: [Link to operational runbook]
- **Architecture Diagram**: [Link to Miro/Lucidchart]
- **Monitoring Dashboard**: [Link to Datadog dashboard]
- **Incident Response**: Page on-call via PagerDuty, escalation in #meridian-incidents

Why Markdown Matters for AI-Native Development

Context as Infrastructure

AI operates on context, not abstraction. The competitive advantage isn't your code - it's how effectively you architect context. Version it. Standardize it. Make it infrastructure. Your requirements, architecture decisions, and domain knowledge belong in markdown files within version control.

Markdown as Substrate

LLMs are optimized for markdown - fewer tokens, cleaner parsing, human-readable yet machine-native. Your strategic plans, architecture decisions, and product roadmaps should live in .md files. Word docs are legacy. Notion is transitional. Markdown is the substrate of AI-native organizations.

Requirements as Code

Requirements are code now - written in a language both humans and AI understand. This is agentic development: your IDE becomes a thinking partner. Context evolves with your code in the repo, not in disconnected wikis that decay with every hotfix. Junior engineers write code. Senior engineers architect context.

"The fastest-moving teams aren't winning by writing better prompts. They're winning by recognizing a fundamental shift: AI thrives on well-structured context. SDLC.md helps you architect that context as a first-class concern - standardized, versioned, and integrated with your development workflow."

Explore More Templates

About SDLC.md

Our Mission

Built by developers who believe context is the new competitive advantage in AI-native development.

We are passionate about helping teams understand that AI doesn't just need code - it needs context. The best codebases are those where knowledge is structured, versioned, and accessible. Markdown files are not just documentation - they are executable specifications that AI can understand and act upon.

Our goal is to show the world that .md files are infrastructure. When you treat context as a first-class concern - versioned in git, reviewed in pull requests, tested for effectiveness - you unlock a new level of development velocity. This is the future of software development: where humans architect context and AI executes implementation.

Why Markdown Matters

AI-Native

LLMs parse markdown better than any other format. Fewer tokens, cleaner structure, better results.

Version Control

Context evolves with code. Git tracks changes, PRs enable review, history preserves decisions.

Human Readable

No special tools needed. Plain text that works everywhere. Documentation humans actually read.

Have feedback? Found a bug? Want to contribute? We'd love to hear from you.