diff --git a/apps/backend/BACKEND_ANALYSIS.md b/apps/backend/BACKEND_ANALYSIS.md new file mode 100644 index 0000000..4914dc0 --- /dev/null +++ b/apps/backend/BACKEND_ANALYSIS.md @@ -0,0 +1,451 @@ +# Backend Code Analysis Report + +> **Date:** 2026-02-23 +> **Scope:** `apps/backend/` β€” 6 Lambda microservices (users, projects, donors, expenditures, reports, auth), database schema, Docker infrastructure, and CLI tooling. + +--- + +## Table of Contents + +1. [Executive Summary](#executive-summary) +2. [Architecture Overview](#architecture-overview) +3. [Critical Issues](#critical-issues) +4. [Code Duplication](#code-duplication) +5. [Security Concerns](#security-concerns) +6. [Inconsistent Patterns](#inconsistent-patterns) +7. [Testing Gaps](#testing-gaps) +8. [Database & Schema Issues](#database--schema-issues) +9. [Build, Packaging & DevOps](#build-packaging--devops) +10. [Recommendations Summary](#recommendations-summary) + +--- + +## Executive Summary + +The backend is a microservice architecture composed of 6 AWS Lambda functions (users, projects, donors, expenditures, reports, auth) backed by PostgreSQL and containerized with Docker for local development. The code is functional but exhibits several significant patterns that will impede maintainability, reliability, and security as the project grows. + +**Key findings:** + +| Category | Severity | Count | +|---|---|---| +| Security vulnerabilities | πŸ”΄ High | 4 | +| Code duplication (copy-pasted files) | 🟠 Medium | 6 files Γ— 6 services | +| Inconsistent API patterns | 🟠 Medium | 5 | +| Testing gaps | 🟑 Low–Medium | 4 | +| Package configuration issues | 🟑 Low | 5 | + +--- + +## Architecture Overview + +``` +apps/backend/ +β”œβ”€β”€ docker-compose.yml # Orchestrates all services +β”œβ”€β”€ Makefile # Developer convenience commands +β”œβ”€β”€ db/ +β”‚ └── db_setup.sql # Schema + seed data (single file) +└── lambdas/ + β”œβ”€β”€ auth/ # Authentication (Cognito + DB) + β”œβ”€β”€ donors/ # Donor management (read-only) + β”œβ”€β”€ expenditures/ # Expenditure tracking (authed) + β”œβ”€β”€ projects/ # Project CRUD + β”œβ”€β”€ reports/ # Reports (stub, no routes) + β”œβ”€β”€ users/ # User CRUD + └── tools/ # CLI scaffolding tool +``` + +Each service is a standalone Node.js application with its own `package.json`, `Dockerfile`, database connection, type definitions, dev server, and Swagger utilities. + +--- + +## Critical Issues + +### 1. πŸ”΄ Unsanitized User Input Passed Directly to Database (projects/handler.ts) + +**File:** `lambdas/projects/handler.ts`, lines 33–38 + +```typescript +// PUT /projects/{id} +const body = event.body ? JSON.parse(event.body) as Record : {}; +const updatedProject = await db + .updateTable("branch.projects") + .set(body) // ← Arbitrary user input passed directly to .set() + .where("project_id", "=", Number(id)) +``` + +**Problem:** The entire parsed request body is passed to `.set(body)` without any validation or field whitelisting. An attacker could overwrite any column in the `projects` table, including `project_id` or `created_at`. + +**Fix:** Whitelist allowed fields before passing to the database update. + +--- + +### 2. πŸ”΄ No Input Validation on PATCH /users (users/handler.ts) + +**File:** `lambdas/users/handler.ts`, lines 95–118 + +```typescript +let email = body.email as string; +let name = body.name as string; +let isAdmin = body.isAdmin as boolean; + +await db.updateTable('branch.users') + .set({ email, name, is_admin: isAdmin }) + .where('user_id', '=', Number(userId)) + .execute(); +``` + +**Problems:** +- No email format validation (unlike the auth service which validates email format). +- No type checking β€” `body.email` could be anything; the `as string` cast doesn't enforce types. +- All fields are set even if only one was provided in the request, potentially setting fields to `undefined`. +- No validation that `isAdmin` is actually a boolean. + +--- + +### 3. πŸ”΄ Debug Logging Left in Production Code (users/handler.ts) + +**File:** `lambdas/users/handler.ts`, lines 17 and 69 + +```typescript +console.log('DEBUG - rawPath:', rawPath, 'normalizedPath:', normalizedPath, 'method:', method); +// ... +console.log(users); +``` + +These debug statements log raw request data and full database results. This can expose sensitive information in production logs. + +--- + +### 4. πŸ”΄ Missing `cognito_sub` in Users Service DB Types + +**File:** `lambdas/users/db-types.d.ts` + +The `BranchUsers` interface is missing the `cognito_sub` field that exists in the SQL schema and in the auth service's `db-types.d.ts`. This means the users service is working with an incomplete type definition, which could lead to data issues. + +--- + +## Code Duplication + +The most significant maintenance burden is the extensive copy-paste duplication across all 6 services. + +### Identically Duplicated Files + +| File | Copies | Lines per Copy | Notes | +|---|---|---|---| +| `db.ts` | 5 | 19 | Identical database connection setup | +| `db-types.d.ts` | 5 | ~78 | Generated types (mostly identical, except auth has `cognito_sub`) | +| `swagger-utils.ts` | 6 | 31 | Identical Swagger UI utility | +| `dev-server.ts` | 6 | ~175 | Identical local development server | +| `Dockerfile` | 6 | 15 | Near-identical (only health check path differs) | +| `tsconfig.json` | 6 | ~20 | Identical TypeScript config | +| `jest.config.js` | 6 | ~13 | Identical Jest config | +| `json()` function | 6 | 11 | Identical helper in every handler.ts | + +**Total duplicated code: ~2,000+ lines that could be shared.** + +### Recommendation + +The `shared/` directory at the repository root exists but is empty. These common files should be extracted into a shared package: + +``` +shared/ +β”œβ”€β”€ db.ts # Single database connection module +β”œβ”€β”€ db-types.d.ts # Single type definitions file +β”œβ”€β”€ swagger-utils.ts # Swagger utilities +β”œβ”€β”€ dev-server.ts # Development server +β”œβ”€β”€ response.ts # json() helper + CORS headers +β”œβ”€β”€ event-parser.ts # Event normalization logic +β”œβ”€β”€ tsconfig.base.json # Base TypeScript config +└── jest.config.base.js # Base Jest config +``` + +--- + +## Security Concerns + +### 5. Wildcard CORS Headers + +**All handlers** return: +```typescript +'Access-Control-Allow-Origin': '*' +``` + +This allows any website to make requests to the API. In production, this should be restricted to the actual frontend domain. + +### 6. No OPTIONS Handler for CORS Preflight + +None of the services handle `OPTIONS` requests. Browsers send `OPTIONS` preflight requests for cross-origin POST/PUT/PATCH/DELETE requests. Currently these return `404`, which means CORS preflight will fail in browser environments. + +### 7. Hardcoded Default Credentials + +**Files:** All `db.ts` files, test files, and `docker-compose.yml` + +```typescript +user: process.env.DB_USER ?? 'branch_dev', +password: process.env.DB_PASSWORD ?? 'password', +``` + +While acceptable for local development, having `password` as a fallback password and hardcoded in multiple test files creates risk if code is accidentally deployed without proper environment variables. + +### 8. No Rate Limiting + +No services implement rate limiting. The auth service (`/register`, `/login`) is particularly vulnerable to brute-force attacks. + +### 9. No Request Body Size Limits + +No services enforce maximum request body size, making them vulnerable to denial-of-service via oversized payloads. + +--- + +## Inconsistent Patterns + +### 10. Response Format Varies Per Service + +Each service returns data in a different shape: + +| Service | Success Response Shape | +|---|---| +| **Users** (POST/PATCH/DELETE) | `{ ok, route, pathParams, body: { ... } }` | +| **Users** (GET /users) | `{ users: [...] }` | +| **Projects** (GET) | Raw array `[...]` | +| **Projects** (POST) | Raw object `{ project_id, name, ... }` | +| **Donors** (GET) | Raw array `[...]` | +| **Expenditures** (POST) | `{ ok, route, body: { ... } }` | + +**Recommendation:** Adopt a consistent response envelope across all services, e.g.: +```json +{ "data": { ... }, "meta": { "timestamp": "..." } } +``` + +### 11. Inconsistent Path Matching Logic + +Services use different approaches to match routes: + +- **Users handler:** Checks `normalizedPath` for some routes, uses `normalizedPath.startsWith('/') && normalizedPath.split('/').length === 2` for parameterized routes. +- **Projects handler:** Checks `rawPath` for GET, but `normalizedPath` for POST. +- **Donors handler:** Checks `rawPath === '/'` for GET. +- **Expenditures handler:** Checks `normalizedPath` consistently. + +This inconsistency makes it hard to reason about routing behavior and could lead to bugs. + +### 12. Event Type Annotations + +All handlers import `APIGatewayProxyEvent` but then type the event parameter as `any`: + +```typescript +import { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda'; + +export const handler = async (event: any): Promise => { +``` + +This defeats the purpose of TypeScript's type safety. A union type or custom event type should be defined. + +### 13. Inconsistent Validation Approaches + +| Service | Validation Approach | +|---|---| +| **Users** | Inline validation in handler | +| **Projects** | `ProjectValidationUtils` class with `ValidationResult` return type | +| **Expenditures** | `ExpenditureValidationUtils` class with `Error` return type | +| **Auth** | Inline validation in handler | +| **Donors** | No validation (read-only) | + +Two different validation patterns exist (returning `Error` objects vs. returning `ValidationResult` objects), and some services do validation inline. A unified validation approach should be adopted. + +### 14. Inconsistent `let` vs `const` Usage + +**Users handler (lines 105–107):** +```typescript +let email = body.email as string; +let name = body.name as string; +let isAdmin = body.isAdmin as boolean; +``` + +These should be `const` since they are never reassigned. The same pattern appears in the PATCH handler. + +### 15. Duplicate ROUTES-END Markers (projects/handler.ts) + +The projects handler has two `// <<< ROUTES-END` markers (lines 43 and 93), with the POST route placed between them. This means the CLI tool's route insertion logic could break. + +--- + +## Testing Gaps + +### 16. `jest` is a Production Dependency + +**All `package.json` files** list `jest` under `dependencies` instead of `devDependencies`: + +```json +"dependencies": { + "jest": "^30.2.0", + ... +} +``` + +This means jest and all its dependencies are included in production Lambda packages, increasing cold start time and package size. + +### 17. Integration Tests Hardcode Connection Details + +**Files:** `users/test/users.test.ts`, `projects/test/crud.test.ts`, `auth/test/auth.e2e.test.ts`, `expenditures/test/expenditures.e2e.test.ts` + +```typescript +const pool = new Pool({ + host: 'localhost', + port: Number(5432), + user: 'branch_dev', + password: 'password', + database: 'branch_db', +}); +``` + +These should use environment variables, consistent with how the application code handles configuration. + +### 18. Missing Test Coverage + +| Service | Unit Tests | Integration/E2E Tests | Notes | +|---|---|---|---| +| Users | βœ… POST only | βœ… CRUD | Missing unit tests for GET, PATCH, DELETE | +| Projects | βœ… POST validation | βœ… GET, PUT | Missing unit tests for GET, PUT | +| Donors | ❌ None | ❌ None | **No tests at all** | +| Expenditures | βœ… Good | βœ… Good | Best test coverage | +| Reports | ⚠️ Trivial | ❌ None | Only 1 health check test | +| Auth | βœ… Registration validation | ⚠️ Partial | Login/verify/resend/logout untested | + +### 19. Projects Unit Test Hits Real Database + +**File:** `lambdas/projects/test/projects.unit.test.ts` + +This file is labeled as a "unit test" but doesn't mock the database β€” it connects directly to PostgreSQL. This means: +- Tests fail without a running database +- Tests are slow +- Tests are not isolated + +--- + +## Database & Schema Issues + +### 20. Destructive Schema Reset on Every Init + +**File:** `db/db_setup.sql`, line 1: + +```sql +DROP SCHEMA IF EXISTS branch CASCADE; +``` + +This drops the entire schema and recreates it. There is no migration system in place. This is dangerous if accidentally run against a production database. + +**Recommendation:** Implement a proper migration system (e.g., Kysely migrations, Flyway, or a simple numbered migration approach). + +### 21. Missing Database Indexes + +The schema only has indexes from primary keys and unique constraints. Common query patterns like filtering expenditures by `project_id` or looking up users by `email` would benefit from explicit indexes: + +```sql +CREATE INDEX idx_expenditures_project_id ON expenditures(project_id); +CREATE INDEX idx_project_memberships_user_id ON project_memberships(user_id); +CREATE INDEX idx_project_donations_project_id ON project_donations(project_id); +``` + +(The `email` column on `users` already has a unique constraint which creates an implicit index.) + +### 22. No `updated_at` Column + +Tables have `created_at` but no `updated_at` timestamp. This makes it impossible to track when records were last modified, which is important for auditing and caching. + +--- + +## Build, Packaging & DevOps + +### 23. All Package Names are `"lambda-local"` + +Every service's `package.json` has `"name": "lambda-local"`. This causes confusion when debugging and makes it impossible to distinguish services in logs or dependency trees. They should be named descriptively (e.g., `"@branch/users-lambda"`). + +### 24. Production Dockerfile Uses Dev Server + +```dockerfile +CMD ["npm", "run", "dev"] # runs ts-node dev-server.ts +``` + +The Dockerfile uses the development server command. For production Lambda deployment, the code should be compiled to JavaScript and bundled properly. + +### 25. No Multi-Stage Docker Build + +The Dockerfile copies all source files (including tests, dev dependencies) into the production image: + +```dockerfile +COPY . . +``` + +A multi-stage build should be used to reduce image size and exclude unnecessary files. + +### 26. Inconsistent Dependency Versions Across Services + +| Dependency | Users | Projects | Donors | Expenditures | Auth | +|---|---|---|---|---|---| +| `pg` | ^8.16.3 | ^8.16.3 | ^8.17.2 | ^8.16.3 | ^8.17.2 | +| `kysely` | ^0.28.8 | ^0.28.8 | ^0.28.8 | ^0.28.8 | ^0.28.10 | +| `@types/pg` | ^8.15.5 | ^8.15.6 | ^8.16.0 | ^8.15.6 | ^8.16.0 | + +Inconsistent versions can lead to subtle bugs and make upgrades harder. + +### 27. `aws-lambda` Package in Expenditures Dependencies + +**File:** `lambdas/expenditures/package.json` + +```json +"dependencies": { + "aws-lambda": "^1.0.7", // ← This is a deprecated runtime package, not the types +} +``` + +This is likely a mistake. The `aws-lambda` npm package is a deprecated Lambda runtime and should not be used. Only `@types/aws-lambda` (in devDependencies) is needed. + +### 28. Unused `dotenv` Dependency in Auth + +**File:** `lambdas/auth/package.json` + +`dotenv` is listed as a dependency but is never imported or used in any auth source file. + +--- + +## Recommendations Summary + +### Priority 1 β€” Security (Fix Immediately) + +| # | Issue | Effort | +|---|---|---| +| 1 | Whitelist fields in PUT /projects `.set()` call | Small | +| 2 | Add input validation to PATCH /users | Small | +| 3 | Remove debug `console.log` statements from users handler | Trivial | +| 6 | Add OPTIONS handler for CORS preflight | Small | + +### Priority 2 β€” Correctness & Reliability + +| # | Issue | Effort | +|---|---|---| +| 4 | Sync `db-types.d.ts` across services (add `cognito_sub` to users) | Small | +| 10 | Standardize API response format | Medium | +| 11 | Standardize route matching logic | Medium | +| 15 | Fix duplicate ROUTES-END markers in projects | Trivial | + +### Priority 3 β€” Code Quality & Maintainability + +| # | Issue | Effort | +|---|---|---| +| Dup | Extract shared code into `shared/` package | Medium-Large | +| 12 | Define proper event types instead of `any` | Small | +| 13 | Unify validation approach across services | Medium | +| 14 | Fix `let` β†’ `const` for immutable variables | Trivial | +| 16 | Move `jest` to `devDependencies` | Trivial | +| 23 | Give each service a unique package name | Trivial | + +### Priority 4 β€” Testing & DevOps + +| # | Issue | Effort | +|---|---|---| +| 17 | Use environment variables in test database connections | Small | +| 18 | Add tests for donors service and expand coverage | Medium | +| 19 | Mock database in projects "unit" tests | Small | +| 20 | Implement database migration system | Medium-Large | +| 24-25 | Fix Dockerfile for production use | Medium | +| 26 | Align dependency versions across services | Small |