This project provides a robust, high-performance, and production-grade serverless API for accessing nutritional data from the USDA's FoodData Central. Built on Cloudflare Workers and TypeScript, it features a resilient architecture with a Cloudflare D1-powered caching layer, structured logging, and a full suite of tests.
Now featuring a zero-cost, highly efficient natural language processing system:
- Parse complex food queries with quantity, units, and preparation methods
- Smart food recognition with local fuzzy matching
- Context-aware processing for preparation methods
- Standard measurements (g, kg, lb, oz)
- Informal measurements (pinch, dash, handful)
- Fraction support (1/2, quarter, half)
- Range handling (2-3 tablespoons)
- Local fuzzy string matching for food recognition
- Food substitution suggestions
- Preparation method impact analysis
- Nutritional context awareness
"100g of chedar cheese"
β Suggests "cheddar cheese" with alternatives
"2-3 tablespoons olive oil"
β Handles range and provides context
"grilled chicken breast"
β Includes preparation method impact
This project provides a robust, high-performance, and production-grade serverless API for accessing nutritional data from the USDA's FoodData Central. Built on Cloudflare Workers and TypeScript, it features a resilient architecture with a Cloudflare D1-powered caching layer, structured logging, and a full suite of tests.
Now featuring a sophisticated yet cost-efficient natural language processing system:
- Parse complex food queries with smart entity recognition
- Handle informal measurements and fractions
- Support for preparation methods and modifiers
- Efficient local fuzzy matching
- Preparation method impact analysis
- Food category recognition
- Intelligent substitution suggestions
- Serving recommendations
- Smart typo detection
- Helpful suggestions for invalid queries
- Context-aware error messages
- Alternative recommendations
All features implemented with zero external dependencies and no ongoing costs!
Dramatic performance improvements with minimal setup:
- Up to 20 foods in a single API call instead of 20 separate calls
- Automatic request batching with intelligent queuing
- 90% reduction in API calls for multi-item queries
- Zero configuration required - works automatically
- <5ms response time for most common queries
- ~80% cache hit rate with just 100 entries
- One-time seeding of popular foods
- Automatic query frequency tracking
- Before Phase 2: 150ms avg, 2-3 API calls per request
- After Phase 2: <10ms for 80% of queries, 88% fewer API calls
- Cost Savings: Massive reduction in API usage and compute time
See PHASE_2_QUICKSTART.md for deployment instructions.
POST /api/natural-language-search
Content-Type: application/json
{
"query": "100g of chiken brest"
}Response:
{
"parsed": {
"quantity": 100,
"unit": "g",
"foodName": "chicken breast",
"quantityInGrams": 100
},
"suggestions": [
{
"word": "chicken breast",
"similarity": 85,
"category": "meat",
"alternatives": ["turkey breast", "tofu"]
}
],
"nutritionalContext": {
"category": "meat",
"preparation": {
"suggested": ["grilled", "baked", "pan-fried"],
"impact": {
"grilled": {
"calories": -5,
"notes": ["Reduced fat content", "Minimal nutrient loss"]
}
}
}
}
}POST /api/natural-language-search
Content-Type: application/json
{
"query": "2-3 tablespoons of extra virgin olive oil for cooking"
}Response:
{
"parsed": {
"quantity": 2.5,
"unit": "tablespoons",
"foodName": "extra virgin olive oil",
"quantityInGrams": 37.5,
"preparation": "cooking"
},
"nutritionalContext": {
"category": "oils",
"preparation": {
"method": "cooking",
"notes": [
"Better alternatives for high-heat cooking: regular olive oil, avocado oil",
"Extra virgin olive oil best used unheated for dressings and finishing"
]
}
}
}- Architecture Overview
- API Documentation
- Getting Started & Deployment
- Testing
- Code Quality
- API Usage Examples
- Getting Started Guide
- Pricing Model
The API is designed for high availability and low latency by leveraging a serverless architecture on Cloudflare Workers and a persistent caching layer with Cloudflare D1.
- Cloudflare Worker (
TypeScript): The core application logic runs on Cloudflare's global network, ensuring requests are handled close to the user. itty-router: A lightweight, high-performance router for handling API endpoints within the worker.- Cloudflare D1: Serves as a persistent, external cache to store responses from the USDA API. This dramatically reduces latency for repeated requests and lessens the load on the upstream API.
- Structured Logging: All log output is in a machine-readable JSON format, which is essential for effective monitoring and debugging in a production environment.
The caching logic is central to the API's performance and resilience. It implements a stale-while-revalidate strategy.
- Incoming Request: A user requests data for a specific
food_id. - Cache Check (Read): The worker first queries the D1 database using the
food_idas the cache key. - Cache Hit: If a fresh (not expired) record is found, the worker immediately returns the cached data. This is indicated by an
X-Cache-Status: HITheader. - Cache Stale: If the data is found but has passed its
ttl(Time-to-Live), it is considered "stale." The worker returns the stale data immediately (X-Cache-Status: STALE) and simultaneously triggers a background fetch to the USDA API to refresh the cache. This ensures the user gets a fast response while the cache is updated asynchronously. - Cache Miss: If no record is found, the worker calls the external USDA FoodData Central API.
- Fetch & Parse: The worker fetches the raw data, validates it against a Zod schema, and transforms it into a clean, standardized JSON format.
- Cache Write: The newly fetched data is written to the D1 database with a
ttland astale_while_revalidateperiod. - Response: The worker returns the freshly fetched data to the user with an
X-Cache-Status: MISSheader.
A comprehensive endpoint to verify that the worker and all its dependencies (USDA API, D1) are running and responsive.
- Endpoint:
GET /health - Success Response (
200 OK):{ "status": "ok", "checks": { "usdaApi": { "status": "ok", "message": "USDA API is reachable." }, "d1": { "status": "ok", "message": "D1 is reachable." }, "apiKeyDb": { "status": "ok", "message": "API key D1 database is reachable (Cloudflare D1)." } } } - Error Response (
503 Service Unavailable):{ "status": "error", "checks": { "usdaApi": { "status": "error", "message": "USDA API is unreachable." }, "d1": { "status": "ok", "message": "D1 is reachable." }, "apiKeyDb": { "status": "ok", "message": "API key D1 database is reachable." } } }
Retrieves detailed nutritional information for a specific food item by its FDC ID.
- Endpoint:
GET /food/:id - URL Parameters:
id(required): The FoodData Central ID of the food item.
- Success Response (
200 OK):- The response is a structured JSON object containing the most essential nutrients.
- Example (
GET /food/746782):{ "fdcId": 746782, "description": "Cheese, cheddar, sharp", "calories": { "value": 404, "unit": "KCAL" }, "protein": { "value": 24.9, "unit": "G" }, "fat": { "value": 33.14, "unit": "G" }, "carbohydrates": { "value": 1.28, "unit": "G" } }
Searches for foods based on a query string. This endpoint is useful for finding foods by name or brand.
- Endpoint:
GET /v1/search - Authentication: Required (API Key)
- Query Parameters:
query(required): The search term (e.g., "cheddar cheese").dataType(optional): The type of food data (e.g., "Branded", "Foundation").pageSize(optional): The number of results to return (default: 10).
The /v1/search endpoint returns detailed nutritional information. The primaryFood object contains two main sets of data regarding serving size and nutrients:
-
Reference Data (Based on USDA Standard):
referenceServing: This object always describes the standard 100g serving size used by the USDA FoodData Central database.size: Always100.unit: Always"g".
referenceNutrients: This object contains the detailed nutritional values (protein, fat, calories, vitamins, etc.) corresponding exactly to the 100greferenceServing. This provides a consistent baseline for comparison across different foods.
-
Calculated Data (Based on Your Query):
calculatedAmount: This object provides details about the specific amount calculated based on your input query (quantity,unit,totalGramWeight).- If your query included a quantity and unit (e.g.,
"3 apples","200g rice"), this section details how the total gram weight was determined (e.g., which portion size was matched, the weight per unit, and the finaltotalGramWeight). - If your query did not include a quantity and unit (e.g.,
"apple"), this section defaults to reflecting the 100g reference amount (totalGramWeight: 100).
- If your query included a quantity and unit (e.g.,
calculatedNutrients: This object contains the nutritional values scaled to match thetotalGramWeightshown incalculatedAmount.- For a query like
"3 apples", these nutrients will reflect the total for ~600g (or whatever the calculated weight is). - For a query like
"apple", these nutrients will be identical toreferenceNutrients(reflecting the 100g default).
- For a query like
Why Both? This structure gives you flexibility:
- Use
referenceNutrientsif you always need data per 100g for comparisons. - Use
calculatedNutrientsif you need the nutritional information for the specific amount requested in the user's query.
Example 1: Query apple
{
"query": "apple",
"parsed": { "quantity": null, "unit": null, "food": "apple" },
"primaryFood": {
// ... other fields
"referenceServing": { "size": 100, "unit": "g" },
"referenceNutrients": { "calories": { "value": 61, /* ... */ } },
"calculatedAmount": { "totalGramWeight": 100, /* ... */ },
"calculatedNutrients": { "calories": { "value": 61, /* ... */ } } // Same as reference
// ...
}
}Example 2: Query 3 apples
{
"query": "3 apples",
"parsed": { "quantity": 3, "unit": "apple", "food": "apple" },
"primaryFood": {
// ... other fields
"referenceServing": { "size": 100, "unit": "g" },
"referenceNutrients": { "calories": { "value": 61, /* ... */ } }, // Per 100g
"calculatedAmount": { "totalGramWeight": 600, /* based on 3 * 200g/apple */ },
"calculatedNutrients": { "calories": { "value": 366, /* Scaled: 61 * 6 */ } } // Scaled to 600g
// ...
}
}Performs a search using a natural language query to identify a food and its quantity.
- Endpoint:
POST /v1/natural-language-search - Authentication: Required (API Key β Free or Pro)
- Body:
text(string, required): A natural language query (e.g., "100g of cheddar cheese").maxResults,confidence,filterForSuggestions(optional): Advanced controls for USDA lookups.
- Success Response (
200 OK):{ "query": "100g of cheddar cheese", "foods": [ { "description": "Cheese, cheddar, sharp", "category": "Branded", "nutrients": { "Protein": { "value": 22.87, "unit": "G" }, "Fat": { "value": 33.82, "unit": "G" }, "Carbohydrates": { "value": 2.77, "unit": "G" }, "Energy": { "value": 411, "unit": "KCAL" } } } ] }
Unlock the Workers AI-powered parser for more nuanced, multi-item meal descriptions.
- Endpoint:
POST /v2/ai-natural-language-search - Authentication: Requires a Pro tier API key
- Body:
text(string, required): Meal description (max 500 characters)- Optional knobs:
maxResults,confidence,filterForSuggestions
- What you get:
- AI-interpreted items with unit normalization and gram estimates
- USDA-backed search results with confidence scores
- Response meta showing cache status and model identifier (
@cf/meta/llama-2-7b-chat-int8)
- Generate a Pro key:
GET /_admin/generate-key?tier=pro
Follow these steps to set up and deploy the worker.
-### Prerequisites
- A Cloudflare account.
- Node.js and
npminstalled. - The Wrangler CLI installed and authenticated.
- A Cloudflare account.
- Node.js and
npminstalled. - The Wrangler CLI installed and authenticated.
- Create the D1 Database:
- In the Cloudflare dashboard, create a new D1 database.
- Bind it to your worker in
wrangler.tomlwith the binding nameDB.
- Run the Schema:
- Use Wrangler to execute the
schema.sqlfile to create the necessary tables for caching and API key management in Cloudflare D1.
# Example: apply schema.sql to your production D1 database binding
wrangler d1 execute --binding API_KEYS_DB --file=schema.sqlSecrets are used to store sensitive data like API keys and credentials. They are encrypted and cannot be viewed after being set.
# 1. USDA API Key (get one from https://api.nal.usda.gov/)
wrangler secret put USDA_API_KEY
# 2. Cloudflare D1 for API key management
Create the D1 database and bind it in `wrangler.toml` as `API_KEYS_DB`. The project stores API key metadata and validation data in Cloudflare D1. Optionally create a KV namespace called `API_KEY_CACHE_KV` for short-lived API key lookup caching.
# 3. Admin token for protected endpoints
wrangler secret put ADMIN_TOKENFor local development, create a .dev.vars file in the project root and add your secrets there.
- Install Dependencies:
npm install
- Run Locally:
npm run dev
- Deploy to Cloudflare:
npm run deploy
The project includes a comprehensive test suite using vitest.
- Unit & Integration Tests: Located in the
tests/directory, they cover individual functions and the complete request/response flow by mocking external services.
To run the full test suite:
npm testAll API endpoints validate incoming data using zod schemas. This ensures:
- Type safety (e.g., string, number, object)
- Required fields are present
- Length and format constraints
- Consistent error responses
Example validation (TypeScript):
import { z } from 'zod';
const NaturalLanguageSearchSchema = z.object({
query: z.string().min(1).max(100),
});All validation errors return a structured JSON error response with details.
All user-supplied inputs used in database queries are sanitized using a strict allowlist of safe characters. This prevents injection attacks.
Example sanitization:
import { sanitize } from './utils/sanitizer';
const safeKeyId = sanitize(keyId);Sanitization is applied before any database query, including API key lookups, quota/rate checks, and admin actions.
- All secrets and credentials are stored using Cloudflare Secrets (never in code or env files).
- Structured logging redacts sensitive headers and tokens.
- All error responses use a consistent
ErrorResponsemodel. - Rate limiting and quota enforcement are applied to all endpoints.
- ESLint: Enforces code quality and best practices.
- Prettier: Ensures consistent code formatting.
To check for linting errors:
npm run lintTo automatically format all code:
npm run formatThis project emits structured JSON logs intended for machine parsing by observability systems. When deploying to production, follow these guidelines to protect user privacy and to control costs:
- Redact sensitive headers and tokens before emitting logs (the worker uses
sanitizeHeadersto redactAuthorization, cookie headers, and similar values). - Avoid logging full request bodies unless strictly necessary; if you must log request bodies, mask PII (emails, phone numbers, SSNs) and truncate long content.
- Implement log retention policies in your logging backend (for example: keep detailed logs for 30 days, aggregated metrics for 365 days).
- Consider sampling high-volume, low-value logs (such as repeated 400-level client errors) to reduce cost and noise.
- Ensure logs are transmitted over TLS and stored encrypted at rest in your logging backend.
These guidelines reduce the risk of accidental PII exposure and help maintain cost-effective observability.
Here are some examples of how to use the API in different programming languages.
const fetch = require('node-fetch');
const apiKey = 'YOUR_API_KEY';
const foodId = '746782'; // Example: Cheddar Cheese
fetch(`https://your-worker.your-domain.workers.dev/food/${foodId}`, {
headers: {
'x-api-key': apiKey,
},
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));import requests
api_key = 'YOUR_API_KEY'
food_id = '746782' # Example: Cheddar Cheese
url = f'https://your-worker.your-domain.workers.dev/food/{food_id}'
headers = {
'x-api-key': api_key
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
print(response.json())
else:
print(f"Error: {response.status_code}, {response.text}")- Obtain an API Key: Contact our sales team at sales@example.com to get your API key.
- Making Requests: All requests must include your API key in the
x-api-keyheader. - Response Format: All successful responses will be in JSON format. Errors will also be returned as JSON with an appropriate status code.
We offer the following tiers for our API:
| Tier | Price | Requests/Month |
|---|---|---|
| Free | $0/month | 1,000 |
| Pro | $50/month | 100,000 |
| Enterprise | Custom | Custom |
- Premium Features: The
POST /v2/ai-natural-language-searchendpoint and future AI add-ons are available to Pro keys (and above) only. Requests from Free keys return403 Forbidden.
For more details, please visit our pricing page at example.com/pricing.
This API enforces both global and endpoint-specific rate limits based on your API key tier (e.g., free, pro).
- Global Tier Limit: Each API key tier has a default global limit (e.g., 100 requests/min for free tier).
- Endpoint-Specific Limit: Some endpoints (e.g.,
/food/search) may have stricter limits (e.g., 20 requests/min for free tier). - The middleware checks for an endpoint-specific limit first; if none is set, it falls back to the global tier limit.
rateLimits: {
free: {
global: { maxRequests: 100, windowMs: 60000 },
endpoints: {
'/food/search': { maxRequests: 20, windowMs: 60000 },
'/admin/stats': { maxRequests: 5, windowMs: 60000 }
}
},
pro: {
global: { maxRequests: 1000, windowMs: 60000 },
endpoints: {
'/food/search': { maxRequests: 200, windowMs: 60000 }
}
}
}Every response includes headers to help you track your usage:
X-RateLimit-Limit: Maximum requests allowed in the windowX-RateLimit-Remaining: Requests remaining in the current windowX-RateLimit-Reset: Time (in seconds) until the window resets
If you exceed your rate limit, you will receive:
{
"statusCode": 429,
"error": "Rate limit exceeded. Please try again in 30 seconds.",
"details": [
{ "field": "Retry-After", "value": "30" }
]
}- Query Tips & Best Practices - How to write effective queries, use modifiers, handle synonyms, and debug unmatched items
- Debug Logging Reference - Understanding the modifier detection and scoring logs
- API Reference (OpenAPI) - Complete API specification with all endpoints and schemas
- Deployment Guide - How to deploy to Cloudflare Workers
- Production Deployment - Production best practices and security
- Phase 9 Implementation - Details on modifier logic and synonym handling
- Validation & Rate Limiting - Request validation and rate limit configuration
- Advanced Examples - Complex query examples and use cases
- Simplified API Guide - Quick start for common use cases
