This project focuses on analyzing real-world e-commerce business problems using PostgreSQL. The goal is to move beyond tutorial-style SQL queries and perform business-driven analytics using a normalized relational database.
The analysis covers revenue trends, customer behavior, retention, and churn risk, simulating the work of a production data analyst.
- Database: PostgreSQL
- Tools: pgAdmin
- Language: SQL
The project uses a normalized schema with four tables:
- customers – customer demographics and geography\
- orders – transaction-level order data\
- order_items – item-level order details\
- products – product catalog information
customers(customer_id, customer_name, country)
orders(order_id, customer_id, invoice_date, total_amount)
order_items(order_item_id, order_id, product_id, quantity, unit_price)
products(product_id, description, category)
How does revenue change month over month? Used to identify growth patterns and seasonality.
Which products contribute the most to total revenue? Helps guide inventory and pricing decisions.
Which regions have the highest number of customers? Supports market expansion and regional marketing strategies.
Which customers generate the highest lifetime revenue? Identifies high-value customers for loyalty programs.
How many customers return after their first purchase? Measures customer retention performance.
How much revenue comes from the top 10% of customers? Demonstrates the Pareto (80/20) principle in business.
Which customers are at risk of churn based on inactivity? Enables proactive retention strategies.
- JOINs across multiple tables
- Aggregations (SUM, COUNT, AVG)
- CASE WHEN for business logic
- Common Table Expressions (CTEs)
- Window functions (NTILE)
- Date functions (DATE_TRUNC)
- KPI calculations (CLV, retention, churn)
-
A small percentage of customers contributes a large share of revenue
-
Repeat customers drive significantly more value than one-time buyers
-
Certain products dominate total sales, indicating inventory focus areas
-
Inactivity-based churn signals can be identified using simple SQL logic