This project analyzes a relational retail sales database using SQL. The goal is to practice querying multiple business tables and answering practical questions about customers, orders, products, stores, payments, shipments, and data quality.
Unlike a single-table dataset, this database is structured across several related tables. This makes it useful for practicing SQL fundamentals and understanding how tables connect through shared IDs.
The objective of this project is to explore retail sales operations and answer business questions related to order activity, customer behavior, product sales, store performance, payment activity, shipment status, and documented data quality issues.
This project is designed as a beginner-to-intermediate SQL portfolio project. It focuses on clear, defendable SQL concepts rather than advanced techniques.
The project uses a SQLite database with the following tables:
CustomersOrdersOrder_ItemsProductsStoresPaymentsShipmentsData_Quality_Issues
- SQL
- SQLite
- DB Browser for SQLite or any SQLite-compatible tool
- GitHub for project documentation
This project focuses on beginner-to-intermediate SQL skills:
SELECTWHEREORDER BYLIMITCOUNT,SUM,AVG,MIN,MAXGROUP BYHAVINGINNER JOIN- Multiple-table joins
- Table aliases
- Basic data quality review
This version intentionally does not use CASE WHEN so the focus stays on filtering, aggregation, grouping, and joins.
The SQL file contains 27 queries organized around the following business questions:
- What sample records are available in each main table?
- How many records are available in each main table?
- Which orders were completed?
- Which orders were placed through the Online sales channel?
- How many orders are there by order status?
- Which sales channels are used most often?
- Which products have the highest list prices?
- Which product categories have the highest total item sales?
- Which product categories generated more than 10,000 in item sales?
- Which electronics products were sold in orders?
- Which products sold the highest quantity?
- How many customers are in each customer segment?
- Which customers placed orders?
- Which online orders include customer details?
- Which customers bought which products?
- Which customers generated the highest total sales?
- Which customers placed more than one order?
- Which stores are associated with each order?
- Which stores generated the highest total sales?
- Which stores processed more than 10 orders?
- What is the total payment amount collected?
- What payment details are connected to each order?
- Which payment methods are most common?
- What shipment details are connected to each order?
- How many shipments are in each shipping status?
- How can customers, orders, order items, and products be connected to understand purchase activity?
- What data quality issues are documented in the database?
Retail-Sales-SQL-Analysis/
├── README.md
├── database/
│ └── retail_sales.sqlite
└── sql/
└── Retail_Sales_Analysis.sql
The SQL file is organized into the following sections:
- Data exploration
- Portfolio size
- Filtering with
WHERE - Sorting and
LIMIT - Aggregations
GROUP BYHAVING- Basic joins
- Joins with filters
- Multiple-table joins
- Customer, store, and product sales analysis
- Payment and shipment analysis
- Data quality review
- Download or clone this repository.
- Open the SQLite database file from the
databasefolder. - Open the SQL file from the
sqlfolder. - Run each query section by section.
- Review how each query answers a specific business question.
The database was created for practice purposes and is designed to support SQL learning with a realistic retail structure. The main value of this project is demonstrating how to query related tables and turn raw database records into business-friendly results.