SQL for Data Analysis

📌 Introduction

SQL (Structured Query Language) is one of the most powerful tools for Data Analysis. It enables Business Analysts (BA) to query large datasets directly from relational databases to uncover insights, trends, and support decision-making.


🚀 Key SQL Concepts for Data Analysis

SQL Concept

Purpose in Data Analysis

SELECT & Filtering (WHERE)

Fetch specific data and apply filters.

Aggregate Functions (SUM, AVG, COUNT)

Summarize data (totals, averages, counts).

GROUP BY & HAVING

Group and filter data at an aggregate level.

JOINs (INNER, LEFT, RIGHT, FULL)

Combine data from multiple tables.

Subqueries & CTEs

Break down complex queries for better understanding.

Window Functions

Perform row-level analytics like ranking, running totals.

CASE Statements

Apply conditional logic to data.

ORDER BY & LIMIT

Sort and limit data for better visualization.


🔑 1. Selecting and Filtering Data

  • Basic Query:

sqlCopyEditSELECT name, salary, department_id
FROM employees
WHERE salary > 50000;
  • Applying Multiple Conditions:

sqlCopyEditSELECT name, salary
FROM employees
WHERE salary > 50000 AND department_id = 2;

🔑 2. Aggregating Data (SUM, AVG, COUNT, MIN, MAX)

sqlCopyEditSELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;
  • Total salary per department:

sqlCopyEditSELECT department_id, SUM(salary) AS total_salary
FROM employees
GROUP BY department_id;

🔑 3. Filtering Aggregated Data (HAVING)

sqlCopyEditSELECT department_id, COUNT(*) AS employee_count
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 5;

🔑 4. Joining Tables

sqlCopyEditSELECT e.name, e.salary, d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.id;
  • LEFT JOIN Example:

sqlCopyEditSELECT e.name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.id;

🔑 5. Subqueries for Dynamic Filtering

sqlCopyEditSELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

🔑 6. Common Table Expressions (CTEs) for Better Query Structure

sqlCopyEditWITH HighEarners AS (
    SELECT name, salary FROM employees WHERE salary > 60000
)
SELECT * FROM HighEarners;

🔑 7. Window Functions for Advanced Analysis

  • Ranking Employees within Departments:

sqlCopyEditSELECT name, department_id, 
       RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS SalaryRank
FROM employees;
  • Running Total of Salaries:

sqlCopyEditSELECT name, salary, 
       SUM(salary) OVER (ORDER BY salary) AS RunningTotal
FROM employees;

🔑 8. Conditional Logic (CASE Statements)

sqlCopyEditSELECT name, salary,
       CASE 
           WHEN salary > 70000 THEN 'High'
           WHEN salary BETWEEN 50000 AND 70000 THEN 'Medium'
           ELSE 'Low'
       END AS SalaryCategory
FROM employees;

🔑 9. Sorting and Limiting Results

  • Sorting:

sqlCopyEditSELECT name, salary
FROM employees
ORDER BY salary DESC;
  • Limiting Results (Top 5 Earners):

sqlCopyEditSELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;

Sample Data Analysis Scenarios using SQL

Business Question

SQL Feature Used

What is the average salary in each department?

GROUP BY + AVG()

Who are the top 5 highest-paid employees?

ORDER BY DESC + LIMIT

How many employees are in each department?

GROUP BY + COUNT()

What departments do not have any employees?

LEFT JOIN + NULL check

Which employees earn above the company average salary?

Subquery with AVG()

Rank employees by salary within each department

RANK() with OVER(PARTITION BY)

Find total salary cost and breakdown by departments

SUM() + GROUP BY

Segment employees into salary brackets

CASE statements


🎯 Why SQL for Data Analysis is Crucial for Business Analysts

Benefit

Explanation

Direct Access to Data

Quickly pull raw data without relying on developers.

Data-Driven Decision Making

Support business decisions with accurate and timely data.

Identify Trends and Patterns

Analyze historical and current data to find business insights.

Build Reports and Dashboards

Prepare datasets for tools like Power BI, Tableau, Excel.

Validate Requirements and Results

Cross-check data during system analysis and UAT testing.


📊 Example: Analyzing Employee Dataset

sqlCopyEdit-- Find top 3 departments by total salary cost
SELECT d.department_name, SUM(e.salary) AS total_salary
FROM employees e
JOIN departments d ON e.department_id = d.id
GROUP BY d.department_name
ORDER BY total_salary DESC
LIMIT 3;

💡 Pro Tips for Business Analysts

  1. Start with Clear Questions: Know what you need to analyze before writing SQL.

  2. Break Down Complex Queries: Use CTEs or subqueries for readability.

  3. Document Queries: Comment and explain your queries for future reference.

  4. Optimize for Performance: Avoid unnecessary columns, use indexes if needed.

  5. Collaborate with DBAs and Developers: For complex joins or optimizations.


Conclusion

SQL for Data Analysis is a must-have skill for any Business Analyst working with data-centric environments. By leveraging advanced SQL techniques, you can transform raw data into actionable insights, directly supporting business goals.

Last updated