Skip to main content
Glama
data-analysis-business-intelligence-2025.md17.3 kB
# Data Analysis & Business Intelligence 2025 **Updated**: 2025-11-24 | **Focus**: Excel, SQL, Tableau, Business Metrics --- ## Excel for Data Analysis ```excel # ESSENTIAL FORMULAS # VLOOKUP (Find value in table) =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup]) Example: Find employee salary =VLOOKUP(A2, EmployeeTable, 3, FALSE) # A2 = Employee ID to find # EmployeeTable = Named range with employee data # 3 = Column number (Salary is 3rd column) # FALSE = Exact match # XLOOKUP (Better than VLOOKUP, Excel 365) =XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found]) Example: =XLOOKUP(A2, Employees[ID], Employees[Salary], "Not Found") --- # INDEX-MATCH (More flexible than VLOOKUP) =INDEX(return_range, MATCH(lookup_value, lookup_range, 0)) Example: Find salary by employee name =INDEX(Employees[Salary], MATCH(A2, Employees[Name], 0)) # Advantage: Can look left (VLOOKUP can't) --- # SUMIF / SUMIFS (Conditional sum) =SUMIF(range, criteria, [sum_range]) =SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...) Example: Total sales for "Electronics" category =SUMIF(Categories, "Electronics", Sales) Example: Total sales for "Electronics" in "West" region =SUMIFS(Sales, Categories, "Electronics", Regions, "West") --- # COUNTIF / COUNTIFS (Conditional count) =COUNTIF(range, criteria) Example: Count orders over $1000 =COUNTIF(OrderValues, ">1000") Example: Count "Electronics" orders in "West" region =COUNTIFS(Categories, "Electronics", Regions, "West") --- # AVERAGEIF / AVERAGEIFS (Conditional average) =AVERAGEIF(range, criteria, [average_range]) Example: Average price for "Electronics" =AVERAGEIF(Categories, "Electronics", Prices) --- # IF / IFS (Conditional logic) =IF(logical_test, value_if_true, value_if_false) Example: Discount based on quantity =IF(B2>=100, "10%", IF(B2>=50, "5%", "0%")) # IFS (Cleaner for multiple conditions, Excel 2016+) =IFS(B2>=100, "10%", B2>=50, "5%", TRUE, "0%") --- # TEXT FUNCTIONS # CONCATENATE / TEXTJOIN =CONCATENATE(A2, " ", B2) # Older =A2 & " " & B2 # Shortcut =TEXTJOIN(" ", TRUE, A2:C2) # Join with delimiter, ignore blanks Example: Full name =A2 & " " & B2 # "John" + " " + "Doe" = "John Doe" # LEFT, RIGHT, MID (Extract text) =LEFT(text, num_chars) =RIGHT(text, num_chars) =MID(text, start_num, num_chars) Example: Extract area code from phone number =LEFT(A2, 3) # "555-1234" → "555" # TRIM (Remove extra spaces) =TRIM(text) # UPPER, LOWER, PROPER (Change case) =UPPER("hello") → "HELLO" =LOWER("HELLO") → "hello" =PROPER("john doe") → "John Doe" --- # DATE FUNCTIONS # TODAY, NOW =TODAY() # Current date =NOW() # Current date & time # YEAR, MONTH, DAY =YEAR(A2) # Extract year from date =MONTH(A2) # Extract month (1-12) =DAY(A2) # Extract day (1-31) # DATEDIF (Date difference) =DATEDIF(start_date, end_date, "unit") # Units: "Y" (years), "M" (months), "D" (days) Example: Age calculation =DATEDIF(A2, TODAY(), "Y") # Years old # EOMONTH (End of month) =EOMONTH(start_date, months) Example: Last day of current month =EOMONTH(TODAY(), 0) --- # PIVOT TABLE FUNCTIONS (Use in formulas) # GETPIVOTDATA (Extract value from pivot table) =GETPIVOTDATA("Sales", PivotTable, "Region", "West", "Product", "Laptop") # Returns sales for Laptops in West region --- ADVANCED TECHNIQUES: # ARRAY FORMULAS (Excel 365) # Spill: Formula returns multiple values automatically Example: Unique values =UNIQUE(A2:A100) # Returns unique list Example: Filter =FILTER(A2:D100, C2:C100="Electronics") # Returns rows where category = Electronics Example: Sort =SORT(A2:D100, 3, -1) # Sort by column 3 descending # XLOOKUP + FILTER (Dynamic lookup) =XLOOKUP(A2, FILTER(Employees[ID], Employees[Active]="Yes"), FILTER(Employees[Salary], Employees[Active]="Yes")) # Only look up active employees --- # CONDITIONAL FORMATTING (Visual insights) 1. Highlight cells > $10,000: - Select range → Conditional Formatting → Highlight Cells Rules → Greater Than 2. Data bars (visual bar chart in cells): - Select range → Conditional Formatting → Data Bars 3. Color scales (gradient based on value): - Select range → Conditional Formatting → Color Scales 4. Icon sets (arrows, traffic lights): - Select range → Conditional Formatting → Icon Sets 5. Custom formula: - Highlight duplicate rows: =$A2=$A3 - Highlight if multiple conditions: =AND($B2>1000, $C2="West") --- # PIVOT TABLES (Quick analysis) 1. Select data range → Insert → Pivot Table 2. Drag fields: - Rows: Categories - Columns: Regions - Values: SUM of Sales 3. Filters: Date range 4. Slicers (visual filters): Insert → Slicer → Select fields CALCULATED FIELDS (Custom metrics): - Click Pivot Table → Analyze → Fields, Items & Sets → Calculated Field - Example: Profit Margin = 'Profit' / 'Revenue' --- # DATA VALIDATION (Ensure data quality) 1. Select cells → Data → Data Validation 2. Allow: List (dropdown) - Source: =Categories # Named range - Or: "Option1,Option2,Option3" 3. Allow: Whole Number - Between: 0 and 100 4. Custom formula: - =COUNTIF($A$2:$A$100, A2)=1 # No duplicates ``` --- ## SQL for Business Analytics ```sql -- BUSINESS METRICS QUERIES -- 1. REVENUE ANALYSIS -- Total revenue by month SELECT DATE_TRUNC('month', order_date) as month, SUM(total_amount) as monthly_revenue, COUNT(DISTINCT order_id) as order_count, SUM(total_amount) / COUNT(DISTINCT order_id) as avg_order_value FROM orders WHERE order_date >= '2025-01-01' GROUP BY DATE_TRUNC('month', order_date) ORDER BY month; -- Revenue by product category SELECT c.category_name, SUM(oi.quantity * oi.unit_price) as revenue, SUM(oi.quantity * (oi.unit_price - p.cost)) as profit, SUM(oi.quantity * (oi.unit_price - p.cost)) / NULLIF(SUM(oi.quantity * oi.unit_price), 0) * 100 as profit_margin_pct FROM order_items oi JOIN products p ON oi.product_id = p.product_id JOIN categories c ON p.category_id = c.category_id GROUP BY c.category_name ORDER BY revenue DESC; --- -- 2. CUSTOMER ANALYSIS -- Customer Lifetime Value (CLV) SELECT customer_id, customer_name, COUNT(order_id) as total_orders, SUM(total_amount) as lifetime_value, AVG(total_amount) as avg_order_value, MIN(order_date) as first_order_date, MAX(order_date) as last_order_date, MAX(order_date) - MIN(order_date) as customer_tenure_days FROM orders o JOIN customers c USING (customer_id) GROUP BY customer_id, customer_name ORDER BY lifetime_value DESC LIMIT 100; -- Top 100 customers -- Customer Segmentation (RFM: Recency, Frequency, Monetary) WITH rfm AS ( SELECT customer_id, MAX(order_date) as last_order_date, CURRENT_DATE - MAX(order_date) as recency_days, COUNT(order_id) as frequency, SUM(total_amount) as monetary FROM orders GROUP BY customer_id ) SELECT customer_id, recency_days, frequency, monetary, CASE WHEN recency_days <= 30 AND frequency >= 5 AND monetary >= 1000 THEN 'Champions' WHEN recency_days <= 30 AND frequency >= 3 THEN 'Loyal Customers' WHEN recency_days <= 60 AND frequency >= 2 THEN 'Potential Loyalists' WHEN recency_days > 180 THEN 'At Risk' ELSE 'Needs Attention' END as customer_segment FROM rfm ORDER BY monetary DESC; --- -- 3. COHORT ANALYSIS (Retention) -- Monthly cohort retention WITH cohorts AS ( SELECT customer_id, DATE_TRUNC('month', MIN(order_date)) as cohort_month FROM orders GROUP BY customer_id ), cohort_activity AS ( SELECT c.cohort_month, DATE_TRUNC('month', o.order_date) as activity_month, COUNT(DISTINCT o.customer_id) as active_customers FROM cohorts c JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.cohort_month, DATE_TRUNC('month', o.order_date) ) SELECT cohort_month, activity_month, active_customers, EXTRACT(MONTH FROM AGE(activity_month, cohort_month)) as months_since_first_order FROM cohort_activity ORDER BY cohort_month, activity_month; --- -- 4. PRODUCT PERFORMANCE -- Best/worst sellers SELECT p.product_name, SUM(oi.quantity) as units_sold, SUM(oi.quantity * oi.unit_price) as revenue, COUNT(DISTINCT o.order_id) as orders_with_product, AVG(oi.quantity) as avg_quantity_per_order FROM products p JOIN order_items oi ON p.product_id = oi.product_id JOIN orders o ON oi.order_id = o.order_id WHERE o.order_date >= CURRENT_DATE - INTERVAL '90 days' GROUP BY p.product_name ORDER BY revenue DESC; -- Product affinity (frequently bought together) SELECT p1.product_name as product_1, p2.product_name as product_2, COUNT(*) as times_bought_together FROM order_items oi1 JOIN order_items oi2 ON oi1.order_id = oi2.order_id AND oi1.product_id < oi2.product_id -- Avoid duplicates JOIN products p1 ON oi1.product_id = p1.product_id JOIN products p2 ON oi2.product_id = p2.product_id GROUP BY p1.product_name, p2.product_name HAVING COUNT(*) >= 10 ORDER BY times_bought_together DESC LIMIT 20; --- -- 5. FUNNEL ANALYSIS (Conversion rates) WITH funnel AS ( SELECT COUNT(DISTINCT session_id) as visitors, COUNT(DISTINCT CASE WHEN event_type = 'view_product' THEN session_id END) as product_views, COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN session_id END) as add_to_cart, COUNT(DISTINCT CASE WHEN event_type = 'checkout' THEN session_id END) as checkouts, COUNT(DISTINCT CASE WHEN event_type = 'purchase' THEN session_id END) as purchases FROM events WHERE event_date = CURRENT_DATE ) SELECT visitors, product_views, add_to_cart, checkouts, purchases, ROUND(product_views::NUMERIC / visitors * 100, 2) as view_rate, ROUND(add_to_cart::NUMERIC / product_views * 100, 2) as add_cart_rate, ROUND(checkouts::NUMERIC / add_to_cart * 100, 2) as checkout_rate, ROUND(purchases::NUMERIC / checkouts * 100, 2) as purchase_rate, ROUND(purchases::NUMERIC / visitors * 100, 2) as overall_conversion_rate FROM funnel; ``` --- ## Tableau for Visualization ```markdown DASHBOARD DESIGN PRINCIPLES: 1. PURPOSE: - Who is the audience? (Executive, analyst, operations) - What decisions will they make? (Strategic, tactical) - What actions should they take? (Approve, investigate, optimize) 2. LAYOUT: - F-pattern (users read top-left first, then scan down) - Most important KPI: Top-left - Supporting details: Below and right - White space (don't cram, let data breathe) 3. CHARTS: - Line chart: Trends over time - Bar chart: Compare categories - Scatter plot: Correlation - Heatmap: Patterns in large dataset - Avoid: Pie charts (hard to compare), 3D (distorts data) 4. COLOR: - Use sparingly (1-2 accent colors) - Colorblind-friendly (avoid red-green) - Consistent meaning (red = bad, green = good) - Highlight important data (rest in gray) 5. INTERACTIVITY: - Filters (date range, category) - Actions (click chart to filter others) - Tooltips (show details on hover) - Parameters (user inputs: "Show top N") --- COMMON CALCULATIONS (Calculated Fields): # Profit Margin ([Revenue] - [Cost]) / [Revenue] # Year-over-Year Growth (SUM([Sales]) - LOOKUP(SUM([Sales]), -12)) / LOOKUP(SUM([Sales]), -12) # -12 = 12 months ago # Running Total RUNNING_SUM(SUM([Sales])) # Rank RANK(SUM([Sales]), 'desc') # Percentile PERCENTILE([Sales], 0.75) # 75th percentile # If/Then Logic IF [Sales] > 10000 THEN "High" ELSEIF [Sales] > 5000 THEN "Medium" ELSE "Low" END # Date Calculations DATEDIFF('day', [Order Date], [Ship Date]) # Days to ship DATEADD('month', 3, [Order Date]) # 3 months later --- LEVEL OF DETAIL (LOD) EXPRESSIONS: # FIXED (Ignore filters) {FIXED [Customer ID]: SUM([Sales])} # Total sales per customer # INCLUDE (Add dimension) {INCLUDE [Product]: AVG([Discount])} # Avg discount including product # EXCLUDE (Remove dimension) {EXCLUDE [Region]: SUM([Sales])} # Total sales excluding region Use case: Calculate % of total SUM([Sales]) / {FIXED: SUM([Sales])} --- DASHBOARD EXAMPLES: 1. EXECUTIVE DASHBOARD: - KPIs: Revenue, Profit, Orders (vs last month/year) - Revenue trend (line chart, last 12 months) - Top 5 products (bar chart) - Sales by region (map) - Filters: Date range, region 2. SALES PERFORMANCE: - Sales by rep (bar chart, color by quota attainment) - Win rate (pie chart or single number) - Pipeline value by stage (funnel chart) - Deals closed this month (table) - Filters: Rep, team, product line 3. CUSTOMER ANALYTICS: - Customer count (total, new, churned) - CLV distribution (histogram) - Retention cohort (heatmap) - Top customers (table with sparklines) - Filters: Segment, acquisition channel 4. PRODUCT ANALYTICS: - Units sold (bar chart by product) - Revenue & profit (dual-axis line chart) - Product affinity (network diagram) - Inventory levels (heatmap) - Filters: Category, date range --- BEST PRACTICES: ✅ DO: - Keep it simple (5-7 charts max per dashboard) - Use consistent date ranges (all charts same period) - Add context (comparison to prior period, target) - Test on mobile (responsive design) - Document data sources (last updated, definitions) ❌ DON'T: - Use 3D charts (distorts perception) - Overuse color (rainbow effect confusing) - Mix chart types unnecessarily - Show all data (aggregate, filter, summarize) - Forget to label axes, add titles ``` --- ## Key Performance Indicators (KPIs) ```markdown E-COMMERCE: REVENUE METRICS: - Total Revenue - Average Order Value (AOV) = Revenue / # Orders - Revenue Per Visitor (RPV) = Revenue / # Visitors CONVERSION METRICS: - Conversion Rate = Orders / Visitors × 100% - Add-to-Cart Rate = Add to Cart / Product Views × 100% - Cart Abandonment Rate = (Carts - Orders) / Carts × 100% CUSTOMER METRICS: - Customer Acquisition Cost (CAC) = Marketing Spend / New Customers - Customer Lifetime Value (CLV) = Avg Order Value × Purchase Frequency × Customer Lifespan - CLV:CAC Ratio (should be >3:1) - Repeat Purchase Rate = Repeat Customers / Total Customers × 100% - Churn Rate = Lost Customers / Total Customers × 100% --- SAAS (Software as a Service): GROWTH: - Monthly Recurring Revenue (MRR) - Annual Recurring Revenue (ARR) = MRR × 12 - New MRR (new customers) - Expansion MRR (upgrades, upsells) - Churn MRR (lost revenue) - Net New MRR = New + Expansion - Churn RETENTION: - Customer Churn Rate = Lost Customers / Total Customers - Revenue Churn Rate = Lost MRR / Total MRR - Net Revenue Retention (NRR) = (Starting MRR + Expansion - Churn) / Starting MRR × 100% * >100% = Good (expansion > churn) UNIT ECONOMICS: - CAC Payback Period = CAC / (ARPU × Gross Margin) * <12 months = Good - LTV:CAC Ratio = CLV / CAC * >3:1 = Healthy --- MARKETING: TRAFFIC: - Website Visits - Unique Visitors - Page Views - Bounce Rate = Single-Page Sessions / Total Sessions ENGAGEMENT: - Time on Site - Pages Per Session - Social Media Engagement Rate = (Likes + Comments + Shares) / Followers × 100% CAMPAIGN PERFORMANCE: - Click-Through Rate (CTR) = Clicks / Impressions × 100% - Cost Per Click (CPC) = Ad Spend / Clicks - Cost Per Acquisition (CPA) = Ad Spend / Conversions - Return on Ad Spend (ROAS) = Revenue / Ad Spend * ROAS 4:1 = $4 revenue per $1 spent EMAIL: - Open Rate = Opens / Delivered × 100% * 15-25% = Average - Click-Through Rate = Clicks / Delivered × 100% * 2-5% = Average - Unsubscribe Rate = Unsubscribes / Delivered × 100% * <0.5% = Good --- FINANCE: PROFITABILITY: - Gross Profit = Revenue - Cost of Goods Sold (COGS) - Gross Margin = Gross Profit / Revenue × 100% - Operating Profit = Gross Profit - Operating Expenses - Net Profit Margin = Net Profit / Revenue × 100% LIQUIDITY: - Current Ratio = Current Assets / Current Liabilities * >1 = Solvent (can pay short-term debts) - Quick Ratio = (Current Assets - Inventory) / Current Liabilities * >1 = Can pay debts without selling inventory EFFICIENCY: - Inventory Turnover = COGS / Average Inventory * Higher = Better (selling inventory faster) - Days Sales Outstanding (DSO) = (Accounts Receivable / Revenue) × 365 * Lower = Faster collections GROWTH: - Revenue Growth Rate = (Current Revenue - Prior Revenue) / Prior Revenue × 100% - Burn Rate = Cash Decrease / # Months * How fast spending cash (startups) - Runway = Cash Balance / Burn Rate * # Months until run out of cash ``` --- ## Key Takeaways 1. **Excel mastery** - VLOOKUP, pivot tables, conditional formatting (80% of analyst work) 2. **SQL proficiency** - JOINs, window functions, CTEs (query databases directly) 3. **Visualization** - Tableau/Power BI (communicate insights, not just numbers) 4. **Business acumen** - Understand metrics (CAC, CLV, churn), tie to decisions 5. **Storytelling** - Present findings (problem → analysis → recommendation → action) --- ## References - "Storytelling with Data" - Cole Nussbaumer Knaflic - "Data Science for Business" - Foster Provost - Mode Analytics SQL Tutorial **Related**: `excel-advanced.md`, `sql-optimization.md`, `tableau-dashboards.md`

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/seanshin0214/persona-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server