Harith Zahid

RDBMS Performance & Optimization Fundamentals

A beginner-friendly guide to making your database queries faster using Knex.js and MySQL.


Table of Contents

  1. What Are Indexes?
  2. The N+1 Query Problem
  3. Essential Query Optimization Tips
  4. Connection Pooling Basics
  5. Checking Query Performance
  6. Quick Reference

1. What Are Indexes?

Indexes are like a book's index - they help the database find data quickly without reading everything.

Without vs With Indexes

// Searching 1 million users for an email

// WITHOUT index: MySQL checks all 1 million rows (slow!)
// WITH index: MySQL checks ~20 rows (fast!)

Creating Basic Indexes

// migrations/001_create_users.js
exports.up = function(knex) {
  return knex.schema
    .createTable('users', (table) => {
      table.increments('user_id').primary();
      table.string('email').notNullable();
      table.string('username').notNullable();
      table.string('status').defaultTo('active');
      table.timestamps(true, true);
      
      // Add indexes on columns you search frequently
      table.index('email');      // Fast email lookups
      table.unique('username');  // Fast + ensures uniqueness
      table.index('status');     // Fast filtering by status
    })
    .createTable('posts', (table) => {
      table.increments('post_id').primary();
      table.integer('user_id').unsigned().notNullable();
      table.string('title').notNullable();
      table.text('content');
      table.timestamps(true, true);
      
      // IMPORTANT: Always index foreign keys!
      table.index('user_id');
    });
};

When to Add Indexes

āœ… Always index these:

  • Primary keys (automatic)
  • Foreign keys (for JOINs)
  • Columns you search with WHERE
  • Columns you sort with ORDER BY

āŒ Don't over-index:

  • Every single column (slows down inserts/updates)
  • Small tables (under 1000 rows)

2. The N+1 Query Problem

This is the most common performance mistake! It happens when you make one query, then loop through results making more queries.

āŒ The Problem: N+1 Queries

// BAD: 1 query + 10 more queries = 11 total queries!
async function getUsersWithPosts() {
  // Query 1: Get users
  const users = await knex('users').limit(10);
  
  // Queries 2-11: Get posts for each user (N+1 problem!)
  for (const user of users) {
    user.posts = await knex('posts')
      .where('user_id', user.user_id);
  }
  
  return users;
}
// With 100 users this becomes 101 queries! 😱

āœ… The Solution: Use whereIn()

// GOOD: Only 2 queries total!
async function getUsersWithPosts() {
  // Query 1: Get users
  const users = await knex('users').limit(10);
  
  // Query 2: Get ALL posts at once
  const userIds = users.map(u => u.user_id);
  const posts = await knex('posts').whereIn('user_id', userIds);
  
  // Group posts by user (in JavaScript)
  const postsByUser = {};
  posts.forEach(post => {
    if (!postsByUser[post.user_id]) {
      postsByUser[post.user_id] = [];
    }
    postsByUser[post.user_id].push(post);
  });
  
  // Attach posts to users
  users.forEach(user => {
    user.posts = postsByUser[user.user_id] || [];
  });
  
  return users;
}
// Always 2 queries, no matter how many users! šŸŽ‰

3. Essential Query Optimization Tips

Tip 1: Only SELECT What You Need

// āŒ BAD: Gets all columns
const posts = await knex('posts')
  .select('*')
  .where('is_published', true);

// āœ… GOOD: Only gets what you need
const posts = await knex('posts')
  .select('post_id', 'title', 'created_at')
  .where('is_published', true);

Why? Less data = faster queries and less network transfer.

Tip 2: Always Use Pagination

// āŒ BAD: Gets everything at once
const posts = await knex('posts').select('*');
// Crashes your app with 1 million posts!

// āœ… GOOD: Get 20 posts at a time
async function getPosts(page = 1, perPage = 20) {
  const offset = (page - 1) * perPage;
  
  return knex('posts')
    .select('*')
    .limit(perPage)
    .offset(offset)
    .orderBy('created_at', 'desc');
}

Tip 3: Batch Insert/Update Operations

// āŒ BAD: 100 separate queries
async function createPosts(posts) {
  for (const post of posts) {
    await knex('posts').insert(post);
  }
}

// āœ… GOOD: 1 query
async function createPosts(posts) {
  return knex('posts').insert(posts);
}

Tip 4: Check If Records Exist (Don't Count)

// āŒ BAD: Counts all matching rows
async function userHasPosts(userId) {
  const result = await knex('posts')
    .where('user_id', userId)
    .count('* as count')
    .first();
  
  return result.count > 0;
}

// āœ… GOOD: Stops at first match
async function userHasPosts(userId) {
  const result = await knex('posts')
    .where('user_id', userId)
    .first();
  
  return !!result;
}

4. Connection Pooling Basics

Connection pooling reuses database connections instead of creating new ones every time.

Why It Matters

Without pooling: Each query creates a new connection (~100ms overhead)
With pooling: Reuses existing connections (almost no overhead)

Basic Setup

// knexfile.js
module.exports = {
  development: {
    client: 'mysql2',
    connection: {
      host: '127.0.0.1',
      user: 'your_user',
      password: 'your_password',
      database: 'your_database'
    },
    pool: {
      min: 2,   // Keep at least 2 connections ready
      max: 10   // Allow up to 10 connections
    }
  }
};

That's it! Knex handles the rest automatically.


5. Checking Query Performance

Use EXPLAIN to See How Queries Run

// Check how MySQL executes your query
const result = await knex.raw(`
  EXPLAIN SELECT * FROM users WHERE email = 'test@example.com'
`);

console.log(result[0]);

What to look for:

  • type: ALL = BAD (scanning entire table)
  • type: ref or const = GOOD (using index)

If you see type: ALL, add an index!

Measure Query Time

async function slowQueryExample() {
  const start = Date.now();
  
  const users = await knex('users')
    .where('status', 'active')
    .limit(100);
  
  const duration = Date.now() - start;
  console.log(`Query took ${duration}ms`);
  
  if (duration > 100) {
    console.warn('āš ļø This query is slow! Add an index?');
  }
  
  return users;
}

6. Quick Reference

Performance Checklist

Before deploying to production, check:

  • āœ… Foreign keys have indexes
  • āœ… WHERE/ORDER BY columns have indexes
  • āœ… No N+1 queries (no queries inside loops!)
  • āœ… Using SELECT specific columns, not SELECT *
  • āœ… Using pagination with LIMIT
  • āœ… Batch inserts instead of individual inserts
  • āœ… Connection pool is configured

Common Index Patterns

// migrations/indexes.js
exports.up = function(knex) {
  return knex.schema.table('your_table', (table) => {
    // Foreign keys
    table.index('user_id');
    table.index('category_id');
    
    // Search columns
    table.index('email');
    table.index('status');
    
    // Sorting columns
    table.index('created_at');
  });
};

Quick Fixes for Slow Queries

  1. Query is slow? → Add an index
  2. Too many queries? → Fix N+1 with whereIn()
  3. Returning too much data? → Use LIMIT and SELECT specific columns
  4. Multiple queries in sequence? → Run them in parallel with Promise.all()

The Golden Rules

  1. Index foreign keys - Critical for JOINs to be fast
  2. No queries in loops - Always causes N+1 problems
  3. Always paginate - Never load everything at once
  4. SELECT only what you need - Faster queries and less data transfer
  5. Measure, don't guess - Use EXPLAIN and timing to find problems

Next Steps

  1. Add indexes to your foreign keys
  2. Find and fix any N+1 queries in your code
  3. Add pagination to any endpoints returning lists
  4. Use EXPLAIN on your slowest queries

Remember: Start simple! Add basic indexes and fix obvious problems first. You can optimize further once you measure actual performance in production.