The SELECT Statement: Reading Data from a Relational Database

If you ever plan to ask a question of a database, you will almost certainly begin with the word SELECT. It is the verb of SQL — the command that retrieves rows, the gateway through which every report, dashboard, and analytical insight ultimately passes. Mastering it is not optional; it is the price of entry to working with relational data.

Why SELECT Is the Heart of SQL

SQL was designed around a deceptively simple promise: you describe what data you want, and the database figures out how to get it. Of the four classical operations on data — create, read, update, and delete — reading is by far the most frequent. The SELECT statement is the read operation, and it is the single command you will type more often than any other.

What makes SELECT remarkable is its expressive power. With one statement you can pull a single value, combine a dozen tables, filter millions of rows, group them, sort them, and return only the top ten. Yet despite this power, the surface syntax is approachable enough that a beginner can write a useful query within minutes. This article walks through that surface — and then dives one level deeper, into the logical execution model that quietly governs everything SELECT does.

The Basic Syntax: SELECT … FROM …

At its absolute simplest, a query consists of two clauses: SELECT, which names the columns you want, and FROM, which names the table they live in.

-- Return every column of every row SELECT * FROM employees; -- Return only the columns you actually need SELECT first_name, last_name, hire_date FROM employees;

The asterisk (*) is shorthand for "every column." It is convenient for quick exploration but considered poor practice in production code: if a column is later added to the table, your query silently begins returning more data, which can break applications and degrade performance. Naming columns explicitly is almost always the right choice.

Renaming Columns with AS

You can rename a column in the result set using the AS keyword — a column alias. It is invaluable when a column name is cryptic, when you want a friendlier label for a report, or when you compute a value that has no natural name.

SELECT first_name AS "First Name", last_name AS "Last Name", salary * 12 AS annual_salary FROM employees;

The AS keyword is optional in most dialects, but writing it explicitly makes the intent unmistakable and improves readability.

Filtering Rows with WHERE

Most of the time you do not want every row in a table — you want only rows that satisfy some condition. That is the job of the WHERE clause. Think of it as a sieve: only rows for which the condition evaluates to true pass through into the result.

SELECT first_name, last_name, salary FROM employees WHERE salary > 50000;

Comparison and Logical Operators

SQL supports the comparison operators you would expect: =, <> or !=, <, >, <=, and >=. Combine conditions with AND, OR, and NOT. As in arithmetic, AND binds more tightly than OR, so use parentheses generously to make precedence explicit.

SELECT * FROM employees WHERE (department = 'Engineering' OR department = 'Research') AND salary >= 60000 AND NOT status = 'Terminated';

BETWEEN, IN, LIKE, and NULL Tests

Several specialized predicates make common filters more readable:

BETWEEN a AND b — matches values in the inclusive range [a, b].
IN (...) — matches any value from a list, replacing long chains of OR equalities.
LIKE — simple pattern matching. % matches any sequence of characters; _ matches exactly one character.
IS NULL / IS NOT NULL — tests for missing values. You cannot use = NULL because NULL means "unknown," and any comparison with it yields unknown rather than true.

SELECT first_name, last_name, hire_date FROM employees WHERE hire_date BETWEEN '2020-01-01' AND '2023-12-31' AND department IN ('Sales', 'Marketing', 'Support') AND last_name LIKE 'M%' AND termination_date IS NULL;

Ordering Results with ORDER BY

A relational table has no inherent ordering. If you want predictable results, you must ask for it explicitly with ORDER BY. Append ASC for ascending order (the default) or DESC for descending. You can sort by multiple columns — later columns act as tiebreakers when earlier ones are equal.

SELECT last_name, first_name, salary FROM employees ORDER BY salary DESC, last_name ASC;

Without ORDER BY, the database is free to return rows in whatever order is most efficient — and that order can change between executions. Results must not be relied upon without an explicit sort.

Limiting Results: LIMIT and FETCH FIRST

For large tables, returning every matching row is rarely what you want. Two clauses restrict the result set to a chosen number of rows — though the syntax varies across dialects:

Database	Preferred syntax	Notes
PostgreSQL	`LIMIT n OFFSET m` or `FETCH FIRST n ROWS ONLY`	Both supported; `FETCH FIRST` is the SQL standard form.
MySQL	`LIMIT n OFFSET m`	Does not support `FETCH FIRST`.
SQL Server	`TOP n` or `OFFSET m ROWS FETCH NEXT n ROWS ONLY`	No `LIMIT`; `OFFSET/FETCH` requires `ORDER BY`.
SQLite	`LIMIT n OFFSET m`	No `FETCH FIRST`.
Amazon Redshift	`LIMIT n OFFSET m`	PostgreSQL-derived; `FETCH FIRST` also supported.

-- PostgreSQL / MySQL / SQLite / Amazon Redshift SELECT * FROM employees ORDER BY salary DESC LIMIT 10; -- ANSI standard (PostgreSQL, SQL Server, Oracle, DB2) SELECT * FROM employees ORDER BY salary DESC FETCH FIRST 10 ROWS ONLY; -- SQL Server idiomatic SELECT TOP 10 * FROM employees ORDER BY salary DESC;

Always pair row-limiting with ORDER BY. Without an explicit sort, "the top 10" is meaningless — the database may return any 10 rows, and the choice can change between executions.

The Logical Order of SELECT Execution

Here we arrive at the most important — and most counter-intuitive — idea in this article. SQL does not execute in the order it is written. You write SELECT first, but the database evaluates it nearly last. The clauses of a query are processed in a fixed logical order that the SQL standard prescribes and that every major engine implements.

Figure 1 — The logical processing order of a SELECT statement. The engine evaluates FROM first and SELECT near the end, even though we write them in the opposite order.

Why does this matter? Because the order tells you, with no guesswork, what is "visible" at each stage. Aliases and expressions defined in SELECT do not yet exist when WHERE runs, so this query fails in standard SQL:

-- ERROR: column "discounted" does not exist at WHERE stage SELECT price * 0.9 AS discounted FROM products WHERE discounted > 100;

You must either repeat the expression directly in WHERE, or wrap the query in a subquery or CTE. By the time ORDER BY runs, however, the alias does exist — which is why ORDER BY discounted works perfectly in the same query.

The same logic explains the difference between WHERE and HAVING. WHERE filters individual rows before grouping; HAVING filters the groups produced by GROUP BY, so it can reference aggregates such as COUNT(*) or AVG(salary) that simply do not exist at the WHERE stage.

Mental model: each step produces a "virtual table" that is fed into the next. FROM assembles raw rows → WHERE trims them → GROUP BY collapses them into groups → HAVING trims the groups → SELECT shapes the output → DISTINCT deduplicates → ORDER BY sorts → LIMIT cuts. Every clause sees only what previous clauses produced.

Expressions and Computed Columns

The SELECT list is not limited to bare column names. You can include arithmetic, string operations, and function calls, producing computed columns on the fly without touching the underlying data.

SELECT product_name, price, quantity, price * quantity AS line_total, price * quantity * 1.20 AS total_with_tax, UPPER(product_name) AS name_upper, LOWER(category) AS category_lower FROM order_lines;

String concatenation is one area where dialects diverge. The SQL standard uses the double-pipe operator ||, supported by PostgreSQL, SQLite, Oracle, and Amazon Redshift. MySQL treats || as logical OR and uses CONCAT() instead. SQL Server uses + and also supports CONCAT(). When portability matters, CONCAT() is the safest choice.

-- Standard SQL / PostgreSQL / SQLite / Oracle / Amazon Redshift SELECT first_name || ' ' || last_name AS full_name FROM employees; -- Portable across most dialects SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM employees;

Removing Duplicates with DISTINCT

By default, SELECT returns every row that matches your conditions, including exact duplicates. Placing DISTINCT immediately after SELECT collapses identical result rows into one.

-- Every department in the table, listed once SELECT DISTINCT department FROM employees; -- DISTINCT applies to the whole row, not just one column SELECT DISTINCT department, country FROM employees;

The second query returns every unique combination of department and country. Because deduplication typically requires sorting or hashing, applying DISTINCT to large result sets carries a real cost — reach for it only when you genuinely need it.

Putting It All Together

Here is a query that exercises nearly every concept in this article. Read it once in written order, then read it again in logical execution order — and notice how the second reading explains why every clause is correct.

SELECT DISTINCT department, UPPER(country) AS country_code, COUNT(*) AS headcount, AVG(salary) AS avg_salary FROM employees WHERE hire_date >= '2020-01-01' AND termination_date IS NULL AND department IN ('Engineering', 'Sales', 'Support') GROUP BY department, country HAVING COUNT(*) > 5 ORDER BY avg_salary DESC FETCH FIRST 10 ROWS ONLY;

FROM assembles the rows; WHERE trims them to active recent hires in three departments; GROUP BY aggregates by department and country; HAVING keeps only sufficiently large groups; SELECT computes the output columns and assigns aliases; DISTINCT deduplicates; ORDER BY can now use the alias avg_salary because SELECT has already run; FETCH FIRST trims to the top ten. Every clause receives a tidy virtual table from the one before it, and produces a tidier one for the one after.

The SELECT statement will accompany you for the rest of your life as a database practitioner. Whenever something behaves unexpectedly, return to the logical execution order. Nine times out of ten, the answer is hiding there.

Main References

PostgreSQL Global Development Group. PostgreSQL Documentation: SELECT.
https://www.postgresql.org/docs/current/sql-select.html
Oracle Corporation. MySQL 8.0 Reference Manual: SELECT Statement.
https://dev.mysql.com/doc/refman/8.0/en/select.html
Microsoft. SELECT (Transact-SQL) — SQL Server Documentation.
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-transact-sql
SQLite Consortium. SQLite: The SELECT Statement.
https://www.sqlite.org/lang_select.html
PostgreSQL Global Development Group. PostgreSQL Documentation: Sorting Rows (ORDER BY).
https://www.postgresql.org/docs/current/queries-order.html

The SELECT Statement: Reading Data from a Relational Database

Why SELECT Is the Heart of SQL

The Basic Syntax: SELECT … FROM …

Renaming Columns with AS

Filtering Rows with WHERE

Comparison and Logical Operators

BETWEEN, IN, LIKE, and NULL Tests

Ordering Results with ORDER BY

Limiting Results: LIMIT and FETCH FIRST

The Logical Order of SELECT Execution

Expressions and Computed Columns

Removing Duplicates with DISTINCT

Putting It All Together

Main References

SQL from Scratch: History, Theory, and Why It Remains the Language of Data

What Type Is That Column?