Choosing the Right Data Type in SQL Is More Important Than You Think

by Arif Ikhsanudin, Backend Developer

The type choice you'll regret

You're building a financial application. You store monetary amounts as FLOAT. Everything works in development. In production, a settlement report shows $10,000.00 becoming $9,999.999999998 after a series of calculations. You've lost money — literally — due to floating-point rounding. Or you store timestamps as VARCHAR to "keep it flexible," and three months later your ORDER BY created_at returns chronologically wrong results because '2024-10-01' sorts before '2024-09-15' lexicographically... except it doesn't, so the bug shows up in a different comparison involving '2024-1-15' vs '2024-09-01'.

Data types are not just a storage detail. They define what operations are valid, what constraints are enforceable, and what the database will do when you ask it to compare, sort, or aggregate your data.

Numbers: the float trap

FLOAT and DOUBLE PRECISION are binary floating-point types (IEEE 754). They cannot exactly represent most decimal fractions. 0.1 + 0.2 is not 0.3 in binary floating-point — it's 0.30000000000000004.

For monetary values, rates, and anything where decimal precision matters, use DECIMAL(precision, scale) (also called NUMERIC):

-- Wrong: binary float, imprecise for currency
amount FLOAT

-- Wrong: integer cents works but loses semantic clarity and causes errors at boundaries
amount_cents BIGINT

-- Right: exact decimal arithmetic
amount DECIMAL(19, 4)  -- 15 integer digits, 4 decimal places

DECIMAL(19, 4) handles amounts up to $999,999,999,999,999 with 4 decimal places — sufficient for most financial use cases including multi-currency with fractional conversions.

FLOAT is appropriate for scientific measurements, ML feature values, and other contexts where you care about magnitude and relative precision rather than exact decimal representation.

Timestamps: always use time zones

-- PostgreSQL: TIMESTAMPTZ stores UTC, displays in session time zone
-- TIMESTAMP stores whatever you give it, no time zone info
created_at TIMESTAMP      -- ambiguous: what time zone?
created_at TIMESTAMPTZ    -- unambiguous: always UTC internally

In PostgreSQL, TIMESTAMPTZ (timestamp with time zone) stores the moment in UTC and converts to the session time zone on display. TIMESTAMP has no time zone information — it stores whatever you insert. If your application servers are in different time zones, or if you ever change your server's time zone, TIMESTAMP columns will silently contain wrong data.

Use TIMESTAMPTZ for all event timestamps. Use DATE for dates where time of day is irrelevant (birthdays, invoice dates). Never store timestamps as VARCHAR or Unix epoch integers unless you have a very specific reason.

Text: size matters

TEXT (or CLOB) in PostgreSQL is variable-length with no practical size limit. VARCHAR(n) enforces a maximum length. CHAR(n) pads to exactly n characters.

-- Use TEXT when length is genuinely unbounded
body TEXT,
notes TEXT

-- Use VARCHAR(n) when length is a business rule
email VARCHAR(255),      -- RFC 5321 limits to 254 characters
country_code CHAR(2),    -- ISO 3166-1 alpha-2: always exactly 2 chars
postal_code VARCHAR(20)  -- Variable by country, but bounded

CHAR(n) is almost always wrong — it pads with spaces, wastes storage for shorter values, and causes surprising equality comparison behavior. Use VARCHAR(n) for fixed-domain strings and TEXT for unbounded ones.

For emails specifically: store lowercased (case-insensitive comparison is expensive if not normalized at write time) and add a UNIQUE constraint.

Booleans vs status enums

-- Boolean: simple, but can't express a third state cleanly
is_active BOOLEAN

-- Enum: explicit set of valid states
status VARCHAR(20) CHECK (status IN ('draft', 'active', 'suspended', 'deleted'))

PostgreSQL supports native ENUM types, but VARCHAR with a CHECK constraint is more portable and easier to migrate (adding an enum value in PostgreSQL requires ALTER TYPE, which can cause issues in some migration tools). MySQL's ENUM has different semantics and storage characteristics.

The rule: use BOOLEAN for genuinely binary state. Use a constrained VARCHAR for anything with more than two states. Don't use TINYINT(1) as a boolean — it's a MySQL-ism that leaks into ORMs and confuses everyone.

Integer sizes

-- SMALLINT: 2 bytes, -32768 to 32767
-- INT / INTEGER: 4 bytes, -2.1B to 2.1B
-- BIGINT: 8 bytes, -9.2Q to 9.2Q

For primary keys: use BIGINT. The 4-byte INT limit (2.1 billion) sounds large but high-volume tables hit it. The storage cost difference (4 bytes per row) is trivial at scale. SERIAL and BIGSERIAL in PostgreSQL, AUTO_INCREMENT on BIGINT in MySQL.

For foreign keys: match exactly the type of the referenced primary key. A FK mismatch causes implicit casts that defeat indexes.

For small-domain counts or status codes: SMALLINT is appropriate and signals to developers that the value range is intentionally limited.

JSON: JSONB vs JSON in PostgreSQL

JSON stores the raw text. JSONB stores a parsed binary representation that supports indexing, key-existence operators, and path queries.

-- JSON: stores text, can't be indexed (except full-text)
metadata JSON

-- JSONB: parsed, indexable, supports @>, ?, #> operators
metadata JSONB

-- GIN index on JSONB for key/value queries
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);

-- Query: find products where metadata contains a specific key-value pair
SELECT id FROM products WHERE metadata @> '{"category": "electronics"}';

Use JSONB unless you specifically need to preserve JSON key order or whitespace (almost never). The write cost of JSONB parsing is negligible for most workloads.

The type decision rule

For each column, ask: what is the actual domain of this data? What operations need to be correct (arithmetic, comparison, sorting)? What constraints should the database enforce? The type that accurately represents the domain and supports the needed operations is the right choice — not the most convenient one.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

System Design Is Not About Drawing Pretty Diagrams

Most system design conversations produce polished diagrams that look great in a slide deck and fall apart in production. The diagram is not the design — the decisions behind it are.

Read more

How Experienced Engineers Detect Project Risk Early

Some engineers seem to spot trouble before it even starts. It’s not magic—it’s a method honed through experience.

Read more

Asynchronous Java With CompletableFuture — Patterns That Stay Readable

CompletableFuture makes async composition possible in Java, but its API surface is large and the error handling semantics are non-obvious. Here are the patterns that produce maintainable async code and the pitfalls that produce callback soup.

Read more

Stop Over-Engineering. Your Future Self Will Thank You.

Over-engineering feels like thoroughness while you are doing it. It feels like a trap six months later. The discipline of building only what is needed is harder than it sounds and more valuable than most engineers admit.

Read more