Skip to content

Text Filters

Goal: Filter text reliably — handle casing, whitespace, escape characters, and pattern matching.


Single quotes for strings

NAME = 'Atlanta'
STATE = 'GA'
COUNTY = 'Fulton'

Double quotes ("...") are for field names, not values.

Escaping single quotes

If your value contains an apostrophe, double it:

NAME = 'O''Hare'             -- → matches "O'Hare"
NAME = 'St. Clair''s'        -- → matches "St. Clair's"

Casing

Source Case sensitive?
File geodatabase Yes
Shapefile Yes
SQL Server (default collation) No
PostgreSQL Yes
Oracle Yes

For case-insensitive filter:

UPPER(STATE) = 'GA'         -- works in most sources
LOWER(NAME) = 'atlanta'

Note: this disables index use, so it's slower on big tables.

Whitespace

Leading/trailing spaces silently break equality:

'Atlanta' <> ' Atlanta'      -- different!

Defensive query:

TRIM(NAME) = 'Atlanta'

(Most enterprise sources support TRIM. File gdb may not — clean the field first with Calculate Field: !NAME!.strip().)

Equality vs LIKE

NAME = 'Atlanta'             -- exact match
NAME LIKE 'Atlanta'          -- same — no wildcards = exact
NAME LIKE 'Atlanta%'         -- starts with "Atlanta"
NAME LIKE '%Atlanta%'        -- contains "Atlanta"

→ Detail: LIKE Operator.

Length

CHAR_LENGTH(NAME) > 10        -- enterprise dbs
LEN(NAME) > 10                -- SQL Server
LENGTH(NAME) > 10             -- PostgreSQL, Oracle

File gdb has limited support — calculate a length field first.

Concatenating

Source Operator
Standard SQL \|\| ('A' \|\| 'B')
SQL Server + ('A' + 'B')
File gdb in Pro Not in WHERE — concatenate via Calculate Field

Empty string vs NULL

IS NULL = ''
NULL
''

Many fields have a mix. Defensive:

NOTES IS NULL OR NOTES = ''

Common patterns

Names matching multiple variants

NAME IN ('St. Louis','Saint Louis','Saint-Louis')

Cleaning before comparing

If your imported data has odd whitespace or casing, calculate clean fields:

# Calculate Field — Python expression
!NAME!.strip().lower()

Then your queries become:

NAME_CLEAN = 'atlanta'

Contains a digit (some sources)

NAME LIKE '%[0-9]%'           -- SQL Server
NAME ~ '[0-9]'                -- PostgreSQL regex

File gdb doesn't support regex. Workaround: calculate a HAS_DIGIT flag in Python.

Phone formatting

PHONE LIKE '404-%-____'      -- 404-XXX-XXXX

Email domain

EMAIL LIKE '%@gmail.com'

Big perf tip

For large datasets, prefix-anchored LIKE patterns ('A%') can use an index. Anything starting with % cannot.

If you query LIKE '%ville%' regularly on a huge enterprise table, ask your DBA for a full-text index.


Practice

Text drills

  1. Counties starting with "M".
  2. Counties not containing "County" in the name.
  3. Cities whose name is exactly 5 characters (in dialects supporting LIKE '_____').
  4. Owners whose name contains "Inc" or "LLC".
  5. Email addresses on Gmail or Yahoo.

Sample answers: 1) NAME LIKE 'M%' 2) NAME NOT LIKE '%County%' 3) NAME LIKE '_____' 4) OWNER LIKE '%Inc%' OR OWNER LIKE '%LLC%' 5) EMAIL LIKE '%@gmail.com' OR EMAIL LIKE '%@yahoo.com'

→ Next: Date Filters.