Skip to main content
April 2, 2026Dan Rodney/6 min read

Solution - Like and Wildcards in SQL

Master SQL wildcards and LIKE operators safely

Wildcards Can Be Dangerous

Think of each wildcard as a little bomb that could explode in your face. They can include way more than you intended, creating false positives in your results.

Essential SQL Tools for Database Navigation

Entity Relationship (ER) Diagrams

Get a bird's-eye view of all tables and their relationships. DBeaver creates these with just a double-click, while other tools require manual creation.

Database Schema Explorer

View all tables and columns without opening multiple windows. Essential for understanding data structure in unfamiliar databases.

Query Result Verification

Always perform the sniff test on wildcard results. Check for false positives, especially as your dataset grows larger.

Using Wildcards in SQL LIKE Statements

Pros
Find patterns when exact values are unknown
Flexible matching for partial text searches
Essential for real-world data variations
Cons
Can create false positives with unexpected matches
Performance impact on large datasets
Risk of including unintended results
Wildcard Best Practice

Use as few wildcards as possible. Don't look for excuses to add them - only use them when you genuinely need to handle unknown quantities in your data.

Gmail Search: Safe vs Risky Wildcard Patterns

FeatureApproachRisk LevelExampleIssues
Exact Match'%@gmail.com'Lowuser@gmail.comNone - most secure
Partial Match'%gmail%'Highhuntingmail.comFalse positives
No Domain'%gmail.com%'Mediumsomethingingmail.comPotential overmatching
Recommended: Always include '@gmail.com' to ensure you only match actual Gmail addresses

Safe Wildcard Query Development Process

1

Start with the positive query

Write your LIKE statement to find exactly what you want first, without any NOT operators

2

Verify results manually

Check that your wildcards aren't catching unintended matches - perform the sniff test

3

Add NOT if needed

Only after confirming your positive query works correctly, add NOT to exclude those results

4

Test on larger datasets

Patterns that work on small databases may create problems when scaled up to real-world data sizes

IN vs OR: Multiple Value Filtering

FeatureMethodCode ExamplePerformanceReadability
IN OperatorIN ('FL', 'TX')EqualHigh - cleaner syntax
OR Operator= 'FL' OR = 'TX'EqualMedium - more verbose
Recommended: Use IN for multiple equality checks - it's more concise and equally performant
IN Operator Limitation

The IN operator only works with equality comparisons. You cannot use wildcards with IN - use individual LIKE statements with OR instead.

SQL Query Structure Requirements

0/4

Wildcard Character Usage Patterns

Percentage Sign (%)

Represents zero or more characters. Use '%plate%' to find 'plate' anywhere in the text, or '%plate' to find text ending with 'plate'.

Underscore (_)

Represents exactly one character. Perfect for 'gr_y' to match both 'gray' and 'grey' with precision and clarity.

NOT with LIKE

Exclude patterns from results. Can be placed before LIKE or before the column name with identical functionality.

Handling Color Variations: Gray vs Grey

FeatureMethodPatternClarityRisk Level
IN operatorIN ('gray', 'grey')Explicit and clearNone
Underscore wildcardLIKE 'gr_y'Somewhat clearVery low
Percentage wildcardLIKE 'gr%y'Unclear intentHigh - false positives
Recommended: Use IN operator for known variations - it's explicit and prevents unexpected matches like 'grainy' or 'greeny'

This lesson is a preview from our SQL Course Online (includes software) and SQL Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Understanding database challenges begins with navigation fundamentals. When searching for users with Gmail addresses, your first obstacle is identifying the correct table. Database familiarity develops over time—much like learning a new city's geography. Initially, you won't remember which streets lead where or what amenities exist in each neighborhood. Similarly, you'll need frequent references to your database schema until table names and relationships become second nature.

This learning curve intensifies when working with databases designed by other developers. Their naming conventions and organizational logic may differ significantly from your preferences, requiring additional time to internalize the structure. However, this initial investment in understanding pays dividends as your queries become more efficient and accurate.

For comprehensive database visualization, Entity Relationship (ER) diagrams provide invaluable oversight. Rather than opening individual tables sequentially, double-clicking the tables section in your ER diagram reveals all tables with their columns simultaneously. This bird's-eye view enables rapid relationship identification and column scanning without the screen real estate limitations of multiple open windows.

ER diagrams serve as your database roadmap, displaying table relationships and column structures in a unified interface. This visualization proves essential for complex queries involving multiple table joins or when exploring unfamiliar database schemas. The ability to see interconnections at a glance reduces query development time and minimizes relationship errors.

All modern database management systems support ER diagram generation, though implementation varies. DBeaver excels in simplicity—double-clicking generates comprehensive diagrams instantly. SQL Server Management Studio requires manual creation: right-click to create new diagrams and select desired tables individually. While DBeaver offers the most streamlined approach, every professional database tool provides relationship visualization capabilities, recognizing their fundamental importance for database comprehension and query optimization.

Examining our ER diagram reveals that user email addresses reside in the users table's email column. For Gmail user identification, we need all user data filtered by email domain. This requires selecting all columns from the users table where email contains "@gmail.com"—displaying complete user profiles exclusively for Gmail addresses.

Our training database contains merely 100 users, facilitating learning but misrepresenting production environments. Real-world databases typically contain thousands, hundreds of thousands, or millions of user records. This scale difference creates exponentially more opportunities for data variation and edge cases that can derail poorly constructed queries.

Consider potential false positives when using wildcard searches for "gmail": addresses like "huntingmail.com" or "workingmailbox.com" contain "gmail" but aren't Gmail addresses. These variations become increasingly problematic as dataset size grows. Each wildcard functions as a precision bomb—potentially including far more results than intended.

Wildcards live up to their name by matching unexpectedly broad patterns. The fundamental principle: use minimum necessary wildcards. Avoid adding wildcards "just in case"—this approach inevitably captures unintended results. Your goal should be maximum precision with minimum wildcard usage, not comprehensive wildcard coverage.


For Gmail address identification, the safest approach specifies "%@gmail.com" without trailing wildcards. This pattern captures any characters before the Gmail domain while ensuring exact domain matching. This method eliminates false positives from domains containing "gmail" as a substring while maintaining complete Gmail address coverage.

Production databases amplify wildcard risks through increased data diversity. Always validate results for false positives—unintended matches that satisfy your pattern but fail your actual requirements. Treat wildcards with respect; use them only when necessary, not as default solutions. The larger your dataset, the more critical this precision becomes.

When filtering orders by multiple states—Florida or Texas—SQL provides several syntactic approaches. The IN ('FL', 'TX') operator offers the most efficient code, checking if the ship_state value exists within the specified list. This approach minimizes typing and enhances readability compared to multiple OR conditions.

Alternatively, traditional OR logic works identically: ship_state = 'FL' OR ship_state = 'TX'. Both methods produce identical results with equivalent performance. The IN operator simply reduces code verbosity, particularly valuable when checking against larger value sets. Choose based on team coding standards and personal preference—both approaches are professionally acceptable.

Note that IN operates exclusively with equality matching—wildcards aren't supported. For pattern matching across multiple values, use separate LIKE conditions connected with OR operators. This distinction becomes crucial when building complex filtering logic involving both exact matches and pattern matching.

Retrieving the five most recent New York orders demonstrates SQL's structured syntax requirements. First, filter for New York orders using the WHERE clause. Then apply ORDER BY created_at DESC to sort by creation date in descending order (newest first). Finally, LIMIT 5 restricts results to the top five records. This clause order is mandatory—LIMIT must follow ORDER BY in SQL's structured syntax.

Multi-condition filtering, such as products containing "plate" priced above $20, requires the AND operator. Begin with the products table, applying WHERE price > 20 AND title ILIKE '%plate%'. The greater-than operator handles price filtering, while ILIKE with wildcards manages case-insensitive text matching. Consider whether "more than $20" includes exactly $20—business requirements often assume inclusive ranges despite imprecise language.

Condition order flexibility allows logical query building. You might filter by price first to reduce the dataset, then apply text matching. SQL's ability to execute partial queries by selecting specific lines enables iterative development—test individual conditions before combining them. This approach proves invaluable for debugging complex queries and verifying intermediate results.


Excluding products containing "rubber" demonstrates negative filtering with NOT operators. Start by identifying rubber products using WHERE title ILIKE '%rubber%', verify the results capture intended items, then add NOT: WHERE title NOT ILIKE '%rubber%'. This positive-then-negative approach prevents typos and logic errors from producing empty result sets.

The NOT operator can precede either the column name or the LIKE operator with identical results. However, establishing the positive case first ensures your pattern matching works correctly before inversion. This methodology becomes essential with wildcards, where subtle errors can drastically alter results.

Handling spelling variations like "gray" versus "grey" presents multiple solution approaches. The IN ('gray', 'grey') method provides maximum clarity and precision. Alternative approaches using LIKE 'gr_y' (underscore for single character) or LIKE 'gr%y' (percentage for multiple characters) work in small databases but introduce unnecessary risks.

Code clarity trumps character savings. The wildcard approaches lack intuitive meaning and could match unintended words if the language evolves or data diversity increases. The IN operator explicitly declares your intentions, making the code self-documenting and maintainable. Your future self and teammates will appreciate the explicit clarity over clever shortcuts.

Remember that you're writing code for humans, not just computers. Even solo developers benefit from clear code—you'll revisit your queries weeks or months later without the original context. Prioritize explicit, readable solutions over abbreviated clever tricks. Use double dashes (--) for comments when additional explanation helps clarify business logic or edge case handling.

Filtering line items by return status demonstrates exact matching for controlled vocabulary fields. Status values typically derive from dropdown menus or predefined lists, ensuring consistent spelling and capitalization. Simply use WHERE status = 'Returned' without wildcards or case-insensitive matching, as the data integrity is guaranteed by the application layer.

For analyzing the most expensive returns by total value (price × quantity), you'll need calculated columns and additional sorting logic. This type of analysis often requires SQL aggregate functions and custom column creation to derive meaningful business insights from transactional data.

Key Takeaways

1Entity Relationship diagrams provide essential bird's-eye views of database structure, with DBeaver offering the easiest creation method through double-clicking
2Wildcards are powerful but dangerous - they can create false positives by matching unintended patterns, especially in larger datasets
3Always use the minimum number of wildcards necessary and include specific known parts like '@gmail.com' to prevent overmatching
4The IN operator is more efficient than multiple OR statements for equality comparisons, but cannot be used with LIKE wildcards
5SQL query structure must follow a specific order: WHERE before ORDER BY before LIMIT, and this sequence cannot be rearranged
6Test wildcard queries with positive patterns first, verify results manually, then add NOT operators only after confirming accuracy
7Use underscore (_) for exact single character replacements and percentage (%) for variable length patterns, choosing based on precision needs
8The sniff test is crucial when using wildcards - always verify that results don't include unintended matches before deploying queries on production data

RELATED ARTICLES