Using the Keyword Query Language in NLPatent

The updated keyword query interface

Our updated keyword query interface includes a new feature, letting users type in keyword queries directly. This feature is particularly helpful for when you want to:

    • Rapidly edit existing keyword queries
    • Include in a high number of keywords in your queries
    • Construct complex long queries where it might be too cumbersome to use the form interface.

Any input in the text input field will update the form interface in real time, and vice versa. This allows users to get familiar with the keyword language, as you can first construct a query inside the form, and see how the text query evolves along with your changes. For more information on how to use the form interface, see Understanding the Keyword Interface in NLPatent Apps.

Before getting started, if you are not familiar already with boolean logic, we recommend you have a look over our A Primer on Boolean Logic article, as it will help with understanding our keyword query language and how it works behind the scenes.

Phrases

The most simple keyword query is searching for a phrase consisting of multiple words separated by spaces.

For example, The query mobile phone will return results that contain the phrase mobile phone anywhere in the title, description, abstract or claims of a patent at least once.

Searches are case insensitive with regards to the keyword inputs, so a search for 5g will return results containing either 5G or 5g.

Glossary

  • word: a single word, with no spaces
  • phrase: words separated by spaces
  • keyword: word or phrase used in a keyword query

Wildcards

Wildcard matching allows you to expand your keyword queries by accommodating variations of specific keywords. Understanding how to use single and multiple-character wildcards can significantly enhance the flexibility of your queries.

Single character wildcard: ?

Employ the ? symbol to replace a single character. For example:  

  • wom?n would match both woman and women 
  • reven?e would match both revenge and revenue

Optional single character wildcard: !

The ! wildcard symbol is similar to the previous one, but the character at that position can also be completely missing. For example:

  • colo!r would match both colour and color

Multiple character wildcard: *

Use the * symbol to replace zero or more characters. This can be used at the end or in the middle of a term. For example: 

  • electri* will filter for electricity, electrical, electric, and so on.

Wildcard usage in phrases

Wildcards can be used inside of phrases as well. For instance, searching for colo!r pattern would match both "colour pattern" and "color pattern".

Wildcard specificity

To ensure the speed of our keyword query system, we impose a limit of 128 alternatives matched per word with wildcards. If the wilcarded word provided matches more than this number of alternatives, the keyword query will fail and you will get an error asking you to refine your keyword terms.

For example, if you were to search for *ing or s* , each of these would match too many alternatives and will result in an error.

This limit applies per word, not per phrase, so you can still craft phrases with multiple wildcarded words where each of the wildcarded words matches less than 128 alternatives.

Avoid non-wildcard special characters

We do not perform searches for non-wildcard special characters inside phrases, such as hyphens, apostrophes or underscores. Therefore, these are not allowed in your phrases.

For example, you will get an error and not be allowed to submit you keyword query if it contains phrases like train's or empty-handed .

For characters which separate words, such as hyphens or underscores, you can use spaces instead. A search for full stack will match full stack, full-stack, full_stack, and any other special character between the two given words.

Chaining Keywords Together Using Boolean Logic

There may be scenarios where you want to string together several keywords to include/exclude specific terms. To achieve this, you can chain keywords together using boolean logic operators. Below we will explain how each of these work.

The AND operator

If you want to chain multiple phrases and ensure that all of them appear in the results, you can chain the keyword using AND .

For example, if you would want to ensure that mobile phone, cell tower, and 5g antenna all appear together in each returned result, you can chain them the following way:

mobile phone AND cell tower AND 5g antenna

The OR operator

If you want to construct a keyword query where at least one of multiple keywords appears in each result, you can chain the keywords together using OR .

For example, if you wanted to get results containing artificial intelligence or its abbreviation AI, you could construct your query the following way to get results that contain at least one of these two keywords:

artificial intelligence OR AI

The NOT operator

If you want to exclude results containing certain keywords, you can use NOT to do so.

For example, if you would want to exclude all results that contain a word starting with hunt, using the wildcarded word hunt* you could type in the following query to achieve your aims:

NOT hunt*

Using parentheses and multiple boolean operators simultaneously

For more complex queries, you might want to use several if not all of the available boolean operators. We follow the standard order of precedence for boolean operators: NOT will be applied first, then AND connectors will be applied, and lastly OR connectors.

For example, let us consider the following query:

mobile AND NOT phone OR NOT cell AND 5g

If we were to take a look at the form structure that results from this query, it will look like this:

We can see that the negations are applied first directly to the keywords after them, then there are two AND subgroups, and finally the outermost root OR group.

In the case of the above query, you will obtain results where either:

  • mobile appears in the patent but not phone ,
  • OR, 5g appears in the patent but not cell .

Using parentheses to change grouping order

However, sometimes we might want to apply a negation to a whole group, or would like to have the subgroups use the OR connector and then, apply the AND to the outermost group only.

In such situations, we can use parentheses to organise our queries into groups manually, allowing us declare a custom grouping for the boolean logic in our keyword query.

For example, we could but parentheses around phone OR NOT cell to completely change how the query works:

mobile AND NOT (PHONE OR NOT cell) AND 5g

If we take a look now at the form interface, we will see that the way rules are grouped looks quite different:

This query will return results where all of the following must apply simultaneously per returned result:

  • mobile must appear in the result,
  • and 5g must appear in the result
  • and neither:
    • phone should appear in the result,
    • or cell shouldn't appear in the result (i.e. it must appear in the result due to double negation)

Shorthand Notation

For most of our keyword query operators, we allow the usage of shorthand notation to allow you to type in your query faster.

For example, instead of typing in mobile AND phone , you can type in mobile & phone . When writing more complex queries, the shorthand notation can save you time, but it's usage is completely optional.

For the boolean operators in particular, we have the following shorthand notation

  • AND can be shortened to &
  • OR can be shortened to |
  • NOT can be shortened to ~

Casing

As previously mentioned, our searches are case insensitive with regards to the input words and phrases that you provide. This also applies to our operators! You do not have to type in any of the operators in upper case, so feel free to mix and match if that helps you write your query faster.

For example, AND is equivalent to and and also equivalent to aNd .

Proximity searches

There are cases where searching if two keywords appear simultaneously in a result is not sufficient, and you might want to also get only the results where those two words are more clearly linked together.

In such cases, you can use proximity searches in your keyword queries. These allow you to check if two words are within a maximum distance of one another in the result.

We support two types of proximity searches, ordered using the WITHINF operator, and unordered using the WITHIN operator.

Unordered Proximity Searches

Unordered proximity searches will likely be the more common in your queries, where you check if two words are within a maximum set distance of another in the result in any order.

For example, if you want to get results where 5g and antenna appear within at most 7 words of one another, you can type in the following query:

5G WITHIN 7 antenna

Ordered Proximity Searches

Ordered proximity searches are functionally identical to unordered ones, but they impose the additional condition that the first word must appear before the second one.

For example, if you want to query for results where mobile appears at least once within at most three words and before phone, you should type in the following query:

mobile WITHINF 3 phone

Using in more complex queries

Feel free to use proximity searches in more complex queries using boolean logic.

Queries like mobile phone AND NOT 5g WITHINF 10 antenna are completely valid!


Narrowing Your Query to Specific Patent Fields

By default, if you specify a keyword directly in a query, it will apply across four different patent fields at once: the title, abstract, description and claims of the patent.

However, in certain instance you may be only interested in matches within some of these fields. You can do some by using the FIELD_NAME=(...) notation, where FIELD_NAME can be one of the following:

  • TITLE
  • ABSTRACT
  • DESCRIPTION
  • CLAIMS
  • FULL_TEXT (this applies the query to all of the four fields above at once)

For example, if you would want to apply one of the previously discussed queries, mobile phone AND cell tower AND 5g antenna , only to the title field, you can do so the following way:

TITLE=(mobile phone AND cell tower AND 5g antenna)

If you take a look at the form interface, you might notice that you can type in keyword queries with boolean logic in a rule's text input field as well!

You can chain multiple field names together using spaces, and you can also use such field queries in your boolean logic queries:

ABSTRACT TITLE=(mobile phone AND cell tower) AND NOT DESCRIPTION=(5g)

In the query above, we are filtering for results where mobile phone and cell tower must appear simultaneously in either the patent's abstract or title, and the patent's description must not contain 5g.

Using Quotes for Reserved Words and Numbers

Reserved words or numbers may not be used directly as a word in a keyword because they're used to construct the surrounding query.

Here is a full list of the reserved words from our keyword query language (casing does not matter):

  • TITLE , TI , ABSTRACT , AB , CLAIMS , CL , DESCRIPTION , DSC , FULL_TEXT , FT , AND , OR , WITHIN , W , WITHINF , WF , NOT .

This also applies to numbers that are not within a word, e.g. 100 . If the number does appear inside a word, e.g. formula100 , you do not need to quote it.

To use reserved words or numbers inside your phrases, you can wrap your phrase in double or single quotes. A search for mobile AND phone will return results that contain both mobile and phone, while a search for "mobile phone" will return results that contain the phrase mobile phone.

Similarly for numbers, typing in 300 spartans will result in an error being shown, but you can avoid this by quoting the phrase with single or double quotes: "300 spartans" will be allowed.