Using the Keyword Query Language in NLPatent
Our updated keyword query interface includes a new feature, letting users type in keyword queries directly. This feature is particularly helpful for when you want to:
- Rapidly edit existing keyword queries
- Include in a high number of keywords in your queries
- Construct complex long queries where it might be too cumbersome to use the form interface.
Any input in the text input field will update the form interface in real time, and vice versa. This allows users to get familiar with the keyword language, as you can first construct a query inside the form, and see how the text query evolves along with your changes. For more information on how to use the form interface, see Understanding the Keyword Interface in NLPatent Apps.
Before getting started, if you are not familiar already with boolean logic, we recommend you have a look over our A Primer on Boolean Logic article, as it will help with understanding our keyword query language and how it works behind the scenes.
Phrases
The most simple keyword query is searching for a phrase consisting of multiple words separated by spaces.
For example, The query mobile phone
will return results that contain the phrase mobile phone
anywhere in the title, description, abstract or claims of a patent at least once.
Searches are case insensitive with regards to the keyword inputs, so a search for 5g
will return results containing either 5G or 5g.
Glossary
- word: a single word, with no spaces
- phrase: words separated by spaces
- keyword: word or phrase used in a keyword query
Wildcards
Wildcard matching allows you to expand your keyword queries by accommodating variations of specific keywords. Understanding how to use single and multiple-character wildcards can significantly enhance the flexibility of your queries.
Single character wildcard: ?
Employ the ?
symbol to replace a single character. For example:
wom?n
would match both woman and womenreven?e
would match both revenge and revenue
Optional single character wildcard: !
The !
wildcard symbol is similar to the previous one, but the character at that position can also be completely missing. For example:
colo!r
would match both colour and color
Multiple character wildcard: *
Use the *
symbol to replace zero or more characters. This can be used at the end or in the middle of a term. For example:
electri*
will filter for electricity, electrical, electric, and so on.
Wildcard usage in phrases
Wildcards can be used inside of phrases as well. For instance, searching for colo!r pattern
would match both "colour pattern" and "color pattern".
Wildcard specificity
To ensure the speed of our keyword query system, we impose a limit of 128 alternatives matched per word with wildcards. If the wilcarded word provided matches more than this number of alternatives, the keyword query will fail and you will get an error asking you to refine your keyword terms.
For example, if you were to search for *ing
or s*
, each of these would match too many alternatives and will result in an error.
This limit applies per word, not per phrase, so you can still craft phrases with multiple wildcarded words where each of the wildcarded words matches less than 128 alternatives.
Avoid non-wildcard special characters
We do not perform searches for non-wildcard special characters inside phrases, such as hyphens, apostrophes or underscores. Therefore, these are not allowed in your phrases.
For example, you will get an error and not be allowed to submit you keyword query if it contains phrases like train's
or empty-handed
.
For characters which separate words, such as hyphens or underscores, you can use spaces instead. A search for full stack
will match full stack, full-stack, full_stack, and any other special character between the two given words.
Chaining Keywords Together Using Boolean Logic
There may be scenarios where you want to string together several keywords to include/exclude specific terms. To achieve this, you can chain keywords together using boolean logic operators. Below we will explain how each of these work.
The AND
operator
If you want to chain multiple phrases and ensure that all of them appear in the results, you can chain the keyword using AND
.
For example, if you would want to ensure that mobile phone, cell tower, and 5g antenna all appear together in each returned result, you can chain them the following way:
The OR
operator
If you want to construct a keyword query where at least one of multiple keywords appears in each result, you can chain the keywords together using OR
.
For example, if you wanted to get results containing artificial intelligence or its abbreviation AI, you could construct your query the following way to get results that contain at least one of these two keywords:
The NOT
operator
If you want to exclude results containing certain keywords, you can use NOT
to do so.
For example, if you would want to exclude all results that contain a word starting with hunt, using the wildcarded word hunt*
you could type in the following query to achieve your aims:
Using parentheses and multiple boolean operators simultaneously
For more complex queries, you might want to use several if not all of the available boolean operators. We follow the standard order of precedence for boolean operators: NOT
will be applied first, then AND
connectors will be applied, and lastly OR
connectors.
For example, let us consider the following query:
If we were to take a look at the form structure that results from this query, it will look like this:
We can see that the negations are applied first directly to the keywords after them, then there are two AND
subgroups, and finally the outermost root OR
group.
In the case of the above query, you will obtain results where either:
mobile
appears in the patent but notphone
,- OR,
5g
appears in the patent but notcell
.
Using parentheses to change grouping order
However, sometimes we might want to apply a negation to a whole group, or would like to have the subgroups use the OR
connector and then, apply the AND
to the outermost group only.
In such situations, we can use parentheses to organise our queries into groups manually, allowing us declare a custom grouping for the boolean logic in our keyword query.
For example, we could but parentheses around phone OR NOT cell
to completely change how the query works:
If we take a look now at the form interface, we will see that the way rules are grouped looks quite different:
This query will return results where all of the following must apply simultaneously per returned result:
- mobile must appear in the result,
- and 5g must appear in the result
- and neither:
- phone should appear in the result,
- or cell shouldn't appear in the result (i.e. it must appear in the result due to double negation)
Shorthand Notation
For most of our keyword query operators, we allow the usage of shorthand notation to allow you to type in your query faster.
For example, instead of typing in mobile AND phone
, you can type in mobile & phone
. When writing more complex queries, the shorthand notation can save you time, but it's usage is completely optional.
For the boolean operators in particular, we have the following shorthand notation
AND
can be shortened to&
OR
can be shortened to|
NOT
can be shortened to~
Casing
As previously mentioned, our searches are case insensitive with regards to the input words and phrases that you provide. This also applies to our operators! You do not have to type in any of the operators in upper case, so feel free to mix and match if that helps you write your query faster.
For example, AND
is equivalent to and
and also equivalent to aNd
.
Proximity searches
There are cases where searching if two keywords appear simultaneously in a result is not sufficient, and you might want to also get only the results where those two words are more clearly linked together.
In such cases, you can use proximity searches in your keyword queries. These allow you to check if two words are within a maximum distance of one another in the result.
We support two types of proximity searches, ordered using the WITHINF
operator, and unordered using the WITHIN
operator.
Unordered Proximity Searches
Unordered proximity searches will likely be the more common in your queries, where you check if two words are within a maximum set distance of another in the result in any order.
For example, if you want to get results where 5g and antenna appear within at most 7 words of one another, you can type in the following query:
Ordered Proximity Searches
Ordered proximity searches are functionally identical to unordered ones, but they impose the additional condition that the first word must appear before the second one.
For example, if you want to query for results where mobile appears at least once within at most three words and before phone, you should type in the following query:
Using in more complex queries
Feel free to use proximity searches in more complex queries using boolean logic.
Queries like mobile phone AND NOT 5g WITHINF 10 antenna
are completely valid!
Narrowing Your Query to Specific Patent Fields
By default, if you specify a keyword directly in a query, it will apply across four different patent fields at once: the title, abstract, description and claims of the patent.
However, in certain instance you may be only interested in matches within some of these fields. You can do some by using the FIELD_NAME=(...)
notation, where FIELD_NAME
can be one of the following:
TITLE
ABSTRACT
DESCRIPTION
CLAIMS
FULL_TEXT
(this applies the query to all of the four fields above at once)
For example, if you would want to apply one of the previously discussed queries, mobile phone AND cell tower AND 5g antenna
, only to the title field, you can do so the following way:
If you take a look at the form interface, you might notice that you can type in keyword queries with boolean logic in a rule's text input field as well!
You can chain multiple field names together using spaces, and you can also use such field queries in your boolean logic queries:
In the query above, we are filtering for results where mobile phone and cell tower must appear simultaneously in either the patent's abstract or title, and the patent's description must not contain 5g.
Using Quotes for Reserved Words and Numbers
Reserved words or numbers may not be used directly as a word in a keyword because they're used to construct the surrounding query.
Here is a full list of the reserved words from our keyword query language (casing does not matter):
TITLE
,TI
,ABSTRACT
,AB
,CLAIMS
,CL
,DESCRIPTION
,DSC
,FULL_TEXT
,FT
,AND
,OR
,WITHIN
,W
,WITHINF
,WF
,NOT
.
This also applies to numbers that are not within a word, e.g. 100
. If the number does appear inside a word, e.g. formula100
, you do not need to quote it.
To use reserved words or numbers inside your phrases, you can wrap your phrase in double or single quotes. A search for mobile AND phone
will return results that contain both mobile and phone, while a search for "mobile phone"
will return results that contain the phrase mobile phone.
Similarly for numbers, typing in 300 spartans
will result in an error being shown, but you can avoid this by quoting the phrase with single or double quotes: "300 spartans"
will be allowed.