Understanding the Keyword Interface in NLPatent Apps
This is a general article that covers how to use keywords across all NLPatent apps. While the keyword function all share the same user interface, their effects on results will differ based on where they are used.
For more information on the effects of keyword queries based on where they are used, have a look at What does a keyword query do? in our FAQ article.
NLPatent seamlessly integrates keyword with its advanced semantic search capabilities, offering users the best of both AI and traditional boolean search approaches. Keyword may be employed to ensure specific terms of art are captured both in the NLPatent Search and Monitor apps.
See our Introduction to Boolean Logic article to better understand how boolean logic works if you are not familiar with it already.
How Keywords Work in NLPatent
You will see the keywords user interface (UI) across several places in our apps:
It consists of two sections. The first section consists of a Text Input Field where you can directly write a keyword search query using our keyword query language. This field allows for fast input of keyword query, and is particularly useful for more complex queries. We also apply syntax highlighting to the input for increased readability.
The second section is the Keyword Form, where keyword rules can be created and grouped. The form provides a helpful live visualisation of keyword queries typed in the first section, as well as a more beginner friendly way to crafting keyword queries.
Example Inputs
To get started, open the Example inputs
dropdown by clicking on it and selecting one of the provided examples to familiarise yourself with the interface:
Using the Form Interface
The form interface is synced with the text input field, so updates in one will reflect in the other in real time. The live sync feature should help with learning the keyword query language.
To get started, edit the existing default rule or add more rules and groups using the provided buttons. Rules and groups can be nested, and they can also be dragged and dropped.
Four types of rules are supported in NLPatent:
A
contains
rule simply checks that the phrase provided is contained inside the selected fields.Example: searching for
mobile phone
.A
contains query
rule is an advanced search tool for checking if a provided logical query is included in the selected fields.This allows you to use the keyword query language boolean operators, such as
AND
,OR
,NOT
,WITHIN
, orWITHINF
, and groups (parentheses) inside the rule's input field.Example: searching for
ai OR artificial intelligence
.- An
ordered proximity
rule checks that two phrases are within a customisable maximum distance of one another and in the same order, in the selected patent fields. - An
unordered proximity
rule does the same as the ordered one, but the ordering does not matter.
One can select one or more fields for your rules, or Full Text
if you want to search across all fields at once.
Both rules and groups can be negated, and groups will link child subgroups and rules with the selected boolean operator. TheAND
boolean operator ensures that all filters declared by the group's children apply to a result, while OR
requires at least one of the children filters to apply to a result.
For example, in the query from the screenshot below, where we have a parent group using OR
to link two rules, one querying for mobile
and the other for 5g
, we would return results where either mobile or 5g appear in the results:
Formatting Complex Queries
To help out in writing long queries, we allow users to split the input across multiple lines and indent it, just as with regular code. We also provide a Format
button which can automatically apply the aforementioned formatting.
Keyword Query Text Input Error Messages
While typing a query directly, you will be guided with helpful messages and errors containing relevant documentation to your input, which should assist you in writing a correct query. To view the relevant documentation, simply hover your mouse above the words underlined with a dotted line:
Phrases
You can search for both words and phrases containing multiple words separated by spaces. In both cases your input will be searched for as is.
For example, if you have a contains rule filtering for quantum
, you will get results containing quantum. Same goes for phrases, if your rule is set to filter for mobile phone
, we will be returning results containing this exact phrase.
Note: in the rest of the document, when using the term keyword
, we will be referring to the words/phrases you are querying for.
Wildcards
Wildcard matching allows you to expand your keyword queries by accommodating variations of specific keywords. Understanding how to use single and multiple-character wildcards can significantly enhance the flexibility of your queries.
Single character wildcard: ?
Employ the ?
symbol to replace a single character. For example:
wom?n
would match both woman and womenreven?e
would match both revenge and revenue
Optional single character wildcard: !
The !
wildcard symbol is similar to the previous one, but the character at that position can also be completely missing. For example:
colo!r
would match both colour and color
Multiple character wildcard: *
Use the *
symbol to replace zero or more characters. This can be used at the end or in the middle of a term. For example:
electri*
will filter for electricity, electrical, electric, and so on.
Wildcard usage in phrases
You are free to use wildcards inside phrases as well. For instance, searching for colo!r pattern
would match both "colour pattern" and "color pattern".
Wildcard specificity
To ensure the speed of our keyword query system, we impose a limit of 128 alternatives matched per word with wildcards. If the wilcarded word you provided matches more than this number of alternatives, the keyword query will fail and you will get an error asking you to refine your keyword terms.
For example, if you were to search for *ing
or s*
, each of these would match too many alternatives and will result in an error.
This limit applies per word, not per phrase, so you can still craft phrases with wildcarded words even if the phrase itself would match more than the alternative count limit.
Avoid non-wildcard special characters
We do not perform searches for non-wildcard special characters inside phrases, such as hyphens, apostrophes or underscores. Therefore, these are not allowed in your phrases. For characters which separate words, such as hyphens or underscores, you can use spaces instead. A search for full stack
will match full stack, full-stack, full_stack, and any other special character between the two given words.
Using Quotes for Reserved Words and Numbers
Reserved words or numbers may not be used directly as a word in a keyword because they're used to construct the surrounding query.
Here is a full list of the reserved words from our keyword query language (casing does not matter):
TITLE
,TI
,ABSTRACT
,AB
,CLAIMS
,CL
,DESCRIPTION
,DSC
,FULL_TEXT
,FT
,AND
,OR
,WITHIN
,W
,WITHINF
,WF
,NOT
.
This also applies to numbers that are not within a word, e.g. 100
. If the number does appear inside a word, e.g. formula100
, you do not need to quote it.
To use reserved words or numbers inside your phrases, you can wrap your phrase in double or single quotes. A search for mobile AND phone
will return results that contain both mobile and phone, while a search for "mobile phone"
will return results that contain the phrase mobile phone.
Similarly for numbers, typing in 300 spartans
will result in an error being shown, but you can avoid this by quoting the phrase with single or double quotes: "300 spartans"
will be allowed.
Practical Examples
- You are looking for patents that contain the words
AI
andML
in all parts of the patents. You want to ensure that both the acronym and the fully spelled word is considered. What would the keyword input look like?
- You are looking for patents in semiconductors. You know that you are looking for semiconductors that contain
silicon
. Specifically, you’d like to look for patents that contain the wordsilicon
within 5 words of and beforechip
, and within 3 words ofsemiconductor
in any order. What would your search string look like?
You’re looking for a composite alloy of aluminum and copper, for an application in building materials. You don't want gold to be involved in the alloy, and you want to only search in patent descriptions. Keeping in mind the the American vs. British spelling of aluminum (
aluminum
vsaluminium
respectively), how would you craft a string that ensures the wordsalloy
along withalumnium
,nickel
are mentioned within the patent description, but notgold
?Here we have two options, a longer one where we define rules for each word, and a shorter approach where we type in a keyword query directly inside the rule text input field.