Understanding the Keyword Interface in NLPatent Apps

This is a general article that covers how to use keywords across all NLPatent apps. While the keyword function all share the same user interface, their effects on results will differ based on where they are used.

For more information on the effects of keyword queries based on where they are used, have a look at What does a keyword query do? in our FAQ article.

NLPatent seamlessly integrates keyword with its advanced semantic search capabilities, offering users the best of both AI and traditional boolean search approaches. Keyword may be employed to ensure specific terms of art are captured both in the NLPatent Search and Monitor apps.

See our Introduction to Boolean Logic article to better understand how boolean logic works if you are not familiar with it already. 

How Keywords Work in NLPatent

You will see the keywords user interface (UI) across several places in our apps:

It consists of two sections. The first section consists of a Text Input Field where you can directly write a keyword search query using our keyword query language. This field allows for fast input of keyword query, and is particularly useful for more complex queries. We also apply syntax highlighting to the input for increased readability.

The second section is the Keyword Form, where keyword rules can be created and grouped. The form provides a helpful live visualisation of keyword queries typed in the first section, as well as a more beginner friendly way to crafting keyword queries.

Example Inputs

To get started, open the Example inputs dropdown by clicking on it and selecting one of the provided examples to familiarise yourself with the interface:

Using the Form Interface

The form interface is synced with the text input field, so updates in one will reflect in the other in real time. The live sync feature should help with learning the keyword query language.

To get started, edit the existing default rule or add more rules and groups using the provided buttons. Rules and groups can be nested, and they can also be dragged and dropped.


Four types of rules are supported in NLPatent:

  1. A contains rule simply checks that the phrase provided is contained inside the selected fields.

    Example: searching for mobile phone.

  2. A contains query rule is an advanced search tool for checking if a provided logical query is included in the selected fields.

    This allows you to use the keyword query language boolean operators, such as AND, OR, NOT, WITHIN, or WITHINF, and groups (parentheses) inside the rule's input field.

    Example: searching for ai OR artificial intelligence.

  3. An ordered proximity rule checks that two phrases are within a customisable maximum distance of one another and in the same order, in the selected patent fields.
  4. An unordered proximity rule does the same as the ordered one, but the ordering does not matter.

One can select one or more fields for your rules, or Full Text if you want to search across all fields at once.

Both rules and groups can be negated, and groups will link child subgroups and rules with the selected boolean operator. TheAND boolean operator ensures that all filters declared by the group's children apply to a result, while OR requires at least one of the children filters to apply to a result.

For example, in the query from the screenshot below, where we have a parent group using OR to link two rules, one querying for mobile and the other for 5g , we would return results where either mobile or 5g appear in the results:

Formatting Complex Queries

To help out in writing long queries, we allow users to split the input across multiple lines and indent it, just as with regular code. We also provide a Format button which can automatically apply the aforementioned formatting.

Keyword Query Text Input Error Messages

While typing a query directly, you will be guided with helpful messages and errors containing relevant documentation to your input, which should assist you in writing a correct query. To view the relevant documentation, simply hover your mouse above the words underlined with a dotted line:

Phrases

You can search for both words and phrases containing multiple words separated by spaces. In both cases your input will be searched for as is.

For example, if you have a contains rule filtering for quantum , you will get results containing quantum. Same goes for phrases, if your rule is set to filter for mobile phone , we will be returning results containing this exact phrase.

Note: in the rest of the document, when using the term keyword , we will be referring to the words/phrases you are querying for.

Wildcards

Wildcard matching allows you to expand your keyword queries by accommodating variations of specific keywords. Understanding how to use single and multiple-character wildcards can significantly enhance the flexibility of your queries.

Single character wildcard: ?

Employ the ? symbol to replace a single character. For example:  

  • wom?n would match both woman and women 
  • reven?e would match both revenge and revenue

Optional single character wildcard: !

The ! wildcard symbol is similar to the previous one, but the character at that position can also be completely missing. For example:

  • colo!r would match both colour and color

Multiple character wildcard: *

Use the * symbol to replace zero or more characters. This can be used at the end or in the middle of a term. For example: 

  • electri* will filter for electricity, electrical, electric, and so on.

Wildcard usage in phrases

You are free to use wildcards inside phrases as well. For instance, searching for colo!r pattern would match both "colour pattern" and "color pattern".

Wildcard specificity

To ensure the speed of our keyword query system, we impose a limit of 128 alternatives matched per word with wildcards. If the wilcarded word you provided matches more than this number of alternatives, the keyword query will fail and you will get an error asking you to refine your keyword terms.

For example, if you were to search for *ing or s* , each of these would match too many alternatives and will result in an error.

This limit applies per word, not per phrase, so you can still craft phrases with wildcarded words even if the phrase itself would match more than the alternative count limit.

Avoid non-wildcard special characters

We do not perform searches for non-wildcard special characters inside phrases, such as hyphens, apostrophes or underscores. Therefore, these are not allowed in your phrases. For characters which separate words, such as hyphens or underscores, you can use spaces instead. A search for full stack will match full stack, full-stack, full_stack, and any other special character between the two given words.

Using Quotes for Reserved Words and Numbers

Reserved words or numbers may not be used directly as a word in a keyword because they're used to construct the surrounding query.

Here is a full list of the reserved words from our keyword query language (casing does not matter):

  • TITLE , TI , ABSTRACT , AB , CLAIMS , CL , DESCRIPTION , DSC , FULL_TEXT , FT , AND , OR , WITHIN , W , WITHINF , WF , NOT .

This also applies to numbers that are not within a word, e.g. 100. If the number does appear inside a word, e.g. formula100, you do not need to quote it.

To use reserved words or numbers inside your phrases, you can wrap your phrase in double or single quotes. A search for mobile AND phone will return results that contain both mobile and phone, while a search for "mobile phone" will return results that contain the phrase mobile phone.

Similarly for numbers, typing in 300 spartans will result in an error being shown, but you can avoid this by quoting the phrase with single or double quotes: "300 spartans" will be allowed.

Practical Examples

  1. You are looking for patents that contain the words AI and ML in all parts of the patents. You want to ensure that both the acronym and the fully spelled word is considered. What would the keyword input look like? 

  1. You are looking for patents in semiconductors. You know that you are looking for semiconductors that contain silicon . Specifically, you’d like to look for patents that contain the word silicon within 5 words of and beforechip , and within 3 words of semiconductor in any order. What would your search string look like?

  1. You’re looking for a composite alloy of aluminum and copper, for an application in building materials. You don't want gold to be involved in the alloy, and you want to only search in patent descriptions. Keeping in mind the the American vs. British spelling of aluminum (aluminum vs aluminium respectively), how would you craft a string that ensures the words alloy along with alumnium , nickel are mentioned within the patent description, but not gold

    Here we have two options, a longer one where we define rules for each word, and a shorter approach where we type in a keyword query directly inside the rule text input field.