2. Authority Document Count Sequence Count Database
USA 320,873 215,305,722 Gold+
EPO 108,362 37,883,488 Gold+
WIPO 144,292 74,293,342 Gold+
Japan 104,108 27,355,841 Gold+
China 78,683 1,029,562 Platinum
India 6,446 69,071 Platinum
Canada 57,671 24,026,839 Gold+
Brazil 2,134 39,001 Platinum
Others 81,148 3,913,808 Gold+
Total 903,717 383,916,674
Country Coverage
World’s Largest Sequence Database
https://www.gqlifesciences.com/genomequest/capabilities-features/
3. GQ Gold+ vs Platinum
Topic Gold+ PLATINUM
Traditional All Patents ST.25 Listings From US, EPO, WIPO,
Korea, Japan
All Patents ST.25 Listings From US, EPO, WIPO,
Korea, Japan
Traditional
and Manual
Curation
GQ-Pat Sequences (including non-ST.25) from
US, EPO, WIPO, Korea, Japan plus the following
Authorities: AT, AU, BE, CA, CH, DE, ES, FR, GB,
LU, NL, NO, TW
GQ-Pat Sequences (including non-ST.25) from
US, EPO, WIPO, Korea, Japan plus the following
Authorities: AT, AU, BE, CA, CH, DE, ES, FR, GB,
LU, NL, NO, TW
BRIC Country Documents: CN, BR, IN, RU +
Emerging Country Documents
Features Extended Legal Status (ELS) Extended Legal Status (ELS)
Normalized Patent Assignee; Parent Normalized Patent Assignee; Parent
Unique Family Sequence (UFS) Unique Family Sequence (UFS)
Access to PDF Downloads
Family Portrait Report
6. Results Pre & Post filtering
560K sequences 2K sequencesFilter
7. Getting to Your Results
ALGORITHMS
• Searches can be done in a broad inclusive manner by
selecting the correct algorithm and a few basic settings
FILTERS
• Broad searches can be narrowed quickly based on
homology data, legal status, and many other critera
VIEWS
• Views allow you to tailor the display to your liking – with
specific columns and intelligent grouping
9. Filters
• Filter your search based on specific legal status,
homology, authority, or many other categories
• Save your favorite – frequently used filters
• Save multiple filters– different filters for different searches
• Filters are categorized for fast access
• Categories include alignment properties, subject text, subject dates, subject
properties etc.
• Filters reduce reported hits based on your criteria
10. Views & Grouping
• Choose how to display your data on Results page
• Tailored views are also used for Excel Table Export
• Add Columns to View with Display List
• Display fields are similar to filter fields
• Display categories similar to filter categories
• Save favorite – frequently used views
• Save multiple views – different views for different searches
• Group based on specific criteria
• Patent ID, Patent Family, Patent Assignee
• Display all records in group, or subset for streamlined analysis
14. • Sequence Search
• Filter
• Export Results to LQ
• Mark to distinguish sequence searches
• LQ text search
• ( ttl_abst_clm:IL-17*^5 OR ttl_abst_clm:IL17*^5) AND antibod*)
• Mark to distinguish text search
• Unite!
• Filter
• Highlight key hits
• Export
• Filter within Excel
Sample Workflow
26. Q:
S:
LOCAL ALIGNMENT
Part of the Query matches part of the
Subject. BLAST, FASTA, and Smith &
Waterman.
S:
Q: GLOBAL ALIGNMENT
All of the Query matches all of the
Subject. Needleman & Wunsch and
algorithms like it.
Q:
S:
BEST FIT ALIGNMENT
All of the Query is fitted into the
Subject. GenePast. Ideal for patent
sequence searching.
Alignment Types
27. Alignment Subject % ID Query % ID
Subject %
Coverage
Query %
Coverage
100% 100% 100% 100%
100% 50% 100% 50%
50% 100% 50% 100%
50% 50% 50% 50%
95% 95% 100% 100%
Alignment % identity, corrected for the ratio of the alignment length to either the query or subject length.
Query/Subject % Identity Definition
This example assumes 100% alignment identity, the longer lines are 100 residues, the shorter lines are 50 residues.
• By filtering for 100% subject coverage you can capture CDR to CDR matches
• With variability % ID can drop, so % coverage is the preferable filter
• This is a key feature to understand – these filters are very powerful
5 mismatches
28. Key Fields
Legal Status
Extended Legal Status And National Phase Legal Status
US PAIR Legal Status
• PAIR Legal status – Updates from US PAIR occur Monthly
Live Links to Reports, Alignments
• Links on analysis page carry over to Excel Reports
• Simple Easy Sharing among groups
Microsoft Excel 97
- 2004 Worksheet
29. Short sequences need GenePAST or Motif
searches (BLAST may miss patents)
• For short Query sequences – or for
easy analysis of variants, GenePAST
is the preferred algorithm.
30. MOTIF on full length – Direct Strike
The long sequence gives hits comprising all three CDRs in the specific order
provided. *. Represents “any number of unspecified residues, including zero”.
Motif searches require 100% match in “defined” residues.
>37-motif
DLSIH.*GFDPQDGETIYAQKFQG.*GSSSSWFDP
>9-motif
RASQGISSWLA.*GASNLES.*QQANSFPWT
31. Unique Family Sequence UFS
• Merge all identical sequences within a family
• Based on strict criteria: identical sequence, patent family, sequence length
• Examine a sequence’s status across authorities
• Group By UFS can replace group by family for finer resolution of unique hits
• UFS Identifier = MD5Sum + Sequence Length + Family ID
• UFS IDs can be transient
Normalized Sequence/Patent Family
32. Methodology – Searching CDRs
All3CDRs(orprimer/ampliconsets)insubjectorpatent
MOTIF – exact match
GenePAST – variations
By requiring a group size equal to three in the post search grouping – we show patents that
contain all three CDRs
• Fasta sequences for your search
allows multiple queries at once
• GenePAST will allow you to view
patent hits with variability in the CDRs
34. Database Selection
Tree Structure and Virtual Databases
• Tree structure allows easy database search setup
• Multiple virtual databases can be chosen
• Virtual databases can be shared among teams
• Save your own databases from
keyword or IP searches – and
search within results
35. Patent Statistics Report
• For multiple queries
quickly display patents
that contain all or a
subset of the queries