SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Sequence Search/Comparison/Analysis
Stephen Allen- Solutions Consultant
Authority Document Count Sequence Count Database
USA 320,873 215,305,722 Gold+
EPO 108,362 37,883,488 Gold+
WIPO 144,292 74,293,342 Gold+
Japan 104,108 27,355,841 Gold+
China 78,683 1,029,562 Platinum
India 6,446 69,071 Platinum
Canada 57,671 24,026,839 Gold+
Brazil 2,134 39,001 Platinum
Others 81,148 3,913,808 Gold+
Total 903,717 383,916,674
Country Coverage
World’s Largest Sequence Database
https://www.gqlifesciences.com/genomequest/capabilities-features/
GQ Gold+ vs Platinum
Topic Gold+ PLATINUM
Traditional All Patents ST.25 Listings From US, EPO, WIPO,
Korea, Japan
All Patents ST.25 Listings From US, EPO, WIPO,
Korea, Japan
Traditional
and Manual
Curation
GQ-Pat Sequences (including non-ST.25) from
US, EPO, WIPO, Korea, Japan plus the following
Authorities: AT, AU, BE, CA, CH, DE, ES, FR, GB,
LU, NL, NO, TW
GQ-Pat Sequences (including non-ST.25) from
US, EPO, WIPO, Korea, Japan plus the following
Authorities: AT, AU, BE, CA, CH, DE, ES, FR, GB,
LU, NL, NO, TW
 BRIC Country Documents: CN, BR, IN, RU +
Emerging Country Documents
Features Extended Legal Status (ELS) Extended Legal Status (ELS)
Normalized Patent Assignee; Parent Normalized Patent Assignee; Parent
Unique Family Sequence (UFS) Unique Family Sequence (UFS)
 Access to PDF Downloads
 Family Portrait Report
Results
Results Pre & Post filtering
560K sequences 2K sequencesFilter
Getting to Your Results
 ALGORITHMS
• Searches can be done in a broad inclusive manner by
selecting the correct algorithm and a few basic settings
 FILTERS
• Broad searches can be narrowed quickly based on
homology data, legal status, and many other critera
 VIEWS
• Views allow you to tailor the display to your liking – with
specific columns and intelligent grouping
Search Setup
https://www.gqlifesciences.com/blog/category/genepast/
Filters
• Filter your search based on specific legal status,
homology, authority, or many other categories
• Save your favorite – frequently used filters
• Save multiple filters– different filters for different searches
• Filters are categorized for fast access
• Categories include alignment properties, subject text, subject dates, subject
properties etc.
• Filters reduce reported hits based on your criteria
Views & Grouping
• Choose how to display your data on Results page
• Tailored views are also used for Excel Table Export
• Add Columns to View with Display List
• Display fields are similar to filter fields
• Display categories similar to filter categories
• Save favorite – frequently used views
• Save multiple views – different views for different searches
• Group based on specific criteria
• Patent ID, Patent Family, Patent Assignee
• Display all records in group, or subset for streamlined analysis
Details and Alignments
LifeQuest
Consolidated Sequence & Text Searching
Filter with LQ markup
Filter by Stars Filter by Color
• Sequence Search
• Filter
• Export Results to LQ
• Mark to distinguish sequence searches
• LQ text search
• ( ttl_abst_clm:IL-17*^5 OR ttl_abst_clm:IL17*^5) AND antibod*)
• Mark to distinguish text search
• Unite!
• Filter
• Highlight key hits
• Export
• Filter within Excel
Sample Workflow
Non Sequence IP
Claims & Alignments
Quickly add columns
Post Filtering
Post Filter sequence searches, text searches, or combined searches
Additional Linkouts
Contact us at:
Stephen.Allen@aptean.com
Ellen.Sherin@aptean.com
Bill Perkins@Aptean.com
Questions?
LifeQuest
• Unite Sequence Based & Text Based Searches
• Create Virtual Sequence Database from LQ Results
Nested – Savable Filters
Complex Boolean filters
• Nested filters for fine tuning
• Save standard filters for easy application
Alerts: See what’s new
Contact us at:
Stephen.Allen@aptean.com
Ellen.Sherin@aptean.com
Bill Perkins@Aptean.com
Questions?
Supplementary Slides
Please contact stephen.allen@aptean.com with any questions
Q:
S:
LOCAL ALIGNMENT
Part of the Query matches part of the
Subject. BLAST, FASTA, and Smith &
Waterman.
S:
Q: GLOBAL ALIGNMENT
All of the Query matches all of the
Subject. Needleman & Wunsch and
algorithms like it.
Q:
S:
BEST FIT ALIGNMENT
All of the Query is fitted into the
Subject. GenePast. Ideal for patent
sequence searching.
Alignment Types
Alignment Subject % ID Query % ID
Subject %
Coverage
Query %
Coverage
100% 100% 100% 100%
100% 50% 100% 50%
50% 100% 50% 100%
50% 50% 50% 50%
95% 95% 100% 100%
Alignment % identity, corrected for the ratio of the alignment length to either the query or subject length.
Query/Subject % Identity Definition
This example assumes 100% alignment identity, the longer lines are 100 residues, the shorter lines are 50 residues.
• By filtering for 100% subject coverage you can capture CDR to CDR matches
• With variability % ID can drop, so % coverage is the preferable filter
• This is a key feature to understand – these filters are very powerful
5 mismatches
Key Fields
Legal Status
Extended Legal Status And National Phase Legal Status
US PAIR Legal Status
• PAIR Legal status – Updates from US PAIR occur Monthly
Live Links to Reports, Alignments
• Links on analysis page carry over to Excel Reports
• Simple Easy Sharing among groups
Microsoft Excel 97
- 2004 Worksheet
Short sequences need GenePAST or Motif
searches (BLAST may miss patents)
• For short Query sequences – or for
easy analysis of variants, GenePAST
is the preferred algorithm.
MOTIF on full length – Direct Strike
The long sequence gives hits comprising all three CDRs in the specific order
provided. *. Represents “any number of unspecified residues, including zero”.
Motif searches require 100% match in “defined” residues.
>37-motif
DLSIH.*GFDPQDGETIYAQKFQG.*GSSSSWFDP
>9-motif
RASQGISSWLA.*GASNLES.*QQANSFPWT
Unique Family Sequence UFS
• Merge all identical sequences within a family
• Based on strict criteria: identical sequence, patent family, sequence length
• Examine a sequence’s status across authorities
• Group By UFS can replace group by family for finer resolution of unique hits
• UFS Identifier = MD5Sum + Sequence Length + Family ID
• UFS IDs can be transient
Normalized Sequence/Patent Family
Methodology – Searching CDRs
All3CDRs(orprimer/ampliconsets)insubjectorpatent
MOTIF – exact match
GenePAST – variations
By requiring a group size equal to three in the post search grouping – we show patents that
contain all three CDRs
• Fasta sequences for your search
allows multiple queries at once
• GenePAST will allow you to view
patent hits with variability in the CDRs
Conservative Substitutions
Subjectscomprisingall3CDRS
Upto1substitution
Subject and Query Gaps
• Gaps in CDRs and primers can be ignored
using the Query/Subject gap filter
• Variations – i.e. number of differences can
be adjusted without calculating % identity
Database Selection
Tree Structure and Virtual Databases
• Tree structure allows easy database search setup
• Multiple virtual databases can be chosen
• Virtual databases can be shared among teams
• Save your own databases from
keyword or IP searches – and
search within results
Patent Statistics Report
• For multiple queries
quickly display patents
that contain all or a
subset of the queries
GenomeQuest 101

Más contenido relacionado

La actualidad más candente

Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Sanaym
 

La actualidad más candente (20)

Motif andpatterndatabase
Motif andpatterndatabaseMotif andpatterndatabase
Motif andpatterndatabase
 
Data formats
Data formatsData formats
Data formats
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Boolean operators and databases
Boolean operators and databasesBoolean operators and databases
Boolean operators and databases
 
BLAST and sequence alignment
BLAST and sequence alignmentBLAST and sequence alignment
BLAST and sequence alignment
 
Sequence database
Sequence databaseSequence database
Sequence database
 
Bio image informatics
Bio image informaticsBio image informatics
Bio image informatics
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
NGS Data Preprocessing
NGS Data PreprocessingNGS Data Preprocessing
NGS Data Preprocessing
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Structural bioinformatics.
Structural bioinformatics.Structural bioinformatics.
Structural bioinformatics.
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
Prosite
PrositeProsite
Prosite
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Autodock and vina
Autodock and vinaAutodock and vina
Autodock and vina
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 

Similar a GenomeQuest 101

Fuzzing - A Tale of Two Cultures
Fuzzing - A Tale of Two CulturesFuzzing - A Tale of Two Cultures
Fuzzing - A Tale of Two Cultures
CISPA Helmholtz Center for Information Security
 
Using DITA's Subject Scheme Support for Educational Assessment Content
Using DITA's Subject Scheme Support for Educational Assessment ContentUsing DITA's Subject Scheme Support for Educational Assessment Content
Using DITA's Subject Scheme Support for Educational Assessment Content
Edwina Lui
 
CDISC SDTM Domain Presentation
CDISC SDTM Domain PresentationCDISC SDTM Domain Presentation
CDISC SDTM Domain Presentation
Ankur Sharma
 

Similar a GenomeQuest 101 (20)

wipo_ip_mnl_19_t5.pdf
wipo_ip_mnl_19_t5.pdfwipo_ip_mnl_19_t5.pdf
wipo_ip_mnl_19_t5.pdf
 
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload
 
patterndat.pdf
patterndat.pdfpatterndat.pdf
patterndat.pdf
 
Fuzzing - A Tale of Two Cultures
Fuzzing - A Tale of Two CulturesFuzzing - A Tale of Two Cultures
Fuzzing - A Tale of Two Cultures
 
E-LEARN Search Strategies
E-LEARN Search StrategiesE-LEARN Search Strategies
E-LEARN Search Strategies
 
2016 02 23_biological_databases_part1
2016 02 23_biological_databases_part12016 02 23_biological_databases_part1
2016 02 23_biological_databases_part1
 
Tips & Tricks for Patent Search orbit.com
Tips & Tricks for Patent Search orbit.comTips & Tricks for Patent Search orbit.com
Tips & Tricks for Patent Search orbit.com
 
Search Basics
Search BasicsSearch Basics
Search Basics
 
You can do WHAT with GenomeQuest? (Almost) 101 Things You May Not Know
You can do WHAT with GenomeQuest? (Almost) 101 Things You May Not KnowYou can do WHAT with GenomeQuest? (Almost) 101 Things You May Not Know
You can do WHAT with GenomeQuest? (Almost) 101 Things You May Not Know
 
CSPro Workshop P-3
CSPro Workshop P-3CSPro Workshop P-3
CSPro Workshop P-3
 
Finding the Bad Actor: Custom scoring & forensic name matching with Elastics...
Finding the Bad Actor: Custom scoring & forensic name matching  with Elastics...Finding the Bad Actor: Custom scoring & forensic name matching  with Elastics...
Finding the Bad Actor: Custom scoring & forensic name matching with Elastics...
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge
 
Patent Search
Patent SearchPatent Search
Patent Search
 
Using DITA's Subject Scheme Support for Educational Assessment Content
Using DITA's Subject Scheme Support for Educational Assessment ContentUsing DITA's Subject Scheme Support for Educational Assessment Content
Using DITA's Subject Scheme Support for Educational Assessment Content
 
CDISC SDTM Domain Presentation
CDISC SDTM Domain PresentationCDISC SDTM Domain Presentation
CDISC SDTM Domain Presentation
 
2020 02 11_biological_databases_part1
2020 02 11_biological_databases_part12020 02 11_biological_databases_part1
2020 02 11_biological_databases_part1
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Scalable Data Models with Elasticsearch
Scalable Data Models with ElasticsearchScalable Data Models with Elasticsearch
Scalable Data Models with Elasticsearch
 
Google for Life Science Researchers
Google for Life Science ResearchersGoogle for Life Science Researchers
Google for Life Science Researchers
 

Último

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 

Último (20)

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 

GenomeQuest 101

  • 2. Authority Document Count Sequence Count Database USA 320,873 215,305,722 Gold+ EPO 108,362 37,883,488 Gold+ WIPO 144,292 74,293,342 Gold+ Japan 104,108 27,355,841 Gold+ China 78,683 1,029,562 Platinum India 6,446 69,071 Platinum Canada 57,671 24,026,839 Gold+ Brazil 2,134 39,001 Platinum Others 81,148 3,913,808 Gold+ Total 903,717 383,916,674 Country Coverage World’s Largest Sequence Database https://www.gqlifesciences.com/genomequest/capabilities-features/
  • 3. GQ Gold+ vs Platinum Topic Gold+ PLATINUM Traditional All Patents ST.25 Listings From US, EPO, WIPO, Korea, Japan All Patents ST.25 Listings From US, EPO, WIPO, Korea, Japan Traditional and Manual Curation GQ-Pat Sequences (including non-ST.25) from US, EPO, WIPO, Korea, Japan plus the following Authorities: AT, AU, BE, CA, CH, DE, ES, FR, GB, LU, NL, NO, TW GQ-Pat Sequences (including non-ST.25) from US, EPO, WIPO, Korea, Japan plus the following Authorities: AT, AU, BE, CA, CH, DE, ES, FR, GB, LU, NL, NO, TW  BRIC Country Documents: CN, BR, IN, RU + Emerging Country Documents Features Extended Legal Status (ELS) Extended Legal Status (ELS) Normalized Patent Assignee; Parent Normalized Patent Assignee; Parent Unique Family Sequence (UFS) Unique Family Sequence (UFS)  Access to PDF Downloads  Family Portrait Report
  • 4.
  • 6. Results Pre & Post filtering 560K sequences 2K sequencesFilter
  • 7. Getting to Your Results  ALGORITHMS • Searches can be done in a broad inclusive manner by selecting the correct algorithm and a few basic settings  FILTERS • Broad searches can be narrowed quickly based on homology data, legal status, and many other critera  VIEWS • Views allow you to tailor the display to your liking – with specific columns and intelligent grouping
  • 9. Filters • Filter your search based on specific legal status, homology, authority, or many other categories • Save your favorite – frequently used filters • Save multiple filters– different filters for different searches • Filters are categorized for fast access • Categories include alignment properties, subject text, subject dates, subject properties etc. • Filters reduce reported hits based on your criteria
  • 10. Views & Grouping • Choose how to display your data on Results page • Tailored views are also used for Excel Table Export • Add Columns to View with Display List • Display fields are similar to filter fields • Display categories similar to filter categories • Save favorite – frequently used views • Save multiple views – different views for different searches • Group based on specific criteria • Patent ID, Patent Family, Patent Assignee • Display all records in group, or subset for streamlined analysis
  • 13. Filter with LQ markup Filter by Stars Filter by Color
  • 14. • Sequence Search • Filter • Export Results to LQ • Mark to distinguish sequence searches • LQ text search • ( ttl_abst_clm:IL-17*^5 OR ttl_abst_clm:IL17*^5) AND antibod*) • Mark to distinguish text search • Unite! • Filter • Highlight key hits • Export • Filter within Excel Sample Workflow
  • 17. Post Filtering Post Filter sequence searches, text searches, or combined searches
  • 20. LifeQuest • Unite Sequence Based & Text Based Searches • Create Virtual Sequence Database from LQ Results
  • 21. Nested – Savable Filters Complex Boolean filters • Nested filters for fine tuning • Save standard filters for easy application
  • 24. Supplementary Slides Please contact stephen.allen@aptean.com with any questions
  • 25.
  • 26. Q: S: LOCAL ALIGNMENT Part of the Query matches part of the Subject. BLAST, FASTA, and Smith & Waterman. S: Q: GLOBAL ALIGNMENT All of the Query matches all of the Subject. Needleman & Wunsch and algorithms like it. Q: S: BEST FIT ALIGNMENT All of the Query is fitted into the Subject. GenePast. Ideal for patent sequence searching. Alignment Types
  • 27. Alignment Subject % ID Query % ID Subject % Coverage Query % Coverage 100% 100% 100% 100% 100% 50% 100% 50% 50% 100% 50% 100% 50% 50% 50% 50% 95% 95% 100% 100% Alignment % identity, corrected for the ratio of the alignment length to either the query or subject length. Query/Subject % Identity Definition This example assumes 100% alignment identity, the longer lines are 100 residues, the shorter lines are 50 residues. • By filtering for 100% subject coverage you can capture CDR to CDR matches • With variability % ID can drop, so % coverage is the preferable filter • This is a key feature to understand – these filters are very powerful 5 mismatches
  • 28. Key Fields Legal Status Extended Legal Status And National Phase Legal Status US PAIR Legal Status • PAIR Legal status – Updates from US PAIR occur Monthly Live Links to Reports, Alignments • Links on analysis page carry over to Excel Reports • Simple Easy Sharing among groups Microsoft Excel 97 - 2004 Worksheet
  • 29. Short sequences need GenePAST or Motif searches (BLAST may miss patents) • For short Query sequences – or for easy analysis of variants, GenePAST is the preferred algorithm.
  • 30. MOTIF on full length – Direct Strike The long sequence gives hits comprising all three CDRs in the specific order provided. *. Represents “any number of unspecified residues, including zero”. Motif searches require 100% match in “defined” residues. >37-motif DLSIH.*GFDPQDGETIYAQKFQG.*GSSSSWFDP >9-motif RASQGISSWLA.*GASNLES.*QQANSFPWT
  • 31. Unique Family Sequence UFS • Merge all identical sequences within a family • Based on strict criteria: identical sequence, patent family, sequence length • Examine a sequence’s status across authorities • Group By UFS can replace group by family for finer resolution of unique hits • UFS Identifier = MD5Sum + Sequence Length + Family ID • UFS IDs can be transient Normalized Sequence/Patent Family
  • 32. Methodology – Searching CDRs All3CDRs(orprimer/ampliconsets)insubjectorpatent MOTIF – exact match GenePAST – variations By requiring a group size equal to three in the post search grouping – we show patents that contain all three CDRs • Fasta sequences for your search allows multiple queries at once • GenePAST will allow you to view patent hits with variability in the CDRs
  • 33. Conservative Substitutions Subjectscomprisingall3CDRS Upto1substitution Subject and Query Gaps • Gaps in CDRs and primers can be ignored using the Query/Subject gap filter • Variations – i.e. number of differences can be adjusted without calculating % identity
  • 34. Database Selection Tree Structure and Virtual Databases • Tree structure allows easy database search setup • Multiple virtual databases can be chosen • Virtual databases can be shared among teams • Save your own databases from keyword or IP searches – and search within results
  • 35. Patent Statistics Report • For multiple queries quickly display patents that contain all or a subset of the queries