SlideShare a Scribd company logo
1 of 12
Interview with
Carol Scott
PhD, Bioengineering
Bioinformatics Scientist and Curator
Conserved Domain Database
A project of the U.S. National Library of Medicine at the
National Institutes of Health,
National Center for Biotechnology Information
Katie Rapp
LBSC 690
March 1, 2011
Protein is Everything!
 Every living thing is made up of unique, identifiable proteins
 Examples: human hemoglobin, insulin, proteins in
fungus, bacteria, plants
 Proteins are made of different combinations of amino acids
 20 naturally-occurring amino acids; they are like beads in a necklace
and their order determines the type of protein
 Proteins do the work inside cells
 Examples: Hemoglobin carries oxygen in the blood, insulin regulates
glucose metabolism
Problems with Proteins
 Proteins do the work inside cells, so when there are
problems, such as diseases, they are often caused by a
defective protein
 Example: Sickle Cell Anemia (one change in one amino acid in
hemoglobin and you go from healthy to ill)
 Medical researchers study proteins at the molecular level in
order to find cures to diseases
Conserved Domains –
Motivation behind the
database
 The amino acid chains that make up proteins are coiled and
folded. Repeated blocks of coiled and folded amino acids are
referred to as “conserved domains.”
 Conserved domains have specific functions and 3-
dimensional shapes
 It is useful for researchers to be able to compare related
conserved domains in different proteins, but there was no
real way to do this in the past
Conserved Domain Database -
Development
 This database was developed to meet the needs of
researchers
 Project begun in 2001; Carol Scott has worked on it since
2002
 Worked with software developers to produce highly-
interactive database
Conserved Domain Database
Curators
 Carol Scott and other curators create the data in the
database from lists of amino acid sequences found in other
databases
 They take amino acid sequences from millions of proteins
and link them based on structural and functional similarities
 They work with programmers to create the interface and
visual output of the database
 Curators also find and provide links to information about each
protein, journal articles and other resources, related proteins
Conserved Domain Database -
Challenges
 Not all amino acid sequence information is reliable – curators
must pick and choose where they get the basic data to put
into their database
 The process of creating the comparisons in the CDD is very
complex and time-consuming
 Software exists to help find these comparisons, but much
work must be done manually based on knowledge of the
chemical attributes of the amino acids
 The project is currently facing budgetary cutbacks which
affect staffing and perhaps the future of the database
Conserved Domain Database
Results
 Enables scientists to search on specific amino acid chains of
interest to them
 Genetic studies, mutation studies, studying size, shape and
function of proteins
 They can find and compare similar chemical alignments in
different proteins
 These alignments can provide insight into the functions of
different parts of protein molecules
Conserved Domain Database
Output – 3-Dimensional Structures
Conserved Domain Database
Output - Superfamilies
Conserved Domain Database
Users – Who Are They?
 The database is freely accessible to anyone over the internet
 It is used frequently by researchers around the world
 Users include anyone studying proteins – everyone from high
school and college students up to very high level researchers
at NIH, pharmaceutical companies, genetic researchers,
bioengineering firms, etc.
 Can be used to spur further research into areas where
defects in proteins could be repaired using genetic
engineering
Conserved Domain Database
 Questions?

More Related Content

Similar to Interview with NCBI Staff Scientist Carol Scott

Biochemistry-Student-Copy.pptx
Biochemistry-Student-Copy.pptxBiochemistry-Student-Copy.pptx
Biochemistry-Student-Copy.pptx
Ellahdulpina
 
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
57.insilico studies of cellulase from Aspergillus terreus
57.insilico studies of cellulase from Aspergillus terreus57.insilico studies of cellulase from Aspergillus terreus
57.insilico studies of cellulase from Aspergillus terreus
Annadurai B
 

Similar to Interview with NCBI Staff Scientist Carol Scott (20)

Intro bioinfo
Intro bioinfoIntro bioinfo
Intro bioinfo
 
Intro bioinfo
Intro bioinfoIntro bioinfo
Intro bioinfo
 
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biochemistry-Student-Copy.pptx
Biochemistry-Student-Copy.pptxBiochemistry-Student-Copy.pptx
Biochemistry-Student-Copy.pptx
 
Databases
DatabasesDatabases
Databases
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Improving online chemistry one structure at a time
Improving online chemistry one structure at a timeImproving online chemistry one structure at a time
Improving online chemistry one structure at a time
 
Protein motif analysis and optimization using neural algorithms
Protein motif analysis and optimization using neural algorithmsProtein motif analysis and optimization using neural algorithms
Protein motif analysis and optimization using neural algorithms
 
The Importance of an Amino Acid Library Iroa
The Importance of an Amino Acid Library IroaThe Importance of an Amino Acid Library Iroa
The Importance of an Amino Acid Library Iroa
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Types of biological databases-protein database
Types of biological databases-protein databaseTypes of biological databases-protein database
Types of biological databases-protein database
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformatics
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014
 
57.insilico studies of cellulase from Aspergillus terreus
57.insilico studies of cellulase from Aspergillus terreus57.insilico studies of cellulase from Aspergillus terreus
57.insilico studies of cellulase from Aspergillus terreus
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Interview with NCBI Staff Scientist Carol Scott

  • 1. Interview with Carol Scott PhD, Bioengineering Bioinformatics Scientist and Curator Conserved Domain Database A project of the U.S. National Library of Medicine at the National Institutes of Health, National Center for Biotechnology Information Katie Rapp LBSC 690 March 1, 2011
  • 2. Protein is Everything!  Every living thing is made up of unique, identifiable proteins  Examples: human hemoglobin, insulin, proteins in fungus, bacteria, plants  Proteins are made of different combinations of amino acids  20 naturally-occurring amino acids; they are like beads in a necklace and their order determines the type of protein  Proteins do the work inside cells  Examples: Hemoglobin carries oxygen in the blood, insulin regulates glucose metabolism
  • 3. Problems with Proteins  Proteins do the work inside cells, so when there are problems, such as diseases, they are often caused by a defective protein  Example: Sickle Cell Anemia (one change in one amino acid in hemoglobin and you go from healthy to ill)  Medical researchers study proteins at the molecular level in order to find cures to diseases
  • 4. Conserved Domains – Motivation behind the database  The amino acid chains that make up proteins are coiled and folded. Repeated blocks of coiled and folded amino acids are referred to as “conserved domains.”  Conserved domains have specific functions and 3- dimensional shapes  It is useful for researchers to be able to compare related conserved domains in different proteins, but there was no real way to do this in the past
  • 5. Conserved Domain Database - Development  This database was developed to meet the needs of researchers  Project begun in 2001; Carol Scott has worked on it since 2002  Worked with software developers to produce highly- interactive database
  • 6. Conserved Domain Database Curators  Carol Scott and other curators create the data in the database from lists of amino acid sequences found in other databases  They take amino acid sequences from millions of proteins and link them based on structural and functional similarities  They work with programmers to create the interface and visual output of the database  Curators also find and provide links to information about each protein, journal articles and other resources, related proteins
  • 7. Conserved Domain Database - Challenges  Not all amino acid sequence information is reliable – curators must pick and choose where they get the basic data to put into their database  The process of creating the comparisons in the CDD is very complex and time-consuming  Software exists to help find these comparisons, but much work must be done manually based on knowledge of the chemical attributes of the amino acids  The project is currently facing budgetary cutbacks which affect staffing and perhaps the future of the database
  • 8. Conserved Domain Database Results  Enables scientists to search on specific amino acid chains of interest to them  Genetic studies, mutation studies, studying size, shape and function of proteins  They can find and compare similar chemical alignments in different proteins  These alignments can provide insight into the functions of different parts of protein molecules
  • 9. Conserved Domain Database Output – 3-Dimensional Structures
  • 11. Conserved Domain Database Users – Who Are They?  The database is freely accessible to anyone over the internet  It is used frequently by researchers around the world  Users include anyone studying proteins – everyone from high school and college students up to very high level researchers at NIH, pharmaceutical companies, genetic researchers, bioengineering firms, etc.  Can be used to spur further research into areas where defects in proteins could be repaired using genetic engineering