2. Drug Research: Common Challenges
EXPENSIVE! - Drug research is expensive. A new drug takes
around 15 years and $1.2b from concept to market.
LOW SUCCESS RATE! - The success ratio is extremely low
with most candidate molecules being abandoned midway. Two
out of three submissions with regulatory authorities result in
failures.
LOW GROWTH RATE! - The typical growth rate has reduced
from 13% to 5%. Limited resources.
DUPLICATION OF EFFORT! - Companies often end up
duplicating research effort as they fail to determine if similar
research is taking place somewhere else.
OMICS EMIT UNMANAGEABLE DATA! - Newer technologies
have come in, that deal at gene and cell level. Resulting data is
Voluminous, in Various formats and gets piled up at a blistering
pace. Drug research faces challenges in leveraging these
technologies in a timely, effective , efficient and optimum
manner.
Accommodator Consultancy Services
Lucknow
3. Our Offerings in IT in Life Sciences
TEXT MINING SOLUTIONS
DATA WAREHOUSE SOLUTIONS
DATA MINING SOLUTIONS
DATABASE DEVELOPMENT SOLUTIONS
BIG DATA ANALYTICS
CANCER SOLUTIONS
Accommodator Consultancy Services
Lucknow
4. TEXT MINING SOLUTIONS
Philosophy – Researchers to be able to find new information found in the
various scientific reports and papers published around the world and then
absorb that information into their ongoing work and give direction to their
work by gathering and analyzing trends.
Areas Covered:
Patents
Research papers
Publications
Specialized web sites such as Pubchem, Pubmed covering millions of articles
Social media sites such as Facebook, Twitter, Instagram, blogging forums etc,
Internal collection of documents and information.
Deliverable: A set of programs that would automatically run and prepare
Reports and documents with relevant summarized and detailed information
downloaded from above mentioned sources based on input keywords and
events.
Accommodator Consultancy Services
Lucknow
5. WHY PATENT MINING
Patent information of Novel bioactive chemical structures related to drug
discovery exceed those in journals by at least five-fold.
Patents encompass academic, as well as commercial, global med. chem.
output.
Targets, assays, mechanisms of action, disease descriptions and in-vivo
data.
~ 70% of data initially patent-only, some never disclosed elswhere.
Include synthetic descriptions and other useful enabling information.
Precede journal or meeting reports by ~ 1.5 to 5 years.
Can be complementary to papers (e.g. larger SAR matrix).
Intersect with papers at chemistry, target, disease, author and citation levels
IP exploitable for Neglected Tropical Disease research becoming ”open”.
Accommodator Consultancy Services
Lucknow
7. DATA MINING SOLUTIONS
Definition - Data Mining is an interdisciplinary subfield of computer science that
discovers patterns in large data sets involving methods at the intersection of
artificial intelligence, machine learning, statistics, and database systems.
Philosophy – The overall goal of the data mining process is to extract
information from a data set and transform it into an understandable structure for
further use.
Areas Covered:
Virtual HTS and HCS Data
Predictive Toxicology
Life sciences and health related issues trending on social media
FDA datasets
Micro-biomes
Chemo-genomics
Predicting and preventing diseases through gene analysis
Both big an small molecules
Deliverable: Converting raw data into actionable information after detecting
patterns and trends, and applying a number of verified algorithms.
Benefits: Improves prediction of early stage drug safety testing. Data mining
(as opposed to conventional statistical analysis) can uncover patterns and
relationships in large data volumes that are completely unexpected. Patterns
can be used to extrapolate and predict.
Accommodator Consultancy Services
Lucknow
9. DATA MINING CASE STUDIES
@ Roche:
Used DM techniques to set up models for the diagnostic of diabetes high risk group
to analyze existing samples sets (including Diabetes II patients and healthy subjects),
to identify the factors (age, sex, race, height, weight, BMI value, ADA value) that may
cause Diabetes II, and predict the probability of the subjects developing Diabetes II in
the next 7 and half years, in order to take preventive measures a traditional statistical
methods are not as accurate as DM methods.
@ GSK:
Data Mining Human Gut Microbiota for therapeutic targets. This could lead to a
systems-level understanding of the global physiology of the host–microbiota
superorganism in health and disease. Such knowledge will provide a platform for the
identification and development of new therapeutic strategies for chronic diseases
possibly involving microbial as well as human-host targets that improve upon existing
probiotics, prebiotics or antibiotics
used text analytics to analyze public discussion boards on BabyCenter.com and
WhattoExpect.com, to learn what factors motivate parents to either go ahead or delay
vaccinating their children for diseases like measles and mumps.
Data mining was used to identify unrecognized drug interaction (pravastatin and
paroxetine) that suggested raising blood glucose level manifold. However this would
need a careful crystallization of the problem statement by experts to make the
exercise useful.
Accommodator Consultancy Services
Lucknow
10. DATA MINING CASE STUDIES
@ Bayer:
GI adverse effect of short term Aspirin use. Meta analysis of AE comparison with
similar drugs for mktg. & drug improvement.
@ Pfizer:
Uses mining to determine if certain AE’s are being reported with greater frequency
than expected.
large-scale semantic Web-based data mining and network methods to seek to
uncover previously undiscovered historical links between chemical compounds,
drugs, biological pathways, targets, genes and diseases.
By using big data to bring together genomic data, clinical trials and EMR data,
Pfizer was able to develop precise drug ‘Xalkori’ which proves very effective for
around 5% of patients suffering from cancer who suffer mutation of their ALK gene.
Through data mining, this sub section of population was identified which had a
healthy lifestyle, yet got affected by cancer.
It funded a study that would use genomic data mining to identify antigens in NTS
(non-typhoidal salmonella) that may be used as targets for vaccine development.
@ Johnson & Johnson:
Has built an open source data management system called Transmart. The idea is
to combine genomic data sets, from internal and external sources, using the
platform's data standards and processing capabilities. This facilitates data mining
which provides immense opportunities.
Accommodator Consultancy Services
Lucknow
11. DATA MINING CASE STUDIES
@ Novartis:
In HTS, used Ontology Based Pattern Identification (OPI) algorithm to predict
patters by which they were able to find out 1500 scaffold families with significant
structure-HTS activity profile relationships.
@ Astra Zeneca
It uses data-mining tools to identify plausible preclinical Gastro Intestinal effects
that may be associated with nausea and that could be of potential use in its
prediction. A total of 86 marketed drugs were used in this analysis, and the main
outcome was a confirmation that nausogenic and non-nausogenic drugs can be
clearly separated based on their preclinical GI observations. .
Accommodator Consultancy Services
Lucknow
12. CHEMOGENOMICS DATA MINING
Chemogenomics is rapidly emerging as a way of helping discover new disease therapies and
uncovering new uses for existing drugs.
There are large structure activity databases set up by pharmaceutical companies and
commercial vendors. These databases can be mined to derive insights into common properties or
structural features among ligands linked to common features of the receptors to which they bind.
These insights can then used for the rational compilation of screening sets or the knowledge-based
synthesis of chemical libraries to accelerate lead finding.
Can be used to reposition drugs and find new applications for existing
drugs/molecules/compounds.
Four Canadian government research funding agencies will spend around US$6.7 million to
create a cloud computing facility and data mining tools that will enable researchers to access and
use data from the International Cancer Genome Consortium.
DM could lead to a systems-level understanding of the global physiology of the host–microbiota
superorganism in health and disease. Such knowledge will provide a platform for the identification
and development of new therapeutic strategies for chronic diseases possibly involving microbial as
well as human-host targets that improve upon existing probiotics, prebiotics or antibiotics
We can collect or organize known GPCR and non GPCR ligands and mining models can be
trained based on such properties. New compounds can automatically be classified as ligand or non
ligand based compound.
Design and knowledge based synthesis of chemical libraries targeting subfamily of purinergic
GPCR . Chemical scoffolds can be synthesized.
Accommodator Consultancy Services
Lucknow
13. DATA WAREHOUSE SOLUTIONS
Definition – Central repository created by integrating data from disparate
sources, with past and current data for both operational and strategic decision
making and senior management reporting such as annual comparisons of budget
per scientist.
Goal – to enable users appropriate access to a homogenized, comprehensive
and consistent view of the organization, supporting forecasting and decision-
making processes at the enterprise level..
Areas Covered:
Bioinformatics research
Finance
HR
Marketing
Disease Management etc
Deliverable: Central repository of useful and actionable data integrated from
multiple departments and sources and available to end users for operational and
strategic decision making in an efficient and effective manner.
Benefits: Better use of internal resources, Reduction in critical time path for
statistical analysis. Standard exchange of data with CRO’s, partners and
regulatory agencies. Cross trial analysis and leveraged use of historical data.
Globalization and knowledge sharing. Facilitates open source drug development.
Compliance with regulatory authorities.
Accommodator Consultancy Services
Lucknow
15. DWH @ NOVARTIS
Prominent DWH – FDA’s Janus, Johnson and Johnson, Pfizer,
Novartis’ Avalon, GSK and Roche
DWH Use Cases:
Accommodator Consultancy Services
Lucknow
16. DWH USE CASES
Novartis:
Tell me everything about a given structure
Collect comprehensive data of corporate interest in a single place.
Data grouped by chemical structure.
Standardized data dictionary to describe data.
Chemical structure conventions are unified.
Computed descriptors would be available
Given a substructure give me useful calculated descriptors.
Assays physical properties and calculated descriptors are represented uniformly.
Will support changing row model between batch, compound and bioactive.
Find all compounds in stock with some publicly known activity.
Integrate structured in house data with external data.
Set the row model by active substance.
Pre defined task based query to automate this kind of query.
FDA Janus:
Janus creates an integrated data platform for most commercial tools for review, analysis and reporting.
It reduces overall cost of information gathering and submissions, development process as well as review and analysis
of information.
It provides a common data model that is based on the SDTM standard to represent four classes of clinical data
submitted to regulatory agencies: tabulation datasets, patient profiles, listings, etc.
It provides central access to standardized data, and provides common data views across collaborative partners.
It supports cross-trial analyses for data mining and helps detect clinical trends and address clinical hypotheses, and
performs more advanced, robust analysis. This enables the ability to contrast and compare data from multiple clinical
trials to help improve efficacy and safety.
It facilitates a more efficient review process and ability to locate and query data more easily through automated
processes and data standards.
It provides a potentially broader data view for all clinical trials with proper security, de-identified patient data, and
proper agreements in place to share data.Accommodator Consultancy Services
Lucknow
17. ERP v/s DWH
People confuse between ERP and DWH. They are
different as shown below:
Accommodator Consultancy Services
Lucknow
ERP DWH
Detailed Summarized
Facilitate data entry &
storage
Facilitate quick analysis
Used by Operations Used by Strategists
End users need to be
trained
Generalist end users
No AdHoc reporting Facilitates ad hoc reports
ERP for biochemical less
available
Easily integrates and
stores biochemical data
18. DATABASE SOLUTIONS
Drug discovery analytics is traditionally performed on
Relational Database Management Systems. However
with new discoveries, it does not remain an optimal
choice. Discoveries require newer technologies.
Commercial RDBMS have kept pace by introducing
newer features (such as column store indexes)
We design the RDBMS to consolidate data from
disparate sources to facilitate analytics. We also convert
existing DBMS systems to leverage newly introduced
features.
We also undertake performance enhancements,
provide additional security and other maintenance
tasks.
Accommodator Consultancy Services
Lucknow
19. BIG DATA SOLUTIONS
Definition – A collection of data sets so large and complex that it becomes difficult
to process using on-hand database management tools or traditional data processing
applications. The challenges include capture, curation, storage, search, sharing,
transfer, analysis and visualization. The trend to larger data sets is due to the
additional information derivable from analysis of a single large set of related data, as
compared to separate smaller sets with the same total amount of data, allowing
correlations to be found to "spot business trends, determine quality of research,
prevent diseases, link legal citations, combat crime, and determine real-time roadway
traffic conditions.
Philosophy – To handle such huge data generated by Omics, regular computers
are used that are networked/set up in such a way to make it loss proof and leverage
individuals processors to work in synergy and solve bigger problems Companies have
started offering cloud storage for big data and publicly available.
Areas Covered:
Finding cause of diseases
Repositioning of drugs
Prescription of more effective drugs and procedures.
Deliverable - We collect information about possible sources of data for related
research area. We analyze the data for volume variety and velocity. We do a small
pilot prototype of the big data set up using source big data on cloud. We set up
programs to collect and process the data and then try to solve the hypothesis
Accommodator Consultancy Services
Lucknow
20. BIG DATA SOLUTIONS
Accommodator Consultancy Services
Lucknow
Use Case 1: Researchers found that previously undetected mutations in a single
gene (called LMX1B) triggered focal segmental glomerulosclerosis (FSGS), a
disease that scars the kidneys’ filtering system. This was possible after genome data
was collected and compared for healthy and diseased individuals.
Use Case 2: Big data approach already has predicted the efficacy of drug
repurposing for treating colitis — a form of inflammatory bowel disease — small-cell
lung cancer and other conditions, according to Scott Saywell, vice president,
corporate development, NuMedii.
Use Case 3: For patients, the use of big data analytics in drug development results
in less trial and error when physicians prescribe drugs. This tighter targeting of
drugs to disease also results in fewer side effects.
According to new draft policy by Dept. of Biotechnology, Govt. of India, genome
based prescription and treatment will be top priority in next few years.
The draft policy envisages converting half of hospitals currently engaged in
treatment of human diseases to that of prediction and prevention of diseases using
genomic tools.
It also aims to provide all available genetic screening tests to general public at
affordable prices.
Genome data processing and analysis has been possible by Big Data as genome
(and other omics technologies) for just one individual results in data that tops 80
story building when translated on a paper.
21. Cancer Solutions
Accommodator Consultancy Services
Lucknow
• We offer collaborate with CDRI and ITRI for providing cancer patients data
for further research.
• We do research on National Cancer Data Repository providing consultancy
on cancer drugs and assisting in cancer research with a goal of
personalized cancer solutions.
• Any other assistance you would need on this subject.
22. Value that Accommodator
Consultancy would add
Accommodator Consultancy Services
Lucknow
We have vast experience in data analysis, text and data mining
and dealing in technologies compatible with biochemical
substances having delivered successful projects throughout the
world. We will take the IT and statistics worries away from you
so you can concentrate on pure research.
We have the skills to be able to work with large volumes of data
and Big Data (Hadoop) source systems.
Vast experience in developing, using and configuring different
kinds of bioinformatics software.
Team consists of chemist, data warehouse and data mining
professional and senior cancer surgeon.
We firmly believe in providing great value in our service/product
offering.
23. Questions/Comments?
Accommodator Consultancy Services
Lucknow
In the interest of keeping material short, only a simple summary
has been provided. Please do not hesitate to ask any
questions/clarification for further details.
Our contact details:
Ankur Khanna: Director Technical
945 166 8432
Dr Vibhor Mahendru: Director Business Development
800 536 5132
THANK YOU