Se ha denunciado esta presentación.
An overview of how we use Amazon web services at Mendeley. I go into more details on generating pdf previews for our site of 13TB of files, and scaling solr search to handle variable load on EC2.
Amazon Web ServicesatMendeleyDan HarveyData Architecttwitter: @email@example.com
Overview• What do we do?• System design• AWS details• Future plans• Summary
Mendeley helps researchers work smarter
Mendeley helps researchers work smarter1) InstallMendeley Desktop
Mendeley helps researchers work smarter1) InstallMendeley Desktop Automatic data extraction 2) Manage your research papers
Mendeley helps researchers work smarter1) InstallMendeley Desktop External database integration 2) Manage your research papers
Mendeley helps researchers work smarter1) InstallMendeley Desktop Automatic bibliography generation 2) Manage your research papers
Mendeley helps researchers work smarter1) InstallMendeley Desktop Tagging and annotation 2) Manage your research papers
Mendeley helps researchers work smarter 3) Mendeley aggregates research data in the cloud1) InstallMendeley Desktop 2) Manage your research papers
By doing this, Mendeley makes science morecollaborative and transparent
Mendeley in numbers• 1 million users• 130 million research articles• 40 million unique• 14 million unique files uploaded• 13 TB in total
System Overview S3 ng Amazon Web Web Web S ynci Services Server ServerEM R Brow sing Docs EC 2 Usage Logs MySQL MySQL MySQL Da ta S erv ice s Map Reduce HB ase HD FS
File Storage• Sync to and from clients –Backed onto S3• How to render 13TB of pdfs?
PDF Previews• Elastic Beanstalk• Java servlet –Load & render –Store into S3• Quick to prototype –Fast iterations –No infrastructure to set up © Elas%c
2011 –Developers in control –No upfront cost in hardware• No dependency on rest of our infrastructure
Adapt to take advantage• Improve delivery –Cloud Front –Faster worldwide• Re-working for cost saving –SQS –Spot instances –Render when it’s cheapest!
Article Search• 40 million papers• Gives 40GB index in Solr• Variable load• Moved to EC2 –Elastic Load Balancer Two
week –Auto-scale instances
Solr Instance Layout• Master Solr –Single instance Master –Matched to indexing load –Backed onto EBS Solr Solr Solr Slave Slave Slave• Slaves –HTTP sync to master –Pre-built AMI images Elastic Load Balancer –EC2 auto scaling
Desktop Client• Client Downloads –From S3 –Adding CloudFront• Crash Reports –Stack traces into S3 –Analytic reports on top –More focused bug fixing
The future• Aim to buy no more hardware• More Java on Elastic Beanstalk• SQS - replace queues• EMR - log analysis• SimpleDB & S3 for data stores
Problems Faced• Accounting usage –Mix of users on account –Start early with this! –IAM helps• Orchestration –Cloud Formation –Elastic Beanstalk –Finding we need more
Summary• Not all or nothing• Focus on your problem not “Undifferentiated heavy lifting” - Werner Vogels• Learn the building blocks provided• Modular system design helps
Mendeley Binary Battle• $10,001 prize + $1000 aws vouchers• Collaboration with PLoS• Prizes to best use of the API• Judging panel includes –Werner Vogels –Tim OReilly
We’re hiring http://mendeley.com/careers/ or chat to me after• Lead Mobile Developer, iOS• Web Developer, PHP/MySQL• Software Engineer, Java