SlideShare una empresa de Scribd logo
1 de 21
Tech Talk Live
Alfresco Performance Tuning – Part 1
Speaker Bio
Luis Cabaceira – Principal Consultant at Alfresco
Agenda
1 - General best practices on tuning
2 - Common mistakes
3 - Sizing, what to expect from a single server
4 – Solr tuning
5 – Jvm tuning ( Part 2 )
6 – Caches (Part 2 )
7 - Alfresco is running slow.. where to start ?
(Part 2)
1 - General Best Practices on Tuning
Disable Un-used services and features
• Disable virtual file-systems
• cifs.enabled=false, ftp.enabled=false
• webdav.enabled=false, nfs.enabled=false, imap.enabled=false
• Disable thumbnails and documents previews
• system.thumbnail.generate=false
• Disable share web-preview (on share-config-custom)
<config evaluator="string-compare" condition="DocumentDetails" replace="true">
<document-details>
<!-- display web previewer on document details page -->
<display-web-preview>false</display-web-preview>
</document-details>
</config>
• Disable replication
• replication.enabled=false
• transferservice.receiver.enabled=false
1 - General Best Practices on Tuning
Disable Un-used services and features (cont.)
• Disable cloud-sync features
• syncService.mode=OFF
• sync.mode=OFF
• sync.pullJob.enabled=false
• sync.pushJob.enabled=false
• Disable user quotas
• system.usages.enabled=false
• system.usages.clearBatchSize=0
• Disable eager creation of home folders
• system.usages.enabled=false
• Disable activities feed
• activities.feed.notifier.enabled=false
• activities.feed.cleaner.enabled=false
• activities.post.cleaner.enabled=false
1 - General Best Practices on Tuning
Golden Rules for the Repository
• Limit Groups hierarchy to 5 (nested groups)
• User inheritance based permission model
• Limit the maximum number of nodes in a folder
• Have a certain degree of control on the number of sites, do you really need
10000 sites ?
• Keep a low ratio on user/groups membership
• Limit the depth of the folder hierarchy
2 – Common Mistakes
• Not keeping extended configurations and customizations separate in the shared
directory. Do not put them in the configuration root. If you do, you will lose them
during upgrades.
• Not testing the backup strategy.
• Insufficient Monitoring, Insufficient troubleshooting tools
• Making changes to the system without testing them thoroughly on a test and pre-
production machine first.
• Forgetting to adjust the system sizing for increased users and sessions
• Increase the database connection pool
• Tune the maxThreads on tomcat
• Tune Jvm and Gc
• Benchmarks and Stress tests
• Using a shared database with other applications
• Network/Infrastructure constraints
• Not following the SPM
2 – Common Mistakes
• Customizations / Custom Code mistakes
• Not closing search resultsets on try..catch…finally blocks (memory leaks)
• Incorrect usage of policies/behaviors (collisions, poor code quality)
• Using lower case versions of Alfresco beans
• Direct access to the database (should use Spring and the existing DAOs)
• Usage of private API’s
2 – Common Mistakes
• Customizations / Custom Code mistakes
• Using Transaction Service instead of
RetryingTransactionHelper
• Not using CMIS query language when using SearchService
• Improper exception handling
3 – Sizing - what to expect from a single server
Lets assume we’re running alfresco on a single server with the following hardware.
Red Hat Linux 64 bits, 16 GB RAM, 2 quad-core cpus 3.2Ghz, local SSD disk
We will have 3 web-applications running on the same JVM and container (i.e tomcat)
• Alfresco Repository
• Alfresco Share UI
• Solr
According to our internal benchmarks, and highly dependent on the specifics of each
use case this server should be able to serve handle 200 concurrent users or up to
2000 casual users
3 – Sizing - what to expect from a single server
The following facts will affect the sizing and architecture.
• Use Case
• Concurrent users
• Document types, sizes and distribution ratios
• Architecture (virtualization ?, fail safe ?, replication ? Integrations ? Component stack
• Authority structure
• Operations
• Components, Protocols and Apis
• Batch operations
• Response times requirements
3 – Sizing , 2 common use cases
3 – Sizing Divide and Conquer
• Know when,where and what are the processes that are running on your
server and what resources are those processes influencing.
• Do it with appropriate monitoring
• Javamelody as a simple approach (DEMO)
• https://github.com/miguel-rodriguez/alfresco-monitoring
• Use support tools for troubleshooting (DEMO)
• https://github.com/Alfresco/alfresco-support-tools
• Have specific servers dedicated to specific tasks
• Offload the user facing nodes
3 – Sizing – Capacity Plan
3 – Sizing – Monitor your resources
4 – Solr Tuning
Golden Rules for Solr
• Do you search on deleted content ? If not, disable the archive core.
• Go to solrHome and edit the solr.xml file commenting out the archive core
• Also disable the archive core backup scheduled task
• Do you search on content or only meta-data ? You can disable full-text-indexing
• alfresco.index.transformContent=false
• Alfresco can make use of Transactional Metadata queries (db fetch)
• SSL really needed? If inside the intranet, it should be disabled to reduce complexity.
• Optimize your ACL policy, re-use your permissions, use inherit and use groups
4 – Solr Tuning
Golden Rules for Solr Indexing
• Have local indexes (don’t use shared folders, NFS, use Fast hardware
(RAID, SSD,..)
• Tune the mergeFactor, 25 is ideal for indexing, while 2 is ideal for search.
• Tune your Ram buffer size (ramBufferSizeMB) on solrconfig.xml, 32 MB by
default
• Analyze your indexing processes (check alfresco repository health)
• Tune the transformations that occur on the repository side, set a
transformation timeout.
4 – Solr Tuning
Golden Rules for Solr Indexing
• Closely monitor Solr JVM (especially GC and Heap usage)
• Enable GC logs, analyze Gc performance, tune the GC algorithm
• Do you need tracking to happen every 15 seconds ?
• Use a dedicated tracking alfresco instance, several architecture options
• Increase your index batch counts to get more results on your indexing
webscript
• In each core solrcore.properties, raise the batch count to 2000
• Impacting factors in Indexing
• Jvm Memory and Cpu usage on Repository Layer (text extraction /transformations)
• Jvm Memory ,Cpu , Disk I/O, Disk Cache size on Solr Layer
• Number of threads for indexing, Solr caches
4 – Solr Tuning
Golden Rules for Solr Search
• Have local indexes (don’t use shared folders, NFS, use Fast hardware (RAID,
SSD,..)
• Tune the mergeFactor, 2 is ideal for search.
• Increase your query caches and the RAMBuffer
• Avoid path search queries, those are be slow,.Avoid * search, avoid ALL search
• Avoid using sort, you can sort your results on the client side using js or any client
side framework of your choice.
• Search is CPU intensive rather then RAM intensive, increase cpu power.
• Upgrade your Alfresco release with the latest service packs and hotfixes. Those
contain the latest Solr improvements and bug fixes that can have great impact on the
overall search performance
4 – Solr Tuning
Solr Caches
Tracking the usage of the solr caches can help to tune them for your use case.
• http://<solr_server>:<solr_port>/solr/alfresco/admin/stats.jsp#cache
The url above show you statistics on cache usages, If you have many evictions you
should look into increasing that cache module so all elements can fit (but don't
overdo it, adjust it and see what fits for your setup). It is likewise also a idea to decrease
the size of some of them if they have a lot of unused slots.
Goal should be to get the hit rate as close to 1.00 as possible (1.00 beeing 100% hit ratio)
4 – Solr Tuning
Solr usage on Alfresco Share
• Solr indexing and search performance will affect positively the overall share
performance. Share relies on Solr in the following situations:
• Full Text Search (search field in top right corner)
• Advanced Search
• Filters
• Tags
• Categories (implemented as facets)
• Dashlets such as the Recently Modified Documents
• Wildcard searches for People, Groups, Sites (uses database search if not wildcard)

Más contenido relacionado

La actualidad más candente

How to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseHow to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseAngel Borroy López
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions Alfresco Software
 
Alfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursAlfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursJ V
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and ThenAngel Borroy López
 
From zero to hero Backing up alfresco
From zero to hero Backing up alfrescoFrom zero to hero Backing up alfresco
From zero to hero Backing up alfrescoToni de la Fuente
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionFrancesco Corti
 
Jose portillo dev con presentation 1138
Jose portillo   dev con presentation 1138Jose portillo   dev con presentation 1138
Jose portillo dev con presentation 1138Jose Portillo
 
Alfresco Security Best Practices 2014
Alfresco Security Best Practices 2014Alfresco Security Best Practices 2014
Alfresco Security Best Practices 2014Toni de la Fuente
 
Architectural changes in the repo in 6.1 and beyond
Architectural changes in the repo in 6.1 and beyondArchitectural changes in the repo in 6.1 and beyond
Architectural changes in the repo in 6.1 and beyondStefan Kopf
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Ji-Woong Choi
 
alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6alfrescosedemo
 
Alfresco Security Best Practices Guide
Alfresco Security Best Practices GuideAlfresco Security Best Practices Guide
Alfresco Security Best Practices GuideToni de la Fuente
 
Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0Angel Borroy López
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoRichard McKnight
 
Building infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowBuilding infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowAnton Babenko
 

La actualidad más candente (20)

How to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseHow to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterprise
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions
 
Alfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursAlfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy Behaviours
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
 
From zero to hero Backing up alfresco
From zero to hero Backing up alfrescoFrom zero to hero Backing up alfresco
From zero to hero Backing up alfresco
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in Action
 
Alfresco CMIS
Alfresco CMISAlfresco CMIS
Alfresco CMIS
 
Alfresco Certificates
Alfresco Certificates Alfresco Certificates
Alfresco Certificates
 
Jose portillo dev con presentation 1138
Jose portillo   dev con presentation 1138Jose portillo   dev con presentation 1138
Jose portillo dev con presentation 1138
 
Alfresco Security Best Practices 2014
Alfresco Security Best Practices 2014Alfresco Security Best Practices 2014
Alfresco Security Best Practices 2014
 
Architectural changes in the repo in 6.1 and beyond
Architectural changes in the repo in 6.1 and beyondArchitectural changes in the repo in 6.1 and beyond
Architectural changes in the repo in 6.1 and beyond
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드
 
alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6alfresco-global.properties-COMPLETO-3.4.6
alfresco-global.properties-COMPLETO-3.4.6
 
Alfresco Security Best Practices Guide
Alfresco Security Best Practices GuideAlfresco Security Best Practices Guide
Alfresco Security Best Practices Guide
 
Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0
 
Terraform
TerraformTerraform
Terraform
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for Alfresco
 
Upgrading to Alfresco 6
Upgrading to Alfresco 6Upgrading to Alfresco 6
Upgrading to Alfresco 6
 
Building infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps KrakowBuilding infrastructure as code using Terraform - DevOps Krakow
Building infrastructure as code using Terraform - DevOps Krakow
 

Similar a Alfresco tuning part1

Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platformLuis Cabaceira
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Lari Hotari
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014Ryusuke Kajiyama
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephenSteve Feldman
 
Got Problems? Let's Do a Health Check
Got Problems? Let's Do a Health CheckGot Problems? Let's Do a Health Check
Got Problems? Let's Do a Health CheckLuis Guirigay
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scaleAnshum Gupta
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Lari Hotari
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Govind Kanshi
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interactionGovind Kanshi
 
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it FastBarry Jones
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaJason Shih
 
MySQL Tech Tour 2015 - Manage & Tune
MySQL Tech Tour 2015 - Manage & TuneMySQL Tech Tour 2015 - Manage & Tune
MySQL Tech Tour 2015 - Manage & TuneMark Swarbrick
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool ManagementBIOVIA
 
MySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMorgan Tocker
 
Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502kaziul Islam Bulbul
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityOSSCube
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera ClusterAbdul Manaf
 

Similar a Alfresco tuning part1 (20)

Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platform
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014
 
MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
Got Problems? Let's Do a Health Check
Got Problems? Let's Do a Health CheckGot Problems? Let's Do a Health Check
Got Problems? Let's Do a Health Check
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
Fastest Servlets in the West
Fastest Servlets in the WestFastest Servlets in the West
Fastest Servlets in the West
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
 
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it Fast
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
 
MySQL Tech Tour 2015 - Manage & Tune
MySQL Tech Tour 2015 - Manage & TuneMySQL Tech Tour 2015 - Manage & Tune
MySQL Tech Tour 2015 - Manage & Tune
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management
 
MySQL Performance Metrics that Matter
MySQL Performance Metrics that MatterMySQL Performance Metrics that Matter
MySQL Performance Metrics that Matter
 
Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High Availability
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 

Último

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Último (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

Alfresco tuning part1

  • 1. Tech Talk Live Alfresco Performance Tuning – Part 1
  • 2. Speaker Bio Luis Cabaceira – Principal Consultant at Alfresco
  • 3. Agenda 1 - General best practices on tuning 2 - Common mistakes 3 - Sizing, what to expect from a single server 4 – Solr tuning 5 – Jvm tuning ( Part 2 ) 6 – Caches (Part 2 ) 7 - Alfresco is running slow.. where to start ? (Part 2)
  • 4. 1 - General Best Practices on Tuning Disable Un-used services and features • Disable virtual file-systems • cifs.enabled=false, ftp.enabled=false • webdav.enabled=false, nfs.enabled=false, imap.enabled=false • Disable thumbnails and documents previews • system.thumbnail.generate=false • Disable share web-preview (on share-config-custom) <config evaluator="string-compare" condition="DocumentDetails" replace="true"> <document-details> <!-- display web previewer on document details page --> <display-web-preview>false</display-web-preview> </document-details> </config> • Disable replication • replication.enabled=false • transferservice.receiver.enabled=false
  • 5. 1 - General Best Practices on Tuning Disable Un-used services and features (cont.) • Disable cloud-sync features • syncService.mode=OFF • sync.mode=OFF • sync.pullJob.enabled=false • sync.pushJob.enabled=false • Disable user quotas • system.usages.enabled=false • system.usages.clearBatchSize=0 • Disable eager creation of home folders • system.usages.enabled=false • Disable activities feed • activities.feed.notifier.enabled=false • activities.feed.cleaner.enabled=false • activities.post.cleaner.enabled=false
  • 6. 1 - General Best Practices on Tuning Golden Rules for the Repository • Limit Groups hierarchy to 5 (nested groups) • User inheritance based permission model • Limit the maximum number of nodes in a folder • Have a certain degree of control on the number of sites, do you really need 10000 sites ? • Keep a low ratio on user/groups membership • Limit the depth of the folder hierarchy
  • 7. 2 – Common Mistakes • Not keeping extended configurations and customizations separate in the shared directory. Do not put them in the configuration root. If you do, you will lose them during upgrades. • Not testing the backup strategy. • Insufficient Monitoring, Insufficient troubleshooting tools • Making changes to the system without testing them thoroughly on a test and pre- production machine first. • Forgetting to adjust the system sizing for increased users and sessions • Increase the database connection pool • Tune the maxThreads on tomcat • Tune Jvm and Gc • Benchmarks and Stress tests • Using a shared database with other applications • Network/Infrastructure constraints • Not following the SPM
  • 8. 2 – Common Mistakes • Customizations / Custom Code mistakes • Not closing search resultsets on try..catch…finally blocks (memory leaks) • Incorrect usage of policies/behaviors (collisions, poor code quality) • Using lower case versions of Alfresco beans • Direct access to the database (should use Spring and the existing DAOs) • Usage of private API’s
  • 9. 2 – Common Mistakes • Customizations / Custom Code mistakes • Using Transaction Service instead of RetryingTransactionHelper • Not using CMIS query language when using SearchService • Improper exception handling
  • 10. 3 – Sizing - what to expect from a single server Lets assume we’re running alfresco on a single server with the following hardware. Red Hat Linux 64 bits, 16 GB RAM, 2 quad-core cpus 3.2Ghz, local SSD disk We will have 3 web-applications running on the same JVM and container (i.e tomcat) • Alfresco Repository • Alfresco Share UI • Solr According to our internal benchmarks, and highly dependent on the specifics of each use case this server should be able to serve handle 200 concurrent users or up to 2000 casual users
  • 11. 3 – Sizing - what to expect from a single server The following facts will affect the sizing and architecture. • Use Case • Concurrent users • Document types, sizes and distribution ratios • Architecture (virtualization ?, fail safe ?, replication ? Integrations ? Component stack • Authority structure • Operations • Components, Protocols and Apis • Batch operations • Response times requirements
  • 12. 3 – Sizing , 2 common use cases
  • 13. 3 – Sizing Divide and Conquer • Know when,where and what are the processes that are running on your server and what resources are those processes influencing. • Do it with appropriate monitoring • Javamelody as a simple approach (DEMO) • https://github.com/miguel-rodriguez/alfresco-monitoring • Use support tools for troubleshooting (DEMO) • https://github.com/Alfresco/alfresco-support-tools • Have specific servers dedicated to specific tasks • Offload the user facing nodes
  • 14. 3 – Sizing – Capacity Plan
  • 15. 3 – Sizing – Monitor your resources
  • 16. 4 – Solr Tuning Golden Rules for Solr • Do you search on deleted content ? If not, disable the archive core. • Go to solrHome and edit the solr.xml file commenting out the archive core • Also disable the archive core backup scheduled task • Do you search on content or only meta-data ? You can disable full-text-indexing • alfresco.index.transformContent=false • Alfresco can make use of Transactional Metadata queries (db fetch) • SSL really needed? If inside the intranet, it should be disabled to reduce complexity. • Optimize your ACL policy, re-use your permissions, use inherit and use groups
  • 17. 4 – Solr Tuning Golden Rules for Solr Indexing • Have local indexes (don’t use shared folders, NFS, use Fast hardware (RAID, SSD,..) • Tune the mergeFactor, 25 is ideal for indexing, while 2 is ideal for search. • Tune your Ram buffer size (ramBufferSizeMB) on solrconfig.xml, 32 MB by default • Analyze your indexing processes (check alfresco repository health) • Tune the transformations that occur on the repository side, set a transformation timeout.
  • 18. 4 – Solr Tuning Golden Rules for Solr Indexing • Closely monitor Solr JVM (especially GC and Heap usage) • Enable GC logs, analyze Gc performance, tune the GC algorithm • Do you need tracking to happen every 15 seconds ? • Use a dedicated tracking alfresco instance, several architecture options • Increase your index batch counts to get more results on your indexing webscript • In each core solrcore.properties, raise the batch count to 2000 • Impacting factors in Indexing • Jvm Memory and Cpu usage on Repository Layer (text extraction /transformations) • Jvm Memory ,Cpu , Disk I/O, Disk Cache size on Solr Layer • Number of threads for indexing, Solr caches
  • 19. 4 – Solr Tuning Golden Rules for Solr Search • Have local indexes (don’t use shared folders, NFS, use Fast hardware (RAID, SSD,..) • Tune the mergeFactor, 2 is ideal for search. • Increase your query caches and the RAMBuffer • Avoid path search queries, those are be slow,.Avoid * search, avoid ALL search • Avoid using sort, you can sort your results on the client side using js or any client side framework of your choice. • Search is CPU intensive rather then RAM intensive, increase cpu power. • Upgrade your Alfresco release with the latest service packs and hotfixes. Those contain the latest Solr improvements and bug fixes that can have great impact on the overall search performance
  • 20. 4 – Solr Tuning Solr Caches Tracking the usage of the solr caches can help to tune them for your use case. • http://<solr_server>:<solr_port>/solr/alfresco/admin/stats.jsp#cache The url above show you statistics on cache usages, If you have many evictions you should look into increasing that cache module so all elements can fit (but don't overdo it, adjust it and see what fits for your setup). It is likewise also a idea to decrease the size of some of them if they have a lot of unused slots. Goal should be to get the hit rate as close to 1.00 as possible (1.00 beeing 100% hit ratio)
  • 21. 4 – Solr Tuning Solr usage on Alfresco Share • Solr indexing and search performance will affect positively the overall share performance. Share relies on Solr in the following situations: • Full Text Search (search field in top right corner) • Advanced Search • Filters • Tags • Categories (implemented as facets) • Dashlets such as the Recently Modified Documents • Wildcard searches for People, Groups, Sites (uses database search if not wildcard)

Notas del editor

  1. Disabling some features that are not being used will release important resources allowing them to be used for active tasks, contributing for increased performance. Transformations When users are accessing alfresco via the share interface, when they access the document details page a full document preview is generated. This involves calls to various third-party tools such as OpenOffice, Ghoscript, ImageMagick to create a flash version of the document. Since previews are not being used, we can prevent their creation by including the following snipped of xml in the share-config-custom.xml file.
  2. Disabling some features that are not being used will release important resources allowing them to be used for active tasks, contributing for increased performance. User Quotas Checking for user quotas can add some overload to alfresco. When not needed, this feature can also be disabled User home folders Alfresco creates a home folder for each new user automatically. If your users are not utilizing this folder for any business related tasks you can disable the automatic creation of home folder for new users. Activities feed If this feature is not necessary and is not being used disabling it will prevent regular checks to the activities and will again save on system resources.  activities.feed.notifier.enabled=false activities.feed.cleaner.enabled=false activities.post.cleaner.enabled=false
  3. 1,2 - Acl checks are known to slow down performance when the maximum group hierarchy depth exceeds 5 levels. Our advice is, when possible, limiting the max group hierarchy depth to 5. 3 - When using share or another client to browse a repository folder, alfresco needs to perform a series of actions before it actually renders or delivers the content (permission checking, etc). The more nodes that reside inside a specific folder the slower will be the response time. We recommend, when possible, limiting the maximum number of document nodes inside a folder to 2000. 4 - The number of sites on the system has some relative influence on performance, especially when checking the site membership for the users. Although this is (on 3 limits suggested here) the factor that has less impact, we recommend keeping the number of sites below 5000. 5- The number of groups a user belongs to has impact on performance while rendering some client pages (specially on some share dashlets, like the mysites dashlet). Alfresco run some complex queries (based on the user groups membership) while determining the assets to render on some share pages. We advice on keeping a low ratio on the number of groups a user belongs to. When possible, and to optimize the share client rendering performance, a user should not belong to more than 5 or 6 groups. 6 - The depth of the folder hierarchy also has an impact when browsing and performing document actions under a certain folder. We recommend, when possible, limiting the maximum depth of a folder hierarchy to 15 levels
  4. Not closing resources Certain resources in Alfresco (specifically search result objects) are not cleaned up automatically by Alfresco and must therefore be cleaned up correctly by extension code (i.e. in a “finally” block). Failing to clean up such resources results in leaks, not only of memory but also, in some cases, “real” operating system resources (such as file handles) as well. Lower case beans The “lower case” versions of the Alfresco service beans (i.e. those whose first name starts with a lowercase letter e.g. nodeService) are configured to bypass Alfresco’s security, transaction and auditing checks, with no recourse for the administrator to turn them back on. There have been persistent (but incorrect) rumors in the Alfresco community that these versions of the services perform significantly better than the official (“upper case”) versions, but that hasn’t been the case since at least Alfresco v2.x. Content policies Content policies are wired in at a very low level in the repository and as a result can be called many hundreds of times a second in some cases (e.g. when content is being manipulated via CIFS). In addition they are executed synchronously within each Alfresco transaction. The result is that even the slightest poor performance in a custom content policy or behavior can have a profound impact on Alfresco performance. For that reason it is critically important that custom content policies / behaviours are either fast (conduct minimal computation and only perform minimal I/O to the repository) or are made asynchronous. Direct access to database Alfresco’s database schema and the SQL the product uses has been carefully tuned. Uncontrolled access to the same tables can interfere with Alfresco’s normal operation, impacting both performance and (in some cases) stability. Note that this includes reads (SELECTs), as this can block concurrent write operations in some circumstances. Use of private API’s Only the public Alfresco Java APIs may be used in a certified extension, the private APIs should not to be used and are not supported. http://docs.alfresco.com/4.2/concepts/java-public-api-list.html
  5. Using Transaction Service instead of RetryingTransactionHelper As the name implies, RetryingTransactionHelper contains retry logic for certain recoverable, expected database exceptions (deadlocks, basically). It also uses Spring’s “template” pattern to ensure a transaction is always completed (committed or rolled back) correctly, regardless of what happens in the logic inside the transaction. The “raw” TransactionService provides neither of these benefits and for that reason is considered unsafe for use in extensions CMIS query Language Alfresco’s SearchService API supports different “languages” (XPath, Lucene, SOLR and CMIS), which roughly correlate to the different underlying search implementations in Alfresco. Of these languages, only CMIS is fully abstracted away from the underlying implementation, and so is the only language that provides some guarantee of consistent behavior, regardless of how a given Alfresco instance has been configured (SOLR vs Lucene, or MDQ, for example). Note however that SearchService doesn’t fully implement CMIS-QL – specifically, the “SELECT” clause in CMIS queries sent to the SearchService are not processed (they are silently ignored). SearchService, regardless of the query language used, always returns sets of matching NodeRefs. Improper exception handling Exceptions should either be caught and recovered from, or allowed to flow up the call stack. It is almost never appropriate to “swallow” an exception (catch it and do nothing), and excessive wrapping of exceptions inside other exceptions should be minimized (it makes triage more difficult). Catching or throwing java.lang.Error instances, and catching java.lang.Throwables is also inappropriate – these classes (java.lang.Error, specifically) indicate fatal JVM problems and therefore cannot be safely caught or thrown.
  6. Alfresco sizing like other systems has many subtleties that are hard to fully systematize. Each deployment will have specificities that will demand consideration while estimating sizing for the different layers (share front-end, repository, indexing, transformation, storage). In any case, for the numbers presented above we are assuming that there are certain fairly generic steps and concerns that are mostly common between the different use cases. Changes on such assumption can drastically change the predictions above. For memory calculations, consider the repository L2 Cache, plus initial VM overhead, plus basic Alfresco system memory. This means that you can run the Alfresco repository and web client with many users accessing the system with a basic single CPU server ,However, you must add additional memory as your user base grows, and add CPUs depending on the complexity of the tasks you expect your users to perform, and how many concurrent users are accessing the client. The terms concurrent users and casual users are used. Concurrent users are users who are constantly accessing the system through Alfresco with only a small pause between requests (3-10 seconds maximum) with continuous access 24/7. Casual users are users occasionally accessing the system through the Alfresco or WebDAV/CIFS interfaces with a large gap between requests (for example, occasional document access during the working day).
  7. Common use cases Alfresco use cases vary considerably because of the elasticity of the platform and although we can enumerate some generic common use cases, the details on which each real implementation differs may be very important from architecture and sizing perspective. We can consider that generally alfresco solutions can be classified on one of the 2 following cases: • Collaboration • Backend Repository Authority Structure We know from recent benchmarks comparisons that authority structure has a direct and important impact on performance of SOLR especially. So its important when sizing SOLR the importance of search operations for the solution, the types of searches being executed, the repository size and characteristics but also and equally important the authority structure of the corresponding use case. Collaboration use cases will so be in general more demanding in principle (keeping the other mentioned factors constant) than backend use cases with simpler authority structures.
  8. Common use cases Alfresco use cases vary considerably because of the elasticity of the platform and although we can enumerate some generic common use cases, the details on which each real implementation differs may be very important from architecture and sizing perspective. We can consider that generally alfresco solutions can be classified on one of the 2 following cases: • Collaboration • Backend Repository Injection Rate: the repository growth or document upload rate. The document types and sizes will impact transformation, text extraction and indexing. The repository size has impact on performance will grow with the ratio of search operations expected on the solution. The injection rate has an obvious impact on the server concerning near future repository sizes, but also around the capability of the different solution layers on handling the throughput that not only stress the content storage and database (metadata extraction/upload/rules) but also the indexing layer. This may imply depending on the amount of document injection happening that dedicated nodes are reserved for the injection (that can be on cluster or not with the front end service nodes) and also Solr layer scaled up/out. The requirements around uploading and downloading large or small documents, and indexing, transforming different types of documents will vary. This can have architecture consequences. Preference for certain type of protocols and mechanisms for large files for example: FTP, bulk ingestion; or use of dedicated caching solutions for large documents downloads. But this also has consequences at the sizing level. It will be very different having to index large documents than smaller ones, and also between different types. Repository size is also dynamical, and the project may expect even on the near term to go through different phases: first migrating legacy repository, new content roll out, archival, etc. So sizing estimates should cover the different phases especially on short term.
  9. One of the secrets of a successful architecture is to know exactly what, when and where are the processes occurring and what resources are those processes influencing. Having this information brings the architect the power to “Divide and Conquer”.   Working with a fairly flexible technology he can wisely divide the overall processes across the resources (servers) achieving the necessary balance. Let’s consider for example the schedule jobs that Alfresco executes, on a distributed architecture we have lots of advantages to offload some of those jobs to a specific server, releasing important resources from the servers that are actually serving user requests.   From an alfresco perspective offloading (disabling) this scheduled jobs from the front-end servers is no more than configured their cron job to execute very far away in the future and have a dedicated server (normally separated from the cluster) to execute this jobs.
  10. Capacity Planning is the science and art of estimating the space, computer hardware, software and connection infrastructure resources that will be needed over some future period of time. It’s a mean to predict the types, quantities, and timing of critical resource capacities that are needed within an infrastructure to meet accurately forecasted workloads. Predicting and sizing a system depends on a good understanding of user behavior. The following diagram represents the stages of a capacity-planning scenario. Validate your predictions with stress tests, compare obtained data with the LIFE data from production adjusting the REAL user behaviors to your stress tests scenarios. Performing a regular analysis to the monitoring/capacity planning data will help you know exactly when and how you need to scale our architecture, allowing for incremental and continuous improvement. The more data that gets indexed inside elastic search along the application life cycle the more accurate your capacity predictions will be. They represent the “real” application usage and how that usage affects the various layers of your application. This plays a very important role when modeling and sizing the architecture for future business requirements.   The Peak period Methodology is an efficient way to implement a capacity planning strategy allowing to analyze vital performance information when the system is under more load/stress, furthermore it represents YOUR system. The peak period may be an hour, a day, 15 minutes or any other period that is used to analyze the collected utilization statistics. Assumptions may be estimated based on business requirements or specific benchmarks of a similar implementation.
  11. The diagram shows what are the most important factors in the deployment that should be analyzed on a regular basis.Note that certain inspection targets have overhead when they are being inspected. These targets might not be appropriate for long-term monitoring or may require tuning to minimize impact. Pay attention to alfresco transformations and text extraction Alfresco executes a high number of transformations while working with documents, those include document’s text extraction (for indexing), previews generation, thumbnails generations and renditions. It’s wise to keep a regular monitoring on the health and performance on the Transformations, check the longest running transformations, measuring transformation times and using transformation limits when applicable. HTTP Requests and Responses Debugging HTTP requests can yield useful information to help you tune your applications. Find out what component are making the page take so long to load, or make sure the JSON your web script is returning looks like you expect it to. There are a couple of tools that can be used to debug http requests and responses like Charles or Fiddler, some other tools are included in the browsers (firebug, chrome inspector).
  12. Disable all full-text indexing We can disable all full-text indexing activities and tune our search layer for performance on meta-data based searches. We can make usage on a recent feature that alfresco makes available called transactional metadata queries. To disable full-text search we need to configure the workspace Spacestore solr core. Edit the solrcore.properties file and set the following property: alfresco.index.transformContent=false Disable archive core If you are not planning on searching for deleted content, we can safely disable the indexing of archive content. Alfresco never searches for content inside files that are deleted/archived. We can disable the indexing of archived content by going into solrHome and edit the solr.xml file that resides on the root of this folder. Edit this file and comment out the archived content indexing using the xml below.   <?xml version='1.0' encoding='UTF-8'?><solr sharedLib="lib" persistent="true"> <cores adminPath="/admin/cores" adminHandler="org.alfresco.solr.AlfrescoCoreAdminHandler"> <!-- <core name="archive" instanceDir="archive-SpacesStore"/>--> <core name="alfresco" instanceDir="workspace-SpacesStore"/> </cores> </solr>   Note that we are commenting out the solr core named “archive”. This will prevent solr from indexing archived content. This saves on disk space, memory on Solr, Cpu during Indexing and overall resources. We also need to disable the archive core backup scheduled task. We do this by configuring the cron to a date in a very far away future. You should do this on every alfresco node (including the new tracking instance)   # disabling the archive backup as we are not using archive search solr.backup.archive.cronExpression=* * * * * ? 2199 solr.backup.archive.numberToKeep=0 Optimize your ACL policy, re-use your permissions, use inherit and use groups. Don’t setup specific permissions for users or groups at a folder level. Try to re-use your Acls.
  13. ramBufferSizeMB ramBufferSizeMB sets the amount of RAM that may be used by Solr indexing for buffering added documents and deletions before they are flushed to disk. generally increasing this to 64 or even 128 has proven increased performance. But this depends on the amount of free memory you might have available. Analyze Indexing process During the indexing process, plug in a monitoring tool (YourKit) to check the repository health during the indexing. Sometimes, during indexing, the repository layer executes heavy and IO/CPU/Memory intensive operations like transformation of content to text in order to send it to Solr for indexing. This can become a bottleneck when for example the transformations are not working properly or the GC cycles are taking a lot of time.
  14. GC Tuning Solr operations are memory intensive so tuning the Garbage collector is an important step to achieve good performance. Jclarity Cemsum tool is really good to analize gc logs, but there are others. Tracking frequency Consider if you really need tracking to happen every 15 seconds (default). This can be configured in Solr configuration files on the Cron frequency property. alfresco.cron=0/15 * * * * ? * This property can heavily affect performance, for example during bulk injection of documents or during a lucene to solr migration. You can change this to 30 seconds or more when you are re-indexing. This will allow more time for the indexing threads to perform their actions before they get more work on their queue. Increase your index batch counts to get more results on your indexing webscript on the repository side. In each core solrcore.properties, raise the batch count to 2000 or more alfresco.batch.count=2000 For index updates, Solr relies on fast bulk reads and writes. One way to satisfy these requirements is to ensure that a large disk cache is available. Use local indexes and the fastest disks possible. In a nutshell, you want to have enough memory available in the OS disk cache so that the important parts of your index, or ideally your entire index, will fit into the cache. Let’s say that you have a Solr index size of 8GB. If your OS, Solr’s Java heap, and all other running programs require 4GB of memory, then an ideal memory size for that server is at least 12GB. You might be able to make it work with 8GB total memory (leaving 4GB for disk cache), but that also might NOT be enough. Troubleshooting common Indexing problems Database – If it’s a database performance issue, normally adding more connections to the connection pool can increase performance. I/O – If it’s a IO problem, it can normally occur when using virtualized environments, you should use hdparam to check read/write disk speed performance if you are running on a linux based system, there are also some variations for windows. Find the example below: sudo hdparm -Tt /dev/sda The rule for troubleshooting involves testing and measuring initial performance, apply some tuning and parameter changes and retest and measure again until we reach the necessary performance. I strongly advice to plugin a profiling tool such as Yourkit to both the repository and Solr servers to help with the troubleshooting.
  15. Make sure you are using only one transformation subsystem. Check the alfresco-global.properties and see if you are using either OooDirect or JodConverter, never enable both sub-systems. Typical issues with Searching It can happen that you are searching and indexing on the same time, this causes concurrent accesses to the indexes and that is known to cause performance issues. There are some workarounds for this situation. To start you should Plugin a profiler and search for commit Issues (I/O locks), this will allow you to check if you are facing this problem.
  16. Solr makes some statistics available by default. Analysing those statistics can help you to tune your solr caches for your use case. The way to read this results (analysing the caches) is the following. If you have many evictions you should look into increasing that cache module so all elements can fit (but don't overdo it, adjust it and see what fits for your setup). It is likewise also a idea to decrease the size of some of them if they have a lot of unused slots.   A goal should be to get the hit rate as close to 1.00 as possible (1.00 beeing 100% hit ratio)
  17. If your project relies on the share client offered by Alfresco, you should know that tuning your Solr indexing and search performance will affect positively the overall share performance.