2. Southeast Dreamin’ 2019
• Dates: March 21-22, 2019
• Where: Atlanta, GA
• Venue: Marriott Buckhead
• Website: bit.ly/sed2019
Calls for Sponsors and Speakers are open!
bit.ly/sed2019sponsor
bit.ly/sed2019cfp
3. Agenda
• Salesforce platform governor limits
• Definition of LDV
• Data Analysis
• Data Modeling best practices
• Reporting
• Technical debt
4. Governor Limits
• Salesforce Limits
– CPU Time exceeded
– SOQL records returned
• 50,000
– TooManyLockFailure error
– Too many SOQL Statement 101
– Unable to activate entity
– Report session timed-out
5. What is cloud computing? Trust
It’s a shared multi-tenant environment that is accessible to
users with access to the internet.
7. What is a Large Data Volume implementation?
• Salesforce definition:
– This paper is for experienced application architects who work with Salesforce
deployments that contain large data volumes. A “large data volume” is an
imprecise, elastic term, but if your deployment has tens of thousands of users,
tens of millions of records, or hundreds of gigabytes of total record storage,
then you can use the information in this paper. A lot of that information also
applies to smaller deployments, and if you work with those, you might still
learn something from this document and its best practices.
• Details
– No mention of size in Gigabytes.
– 100 millions records, is that specific to an object?
– Where is the cutoff before it’s large?
13. What happens when you save a record?
No No
Execute after update
triggers2
Request came from Standard UI edit page
Overview of the Order of Execution of a Save – Last Updated June 15 2015
Number Reference: https://www.salesforce.com/us/developer/docs/apexcode/Content/apex_triggers_order_of_execution.htm
Loads the original
record’s value.
trigger.old keeps the
original values.
Updates trigger.new with
values from the request.
Standard System
Validations1
Execute all before
triggers2
Saves the record to the
database
No commit yet
Execute all after
triggers2
Executes Assignment
Rules.
(Lead & Case only)
Executes Auto-
Response Rules.
(Lead & Case only)
Execute Escalation
Rules.
(Case only)
Calculates Roll-up
Summary Fields on
Parent Records. Saves
Parent Record.7
Calculates Roll-up
Summaries on Grand-
parent Records. Saves
Grand-parent Record.7
Executes Criteria Based
Sharing (CBS)
evaluation.
All changes to the sharing
table are calculated.
Commits all DML
operations to the
Database5
Executes Post-commit
logic, such as sending
email.6
Request DID NOT come from Standard UI, e.g. upsert via Apex or a Webservice call
Execute before update
triggers2
Standard System
Validations3
NO Custom Validations,
Duplicate Rules
Saves the record to the
database
No commit yet
During a recursive save, Salesforce skips steps 8
(assignment rules) through 17 (roll-up summary in the
grandparent record).
1a 2a 2b 3
Standard System
Validations3,
Custom Validations,
Duplicate Rules
4/5
12b 12c
Updates trigger.new with
values from the updated
records. Including new
field updates.
12a
9 8
6 7
12e12d
12
14 16 1918 20
Comments
2) Trigger (step 3, 6, 13b, 13e)
• If you have more than one trigger
for the same object, the order will
be random. -> Consider one trigger
per object only to get ahead of this
behavior
3) System Validations (step 4, 13c)
• Required values at the layout level
and field-definition level
• Valid field formats
• Maximum field length
6) Post Commit Logic (PCL) (step 20)
• Sending Email
• Outbound (OB) Messages placed on OB
message queue
• Time based workflow actions
• Calculate Index, such as Search Index
• Render file previews
7) Save of another record
• For any other record created, deleted or
updated within triggers, workflows (tasks),
and flows, an entire save will be called
and executed at the same point in time.
10 , 11, 12 –
Workflows4
YES
xy
4) Workflow Rules (step 10+)
• Maximum of 5 re-evaluations and
recursions, maximum of 6 iterations
• A particular Workflow runs only once
• Field Updates are executed before all other
WF actions
• An Approval Workflow is treated as
workflow
5) Commit to Database (step 19)
• Commit of ALL new, updated, and deleted
records
• Commit of ALL new, updated, and deleted
Sharing rules
WorkflowEvaluation
Field Updates (FU)
Email Alerts
Create Tasks7
Outbound Message
Flow Trigger8
12
WorkflowRe-Evaluation
YES
WorkflowRe-Evaluation
FU
?
FU
?WorkflowRe-Evaluation
WorkflowRe-Evaluation
WorkflowRe-Evaluation
All of these steps are repeated during up to 5 Workflow Re-Evaluations (6 iterations in total). Re-Evaluations apply to Workflow FU
only!
In total you can have up to 13x calls of Before Triggers, 13 After Triggers, 6x Field Updates, and 6x Flow Triggers.
1) Standard System Validations (step 2b)
• Compliance with layout-specific rules
• Required values at the layout level and field-definition
level
• Valid field formats
• Maximum field length
Any feedback? Please contact me,
Marc Kirsch: mkirsch@vlocity.com
17
Execute Entitlement
Rules.
(Case only)
15
8) Flow Trigger Input
• Input parameters are taken at the
moment the flow trigger starts, not at
the moment the workflow evaluation
takes place.
13
15. Data Analysis checklist
• Data Integrity
– Duplicates
– Email format
– Phone format
– Text field with carriage returns, images
• Data quantity
– SQL group by clause
– SQL group by date last modified
• Data outputs
– Revenue reports
– Mailing lists
– General Ledger integration
17. Field Type Considerations
Salesforce Data architect’s perspective
Lookup Joins to records
Formula SQL function
Rollup Summary SQL functions that aggregate records. Cannot be shut
off.
Filter lookup SQL select on a joined record. Filtered fields in
managed package cannot be deactivated.
18. Object relationship
Relationship type LDV issue
Master detail Lookup Child records with more 10k records to 1 master is definition of
lookup skew.
Ownership lookup Causes problems when inserting records with a single user or
reassigning records to another user.
Junction object Both master objects must be exist before creating the record in
the junction object.
19. When lookup skew performance issues arise?
• Export the salesforce data into a database.
• Focus on the lookup fields.
• Is there a hierarchy?
• Use a SQL Group by <lookup field> having count(*) > 10,000
• Assess business impact.
• Break up the skew.
– Dates
– Identified segment.
20. What about ownership skew?
• How complex is the organization sharing model?
– Private vs public
– Complex sharing rules
– Group calculations
• Is it a integration user?
– If yes, then assign ownership to record upon creation.
• Is this a one time data migration or ownership?
– Defer sharing is awesome but don’t forget it’s deferred.
21. What is so bad about Junction objects?
• More integration points
– Master records must exists.
– Instead of 1 load, there are 3 data load operations.
• Lookup skew
• Ownership skew
23. Reasons for Slow Reports
• Reasons for slow reports
– Querying too many objects
– Dealing with intricate lookups
– Too many fields
• If you can’t view a report and want to edit it to avoid the time-out, append
?edit=1 to the report URL. Doing so gets you into edit mode, where you can
adjust the criteria.
24. Report Tuning
• Try these tips to get your reports to run more efficiently.
– When filtering, use the equals or not equal to operators instead of contains or
does not contain. For example, use Account Owner equals John James, not
Account Owner contains John. Choose AND rather than OR for filter logic
– To narrow your report date range, use time frame filters. For example, use Last 30
Days instead of Current FY.
– Set time frame filters by choosing a Date Field and Range to view. Only records
for that time frame are shown.
25. Report Tuning
• Try these tips to get your reports to run more efficiently.
– Reduce the number of fields in the report by removing unnecessary columns or
fields.
– If you receive an error message saying that your activity report has too many
results, filter on a picklist, text, or date field. Alternatively, rerun the report using
a different activity data type such as “Activities with Accounts” or “Activities
with Opportunities”.
– Add time filters, scope filters, and filter criteria to the report to further narrow the
results.
26. More Options
• Still doesn’t work?
– Filter on Standard Indexed Fields
– Work with Salesforce to determine if custom indexes should be created on
fields you filter by
• Do the math…
27. Force.com Query Optimizer
• The Force.com Query Optimizer
– An engine that sits between your SOQL, reports, and list views and the database
itself.
– Because of salesforce.com’s multitenancy, the optimizer gathers its own
statistics instead of relying on the underlying database statistics.
– Using both these statistics and pre-queries, the optimizer generates the most
optimized SQL to fetch your data. It looks at each filter in your WHERE clause to
determine which index, if any, should drive your query.
28. Force.com Query Optimizer
• It’s a Numbers Game…
– To determine if an index should be used to drive a query, the Force.com query
optimizer checks the number of records targeted by the filter against selectivity
thresholds.
29. Standard Index Selectivity
• Standard Index Selectivity
– The threshold is 30 percent of the first million targeted records and 15 percent
of all records after that first million.
– It maxes out at 1 million total targeted records, which you could reach only if
you had more than 5.6 million total records.
– So if you had 2.5 million accounts, and your SOQL contained a filter on a
standard index, that index would drive your query if the filter targeted fewer
than 525,000 accounts.
– (30% of 1 to 1 million targeted records) + (15% of 1 million to 2.5 million
targeted records) = 300,000 + 225,000 = 525,000
30. Custom Index Selectivity
• Custom Index Selectivity
– The selectivity threshold is 10 percent of the first million targeted records and
5 percent all records after that first million.
– The selectivity threshold for a custom index maxes out at 333,333 targeted
records, which you could reach only if you had more than 5.6 million records.
– So if you had 2.5 million accounts, and your SOQL contained a filter on a
custom index, that index would drive your query if the filter targeted fewer
than 175,000 accounts.
– (10% of 1 to 1 million targeted records) + (5% of 1 million to 2.5 million
targeted records) = 100,000 + 75,000 = 175,000
31. Non-Selective SOQL Queries
• Common Causes of Non-Selective SOQL Queries
– Having too much data (LDV)
– Performing large data loads
• Large data loads and deletions can affect query performance. The Force.com query
optimizer uses the total number of records as part of the calculation for its selectivity
threshold.
• When the Force.com query optimizer judges returned records against its thresholds,
all of the records that appear in the Recycle Bin or are marked for physical delete do
still count against your total number of records.
– Using Leading % Wildcards
• A LIKE condition with a leading % wildcard does not use an index
• Within a report/list view, the CONTAINS clause translates into ‘%string%’.
32. Non-Selective SOQL Queries (cont)
• Common Causes of Non-Selective SOQL Queries
– Using NOT and !=
• When your filter uses != or NOT—which includes using NOT EQUALS/CONTAINS for
reports, even if the field is indexed—the Force.com query optimizer can’t use the index
to drive the query.
• 1SELECT id FROM Case WHERE Status != ‘Closed’
• 1SELECT id FROM Case WHERE Status IN (‘New’, ‘On Hold’, ‘Pending’,
‘ReOpened’)
33. Non-Selective SOQL Queries (cont)
• Common Causes of Non-Selective SOQL Queries
– Using Complex Joins
• OR Condition
– For Force.com to use an index for an OR condition, all of the fields in the condition must be
indexed and meet the selectivity threshold. If the fields in the OR condition are in multiple
objects, and one or more of those fields does not meet a selectivity threshold, the query can
be expensive.
• Formula fields
– Filters on formula fields that are non-deterministic can’t be indexed and result in additional
joins.
– If you have large data volumes and are planning to use this formula field in several queries,
creating a separate field to hold the value will perform better than following either of the
previous common practices. You’ll need to create a workflow rule or trigger to update this
second field, have this new field indexed, and use it in your queries.
34. Reporting Guidelines for clients with LDV
• Ensure that your queries are selective.
• Understand your schema and have proper indexes created if needed.
• Apply as many filters as possible to reduce the result set.
• Minimize the amount of records in the Recycle Bin.
• Remember that NOT operations and LIKE conditions with a leading %
wildcard do not use indexes, and complex joins might perform better as
separate queries.
• If the object has more than 5.6 Million records and reports don’t work you
may need to explore off platform options.
36. • Data Feed
– ETL Tool
– API call out salesforce
• SQL world
– SQL Views/Tables
– SQL Procedure Tables
– SQL Functions
• Digestion
– Security (PCI)
– Results
• Monetary Costs $$$$
• Technical debt
• Ease of Use
Reporting options
37. #NoSQL vs SQL
• Dynamic schema: As mentioned, this gives you
flexibility to change your data schema without
modifying any of your existing data.
• Scalability: MongoDB is horizontally scalable,
which helps reduce the workload and scale your
business with ease.
• Manageability: The database doesn’t require a
database administrator. Since it is fairly user-
friendly in this way, it can be used by both
developers and administrators.
• Speed: It’s high-performing for simple queries.
• Flexibility: You can add new columns or fields on
MongoDB without affecting existing rows or
application performance.
• Maturity: MySQL is an extremely established
database, meaning that there’s a huge
community, extensive testing and quite a bit of
stability.
• Compatibility: MySQL is available for all major
platforms, including Linux, Windows, Mac, BSD
and Solaris. It also has connectors to languages
like Node.js, Ruby, C#, C++, Java, Perl, Python
and PHP, meaning that it’s not limited to SQL
query language.
• Cost-effective: The database is open source and
free.
• Replicable: The MySQL database can be
replicated across multiple nodes, meaning that the
workload can be reduced and the scalability and
availability of the application can be increased.
• Sharding: While sharding cannot be done on
most SQL databases, it can be done on MySQL
servers. This is both cost-effective and good for
business.
41. Data load strategy for migration and integration
• Stage the data.
• Triggers off
– Custom setting
– Custom metadata types
– Custom labels
• Deactivate functionality
– Workflow rules
– Process builders
– Validation rules
• Load order is set by hierarchy
– Master (Account,
Campaign)
– Child (Opportunity)
42. Asynchronous process
• Apex jobs are not included in your service level agreement (SLA).
• Pushing out operation to a separate process. (Lightning Event)
– Is that acceptable?
• Extending the code base beyond the trigger and classes invoked by the
save operation.
• Clutters the schedule job queue.
• Prone to return more than 50,000 records in a SOQL query.
• https://developer.salesforce.com/blogs/engineering/2014/05/4-steps-
successful-asynchronous-processing-in-force-com.html
• https://developer.salesforce.com/docs/atlas.en-
us.216.0.integration_patterns_and_practices.meta/integration_patterns_an
d_practices/integ_pat_middleware_definitions.htm
43. Pros
• External data object is a related list.
• Field sets can be used for the visual
force page.
• Can be used in Apex classes to
manage external data.
• Leverage Salesforce security.
Cons
• Salesforce Connect cost
• Off platform database
– Data model
– Indexes
– Security
– Network
• ODATA provider
– Specific to Salesforce
• Limitations
– Volume
• Sandbox refreshes
External Data Sources
44. Pros
• Free storage for billions of records.
• Accessible via SOQL
– VisualForce page
– Reporting tools
• Secured via Permission set
Cons
• General Available as Winter 18’
• No fields sets
• Have to use a metadata API to define
object. Or Custom Big Object
Creator.
Big Object
45. Questions
• Use case?
• How many objects?
• What is the Criteria?
• Is there a plan to archive?
• What are the common fields used in
the report?
Actions
• Normal form
– 1 to 1 object
– 1 to many
– Many to many
• Rollup field on Account/Contact
– Last Gift Date
– No Email
• Volume
• Indexes
Please explain?
46. LDV tool kit
• User requirements.
• Change Enablement – Org Management
– Dev Sandbox > Dev Pro > Full Sandbox > Production
• Test data generator.
– Mockaroo $50/year for 100,000 records
– GenerateData.com
• ERD visualization tool
• ETL Tool
• Cloud environment
• Cloudtoolkit
– Schema Lister
– Switch
47. Takeaways
• Salesforce owns the platform
– Do not ask for more CPU
– Be prepared to justify
indexes/skinny tables.
• Query criteria
• Explain Plans
• Volume
• Expectations
– Understand the tools
– Seek help
– Confirm success requirement
– Know your audience
• End to End testing
• ISV Partners/Developers
– Owns the User Experience
– Data model is malleable.
• Declarative (Watch OUT!)
– Process Builders
– Rollup Summary
– Work flow rules
– Other components
48. LDV Architecture
• Go Off Platform
– Data Warehouse
• Read-Only
• Write-Only
– Extract Transform Tool or
API
• Processing data
• Extracting data
• Feeding data to
reporting platform
– Purchase Reporting/BI
tool
• Conga/Apsona - High
touch donors
• Visualization
Salesforce data model includes multiple objects