TeamStation AI System Report LATAM IT Salaries 2024
Forcelandia 2016 PK Chunking
1. PK Chunking
Divide and conquer massive objects in Salesforce
Daniel Peter
Lead Applications Engineer,Kenandy
@danieljpeter
Bay Area Salesforce Developer User Group
2. Takeaways: How to avoid these errors
Query not “selective” enough:
•Non-selective query against large object type (more than 100000 rows).
Query takes too long:
•No response from the server
•Time limit exceeded
•Your request exceeded the time limit for processing
Too much data returned in query:
•Too many query rows: 50001
•Remoting response size exceeded maximum of 15 MB.
8. Salesforce Ids (prereq)
•Composite key containing multiple pieces of
data.
•Uses base 62 numbering instead of the more
common base 10.
•Fastest way to find a database row.
13. How does PK Chunking work?
Analogy: fetching people in a city.
14. Fetching people in a city: problems
Non-selective
Request:
“get me all the people who are female”
Response:
“yer trippin’!”
15. Fetching people in a city: problems
Timeout
Request:
“find me a 7 foot tall person in a pink tuxedo in Beijing”
Response:
(after searching all day) “I can’t find any! I give up!”
16. Finding people in a city: problems
Too many people found
Request:
“find me all the men in San Francisco with beards”
Response:
(after searching for 10 mins) “The bus is full!”
18. Fetching people in a city: solutions
Non-selective
Request:
“get me all the people who are female, in your small search area”
Response:
“¡Con mucho gusto!”
19. Fetching people in a city: solutions
Timeout
Request:
“find me a 7 foot tall person in a pink tuxedo in Beijing, in your small
search area”
Response:
SP1: “Didn’t find any, sorry!”
SP2: “Didn’t find any, sorry!”
SP3: “Found one!”
SP4: “Didn’t find any, sorry!”
20. Finding people in a city: solutions
Too many people found
Request:
“find me all the men in San Francisco with beards, in your small search
area”
Response:
SP1: 30 people in our bus
SP2: Didn’t find any
SP3: 50 people in our bus
23. QLPK
Salesforce SOAP or REST API – AJAX toolkit works great.
Create and leverage a server-sidecursor. Similar to an Apex query
locator (Batch Apex).
Analogy: Print me a phone book of everyonein the city so I can flip
through it.
25. QLPK – AJAX Toolkit Response
Chunk the database, in size of your choice, by offsetting the
queryLocator:
01gJ000000KnRpDIAV-500000
1gJ000000KnRpDIAV-100000
…
01gJ000000KnRpDIAV-39950000
01gJ000000KnRpDIAV-40000000
26. QLPK – The Chunks
800 chunks
x 50,000 records
40,000,000 total records
Analogy: we have exact addresses for clusters of 50k
people to give to 800 different search parties.
27. QLPK – How to use in a query?
Perform 800 queries with the Id ranges in the whereclause:
SELECT Id, Autonumber__c, Some_Number__c
FROM Large_Object__c
WHERE Some_Number__c> 10 AND Some_Number__c< 20
AND Id >= ’a00J000000BWNYk’
AND Id <= ’a00J000000BWO4z’
29. QLPK – Parallelism
Yeah it’s 800 queries, but…
They all went out at once, and they might all come
back at once.
Analogy: We hired 800 search parties and unleased
them on the city at the same time.
31. Base62PK
Get the first and last Id of the database and
extrapolate the ranges in between.
Analogy: Give me the highest and lowest address of
everyone in the city and I will make a phonebook with
every possible address in it. Then we will break that
into chunks.
32. Base62PK – first and last Id
Get the first Id
SELECT Id FROM Large_Object__c ORDER BY Id ASC LIMIT 1
Get the last Id
SELECT Id FROM Large_Object__c ORDER BY Id DESC LIMIT 1
Even on H-U-G-E databases these return F-A-S-T. No problem.
33. Base62PK – extrapolate
1. Chop off the last 9 digits of the 15 digit first/last Ids.
Decompose.
2. Convert the 9 digit base 62 numbers into a Long Integer.
3. Add the chunk size to the first number until you hit or
exceed the last number.
4. Last chunk may be smaller.
5. Convert those Long Integers back to base 62 and re-
compose the 15 digit Ids
35. Base62PK – issues
•Digits 4 and 5 of the Salesforce Id are the pod
Identifier. If the Ids in your org have different
pod Id’s this technique will break, unless
enhanced.
•Fragmented Ids lead to sparsely populated
ranges. You will search entire ranges of Ids
which have no records.
37. So which do I pick?
Hetergeneous Pod Ids Homogeneous Pod Ids
Low Id Fragmentation
(<1.5x)
Medium Id
Fragmentation
(1.5x - 3x)
High Id
Fragmentation
(>3x)
QLPK X X X
Base62PK X X
38. How do I implement?
•Needs to be orchestrated via language like JS in
your page, or another platform (Heroku)
•Doesn’t work on Lightning Component
Framework (yet). No support for real parallel
controller actions. (boxcarred)
•Has to be Visualforce or Lightning / Visualforce
hybrid.
39. How do I implement?
•Use RemoteActions to get the chunk queries
back into your page.
•Can be granular or aggregate queries!
•Process each chunk query appropriately when it
comes back. EX: update totals on a master
object or push into a master array.
40. function queryChunks() {
for (var i=0; i<chunkList.length; i++) {
queryChunk(i);
}
}
function queryChunk(chunkIndex) {
var chunk = chunkList[chunkIndex];
Visualforce.remoting.Manager.invokeAction(
'{!$RemoteAction.Base62PKext.queryChunk}',
chunk.first, chunk.last,
function (result, event) {
for (var i=0; i<result.length; i++) {
objectAnums.push(result[i].Autonumber__c);
}
queryChunkCount++;
if (queryChunkCount == chunkList.length) {
allQueryChunksComplete();
}
},
{escape: false, buffer: false}
);
}
42. Landmines
Timeouts – retries
•Cache warming means if you first fail, try and try again!
Concurrency
•Beware: ConcurrentPerOrgApexLimit exceeded
•Keep your individual chunk queries lean. < 5 secs.
44. How did you figure this out?
Had to meet requirements for Kenandy’slargest customer. $2.5B / yr
manufacturer.
High visibility project.
Necessity mother of invention!
46. How did you figure this out?
Debug logs from real execution
47. Why doesn’t Salesforce do this?
They do!
(kinda)
The Bulk API uses a similar technique, but it is more
asynchronous and wrapped in a message container to
track progress.
48. More Info
Article on Salesforce Developers Blog
https://developer.salesforce.com/blogs/developer-relations/2015/11/pk-chunking-techniques-massive-
orgs.html
Githubrepo
https://github.com/danieljpeter/pkChunking
Bulk API documentation
https://developer.salesforce.com/docs/atlas.en-
us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm