PK stands for Primary Keys in all database systems. This helps us in creating an index in Salesforce object fields which makes our query faster. Chunk is a small set of records in terms of database. So basically PK chunking is used to split the queries into manageable chunks which will increase application performance.
When to Use PK Chunking?
We should enable PK chunking when querying tables with more than 10 million records or when a bulk query constantly times out. PK Chunking is a supported feature of the Salesforce Bulk API, so it does all the work of splitting the queries into manageable chunks.
How to enable PK Chunking?
To enable auto PK chunking in Bulk Query Job, we have to use the primary key (PK) chunking request header. PK chunking splits bulk queries on large tables into chunks based on the record IDs, or primary keys, of the queried records. Each chunk is processed as a separate batch that counts toward our daily batch limit, and we must download each batch’s results separately. PK chunking works only with queries that don’t include subqueries or conditions other than WHERE.
How PK Chunking Work?
In general terms, we should first query the target table to identify a number of chunks of records with sequential IDs. Then we should submit separate queries to extract the data in each chunk and finally combine the results.
PK chunking works by adding record ID boundaries to the query with a WHERE clause, limiting the query results to a smaller chunk of the total results. The remaining results are fetched with extra queries that contain successive boundaries. The number of records within the ID boundaries of each chunk is referred to as the chunk size. The first query retrieves records between a specified starting ID and the starting ID plus the chunk size. The next query retrieves the next chunk of records, and so on.
When a query is successfully chunked, the original batch’s status shows as NOT_PROCESSED. If the chunking fails, the original batch’s status shows as FAILED, but any chunked batches that were successfully queued during the chunking attempt are processed as normal.
The default chunk size is 100,000, and the maximum size is 250,000. The default starting ID is the first record in the table. However, we can specify a different starting ID to restart a job that failed between chunked batches.
Sforce-Enable-PKChunking: chunkSize=25000; startRow=00130000000xEftMGH
Above PK chunk header will set chunk size of 25,000 and it will start chunking from record id 00130000000xEftMGH.