How to use custom iterator in apex batch class

Apex batch classes are used to process large volume of records that can hit governor limits if processed synchronously. Batch classes are used to split these large set of records into small chunks and execute as separate transactions. Batch class can work on up to 50M sObject records using QueryLocator or upto 50k records using Iterator. Using list of sObjects is a common pattern and you can find a sample batch class template here. There are two ways of using iterator approach. it can either be a or a There are cases where using a Iterator is more meaningful. You can find some use cases below,

  1. Use a custom iterator defined using CustomIterable interface
  2. You can just use standard apex collections like list as iterator. Because they are implementing Iterable interface behind the scenes

Use cases

  • If your batch class needs to operate on data obtained from two unrelated objects, you can create a custom iterable class and return combined data from both objects as a list of iterable from start method
  • Say you have an API that returns ~5k product Ids and then you need to hit another API for each of these products to get more details about each product. You can return a list of product Ids from start method to achieve this.

/*******************************************************************************************
* @Name IteratorExampleBatch
* @Author FirstName LastName <author@email.com>
* @Date 01/13/2020
* @Group Apex Batches
* @Description This batch class demonstrates use of iterator in apex batch class.
*******************************************************************************************/
/* MODIFICATION LOG
* Version Developer Date Description
*-------------------------------------------------------------------------------------------
* 1.0 Firstname 01/13/2020 Initial Creation
*******************************************************************************************/
global class IteratorExampleBatch implements Database.Batchable<String>, Database.AllowsCallouts {
//Make sure you add below endpoint to remote site settings
private final String API_BASE = 'https://th-apex-http-callout.herokuapp.com/';
/**************************************************************************************
* @Description This method gets all animals, get corresponding animal IDs and sends to
* execute for processing
* @Param bc - BatchableContext
* @Return Iterable<String> - List of animals whose details need to be retrieved
**************************************************************************************/
global Iterable<String> start(Database.BatchableContext bc) {
List<String> animalIdList = new List<String>();
//Move the API call code to a separate class for reusability
Http http = new Http();
HttpRequest request = new HttpRequest();
request.setEndpoint(API_BASE+'animals');
request.setMethod('GET');
HttpResponse response = http.send(request);
if (response.getStatusCode() == 200) {
Map<String, Object> resultMap = (Map<String, Object>) JSON.deserializeUntyped(response.getBody());
for (Object animal: (List<Object>) resultMap.get('animals')) {
animalIdList.add((String)animal);
}
}
return animalIdList;
}
/**************************************************************************************
* @Description This method gets details of individual animal by hitting API
* @Param bc - BatchableContext
* @Param accountanimalIdListList - Chunk of animal Ids that needs to be processed
* @Return NA
**************************************************************************************/
global void execute(Database.BatchableContext bc, List<String> animalIdList){
//Do some heavy processing here. In this case get more details about each animal :)
//Move the API call code to a separate class for reusability
Http http = new Http();
HttpRequest request = new HttpRequest();
request.setMethod('GET');
for(String animalId: animalIdList) {
System.debug('Getting details about'+animalId);
request.setEndpoint(API_BASE+'animals/'+animalId);
HttpResponse response = http.send(request);
if (response.getStatusCode() == 200) {
Map<String, Object> resultMap = (Map<String, Object>) JSON.deserializeUntyped(response.getBody());
System.debug('Got details about animal ' + response.getBody());
//Bulkify the results and store in records
}else {
system.debug('Unable to get about animal => '+response.getStatusCode()+response.getBody());
}
}
}
/**************************************************************************************
* @Description This method executes any post-processing operations
* @Param bc - BatchableContext
* @Return NA
**************************************************************************************/
global void finish(Database.BatchableContext bc){
// execute any post-processing operations
}
}

Explanation

Apex batch classes can return an iterator to process large volume of data that are not directly queried from database/sobjects. These iterators can be classes implementing Iterable interface or standard apex data strucutres like list. In above example we are getting a list of animals and details of individual animal from an API hosted in Heroku. It is used by Salesforce for trailhead modules. Here "baseUrl/animals" gives a list of animals and "baseUrl/animals/animalId" gives details of a single animal.

Suppose the API to get list of animals gives back 1000 animal IDs. Then we cannot get details of all animals individually using single transaction. This is because currently governor limit on number of callouts in one transaction is 100. So we need to split it into multiple transactions. Best way to do this is to use apex batch. Here we use start method to make an API call to get list of all animals. Then we will process the result and finally start method will return a list of all animal Ids to execute method. Then because it is a batch, Salesforce will call execute method multiple times with a subset of total animal list each time using the batch size we specify while starting the batch. This will help us to stay within governor limits if we use correct batch size (Something below 100 in our case so that we can stay within 100 callouts per transaction limit).


How to use

  • Make sure that you are adding https://th-apex-http-callout.herokuapp.com/ in remote site settings to allow callouts to this URL.
  • Create an apex class with name "IteratorExampleBatch" and copy paste the code from above
  • Execute the batch using "Developer Console" => "Execute Anonymous" option. You can use Database.executeBatch(new IteratorExampleBatch(), 10); to execute batch with a batch size of 10

4 comments:

  1. Can we make this scheduled ? Please share ur thought.

    ReplyDelete
    Replies
    1. You can. Just implement the schedulable interface in the batch or call another schedulable class the executes this batch.

      Delete
  2. Don't you think at line 36, we should return something like animalIdList.iterator();

    Please correct me if I am wrong

    ReplyDelete