# Sunday, 23 January 2011
Salesforce Administrators and Developers are routinely required to manipulate large amounts of data in a single task.

Examples of batch processes include:
  • Deleting all Leads that are older than 2 years
  • Replacing all occurrences of an old value with a new value
  • Updating user records with a new company name

The tools and options available are:
  1. Browser-based Admin features / Execute anonymous in System Log
  2. Data loader
  3. Generalized Batch Apex (Online documentation)
  4. Specialized Batch Apex
Option 1 (Admin Settings) represents the category of features and tools available when directly logging into Salesforce via a web browser. The transactions are typically synchronous and subject to Governor Limits.

Option 2 (Data Loader) provides Admins with an Excel-like approach to downloading data using the Apex Data Loader, manipulating data on a local PC, then uploading the data back to Salesforce. Slightly more powerful than browser-based tools, doesn't require programming skills, and subject to web service API governor limits (which are more generous). But also requires slightly more manual effort and introduces the possibility of human error when mass updating records.

Option 3 (Generalized Batch Apex) introduces the option of asynchronous batch processes that can manipulate up to 50 million records in a single batch. Doesn't require programming (if using the 3 utility classes provided below in this blog post) and can be executed directly through the web browser; but limited to the general use cases supported by the utility classes. Some general purpose batch Apex utility classes are provided at the end of this article.

Option 4 (Specialized Batch Apex) requires Apex programming and provides the most control of batch processing of records (such as updating several object types within a batch or applying complex data enrichment before updating fields).

Batch Apex Class Structure:

The basic structure of a batch apex class looks like:

global class BatchVerbNoun implements Database.Batchable<sObject>{
    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator(query); //May return up to 50 Million records
    }
  
    global void execute(Database.BatchableContext BC, List<sObject> scope){       
        //Batch gets broken down into several smaller chunks
        //This method gets called for each chunk of work, passing in the scope of records to be processed
    }
   
    global void finish(Database.BatchableContext BC){   
        //This method gets called once when the entire batch is finished
    }
}
An Apex Developer simply fills in the blanks. The start() and finish() methods are both executed once, while the execute() method gets called 1-N times, depending on the number of batches.

Batch Apex Lifecycle

The Database.executeBatch() method is used to start a batch process. This method takes 2 parameters: instance of the batch class and scope.

BatchUpdateFoo batch = new BatchUpdateFoo();
Database.executeBatch(batch, 200);
The scope parameter defines the max number of records to be processed in each batch. For example, if the start() method returns 150,000 records and scope is defined as 200, then the overall batch will be broken down into 150,000/200 batches, which is 750. In this scenario, the execute() method would be called 750 times; and each time passed 200 records.

A note on batch sizes: Even though batch processes have significantly more access to system resources, governor limits still apply. A batch that executes a single DML operation may shoot for a batch scope of 500+. Batch executions that initiate a cascade of trigger operations will need to use a smaller scope. 200 is a good general starting point.

The start() method is called to determine the size of the batch then the batch is put into a queue. There is no guarantee that the batch process will start when executeBatch() is called, but 90% of the time the batch will start processing within 1 minute.

You can login to Settings/Monitor/Apex Jobs to view batch progress.


Unit Testing Batch Apex:
The asynchronous nature of batch apex makes it notoriously difficult to unit test and debug. At Facebook, we use a general Logger utility that logs debug info to a custom object (adding to the governor limit footprint). The online documentation for batch apex provides some unit test examples, but the util methods in this post use a short hand approach to achieving test coverage.

Batch Apex Best Practices:
  • Use extreme care if you are planning to invoke a batch job from a trigger. You must be able to guarantee that the trigger will not add more batch jobs than the five that are allowed. In particular, consider API bulk updates, import wizards, mass record changes through the user interface, and all cases where more than one record can be updated at a time.
  • When you call Database.executeBatch, Salesforce.com only places the job in the queue at the scheduled time. Actual execution may be delayed based on service availability.
  • When testing your batch Apex, you can test only one execution of the execute method. You can use the scope parameter of the executeBatch method to limit the number of records passed into the execute method to ensure that you aren't running into governor limits.
  • The executeBatch method starts an asynchronous process. This means that when you test batch Apex, you must make certain that the batch job is finished before testing against the results. Use the Test methods startTest and stopTest around the executeBatch method to ensure it finishes before continuing your test.
  • Use Database.Stateful with the class definition if you want to share variables or data across job transactions. Otherwise, all instance variables are reset to their initial state at the start of each transaction.
  • Methods declared as future are not allowed in classes that implement the Database.Batchable interface.
  • Methods declared as future cannot be called from a batch Apex class.
  • You cannot call the Database.executeBatch method from within any batch Apex method.
  • You cannot use the getContent and getContentAsPDF PageReference methods in a batch job.
  • In the event of a catastrophic failure such as a service outage, any operations in progress are marked as Failed. You should run the batch job again to correct any errors.
  • When a batch Apex job is run, email notifications are sent either to the user who submitted the batch job, or, if the code is included in a managed package and the subscribing organization is running the batch job, the email is sent to the recipient listed in the Apex Exception Notification Recipient field.
  • Each method execution uses the standard governor limits anonymous block, Visualforce controller, or WSDL method.
  • Each batch Apex invocation creates an AsyncApexJob record. Use the ID of this record to construct a SOQL query to retrieve the job’s status, number of errors, progress, and submitter. For more information about the AsyncApexJob object, see AsyncApexJob in the Web Services API Developer's Guide.
  • All methods in the class must be defined as global.
  • For a sharing recalculation, Salesforce.com recommends that the execute method delete and then re-create all Apex managed sharing for the records in the batch. This ensures the sharing is accurate and complete.
  • If in the course of developing a batch apex class you discover a bug during a batch execution, Don't Panic. Simply login to the admin console to monitor Apex Jobs and abort the running batch.


Utility Batch Apex Classes:

The following batch Apex classes can be copied and pasted into any Salesforce org and called from the System Log (or Apex) using the "Execute Anonymous" feature. The general structure of these utility classes are:
  • Accept task-specific input parameters
  • Execute the batch
  • Email the admin with batch results once complete
To execute these utility batch apex classes.
1. Open the System Log

2. Click on the Execute Anonymous input text field.

3. Paste any of the following batch apex classes (along with corresponding input parameters) into the Execute Anonymous textarea, then click "Execute".


BatchUpdateField.cls
/*
Run this batch from Execute Anonymous tab in Eclipse Force IDE or System Log using the following

string query = 'select Id, CompanyName from User';
BatchUpdateField batch = new BatchUpdateField(query, 'CompanyName', 'Bedrock Quarry');
Database.executeBatch(batch, 100); //Make sure to execute in batch sizes of 100 to avoid DML limit error
*/
global class BatchUpdateField implements Database.Batchable<sObject>{
    global final String Query;
    global final String Field;
    global final String Value;
   
    global BatchUpdateField(String q, String f, String v){
        Query = q;
        Field = f;
        Value = v;
    }
   
    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator(query);
    }
   
    global void execute(Database.BatchableContext BC, List<sObject> scope){   
        for(sobject s : scope){
            s.put(Field,Value);
         }
        update scope;
    }
   
    global void finish(Database.BatchableContext BC){   
        AsyncApexJob a = [Select Id, Status, NumberOfErrors, JobItemsProcessed,
            TotalJobItems, CreatedBy.Email
            from AsyncApexJob where Id = :BC.getJobId()];
       
        string message = 'The batch Apex job processed ' + a.TotalJobItems + ' batches with '+ a.NumberOfErrors + ' failures.';
       
        // Send an email to the Apex job's submitter notifying of job completion. 
        Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
        String[] toAddresses = new String[] {a.CreatedBy.Email};
        mail.setToAddresses(toAddresses);
        mail.setSubject('Salesforce BatchUpdateField ' + a.Status);
        mail.setPlainTextBody('The batch Apex job processed ' + a.TotalJobItems + ' batches with '+ a.NumberOfErrors + ' failures.');
        Messaging.sendEmail(new Messaging.SingleEmailMessage[] { mail });   
    }
   
    public static testMethod void tests(){
        Test.startTest();
        string query = 'select Id, CompanyName from User';
        BatchUpdateField batch = new BatchUpdateField(query, 'CompanyName', 'Bedrock Quarry');
        Database.executeBatch(batch, 100);
        Test.stopTest();
    }
}
BatchSearchReplace.cls
/*
Run this batch from Execute Anonymous tab in Eclipse Force IDE or System Log using the following

string query = 'select Id, Company from Lead';
BatchSearchReplace batch = new BatchSearchReplace(query, 'Company', 'Sun', 'Oracle');
Database.executeBatch(batch, 100); //Make sure to execute in batch sizes of 100 to avoid DML limit error
*/
global class BatchSearchReplace implements Database.Batchable<sObject>{
    global final String Query;
    global final String Field;
    global final String SearchValue;
    global final String ReplaceValue;
   
    global BatchSearchReplace(String q, String f, String sValue, String rValue){
        Query = q;
        Field = f;
        SearchValue = sValue;
        ReplaceValue = rValue;
    }
   
    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator(query);
    }
   
    global void execute(Database.BatchableContext BC, List<sObject&> scope){   
        for(sobject s : scope){
            string currentValue = String.valueof( s.get(Field) );
            if(currentValue != null && currentValue == SearchValue){
                s.put(Field, ReplaceValue);
            }
         }
        update scope;
    }
   
    global void finish(Database.BatchableContext BC){   
        AsyncApexJob a = [Select Id, Status, NumberOfErrors, JobItemsProcessed,
            TotalJobItems, CreatedBy.Email
            from AsyncApexJob where Id = :BC.getJobId()];
       
        string message = 'The batch Apex job processed ' + a.TotalJobItems + ' batches with '+ a.NumberOfErrors + ' failures.';
       
        // Send an email to the Apex job's submitter notifying of job completion. 
        Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
        String[] toAddresses = new String[] {a.CreatedBy.Email};
        mail.setToAddresses(toAddresses);
        mail.setSubject('Salesforce BatchSearchReplace ' + a.Status);
        mail.setPlainTextBody('The batch Apex job processed ' + a.TotalJobItems + ' batches with '+ a.NumberOfErrors + ' failures.');
        Messaging.sendEmail(new Messaging.SingleEmailMessage[] { mail });   
    }
   
    public static testMethod void tests(){
        Test.startTest();
        string query = 'select Id, Company from Lead';
        BatchSearchReplace batch = new BatchSearchReplace(query, 'Company', 'Foo', 'Bar');
        Database.executeBatch(batch, 100);
        Test.stopTest();
    }
}
BatchRecordDelete.cls:
/*
Run this batch from Execute Anonymous tab in Eclipse Force IDE or System Log using the following

string query = 'select Id from ObjectName where field=criteria';
BatchRecordDelete batch = new BatchRecordDelete(query);
Database.executeBatch(batch, 200); //Make sure to execute in batch sizes of 200 to avoid DML limit error
*/
global class BatchRecordDelete implements Database.Batchable<sObject>{
    global final String Query;
   
    global BatchRecordDelete(String q){
        Query = q;   
    }
   
    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator(query);
    }
   
    global void execute(Database.BatchableContext BC, List<sObject&> scope){       
        delete scope;
    }
   
    global void finish(Database.BatchableContext BC){   
        AsyncApexJob a = [Select Id, Status, NumberOfErrors, JobItemsProcessed,
            TotalJobItems, CreatedBy.Email
            from AsyncApexJob where Id = :BC.getJobId()];
       
        string message = 'The batch Apex job processed ' + a.TotalJobItems + ' batches with '+ a.NumberOfErrors + ' failures.';
       
        // Send an email to the Apex job's submitter notifying of job completion. 
        Messaging.SingleEmailMessage mail = new Messaging.SingleEmailMessage();
        String[] toAddresses = new String[] {a.CreatedBy.Email};
        mail.setToAddresses(toAddresses);
        mail.setSubject('Salesforce BatchRecordDelete ' + a.Status);
        mail.setPlainTextBody('The batch Apex job processed ' + a.TotalJobItems + ' batches with '+ a.NumberOfErrors + ' failures.');
        Messaging.sendEmail(new Messaging.SingleEmailMessage[] { mail });   
    }
   
    public static testMethod void tests(){
        Test.startTest();
        string query = 'select Id, CompanyName from User where CompanyName="foo"';
        BatchRecordDelete batch = new BatchRecordDelete(query);
        Database.executeBatch(batch, 100);
        Test.stopTest();
    }
}
Friday, 11 February 2011 10:25:04 (Pacific Standard Time, UTC-08:00)
Mike, another great post - and timely as I struggle to figure out why we have such huge batch delete problems with our KJ implementation at GreatVines. We (at their request) delete around 18,000 invoice records each night so they can be reloaded. Invoices are Master-Detail to Account. In the three months since this nightly process has run we have hit around 5 to 10% failure rate with one of the following exceptions: "Unable to Lock Row" (presumably the parent account record) or the latest "Unable to access query cursor data; too many cursors are in use" and even "Unable to write to any of the ACS stores in the alloted time".

To solve the first one we reduced our batch size from 200 to 1. Developer support informed us that this would reduce record collision where multiple invoices in a single batch belonged to the same parent. Now that we are running with a single record batch size, the latter two exceptions hit frequently. This has been incredibly frustrating. I don't know how much testing was done with batch DML involving records in a MD relationship, something to look out for.

Any ideas you think we should try?
Tuesday, 22 February 2011 18:17:39 (Pacific Standard Time, UTC-08:00)
Jim - I run into the record locking issue frequently. I suspect that if a end-user touches any record in a batch, then the whole batch rolls back. Re-running large batches seems common.

Running batch sizes of 1 is certainly less than ideal. Sounds like deleting/re-loading mass records is causing your issue. Could an external ID be used to upsert as an alternative approach?
Mike Leach
Wednesday, 16 March 2011 08:26:49 (Pacific Standard Time, UTC-08:00)
Hi ,

In your entry regarding executing batch jobs in System log , you said

. Paste any of the following batch apex classes (along with corresponding input parameters) into the Execute Anonymous textarea, then click "Execute".

Can we actually paste a class in the execute anonymous test Area. When I try and paste the class and call execute batch (right above the class) , I get this error - Global type must be contained inside of a global class.

Any ideas?

AKB
Wednesday, 16 March 2011 10:14:18 (Pacific Standard Time, UTC-08:00)
@AKB- Try hovering over the code example above and clicking the 'View Source' option in the upper right corner of the source.

Then paste the view source into an Apex class of the same name as the example source.
Mike Leach
Wednesday, 16 March 2011 10:28:40 (Pacific Standard Time, UTC-08:00)
I totally agree that if I create an Apex Class thats batchable , then I can call executeBatch on that class from the system log. My question was whether we can code the class itself in the execute anonymous text area and call execute batch from there itself.
AKB
Wednesday, 16 March 2011 13:15:42 (Pacific Standard Time, UTC-08:00)
AKB- I can also confirm that Apex batches can be executed from System Log, but I haven't tried constructing the class in System Log and executing. Let me know what you find out.
Mike Leach
Tuesday, 12 April 2011 18:32:17 (Pacific Daylight Time, UTC-07:00)
AKB- I can also confirm that Apex batches can be executed from System Log, but I haven't tried constructing the class in System Log and executing. Let me know what you find out.
Christian Louboutin black leather Guerriere 120 wrapped boots
Christian Louboutin Cate patent calf boot
Comments are closed.