Configurable Chunk Size for Embedded Data

Overview

This feature provides configurable database chunking to efficiently process large datasets while maintaining optimal performance and stability. Instead of loading all records in a single query, data is fetched and processed in smaller chunks, reducing memory usage and preventing execution timeouts.

The chunk size is configurable via the extension settings, allowing administrators to adjust it according to server capacity and dataset size.

Default Configuration

By default, database chunking is enabled with a chunk size of 1000 records.

If required, this value can be modified in the extension settings to better suit the execution environment.

Configuration Steps

  1. Navigate to Settings → Extension Configuration in the TYPO3 backend.

  2. Select ns_t3cs from the extension list.

  3. Go to the Training tab.

  4. Locate the Chunk Size [Used to fetch embedded data from database] field.

  5. Adjust the value according to your server capacity and dataset size.

  6. Click Save ‘ns_t3cs’ configuration to apply the changes.

Extension Configuration - Chunk Size Setting

Extension Configuration showing the Chunk Size setting in the ai engine tab

Recommendations

Chunk Size Guidelines:

  • Small datasets (< 10,000 records): Use default value (1000) or lower (500-800)

  • Medium datasets (10,000 - 50,000 records): Use 1000-1500

  • Large datasets (> 50,000 records): Use 1500-2000, but monitor server memory usage

Note

Adjusting the chunk size can significantly impact performance. Smaller chunks use less memory but may take longer to process. Larger chunks process faster but require more memory. Monitor your server’s memory usage when adjusting this value.

Benefits

  • Reduced Memory Usage: Processing data in smaller chunks prevents memory exhaustion

  • Prevents Timeouts: Smaller batches reduce the risk of PHP execution timeouts

  • Better Performance: Optimized chunk sizes can improve overall processing speed

  • Flexible Configuration: Administrators can adjust settings based on their specific environment