> For the complete documentation index, see [llms.txt](https://developer.collibra.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://developer.collibra.com/tutorials/script-task-performance.md).

# Script task performance

Script tasks in Collibra workflows provide a powerful and flexible mechanism for automating actions. However, as a workflow developer, you must carefully consider the performance implications of your code before publishing it to production.

{% hint style="info" %}
A workflow that performs well in a testing environment may exhibit different behavior in production due to variations in data volume, complexity, and concurrent workflow executions.
{% endhint %}

If you experience slowness after publishing a new or updated workflow, analyze whether the workflow changes could have caused the issue. Revert the changes if necessary.

## Factors influencing performance

The performance of a script task depends primarily on the complexity of the Groovy script. The following factors contribute to script complexity:

* Lines of code: A higher volume of code generally translates to longer execution times.
* Data volume and API calls: Processing large datasets or making numerous API calls can significantly impact performance.
* Looping constructs:
  * `for` loops iterating over a large number of items.
  * `while` loops with conditions that may never be met or recursive methods that lead to infinite loops.

Additionally, the compilation time affects performance. Before execution, each script task must be compiled into machine-executable instructions. This process introduces some overhead, with longer scripts requiring more time to compile.

To improve performance, the Groovy engine uses a caching mechanism to store compiled scripts for reuse. Cached scripts execute faster than non-cached scripts. However, scripts tend to execute longer under the following conditions:

* The first time a workflow runs after publishing.
* The first time a workflow runs after an application restart.
* When the Java garbage collector clears the Groovy engine cache, typically due to low memory.

## The impact of loops on memory usage

By default, script tasks execute synchronously, committing all changes in a single transaction. If an error occurs, the script rolls back to the last committed state, which may include:

* The beginning of the workflow.
* The last completed user task.
* The last completed asynchronous task.

When processing large amounts of data in `for` loops, all the data is retained in memory until a commit occurs. This is true even if you are iterating through paged API results.

To prevent memory issues in such scenarios, use asynchronous mode. For more information, see [Process execution](/workflows/designing-workflows/processes/process-execution.md).

However, asynchronous mode introduces additional complexity, particularly for advanced exception handling. If an error occurs, already committed changes are not undone unless you implement code to reverse those changes.

Memory issues can significantly degrade performance, primarily because the system may swap data to disk, which is slower. Additionally, memory problems can impact the compilation time of subsequent script tasks. When the Java garbage collector frees up space by deleting previously compiled scripts, these scripts must be recompiled, further slowing execution.

## The impact of using the groovy-lib mechanism on compilation time

If you use the **groovy-lib** mechanism to create reusable functions across workflows, consider that by default, the entire content of the **groovy-lib** folder is included in each script task before compilation. Since compilation time increases with the number of lines of code, including unused reusable functions can negatively affect performance.

To mitigate this issue, enable the **Don’t attach Groovy libs by default** option in Collibra Console. For scripts that require reusable functions, explicitly add a `// #importFile` statement at the beginning of the script to load the relevant files:

```
// #importFile resourcePrinter.groovy
// #importFile processDetailsPrinter.groovy
```

Rules for using `#importFile`:

* The `#importFile` statement must be the first line in the script, even before the imports section. Any whitespace character at the beginning of the script is ignored.
* Whitespace characters are allowed before and after `//`.
* There cannot be any whitespace character between `#` and `importFile`.
* Whitespace characters are allowed between `#importFile` and the Groovy file name.
* If a referenced Groovy file is not found in the **groovy-lib** folder, it is silently ignored.

## Conclusion

Script tasks can significantly enhance workflow automation in Collibra, but their performance depends on multiple factors, including script complexity, memory usage, and compilation time. By following best practices, such as optimizing loops, using asynchronous mode where appropriate, and managing the **groovy-lib** mechanism effectively, you can improve workflow performance and ensure smoother execution in production environments.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://developer.collibra.com/tutorials/script-task-performance.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
