Collibra Connect upsert operation

This tutorial contains instructions on how to add data from a CSV file to Collibra Data Governance Center (Collibra DGC) by performing an upsert operation using Collibra Connect. An upsert is an operation where a resource is either updated if it already exists in Collibra DGC or created if it does not exist.

Prerequisites

  • Access to a Collibra DGC environment
  • You have installed MuleSoft Anypoint Studio with the CollibraDGC connector.

Prepare the CSV file

The source of the upsert operation is a CSV file. We are using a short list of acronyms in this example as sample data.

For the operation to succeed you have to provide the following mandatory elements.

To create a new business term:

  • The business term name.
  • The domain the business terms belongs to.

    The domain can be identified by name or the universally unique identifier (UUID).

If the specified domain does not exit, it will be created and you have to provide the following mandatory elements.

To create a new domain:

  • The domain name.
  • The domain type.
  • The community the domain belongs to.

    The domain can be identified by name or UUID.

If the specified community does not exist, it will be created and you have to provide:

  • The community name.

For more information, see the Upserting assets by name section of the Collibra Connect documentation.

The following table and text show the structure of the sample file:

Name Domain Domain Type Community
DGC New Business Terms Glossary Data Governance Council
FAQ New Business Terms Glossary Data Governance Council
GDPR New Business Terms Glossary Data Governance Council
Name,Domain,Domain Type,Community
FAQ,New Business Terms,Glossary,Data Governance Council
DGC,New Business Terms,Glossary,Data Governance Council
GDPR,New Business Terms,Glossary,Data Governance Council

Save the above as a CSV file to use later in the project

Create a MuleSoft project

Open Anypoint Studio and create a new project:

  1. Click FileNewMule Project.

  2. Enter a name for your project and click Next. We are using csv_upsert in this example.

    Blank spaces are not supported as part of the project name.

  3. Leave all the default values and click Next.

  4. Click Finish.

The project is created. The Package Explorer, Palette and Connections Explorer are populated.

Create the connect flow

  1. From the Palette, drag the File connector to the Canvas.

  2. From the Palette, drag the Transform Message component to the Canvas.

  3. From the Palette, drag the CollibraDGC connector to the Canvas.

Configure the components

File connector configuration

A File connector placed at the beginning of a flow is set to behave like an inbound endpoint.

To configure the File connector:

  1. Select the File connector on the Canvas.
  2. In the General section, specify the path to the CSV file. We are using src/main/resources/input in this example.

    To create the input folder:

    1. In the Package Explorer, right click the resources folder and select New Folder.
    2. Type the name of the folder you wish to create and click Finish.
  3. In the General section click the green + sign to create a new Connector Configuration.

  4. In the Global Element Properties window, click OK.

  5. In the Metadata section, click Add metadata and then click the edit icon to add a new metadata type.

  6. In the Select metadata type window, click Add.
  7. Type the name of the metadata type and click Create type. We are using Asset_by_name in this example.

  8. From the Type drop-down select CSV.

  9. In the Sample File field, select the file you have created.

    The metadata is retrieved from the file.

  10. In the Select metadata type window, click Select.

CollibraDGC connector configuration

The CollibraDGC connector is used to establish a direct connection to the Collibra Data Governance Center.

To configure the CollibraDGC connector:

  1. Select the CollibraDGC connector on the Canvas.
  2. In the General section click the green + sign to create a new Connector Configuration.
  3. In the Global Element Properties window, in the General tab, enter the connection details of your Collibra DGC environment:
    • Username
    • Password
    • Base Application Url

    Use the Test Connection... feature to test the connection to Collibra DGC.

  4. Click OK to save the configuration.
  5. In the General section, from the Operation drop-down list select Upsert assets by name.

  6. From the Asset Type Id drop-down list, select Acronym.

The Transform Message component takes the output of the File connector and transforms it into the expected input for the CollibraDGC connector. Since you configured both connectors, their metadata is already loaded into the Transform Message component. To configure it, drag the fields from the input section over the corespondent fields in the output section:

As you create the connections, the Transform Message component builds the required code.

Save your project.

Run the application

To start the application in your development environment, right-click on the Canvas and select Run project csv_upsert.

If there are no errors, the console will show that the application has started.

To perform the upsert operation, copy the CSV file to the src/main/resources/input folder.

You can use the Package Explorer to paste or drop the file.

If there are no errors, the contents of the CSV file are upserted to Collibra DGC, in the Business Glossary application, under the All Business Assets view.

In case of errors, check the console output as it contains details about what went wrong.

To stop the application, right click on the Canvas and select Stop project csv_upsert.

Next Steps

Add a definition to the CSV file, reload the CSV metadata and map the new column to the corresponding field fort the CollibraDGC connector input.

Name,Definition,Domain,Domain Type,Community
FAQ,Frequently Asked Questions,New Business Terms,Glossary,Data Governance Council
DGC,Data Governance Center,New Business Terms,Glossary,Data Governance Council
GDPR,General Data Protection Regulation,New Business Terms,Glossary,Data Governance Council

Stay tuned for the next tutorial, which enables you to connect to an external system, such as Salesforce.

Additional resources