Search
StarWind is a hyperconverged (HCI) vendor with focus on Enterprise ROBO, SMB & Edge

How to install and use Talend Open Studio in Linux

  • August 23, 2021
  • 7 min read
IT Engineer and Technical Author. Karim is specializing in Linux, he is a prolific blogger who writes for various websites.
IT Engineer and Technical Author. Karim is specializing in Linux, he is a prolific blogger who writes for various websites.

Talend is an Open Source data integration platform which is most widely used in today’s World of Big data. It helps data scientists to effortlessly tune their large amount of raw data into business insights. It also helps in automating the tasks using the tools which are available for free to speed up their processes. Talend is a cross platform application which is available both for Linux and Windows operating systems, Mac and Solaris as well. It comes with an Enterprise and community edition backed by a strong community of its users.

Following in this article we are going to show you its installation and use on a Linux Operating System which is CentOS 7 with GUI in our case.

Prerequisites:

Before starting, let’s make some points to take the start that we have a Linux system running with CentOS 7 Desktop. Make sure to install the updates and security harden your system.

To install the system updates, run below command with a user with ‘sudo’ rights as below.

Once the updates are complete, we can move to the next step to get the Talend package.

Download Talend Package:

To get the download package for the latest Talend Open Studio go to its website link https://www.talend.com/free-trial/ get yourself register by providing some of the information about yourself. Then click on to get the free trial.

You will get the Talend Download link in your Email that you have provided during the registration.

Talend Download

Click to download the application or simply click Access Software, your Talend package will be downloaded. Take it to your system and extract the archived package using ‘unzip’ command as below.

Configure Talend JVM Parameters:

In order to use Talend, make sure that you have Java installed on your system. If Java is not already installed, you can do so using the ‘yum’ command.

If you have already java installed, you can check its version by using below command.

Which is 1.8.0_292 is our case.

Let’s switch to the extracted Talend folder and update the JVM Xms and Xmx as per required using its .ini file for Linux.

Here we have updated Xmx value to 4GB and MaxMetaspaceSize to 1GB. Now save and close the file to start the Talend application as in next step.

Starting Talend Application:

To start the talend application, execute its shell script and you will get its initial setup running as shown below.

Starting Talend Application

Accept the License agreement to move forward for creating the new Talend Project as shown below.

Accept the License agreement

Once you click on the finish button, it will go through the libraries and VM setup in the background to initiate the a fresh workspace for you to start using Talend.

Start using Talend

If you are new to the Talend Open Studio, go through its quick tour and understand its usage.

Using Talend Open Studio for Data Integration:

Now, as we have Talend application configured, up and running for making your life as data scientist easier. You can create and load your own custom jobs by using its Job design Tab as shown below.

Talend application configured

Conclusion:

At the end of this article, now you should be familiar with the installation and configuration setup of Talend Open Studio for Data Integration. We have used the latest currently available version TOS 7.3.1 which is one of the most useful applications which is being used by almost every data scientist and data integration engineer. It makes it easy to access and manage the large amount of data and to store them in an organized way. I hope you find this article very helpful, thank you.

Found Karim’s article helpful? Looking for a reliable, high-performance, and cost-effective shared storage solution for your production cluster?
Dmytro Malynka
Dmytro Malynka StarWind Virtual SAN Product Manager
We’ve got you covered! StarWind Virtual SAN (VSAN) is specifically designed to provide highly-available shared storage for Hyper-V, vSphere, and KVM clusters. With StarWind VSAN, simplicity is key: utilize the local disks of your hypervisor hosts and create shared HA storage for your VMs. Interested in learning more? Book a short StarWind VSAN demo now and see it in action!