Copy data activity in Data factory allows you to copy data among data stores located on-premises and in the cloud. After you copy the data, you can use other activities to further transform and analyze it.
Let’s create a copy activity to copy sample data into Lakehouse.
Log into Microsoft Fabric Portal, you can see Fabric home page.
Now, click on Data Factory tab.
Now, we create a pipeline. To create click on Data pipeline tab.
Once you click on Data pipeline tab, a New pipeline window opens, here you need to provide the name for pipeline.
Now, we will add a copy data activity in pipeline.
Click on Add to canvas option under copy data tab or Copy data tab showing at the center of page as shown below.
Now provide the name for your activity.
After that, click on Source tab.
Select Sample dataset, you will see a Sample datasets window opens.
Here you can select any dataset, Let’s select COVID-19 Data Lake dataset.
Now, you can see all the available datasets under COVID-19 Data Lake.
Select COVID Tracking project dataset and click on OK button.
So far, we have defined the source dataset.
Now, click on Destination tab to define destination.
As we want to copy sample data into Lakehouse, so you can select an existing Lakehouse or create a New.
In our case, we will create a New Lakehouse. For this click on New button.
Then provide the name for New Lakehouse and click on Create button.
Once you click on Create button, it creates a Lakehouse in workspace.
You can quickly verify the same, Go to workspace and check for the newly created Lakehouse.
You can see, the Lakehouse named lk_dev is created. You can see tables and files folder there.
Lets go back to pipeline, now we have a newly created Lakehouse so select the Lakehouse then provide the name for table where we want to copy the data.
As we do not have any table created in Lakehouse, so we will create a new table during the data copy.
Check the Edit option and provide the table name as CovidData.
Now, click on Mapping tab. Here you can make changes in schema such as you can change the column data type or delete the columns that you do not want to copy.
Click on Import schemas button to see the source data schema.
Once you done with this, let’s execute the pipeline. Click on Run button.
It asks you to Save and run the pipeline, click on Save and run button.
Now you can see, pipeline executed successfully.
Now, Go to Lakehouse to check whether the data is copied into Lakehouse or not.
You can see, there is table CovidData is created in Lakehouse. Also you can see the table columns by expanding the table.
You can also see the data as shown below.