Put simply, an ETL pipeline is a tool for getting data from one place to another, usually from a data source to a warehouse. A data source can be anything from a directory on your computer to a webpage that hosts files. The process is typically done in three stages: Extract, Transform, and Load. The first stage, extract, retrieves the raw data from the source. The raw data is then transformed to match a predefined format. Finally, the load stage moves all that processed data into a data warehouse.
Source : http://www.datasciencecentral.com/xn/detail/6448529:BlogPost:951770
Date : May 15, 2020 at 12:35AM
Tag(s) : #DATA ENG