Large scale machine learning projects require vast amounts of data, i.e. large datasets. Training models on these datasets is tedious and therefore poses a danger of running into redundancy. It makes no sense to train a model on some data if it has already been trained? A responsible developer would always like to know this information in order to track biases in models.
Date : February 10, 2020 at 08:52AM
Tag(s) : #AI ENG