This is the collection of raw structured and unstructured data from all relevant sources using a variety of methods, ranging from manual input and web scraping to real-time data capture from systems and devices.
This can encompass everything from data cleansing, reduplication, and reformatting to combining data into a data warehouse, data lake, or other unified storage for analysis utilizing ETL (extract, transform, load) or other data integration tools.
Data scientists look for biases, trends, ranges, and distributions of values in the data to see if it's suitable for predictive analytics.
data scientists use statistical analysis, predictive analytics, regression, algorithms, and other techniques to extract insights from the data.
Data scientists can utilize specific visualization tools or use data science programming languages such as R or Python, which have components for producing visuals. Finally, the insights are presented as reports, charts, and other data visualizations.