What They Do
- Data acquisition: Identifying and gathering data that’s useful for the project
- Data preparation: Cleaning and organizing the data
- Data analysis: Identifying patterns, trends, and relationships in the data
- Data visualization: Creating interactive visualizations to learn trends and variations
- Model development: Creating statistical and predictive models that run against the data sets
- Data testing: Creating, validating, and updating algorithms and models
- Business recommendations: Making recommendations to stakeholders based on data analysis
- Data communication: Creating data visualizations, dashboards, and reports to share findings
The Steps of Data Science:
- Ask the right questions - To understand the business problem.
- Explore and collect data - From database, web logs, customer feedback, etc.
- Extract the data - Transform the data to a standardized format.
- Clean the data - Remove erroneous values from the data.
- Find and replace missing values - Check for missing values and replace them with a suitable value (e.g. an average value).
- Normalize data - Scale the values in a practical range (e.g. 140 cm is smaller than 1,8 m. However, the number 140 is larger than 1,8. - so scaling is important).
- Analyze data, find patterns and make future predictions.
- Represent the result - Present the result with useful insights in a way the “company” can understand.
What is Data?
Data is a collection of information.