6 September 2023
ONLINE
Europe/Vienna timezone

Agenda & Content

1 day – 6 September 2023

09:45  Join in
10:00Part 1 – Data Analysis and Data Preparation for Machine Learning
12:00Lunch break
13:00Part 2 – Data Analysis and Data Preparation for Machine Learning
16:00End of course

Participants will learn how to

  • Get data into a suitable form
  • Visualize data
  • Clean data
  • Transform data
  • Analyze data
  • Handle data that does not fit in memory

Program

  • Overview 
    Participants learn why data needs to be pre-processed before being passed to ML methods. They also learn what the typical challenges are in data wrangling.
     
  • Pandas
    Participants get to know this powerful Python library and find out how they can load data into a data frame, get the look and feel of it and transform it in the best suitable way.
     
  • NumPy
    ML would simply not be possible in Python without this useful library for numerical operations. This is why participants will get to know the most important aspects of the API and what can be achieved with it.
     
  • Matplotlib
    Humans are visual beings and this is why we prefer looking at graphs, rather than endless tables of data. Matplotlib is the Python library to create all kinds of graphs which helps understand data a great deal more. Participants will learn how to create the most common graphs within Matplotlib.
     
  • Dask
    In ML problems, we often get to a situation where our data does not fit into memory. Even if it fits into memory, we would like some operations to run faster. Dask solves this problem by dividing our data into smaller, more manageable chunks. It then runs computations on those chunks in parallel, making it possible to handle data that is larger than memory. It is also faster since it makes computations run concurrently. Participants will get to know this tool and see the similarities with previously learned libraries.