We held workshop on Data Wrangling and EDA with UpGrad. Slides for wrangling & EDA and the analyses on Cricket & Bank Marketing are online.
Here’s the video of the data wrangling session (1 hour.)
Among other things, this session analyses the fascinating OKCupid dataset, where thousands of users have answered over 2,000 questions, covering:
This blog post is not about the talk, however. Before we started, we asked what the audience would like covered to know — and this is about what they asked.
Many people were interested in how projects are executed. For example, How do I execute a project? What happens behind the scenes? Can you show us the discarded analyses / visuals? What challenges do you face with big data? This workshop wasn’t meant for that, but we’ll plan one to talk about this.
Another popular theme was what resources are required for analysis. For example, What technologies do we use? What tools should we learn? What statistical / machine learning techniques should we know? This is a topic in itself, but we will iterate that you can be a good analyst without learning statistics and with just Excel.
A third set of questions were around how we analyse data. For example, How do we clean data? How do we analyse data in a new domain? How do we determine the dependency between variables? This was the focus of the session.
A fourth set of questions emerged after the talk, and in one-on-one conversations. They were all about how to get started. For example, I’m new to data analytics — how do I get started? I’m experienced in other fields, but what to enter analytics — how do I get started? I’m looking for projects in analytics — how do I get started?
Our advice is:
The emergence of Generative AI (GenAI) is reshaping healthcare use cases and facilitating the rapid… Read More
The U.S. Food and Drug Administration's (FDA) stance on GenAI is clear: it's a groundbreaking… Read More
Executive Summary In healthcare, protecting patient information is not just a legal requirement; it's a… Read More
Demand forecasting in the supply chain is crucial for optimizing inventory levels and ensuring efficient… Read More
Hi, I am ChatGPT 3.5 Turbo. Do you know what my favorite number is? Do… Read More
After a successful webinar on digital transformation and sustainability, we organized a sequel titled “Data-Driven… Read More
This website uses cookies.
Leave a Comment