Python Day 41 – Pandas!!

Today I worked on a different course curated by my fantastic friend. We started out by going over broad view of the data science process, and delved in a bit with obtaining data and pandas.

I learned:

A majority of data science is collecting and cleaning data. Pandas is a module that allows for easier readability and manipulation of datasets within Python. One can append new columns to a data frame with df[‘new column’] = [‘val 1’, ‘val 2’, etc]. When working with larger datasets it’s very useful to use df.head() and df.tail() to quickly look at the first/last 5 rows of data.

One can sort the data by column values with the .sort_values(by=’name’) function.

I’m still confused on:

When using .sort_values, it can’t seem to properly order numbers with different digits than each other. For instance, it’ll sort [27,3,55], which is out of order. But when I put in ’03’ instead of ‘3’ it’ll sort it properly like [03,27,55].

How does one get around this?

Alas, this shall be a conundrum for tomorrow’s Isaac.

Godspeed future me 🙏🏼

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: