Today I finished the module on time series! I learned: One can shift data over by a certain time amount with .shift(). One can use ‘rolling windows’ to average out data over a period of time. This can be done with past/future data (by setting center = True), but is usually done with just pastContinue reading “Python Day 61 – shifting and rolling data”
Tag Archives: pandas
Python Day 58 – more datetime stuffs
I figured out why it wasn’t letting me subtract dates before! It was because one value was a date (year, month, day) and the other value was a date AND time (including hours, mins, seconds). The difference was really between using .datetime and .date. So I just made them both be .datetime and it worked!Continue reading “Python Day 58 – more datetime stuffs”
Python Day 57 – timestamps
Today I continued with the data wrangling and learned about timestamps. I learned: Timestamps are the most basic form of time series data for Pandas. Apparently they’re used in almost all data (usually counting when the data event happened). Their class is a string, however you can convert it to datetime format (so we canContinue reading “Python Day 57 – timestamps”
Python Day 54 – intro to regex
Went through some tutorials for regex today… I learned: \d can be used to replace any digit from 0 to 9 . is a wildcard To use normal period. use \. To specify characters use [] ex: [abc]an To match any value that starts with a or b or c, and ends in ‘an’ ToContinue reading “Python Day 54 – intro to regex”
Python Day 53 – Pandas array operators
Update from yesterday’s problem: to take the mean one simply needs to call which column first, for instance data[‘column_name’].mean(). Today… I learned: How to use basic string operators in arrays with pandas. I learned that one just needs to specify a column, and the add the .str attribute to whatever operation they want to do.Continue reading “Python Day 53 – Pandas array operators”
Python Day 51 – formatting .txt in Pandas
Success! In a matter of minutes I solved the problem that was vexing me yesterday. I sense a pattern here… yay for a fresh mind. I learned: When getting data from a .txt file, one can still use the pd.read_csv function. The issue I was running into yesterday was that it was reading the columnContinue reading “Python Day 51 – formatting .txt in Pandas”
Python Day 50 – Pandas practice problems
Today I practiced importing various data into Python with Pandas, and looking at it in different ways. I learned: To use the pd.read_csv(“folder/data_name.csv”) command to define a dataset from an outside source. And then use .sort_values(by = ‘category’, ascending = False) to sort the values however I wish. Also to chart data, use the df3.plot(xContinue reading “Python Day 50 – Pandas practice problems”
Python Day 48 – the roller coaster of SQL
I started off the day very frustrated. The smallest syntax errors were messing me up. I learned the hard way to always put a comma after SELECTing a category in SQL, and to make sure to put a backslash if I’m separating lines. After these initial frustrations I was almost ready to put the computerContinue reading “Python Day 48 – the roller coaster of SQL”
Python Day 46 – more SQL
Today I continued working on SQL… I learned: To count up certain values or the number of variables in a column, one can use COUNT(*), and attach “AS new_category_name” to create a new category with the count! This was useful in the example data of Nobel laureates to count how many prizes each country won.Continue reading “Python Day 46 – more SQL”
Python Day 45 – progress on SQL!
Today I didn’t think I’d get much done but I actually felt like I understood the bare basics of SQL! I learned: Use * to select all columns from the data. To use a pandas dataframe use the .read_sql() function. I also learned the basics of SELECT, FROM, and conditional statements (for instance how toContinue reading “Python Day 45 – progress on SQL!”