Today I finished the module on time series!
I learned:
One can shift data over by a certain time amount with .shift(). One can use ‘rolling windows’ to average out data over a period of time. This can be done with past/future data (by setting center = True), but is usually done with just past data (center = False). I also learned that one can plot data inherently in Pandas (without the matplotlib import I was doing earlier) with the .plot() function.
On a conceptual level, I’m starting to grasp the difference between series and data frames. Dataframes have the actual columns/rows of sorted data, while series of data are just a list of the values. The importance of this comes when trying to use dataframe functions (such as .sort_values) or if one .resample ‘s their data which turns it into a series. To keep your data in a dataframe while resampling one can add .mean() or .interpolate() to the end of the line. This is useful when trying to create custom rolling window period averages (for instance by a week instead of a minute). This is also how to sort values when using the .max() function on a series! Which solves the issue I had yesterday.
Things are making sense!
I am still confused on:
How to run all code of a newly opened Jupyter notebooks. That’s really all I’m confused with at this point haha. I think it’s getting to the level that I’ll phone a friend for this one. So hopefully this will be sorted out by tomorrow’s blog!