Published On: March 29th, 2023Categories: AI News

Measuring The Speed of New Pandas 2.0 Against Polars and Datatable — ...

Image by author via Midjourney

People have been complaining about Pandas’ speed ever since they tried reading their first gigabyte-sized dataset with read_csv and realized they had to wait for – gasp – five seconds. And yes, I was one of those complainers.

Five seconds might not sound a lot, but when loading the dataset itself takes that much runtime, it usually means subsequent operations will take as long. And since speed is one of the most essential things in quick, dirty data exploration, you can get very frustrated.

For this reason, folks at PyData recently announced the planned release of Pandas 2.0 with the freshly minted PyArrow backend. For those totally unaware, PyArrow, on its own, is a nifty little library designed for high-performance, memory-efficient manipulation of arrays.

People sincerely hope the new backend will bring considerable speed-ups…

Continue reading this article at;

https://towardsdatascience.com/measuring-the-speed-of-new-pandas-2-0-against-polars-and-datatable-still-not-good-enough-e44dc78f6585?source=rss—-7f60cf5620c9—4

https://towardsdatascience.com/measuring-the-speed-of-new-pandas-2-0-against-polars-and-datatable-still-not-good-enough-e44dc78f6585?gi=396abcdab833&source=rss—-7f60cf5620c9—4
towardsdatascience.com

Feed Name : Towards Data Science – Medium

machine-learning,artificial-intelligence,python,data-science,programming
hashtags : #Measuring #Speed #Pandas #Polars #Datatable

[gs-fb-comments]

Leave A Comment