Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

to_sql function takes forever to insert in oracle database

See original GitHub issue

I am using pandas to do some analysis on a excel file, and once that analysis is complete, I want to insert the resultant dataframe into a database. The size of this dataframe is around 300,000 rows and 27 columns. I am using pd.to_sql method to insert dataframe in the database. When I use a MySQL database, insertion in the database takes place around 60-90 seconds. However when I try to insert the same dataframe using the same function in an oracle database, the process takes around 2-3 hours to complete.

Relevant code can be found below:

data_frame.to_sql(name='RSA_DATA',  con=get_engine(), if_exists='append',
                          index=False, chunksize=config.CHUNK_SIZE)

I tried using different chunk_sizes (from 50 to 3000), but the difference in time was only of the order of 10 minutes. Any solution to the above problem ?

Issue Analytics

State:
Created 7 years ago
Comments:10 (2 by maintainers)

Top GitHub Comments

2reactions

BenjaminHabertcommented, Oct 14, 2019

As mentioned by @wuhaochen I have also ran into this problem. For me the issue was that oracle was creating columns of CLOB data type for all the string columns of the pandas dataframe. I sped-up the code by explicitly setting the ~~schema~~ dtype parameter of to_sql() and using VARCHAR dtypes for string columns.

I think this should be the default behavior of to_sql as creating CLOB is counter-intuitive.

0reactions

iron0012commented, Sep 8, 2022

to_sql() is still practically broken when working with Oracle without using the workaround recommended above.

Top Results From Across the Web

Speed up to_sql() when writing Pandas DataFrame to Oracle ...

Pandas + SQLAlchemy per default save all object (string) columns as CLOB in Oracle DB, which makes insertion extremely slow.

Insert happening very slow... — oracle-tech

Hi, Insert happening very slow after sqlldr happening in my program. ... 1) SQLLDR will be called, it will insert around 4 lakhs...

Insert still running for a long time - Oracle Communities

Hi All - We have an insert that is running forever. In this case, it began yesterday at 5 pm and still shows...

Huge Insert! - Ask TOM

This insert has taken 3 days to insert just 4 million records and there are way more ... Seems to me this insert...

11 Tuning PL/SQL Applications for Performance

(With the many performance improvements in Oracle Database 10g, any code ... Badly written subprograms (for example, a slow sort or search function)...