When I was loading data into partitioned tables, we were able to get our AppDev folks to provide the data in separate files for each partition. Data loading just FLEW. Sorry if this doesn't help your situation, but may be something to discuss for future loads.
I was wondering if anyone has experience bulk loading data to partitioned tables? I have run some tests and running bulk load (insert append) into partitioned tables is actually 40% more costy. For example to load up a 80 million rows table it takes around 8 minutes whereas with plain heap table it only takes 5
Test used: LMT with 16MB uniform size extent No ASSM Parallel DML Parallel Query Degree 16