Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 19 total
Thread EDB Server will use only a single CPU core
Fri, Oct 7 2011 1:58 AMPermanent Link

TonyWood

I'm trying to speed up the loading of a retail database to allow our operators to migrate retail legacy systems to Elevate within a tight time-frame, i.e. overnight. The last step is to load a target elevateDB with about 250 tables with about (worse case) 20 million rows, using 'IMPORT TABLE' from CSV queries.

With a single client instance (i.e. EDB Manager) this script takes about 75 minutes, with edbsrvr running at 6% on a 16 cpu system. I have tried splitting the load between multiple instances of VB console application client programs (i.e not using the EDB Manager) [taking into account the foreign key dependencies] talking to the server via ODBC. These instances are single threaded separate processes. Disappointingly, it appears that edbsrvr continues to use just a single cpu core (i.e. 6% of total), when serving concurrent requests from 16 clients, although the EDB v2 SQL manual says that

'The ElevateDB Server is multi-threaded and uses one thread per session connection'

thanks in advance for any guidance on this
Tony Wood





Fri, Oct 7 2011 8:52 AMPermanent Link

Raul

Team Elevate Team Elevate


Those are technically 2 different aspects - EDBServer is multi-threaded. If you use task manager to show how many threads process has then EDBServer starts with 5 in my case and it goes up by 1 every time client (edbmanager) connects. Check out thread count of your edbserver instance

The issue you are seeing is workload distribution on CPUs and that is depending on many other factors including whether you are actually CPU bound (and not disk or network) and if work can be distributed (no data affinity etc).

On my core i7 (8 virtual cpus in windows) i'm definitely seeing edbmanager being run on at least 5 of those cores but i don't have anything to load at it to see if it process - doing an import or sql update takes sub-seconds and spikes nothing

Raul

<<
TonyWood wrote:

I'm trying to speed up the loading of a retail database to allow our operators to migrate retail legacy systems to Elevate within a tight time-frame, i.e. overnight. The last step is to load a target elevateDB with about 250 tables with about (worse case) 20 million rows, using 'IMPORT TABLE' from CSV queries.

With a single client instance (i.e. EDB Manager) this script takes about 75 minutes, with edbsrvr running at 6% on a 16 cpu system. I have tried splitting the load between multiple instances of VB console application client programs (i.e not using the EDB Manager) [taking into account the foreign key dependencies] talking to the server via ODBC. These instances are single threaded separate processes. Disappointingly, it appears that edbsrvr continues to use just a single cpu core (i.e. 6% of total), when serving concurrent requests from 16 clients, although the EDB v2 SQL manual says that

'The ElevateDB Server is multi-threaded and uses one thread per session connection'

thanks in advance for any guidance on this
Tony Wood
>>
Fri, Oct 7 2011 8:53 AMPermanent Link

Raul

Team Elevate Team Elevate

Sorry - meant edbserver being run on at least 5 cores...

Raul

<<
On my core i7 (8 virtual cpus in windows) i'm definitely seeing edbmanager being run on at least 5 of those cores but i don't have anything to load at it to see if it process - doing an import or sql update takes sub-seconds and spikes nothing
>>
Tue, Oct 11 2011 3:26 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Tony,

<< With a single client instance (i.e. EDB Manager) this script takes about
75 minutes, with edbsrvr running at 6% on a 16 cpu system. I have tried
splitting the load between multiple instances of VB console application
client programs (i.e not using the EDB Manager) [taking into account the
foreign key dependencies] talking to the server via ODBC. These instances
are single threaded separate processes. Disappointingly, it appears that
edbsrvr continues to use just a single cpu core (i.e. 6% of total), when
serving concurrent requests from 16 clients, although the EDB v2 SQL manual
says that >>

That is not correct.  The EDB Server leaves the scheduling of the threads on
multiple processors/cores to the operating system, and the operating system
should schedule the threads on different cores if the current core is busy.

See here for more information:

http://www.msfn.org/board/topic/142203-server-2008-r2-thread-scheduling/

--
Tim Young
Elevate Software
www.elevatesoft.com
Tue, Oct 18 2011 12:43 AMPermanent Link

TonyWood

I can't figure out where i'm going wrong with this then. No matter what i try, i've NEVER seen an instance of edbsrvr use more than 100% processor cycles of a single CPU core on a multi-core system. I've seen them sit happily for a while (several minutes) at 99-100% while loading tables from CSV files, but not more.

I have tried multiple clients talking to multiple edbsrvrs (where the number of clients and servers is the same as the number of CPU cores on both 8 and 16 core machines) using different ports, this is the best config so far, as you may get 2 edbsrvrs at 100%, or more typically one at 100% and one at approx 50%. It seems as though no more than 2 servers will get significant CPU time.

I have tried multiple clients talking to a single edbsrvr, again this single edbsrvr runs at most 100%, so no improvement over using a single client. However the fact that this runs at 100% suggest that the processing is not limited by waiting for IO (i.e disk, network access etc). Resource Monitor tells me that running like this, edbsrvr has 13-14 threads

I've tried re-writing the clients from .Net VB to Cygwin / Perl in case there was anything inherently non-parallel in the client code, it didn't make any difference.

The bottom line is that after several weeks of trying with multiple clients, i've been unable to do any better time-wise than a single client loading SQL sequentially into a single edbsrvr.

(when i say the process has 100% of the a single CPU core, it will display in Task Manager at 13% on an 8 core machine (i guess this is 12.5% rounded up) and 6% on 16 core machine)

I'm running ElevateDB 2.05b11 on 64 bit Windows 7 with a 'remote' server at 127.0.0.1
any clues as to where the bottleneck may be ?
thanks
Tony Wood
Tue, Oct 18 2011 3:50 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tony


I don't have a wizzy machine like yours so I can't try anything out, however, trusting in what Raul and Tim say makes me wonder if its the type of operation you're carrying out that's the problem.

I'm guessing here. Tim will tell us if I'm totally wrong. Inserting data into a table will require a write lock, each insert is wrapped in an implicit transaction. With the process underway on one core the others simply never get the chance to grab the table in between.

Roy Lambert
Tue, Oct 18 2011 6:51 PMPermanent Link

TonyWood

Thanks for the response Roy.

I understand your point, but, to go into a little more detail on my setup, i'm using different clients to load different tables from CSV, i.e. 2 different clients will never write to the same table. Also I've split the load process up into 'generations' where the first generation of tables to be loaded have no foreign key constraints, so i can see no reason why these can't be loaded in parallel

Anyway, cheers from sunny Melbourne, Aus.
Tony Wood
Wed, Oct 19 2011 3:43 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tony

>I understand your point, but, to go into a little more detail on my setup, i'm using different clients to load different tables from CSV, i.e. 2 different clients will never write to the same table. Also I've split the load process up into 'generations' where the first generation of tables to be loaded have no foreign key constraints, so i can see no reason why these can't be loaded in parallel

In those circumstances neither do I.

Roy Lambert [Team Elevate]
Thu, Oct 20 2011 5:26 PMPermanent Link

Raul

Team Elevate Team Elevate

Tony Wood,

Tim is likely only one able to properly answer this question as this requires fairly detailed edbsrvr and edb knowledge.

However i will speculate  - beware that what follows is likely wrong:

First, are you using the ebsrvr as shipped by Elevate?
If you check it in task manager and use "set affinity" option does it show that all CPUs are selected? This is just to ensure its not limited by CPUs it can access.

It is quite possible that your scenario is running into a bottleneck either in edbsrvr or edb engine itself. As has been mentioned dbsrvr is multi-threaded but i don't know for example if all those threads have individual access to actually write data or whether that is handled by one or few main engine threads. Hence your scenario where you are basically acting as data-pump might result in database writes queuing up. Alternatively there will be some lock management and other control going on inside the edbsrvr/engine so again you might have hit the max thruput.

Running individual apps against their own edbsrvr does seem to indicate that there is bottleneck somewhere (since disk and network should have limited these similarly to single edbsrvr - and you're not seeing that)

Do you need to use edbsrrvr? It is afterall just another edb application (though written by the author of edb) so running with direct file access mode would allow you to basically cut the overhead of IP communication and theoretically get max performance. You mentioned you can load individual tables independently so doign exclusive access on a local DB file directly might result in this being faster. It might be worthwhile to switch to local session in your loader app and see what kind of speed you can achieve. Possibly a hybrid solution might work depending on your data - run the db locally and smaller (if there is such thing) remotely back to central edbsrvr

Raul


<<
TonyWood wrote:

I can't figure out where i'm going wrong with this then. No matter what i try, i've NEVER seen an instance of edbsrvr use more than 100% processor cycles of a single CPU core on a multi-core system. I've seen them sit happily for a while (several minutes) at 99-100% while loading tables from CSV files, but not more.

I have tried multiple clients talking to multiple edbsrvrs (where the number of clients and servers is the same as the number of CPU cores on both 8 and 16 core machines) using different ports, this is the best config so far, as you may get 2 edbsrvrs at 100%, or more typically one at 100% and one at approx 50%. It seems as though no more than 2 servers will get significant CPU time.

I have tried multiple clients talking to a single edbsrvr, again this single edbsrvr runs at most 100%, so no improvement over using a single client. However the fact that this runs at 100% suggest that the processing is not limited by waiting for IO (i.e disk, network access etc). Resource Monitor tells me that running like this, edbsrvr has 13-14 threads

I've tried re-writing the clients from .Net VB to Cygwin / Perl in case there was anything inherently non-parallel in the client code, it didn't make any difference.

The bottom line is that after several weeks of trying with multiple clients, i've been unable to do any better time-wise than a single client loading SQL sequentially into a single edbsrvr.

(when i say the process has 100% of the a single CPU core, it will display in Task Manager at 13% on an 8 core machine (i guess this is 12.5% rounded up) and 6% on 16 core machine)

I'm running ElevateDB 2.05b11 on 64 bit Windows 7 with a 'remote' server at 127.0.0.1
any clues as to where the bottleneck may be ?
thanks
Tony Wood
>>
Fri, Oct 21 2011 2:59 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Raul


>Tim is likely only one able to properly answer this question as this requires fairly detailed edbsrvr and edb knowledge.

Hear hear <vbg>

>However i will speculate - beware that what follows is likely wrong:
>
>First, are you using the ebsrvr as shipped by Elevate?
>If you check it in task manager and use "set affinity" option does it show that all CPUs are selected? This is just to ensure its not limited by CPUs it can access.
>
>It is quite possible that your scenario is running into a bottleneck either in edbsrvr or edb engine itself. As has been mentioned dbsrvr is multi-threaded but i don't know for example if all those threads have individual access to actually write data or whether that is handled by one or few main engine threads. Hence your scenario where you are basically acting as data-pump might result in database writes queuing up. Alternatively there will be some lock management and other control going on inside the edbsrvr/engine so again you might have hit the max thruput.

This is sort of what I was saying, but since Tony's running several instances of the server its unlikely to be the problem. Then again I could be wrong.

>Running individual apps against their own edbsrvr does seem to indicate that there is bottleneck somewhere (since disk and network should have limited these similarly to single edbsrvr - and you're not seeing that)
>
>Do you need to use edbsrrvr? It is afterall just another edb application (though written by the author of edb) so running with direct file access mode would allow you to basically cut the overhead of IP communication and theoretically get max performance. You mentioned you can load individual tables independently so doign exclusive access on a local DB file directly might result in this being faster. It might be worthwhile to switch to local session in your loader app and see what kind of speed you can achieve. Possibly a hybrid solution might work depending on your data - run the db locally and smaller (if there is such thing) remotely back to central edbsrvr

That's a neat idea.

Roy Lambert [Team Elevate]
Page 1 of 2Next Page »
Jump to Page:  1 2
Image