The simple network math

The one from frequent support questions "How quickly can I collect 10000 real email addresses?"

There is no a general answer by the following reasons:

The results of EmEx 3 work depends from your sources of search.

EmEx 3 processes given document and their links. If required information presents there, EmEx will extract it. But if the information absents, so there is not any chance to find and extract it. Perhaps, only one document will contains the required information, but it is real to collect this information from a thousands pages. So, everything depends only from you - how you will set up software, what will be indicated as a source of searching and is there the required information on mentioned source.

The speed of processing depends from your internet connection and your provider (ISP) internet channel.

The simple math and no more other complicated.

Suppose, that your a speed of connection is 512 Kb/s. It means you can download the files and documents with 64 Kb/s maximum speed, in theory. The average modern HTML page has weight till 150 Kb. So, you have to spend 2-3 seconds for such page download.

If we run 10 streams at one time, we can not get the speed more 64 Kb/s, because it is a limitation of our channel.
Keep in mind our ideal conditions, we have average speed for one page processing - 2-3 seconds. So, for processing of 100 pages we will need to spend 200-300 seconds.

Take into account, server requires some time and outgoing traffic (~10-20%) to start transfer you the requested page. From time of page's request till start of download, it's necessary to send the information to the server (HTTP request) and receive the answer (HTTP headers). The transfer of the page to your desktop will start only after that.

But there are some other parameters: utilized capacity of the channel and speed of server's answers.

Finally, we get 200-300 seconds in the best case, and 500 and more seconds in reality with 512 Kb/s connection.

We provide different schemes to increase the speed of EmEx work. There is the distributed mode of scan, using proxy-servers, etc. Test, use, find the most optimal solutions for you.