Download speed is very slow

Topics: Developer Forum, Project Management Forum, User Forum
Aug 20, 2009 at 3:21 AM

I used the lastest version source code. 

I wrote a multi-threading crawler using the APIs provided by this library. Well, I started to crawl the user information, including user's name,  user' photos and comments associated with each photo. The problem is the download speed. In one hour, I only crawled 20 users. Is that possible to speed it up?


I also found a suspicious bug which need to be confirmed. It happens several times in the file: LockFile.cs, line 54. It happens when I run two instances of the program. It seems like the "lock file" ( FlickrNet.lock ) is violated by multiple processes. Need more time to confirm in what condition this bug happens.

Aug 20, 2009 at 9:24 AM

On line 67 if you change int i = GenerateReallySlowInt(); to int i = ReallyQuickInt() then it will be much faster.

But seriously, how am I meant to know when you don't tell us what you are doing? Which API methods are you calling? How many photos do these users have?

Generally trying to crawl the Flickr database as fast as you can is a good way to get you banned - 1 query per second is the general agree limit.


Aug 20, 2009 at 5:17 PM

Here is calling routines:

1. flickr.PeopleFindByUsername to get the user info

2. flickr.ContactsGetPublicList to get the contacts

3. flickr.PeopleGetPublicGroups to get the groups the user belongs to

4. flickr.PhotosSearch to retrieve the photos belong to this user

        4.1 flickr.TagsGetListPhoto to get the tags associated with this photo

        4.2 flickr.PhotosCommentsGetList to get the comments

Yeah, it's a good idea to wait for a short time between two queries.  Well, i also figured out that lots of people have as many as 3000 photos uploaded, probably that's why it takes a long time.

By the way, what you mean :

On line 67 if you change int i = GenerateReallySlowInt(); to int i = ReallyQuickInt() then it will be much faster.

I can't find the "GenerateReallySlowInt" function call in the whole project. I download the source code from this website and its version is 2.2.0, Released on Mar 10 2009.




Aug 20, 2009 at 8:48 PM

I also wonder is that possible to retrieve all the tags related to a person without iterating each photo. It seems that there are no such kina API.

Otherwise, I will have to put a limit on the number of photos to be crawled for each person.


Aug 21, 2009 at 12:48 PM

I was being sarcastic about the RealSlowInt() method - sorry, sarcasm obviously doesn't come across well in text.


If you pass PhotoSearchExtras.Tags in to the PhotosSearch method then you can get all the tags for each photo at the same time.

The slowest method there is probably going to be the PhotosCommentsGetList. I don't know what the purpose of your program is but think carefully if indeed you need to call this method in advance for all photos.