Some problem happens when querying info. of a large number of photos

Topics: Developer Forum
Nov 26, 2012 at 5:44 PM

Hi, thank you for sharing a good open source kit.

Recently, due to the research goal, I want to get the text information (tag names, description) and the geocoordinates of all photos lying within a geographic area I defined. The geographic extent is usually pretty big, for example 112.0-117.0 E, 39.0-42.0 N, so a lot of photos are involved and in this case the number is around 300, 000. I have two problems by doing so.

The first problem. As we know, the flickr api could only return the result page by page, so I when I use the Flickr.PhotoSearch function, I will repeat calling this function and in each time change the page parameter. But it turns out I will loose the control of this loop after a few times of calling, and the program will never end even if the loop times is only 10 or 20.

Please see my code as follows

            string apikey = "mykey";
            Flickr flickr = new Flickr(apikey);

            PhotoSearchOptions searchOptions = new PhotoSearchOptions();
            searchOptions.MinUploadDate = new DateTime(2000, 1, 1, 1, 0, 0);
            searchOptions.BoundaryBox = new BoundaryBox(112, 39, 117, 42);
            searchOptions.HasGeo = true;
            searchOptions.Page = 1;
            searchOptions.PerPage = 50;
            PhotoCollection resColl = flickr.PhotosSearch(searchOptions);

 
            int nPages = resColl.Pages;

            // Read the query result page by page
            for (int i = 1; i <= nPages; i++)
            {
                searchOptions.Page = i;
                resColl.Clear();
                resColl = flickr.PhotosSearch(searchOptions);

                int curNum = resColl.Count;
                for (int j = 0; j < curNum; j++)
                {
                    string photoid = resColl.ElementAt(j).PhotoId;
                    // here I read information about photos
                }
            }

I mean in this code the first for loop won't end, even sometimes I manually set the nPages as a fairly small constant number like 10. I am wondering if there is any possibility that the program will loose the network connection to the flickr server when it call photosearch a few times continually.

Another problem is that in the above code, the variable resColl (an instance of class PhotoCollection) only contain as much as 250 entities, even though the photosearch should return more than 250 entities. Is there any space limit for this class variable?

Coordinator
Nov 26, 2012 at 8:24 PM

flickr.photos.search will only return 4000 photos before it just begins to repeat, so attempting to return 300,000 photos won't work like that.

The maximum value for PerPage is 500, but Flickr sometimes doesn't return a full set of the full 500 images, but you should see more than 250 I'd hope.

Consider splitting your search up into Year, or even Month blocks to get the required result.

You should also consider throttling your queries, as Flickr has a limit of 1 query per second.

I hope that is of some help.

Sam

Nov 26, 2012 at 8:56 PM

Thanks very much. Your reply is of great help to me.