I would like to query maybe 20000 RIC Time period pairs, in each instance picking up the tick data - top of the book, trades, sizes and timestamps. Each Ric might have 200 time periods which would be about 30 seconds long. What is the best practice way to query this via the API?
AWS improves the data download time (not the extraction time), but this will be apparent and of use only for medium to large data sets (for small data sets the faster download is offset by the overhead), as detailed in this article.
AWS can improve the data download time, but only for medium to large data sets (for small data sets the faster download is offset by the overhead), as detailed in this article.
I'm sorry, but there is no API call that allows to specify combinations of time ranges and instruments; all API calls apply the same time range to the entire instrument list.
An analysis of your required combinations of instruments and time frames should allow you to group a fair number of instruments (hopefully several hundred or more) for a defined time frame (a few hours or up to a day), request data for that combination, and then pick out what you require. I agree this is not optimal, you will retrieve more data than required, and then discard it, but for your use case this should deliver better performance than making numerous very small queries.
EDIT Oct 2018:
IMPORTANT CHANGE in API capabilities: it is now possible to use API calls specifying combinations of time ranges and instruments.
For more information, see this thread.
This cannot be done using a single API call and you will have to invoke multiple calls - one for each time period. Multiple RIC's for a particular time period should be added to a single request.
Multiple queries can be issued in parallel, while waiting for one to complete.
Follow the extraction limit guidelines specified here: LINK
Thanks for that, that's what I feared. 50 concurrent queries, 6 minutes to process 5 minutes data when I've tried over the last week through the rest API. Makes 40 hours for 20000 combinations. Hmmm. I can't believe there isn't a batch mode and it's qoing to be quicker to download whole months of data. Maybe Mifid etc is a very well kept secret
Thanks for replying though and I hope I've missed something.
It's possible to get very good returns with following scenario:
using multiple accountId's
multi-thread the calls (using sempahore to guarantee that the max number of threads is always up)
group your identifiers by time range (1 call all identifiers for a day/hour etc)