question

Upvotes
Accepted
1 0 0 1

Getting Starmine Data in Python

Hi. I'm trying to get Starmine ARM data in Python systematically. I'm able to get a response using requests.get. The code snippet is:

fileURL = 'https://s3.amazonaws.com/a204558-arm-bucket-regional-prod/2022/11/17/2022-11-17_09-00-00_europe.csv?x-request-Id=534f1ed4-5126-487c-a41c-a606d54498c8&x-package-id=48f4-4f09-cbdca722-bff2-0691c073cf94&x-client-app-id=911e653ec9874f7295783f403f36797519fa9505&x-file-name=2022-11-17_09-00-00_europe.csv&x-fileset-id=4006-0c6a-5fe0a547-9795-c1075b72a85d&x-bucket-name=STARMINE_PREDICTIVE_ANALYTICS_ARM_EMEA&x-uuid=GEDTC-529992&x-file-Id=4e2d-4f56-531a819f-9cf0-54bb1ca03293&x-fileset-name=arm-live-europe-2022-11-17_09-00-00-prod&x-event-external-name=cfs-claimCheck-download&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEHoaCWV1LXdlc3QtMSJGMEQCIBfH0O5n%2BUTusr2mmjjStbqHga6q2bZ8hFUtHuEae8eLAiAS8QbfknCSgpcbgq%2B4gV3A%2BvSTSVH%2Fk7JQGpXmra7WMSqaAggTEAMaDDY0MjE1NzE4MTMyNiIM48Te0Ty7qMqCMMvcKvcBSjFgnubGF6uqmr35wS8BjeupyUxBSiuqoIQB5L7zdRawO72TmOWKFATdWofXzK%2BAcPTHqWG%2FFjncZGw739YlA4zkYT9HziD2H6NRGM5NoYUT0Yi3OCB6HRMp%2FrrKYBdCj5QGpyWD0Ur5%2BsWuAY1qPNcEdq80Qtujq4sDMOyvDKkzeRLrdgzqZQnl8ZJ37m64R6ca64DoMes7UPd23VU31u5DZr4WeypWQe4SOxIt%2FfJmHS8rGYF7L%2BYIaPaKltufuph9LjBoMMV2jrRntFFeMvMHARFSL%2FDD4x1TTkcWq2I9PKXQaR3lm%2B%2B6FPgRXa7ANV%2BhsZP9cjDaqK2fBjqeAXhplS4PFE4vPLjgWWoNNtMEp4EWd%2FnCfAO68IEiuEU6PX85JYRfivdMswdlYiO3Hi0eJoyuGtNGtg%2B890SWt56aTZ%2Fx9i4CIVuNlec43ZGPXN8NDCufuyuMBED9Rcul3QZUaeL16sCfwDUJyB%2BQOEdJCgLolTJieA2w0DZxK1onxnLEVv9Z0NdeGnqUo4U1MlB8YoHLxhp13zy9Y4wa&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230214T092858Z&X-Amz-SignedHeaders=host&X-Amz-Expires=21600&X-Amz-Credential=ASIAZLA4M7GHG2NN4U7N%2F20230214%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=e5c5859728113c6c340b1a6ad6ef301f8ac0018ec66b5e7563924cce82b40604'

url_obj = urlparse(fileURL)

parsed_params = parse_qs(url_obj.query)

parsed_url = url_obj._replace(query=None).geturl()

response = requests.get(parsed_url, params=parsed_params, stream=True)


I want to get the data in this response object in a dataframe. However, when I try running the below command:

response_json = json.loads(response.text)

I get the below error:

1676381249622.png


Could you please advise what can be done to get this data in a dataframe?

I'm using an RDP endpoint to get the fileURL I pasted above. When I try running the above snippet to get data from this received fileURL, I face this issue. If you try running the above code at your end, you'll see that response.text will return a text (and hence it's not a None object. Snippet of response.text shown below).

1682610572492.png


It is still somehow failing to perform json.loads on it for some reason.


The JSON command I'm running (as mentioned above) is simply "response_json = json.loads(response.text)".

I'm using user credentials for authentication (username, password, and client_id = app_key).

Perhaps it might be easier to get on a call to help resolve this faster.

Thanks!

#technology#contentstarmine
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Hi,

Please be informed that a reply has been verified as correct in answering the question, and marked as such.

Thanks,

AHS

1 Answer

· Write an Answer
Upvote
Accepted
79.4k 253 52 74

@vibhor.gupta

Thanks for reaching out to us.

It is a CSV (command-separated values) file that is not JSON.

You can search how to load CSV string to the data frame. I found this solution.

I hope that this information is of help.


icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

This is helpful, thanks.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.