How to remove formatting characters amid text body of news when retrieving real time news data using MRN_STORY

Question

question

Liheng.Wang

1 ●0 ●3 ●4

How to remove formatting characters amid text body of news when retrieving real time news data using MRN_STORY

Hi, one customer called me to ask the question. How to remove those invisible formatting characters amid those printable text words? Because sometimes the re-organized news story body looks like mess because of those special formatting characters. How to typeset those news body text words in order?

elektron refinitiv-realtime elektron-sdk ema-api rrt elektron-message-api mrn

Sep 23, 2019 at 07:04 PM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Alex Putkov.1 Sep 30, 2019 at 08:19 PM

@Liheng.Wang

Thank you for your participation in the forum. Is the reply below satisfactory in resolving your query? If yes please click the 'Accept' text next to the reply. This will guide all community members who have a similar question. Otherwise please post again offering further insight into your question.

Thanks,

-AHS

Alex Putkov.1 Oct 07, 2019 at 11:23 PM

@Liheng.Wang

Please be informed that a reply has been verified as correct in answering the question, and has been marked as such.

Thanks,

-AHS

Answer 1 · 2019-09-23T19:55:11Z

nick.zincone

17.2k ●82 ●39 ●63

Hi @Liheng.Wang,

Can you elaborate what you mean by "invisible formatting characters"? The story body text can be determined by the mimeType defined within the JSON data structure - plain text.

Stories do contain <CR><LF> (Carriage Return/Line feeds) which is used for display terminals. In addition, stories can be nativly represented in other language variants. Can you also elaborate what "typeset those news body text words in order"? You mean you want to filter out certain ASCII characters like <CR> <TAB> <LF>, etc? If so, you will need to parse the body of the text and apply your own filtering.

ahs.png (31.0 KiB)

Sep 23, 2019 at 07:55 PM

10 |1500

Attachments: Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Liheng.Wang Sep 24, 2019 at 07:09 AM

@nick.zincone.1, thank you so much for answer. Yes, that is what I exactly mean. The customer said those ASCII characters are not good enough to phrase the news body and sometimes it will make the news looking like a mess. So they want to remove all this characters and then re-construct them by its own based on the plain text. So the answer is, how to identify all these characters and filter out all of them? I believe the problem they have is about how to make the news body a purely plain text.

wasin.w Liheng.Wang Sep 25, 2019 at 09:46 AM

Hello @Liheng.Wang
Those control characters are generated from MRN data feed, not from the API or TREP. They are part of news story content and the API just sends it to the client application "as is". If the client wants to remove those ascii characters, the client needs to implement their own filter to detect and remove them in the application level.

Q&A Forum

question

How to remove formatting characters amid text body of news when retrieving real time news data using MRN_STORY

1 Answer

Write an Answer

question

How to remove formatting characters amid text body of news when retrieving real time news data using MRN_STORY

1 Answer

Write an Answer

Related Questions