ios - Simple NSData's category to parse XML with cyrillic -


i have parse nsdata xml string, know simple category it? have such json, forced use xml. tried use xmlreader, it's interface looks clean, found issues:

  1. mysterious new line characters , spaces everywhere:

    "comment_count" = {text = "\n              \n              21";}; 
  2. my cyrillic symbols looks so:

    "description_text" = {text = "\n              \u041f\u0438\u043a\u0430\u0431\u0443\u0448}; 

example:

<?xml version="1.0" encoding="utf-8" ?> <news>     <xml_count>43</xml_count>     <hot_count>449</hot_count>     <item type="text">         <id>1469845</id>         <rating>147</rating>         <pluses>171</pluses>         <minuses>24</minuses>         <title>             <![cdata[Обновление огромного архива Пикабу!]]>         </title>         <comment_count>26</comment_count>         <comment_link>http://pikabu.ru/story/obnovlenie_ogromnogo_arkhiva_pikabu_1469845</comment_link>         <author>icq677555</author>         <description_text>             <![cdata[Пикабушники, я обновил свой огромный архив текстовых постов из горячего!]]>         </description_text>     </item> </news> 

i realized whats' going on. data samples nsdictionary instances printed in debugger. issues found are:

  1. as xml designed annotated text format, whitespace (spaces, newlines) handling doesn't fit data usage. can either trim resulting strings ([stringvar stringbytrimmingcharactersinset:[nscharacterset whitespaceandnewlinecharacterset]]), adapt xmlreader or use xml parser @ http://ios.biomsoft.com/2011/09/11/simple-xml-to-nsdictionary-converter/ (which default).

  2. the funny output cyrillic characters proper escaping non-ascii characters in debugger output (which uses old-style property list format). it's artifact of debugger output. variables contain proper characters.

btw: while json contains implicit type information (strings quoted, numbers never quoted etc.), xml without schema file not. parsed simple values strings if numbers.

update:

the xml parser you're using still contains old whitespace handling code described in pesky new lines , whitespace in xml reader class (though comment tells otherwise). apply fix mentioned @ bottom of answer, namely change line:

[dictinprogress setobject:textinprogress forkey:kxmlreadertextnodekey]; 

to:

[dictinprogress setobject:[textinprogress stringbytrimmingcharactersinset:[nscharacterset whitespaceandnewlinecharacterset]] forkey:kxmlreadertextnodekey]; 

Comments

Popular posts from this blog

ios - UICollectionView Self Sizing Cells with Auto Layout -

node.js - ldapjs - write after end error -

DOM Manipulation in Wordpress (and elsewhere) using php -