Showing posts with label Encoding. Show all posts
Showing posts with label Encoding. Show all posts

Saturday, February 23, 2013

How MSBuild could make a file inparseable

Problem: A text file that was being parsed from an assembly being part of a developing system. In the developer environment, this worked great. After having deployed the text file and the necessary assemblies to a different location using MSBuild, where MSBuild also replaced a couple of strings inside the text file, parsing would no longer work.
  • Looking at the file in a text editor (i use Notepad++) confirmed that the edited version looked fine.
  • Editing the file in the developer environment manually (not running MSBuild on it) worked fine - the file was parseable afterwards.
Obviously, MSBuild made the file inparseable So how could MSBuild make the text file inparseable? The command touching the file was
<MSBuild.ExtensionPack.FileSystem.File 
     TaskAction="Replace" 
     RegExPattern="Something" Replacement="SomethingElse"
     Files="%(filename)"/>
Only when tracing the code parsing the file it became clear to me that the file now contained some extra characters at the beginning of the file. Then it occurred to me:

Solution: MSBuild changed the file encoding. To make sure MSBuild used the right encoding, I had to add one more key/value pair to the MSBuild tag mentioned above:
TextEncoding="Windows-1252"
I found this by inspecting the file encoding on the source and destination text files. My source file was reported as ANSI, whileas my destination file was reported as UTF-8. It was however not as simple as putting "ANSI" as the TextEncoding, as described in this excellent StackOverflow article, which also lead me on the right path to "Windows-1252".

In my opinion, MSBuild should have retained the original encoding on the files it touches instead of  defaulting it into something unwanted. But then again, that's wat keeps bread on my table... Thanks, MS...

Saturday, September 10, 2011

PHP scripts stopped working when moved

Problem: After having moved some PHP scripts from one web server to another, they stopped working , or only worked partially. One of the webpages showed all the HTML and the upper part of PHP generated code, but strangely after a certain point the rest of the PHP code had just been ignored.

Solution: I loaded one of the script files in the open source DevPHP editor and chose Format on the menu. The Mac (CR) option was checked. I changed it to Windows (CRLF), uploaded the file again and reran it. Voilla, it worked!

Obviously, some PHP parsers may be more picky than others with how they read line breaks.

Tuesday, May 24, 2011

Edit and display local characters in .bat scripts

Problem: Sometimes I need to write a batch script echoing instructions in my native language Norwegian, including its special characters æ, ø and å. The only thing is, if you edit the .bat or .cmd file in Notepad, it will translate into garbled characters because of an character encoding issue.

Solution: With my favorite plain text editor, the free Notepad++, I have the possibility of selecting encoding for the file I am editing. Menu item Encoding-Character sets-Western European-OEM 850 works well for the Norwegian special characters, and possibly others.