Removing ^M in imported Windows files
When you import file from Windows OS (or even from old Macintosh OS) you most likely have the ^M at the end of each line. Systems based on ASCII or a compatible character set use either LF (Line Feed, 0x0A, n) or CR (Carriage Return, 0x0D, r) individually, or CR followed by LF (CR+LF, 0x0D 0x0A, rn). Below is a quick list of OS using which convention :
- LF : UNIX and UNIX-Like systems, Linux, AIX, Xenix, Mac OS X, BeOS, Amiga, RISC OS...
- CR+LF : CP/M, MP/M, DOS, OS/2, Microsoft Windows (all versions)
- CR : Commodore machines, Apple II family and Mac OS through version 9
The different newline conventions often cause text files that have been transferred between systems of different types to be displayed incorrectly. For example, files originating on Unix or Apple Macintosh systems may appear as a single long line on a Windows system. Conversely, when viewing a file from a Windows computer on a Unix system, the extra CR may be displayed as ^M at the end of each line or as a second line break.
You can convert with editors relatively small files. For larger files on Windows NT/2000/XP you can use the following command:
TYPE unix_file | FIND "" /V > dos_file
On Unix, a DOS/Windows text file can be converted to Unix format by simply using the tool dos2unix or by removing all ASCII CR characters with the command "tr".
tr -d &#8216;\r&#8217; < inputfile > outputfile
You can add an alias to your shell startup script to create easy to remember variations of the tr command for each purpose.
Using bash as an example, edit .bashrc and add these lines.
alias cvtCR="tr '\r' '\n'"
alias cvtCRLF="tr -d '\r'";
You now have two new commands that you can type from the command line.
To try the commands out right away, without opening a new terminal, you
need to tell bash to re-read it’s startup file by typing
Now you are ready to try out the new commands.
To convert an old MAC file you would type
cvtCR < MACFILE > UNIXFILE This will read a file named MACFILE and create a file named UNIXFILE that has all of the r’s converted to n’s For DOS files, you just want to remove the darn r’s so in cvtCRLF the -d tells tr to delete them. If you wan to update a sample file you can use you favorite editor : VI !
Becarefull, you get the cariage return symbols by the keystroke CTRL+V+ENTER.