<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.0.6619.12">
<TITLE>Grep "quickie" needed -- searching for hi-bit characters</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->
<P><B><FONT FACE="Courier New">I've got an odd one here -- I know how I'd do this on an HP using some proprietary tools I've used for the last 15 years, but this is on a *nix system so I need to know how to do this using grep.</FONT></B></P>
<P><B><FONT FACE="Courier New">We have some files that were transferred from one machine to another [one of which was a PC], and somewhere in the process, it appears that some local-language/"multi-byte" characters got translated to multiple-ascii-bytes, which in turn buggered up the record length. Fortunately, these are easy to detect visually as the new values for each "byte" of the character are between 128 and 255 and generally look like "line noise" when cat'd to the screen. Unfortunately, the files involved are thousands of lines long, so a pure visual search is out of the question.</FONT></B></P>
<P><B><FONT FACE="Courier New">What would I use as a regex to find characters with a byte (ascii) value > 127?</FONT></B>
</P>
</BODY>
</HTML>