[SGVLUG] Grep "quickie" needed -- searching for hi-bit characters

Emerson, Tom (*IC) Tom.Emerson at wbconsultant.com
Fri Jan 4 16:13:56 PST 2008


> -----Original Message----- Of Claude Felizardo
> On Jan 4, 2008 3:57 PM, Emerson, Tom (*IC) 
> >
> > What would I use as a regex to find characters with a byte (ascii) 
> > value > 127?
> 
> sounds like you should be using sed or perl.
> can't think of the regex right now but if it's suppose to be 
> regular text, what about just running the files through strings?

I need to find the lines that have "odd" characters to edit (remove)
those characters.  Out of the many-thousand-lines of input, there are
maybe half a dozen lines where there are/were "bad" characters.  I don't
think "strings" will help here as the rest of the lines will have "good"
text in them.  (these are movie titles included as part of the data "for
the benefit of humans" reviewing the file, however the file /format/ is
a fixed-field format, so when "whatever" translated the "local"
characters into multiple-byte values, it shifted the remaining fields,
causing a loader error)

 


More information about the SGVLUG mailing list