M HYPE SPLASH
// general

replace text with part of text using regex with bash perl

By Abigail Rogers

For example I have this output:

string1 anynameveryveryverylong string2
string1 othernameveryveryverylong string2

I want truncate the name to the first ten characters:

string1 anynamever string2
string1 othernamev string2

a pseudo regex can be:

perl -pe "s/([^\t]+\t)([^\t]+)\t/\1\2{10}\t/g"

How do i get this?

4

2 Answers

perl -pe 's/^(\S+\s+)(\S{10})\S*/$1$2/'
  • ^ matches at the start of the string
  • \S means non-whitespace
  • + means repeated at least once
  • \s means whitespace
  • {10} means repeated 10 times

I.e. Keep the first word and the first 10 characters of the following word while forgetting the remaining characters of the second word.

Your pseudoregex has one substantial problem: the {10} is placed in the replacement part, but the replacement is just a string. The regex happens in the pattern part only.

3

Some more choices:

  1. Perl with autosplitting on tabs:

    $ perl -F"\t" -lae '$F[1]=substr($F[1],0,10); print join "\t",@F' file
    string1 anynamever string2
    string1 othernamev string2
  2. awk

    $ awk -F"\t" -vOFS="\t" '{$2=substr($2,1,10)}1' file
    string1 anynamever string2
    string1 othernamev string2
  3. sed

    $ sed -E 's/(\S+\t\S{10})[^\t]+/\1/' file
    string1 anynamever string2
    string1 othernamev string2
  4. One more Perl

    $ perl -pe 's/(\S+\t\S{10})[^\t]+/\1/' file
    string1 anynamever string2
    string1 othernamev string2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy