One liner tricks using perl/bash

Sometime small small hacks saves a lot of our time. The regexes used here can be changed as per need. In some cases try alternatively with single/double quote if dont work.

Count the number of times a specific character appears in each line

  • This counts the number of quotation marks in each line and prints it
    perl -ne ‘$cnt = tr/”//;print “$cnt\n“‘ inputFileName.txt

Add string to beginning of each line

  • Adds string(slc) to each line,
    perl -pe “s/(.*)/slcn\$1/” in.txt > out.txt

 Add string to end of each line

  • Append a string( to each line
     perl -pe “s/(.*)/\$” in.txt > out.txt

Print only alternate values in a list

  • Sometime froma list we have to print only alternate values, We can use the special $| of perl which stores onlt 1 or 0, Below will print the alternate from a list (a..z ) starting with a.
    perl -E  ‘say grep –$|, a..z’

Print only some columns of a file

  • Columns separated by a space
    cut fileWithLotsOfColumns.txt -d” “ -f 1,2,3,4 > fileWithOnlyFirst4Cols.txt

Print all columns except the first

  • cut -d” “ -f 1 –complement filename > filename.

 Replace a pattern with another one inside the file with backup

  • Replace all occurrences of pattern1 (e.g. [0-9]) with pattern2
    perl -p -i.bak -w -e ‘s/pattern1/pattern2/g’ inputFile

 Print only non-uppercase letters

  • Go through file and only print words that do not have any uppercase letters.
    perl –ne ‘print unless m/[A-Z]/’ allWords.txt > allWordsOnlyLowercase.txt

 Print one word per line

  • Go through file, split line at each space and print words one per line.
    perl –ne ‘print join(“\n”, split(/ /,$_));print(“\n”)’ someText.txt > wordsPerLine.txt

 Kill all screen sessions (no remorse)

  • Since there’s no screen command that would kill all screen sessions regardless of what they’re doing, here’s a perl one-liner that really kills ALL screen sessions without remorse.
    screen -ls | perl –ne ‘/(\d+)\./;print $1’ | xargs -l kill –9
  • The killall command may also do the job…

Return all unique words in a text document (divided by spaces), sorted by their counts (how often they appear)

  • assuming no punctuation marks:
    perl -ne ‘print join(“\n“, split(/\s+/,$_));print(“\n“)’ documents.txt > wordsOnePerLine.txt
    cat wordsOnePerLine.txt | sort | uniq -c  | sort -n > wordCountsSorted.txt

 Delete all special characters

  • delete every character that is not a letter, white space or line end (replace with nothing)
    perl -pne ‘s/[^a-zA-Z\s]*//g’ text_withSpecial.txt > text_lettersOnly.txt

 Lower case everything

  • perl -pne ‘tr/[A-Z]/[a-z]/’ textWithUpperCase.txt > textwithoutuppercase.txt;

Combine lower-casing with word counting and sorting

  • perl -pne ‘tr/[A-Z]/[a-z]/’ sentences.txt | perl –ne ‘print join(“\n”, split(/ /,$_));print(“\n”)’ | sort | uniq -c | sort -n

 Print only one column

  • Print only the second column of the data when using tabular as a separator
    perl –ne ‘@F = split(“\t”, $_); print “$F[1]”;’ columnFileWithTabs.txt > justSecondColumn.txt

 Print only text between tags

  • perl –ne ‘if (m/\<a\>(.*?)\<\/a\>/g){print “$1\n”}’ textFile
  • Extracting multiple multiline patterns between a start and an end tag
    • Here, we want to extract everything between <parse> and </parse>.
    • #!/usr/bin/perl -w
      local $/;open(DAT, “yourFile.xml”) || die(“Could not open file!”);
      my $content = <DAT>;while ($content =~ m/<parse>(.*?)<\/parse>/sg){
      print “$1\n

Sort lines by their length

  • perl -e ‘print sort {length $a <=> length $b} <>’ textFile

Print second column, unless it contains a number

  • perl -lane ‘print $F[1] unless $F[1] =~ m/[0-9]/’ wordCounts.txt

Trim/ Collapse white spaces and replace new lines by something else

  • echo “The cat sat    on    the  mat
    asd  sad  das   “
    | perl  –ne ‘s/\n/ /; print $_; print(“;”)’ | perl –ne ‘s/\s+/ /g; print $_’

Get the average of one column from certain lines

  • grep “another criterion” thisDataFile.txt |  perl –ne ‘@F = split(“,”, $_); print “$F[29]\n”;’ | awk ‘{sum+=$1} END { print “Average = “,sum/NR}’

How to sort a file by a column

  • Columns are separated by a space, we sort numerically (-n) and we sort by the 10’th column (-k10)
  • sort -t’ ‘ -n -k10 eSet1_both.txt

 Replace specific space but also copy a group of matches

  • matches a group of numbers in the beginning of a line
  • perl -p -i.bak -w -e ‘s/^([0-9]+) “/$1\t”/g’ someFile.txt

