921

Replacing strings in files based on certain search criteria is a very common task. How can I

  • replace string foo with bar in all files in the current directory?
  • do the same recursively for sub directories?
  • replace only if the file name matches another string?
  • replace only if the string is found in a certain context?
  • replace if the string is on a certain line number?
  • replace multiple strings with the same replacement
  • replace multiple strings with different replacements
Braiam
  • 34,395
  • 25
  • 105
  • 163
terdon
  • 220,769
  • 58
  • 415
  • 622
  • 2
    This is intended to be a canonical Q&A on this subject (see this [meta discussion](http://meta.unix.stackexchange.com/q/2708/22222)), please feel free to edit my answer below or add your own. – terdon Feb 01 '14 at 17:08
  • Great `grep -rl` (then piped to `sed`) answer here: https://unix.stackexchange.com/questions/472476/grep-global-find-replace/472482#472482 – Gabriel Staples Mar 20 '20 at 00:24

10 Answers10

1241

1. Replacing all occurrences of one string with another in all files in the current directory:

These are for cases where you know that the directory contains only regular files and that you want to process all non-hidden files. If that is not the case, use the approaches in 2.

All sed solutions in this answer assume GNU sed. If using FreeBSD or macOS, replace -i with -i ''. Also note that the use of the -i switch with any version of sed has certain filesystem security implications and is inadvisable in any script which you plan to distribute in any way.

  • Non recursive, files in this directory only:

     sed -i -- 's/foo/bar/g' *
     perl -i -pe 's/foo/bar/g' ./* 
    

(the perl one will fail for file names ending in | or space)).

  • Recursive, regular files (including hidden ones) in this and all subdirectories

     find . -type f -exec sed -i 's/foo/bar/g' {} +
    

    If you are using zsh:

     sed -i -- 's/foo/bar/g' **/*(D.)
    

    (may fail if the list is too big, see zargs to work around).

    Bash can't check directly for regular files, a loop is needed (braces avoid setting the options globally):

     ( shopt -s globstar dotglob;
         for file in **; do
             if [[ -f $file ]] && [[ -w $file ]]; then
                 sed -i -- 's/foo/bar/g' "$file"
             fi
         done
     )
    

    The files are selected when they are actual files (-f) and they are writable (-w).

2. Replace only if the file name matches another string / has a specific extension / is of a certain type etc:

  • Non-recursive, files in this directory only:

    sed -i -- 's/foo/bar/g' *baz*    ## all files whose name contains baz
    sed -i -- 's/foo/bar/g' *.baz    ## files ending in .baz
    
  • Recursive, regular files in this and all subdirectories

    find . -type f -name "*baz*" -exec sed -i 's/foo/bar/g' {} +
    

    If you are using bash (braces avoid setting the options globally):

    ( shopt -s globstar dotglob
        sed -i -- 's/foo/bar/g' **baz*
        sed -i -- 's/foo/bar/g' **.baz
    )
    

    If you are using zsh:

    sed -i -- 's/foo/bar/g' **/*baz*(D.)
    sed -i -- 's/foo/bar/g' **/*.baz(D.)
    

The -- serves to tell sed that no more flags will be given in the command line. This is useful to protect against file names starting with -.

  • If a file is of a certain type, for example, executable (see man find for more options):

    find . -type f -executable -exec sed -i 's/foo/bar/g' {} +
    

zsh:

    sed -i -- 's/foo/bar/g' **/*(D*)

3. Replace only if the string is found in a certain context

  • Replace foo with bar only if there is a baz later on the same line:

     sed -i 's/foo\(.*baz\)/bar\1/' file
    

In sed, using \( \) saves whatever is in the parentheses and you can then access it with \1. There are many variations of this theme, to learn more about such regular expressions, see here.

  • Replace foo with bar only if foo is found on the 3d column (field) of the input file (assuming whitespace-separated fields):

     gawk -i inplace '{gsub(/foo/,"baz",$3); print}' file
    

(needs gawk 4.1.0 or newer).

  • For a different field just use $N where N is the number of the field of interest. For a different field separator (: in this example) use:

     gawk -i inplace -F':' '{gsub(/foo/,"baz",$3);print}' file
    

Another solution using perl:

    perl -i -ane '$F[2]=~s/foo/baz/g; $" = " "; print "@F\n"' foo 

NOTE: both the awk and perl solutions will affect spacing in the file (remove the leading and trailing blanks, and convert sequences of blanks to one space character in those lines that match). For a different field, use $F[N-1] where N is the field number you want and for a different field separator use (the $"=":" sets the output field separator to :):

    perl -i -F':' -ane '$F[2]=~s/foo/baz/g; $"=":";print "@F"' foo 
  • Replace foo with bar only on the 4th line:

     sed -i '4s/foo/bar/g' file
     gawk -i inplace 'NR==4{gsub(/foo/,"baz")};1' file
     perl -i -pe 's/foo/bar/g if $.==4' file
    

4. Multiple replace operations: replace with different strings

  • You can combine sed commands:

     sed -i 's/foo/bar/g; s/baz/zab/g; s/Alice/Joan/g' file
    

Be aware that order matters (sed 's/foo/bar/g; s/bar/baz/g' will substitute foo with baz).

  • or Perl commands

     perl -i -pe 's/foo/bar/g; s/baz/zab/g; s/Alice/Joan/g' file
    
  • If you have a large number of patterns, it is easier to save your patterns and their replacements in a sed script file:

     #! /usr/bin/sed -f
     s/foo/bar/g
     s/baz/zab/g
    
  • Or, if you have too many pattern pairs for the above to be feasible, you can read pattern pairs from a file (two space separated patterns, $pattern and $replacement, per line):

     while read -r pattern replacement; do   
         sed -i "s/$pattern/$replacement/" file
     done < patterns.txt
    
  • That will be quite slow for long lists of patterns and large data files so you might want to read the patterns and create a sed script from them instead. The following assumes a <<!>space<!>> delimiter separates a list of MATCH<<!>space<!>>REPLACE pairs occurring one-per-line in the file patterns.txt :

     sed 's| *\([^ ]*\) *\([^ ]*\).*|s/\1/\2/g|' <patterns.txt |
     sed -f- ./editfile >outfile
    

The above format is largely arbitrary and, for example, doesn't allow for a <<!>space<!>> in either of MATCH or REPLACE. The method is very general though: basically, if you can create an output stream which looks like a sed script, then you can source that stream as a sed script by specifying sed's script file as -stdin.

  • You can combine and concatenate multiple scripts in similar fashion:

     SOME_PIPELINE |
     sed -e'#some expression script'  \
         -f./script_file -f-          \
         -e'#more inline expressions' \
     ./actual_edit_file >./outfile
    

A POSIX sed will concatenate all scripts into one in the order they appear on the command-line. None of these need end in a \newline.

  • grep can work the same way:

     sed -e'#generate a pattern list' <in |
     grep -f- ./grepped_file
    
  • When working with fixed-strings as patterns, it is good practice to escape regular expression metacharacters. You can do this rather easily:

     sed 's/[]$&^*\./[]/\\&/g
          s| *\([^ ]*\) *\([^ ]*\).*|s/\1/\2/g|
     ' <patterns.txt |
     sed -f- ./editfile >outfile
    

5. Multiple replace operations: replace multiple patterns with the same string

  • Replace any of foo, bar or baz with foobar

     sed -Ei 's/foo|bar|baz/foobar/g' file
    
  • or

     perl -i -pe 's/foo|bar|baz/foobar/g' file
    

6. Replace File paths in multiple files

Another use case of using different delimiter:

sed -i 's|path/to/foo|path/to/bar|g' *
Shashank Gb
  • 111
  • 5
terdon
  • 220,769
  • 58
  • 415
  • 622
  • 3
    @StéphaneChazelas thanks for the edit, it did indeed fix several things. However, please don't remove information that is relevant to bash. Not everyone uses `zsh`. By all means add `zsh` info but there is no reason to remove the bash stuff. Also, I know that using the shell for text processing is not ideal but there are cases where it is needed. I edited in a better version of my original script that will create a `sed` script instead of actually using the shell loop to parse. This can be useful if you have several hundred pairs of patterns for example. – terdon Jan 16 '15 at 15:10
  • 2
    @terdon, your bash one is incorrect. bash before 4.3 will follow symlinks when descending. Also bash has no equivalent for the `(.)` globbing qualifier so can't be used here. (you're missing some -- as well). The for loop is incorrect (missing -r) and means making several passes in the files and adds no benefit over a sed script. – Stéphane Chazelas Jan 16 '15 at 15:16
  • @StéphaneChazelas I don't see why following symlinks is a problem here. If the links are in the directory in most cases I would want to follow them. I'm afraid I don't understand what you mean about the `(.)` which is, I guess, some `zsh` magic I am unfamiliar with. I fixed the shell loop (and added a few `--` though not in all examples), its point is in dealing with hundreds of patterns and will only read the pattern file once. It produces a sed script. – terdon Jan 16 '15 at 15:53
  • 7
    @terdon What does `--` after `sed -i` and before the substitute command indicate? – Geek Sep 28 '15 at 11:29
  • 7
    @Geek that's a POSIX thing. It signifies the end of options and lets you pass arguments starting with `-`. Using it ensures that the commands will work on files with names like `-foo`. Without it, the `-f` would be parsed as an option. – terdon Sep 28 '15 at 11:42
  • @terdon One more follow-up question. In the last example `sed -Ei 's/foo|bar|baz/foobar/g' file` I think that the E stands for extended regular expression. Is this a Posix feature? – Geek Sep 30 '15 at 04:57
  • @Geek no, it's [not POSIX](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html). It is also not GNU. As far as I can tell [it's a BSD thing](http://nixdoc.net/man-pages/FreeBSD/man1/sed.1.html) (so, by extension, OSX as well). That was added by Stéphane Chazelas (see edit 10), not me, but I'm sure he's right. He tends to be. Presumably, the BSD sed won't interpret the `|` correctly without the `-E`. – terdon Sep 30 '15 at 22:56
  • For me, the #! /usr/bin/sed -i doesn't work. You need to give sed the -f argument to make it treat the file as scriptfile. Unfortunately you can't give it -i as well, so that script can just work as a filter. – Hans-Peter Störr Jan 19 '16 at 16:28
  • @hstoerr yes, same here. Sorry about that. I don't know who made that particular edit. Fixed now. – terdon Jan 19 '16 at 16:34
  • 4
    Be very careful executing some of the recursive commands in git repositories. For example, the solutions provided in section 1 of this answer will actually modify internal git files in a `.git` directory, and actually mess up your checkout. Better to operate within/on specific directories by name. – Pistos Apr 19 '16 at 14:44
  • In the first case with recursive directory search, how can I also include executables? – Arjun May 11 '16 at 14:17
  • @Mellkor that already does include executables. Executable files, be they binary or text, are still just files. – terdon May 11 '16 at 14:18
  • 2
    Just thought I'd mention **repren**, which is another approach to this. It is my own Python tool and so must be installed with pip, but covers most of these scenarios and several others (like dry runs, stats, and multiple simultaneous replacements) that you can't replicate with sed and perl: https://github.com/jlevy/repren – jlevy Aug 25 '16 at 19:35
  • @jlevy sounds like that might make a good answer. Why not post one with a few examples? – terdon Aug 25 '16 at 19:56
  • Apparently I can't (the question is protected). In any case, three short examples are here (search for repren): https://github.com/jlevy/the-art-of-command-line#processing-files-and-data – jlevy Aug 26 '16 at 00:53
  • The mentioned security issue should be considered as an operating system bug. Otherwise, there is no need for permissions on file level [Any file in an rw directory is writable by renaming the old file] – user877329 Sep 07 '16 at 07:55
  • Beware that #1 bullet 2 may damage your git index if you have one. – Raffi Khatchadourian Jan 05 '17 at 19:35
  • Recursive doesn't work for me. It's not the number of files; I moved down several directories and it still doesn't change the text. – felwithe Feb 26 '19 at 17:11
  • @felwithe please post a new question, explaining exactly what you are trying to do. These commands are pretty standard, so if they don't work, you're doing something wrong. – terdon Feb 26 '19 at 17:16
  • 1
    I suggest adding a section to the recursive option here to exclude `.hg`, `.git`, and `.svn` directories. That's the sort of thing one only thinks about when it's too late. – Faheem Mitha Dec 09 '20 at 11:25
105

A good replacement Linux tool is rpl, that was originally written for the Debian project, so it is available with apt-get install rpl in any Debian derived distro, and may be for others, but otherwise you can download the tar.gz file from SourceForge.

Simplest example of use:

 $ rpl old_string new_string test.txt

Note that if the string contains spaces it should be enclosed in quotation marks. By default rpl takes care of capital letters but not of complete words, but you can change these defaults with options -i (ignore case) and -w (whole words). You can also specify multiple files:

 $ rpl -i -w "old string" "new string" test.txt test2.txt

Or even specify the extensions (-x) to search or even search recursively (-R) in the directory:

 $ rpl -x .html -x .txt -R old_string new_string test*

You can also search/replace in interactive mode with -p (prompt) option:

The output shows the numbers of files/string replaced and the type of search (case in/sensitive, whole/partial words), but it can be silent with the -q (quiet mode) option, or even more verbose, listing line numbers that contain matches of each file and directory with -v (verbose mode) option.

Other options that are worth remembering are -e (honor escapes) that allow regular expressions, so you can search also tabs (\t), new lines (\n),etc. You can use -f to force permissions (of course, only when the user has write permissions) and -d to preserve the modification times`).

Finally, if you are unsure what exactly will happen, use the -s (simulate mode).

Matthias Braun
  • 6,947
  • 6
  • 41
  • 50
Fran
  • 1,631
  • 1
  • 14
  • 8
  • 3
    So much better at the feedback and simplicity than sed. I just wish it allowed acting on file names, and then it'd be perfect as-is. – Kzqai Dec 23 '16 at 17:12
  • 1
    i like the -s (simulate mode) :-) – m3nda Jun 10 '18 at 11:08
  • 1
    sooo much better than `sed`. golly – Marc Compere Aug 03 '20 at 17:26
  • thanks so much for this. `sed` is fine for simple replacements, but terrible for more complicated and longer strings – yeah22 Sep 15 '20 at 19:02
  • 1
    For macOS, `rpl` is available from MacPorts. – murray Sep 23 '20 at 20:49
  • I get `UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 293: ordinal not in range(128)`. The command is `rpl -x .php -R -p -v ".frames['tudo']" "" *`. Any ideas? – Heitor Dec 31 '20 at 18:28
  • Check with single files, probably have some with the wrong encoding. Use `uconv` if needed. – Fran Dec 31 '20 at 23:55
  • rpl seems nice, but for various random files it gives me `decoding error (character maps to )`. I got lucky using verbose mode or I might not have noticed until much later... Perhaps it's because some of my README.md files have emojis in them? Either way, back to good ol find + sed. – Xunnamius Jan 03 '22 at 21:12
  • This is awesome, many thanks – anvd Feb 09 '22 at 16:55
28

How to do a search and replace over multiple files suggests:

You could also use find and sed, but I find that this little line of perl works nicely.

perl -pi -w -e 's/search/replace/g;' *.php
  • -e means execute the following line of code.
  • -i means edit in-place
  • -w write warnings
  • -p loop over the input file, printing each line after the script is applied to it.

My best results come from using perl and grep (to ensure that file have the search expression )

perl -pi -w -e 's/search/replace/g;' $( grep -rl 'search' )
19

I used this:

grep -r "old_string" -l | tr '\n' ' ' | xargs sed -i 's/old_string/new_string/g'
  1. List all files that contain old_string.

  2. Replace newline in result with spaces (so that the list of files can be fed to sed.

  3. Run sed on those files to replace old string with new.

Update: The above result will fail on filenames that contain whitespaces. Instead, use:

grep --null -lr "old_string" | xargs --null sed -i 's/old_string/new_string/g'

shas
  • 2,338
  • 4
  • 16
  • 31
o_o_o--
  • 755
  • 1
  • 7
  • 9
  • 1
    Note that this will fail if any of your file names contain spaces, tabs or newlines. Use `grep --null -lr "old_string" | xargs --null sed -i 's/old_string/new_string/g'` will make it deal with arbitrary file names. – terdon Oct 26 '15 at 17:07
  • thanks guys. added update and left the old code cause it's an interesting caveat that could be useful to someone unaware of this behavior. – o_o_o-- Oct 26 '15 at 20:59
19

You can use Vim in Ex mode:

replace string ALF with BRA in all files in the current directory?

for CHA in *
do
  ex -sc '%s/ALF/BRA/g' -cx "$CHA"
done

do the same recursively for sub directories?

find -type f -exec ex -sc '%s/ALF/BRA/g' -cx {} ';'

replace only if the file name matches another string?

for CHA in *.txt
do
  ex -sc '%s/ALF/BRA/g' -cx "$CHA"
done

replace only if the string is found in a certain context?

ex -sc 'g/DEL/s/ALF/BRA/g' -cx file

replace if the string is on a certain line number?

ex -sc '2s/ALF/BRA/g' -cx file

replace multiple strings with the same replacement

ex -sc '%s/\vALF|ECH/BRA/g' -cx file

replace multiple strings with different replacements

ex -sc '%s/ALF/BRA/g|%s/FOX/GOL/g' -cx file
Zombo
  • 1
  • 5
  • 42
  • 62
9

From a user's perspective, a nice & simple Unix tool that does the job perfectly is qsubst. For example,

% qsubst foo bar *.c *.h

will replace foo with bar in all my C files. A nice feature is that qsubst will do a query-replace, i.e., it will show me each occurrence of foo and ask whether I want to replace it or not. [You can replace unconditionally (no asking) with -go option, and there are other options, e.g., -w if you only want to replace foo when it is a whole word.]

How to get it: qsubst was invented by der Mouse (from McGill) and posted to comp.unix.sources 11(7) in Aug. 1987. Updated versions exist. For example, the NetBSD version qsubst.c,v 1.8 2004/11/01 compiles and runs perfectly on my mac.

terdon
  • 220,769
  • 58
  • 415
  • 622
phs
  • 433
  • 3
  • 11
  • Searched similar solution for Linux and found `wrg` script wrapper for ripgrep by lydell: https://github.com/BurntSushi/ripgrep/issues/74#issuecomment-309213936 – d9k Mar 20 '21 at 23:32
7

ripgrep (command name rg) is a grep tool, but supports search and replace as well.

$ cat ip.txt
dark blue and light blue
light orange
blue sky
$ # by default, line number is displayed if output destination is stdout
$ # by default, only lines that matched the given pattern is displayed
$ # 'blue' is search pattern and -r 'red' is replacement string
$ rg 'blue' -r 'red' ip.txt
1:dark red and light red
3:red sky

$ # --passthru option is useful to print all lines, whether or not it matched
$ # -N will disable line number prefix
$ # this command is similar to: sed 's/blue/red/g' ip.txt
$ rg --passthru -N 'blue' -r 'red' ip.txt
dark red and light red
light orange
red sky

rg doesn't support in-place option, so you'll have to do it yourself

$ # -N isn't needed here as output destination is a file
$ rg --passthru 'blue' -r 'red' ip.txt > tmp.txt && mv tmp.txt ip.txt
$ cat ip.txt
dark red and light red
light orange
red sky

See Rust regex documentation for regular expression syntax and features. The -P switch will enable PCRE2 flavor. rg supports Unicode by default.

$ # non-greedy quantifier is supported
$ echo 'food land bark sand band cue combat' | rg 'foo.*?ba' -r 'X'
Xrk sand band cue combat

$ # unicode support
$ echo 'fox:αλεπού,eagle:αετός' | rg '\p{L}+' -r '($0)'
(fox):(αλεπού),(eagle):(αετός)

$ # set operator example, remove all punctuation characters except . ! and ?
$ para='"Hi", there! How *are* you? All fine here.'
$ echo "$para" | rg '[[:punct:]--[.!?]]+' -r ''
Hi there! How are you? All fine here.

$ # use -P if you need even more advanced features
$ echo 'car bat cod map' | rg -P '(bat|map)(*SKIP)(*F)|\w+' -r '[$0]'
[car] bat [cod] map

Like grep, the -F option will allow fixed strings to be matched, a handy option which I feel sed should implement too.

$ printf '2.3/[4]*6\nfoo\n5.3-[4]*9\n' | rg --passthru -F '[4]*' -r '2'
2.3/26
foo
5.3-29

Another handy option is -U which enables multiline matching

$ # (?s) flag will allow . to match newline characters as well
$ printf '42\nHi there\nHave a Nice Day' | rg --passthru -U '(?s)the.*ice' -r ''
42
Hi  Day

rg can handle dos-style files too

$ # same as: sed -E 's/\w+(\r?)$/123\1/'
$ printf 'hi there\r\ngood day\r\n' | rg --passthru --crlf '\w+$' -r '123'
hi 123
good 123

Another advantage of rg is that it is likely to be faster than sed

$ # for small files, initial processing time of rg is a large component
$ time echo 'aba' | sed 's/a/b/g' > f1
real    0m0.002s
$ time echo 'aba' | rg --passthru 'a' -r 'b' > f2
real    0m0.007s

$ # for larger files, rg is likely to be faster
$ # 6.2M sample ASCII file
$ wget https://norvig.com/big.txt
$ time LC_ALL=C sed 's/\bcat\b/dog/g' big.txt > f1
real    0m0.060s
$ time rg --passthru '\bcat\b' -r 'dog' big.txt > f2
real    0m0.048s
$ diff -s f1 f2
Files f1 and f2 are identical

$ time LC_ALL=C sed -E 's/\b(\w+)(\s+\1)+\b/\1/g' big.txt > f1
real    0m0.725s
$ time rg --no-unicode --passthru -wP '(\w+)(\s+\1)+' -r '$1' big.txt > f2
real    0m0.093s
$ diff -s f1 f2
Files f1 and f2 are identical
Sundeep
  • 11,118
  • 1
  • 25
  • 49
6

I needed something that would provide a dry-run option and would work recursively with a glob, and after trying to do it with awk and sed I gave up and instead did it in python.

The script searches recursively all files matching a glob pattern (e.g. --glob="*.html") for a regex and replaces with the replacement regex:

find_replace.py [--dir=my_folder] \
    --search-regex=<search_regex> \
    --replace-regex=<replace_regex> \
    --glob=[glob_pattern] \
    --dry-run

Every long option such as --search-regex has a corresponding short option, i.e. -s. Run with -h to see all options.

For example, this will flip all dates from 2017-12-31 to 31-12-2017:

python replace.py --glob=myfile.txt \
    --search-regex="(\d{4})-(\d{2})-(\d{2})" \
    --replace-regex="\3-\2-\1" \
    --dry-run --verbose
import os
import fnmatch
import sys
import shutil
import re

import argparse

def find_replace(cfg):
    search_pattern = re.compile(cfg.search_regex)

    if cfg.dry_run:
        print('THIS IS A DRY RUN -- NO FILES WILL BE CHANGED!')

    for path, dirs, files in os.walk(os.path.abspath(cfg.dir)):
        for filename in fnmatch.filter(files, cfg.glob):

            if cfg.print_parent_folder:
                pardir = os.path.normpath(os.path.join(path, '..'))
                pardir = os.path.split(pardir)[-1]
                print('[%s]' % pardir)
            filepath = os.path.join(path, filename)

            # backup original file
            if cfg.create_backup:
                backup_path = filepath + '.bak'

                while os.path.exists(backup_path):
                    backup_path += '.bak'
                print('DBG: creating backup', backup_path)
                shutil.copyfile(filepath, backup_path)

            with open(filepath) as f:
                old_text = f.read()

            all_matches = search_pattern.findall(old_text)

            if all_matches:

                print('Found {} matches in file {}'.format(len(all_matches), filename))

                new_text = search_pattern.sub(cfg.replace_regex, old_text)

                if not cfg.dry_run:
                    with open(filepath, "w") as f:
                        print('DBG: replacing in file', filepath)
                        f.write(new_text)
                else:
                    for idx, matches in enumerate(all_matches):
                        print("Match #{}: {}".format(idx, matches))

                    print("NEW TEXT:\n{}".format(new_text))

            elif cfg.verbose:
                print('File {} does not contain search regex "{}"'.format(filename, cfg.search_regex))


if __name__ == '__main__':

    parser = argparse.ArgumentParser(description='''DESCRIPTION:
    Find and replace recursively from the given folder using regular expressions''',
                                     formatter_class=argparse.RawDescriptionHelpFormatter,
                                     epilog='''USAGE:
    {0} -d [my_folder] -s <search_regex> -r <replace_regex> -g [glob_pattern]

    '''.format(os.path.basename(sys.argv[0])))

    parser.add_argument('--dir', '-d',
                        help='folder to search in; by default current folder',
                        default='.')

    parser.add_argument('--search-regex', '-s',
                        help='search regex',
                        required=True)

    parser.add_argument('--replace-regex', '-r',
                        help='replacement regex',
                        required=True)

    parser.add_argument('--glob', '-g',
                        help='glob pattern, i.e. *.html',
                        default="*.*")

    parser.add_argument('--dry-run', '-dr',
                        action='store_true',
                        help="don't replace anything just show what is going to be done",
                        default=False)

    parser.add_argument('--create-backup', '-b',
                        action='store_true',
                        help='Create backup files',
                        default=False)

    parser.add_argument('--verbose', '-v',
                        action='store_true',
                        help="Show files which don't match the search regex",
                        default=False)

    parser.add_argument('--print-parent-folder', '-p',
                        action='store_true',
                        help="Show the parent info for debug",
                        default=False)

    config = parser.parse_args(sys.argv[1:])

    find_replace(config)

Here is an updated version of the script which highlights the search terms and replacements with different colors.

ccpizza
  • 1,485
  • 1
  • 18
  • 19
  • 2
    I don't understand why you would make something this complex. For recursion, use either bash's (or your shell's equivalent) `globstar` option and `**` globs or `find`. For a dry run, just use `sed`. Unless you use the `-i` option, it won't make any changes. For a backup use `sed -i.bak` (or `perl -i .bak`); for files that don't match, use `grep PATTERN file || echo file`. And why in the world would you have python expand the glob instead of letting the shell do it? Why `script.py --glob=foo*` instead of just `script.py foo*`? – terdon Nov 23 '17 at 09:34
  • 2
    My _why's_ are very simple: (1) above all, ease of debugging; (2) using only a single well documented tool with a supportive community (3) not knowing `sed` and `awk` well and being unwilling to invest extra time on mastering them, (4) readability, (5) this solution will also work on non-posix systems (not that I need that but somebody else might). – ccpizza Nov 23 '17 at 12:59
2

Update July 2022: I have a really robust wrapper tool, rgr, which stands for "RipGrep Replace" and which wraps the incredibly fast RipGrep tool (rg), which you should use instead. See my other answer here. My wrapper supports all rg options, while adding -R for actual, on-disk text replacements.


Here I use grep to tell if it is going to change a file (so I can count the number of lines changed, and replacements made, to output at the end), then I use sed to actually change the file. Notice the single line of sed usage at the very end of the Bash function below:

replace_str Bash function

Update: the below code has been upgraded and is now part of my eRCaGuy_dotfiles project as "find_and_replace.sh" here. <-- I recommend you use this tool now instead.

Usage:

gs_replace_str "regex_search_pattern" "replacement_string" "file_path"

Bash Function:

# Usage: `gs_replace_str "regex_search_pattern" "replacement_string" "file_path"`
gs_replace_str() {
    REGEX_SEARCH="$1"
    REPLACEMENT_STR="$2"
    FILENAME="$3"

    num_lines_matched=$(grep -c -E "$REGEX_SEARCH" "$FILENAME")
    # Count number of matches, NOT lines (`grep -c` counts lines), 
    # in case there are multiple matches per line; see: 
    # https://superuser.com/questions/339522/counting-total-number-of-matches-with-grep-instead-of-just-how-many-lines-match/339523#339523
    num_matches=$(grep -o -E "$REGEX_SEARCH" "$FILENAME" | wc -l)

    # If num_matches > 0
    if [ "$num_matches" -gt 0 ]; then
        echo -e "\n${num_matches} matches found on ${num_lines_matched} lines in file"\
                "\"${FILENAME}\":"
        # Now show these exact matches with their corresponding line 'n'umbers in the file
        grep -n --color=always -E "$REGEX_SEARCH" "$FILENAME"
        # Now actually DO the string replacing on the files 'i'n place using the `sed` 
        # 's'tream 'ed'itor!
        sed -i "s|${REGEX_SEARCH}|${REPLACEMENT_STR}|g" "$FILENAME"
    fi
}

Place that in your ~/.bashrc file, for instance. Close and reopen your terminal and then use it.

Example:

Replace do with bo so that "doing" becomes "boing" (I know, we should be fixing spelling errors not creating them :) ):

$ gs_replace_str "do" "bo" test_folder/test2.txt 

9 matches found on 6 lines in file "test_folder/test2.txt":
1:hey how are you doing today
2:hey how are you doing today
3:hey how are you doing today
4:hey how are you doing today  hey how are you doing today  hey how are you doing today  hey how are you doing today
5:hey how are you doing today
6:hey how are you doing today?
$SHLVL:3 

Screenshot of the output, to show the matched text being highlighted in red:

enter image description here

References:

  1. https://superuser.com/questions/339522/counting-total-number-of-matches-with-grep-instead-of-just-how-many-lines-match/339523#339523
  2. https://stackoverflow.com/questions/12144158/how-to-check-if-sed-has-changed-a-file/61238414#61238414
Gabriel Staples
  • 1,578
  • 18
  • 21
  • I tried to use your `find_and_replace.sh` to find `@Table(name = "` and replace it with `@Table(name = "V_` but I failed in finding the right way of escaping parenthesis and double quotes. Is it a bug due to the fact that it uses both `grep` and `sed`? – Pino Jul 15 '22 at 08:57
  • Hi @Pino, it would look like this: `find_and_replace.sh "path/to/some_file.txt" "" '@Table\(name = "' '@Table(name = "V_'`. Here we use single quotes (`'`) to surround the double quotes, and you have to escape parenthesis in regular expressions with a backslash (\\). See https://regex101.com/ to help you come up with regular expressions (regexs). If you copy and paste `@Table(name = "` into the regex box there, it will highlight the `(` in red, indicating it has a problem and needs to be escaped as `\(`. – Gabriel Staples Jul 16 '22 at 01:56
  • Once you escape it, you can read the "EXPLANATION" section at the top-right of https://regex101.com/ for details. – Gabriel Staples Jul 16 '22 at 01:58
  • @Pino, that being said, my Ripgrep (`rg`) wrapper called [`rg_replace`, or `rgr`](https://unix.stackexchange.com/a/684926/114401) for short, is a **much better** tool! See [my other answer here](https://unix.stackexchange.com/a/684926/114401). `rgr` is **waaaay** faster than both `grep` and `git grep`, and has a ton of features. It's my go-to tool. `rgr` gives you access to _all_ `rg` features while adding the `-R` option to do actual on-disk replacements. – Gabriel Staples Jul 16 '22 at 01:59
0

My RipGrep Replace wrapper, rgr, is now my go-to on-disk find-and-replace and grep-replacement tool, period. It is incredibly robust, fast, and thorough. Unlike grep, it handles a full feature-set of regular expression syntax. It is 0.348s/0.136s = ~3x faster than git grep, and 0.806s/0.136s = ~6x faster than GNU grep (see RipGrep speed tests here), and rgr supports on-disk find-and-replace!

My rgr wrapper around the incredibly fast RipGrep (rg) tool gives it the ability to do on-disk textual replacements via a new -R option. I call my wrapper rgr, for "RipGrep Replace", since it will do the find-and-replace feature on your disk via the -R option I added.

See the installation instructions at the top of rg_replace.sh, so you can use it as rgr.

For additional information on it, see my comments here and here.

Example usages as shown from the help menu, accessed via rgr -h or rgr --help, are here:

EXAMPLE USAGES:

    rgr foo -r boo
        Do a *dry run* to replace all instances of 'foo' with 'boo' in this folder and down.
    rgr foo -R boo
        ACTUALLY REPLACE ON YOUR DISK all instances of 'foo' with 'boo' in this folder and down.
    rgr foo -R boo file1.c file2.c file3.c
        Same as above, but only in these 3 files.
    rgr foo -R boo -g '*.txt'
        Use a glob filter to replace on your disk all instances of 'foo' with 'boo' in .txt files
        ONLY, inside this folder and down. Learn more about RipGrep's glob feature here:
        https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md#manual-filtering-globs
    rgr foo -R boo --stats
        Replace on your disk all instances of 'foo' with 'boo', showing detailed statistics.

See the rgr --help menu yourself for a full description.

Gabriel Staples
  • 1,578
  • 18
  • 21