Python script to find files that contain a text string

Last updated on 23rd September 2015

This is a Python program to search for a specified text in all files of a directory. The user inputs the directory path and a text string to search. Additionally, by entering the file type, the search can be restricted to a subset of the files in the directory. The program outputs the filenames that contain the search string and also the line number and column number where the text appears in the file.

Program Flow

The program uses the os module to get the files from a directory.

The user is prompted to enter a directory path, file type and a text string to be searched. If the directory path does not containing a directory separator, which is forward slash (/) in case of Linux and either a backward slash() or forward slash (/) in case if Windows, then it is appended to search path. The resulting path is then validated. If the path is invalid or if it does not exist then the search path is set to the current directory.

In the next step, each file in the directory is checked and if it matches the file type. The file type can be any text file format such as .txt, .log, .ini, .conf etc., If the user inputs a file type, for example .ini the program will check if the filename ends with the extension .ini. If no file type is input then program will search all files in the directory.

The files that match the file type are opened and each line is read in loop. The find() method is called to check if the search string is present in a line. If found, the find method returns index where the string was found. The filename, line number, index and the whole line that contains the string is output to the user console.

Program Source

#Import os module
import os

# Ask the user to enter string to search
search_path = input("Enter directory path to search : ")
file_type = input("File Type : ")
search_str = input("Enter the search string : ")

# Append a directory separator if not already present
if not (search_path.endswith("/") or search_path.endswith("\\") ): 
        search_path = search_path + "/"
                                                          
# If path does not exist, set search path to current directory
if not os.path.exists(search_path):
        search_path ="."

# Repeat for each file in the directory  
for fname in os.listdir(path=search_path):

   # Apply file type filter   
   if fname.endswith(file_type):

        # Open file for reading
        fo = open(search_path + fname)

        # Read the first line from the file
        line = fo.readline()

        # Initialize counter for line number
        line_no = 1

        # Loop until EOF
        while line != '' :
                # Search for string in line
                index = line.find(search_str)
                if ( index != -1) :
                    print(fname, "[", line_no, ",", index, "] ", line, sep="")

                # Read next line
                line = fo.readline()  

                # Increment line counter
                line_no += 1
        # Close the files
        fo.close()

Sample program Output

Run the program to check which all log files in the directory c:\samples contain the word "error"


Enter directory path to search : c:\samples
File Type : .log
Enter the search string : error
LogFile-20150922.log[5,25] 22/09/2015 08:56 Unknown error occured.

Post a comment

Comments

Josef | November 12, 2019 10:52 AM |

thanks for that !

Nik | October 10, 2019 3:31 PM |

Is is possible to output to outlook / csv? Also can you do a "fuzzy" search. Finally can you add a criteria to pick a range of terms and specify if the results must contain all, must contain none, must contain one but not the others etc? New to python but loving it :-)

Ana_Conda | October 11, 2019 10:12 AM |

Look at this article which explains reading and writing to CSV. www.opentechguides.com/how-to/article/python/181/csv-read-write.html

Juta | October 2, 2019 3:20 PM |

How can I ruth sith script via web browser to insert string on browser ??

Juta | October 2, 2019 2:51 PM |

Hello, I have copy and paste this script and run it on terminal ./test.py. I got this errot: ./py.py File "./py.py", line 43 print(fname, "[", line_no, ",", index, "] ", line, sep="") ^ How can I sole this ?

Ramesh | January 24, 2018 7:28 AM |

Thanks a lot for this wonderfull script! Just curious to know if we can feed a text file as input to match/find all values in that file instead of manually providing the text string and then finally get the output file (as .txt only) with all details like which all files contains those values mentioned in input txt file.

Patrick | April 2, 2019 10:55 AM |

@Don You need to add the code between line_no = 1 and while line != '' :

Don | February 23, 2019 12:38 AM |

@Patrick Where would I insert your code to read from a text file as as an input Ramesh is attempting to do? I have a similar need. Thanks in advance.

Patrick | September 17, 2018 10:28 AM |

Open the file and loop through the lines as below:
with open(filename, 'r') as fh:
   for search_str in fh.readlines:

Jay | September 14, 2018 9:55 PM |

I need a similar solution, but no luck so far.

ahmed | April 12, 2018 8:04 AM |

Team, am having trouble with my custom requirement. basically, am searching for a the string in like 100 files and i want to write to a new file everytime the searched strings appears in these files. so stdout does show me the proper output but am not able to write the same output to new file.

prog | January 17, 2018 3:00 PM |

how can we save all results to results.txt?

skp | January 18, 2018 9:51 AM |

Open a file to write
fw = open(search_path + filename, 'w')
and inside the while loop
fw.write(fname + " " + str(line_no) + " " + str(index))

Rajesh vishwakarma | May 12, 2016 9:14 AM |

This is nice script, I learnt something from it.

Marcel | December 24, 2015 4:38 PM |

How to use two search strings? Print all lines including both search strings?

Marcel | December 24, 2015 4:38 PM |

Found it ! search_str1 = input("Enter the search1 string : ") search_str2 = input("Enter the search2 string : ") # Search for strings in line index = line.find(search_str1) and line.find(search_str2)