Category Archives: Regular Expressions

Using recursion to find a string in all text files

Recently I had a need to work through a large number of text files looking for various bits of information. I decided to write a small program that would at least narrow down the number of files that I had to search through by telling me which files within a directory (and sub directories) contained the string I was looking for.

You can download this solution to use it yourself. Below is just a couple of pointers as to how this works.

The following piece of code performs the search for files within a given directory (and sub directories if the user has checked a box indicating that they want to include sub directories):

Firstly, in order to search for files within a directory, we use the System.IO.Directory.GetFiles method which allows you to specify both the path that you want to search and optionally a file pattern, in this case Text files (*.txt). The syntax is very straightforward and a string array is returned containing the file names matching the files found within the path. Secondly, I have a check to see if the user has selected to include sub directories. If they have, then we make use of the System.IO.Directory.GetDirectories method which returns a string array containing all directories within the given path. Using each path returned I then call the same routine again in order to continue searching within that directory for files and further sub directories. This is known as recursion.

The code above calls out to a method called occurrencesOfSearchTerm which first opens the text file and then reads the entire contents before finally using a very basic regular expression to count the number of occurrences of the search term found within the file:

That’s it. Please download the solution to see the full set of code (not much more than what is posted here) and to test it out for yourself.