Finding Directories With Regular Expressions

Introduction

This article uses the classes File and Directory, and regular expression capabilities, to report the number of files of each file type that exist in the specified directory path. The illustration also serves as a "clean-up" utility when the program encounters a file that has the .bak filename extension (i.e., a backup file), the program displays a Message Box asking the user whether that file should be removed, then responds appropriately to the user's input.

Use the following market controls to do the search in a WPF application.

Finddirectory.bmp

When the user presses the Enter key or clicks the Search Directory button, the program invokes the method
"btnSearch_Click" that searches recursively through the directory path that the user provides.

If the user inputs text in the TextBox then the Directory method "Exists" is called to determine whether that text is a valid directory path and name. If not, notify the user of the error.

If the user specifies a valid directory then the directory name is passed as an argument to the private method SearchDirectory. This method locates files that match the regular expression defined.

// check for user input; default is current directory

if (this.txtPath.Text != "")

{
    // verify that user input is valid directory name

    if (Directory.Exists(this.txtPath.Text))

    {

        currentDirectory = this.txtPath.Text;

 

        // reset input text box and update display

        tbkCurrentDirectory.Text = currentDirectory;

    } // end if

    else

    {

        // show error if user does not specify valid directory

        MessageBox.Show("Invalid Directory", "Error",

        MessageBoxButton.OK, MessageBoxImage.Error);
    }
}


This regular expression matches any sequence of numbers or letters followed by a period and one or more letters. Notice the substring of the format
"(?<extension>\w+)" in the argument to the Regex constructor. This indicates that the part of the string that matches "\w+" (in other words, the filename extension that appears after a period in the file name) should be placed in the regular expression variable named extension. This variable's value is retrieved later from the Match object "matchResult" to obtain the filename extension so we can summarize the types of files in the directory.

// regular expression for extensions matching pattern
Regex regularExpression = new Regex(@"[a-zA-Z0-9]+\.(?<extension>\w+)");

// stores regular-expression match result
Match matchResult;

The Call Directory method Getdirectories to retrieve the names of all subdirectories that belong to the current directory.

// get directories
directoryList = Directory.GetDirectories(currentDirectory);

Call Directory method GetFiles to store in string array fileArray the names of files in the current directory.

// get list of files in current directory
fileArray = Directory.GetFiles(currentDirectory);

The foreach loop searches for all files with the extension .bak. The loop then calls SearchDirectory recursively for each subdirectory in the current directory. Eliminate the directory path, so the program can test only the file name when using the regular expression. Use the Regex method "Match" to match the regular expression with the file name, then assign the result to the Match object matchResult. If the match is successful then use the Match method "Result" to assign to the fileExtension the value of the regular expression variable extension from the object "matchResult". If the match is unsuccessful then set fileExtension to "[no extension]".

// iterate through list of files

foreach (string myFile in fileArray)
{
   
// remove directory path from file name
    fileName = myFile.Substring(myFile.LastIndexOf(@"\") + 1);
    
// obtain result for regular-expression search
    matchResult = regularExpression.Match(fileName);
    
// check for match
    if (matchResult.Success)
        fileExtension = matchResult.Result("${extension}");
    
else
        fileExtension = "[no extension]";
 
   
// store value from container
   if (found[fileExtension] == null)
       found.Add(fileExtension, "1");
   
else
   {
       extensionCount = Int32.Parse(found[fileExtension]) + 1;
       found[fileExtension] = extensionCount.ToString();
   }
       
// search for backup( .bak ) files
   
if (fileExtension == "bak")
    {
       
// prompt user to delete ( .bak ) file
        MessageBoxResult result = MessageBox.Show("Found backup file " + fileName + ". Delete?", "Delete Backup", MessageBoxButton.YesNo, MessageBoxImage.Question);
 
        
// delete file if user clicked 'yes'
     if (result == MessageBoxResult.Yes)
     {
         File.Delete(myFile);
         extensionCount = Int32.Parse(found["bak"]) - 1;
         found["bak"] = extensionCount.ToString();
    }
// end if
}
         }
        
// recursive call to search files in subdirectory
         foreach (string myDirectory in directoryList)
       SearchDirectory(myDirectory);

Class "FileSearchForm" uses an instance of the class "NameValueCollection" to store each filename-extension type and the number of files for each type. A NameValueCollection (namespace "System.Collections.Specialized") contains a collection of key-value pairs of strings, and provides the method "Add" to add a key-value pair to the collection. The indexer for this class can index by the order of the items added or by the keys. Use the NameValueCollection found to determine whether this is the first occurrence of the filename extension (the expression returns null if the collection does not contain a key-value pair for the specified fileExtension). If this is the first occurrence then add that extension to "found" as a key with the value 1. Otherwise, increment the value associated with the extension in "found" to indicate another occurrence of that file extension, and assign the new value to the key-value pair.

NameValueCollection found = new NameValueCollection();
// store value from container

if (found[fileExtension] == null)
    found.Add(fileExtension, "1");
else
{
    extensionCount = Int32.Parse(found[fileExtension]) + 1;
    found[fileExtension] = extensionCount.ToString();
}

The following code determines whether fileExtension equals "bak", in other words whether the file is a backup file. If so then prompt the user to indicate whether the file should be removed; if the user clicks "Yes" then delete the file and decrement the value for the "bak" file type in "found".

if (fileExtension == "bak")
{
   // prompt user to delete ( .bak ) file
   MessageBoxResult result = MessageBox.Show("Found backup file " + fileName + ". Delete?", "Delete Backup", MessageBoxButton.YesNo, MessageBoxImage.Question);
 
    // delete file if user clicked 'yes'
   if (result == MessageBoxResult.Yes)
   {
       File.Delete(myFile);
        extensionCount = Int32.Parse(found["bak"]) - 1;
       found["bak"] = extensionCount.ToString();

   } // end if
 
}
}

Call the method SearchDirectory for each subdirectory. Using recursion, we ensure that the program performs the same logic for finding the .bak files in each subdirectory. After each subdirectory has been checked for .bak files, the SearchDirectory method completes and displays the results.

// recursive call to search files in subdirectory
foreach (string myDirectory in directoryList)
SearchDirectory(myDirectory);

Result

Finddirectory1.bmp

Summary

In the preceding example we learned how to use regular expressions to find directories and determine file type.

Up Next
    Ebook Download
    View all
    Learn
    View all