Multiple file search

Dorcas

New Member
Credits
65
Dear All,

May I begin by thanking those who answered an earlier query I posted here. It is a noble thing to help the ignorant ! This time I am asking if anyone knows of an app or program in any Linux library which can sequentially search files, find a text string, and go to it ? I have kept a daily diary since 1960, each year being one file, and all the files kept in one folder. How can I find and go to, for example, the the date on which I moved to a new house, or first set up my computer to use Linux ? I am using Linux mint (Sarah) at the moment.
 


JasKinasis

Well-Known Member
Credits
5,358
Personally, I use rednotebook for my diary.
Not that I've been particularly consistent with it. But I can search rednotebook using keywords, or using its built in calendar control to pick a date with a daily entry.

If your diary is contained in a series of text files - I don't think there is a single command, or program that can do what you want.
But the good news is that you should be able to write a small script that can do it.

It could be as simple as using something like grep to find the name of the file and the line-number containing the date you're looking for. Then once you have that information, you can use a command to open a text editor and go to a particular line.

So you're probably looking at using grep to find the filename and line-number, something like cut, or awk, or sed to extract the file-name and line-number information you're looking for into separate variables.

Then you need to fire up your preferred text editor and go to the required line in the file.

Most text editors accept command-line parameters that will allow you to jump to a specific line in a file.

For example, vi/vim, nano, emacs and gedit use +{line-number} to specify a line to jump to in the file.
e.g.
Bash:
gedit +35 /path/to/file
vi/vim, nano and emacs use the above syntax too.

Kate uses --line-number {line-number}.
e.g.
Bash:
kate /path/to/file --line-number 35
So for the final part of your script, you'd need to take a look at the man page for the text editor you use to find out how to open a file at a particular line.

Then it's a case of putting it all together in a script.

So perhaps you'd want to invoke the script and pass it a date.
It doesn't really matter what format the date is in. But despite being a Brit myself, I will assume you're using the American date format mm/dd/yyyy - we have a lot of American users here, so I'll run with it for now. And I'll explain how to search using other date formats.

But this script idea I have will only work if the date-format is consistent throughout the files.

So, lets called this script "gotodiary".
It will take a single parameter - the date in mm/dd/yyyy format (for now).
So, if you wanted to view your diary entry for christmas day in 1984, and your dates are formated in mm/dd/yyyy we'd type something like this:
Bash:
gotodiary 12/25/1984
And the script could look something like this:
gotodiary:
Bash:
#!/usr/bin/env bash

# Search directory - IMPORTANT - edit this!
searchDir="/path/to/diaryfiles/"

# Find the filename and line-number for our date
results=$(\grep -RHn "$1" "$searchDir" 2> /dev/null)

# If we have no results - exit
if [[ -z "$results" ]] ; then
    echo "No entries found for $1"
    exit 1
fi

# if we got here - we found an entry with the specified date string

# Get the path/filename and the line-number
filePath=$(echo "$results" | awk -F ":" '{print $1}')
lineNumber=$(echo "$results" | awk -F ":" '{print $2}')

# Now open the file in your editor and jump to the line
# I use vim in the terminal, so I'd do this:
vim +"$lineNumber" "$filePath"
# IMPORTANT - replace the above line with a command to start your
# preferred text editor
# Some suggestions are below:

# Terminal  based text editor commands:
# vim +"$lineNumber" "$filePath"
# nano +"$lineNumber" "$filePath"
# emacs -nw +"$lineNumber" "$filePath"

# Graphical text editor commands
# gvim +"$lineNumber" "$filePath" &> /dev/null &
# emacs +"$lineNumber" "$filePath" &> /dev/null &
# gedit +"$lineNumber" "$filePath" &> /dev/null &
# kate "$filePath" --line-number "$lineNumber" &> /dev/null &
# etc etc.
# If you use a text editor that is not in the above list, then
# take a look at the man page for your editor and see what options you need
# to use
Save the above in your home directory as "gotodiary"

IMPORTANT NOTES:
1. Set the variable searchDir to contain the full-path to the directory containing all of your diary files. e.g. /home/dorcas/Documents/diary/
2. Set the editor command for your chosen text editor.
3. The extra bits of code at the end of the commented out lines for opening files in graphical text editors like kate and gedit do the following things:
- The &> /dev/null redirects all terminal output from these graphical programs, so their output doesn't clutter up your terminal.
- The final & at the end runs the graphical text editor in the background. This allows you to continue to use the terminal whilst the graphical text editor is still open.

I use vim in the terminal, so I don't perform any redirection and do not need to put vim in the background, because it is a terminal program anyway! Likewise, if you are using any other terminal based text editor, like nano, or emacs (with the -nw option), those extra redirections are not required.

But I would add those extra bits if I decided to open the file in gvim - the graphical version of vim, or any other graphical text editor.

4. Running the script:
4a - Firstly, make sure you have read the above notes and have set the path to your directory containing your diary files and set up a command for your text editor.
4b - Make the script executable using chmod +x gotodiary
4c - Run the script like this:
Bash:
./gotodiary 12/25/1984
Additional notes:
1. If you're using a different date format, you can simply use that instead.
So if you're British like me and the dates in your diary file are in dd/mm/yyyy format, you'd run it like this:
Bash:
./gotodiary 25/12/2020
If you use a different format that uses space characters:
e.g. "Tues 25th December 1984" Then you would need to enclose the date in double quotes:
e.g.
Bash:
./gotodiary "25th December 1984"
- In the above example - nobody is going to remember that the 25th of December in 1984 was a Tuesday (I looked it up), but the rest of the date is unique enough to correctly identify it.

2. The script assumes that all dates are in a uniform format and that all of the diary files are in the same directory

3. This script assumes that there will only ever be one entry per date.

It doesn't deal with instances where there are several entries for a particular date.

3. It performs no range-checking/date validation on the dates.
At the end of the day - this is just a quick and dirty script, completely off the top of my head. It may not even fit your needs. But hopefully it will give you some ideas!

4. You could use any other unique search string as a parameter
The search parameter doesn't have to be a date - if you have some other unique string that only appears once in your diary - then you can use that to search.
e.g.
gotodiary "Dave married Jess"

Again - if it finds multiple instances of the search term - the above script would need to be modified.
I hope this is in some way useful!
 
Last edited:

stan

Active Member
Credits
2,095
I have kept a daily diary since 1960, each year being one file, and all the files kept in one folder.
A simple method that you might find suitable is to use the grep command, but it does not fully do what you have asked. Here is an example:
grep.png


With a terminal open in the "diary" folder, the ls command shows the 5 files I created to simulate your method of one-file-per-year. The grep command uses 2 options together: -i makes the search string case insensitive, so that both "Linux" and "linux" are returned. The -r option makes the search recursive, so that all the files in the folder will be scanned.

You can quickly see that Linux is first mentioned here in 2002, and also in every year after that. At this point you would have to manually open the file for 2002 with an editor and find the Linux reference to determine the date of your diary entry. CTRL-F is usually available in many text editors to quickly find a word or phrase.

I don't know why the grep output for Linux shows the years out of sequence, but that is something for you to watch for if you try using this simple method, and if you are indeed searching for the first instance of your search term(s).


sequentially search files, find a text string, and go to it ?
Your wish for the search output to actually go to the found text might not be such a good idea, in some cases. If your search string is not narrowly focused enough, it might would open up all of your files, or a great many of them. But I don't know how to accomplish that part anyway. The grep command will search the files and find the text strings, as I have shown. But there are always many ways to do things in Linux, so this is likely not the best way... it's just one way.

Good luck!
 

stan

Active Member
Credits
2,095
In fiddling around a little more, I have a little enhancement to what I offered above, but it still will not "go to the text" that is found by the search.
grep.png


This time I added the -n option, now grep -nir so that you see the line number in your diary file where your text string is located. That will help you to quickly find the string after you manually open the file in a text editor.

Then I "piped" the output of grep to the input of the sort command. This now put the diary files in the proper order to help you to know when the first (or last) instance of a text string search occurred.

With my Linux distro, grep gives a colorful output, but these colors are lost after running through the sort command. That doesn't matter... at least the output comes closer to providing information on the search string you are looking for.

[EDIT] Oops, I noticed rather late that @JasKinasis already mentioned using grep with line numbers, although he provides a much more sophisticated script that may satisfy your needs better. Again, good luck with your project, however you go. [/EDIT]
 
Last edited:

Dorcas

New Member
Credits
65
A simple method that you might find suitable is to use the grep command, but it does not fully do what you have asked. Here is an example:
View attachment 8125

With a terminal open in the "diary" folder, the ls command shows the 5 files I created to simulate your method of one-file-per-year. The grep command uses 2 options together: -i makes the search string case insensitive, so that both "Linux" and "linux" are returned. The -r option makes the search recursive, so that all the files in the folder will be scanned.

You can quickly see that Linux is first mentioned here in 2002, and also in every year after that. At this point you would have to manually open the file for 2002 with an editor and find the Linux reference to determine the date of your diary entry. CTRL-F is usually available in many text editors to quickly find a word or phrase.

I don't know why the grep output for Linux shows the years out of sequence, but that is something for you to watch for if you try using this simple method, and if you are indeed searching for the first instance of your search term(s).



Your wish for the search output to actually go to the found text might not be such a good idea, in some cases. If your search string is not narrowly focused enough, it might would open up all of your files, or a great many of them. But I don't know how to accomplish that part anyway. The grep command will search the files and find the text strings, as I have shown. But there are always many ways to do things in Linux, so this is likely not the best way... it's just one way.

Good luck!
Thanks Stan, I am grateful for your help. It will take me a while to find out how your solution works. I would use short and specific search strings such as the name of a place or a person and would not mind looking at each instance. Dorcas
 

Dorcas

New Member
Credits
65
Personally, I use rednotebook for my diary.
Not that I've been particularly consistent with it. But I can search rednotebook using keywords, or using its built in calendar control to pick a date with a daily entry.

If your diary is contained in a series of text files - I don't think there is a single command, or program that can do what you want.
But the good news is that you should be able to write a small script that can do it.

It could be as simple as using something like grep to find the name of the file and the line-number containing the date you're looking for. Then once you have that information, you can use a command to open a text editor and go to a particular line.

So you're probably looking at using grep to find the filename and line-number, something like cut, or awk, or sed to extract the file-name and line-number information you're looking for into separate variables.

Then you need to fire up your preferred text editor and go to the required line in the file.

Most text editors accept command-line parameters that will allow you to jump to a specific line in a file.

For example, vi/vim, nano, emacs and gedit use +{line-number} to specify a line to jump to in the file.
e.g.
Bash:
gedit +35 /path/to/file
vi/vim, nano and emacs use the above syntax too.

Kate uses --line-number {line-number}.
e.g.
Bash:
kate /path/to/file --line-number 35
So for the final part of your script, you'd need to take a look at the man page for the text editor you use to find out how to open a file at a particular line.

Then it's a case of putting it all together in a script.

So perhaps you'd want to invoke the script and pass it a date.
It doesn't really matter what format the date is in. But despite being a Brit myself, I will assume you're using the American date format mm/dd/yyyy - we have a lot of American users here, so I'll run with it for now. And I'll explain how to search using other date formats.

But this script idea I have will only work if the date-format is consistent throughout the files.

So, lets called this script "gotodiary".
It will take a single parameter - the date in mm/dd/yyyy format (for now).
So, if you wanted to view your diary entry for christmas day in 1984, and your dates are formated in mm/dd/yyyy we'd type something like this:
Bash:
gotodiary 12/25/1984
And the script could look something like this:
gotodiary:
Bash:
#!/usr/bin/env bash

# Search directory - IMPORTANT - edit this!
searchDir="/path/to/diaryfiles/"

# Find the filename and line-number for our date
results=$(\grep -RHn "$1" "$searchDir" 2> /dev/null)

# If we have no results - exit
if [[ -z "$results" ]] ; then
    echo "No entries found for $1"
    exit 1
fi

# if we got here - we found an entry with the specified date string

# Get the path/filename and the line-number
filePath=$(echo "$results" | awk -F ":" '{print $1}')
lineNumber=$(echo "$results" | awk -F ":" '{print $2}')

# Now open the file in your editor and jump to the line
# I use vim in the terminal, so I'd do this:
vim +"$lineNumber" "$filePath"
# IMPORTANT - replace the above line with a command to start your
# preferred text editor
# Some suggestions are below:

# Terminal  based text editor commands:
# vim +"$lineNumber" "$filePath"
# nano +"$lineNumber" "$filePath"
# emacs -nw +"$lineNumber" "$filePath"

# Graphical text editor commands
# gvim +"$lineNumber" "$filePath" &> /dev/null &
# emacs +"$lineNumber" "$filePath" &> /dev/null &
# gedit +"$lineNumber" "$filePath" &> /dev/null &
# kate "$filePath" --line-number "$lineNumber" &> /dev/null &
# etc etc.
# If you use a text editor that is not in the above list, then
# take a look at the man page for your editor and see what options you need
# to use
Save the above in your home directory as "gotodiary"

IMPORTANT NOTES:
1. Set the variable searchDir to contain the full-path to the directory containing all of your diary files. e.g. /home/dorcas/Documents/diary/
2. Set the editor command for your chosen text editor.
3. The extra bits of code at the end of the commented out lines for opening files in graphical text editors like kate and gedit do the following things:
- The &> /dev/null redirects all terminal output from these graphical programs, so their output doesn't clutter up your terminal.
- The final & at the end runs the graphical text editor in the background. This allows you to continue to use the terminal whilst the graphical text editor is still open.

I use vim in the terminal, so I don't perform any redirection and do not need to put vim in the background, because it is a terminal program anyway! Likewise, if you are using any other terminal based text editor, like nano, or emacs (with the -nw option), those extra redirections are not required.

But I would add those extra bits if I decided to open the file in gvim - the graphical version of vim, or any other graphical text editor.

4. Running the script:
4a - Firstly, make sure you have read the above notes and have set the path to your directory containing your diary files and set up a command for your text editor.
4b - Make the script executable using chmod +x gotodiary
4c - Run the script like this:
Bash:
./gotodiary 12/25/1984
Additional notes:
1. If you're using a different date format, you can simply use that instead.
So if you're British like me and the dates in your diary file are in dd/mm/yyyy format, you'd run it like this:
Bash:
./gotodiary 25/12/2020
If you use a different format that uses space characters:
e.g. "Tues 25th December 1984" Then you would need to enclose the date in double quotes:
e.g.
Bash:
./gotodiary "25th December 1984"
- In the above example - nobody is going to remember that the 25th of December in 1984 was a Tuesday (I looked it up), but the rest of the date is unique enough to correctly identify it.

2. The script assumes that all dates are in a uniform format and that all of the diary files are in the same directory

3. This script assumes that there will only ever be one entry per date.

It doesn't deal with instances where there are several entries for a particular date.

3. It performs no range-checking/date validation on the dates.
At the end of the day - this is just a quick and dirty script, completely off the top of my head. It may not even fit your needs. But hopefully it will give you some ideas!

4. You could use any other unique search string as a parameter
The search parameter doesn't have to be a date - if you have some other unique string that only appears once in your diary - then you can use that to search.
e.g.
gotodiary "Dave married Jess"

Again - if it finds multiple instances of the search term - the above script would need to be modified.
I hope this is in some way useful!
Hi Jas,
Thanks for the considerable time you have put in to suggest a practical solution. I am new to Linux and had hoped that one of the many existing apps/programs in the various Linux libraries would have done the job. I though that the need to search multiple files and go to an identified string would have been common enough to have resulted in quite a few programs to do it. It will take me a while to get my head around what you have suggested, but I will do my best. Dorcas.
 

Members online


Top