bash script, find amount of strings in files

krdgroup23

New Member
Joined
Nov 25, 2022
Messages
4
Reaction score
0
Credits
34
Hello everyone! I am writing script which must count amount of strings in files .z format. Name of folder can changes in dependence of current date, if it's 1-st date of the month script count amount in folder which is called 2210 (previous month ) if today is not 1-st day of month then folder is called 2211 (current month). For ex. I have catalogs with files:
Code:
WORK6\AXE\CNA5\LBN\2211\1.z # 16 strings in file
WORK6\AXE\TELLIN\2211\2.z # 16 strings in file
WORK6\AXE\TELLIN\2211\3.z # 16 strings in file

here is my script
Code:
#!/bin/bash
#assign a value to the variable
timestamp=$(date +%d)
#if today is 1-st date of month then previous month else current
if [[ $timestamp == 01 ]]; then
    folder=$(date -d " - $(date +%d) days" +%y%m)
else
    folder=$(date +%y%m)
fi
#find path to files
path=$(find ./ -type d -name "$folder")
#find files and count amount of strings
for file in $path/*
do
    $(zcat $file | wc –l)
done

it shows
Code:
wc: –l: No such file or directory
gzip: ./AXE/CNA5/LBN/2211 is a directory -- ignored
wc: –l: No such file or directory
wc: –l: No such file or directory

How should i change my script that the result will
Code:
16 WORK6\AXE\CNA5\LBN #  16 strings in file 1.z
32  WORK6\AXE\TELLIN # 32 strings in files  2.z+3.z
 


$path != $PATH
As i understand in these fragment path=$(find ./ -type d -name "$folder") i send several paths and my cycle try to process it all What's why it doesn't work? And how can i fix it?
 
Last edited:
Post #2 was a cryptic reference to the naming of variables which I'll leave for the reader. If the post is homework, I shall offer only a hint which may or may not be useful. I interpret the expression "amount of strings" actually as "number of lines" since it's the output of the "wc -l" command that appears to be the output sought. The following will output the number of lines in a listing of files:
Code:
for file in $(ls)
  do
     wc -l $file
  done
 
Last edited:
We'll wait on an answer on whether this is homework and subject to that, it may get moved to Command Line, where scripting answers are provided.

Avagudweegend

Wiz
 
wc counts the number of words, characters and lines in a file. Not the number of strings.
Your question states to find the number of strings in a file.

If English is a second language to you, then I’ll concede that perhaps ‘strings’ might be a bad translation.

However, if English is your first language, ‘strings’ would imply that you’re looking for string variables, NOT the number of words, characters, or lines in a file.

So for example:
Bash:
echo ‘This is a literal string’
echo "this is another string"
Everything in between single, or double quotes in the above is a string.

So perhaps you need to be using the strings command to find all of the strings in a file and then pipe the results to wc -l to determine the total number of strings found?!

I haven’t got time to write an example snippet atm. I’m on my phone and it’s early in the morning. My brain’s not fully in gear yet. But give that a go and let us know how you get on.

The strings command will search any binary, or executable program for strings and will list each string it finds.
And a .zip file is a binary format.

It might also be worth looking at the man page for string, because it may, or may not have an option that will cause it to count the number of strings found, instead of displaying all of the strings found. Which would mean that there would be no need to pipe the results to wc.

I haven’t used the strings for a while, so I can’t remember anything about its various command-line options/switches.

I hope this helps!
 
"amount of strings" actually as "number of lines - yeah, i mean number of lines in .z files which is in parent folders of current/
No, it is not a homework. I work as database integrator with teradata, vertica, odi, mssql and so on. And now studying bash scripts doing that job then get up my skills
 
I finished my script:

Code:
todays_day=$(date +%d)
if ((todays_day==1)); then
  month="$(date --date='1 day ago' +%y%m)"
else
  month="$(date +%y%m)"
fi
for file in $(find ./ -type d -name "$month")
do
    echo "number of lines $(find $catalog -type f -name "*.z" | xargs zcat | awk 1 | wc -l);source $(find $catalog -type d );date $(date +"%d-%m-%Y %T")"
done


But it doesn't count last line if where is not end of line character. how can I fix it? . Parameter awk 1 count only in one file in each catalog but not at other files
 
Last edited:
The script you have provided looks germane to your system, with a few oddities, but I shan't go into it since I'm presenting an alternative for your consideration.

As I understand it, the task is to produce the line numbers of a series of files in .gz format, with dates. What follows is a sort of "proof of concept", so quite primitive but outputs the intended result.

First is the creation a directory with a number of .gz files:

The contents of the files are:
Code:
[flip@flop ~/test] cat file1
the
quick
brown


[flip@flop ~/test] cat file2
fox
jumps
over
the

[flip@flop ~/test] cat file3
lazy
dog

The files are compressed:
Code:
[flip@flop ~/test] gzip file1 file2 file3
[flip@flop ~/test] ls
file1.gz  file2.gz  file3.gz

Create the script, called "gzlines", to output filenames, number of lines and date.
Code:
[flip@flop ~] vi gzlines
#!/bin/bash
for file in $(find ~/test -type f -print)
  do
    echo -n $file "No. of lines= "
    zcat $file | wc -l
    echo  `date`
    echo ""
   done

Make the gzlines file executable and run it for the output:
Code:
[flip@flop ~] chmod 775 gzlines
[flip@flop ~] ./gzLines
/home/flop/test/file3.gz No. of lines= 2
Tue 29 Nov 2022 20:36:33 AEDT

/home/flop/test/file2.gz No. of lines= 4
Tue 29 Nov 2022 20:36:33 AEDT

/home/flop/test/file1.gz No. of lines= 3
Tue 29 Nov 2022 20:36:33 AEDT

The intended information is produced in this output, and a comparison of the line count in the files and the output shows that the script is accurate. This is a rather primitive script which has only basic formatting and a hard-coded directory in it which would restrict its reach. These details can be addressed to be altered to produce a more sophisticated script.
 

Members online


Top