Neat little script there Stan!
There are a few improvements that could be made here and there.
There are some bits of repetition that could be refactored into loops and a few other things I've spotted.
I hope you don't mind, but I've had a quick play with your script and boiled it down to this:
(dice2.sh in the attached zip)
Code:
#!/usr/bin/env bash
# Generates random passphrases using the Diceware(TM) Word list
# Diceware(TM) is a trademark of Arnold G. Reinhold. See more at http://www.diceware.com.
# WARNING! The Diceware(TM) author strongly discourages the use of electronic number generators such as this script, as the quality of the random number cannot be certain. He is right, of course. Do not use this script if you have any concerns about the security of the passphrase it generates. If you have extremely high security needs, use real dice. For my use, the script seems to be sufficiently random enough. The creation of this script was just for fun and learning.
# Dice version 1.0 completed 12 March 2018 by Stan Vandiver http://www.linuxgeeks.us. Copyright (C) 2017-2018
# Additional edits by Jason Trunks https://notabug.org/JasKinasis
# Quit with an error message
function die {
echo >&2 "ERROR:" "$@"
exit 1
}
# Generate a 5 digit random number based on the roll of a 6-sided dice
function get_word_code {
digits=[]
for digitcount in {0..4}; do
digits[$digitcount]=$(( RANDOM % 6 + 1 ))
done
echo -n "${digits[@]}" | sed 's/ //g'
}
clear
printf "\nPlease visit diceware.com for more info about passphrase security.\n"
printf "\nHow many words should the passphrase contain? "
read number_of_words
printf "\n"
# Ensure user entered a valid number
number_of_words=$(echo "${number_of_words}" | awk '/^[0-9]+$/')
[[ $number_of_words ]] || die "Invalid number entered..."
printf "Dice\tWord list Match\n" > passphrase.txt
# Generate words
count=0
while [[ $count -lt "${number_of_words}" ]]
do
\grep "$(get_word_code)" dicelist.txt | sed 's/ //g' >> passphrase.txt
(( count++ ))
done
# Display results
cat passphrase.txt
printf "\n"
#rm passphrase.txt
# Uncomment the line above if you want the file to automatically delete itself after generating each passphrase.
For the sake of brevity in the post here, I've removed some of the longer-comments at the top of the file which describe the operation of the script, but I've kept your original copyright/licensing information in - in case anybody else copies code from this post. Full comments will be in the versions in the .zip I've attached.
So what changes are in the above?
Lets run through it:
First up we have a couple of functions.
The first function "die":
Code:
# Quit with an error message
function die {
echo >&2 "ERROR:" "$@"
exit 1
}
This is called in the event that an error condition occurs. All it does is output an error message (passed in by the caller) and exits with the value 1 - indicating that an error has occurred.
The second function replaces get_digit and get_a_word.
Code:
# Generate a 5 digit random number based on the roll of a 6-sided dice
function get_word_code {
digits=[]
for digitcount in {0..4}; do
digits[$digitcount]=$(( RANDOM % 6 + 1 ))
done
echo -n "${digits[@]}" | sed 's/ //g'
}
In the above, we create an empty array called digits.
We then perform 5 iterations in a for loop - counting from 0 to 4.
- each iteration creates a digit in the array.
The reason we count from zero is because arrays are indexed from 0.
As output from the function, we echo the entire digit-array, passing it through sed to strip out the spaces that get put between each value.
By using the modulus operator on the value from $RANDOM we have made the pseudo-random number generation more efficient. Because we are simulating the roll of a 6 sided die, we use modulus 6 to derive our digit.
The modulus operator divides the random number (from $RANDOM) by 6 and displays the remainder - which will yield values between 0 and 5. Because we want values ranged between 1 and 6, we simply add 1.
That way you don't need the tr and awk calls in the get_digit function and there will be no need for the inefficient "until...do...done" loops that were in your get_a_word function, because you are now guaranteed to get a pseudo-random value between 1 and 6 every single time.
This simple change has taken out a huge chunks of duplicate code and has removed the need for writing to the digits.txt file - so there are less disk operations being performed too.
The next part is quite straightforward:
Code:
clear
printf "\nPlease visit diceware.com for more info about passphrase security.\n"
printf "\nHow many words should the passphrase contain? "
read number_of_words
printf "\n"
# Ensure user entered a valid number
number_of_words=$(echo "${number_of_words}" | awk '/^[0-9]+$/')
[[ $number_of_words ]] || die "Invalid number entered..."
When I was looking at the original code, I thought - Why only 6, 8 or 10 word passphrases? Wouldn't it be better to allow the user to simply enter the number of words to generate?
So here we ask the user to enter the number of words they wish to generate and then perform some checks on the user-entered value.
The line:
Code:
number_of_words=$(echo "${number_of_words}" | awk '/^[0-9]+$/')
Checks whether the users input was purely numeric. If the entire line is numeric, the value is unchanged, but if there are any non-numeric characters, $number_of_words will be set to blank/"".
The next line checks that we still have a value - if $number_of_words is blank, we call the die function and pass an error message.
Now we are on the home-straight. The final part of the script generates the passphrase in passphrase.txt and cats it to the screen:
Code:
printf "Dice\tWord list Match\n" > passphrase.txt
# Generate words
count=0
while [[ $count -lt "${number_of_words}" ]]
do
\grep "$(get_word_code)" dicelist.txt | sed 's/ //g' >> passphrase.txt
(( count++ ))
done
# Display results
cat passphrase.txt
printf "\n"
#rm passphrase.txt
The first printf command writes a header to the passphrase file "passphrase.txt".
Then we set up a counter "count" and a while loop to generate our required number of words.
The magic which generates each word is contained in a single grep command inside the loop.
There is quite a lot going on in that single line:
Code:
\grep "$(get_word_code)" dicelist.txt | sed 's/ //g' >> passphrase.txt
Firstly - we are calling \grep (not grep) - this is to escape/avoid using any grep aliases that the user might have set up that could affect the formatting of greps output. That way we know we aren't going to get line-numbers added to the output from grep.
The first parameter to grep is "$(get_word_code)" which is the captured result from a call to the "get_word_code" function. So before grep is actually called, get_word_code is called to create the random code-number for the word.
The second parameter to grep is the path to the dicelist.txt which is the file containing our wordlist.
The net effect of the first part of the line is to generate a 5 digit code which is used as the search pattern in a grep of the wordlist.
The line matched by grep is passed/piped to sed, to strip the leading spaces before the code-number and the remainder of the line is appended to passphrase.txt via output redirection.
We then increment our counter and continue to generate random words in this manner until we have the required number of words.
Finally, the content of "passphrase.txt" is displayed on-screen.
Here is example output from running the modified version of the script:
Code:
Please visit diceware.com for more info about passphrase security.
How many words should the passphrase contain? 8
Dice Word list Match
32456 head
34614 joust
13311 bandit
56435 tansy
64624 xyz
55136 spa
11516 akin
25444 filch
Note: The Dice-numbers are no longer duplicated in the output.
The previous version output like this:
Code:
Dice Word list Match
22333 22333 dane
41231 41231 lug
# snip snip
If the original output was intentional and you want it to continue to appear as above, you can make the following changes (as per dice3.sh in the .zip) :
1. In get_word_code, pipe the output from sed to tee:
Code:
# Generate a 5 digit random number based on the roll of a 6-sided dice
function get_word_code {
digits=[]
for digitcount in {0..4}; do
digits[$digitcount]=$(( RANDOM % 6 + 1 ))
done
echo -n "${digits[@]}" | sed 's/ //g' | tee -a passphrase.txt
}
The addition of the pipe to tee -a will append the word-code to passphrase.txt
2. In the \grep command in the main while loop, remove the call to sed:
Code:
# Generate words
count=0
while [[ $count -lt "${number_of_words}" ]]
do
\grep "$(get_word_code)" dicelist.txt >> passphrase.txt
(( count++ ))
done
Those 2 changes will yield the original output.
I just thought it looked a bit odd having the number repeated twice. But it's your script, so I'll leave that decision up to you.
It probably took me a lot longer to write this post and explain what I did than it did to actually make the changes to your script.
I had a bit of fun looking at the code and making my "improvements" - if they can be called that. At some point, I'm sure someone else will weigh in with things that could be done better still!
Anyway - I've uploaded a zip containing your original script, the word-list, plus two versions with my changes.
dice2.sh contains my original changes, without the duplication of the dice-numbers
dice3.sh contains my changes, but retains the original output format.
Feel free to use, or ignore whatever you like, after all it's your script.
If there's anything you don't understand, or want further clarification on, feel free to give me a shout!