Solved help with cut command

Solved issue

CaffeineAddict

Well-Known Member
Joined
Jan 21, 2024
Messages
2,482
Reaction score
2,038
Credits
20,113
I have a file with contents like this:

Aaronical /(@)/'r/A/n/I/k/@/l
Aaronite '/(@)/r/@/,n/aI/t
Aaronitic ,/(@)/r,/@/n/I/t/I/k
Aase '/A/s/i/
Ab /&/b
aba /@/'b/A/
abac '/eI/b/&/k
abaca ,/A/b/A/'k/A/
abaci '/&/b/@/,s/aI/
abaciscus ,/&/b/@/'s/I/sk/@/s
abacist '/&/b/@/s/I/st
aback /@/'b/&/k
Abaco '/&/b/@/,k/oU/

Objective is to cut out first column so that garbage on the right is removed.

Bash:
cut -d' ' -f1 file.txt

Does not work as expected.

This is expected result:

Aaronical
Aaronite
Aaronitic
Aase
Ab
aba
abac
abaca
abaci
abaciscus
abacist
aback
Abaco

Why it doesn't work and how to fix?
 


The cut command isn't working as expected because the delimiter you're using (-d' ') is a space, but the file contents have varying amounts of spaces and other characters between the columns.

You can try this with awk.

Code:
awk '{print $1}' file.txt<br>

or sed.

Code:
sed 's/ .*//' file.txt

cut could work with this, if the first character after the first column was consistent.
 
Sadly none of the 2 commands work, i'm using:

Bash:
sed 's/ .*//' mobypron.unc > mobypron.unc2
awk '{print $1}' mobypron.unc > mobypron.unc2

In first case output is same, in second case with awk, there is no output beside just 1 line.
 
Spaces matter. Delimeter characters matter.

In your sed example, you are looking for (space)((period). But your string has no space, only a period.
Remove the space before the period, then it will work.
Code:
sed 's/\..*//' file.txt
(Since your original file had multiple lines, I changed it slightly to change all lines)

In your awk example. If you don't specify a delimiter, it uses (space) by default. There is no space in your string.
So the whole string is $1.
Try it with either a space (i.e. mobypron unc) or else set a period as the delimiter.
Code:
awk -F'.' '{print $1}' file.txt
 
sed -En 's/[[:space:]]+.*$//gp' file.txt

This changes things a bit, what exactly are you trying to do?

Code:
Aaronical /(@)/'r/A/n/I/k/@/l
Aaronite '/(@)/r/@/,n/aI/t
Aaronitic ,/(@)/r,/@/n/I/t/I/k
Aase '/A/s/i/
Ab /&/b
aba /@/'b/A/
abac '/eI/b/&/k
abaca ,/A/b/A/'k/A/
abaci '/&/b/@/,s/aI/
abaciscus ,/&/b/@/'s/I/sk/@/s
abacist '/&/b/@/s/I/st
aback /@/'b/&/k
Abaco '/&/b/@/,k/oU/

Code:
sed 's/ .*//' test.file
Aaronical
Aaronite
Aaronitic
Aase
Ab
aba
abac
abaca
abaci
abaciscus
abacist
aback
Abaco
 
Code:
john.doe
jane.smith
alice.johnson
robert.brown
michael.davis
mary.miller
william.wilson
linda.moore
elizabeth.taylor
james.anderson

Code:
#!/bin/bash
# This script extracts the first names from a file with names in the format firstname.lastname

# Read the file and use 'cut' to extract the first names
cut -d '.' -f 1 sample_names.txt
 
I notice you're using En in your last example. If you have more complex things, you may have to write a function loop for each line.
 
This command works
sed 's/ .*//' test.file

For the first text file example you gave.
Also the commands in post #3 are working for me as well. (I did use different file names )
 
I notice you're using En in your last example. If you have more complex things, you may have to write a function loop for each line.
This is strange, the sed command is as it should be but why isn't working is a mystery.

This command works
sed 's/ .*//' test.file

For the first text file example you gave.
hm... IDK what the heck is wrong then here. I'll try with just the sample I posted because the actual file is much bigger.
 
This command works
sed 's/ .*//' test.file

For the first text file example you gave.
OK it's working with the sample I provided, but here is full file with which it doesn't:

If somebody can help figure out why it doesn't work I'd appreciate it!
 

Attachments

Alright, well... when I unzipped your text file, it didn't have hundreds of lines. It was all one huge text string without any carraiage returns in it at all. When I did a "cat mobypron.umc" all the output was one long line, thousands of characters long.

Also I noticed hundreds of ^M carriage returns. Was this file created in MS Windows?

First I had to run this script to fix the line feeds.
Code:
#!/bin/bash

# Input file
input_file="mobypron.unc"
# Output file
output_file="output.txt"

# Replace carriage return characters with newline and save to output file
tr '\r' '\n' < "$input_file" > "$output_file"

echo "Replacement complete. Check the output file: $output_file"

I tried dos2unix first, but that didn't fix it. But once I fixed the carriage returns, this command worked.

Code:
sed 's/ .*//' output.txt
 
This text file looked like this.
Code:
3-D '/T/r/i/d/i/^MA /eI/^Ma /eI/^Ma' /A/^Ma-be /eI/_b/i/^MA-Bomb '/eI/,b/A/m^Ma-c /eI/_s/i/^Ma-sea /eI/_s/i/^Ma-tiptoe /eI/_'t/I/p,t/oU/^Maa '/A//A/^MAachen '/A/k/@/n^MAalborg '/O/lb/O/rg^MAalesund '/O/l/@/,s/U/n^Maalii /A/'l/i//i/^MAalst /A/lst^MAalto '/A/lt/O/^MAandahl '/A/nd/A/l^MAaqbiye /A/k'b/i//j//E/^MAar /A/r^MAarau '/A/r/AU/^Maardvark '/A/rd,v/A/rk^Maardwolf '/A/rd,w/U/lf^MAargau '/A/rg/AU/^MAaron '/(@)/r/@/n^MAaron's-beard '/(@)/r/@/nz,b/i/rd^MAaronic /(@)/'r/A/n/I/k^MAaronical /(@)/'r/A/n/I/k/@/l^MAaronite '/(@)/r/@/,n/aI/t^MAaronitic ,/(@)/r,/@/n/I/t/I/k^MAase '/A/s/i/^MAb /&/b^Maba /@/'b/A/^Mabac '/eI/b/&/k^Mabaca ,/A/b/A/'k/A/^Mabaci '/&/b/@/,s/aI/^Mabaciscus ,/&/b/@/'s/I/sk/@/s^Mabacist '/&/b/@/s/I/st^Maback /@/'b/&/k^MAbaco '/&/b/@/,k/oU/^Mabaculus /@/'b/&/k/j//@/l/@/s^Mabacus '/&/b/@/k/@/s^Mabacus_major '/&/b/@/k/@/s_'m/eI//dZ//@/r^MAbadan /A/b/@/'d/A/n^MAbaddon /@/'b/&/d/-/n^Mabaft /@/'b/&/ft^MAbagtha /@/'b/&/g/T//@/^MAbailard Ab/eI/'lAR^Mabaised /@/'b/eI/st^MAbakumov /A/b/A/'k/u/m/O/f^Mabalone ,/&/b/@/'l/oU/n/i/^Mabalone_shell ,/&/b/@/'l/oU/n/i/_/S//E/l^Mabamp '/&/b,/&/mp^Mabampere /&/b'/&/mp/i/r^Mabandon /@/'b/&/nd/@/n^Mabandoned /@/'b/&/nd/@/nd^Mabandonedly /@/'b/&/nd/@/ndl/i/^Mabandoned_person /@/'b/&/nd/@/nd_'p/[@]/rs/@/n^Mabandoned_thing /@/'b/&/nd/@/nd_/T//I//N/^Mabandonee /@/,b/&/nd/@/'n/i/^Mabandon_hope /@/'b/&/nd/@/n_h/oU/p^MAbantes /@/'b/&/nt/i/z^Mabaptiston ,/&/b/&/p't/I/st/@/n^MAbarbarea /@/b/A/r'b/&/r/i//@/^MAbaris '/&/b/@/r/I/s^MAbas '/A/b/@/s^Mabase /@/'b/eI/s^Mabased /@/'b/eI/st^Mabase_yourself /@/'b/eI/s_/j//U/r's/E/lf^Mabash /@/'b/&//S/^Mabashedly /@/'b/&//S//I/dl/i/^Mabasia /@/'b/eI//Z//@/^Mabask /@/'b/A/sk^MAbassieh /A/b/A/'s/i//E/^Mabatage /A/b/@/'t/A//Z/^Mabate /@/'b/eI/t^Mabatement /@/'b/eI/tm/@/nt^Mabatic /@/'b/&/t/I/k^Mabatis '/&/b/@/,t/i/^Mabatises '/&/b/@/,t/I/s/I/z^Mabatjour AbA'/Z//u/R^Mabattage '/A/b/@/'t/&//Z/^Mabattoir ,/&/b/@/'tw/A/r^Mabaxial /&/b'/&/ks/i//@/l^Mabb /&/b^MAbba '/&/b/@/^Mabba /@/'b/A/^Mabbacy '/&/b/@/s/i/^MAbbai /A/'b/aI/^MAbbasid '/&/b/@/s/I/d^MAbbasside '/&/b/@/,s/aI/d^Mabbatial /@/'b/eI//S//@/l^MAbba_Eban '/A/b/A/_'/i/b/A/n^MAbbe '/&/b/i/^Mabbe /&/'b/eI/^Mabbes /&/'b/eI/z^Mabbess '/&/b/I/s^MAbbeville '/&/b/i/v/I/l^MAbbevillian /&/b'v/I/l/i//@/n^Mabbey '/&/b/i/^MAbbey '/&/b/i/^Mabbeystead '/&/b/i/,st/E/d^Mabbey_counter '/&/b/i/_'k/AU/nt/@/r^Mabbey_laird '/&/b/i/_l/(@)/rd^Mabbey_lubber '/&/b/i/_'l/@/b/@/r^MAbbe_condenser '/&/b/i/_k/@/n'd/E/ns/@/r^MAbbie '/&/b/i/^MAbbot '/&/b/@/t^Mabbot '/&/b/@/t^MAbbotsford '/&/b/@/tsf/A/rd^MAbbott '/&/b/@/t^MAbboud /A/'b/u/d^Mabbreviate /@/'br/i/v/i/,/eI/t^Mabbreviated /@/'br/i/v/i/,/eI/t/I/d^Mabbreviation /@/,br/i/v/i/'/eI//S//@/n^MAbby /&/b/i/^Mabb_wool /&/b_w/U/l^Mabcoulomb /&/b'k/u/l/A/m^MAbd-al-Kadir
 
But once I fixed the carriage returns, this command worked.

When reading the thread and seeing it not working, I wondered if it was a case of the whole end of line thing. Lots of people use Windows when compiling these sort of lists and so there's no traditional UNIX end of lines.

Then I was like, "Nah... They'd have checked that already."

It's like a soap opera! Let's see what happens next!

Also, with a bit of knowledge, the Linux terminal has some great tools to manage text files. They're still useful today.
 
So reason why it didn't work in my case despite running mac2unix is that I merged the file with other files and then running mac2unix on a bundled file.

Running mac2unix on the file prior merge made things work.

Bash:
mac2unix --force --newfile file1.txt file2.txt

I appreciate your discovery @dos2unix :)
 
I was following along this thread from the top, not having peeked at the end to see that it was already solved, and at a glance, I couldn't fathom why cut didn't work. I would never have suspected a newline issue until I opened the file up in a text editor and saw that one whopping big line with a bunch of ctrl-Ms in it. Then I thought, "Hmph... one of those Macintosh people" :)

What's funny, to me, is that my text editor, on a typical text file, would have silently switched to Macintosh newline mode and displayed the file as expected and it would have been a small miracle if I had noticed the little "M" in the mode line. I guess because of all those "junk" characters, it didn't recognize it as a regular text file and so treated it as a binary with no newline translation.
 
What's funny, to me, is that my text editor, on a typical text file, would have silently switched to Macintosh newline mode and displayed the file as expected and it would have been a small miracle if I had noticed the little "M"
same here, I opened the file with kate, it didn't appear as single line and there is no indication of line feed. it appeared normal.
 


Members online


Latest posts

Top