Extract URLs out of the network traffic on Linux how?

postcd · Feb 28, 2021

Hello,

the web browser has no developer console and web page is made to refuse work if it is enabled. The web page source does not show any streamed video URL, so i was thinking i may run some Linux command to capture network trafic for lets say 5 minutes during video play and extract all http/s (i will be able to read URLs if HTTPS?) URLs. Anyone knows please which command to use?

I can find possible sed command to get that URLs, but do not know that network capture command. I was not successful with wireshark GUI.

Trash · Mar 20, 2021

You can download source code of the website with wget or with curl. To extract the links you can use grep.

How to grep website links out of an html file, MULTIPLE METHODS:
grep -Eoi '<a [^>]+>'|
or:
grep -Eo 'href="[^\"]+"'|
or:
grep -Eo '(http|https)://[^/"]+'
or:
grep -Eo "(http|https)://[a-zA-Z0-9./?_%:-]*" | sort -u
or:
grep -Po '(?<=href=")[^"]*(?=")'
or:
grep "href=" index.html | cut -d"/" -f3 | sort -u > links.txt

Extract URLs out of the network traffic on Linux how?

postcd

Member

Trash

New Member

Staff online

Members online

Latest posts