Learning Outcomes
- To learn how to run Screaming Frog using the command line for Mac & Windows.
Screaming Frog (SF) is a fantastic desktop crawler thatâs available for Windows, Mac and Linux.
This tutorial is separated across multiple blog posts:
Youâll learn not only how to easily automate SF crawls, but also how to automatically wrangle the .csv data using Python.
Then weâll create a data pipeline which will push all of the data into BigQuery to view it in Google Data Studio.
Finally, weâll step up the automation and upload our scripts to a Google Cloud virtual machine:
- The virtual machine will turn on every day at a specific time.
- Several python scripts will automatically execute and will perform the following:
- A list of domains from a .txt file / via environment variables will be sequentially crawled.
- Weâll wrangle the data and save it to BigQuery.
- Then the virtual machine will shut down after all of the domains have either completed or failed.
- The daily data will then be available via Google Data Studio.
In this blog post, youâll learn how to automate screaming frog with the command line!
The Command Line
Many daily acitivites such as opening/closing programs or requesting a web page can be completely automated via the command line.
If youâd like a detailed overview of the different types of commands you can use on your computer, Iâd recommend viewing these guides:
- Mac/Linux Udemy Course
- Mac/Linux Cheatsheet
- Windows Udemy Course
- Windows Command Prompt Cheatsheet
Part 1 â Screaming Frog CLI
Mac Terminal + Screaming Frog
This part of the tutorial is only for Mac OSX users, therefore if youâre using Windows, visit the Windows section instead here.
Opening Terminal
Firstly you will need to open terminal which can be done by the following commands:
- â Cmd + Space
- Type terminal
- Press enter
Useful Linux Commands:
Several useful commands include:
- cd ~ (cd allows you to change directory)
- pwd (pwd prints your current working directory)
- mkdir folder (mkdir allows you to create folders)
- clear (clear removes any previous text from your terminal)
How To Open Screaming Frog With The Terminal
Assuming that Screaming Frog is installed in the default location, you can run Screaming Frog with:
/Applications/Screaming Frog SEO Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher
How To Create An Alias In Terminal
Now letâs create a shortcut for the command that we just ran, this is called an alias.
All of your aliasâ need to be created inside of either:
~/.bash_profile (Older Mac Terminals)
~/.zshrc (Newer Mac Terminals)
NB: You can easily find out whether youâre on a new Mac terminal with:
which $SHELL
If it says /bin/zsh, then you will need to update the .zsrc file instead.
You can edit this file with either:
cd ~ && sudo nano .bash_profile (Older Mac Terminals)
cd ~ && sudo nano .zshrc (Newer Mac Terminals)
Weâll create an alias called sf that will automatically run the Screaming Frog Application.
Add the following to either your .bash_profile or .zsrc file:
alias sf="/Applications/Screaming Frog SEO Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher"
Then hit:
CTRL + X (to save the file)
Enter
Now close your terminal and reload it using:
- â Cmd + Space
- Type terminal
- Press enter
Now type:
sf
As you can see, weâve now successfully created a shortcut for loading Screaming Frog.
How To See All Of The Commands:
You can easily get a list of all of the available commands with:
sf --help
How To Run A Crawl
If you want to open Screaming Frog and crawl a website use this:
sf --crawl
For example if you wanted to crawl https://sempioneer.com:
sf --crawl https://sempioneer.com
You can use any URL or domain name that youâd like and the above commands will:
- Open your Screaming Frog Application.
- Crawl the desired domain.
How To Run Screaming Frog Headless (Without A Graphical User Interface)
Itâs possible for us to execute Screaming Frog without a graphical user interface, by adding âheadless:
sf --headless --crawl
Additionally we can save the crawl by adding âsave-crawl:
sf --headless --save-crawl --crawl
NB: You will need to purchase a license for executing Screaming Frog with the âsave-crawl functionality.
An example would be:
sf --headless --save-crawl --crawl https://phoenixandpartners.co.uk/
How To Export Data
Instead of saving a crawl, weâll export the data to a specific folder by adding two extra arguments:
--output-folder (This argument allows us to specify a folder where you would like to export the crawl data).
--timestamped-output (This argument will save the file under a time-stamped folder and as every file is saved as crawl.seospider, adding a timestamp prevents a conflict or overwriting an existing file).
- Locate your username by typing pwd in Terminal and excluding the $. For example my username is: jamesaphoenix
- Go back to either your .bash_profile file or .zshrc file and create a new alias:
cd ~ && sudo nano .bash_profile (Older Mac Terminals)
cd ~ && sudo nano .zshrc (Newer Mac Terminals)
Then add the following alias to the bottom of your file:
alias sf-headless="/Applications/Screaming Frog SEO Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher --headless --save-crawl --output-folder /users/{username}/desktop --timestamped-output --crawl"
Please remember to replace {username} with your true username!
Save your file and load up a new Terminal window and enter:
sf-headless example.org
Youâll hopefully have a time-stamped folder on your desktop and inside of that folder, youâll see a file called crawl.seospider
How To Export A Single Tab
As well as doing a crawl, its possible to automatically extract the .csv files.
You can export tabs, which are these:
For example if we wanted to crawl the website and export a .csv file with all of the images without alt text, we would do the following:
sf --crawl --output-folder /users/{username}/desktop/sf --export-tabs "Images:Missing Alt Text" --headless
The snytax for exporting from tabs follows a generic structure:
--export-tabs "tab-parent:tab-child"
How To Export Multiple Tabs
You can also export multiple files at once by simplying separating them by a comma:
"parent1:child1,parent2:child2,parent3:child3"
In order to see the parent:child relationships for the tabs, simply look at how they nested inside of the right panel of Screaming Frog:
Letâs simulataneously extract duplicated title tags, missing title tags and meta descriptions:
sf --crawl --timestamped-output --output-folder /users/{username}/Desktop --export-tabs "Page Titles:Duplicate,Page Titles:Missing,Meta Description:Missing" --headless
For example my desired URL + username is phoenixandpartners.co.uk + jamesaphoenix:
sf --crawl phoenixandpartners.co.uk --timestamped-output --output-folder /users/jamesaphoenix/Desktop --export-tabs "Page Titles:Duplicate,Page Titles:Missing,Meta Description:Missing" --headless
How To Export Reports
Also you can export reports:
The syntax is similar and uses the parent:child structure, however if there is no child then only the parent name is required.
Hereâs an example where only the parent level is required:
sf --crawl --timestamped-output --output-folder /users/{username}/desktop --save-report "Redirect & Canonical Chains" --headless
Hereâs an example where the parent:child structure is required:
sf --crawl --timestamped-output --output-folder /users/{username}/desktop --save-report "Redirects:All Redirects" --headless
How To Perform Bulk Exports
We can also extract the bulk exports too!
An example where only a parent level is required:
sf --crawl --timestamped-output --output-folder /users/{username}/Desktop --bulk-export "All Images" --headless
An example where the parent:child structure is required:
sf --crawl --timestamped-output --output-folder /users/{username}/Desktop --bulk-export "AMP:All Inlinks" --headless
How To Create A Sitemap
If youâre using a content management system such as WordPress, then Iâd recommend using a plugin such as Yoast / TheSEOFramework / RankMath to automatically build your sitemap.xml files.
However if youâre working with a headless CMS or a static website, you can automatically create sitemaps with Screaming Frog:
sf --crawl --create-sitemap --output-folder /users/{username}/desktop --headless
How To Create Configuration Files
Configuration files allow you to tune the crawl speed, choose specific user agents, crawl or not crawl specific pages and many more features!
After changing the configuration inside of Screaming Frog, you can save it as a configuration file.
We can then apply that configuration file to a headless terminal screaming frog crawl via the terminal.
Create Your Config File:
First open up Screaming Frog and go to Configuration > Spider > Extraction > Structured Data:
Then tick the following checkboxes:
- JSON-LD
- Microdata
- RDFa
Click OK.
Then youâll need to save the configuration file by:
File > Configuration > Save As
I will choose to call my file custom_crawl.seospiderconfig
Make sure to save it under a new folder in your desktop called config
How To Crawl With A Config File
Letâs crawl the example site with our newly created configuration file:
sf --crawl --config /users/{username}/desktop/config/{configname}.seospiderconfig --output-folder /users/{username}/desktop --save-report "Redirect & Canonical Chains" --headless
So in my example it would be:
sf --crawl https://phoenixandpartners.co.uk/ --config /users/jamesaphoenix/desktop/config/custom_crawl.seospiderconfig --output-folder /users/jamesaphoenix/desktop --save-report "Redirect & Canonical Chains" --headless
How To Crawl Text Files
Itâs possible to run Screaming Frog in list mode via the terminal.
Simply create a .txt file with a list of URLs that youâd like to crawl.
These can be from a single website or many websites. Save this .txt to your desktop:
The extra argument used here is âcrawl-list like so:
sf --export-tabs "Response Codes:Client Error (4xx)" --output-folder /users/{username}/desktop --headless --crawl-list /users/{username}/desktop/{filename}.txt
My example looks like this:
sf --export-tabs "Response Codes:Client Error (4xx)" --output-folder /users/jamesaphoenix/desktop --headless --crawl-list /users/jamesaphoenix/desktop/urlstocrawl.txt
Weâve finished the Mac section, I hope that this post provides you with a good overview on how to get started with automating Screaming Frog.
Automation is powerful and I encourage you to practice your new found super powers!
In the next post, youâll learn how to automatically wrangle your Screaming Frog data with Python + Pandas..
Windows Command Prompt + Screaming Frog
This section of the post is for Windows Users, if youâre using a Mac, click here.
How To use The Command Prompt
Firstly type in your Windows search bar Command Prompt :
After opening your Command Prompt it should look similar to this:
Now that your command prompt is running type start . and hit enter
Creating Shortcuts In Windows
Weâll create shortcuts that you can run via command crompt to automate Screaming Frog!
Letâs store all of these shortcuts in a folder on our desktop.
Additionally weâll create a shortcut that will navigate to this specific shortcuts folder!
- Create a new folder on your desktop called screaming-frog-commands
- Go to your desktop, right click and then select Shortcut.
This will open a new window:
Change the following command so that the {username} is replaced with your actual username:
"C:WindowsSystem32cmd.exe" /k cd "C:Users{username}screaming-frog-commands"
Then enter the command inside of the Type the location of the item , click next and save the shortcut as sf
An icon will have been saved onto your desktop.
After you click the icon, the shortcut that you entered above will be executed which will:
- Open command prompt.
- Navigate to the screaming-frog-commands folder on your desktop.
How To Open Screaming Frog
Next we need to figure out whether youâre using a 32bit or 64bit version of Windows.
Try to run the 32-bit version in Command Prompt:
cd "C:Program FilesScreaming Frog SEO Spider"
If you receive this message âThe system cannot find the path specifiedâ, then youâll need to use the 64-bit command:
cd "C:Program Files (x86)Screaming Frog SEO Spider"
To open Screaming Frog from your current working directory, type ScreamingFrogSEOSpiderCli.exe
Hopefully youâll now have just opened Screaming Frog from the command line đĽ°!
Close your Command Prompt and open the sf shortcut that we created earlier on. Then open this directory in the file explorer with:
start .
From this folder, letâs create a new command line shortcut to speed up the process:
"C:WindowsSystem32cmd.exe" /k cd "C:Program Files (x86)Screaming Frog SEO Spider" & ScreamingFrogSEOSpiderCli.exe
NB: If youâre running on a 32-bit version of Windows, simply change
âC:Program Files (x86)Screaming Frog SEO Spiderâ to âC:Program FilesScreaming Frog SEO Spiderâ
Name this shortcut open-sf
Then close the Command Prompt and File Exporer, and navigate to your desktop.
Run the sf shortcut and enter open-sf.link
This shouldâve opened Screaming Frog.
Basically how this works is:
- When we open our sf shortcut, we navigate into the screaming-frog-commands folder.
- Then there is an open shortcut called open.lnk. We then ran this by entering its name open-sf.lnk
So far we have the following shortcuts:
- âC:WindowsSystem32cmd.exeâ /k (This opens Command Prompt)
- cd âC:Program Files (x86)Screaming Frog SEO Spiderâ & ScreamingFrogSEOSpiderCli.exe (This navigates to a specific folder and executes the Screaming Frog application).
Notice that the & symbol, which ensures that the first command is executed, then the second command is executed afterwards inside of the Command Prompt.
How To Run A Crawl
Now close Screaming Frog and the Command Prompt. Re-run your sf shortcut. In the future sections weâll be adding on more arguments to our shortcut (open.lnk) file:
Enter:
open-sf.lnk â-crawl
For example if you wanted to crawl https://phoenixandpartners.co.uk/ then it would be:
open-sf.lnk --crawl https://phoenixandpartners.co.uk/
Letâs create another shortcut in the screaming-frog-commands folder and call it crawl:
"C:WindowsSystem32cmd.exe" /k cd "C:Program Files (x86)Screaming Frog SEO Spider" & ScreamingFrogSEOSpiderCli.exe --crawl
Also notice above how the last argument is âcrawl , which means we will only need to pass a URL into this shortcut for it to successfully execute.
- Close everything down.
- Open your sf shortcut.
- Then enter:
crawl.lnk https://example.org/
This will then crawl from the above URL all via the shortcut!
How To Run Screaming Frog Headless (Without A Graphical User Interface)
We are going to add several extra arguments to our existing crawl shortcut:
Itâs possible for us to execute Screaming Frog without a graphical user interface (GUI), by adding the âheadless argument:
1. Run the sf shortcut
2. crawl --headless
Additionally we can save the crawl by adding âsave-crawl:
1. Run the sf shortcut
2. Enter crawl --headless --save-crawl
NB: You will need to purchase a license for executing Screaming Frog with the âsave-crawl functionality.
How To Save A Crawl:
We can also save our folders to a specific folder with the âoutput-folder argument . Additionally we can make sure that the created folder has a unqiue name by adding the âtimestamped-output argument.
Letâs see all of the commands in action without any shortcuts to easily see whatâs happening:
"C:WindowsSystem32cmd.exe" /k cd "C:Program Files (x86)Screaming Frog SEO Spider" & ScreamingFrogSEOSpiderCli.exe --headless --save-crawl --output-folder "C:Users{username}Desktop" --timestamped-output --crawl
Then save this as a shortcut called save-screaming-frog-crawl
You can now easily access this by:
1. Run the sf shortcut
2. Enter save-screaming-frog-crawl
Now that weâve covered the basic crawling applications, letâs explore how to export tabs, reports and bulk exports!
How To Export A Single Tab
As well as doing a crawl, its possible to automatically extract the .csv files.
You can export tabs, which are the following:
For example if we wanted to crawl the website and export a .csv file with all of the images without alt text, we would do the following:
1. Open your sf shortcut, then enter:
2. crawl.lnk --output-folder "C:Users{username}Desktop" --timestamped-output --export-tabs "Images:Missing Alt Text" --headless
The snytax for exporting from tabs follows a generic structure:
--export-tabs "tab-parent:tab-child"
Exporting Multiple Tabs
You can easily export multiple tabs by separating the multiple tabs with a comma. Letâs simulataneously extract duplicated title tags, missing title tags and meta descriptions:
--export-tabs "Page Titles:Duplicate,Page Titles:Missing,Meta Description:Missing"
1. Run your sf shortcut
2. crawl.lnk --output-folder "C:Users{username}Desktop" --export-tabs "Page Titles:Duplicate,Page Titles:Missing,Meta Description:Missing" --headless
To see the parent:child relationships for the tabs, simply look at how they nested on the right panel of Screaming Frog:
For example my username is jamesaphoenix:
1. Run the sf shortcut.
2. crawl.lnk --output-folder "C:UsersjamesaphoenixDesktop" --export-tabs "Page Titles:Duplicate,Page Titles:Missing,Meta Description:Missing" --headless
How To Export Reports
Also you can export reports:
The syntax is similar and uses the parent:child structure, however if there is no child then only the parent name is required.
Hereâs an example where only the parent level is required:
1. Run your sf shortcut
2. crawl.lnk --timestamped-output --output-folder /--output-folder "C:Users{username}Desktop" --save-report "Redirect & Canonical Chains" --headless
Hereâs an example where the parent:child structure is required:
1. Run your sf shortcut
2. crawl.lnk --timestamped-output --output-folder "C:Users{username}Desktop" --save-report "Redirects:All Redirects" --headless
How To Perform Bulk Exports
We can also extract the bulk exports too!
An example where only a parent level is required:
1. Run your sf shortcut
2. crawl.lnk --timestamped-output --output-folder "C:Users{username}Desktop" --bulk-export "All Images" --headless
An example where the parent:child structure is required:
1. Run your sf shortcut
2. crawl.lnk --timestamped-output --output-folder "C:Users{username}Desktop"--bulk-export "AMP:All Inlinks" --headless
How To Create A Sitemap
If youâre using a content management system such as WordPress, then Iâd recommend using a plugin such as Yoast/TheSEOFramework/RankMath to automatically build your sitemap.xml files.
However if youâre working with a headless CMS or a static website, you can automatically create sitemaps with Screaming Frog:
crawl.lnk --output-folder "C:Users{username}Desktop" --headless --create-sitemap
How To Create Configuration Files
Configuration files allow you to tune the crawl speed, choose specific user agents, crawl or not crawl specific pages and many more features!
After changing the configuration inside of Screaming Frog, you can save it as a configuration file.
We can then apply that configuration file to a headless screaming frog crawl via the terminal.
Create Your Config File:
First open up Screaming Frog and go to Configuration > Spider > Extraction > Structured Data:
Then tick the following checkboxes:
- JSON-LD
- Microdata
- RDFa
Click OK.
Then youâll need to save the configuration file by:
File > Configuration > Save As
I will choose to call my file custom_crawl.seospiderconfig
Make sure to save it under a new folder in your desktop called config
How To Crawl With A Config File
Letâs crawl the example site with our newly created configuration file:
crawl.lnk --config "C:Users{username}Desktop"{configname}.seospiderconfig --output-folder /"C:Users{username}Desktop" --save-report "Redirect & Canonical Chains" --headless
So in my example it would be:
crawl.lnk https://phoenixandpartners.co.uk/ --config "C:UsersjamesaphoenixDesktop"custom_crawl.seospiderconfig --output-folder "C:UsersjamesaphoenixDesktop" --save-report "Redirect & Canonical Chains" --headless
How To Crawl Text Files
Itâs possible to run Screaming Frog in list mode via the terminal.
Simply create a .txt file with a list of URLs that youâd like to crawl. These can be from a single website or many websites.
Save this .txt to your desktop:
The extra argument used here is âcrawl-list like so:
crawl.lnk --export-tabs "Response Codes:Client Error (4xx)" --output-folder "C:Users{username}Desktop" --headless --crawl-list "C:Users{username}Desktop{filename}.txt"
My example looks like this:
crawl.lnk --export-tabs "Response Codes:Client Error (4xx)" --output-folder "C:UsersjamesaphoenixDesktop" --headless --crawl-list "C:Users{username}Desktopurlstocrawl.txt
Weâve finished the Windows section, I hope that this post provides you with a good overview on how to get started with automating Screaming Frog.
Automation is powerful and I encourage you to practice your new found super powers!
In the next post, youâll learn how to automatically wrangle your Screaming Frog data with Python + Pandas.