How To Prospect For Companies Without Google My Business Using Python

James Phoenix
James Phoenix

Google My Business is a local SEO directory and a vital marketing channel for local businesses as it helps them to acquire customers within their local search market.

Agencies are able to capitalise on business owners that still haven’t claimed their Google My Business listing.

These make for fresh, easy prospects to veteran SEO’s.

In this guide we will be creating a simple web scraper that will:

  • Find businesses that don’t have a knowledge panel with the Google Search Engine Results Page (SERP).

We’ll be using a mixture of selenium and pandas for this tutorial.


Loading Python Libraries

#Module Dependencies
import time
from time import sleep
import datetime
import selenium
import urllib
import re
from bs4 import BeautifulSoup
from random import randint
from random import uniform

#Import Selenium Dependencies
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By

import pandas as pd

1. Let’s get the dataset that we need to scrape

df = pd.read_csv('website_data.csv')
df.head(15)

2. Let’s study a Google Search Query so that we can understand how to structure our target URL:

It’s important that we understand how to structure the URL so that we can dynamically inject our custom brand queries into google searches.

single_keyword_url = 'https://www.google.com/search?q=vidioh&oq=vidioh+&aqs=chrome..69i57j69i60j69i61l2j69i60l2.3654j0j1&sourceid=chrome&ie=UTF-8'

single_word_query = 'https://www.google.com/search?q={0}&oq={1}+&aqs=chrome..69i57j69i60j69i61l2j69i60l2.3654j0j1&sourceid=chrome&ie=UTF-8'.format('vidioh', 'vidioh')


multiple_keyword_url = 'https://www.google.com/search?q=video+brochures&oq=video+brochures&aqs=chrome..69i57j69i60j69i61.1545j0j1&sourceid=chrome&ie=UTF-8'

multiple_word_query = 'https://www.google.com/search?q={0}&oq={1}&aqs=chrome..69i57j69i60j69i61.1545j0j1&sourceid=chrome&ie=UTF-8'.format('video+brochures', 'video+brochures')

3. Identify a GMB knowledge Panel HTML Div

From my initial testing, this isn’t a 100% foolproof method for determining whether a business has a Google My Business listing, but for a 25 minute python script, it’s certainly a good start.

Method:

If there is a div with the class knowledge-panel on the web page, then we can assume that the business brand query is triggered by showcasing a Google My Business page. Therefore we can use this to help with our digital marketing prospecting for local businesses that have yet to invest within a Google My Business page.


4. Extract specific queries for every brand

We will remove all of the website extensions such as .org, .co.uk or .com by simply looking for the first mention of the character: .

df['Queries'] = df['Site'].apply(lambda x: x[0 : x.find('.')])
Awesome, so now we've got some brand names that can be used as queries inside of a custom google search via Selenium!

5. Scrape Google Search Engine Results Page with A BIG TIMER

driver = webdriver.Chrome(executable_path='chromedrivers/chromedriver')

#urls = ['ferryads' , 'matthewfuneralhome'] <-- I built the method to work on two search queries before moving to the entire 300+ list.

knowledge_panel_results = []

query_string = 'https://www.google.co.uk/search?source=hp&ei=_aegXcHTIsOVsAfNpLf4Bg&q={}&oq={}&gs_l=psy-ab.3..0i131j0l3j0i131j0l3j0i131j0.302585.302825..302898...0.0..0.46.172.4......0....1..gws-wiz.....0.GfS7vSMN0Qs&ved=0ahUKEwiBxtPaypTlAhXDCuwKHU3SDW8Q4dUDCAg&uact=5'

for url in list(df['Queries']):
    query = query_string.format(url, url)
    driver.get(query)
    
    try: 
        element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.CLASS_NAME, "knowledge-panel"))
            )
        knowledge_panel_results.append('True')
    except:
        knowledge_panel_results.append('False')
        
    sleep(randint(10,29))   

Then simply re-combine the list of True/False’s with your original dataframe and you’ll have some easy prospects to send digital marketing proposals to 🙂

Unleash Your Potential with AI-Powered Prompt Engineering!

Dive into our comprehensive Udemy course and learn to craft compelling, AI-optimized prompts. Boost your skills and open up new possibilities with ChatGPT and Prompt Engineering.

Embark on Your AI Journey Now!
df['Knowledge_Panel_Results'] = knowledge_panel_results

And voila! We can save the data to a CSV and forward it on to any business development executives / managers to start making prospecting calls.

df.to_csv('results.csv')

Taggedtutorial


More Stories

Cover Image for Soft Skills for Programmers: Why They Matter and How to Develop Them

Soft Skills for Programmers: Why They Matter and How to Develop Them

Overview You need a variety of soft skills in addition to technical skills to succeed in the technology sector. Soft skills are used by software professionals to collaborate with their peers effectively and profitably. Finding out more about soft skills and how they are used in the workplace will help you get ready for the job if you are interested in a…

James Phoenix
James Phoenix
Cover Image for What Are Webhooks? And How Do They Relate to Data Engineering?

What Are Webhooks? And How Do They Relate to Data Engineering?

Webhooks are a simple and powerful method for receiving real-time notifications when certain events occur. They enable a whole host of automated and interconnected applications. Broadly speaking, your apps can communicate via two main ways: polling and webhooks.  Polling is like going to a shop and asking for pizza – you have to ask whenever you want it. Webhooks are almost the…

James Phoenix
James Phoenix