Skip to content

Building an Automated AI Email Marketing Bot on Edurata.com

 

Part of the workflow
By Julian de Mourgues

Recently i acquired a dataset of german companies https://www.wer-zu-wem.de/ which is thousands of rows of public email addresses to market my product to. However i really didn’t want to go through all of them manually to send out a boilerplate message as i don’t think it honors the progress of AI in 2024.

The Idea

The core idea behind the bot was simple: automate the repetitive task of crafting and sending personalized emails. The objective was to extract email addresses stored in Airtable, analyze the domain of each recipient, scrape their website for relevant information, use ChatGPT to generate a customized pitch, send the email, and finally, update the records in Airtable and HubSpot. This automation would allow me to focus solely on responding to replies in my inbox, freeing up a significant portion of my time.

Tools and Technologies

1. Airtable: A versatile spreadsheet-database hybrid to store email addresses.
2. Edurata Platform: The automation platform to build and manage the bot.
3. Web Scraping Tools: Python libraries such as BeautifulSoup and Scrapy for extracting information from websites.
4. OpenAI’s ChatGPT: For generating personalized email content.
5. AWS SNS: For sending out emails.
6. HubSpot API: For updating and creating new contacts and companies.

The Workflow

As you might know, workflows are defined in yaml on the edurata platform. As we can see in the it runs on a regular schedule every hour from 8am to 8pm. It takes several inputs that are defined on the platform globally and pulled into the workflow, for example: ${variables.company_table_id}

Also we see a few steps defined, that we’ll have a closer look at later.

apiRevision: edurata.io/v1
name: pitch-bot
title: Pitch Bot
schedule: "0 8-20 * * *"
description: |
Pitch bot takes a list of companies, parses their website for info, generates pitches and then sends them to the company.
inputs:
company_table_id: ${variables.company_table_id}
sender_name: ${variables.ceo_name}
sender_company_name: ${variables.company_name}
sender_company_description: ${variables.company_description}
sender_email: ${variables.ceo_email}
interface:
inputs:
properties:
company_table_id:
type: string
description: |
The base id + table_id from which to take the companies
The table needs to have the following columns:
- name: string
- email: string
contact_info:
type: string
description: What is sent at the end of the email
sender_name:
type: string
description: The name of the sender
sender_company_name:
type: string
description: The company name
sender_email:
type: string
description: The email address from which the emails are sent
limit:
type: number
description: The amount of companies to send out emails for
default: 3
steps:
get-company-data:
...
add-website-from-email: # A bit of restructuring
...
extract-company-info:
...
generate-body:
...
generate-header:
...
send-message:
...
prepare_hubspot_contacts:
...
create-companies-hubspot:
...
create-contacts:
...
update-airtable:
...

Step-by-Step Implementation

1. Extracting Email Addresses from Airtable

The first step involved integrating Airtable with the Edurata platform to pull email addresses. Using Airtable’s API and a simple axios function, the first step is to pull a list of companies from a source table which then can be used further

The code in the workflow was referenced like this:

get-company-data:
source:
repoUrl: "https://github.com/Edurata/edurata-functions.git"
path: general/axios
dependencies:
url: "https://api.airtable.com/v0/${inputs.company_table_id}?filterByFormula=AND(NOT({processed}), NOT({email} = ''), NOT({company_name} = ''), NOT({first_name} = ''), NOT({last_name} = ''), NOT({title} = ''), NOT({MANational} = '0'))&maxRecords=${inputs.limit}"
headers:
Authorization: "Bearer ${secrets.AIRTABLE_API_KEY}"

This basically pulls the code at the repository here when deploying the workflow and provides the inputs, defined as dependencies

In practice this step produces output like this:

live output

2. Scraping the website

Once the email addresses were retrieved, the next task was to extract the domain from each email address (add-website-from-email) and scrape the corresponding website for only the text without the html tags. Note how it uses the foreach feature here in order to iterate over all entries that we prepared.

extract-company-info:
foreach: ${add-website-from-email.companies}
source:
repoUrl: "https://github.com/Edurata/edurata-functions.git"
path: etl/extract/extract-text-from-webpage
dependencies:
url: ${each.website}
CRAWL_PASSWORD: ${secrets.CRAWL_PASSWORD}

A glimpse into the code here using crawlbase as scraping provider and BeautifulSoup:

import requests
from bs4 import BeautifulSoup
import os
import urllib.parse

def call_proxy(url):
print("calling: " + url)
password = os.environ.get('CRAWL_PASSWORD')

oxy_url = f'https://api.crawlbase.com/?token={password}&url={urllib.parse.quote_plus(url)}&format=json'
print(oxy_url)
response = requests.get(oxy_url)
if response.status_code != 200:
print(f"Request failed with status code {response.status_code}")
print(response.reason)
return None
response_json = response.json()
return response_json["body"]

def extract_text_from_url(url):
html_content = call_proxy(url)
if not html_content:
return None

soup = BeautifulSoup(html_content, 'html.parser')
text_content = soup.get_text(separator='\n', strip=True)
return text_content

def handler(inputs):
url = inputs['url']
text_content = extract_text_from_url(url)
return {'text': text_content}

and the live output after deployment here:

the pure text content of the website of the recipient

3. Generating Personalized Pitches with ChatGPT

With the scraped data, the next step is to generate a personalized pitch using ChatGPT. The scraped content is sent to the AI model, which then crafted a bespoke email.

generate-body:
foreach: ${add-website-from-email.companies}
source:
repoUrl: https://github.com/Edurata/edurata-functions.git
path: etl/transform/chatgpt
dependencies:
API_KEY: ${secrets.OPENAI_API_KEY}
systemMessage: You are an assistant that extracts information from a text and generates an email body from it.
message: |
Ich bin ${inputs.sender_name}. Unser Unternehmen heißt ${inputs.sender_company_name}. Die Unternehmensbeschreibung lautet wie folgt:
Unser Unternehmen bietet flexible, kostengünstige Automatisierungslösungen für den Mittelstand im DACH-Raum auf eigener Plattform mit Allround Betrieb und Wartung. Dank unserer erfahrenen Berater und bausteinartigen Programmierung entwickeln wir skalierbare Software schnell und effizient, wobei der Quellcode stets bei Ihnen bleibt.


Ich möchte, dass du einen E-Mail-Text für einen unserer Kunden ${each.ceo_name} des Unternehmens ${each.company_name} generierst.

Analysiere den folgenden Text, der von der Website des Kunden ${each.company_name} extrahiert wurde.
Generiere einen E-Mail-Text auf deutsch in der folgenden Struktur:
- Angebot von kostenlosem einstündigem Beratungsgespräch durch mich zu Cloud und Softwarelösungen.
- Vorstellung von unserem Unternehmen und unseren Lösungen.
- Detaillierte Beispiele für potenzielle Lösungen durch uns für den Kunden.
- Die E-Mail sollte professionell sein aber auch spannend. Verwende keine Floskeln.
- Erwähne am Ende auch, dass die E-Mail von einer unserer KI-Lösungen generiert wurde und die tatsächlichen Anwendungsfälle in dem kostenlosen Beratungsgespräch besprochen werden können.
- Gib nur den E-Mail-Text mit Grußwort zurück ohne Betreff.

Text:
${extract-company-info[each.index].text}
Example output

As can be seen the email is not perfect and needs some prompt engineering as it still awkwardly reiterated the product of the recipient to themselves mostly..

4. Sending the Email

After generating the pitch, the bot uses a AWS SES function to send out the generated email

The function code is called here in the workflow with a few inputs, with metadata at the end of the email and the necessary secrets that are set outside the workflow in the platform globally. Notice also the foreach loop.

send-message:
foreach: ${generate-body}
source:
repoUrl: "https://github.com/Edurata/edurata-functions.git"
path: etl/load/send-ses
dependencies:
sender: ${inputs.sender_email}
to: ${get-company-data.response.data.records[each.index].fields.email}
subject: ${generate-header[each.index].response}
body: |
${each.response}
${variables.company_address},
Handelsregister: ${variables.company_registry}
VAT: ${variables.company_vat}
Web: ${variables.company_website}
AWS_ACCESS_KEY_ID: ${secrets.SHORT_STORY_KEY}
AWS_SECRET_ACCESS_KEY: ${secrets.SHORT_STORY_SECRET}

5. Updating Airtable and HubSpot

Finally, the bot updated the processed emails in Airtable and created new contacts and companies in HubSpot.

create-companies-hubspot:
source:
repoUrl: "https://github.com/Edurata/edurata-functions.git"
path: etl/load/hubspot/create-companies
dependencies:
HUBSPOT_API_KEY: ${secrets.HUBSPOT_API_KEY}
companies: ${prepare_hubspot_contacts.companies}
create-contacts:
foreach: ${prepare_hubspot_contacts.contacts}
source:
repoUrl: "https://github.com/Edurata/edurata-functions.git"
path: etl/load/hubspot/create-contacts
description: A function to create a contact or a list of contacts in HubSpot..
dependencies:
HUBSPOT_API_KEY: ${secrets.HUBSPOT_API_KEY}
contacts:
- email: ${each.email}
firstname: ${each.firstname}
lastname: ${each.lastname}
company_hubspot_id: ${create-companies-hubspot.result[each.index].company_id}
update-airtable:
foreach: ${send-message}
source:
repoUrl: https://github.com/Edurata/edurata-functions.git
path: general/axios
dependencies:
method: PATCH
url: https://api.airtable.com/v0/${inputs.company_table_id}/${get-company-data.response.data.records[each.index].id}
headers:
Authorization: Bearer ${secrets.AIRTABLE_API_KEY}
Content-Type: application/json
data:
fields:
processed: true

The hubspot functions can be found here

Benefits

This automated system has transformed my outreach process. Here are some key benefits:

1. Time Efficiency: The bot handles the entire process, allowing me to focus on responding to emails rather than crafting and sending them.
2. Personalization at Scale: Each email is tailored to the recipient, increasing the chances of engagement.
3. Accurate Tracking and Management: Automated updates to Airtable and HubSpot ensure that my records are always up to date.

Drawbacks

As can be seen in the generation process, the email is still not perfect and i am mostly spending time now honing the prompt in order for the email to avoid boilerplate sentences and boring information.

Also i deactivated the hubspot update after 1000 processed emails as i noticed that i am running in a higher tier at hubspot. So in the end i decided to update hubspot manually in case a contact was made.

Keep in touch!

Platform -> https://edurata.com

Consulting -> https://contact.edurata.com/de/consulting