My AI content machine in Google Sheets (Part 2 of 2)

Last week, I told you about my 4 AI agents that create 80% of my social media content.

If you missed that episode you can find it here.

If you haven't read it, I suggest you do so first, as it provides the context for this episode.

In today's episode, I will share step-by-step instructions on how I built these agents so that you can build your own agents.

Requirements for building these agents:

  • Advanced SQL skills
  • An openAI API account plus API key

Let's begin!

Subscribe to the Newsletter

Join 500+ data innovators for tips, strategies, and resources to build impactful data teams, make your CEO love you, and live a better life as a data professional. 😍

Error. Your form has not been submittedEmoji
This is what the server says:
There must be an @ at the beginning.
I will retry
Reply
I will never spam or sell your info. Promise.

Building the 4 agents involves the following steps:

  • 1) Plan and manage Social Media content in Notion
  • 2) Connect Notion to a BigQuery Data Warehouse
  • 3) Build a data model in the Data Warehouse
  • 4) Connect Google Sheets to the Data Warehouse
  • 5) Install an app script in Google Sheets to connect openAI
  • 6) Set up the agents in Google Sheets

STEP 1: Plan and manage Social Media content in Notion

I use Notion to create, structure, plan, and document all my social media content.

This is how that looks like for me:

I won't go too deep into each of the fields you see on the Notion card as you might likely want to use different fields. 

Choose fields that fulfill the following purposes:

  • Filter the right type of content so that the AI agent uses the right input data
  • Filter content for recency 
  • Filter well performing content
  • Filter content by status (e.g. published vs idea stage)

When I started with Notion, it seemed like the best option but if I would do it again, I would probably use Airtable as it is better suited for a wider range of automations. I might migrate in the future. 

Theoretically, you can also manage your content in Google Sheets but then you cannot easily display the content on a calendar, which I use a lot. 

Note: Managing your content in Google Sheets spares you steps 2-5 and you don't need SQL skills to build this. But you lose a lot of the functionality.

STEP 2: Connect Notion to a BigQuery Data Warehouse

I use Fivetran to pull data from Notion into Google BigQuery.

I have a Love/Hate relationship with Fivetran.

On the one hand it is really incredibly easy to use and enables anyone to pull data from one system and transfer it into another. 

Fivetran manages the connection for you which means that even if Notion is making changes to their API, your data stream won't be affected.

It is also one of the very few comparable tools that allow you to do so for free.

On the other hand, it does get very expensive very quickly. The free tier is pretty generous, though, so I will likely never need to pay anything for this service - which is awesome.

STEP 3: Build a data model in the Data Warehouse

This step is a lot trickier compared to Step 2 as it involves transforming the raw notion data into a useful structure.

Luckily, the data structure in Notion is pretty straightforward.

Every piece of content will be a page.

I structured my content pieces in a Notion database called 'Content Collection'. 

Each database has different database properties (e.g. a Main POV in my case, see example Notion page above).

Each database property has one page property (e.g. Main POV is in a multi select field).

Based on this standardized data structure, you can replicate the same code to extract each property.

You can see how I extract the property 'Main POV' below.

I am using dbt to transform the data in my BigQuery Data Warehouse.

Notion is not the only data source that goes in there. 

For example, I also add data from Google Analytics and my Email tool Convertkit. In this way, I can add other success metrics to each content piece, such as newsletter subscribers and leads generated.

By connecting these tools I can "guide" the AI to focus on content that is good at generating leads or subscribers and not only impressions and engagements.

This is when LLMs really start being fun - when they have enough context to guide their reasoning!

STEP 4: Connect Google Sheets to the Data Warehouse

A lot of people don't know that this is possible.

In my opinion, this is THE killer feature of the Google Cloud environment: You can connect Google Sheets to your BigQuery Data Warehouse and build Pivot tables in Google Sheets on billions of rows in seconds! 

Now, I don't really have billions of rows in my Data Warehouse (yet) but it makes me sleep better to know that I could.

The other huge advantage of connecting Google Sheets to BigQuery is that you only need to connect Sheets to ONE data source to get all the information you need (provided you have connected the data to your BigQuery Data Warehouse and transformed it into a usable format).

A true single source of truth for all your decision-making and the AI agents.  

Connecting Google Sheets to BigQuery is incredibly straightforward as you can see in the image below:

Simply 

  • open a Google Sheet
  • Find "Data" in your navigation bar
  • Data connectors > Connect to BigQuery
  • Enter the BigQuery project and dataset to connect to
  • Enter your user credentials
  • Done!

STEP 5: Install an app script in Google Sheets to connect openAI

Next, head over to this website and grab your Google Sheets openAI apps script for free.

You will find all instructions on how to use it there.

In short, you will need to install this script in Google Sheets to access the openAI API from within Google Sheets. 

STEP 6: Set up the agents in Google Sheets

Last Step! The most difficult steps are done! If you've made it this far, you're almost there!

As mentioned in my previous episode, I currently have 4 agents.

In this post, I want to focus on the first agent. 

This agent takes a long post and repurposes it into a shorter, more readable version. You can see it in the image below.

Columns A-C contain a pivot table that sits directly on the BigQuery data source. This table captures all my LinkedIn posts and I can use filters to make different selections.

In this case, I selected all posts with a certain length (> 1500 characters), that reached at least 5000 impressions, were published in the last 12 months, and revolved around a certain topic (my Control vs Agility Framework).

Column D contains the openAI prompts that will generate a new post for each of the old posts. 

By dragging the formula in D3 down, I can create dozens of new posts in seconds. 

You can find the prompt in D3 below.

=Q(A3,"Extract the key message from this LinkedIn post and turn it into a shorter LinkedIn post with less than 100 words. Use concise language. Each sentence in the post should ideally be less than 10 words, maximum 15 words. No hashtags or emojis. Use a highly readable and engaging structure by separating paragraphs and using bullet points wherever necessary. Never more than 6 bullet points in a row. Create two strong hooklines. The hooks must appeal to data analysts, data engineers, data scientists, data leaders and CTOs, primarily in startups and scaleups. They must also appeal to startup founders and CMOs who believe in building data-driven companies.

Rules for hooks: 

- The first line must be very short, ideally less than 6 words and definitely not longer than 8 words.  

- Ideally, it includes a specific number. 

- But: Under no circumstances are you allowed to make up a random number such as 90% of companies don't do XYZ, unless you have a reference for your number 

- It should evoke a strong emotion in the reader. Consider the common enemy of the reader, the ""Old Guard of Data Analytics"" as described below 

- It should be either opinionated and polarizing or relatable 

- Create a line break after the first hook line 

- The second line must intrigue the reader continue reading. 

- It should also be ideally less than 6 words and no longer than 8 words 

- The tone of voice should be educational. Not too humorous.  

Common Enemy: Old Guard of Data Analytics Wants to stop stakeholders to use Excel and rely only on dashboards Insists on centralized data teams Wants to integrate data from millions of sources all at once Spends weeks writing concepts Starts every project with a solution Wants to solve every problem with quantitative data Sticks religiously to data modeling best practices Thinks that data should report into the CTO Pushes responsibility for data quality away to data providers Forces stakeholders to use terrible tools and processes to get what they need 

End the post with a CTA from the following csv. The rules are as follows: The csvs contains call to actions = CTAs. Each CTA is separated by the string NEW CTA. You can choose a CTA that you think fits best to the content of the post. Some CTAs contain a section {choose_a_post_title}. Replace this variable with a post title with less than 6 words that you can choose. Use only lower case words in your post title.  

"&CTAs!$C$2)

Note that the prompt works with two data inputs:

  • One is the original post in column A
  • And one is cell C in the sheet CTAs that contains a comma separated list of CTAs (call to actions) that has been evaluated as successful

This retrieval augmented generation (RAG) is what makes these AI agents so powerful as the agents:

a) use only information from my existing LinkedIn posts in my new posts

b) focus on posts that performed well

c) use my tone of voice and writing style

d) can access a variety of databases for different tasks (e.g. the CTAs database to add CTAs)

e) can do all of this within a tool that is easy to use (Google Sheets)

Now, that's it! 

I hope you found this valuable!

Whenever you need me, here's how I can help you:

​Data Action Mentor Masterclass​: Create massive business impact with your data team. 

This class is built for ambitious data professionals and data leaders. I spent 17 years building data teams for high-growth companies such as Rocket Internet, Zalando, Takeaway.com, Lazada (acquired by Alibaba), and many more. In this class, I am sharing all my lessons, failures, and successes so that you can make your stakeholders and CEO happy and accelerate your data career.  

​Impactful Data Teams for Scale-ups​

I build data infrastructure and data teams with immediate business impact for global b2c scale-ups and grown-ups in e-commerce, insurance, fintech, and consumer subscription. My proven approach has helped dozens of scale-ups. I build the infrastructure at a fixed price and then empower the business to move rapidly from data to action. If you know a consumer internet scaleup that needs an impactful data team, hit me up!

​Data Audits​

I run Data Audits to support you in transforming your data team into a strong business partner. You will get a tailor-made list with action items that will help you create massive business impact with your data team.

​Knowledge Base​

I am committed to adding actionable, free content to our Data Action Mentor knowledge base to help you on your journey to create massive business impact with your data team